On the limits of low-numerical-aperture imaging scatterometry

Although imaging scatterometry has been demonstrated to be a powerful technique for characterization of nano-gratings when high lateral resolution is required, some limits of this novel technique are still undisclosed yet, such as the constraint for the imaging numerical aperture (NA), the number of unit cells for accurate grating reconstruction, and the analyzability of image pixels associated with the grating region. To this end, we establish a vectorial image formation (VIF) model for imaging scatterometry based on the finite-difference time-domain (FDTD) method and vectorial diffraction theory. According to the established VIF model and the simulation results of a Si grating sample with finite numbers of unit cells, we find that accurate grating reconstruction by routine RCWA (rigorous coupled-wave analysis) -based data analysis requires an upper limit for the NA of the employed objective. And enough numbers of unit cells are also required to be covered in the illumination spot. Only in these conditions, the zeroth-order diffraction information of the grating under test can be exclusively and completely collected by the imaging system. Moreover, only the image pixels off the edge of the grating region are analyzable by routine RCWA-based data analysis due to the effect of edge scattering. The required number of grating unit cells and the size of the analyzable region are closely related with the imaging NA and the ratio between the illumination spot size and the size of the grating region D/L. Higher imaging NA or smaller D/L typically requires fewer grating unit cells and meanwhile allows a larger analyzable region. The investigation in this paper promises to provide valuable insights into the application of imaging scatterometry. © 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement


Introduction
Imaging scatterometry is a hybrid of optical scatterometry and optical microscopy aiming to solve the low lateral resolution of traditional non-imaging optical scatterometry. Optical scatterometry is currently one of the major in-line workhorse metrology solutions in addressing devices with deep sub-wavelength feature sizes in semiconductor industry [1,2]. Nevertheless, the lateral resolution of optical scatterometry is limited by its illumination spot size and is typically lower than 30 µm, which restricts its applications in characterization of gratings patterned in regions smaller than the illumination spot, such as micropolarier array [3] and multi-view holographic 3D display for augmented reality [4]. In comparison, imaging scatterometry allows for direct visualization of a sample and intuitive selection of interested regions from the illumination area for data analysis, even when the illumination spot size is larger than the grating region to be analyzed [5][6][7][8].
It should be pointed out that, although sample images can be acquired in imaging scatterometry, its imaging principle is different from that in traditional optical microscopy, since only the zeroth-order diffraction beam of a grating under test is collected in imaging scatterometry (achieved by a small NA objective) with high-order diffraction beams excluded. Therefore, the acquired images in imaging scatterometry essentially show only contrast of different sample regions but not profile details of the grating under test. The strategy for grating reconstruction in imaging scatterometry totally borrows from that in traditional optical scatterometry [9], which involves first establishing a diffraction model according to RCWA (also named as Fourier modal method) [10,11] to relate the profile parameters of a grating under test with its optical signature and then solving an inverse diffraction problem pixel-by-pixel with the objective of finding an optimal input to the established model whose simulated signature can best match the measured one. The signature here can be in forms of reflectance, ellipsometric angles, Mueller matrix elements, etc. [12][13][14][15][16][17][18] Note that the above routine RCWA-based grating reconstruction approach does not consider the imaging process in imaging scatterometry. Although experiments have demonstrated the great potential of imaging scatterometry [5][6][7][8], some limits of this novel technique are still undisclosed yet to the best of our knowledge, which are summarized as follows: (1) The constraint for the imaging NA. The NA of the employed objective in an imaging scatterometer determines its lateral resolution. A larger NA objective gives a higher lateral resolution. On the other hand, an upper limit should be imposed for the NA of the employed objective to ensure that the zeroth-order diffraction information of a grating under test is exclusively collected.
(2) The number of unit cells for accurate grating reconstruction. RCWA calculations assume that the number of grating unit cells is infinite. However, in imaging scatterometry, the illumination spot size is usually larger than the grating region to be analyzed. In this case, how many unit cells are required so that the collected diffraction beams approximate to be plane waves and the grating reconstruction process does not lose accuracy?
(3) The analyzability of image pixels associated with the grating region. As mentioned above, the illumination spot size in imaging scatterometry can be larger than the grating region to be analyzed. In this case, the light-matter interaction near the edge of the grating region is no longer a routine diffraction problem but a general scattering problem, which will lead to errors in routine RCWA-based data analysis for pixels near the edge of the grating region.
In order to investigate the above issues and disclose the inherent limits of imaging scatterometry, we first establish a VIF model for imaging scatterometry based on the FDTD method [19] and vectorial diffraction theory [20][21][22][23]. Considering that RCWA is primarily suitable for calculation of diffraction fields of periodic structures, the FDTD is used as a general method to deal with the light-matter interaction in the near filed in imaging scatterometry in this work. Although the imaging NA in imaging scatterometry is usually small, considering that polarization is sometimes taken into consideration, the vectorial theory is adopted to describe the light propagation from the near field to the image plane in imaging scatterometry. We then use the established VIF model to simulate imaging signature of a grating under test with finite numbers of unit cells and take the simulated imaging signature as the measured data by an imaging scatterometer for further data analysis. Figure 1 depicts two commonly used configurations for an imaging scatterometer. One is the oblique objective configuration revamped from an ellipsometer [5,6], as shown in Fig. 1(a). And another one, as shown in Fig. 1(b), is the vertical objective configuration revamped from an optical microscope [7,8]. As depicted in Fig. 1, the light beam emitted from a light source is first collimated by a collimating lens and then incident upon a grating sample under test. After interacted with the sample, the diffracted (or scattered) beam is collected by an objective and finally enters into a detector. For ellipsometric measurements, a polarization state generator and a polarization state analyzer that are both comprised of polarizers and retarders are incorporated into the incidence arm and the detection arm of the instrument, respectively. Despite of the differences between the above two configurations, their imaging principles are identical, which can be abstracted as a VIF model using the pupil approach as described in Fig. 2. According the Hopkins' work [24], the imaging system that satisfies the Abbe sine condition can be simplified as consisting of an entrance pupil and an exit pupil. The entrance and exit pupils are represented by the Gaussian reference spheres with their centers located on the origins of the object and image planes, respectively. As shown in Fig. 2, the diffraction beams from the object plane will pass through the entrance and exit pupils, and then converge to the image plane. So, the VIF model can be divided into four parts: (1) The light-matter interaction in the near field (the object plane); (2) The propagation of diffraction beams from the object plane to the entrance pupil; (3) The propagation of diffraction beams in the imaging system, i.e. from the entrance pupil to the exit pupil; (4) The imaging on the image plane. Next, we will introduce each part separately in detail.

Light-matter interaction in the near field by FDTD method
When the size of the sample under test approaches to the illuminating light wavelength, traditional geometric optics principle is no longer applicable. And the mechanism of the light-matter interaction can be described by the well-known Maxwell's equations. However, there is no analytical solution to most light-matter interaction calculations except for several special cases.
The FDTD method is a method for numerically solving the Maxwell's equations. It is a timedomain technique, so a broadband optical source can be used and the response of system with different wavelengths can be obtained in a single calculation.
The differential time-domain Maxwell's curl equations for nonmagnetic materials are given as follows where E and H are the electric and magnetic field vector, respectively, and ε and µ are the complex dielectric constant and magnetic permeability, respectively. The above curl equations can be rewritten in terms of electric and magnetic field components in the Cartesian coordinate system as As shown in Fig. 3, the FDTD method decomposes the object into discrete spatial and temporal Yee cells. The electric field components are located on the edges and the magnetic field components are located on the faces of the Yee cell. Then, let E m (i, j, p) denotes the electric field on the cell indexed as (i, j, p) at the time instants indexed as m. And using the central difference approximation [19], we can rewrite Eq. (6) as where ∆y, ∆z and ∆t denote the dimension of the Yee cell in the y-direction, z-direction and time interval, respectively. Equation (9) is called the updating equation in FDTD. We can acquire updating equations for other electric and magnetic field components in a similar manner. Then, the FDTD method uses a time-marching algorithm to update the electric field at the time instant (m+1)∆t and magnetic field at the time instant (m+0.5)∆t with a specific boundary condition. And the current field components are stored as output data. This process is repeated over and over again until certain stopping conditions are satisfied.

Propagation of diffraction beams from the object plane to the entrance pupil
With the obtained the electric field E(x, y, 0) on the object plane, the plane-wave spectrum of E(x, y, 0) can be further calculated by [25] where k x and k y are the corresponding spatial frequencies and i is the imaginary unit defined as i 2 = −1. Equation (10) can be regarded as the series expansion of an arbitrary electromagnetic field in terms of plane waves with different amplitudes and propagation directions. As shown in Fig. 4, for an diffraction beam from the object plane, its unit direction vector is assumed to bê r = (r x , r y , r z ). Since the objective is usually telecentric on the object side, the entrance pupil can be seen located at infinity. According the method of stationary phase and [25,26], the electric field of the intersection ofr with the entrance pupil can be obtained by [25] where k = 2π/λ is the wave number, λ is the illuminating wavelength, and r is the distance from the point on the entrance to the origin of the object plane. Note that, the right side of Eq. (11) depends on the r z . Equation (11) is valid for the angular spectra of all the diffraction orders.
For the zeroth-order diffracted light (r z = 1), according to Eq. (11), we can see that the far field representation E(r x , r y , r z ) is equivalent to the angular spectra. However, for other diffraction light (r z 1), the far field representation E(r x , r y , r z ) has a deviation of factor r z compared to the angular spectra.

Propagation of diffraction beams in the imaging system
Due to the finite aperture size of optical elements, only part of diffraction beams can be collected and propagated in a practical imaging system. As shown in Fig. 4, for the diffraction beam with the unit direction vectorr = (r x , r y , r z ), the angle betweenr and the optical axis (assuming the z-axis here) is θ. The meridional plane is formed by the direction vectorr and the optical axis. And the meridional plane forms an angle φ with the x-axis. First, we transform (r x , r y , r z ) to (θ, Then, only the beams that satisfy the following equation can be collected where n is the refractive index in the object side, and NA is the NA of the objective. As illustrated in Fig. 4, when the beam passes through the imaging system, the direction vectorr will rotate tô s. The corresponding angles forŝ are denoted as θ and φ , respectively. According to Abbe's theory [27], an aplanatic imaging system satisfies the following equations where n is the refractive index in the image side and M is the magnification factor of the imaging system. As for the electric field vector of the diffraction beam, we traced each beam passing through the optical system and leaving from the exit pupil. According to the works of B. Richards and E. Wolf [21,22], the angle between the electric field vector and the meridional plane will remain unchanged when the beam passes through the optical system if the angle of incidence at the entrance pupil is small. Since the NA of the employed objective in imaging scatterometry is usually small, the angle of incidence at the entrance pupil satisfies Richards and Wolf's theory. In addition, the amplitudes of the electric field vector on the exit pupil can be obtained according to the law of energy conservation as follows [20] where E s (θ, φ) and E p (θ, φ) are the electric field components in the object side, which can be obtained by Eqs. (11)-(13), and E s (θ , φ ) and E p (θ , φ ) are the electric field components in the image side. As shown in Fig. 4, E s (θ, φ) and E s (θ , φ ) are the components that are vertical to the meridional plane, and E p (θ, φ) and E p (θ , φ ) are the components that are parallel to the meridional plane and vertical to the direction vector.

Imaging on the image plane
For the imaging on the image plane, we use the Debye-Wolf integral to calculate the electric field distribution on the image plane for an aplanatic optical system by [21,22] where Ω represents the integration area, i.e., the solid angle limited by the exit pupil, E (s x , s y ) represents the electric field vector of the beam leaving from the exit pupil, which is calculated by Eqs. (17) and (18), (s x , s y , s z ) represents the unit direction vector and can be calculated by (s x , s y , s z ) = (sinθ cosφ , sinθ sinφ , cosθ ). According to Eq. (19), we can further acquire the Jones matrix distribution on the image plane by assuming a coherent imaging system. The relationship between the Jones vector on the image plane and the Jones vector of the incident beam with a uniform electric field is given by where [E 0,p , E 0,s ] T refers to the Jones vector of the incident light and [E image,p (x , y ), E image,s (x , y )] T refers to the Jones vector distribution on the image plane. According to the acquired Jones matrix distribution, the reflectance image can be calculated by The ellipsometric angle images Ψ and ∆ can be calculated by The Mueller matrix image can be calculated by [28] M where ⊗ denotes the Kronecker product, J* is the complex conjugate of J, and the matrix A is given by The Mueller matrix is usually normalized to the (1, 1)-st element M 11 , with the normalized Mueller matrix elements defined as m ij = M ij /M 11 .

Sample description and simulation conditions
Figure 5(a) shows the cross-section profile of a Si grating sample examined in simulation, which has a pitch of Λ = 800 nm, a linewidth of W = 400 nm, and a line height of H = 500 nm, respectively. The refractive index of Si refers to [29]. Figure 5(b) depicts the scheme of the illumination setup in simulation. The illuminating light is assumed to be plane wave with the incident angle denoted as α. The Mueller matrices images are calculated by scanning the wavelength λ from 400 to 700 nm in increments of 20 nm and making the plane of incidence perpendicular to grating lines. The magnification factor of the imaging system is fixed at M = 10.
Both the refractive indices in the object and image sides are assumed to be 1. In addition, the illumination spot size is larger than the grating region to be analyzed that consists of N unit cells. Figure 5(c) shows the illumination area with a diameter of D on the Si grating sample and the length of the grating region along the x-axis is denoted as L (NΛ = L). Note that here the illumination area on the Si grating sample is simply drawn as a circle even though it should be an ellipse at oblique incidence. According to the VIF model, the distributions of reflectance, ellipsometric angles, and Mueller matrix on the image plane can be calculated. Among these signatures, Mueller matrix contains more sample information, and moreover, the reflectance and ellipsometric angles can also be derived from the Mueller matrix (in case of not being normalized to the (1, 1)-st element M 11 ). Therefore, we take the Mueller matrix image as an example in the subsequent analysis.
We used the established VIF model to simulate Mueller matrix images of the Si grating sample and take the VIF-simulated results as the actual measured data by an imaging scatterometer. Considering that the routine data analysis in imaging scatterometry relies on the matching between RCWA-calculated and measured data, the difference between the RCWA-calculated and the VIF-simulated Mueller matrices indicates errors in grating reconstruction by routine RCWA-based data analysis in imaging scatterometry. To evaluate the difference between VIF-simulated and RCWA-calculated Mueller matrices, we define a root-mean-squared error (RMSE) as follows where m VIF ij,k and m RCWA ij,k represent the VIF-simulated and RCWA-calculated (i, j)-th Mueller matrix elements at the k-th wavelength, respectively, and K denotes the total number of wavelengths. The constant 15 in Eq. (26) refers to the 15 normalized Mueller matrix elements except the (1, 1)-st element m 11 . Larger RMSE values indicate larger errors in the reconstructed grating profile by routine data analysis.
We used a commercial software "FDTD solutions" (Lumerical Inc., Canada) to perform FDTD calculations [30]. In order to reduce FDTD calculation time, we assumed the length of the grating line along the y-axis was infinite and just made a two-dimensional calculation. For the FDTD model, we used perfectly matched layers to truncate computational regions and absorb electromagnetic fields at the edges of truncated regions. The length of the FDTD computational region was set to be 1 µm larger than the diameter of the illumination area [labeled as D in Fig. 5(c)] in the following simulations. We used an electromagnetic field monitor to acquire the electromagnetic fields on the object plane of the Si grating sample.
Since the FDTD is used as a reference algorithm in the simulations, its accuracy is very important. As a numerical method for the solution of Maxwell's equations, the accuracy of the FDTD model mainly depends on the mesh grid size. Generally, a smaller mesh grid size yields a higher accuracy, but meanwhile increases computation burden on time and memory. In order to estimate an appropriate mesh grid size to ensure the FDTD model achieve enough accuracy, we calculated Mueller matrices of the zeroth-order diffraction beam of the Si grating sample by FDTD. Since the RCWA is a semi-analytical method and could yield a more accurate result, we used it as a reference to evaluate the accuracy of the FDTD. In the calculations, the truncation order in RCWA was set to be 30, which was enough to make sure the convergence of RCWA. Then, we compared the FDTD-calculated results m FDTD  Figure 6 shows the RMSE as a function of the mesh grid size. As expected, the RMSE decreases with the decreasing mesh grid size. The mesh grid size of 3 nm yields a RMSE of about 0.0061. Considering that the practical measurement precision of imaging Mueller matrices are about 0.01 [6], we therefore chose a mesh grid size of 3 nm in subsequent simulations to make a trade-off between the calculation accuracy and the computation burden.

The constraint for the imaging NA
As mentioned in the Introduction section, there should be an upper limit for the NA of the employed objective to ensure that the zeroth-order diffraction beam of a grating under test is exclusively collected. According to the well-known grating formula, the imaging NA should satisfy the following equation where λ min is the minimal wavelength of a spectral range.
To investigate the influence of the imaging NA quantitatively, we simulated Mueller matrix images of the Si grating sample under the following simulation conditions: N = 50, D = 80 µm (D/L = 2), and λ = 400∼700 nm. To ensure a negligible effect of edge scattering, we chose the Mueller matrices associated with the central point (x = 0, y = 0) of the image plane and made a comparison with those calculated by RCWA according to Eq. (26). Figure 7(a) presents the RMSE as a function of the imaging NA under α = 0°. According to Eq. (27), the theoretical limit for the imaging NA of the investigated Si grating sample (Λ = 800 nm) should be NA t = 0.5. However, as can be observed from Fig. 7(a), the RMSE is beyond the noise level when the imaging NA reaches about 0.49. Within the range from 0.49 to 0.5, as shown in the inset of Fig. 7(a), the RMSE increases significantly with the increasing imaging NA.
The large RMSE for the imaging NA above the practical upper limits is primarily attributed to the finite bandwidth of the angular spectra associated with the first-order diffraction beams. Figure 7(b) presents the angular spectra of the 0th-and ±1st-order diffraction beams propagated to the entrance pupil calculated under α = 0°and λ = 400 nm by Eqs. (11) and (12). As can be observed, the bandwidth of the angular spectra of the first-order diffraction beams is not ideally zero as assumed in Eq. (27). In this case, partial information of the first-order diffraction beams is collected by the imaging system. The practical upper limit for the imaging NA, NA p as indicated in Fig. 7(b), should therefore take the finite bandwidth of the angular spectra of the first-order diffraction beams into account to ensure that the zeroth-order diffraction beam is exclusively collected. It should be noted that the bandwidth of the angular spectra of diffraction beams is closely related with the number of grating unit cells N as well as the ratio between the illumination spot size and the size of the grating region D/L, which will be discussed in detail in Section 4.2. Figure 7(c) further presents the theoretical and practical upper limits for the imaging NA at different incident angles for the investigated Si grating sample. As shown in Fig. 7(c), the upper limit of the imaging NA also depends on the incident angle. The upper limit of the imaging NA decreases with the increasing incident angles for incident angles below about 15°a nd increases for incident angles above 15°.

The number of unit cells for accurate grating reconstruction
In imaging scatterometry, the illumination spot size is usually larger than the grating region to be analyzed. The number of grating unit cells covered by the illumination spot determines the distribution of the collected diffraction information. In the extreme case, when there are only one or several grating unit cells, the light-matter interaction becomes a diffraction problem similar to the single-slit or multiple-slit diffraction with many subsidiary maxima in the diffraction pattern besides the principal maxima corresponding to the diffraction orders. The routine data analysis based on RCWA will inevitably lead to great deviation in the reconstructed grating profile in the extreme case, since RCWA calculations assume that the number of grating unit cells is infinite. Therefore, there should exist a minimal number of unit cells required for accurate grating reconstruction by routine RCWA-based data analysis in imaging scatterometry. It is noted that the effects of finite illumination spot size and finite target size have also been discussed in traditional non-imaging optical scatterometry [31,32]. . Although all NAs in the simulation are below the upper limit as described in Section 4.1, as can be observed from Fig. 8(a), the RMSEs at different NAs are below the noise level only when N > 40. When N < 40 and take N = 20 as an example, as shown in Fig. 8(a), the RMSEs corresponding to larger NA (NA = 0.3 or 0.4) are still below the noise level while those associated with smaller NA (NA = 0.1 or 0.2) are beyond the noise level. To interpret it, we calculated the angular spectra of diffraction beams propagated to the entrance pupil under N = 20, D/L = 1.6, α = 0°, and λ = 400 nm. As can be observed, the angular spectra of the 0th-and ±1st-order diffraction beams shown in Fig. 8(b) are quite different from those shown in Fig. 7(b) that were calculated under N = 50. The former angular spectra have many subsidiary maxima (side lobes) accompanying with the zeroth-order diffraction beam. In this case, only relatively large NA could completely collect the zeroth-order diffraction information and thus yield RMSE below the noise level. It therefore indicates that there should exist a lower limit for the imaging NA, NA (Obviously NA = 0.3 when N = 20), when there are a few numbers of unit cells. It is also noted from Fig. 8(a) that all the NAs from 0.1 to 0.4 fail to yield RMSEs below the noise level when the number of unit cells is decreased to N = 10. It is because in this case the subsidiary maxima in the angular spectra of diffraction beams become more striking and none of the above NAs could completely collect the zeroth-order diffraction information and meanwhile exclude the first-order diffraction information. In real experiments, the subsidiary maxima may be damped due to the finite NA of illumination and the finite bandwidth of illuminating wavelength. Nevertheless, the angular spectra of diffraction beams are still continuous when the grating has finite unit cells. It is therefore still necessary to consider this effect to select an appropriate imaging NA in real experiments. It is worth pointing out that the required number of unit cells for accurate grating reconstruction is also related with the ratio between the illumination spot size and the size of the grating region D/L, since D/L also affects the angular spectra of diffraction beams. Figure 9 further presents the RMSE as a function of N and D/L under different NAs. For a certain number of unit cells, we varied D to control D/L. As can be observed, for larger values of D/L (i.e., larger illumination spots and smaller grating regions), more numbers of unit cells are required for accurate grating reconstruction. This phenomenon is more evident under smaller NAs. It is because the subsidiary maxima in the angular spectra of diffraction beams become more striking for larger D/L, and larger NA or larger N is required to completely collect the zeroth-order diffraction information.

The analyzability of image pixels associated with the grating region
The results shown in Figs. 7-9 are only related with Mueller matrices of the Si grating sample associated with the central point (x = 0, y = 0) of the image plane. We further examined Mueller matrix images of the Si grating sample over a large area. Since we assumed that the length of the grating line along y-axis is infinite, as described in Section 3, we only simulated Mueller matrices along the line y = 0 of the image plane. Figure 10 presents the comparison between the VIF-simulated results and the RCWA-calculated zeroth-order Mueller matrices under N = 50, D/L = 2, α = 0°, λ = 500 nm, and NA = 0.4. For clarity, the grating region is marked in Fig. 10. As shown in Fig. 10, the Mueller matrices associated with the substrate are diagonal matrices diag(1, 1, −1, −1), which just corresponds to the Mueller matrix for reflection at normal incidence. The two 2 × 2 off-diagonal blocks of the Mueller matrices associated with the grating region vanish because the incident beam is perpendicular to grating lines in simulation. In addition, because the phase difference between pand s-polarizations are zero at normal incidence, the values of Mueller matrix elements m 34 and m 43 corresponding to the grating region also vanish. Note that the values of the Mueller matrix elements are approximately constant in the grating region, because only the zeroth-order diffraction of the grating is collected while higher-order diffractions are excluded by the employed low-NA objective. Nevertheless, the contrast between the grating region and the substrate can be easily observed from Fig. 10. As can also be observed from Fig. 10, for pixels off the edge of the grating region, the associated VIF-simulated Mueller matrices match well with those calculated by RCWA. However, for pixels near the edge of the grating region, distinct discrepancy can be observed between the VIF-simulated and RCWA-calculated data due to the effect of edge scattering. The application of the routine RCWA-based data analysis to these pixels will inevitably lead to great deviation in the reconstructed grating profile and these pixels usually are abandoned in practice. Figure 11(a) presents the RMSE along y = 0 of the image plane between the VIF-simulated and RCWA-calculated Mueller matrices according to Eq. (26) under N = 50, D/L = 2, α = 0°, λ = 400∼700 nm, and NA = 0.4. For clarity, Fig. 11(b) presents an enlarged view of the dotted rectangle shown in Fig. 11(a). As can be observed from Fig. 11, only the pixels off the edge of the grating region within a range of about [−115.2, 115.2] µm yield RMSEs below the noise level. From the viewpoint of routine RCWA-based data analysis, only the measured data associated with the pixels within the above range are analyzable in imaging scatterometry. As indicated in Fig. 11(b), the analyzable region is much smaller than the whole grating region and actually only occupies about 57.6% of the whole grating region. Figure 12(a) further presents the RMSE along y = 0 of the image plane under different NAs at α = 0°. According to Fig. 12(a) Figure 12(b) reveals that the size of the analyzable region is closely related with NA and D/L. As can be observed, the length of the analyzable region increases with the increasing NA and the decreasing D/L. Similarly, according to Fig. 13    regions are asymmetric with respect to the central point (x = 0) due to the oblique incidence but have almost the same size.
As summarized from the above results, not all of the pixels associated with the grating region are analyzable from the viewpoint of routine RCWA-based data analysis, and there will be errors in grating reconstruction for the pixels near the edge of the grating region. The size of the analyzable region for accurate grating reconstruction that depends on both NA and D/L is much smaller than the actual size of the grating region. This is primarily attributed to the scattering arising from the edge of the grating region. The collected information in imaging scatterometry is in fact a combination of the zeroth-order diffraction information from the grating region under test and the scattering information from the edge of the grating region. In routine RCWA-based data analysis, only the zeroth-order diffraction information is taken into consideration. Consequently,  the scattering information leads to large deviation in the reconstructed grating profile for pixels near the edge of the grating region.
From another perspective, if we take the effect of edge scattering into account in the solution of the inverse problem in imaging scatterometry, the pixels near the edge of the grating region are expected to yield accurate grating reconstruction results and a wider analyzable region can thus be achieved. To verify this, we added the Gaussian noise with a zero mean and a standard deviation of 0.01 to VIF-simulated data and took the compound data as the actual measured data. We then replaced the RCWA model in routine data analysis with the VIF model to reconstruct the grating profile from the above measured data. In the solution of the inverse problem, we only let the width of the Si grating sample W float to reduce calculation time. Figure 14 present the error in W between the true and the best-match values at each pixel of the grating region achieved by RCWA-based and VIF-based data analysis, respectively. As can be observed, the errors of the VIF-based data analysis are nearly 0 at all pixels of the grating region. It therefore demonstrates the feasibility of the VIF-based data analysis for accurate grating reconstruction even for pixels near the edge of the grating region. In real experiments, some nonidealities such as the finite NA of illumination and the finite bandwidth of illuminating wavelength may lead to errors in the VIF-based data analysis, which will be further investigated in the future work.

Conclusions
Three limits of imaging scatterometry have been investigated in this paper. We have established a VIF model for imaging scatterometry based on FDTD and vectorial diffraction theory. Using the established model, we simulated imaging Mueller matrices of a Si grating sample with finite numbers of unit cells covered by the illumination spot. According to the simulation results, we have found that in imaging scatterometry (1) An upper limit should be imposed for the NA of the employed objective to ensure that the zeroth-order diffraction beam of the grating under test is exclusively collected. The theoretical upper limit can be estimated according to the well-known grating formula; however, the practical upper limit should also take into account of the finite bandwidth of the angular spectra of the first-order diffraction beams due to the finite number of grating unit cells.
(2) Enough numbers of unit cells are required to be covered in the illumination spot of an imaging scatterometer to achieve accurate grating reconstruction. The required minimal number of grating unit cells is closely related with the imaging NA and the ratio between the illumination spot size and the size of the grating region D/L. More grating unit cells are usually required for smaller NA or larger D/L to ensure that the zeroth-order diffraction information is completely collected by the imaging system.
(3) Not all of the pixels associated with the grating region are analyzable by the routine RCWA-based data analysis due to the effect of edge scattering, which will lead to large deviation in the reconstructed grating profile for pixels near the edge of the grating region. The size of the analyzable region for accurate grating reconstruction depends on both imaging NA and D/L. Typically, the size of the analyzable region increases with the increasing NA or the decreasing D/L. Nevertheless, the replacement of the RCWA model in routine data analysis with the VIF model could still achieve accurate grating reconstruction for the pixels near the edge of the grating region.
It should be pointed out that, although the simulation results are specific to the investigated Si grating sample, the established VIF model as well as the revealed limits are suitable for other grating samples, simulation conditions, and imaging scatterometry techniques. The established VIF model and the revealed limits are thus expected to provide valuable insights into the application of imaging scatterometry.