Optical wafer defect inspection at the 10 nm technology node and beyond

The growing demand for electronic devices, smart devices, and the Internet of Things constitutes the primary driving force for marching down the path of decreased critical dimension and increased circuit intricacy of integrated circuits. However, as sub-10 nm high-volume manufacturing is becoming the mainstream, there is greater awareness that defects introduced by original equipment manufacturer components impact yield and manufacturing costs. The identification, positioning, and classification of these defects, including random particles and systematic defects, are becoming more and more challenging at the 10 nm node and beyond. Very recently, the combination of conventional optical defect inspection with emerging techniques such as nanophotonics, optical vortices, computational imaging, quantitative phase imaging, and deep learning is giving the field a new possibility. Hence, it is extremely necessary to make a thorough review for disclosing new perspectives and exciting trends, on the foundation of former great reviews in the field of defect inspection methods. In this article, we give a comprehensive review of the emerging topics in the past decade with a focus on three specific areas: (a) the defect detectability evaluation, (b) the diverse optical inspection systems, and (c) the post-processing algorithms. We hope, this work can be of importance to both new entrants in the field and people who are seeking to use it in interdisciplinary work.


Introduction
Growing demand for smartphones, tablets, digital televisions, wireless communication infrastructure, network hardware, computers, and electro-medical devices is stimulating the global demand for semiconductor chips [1]. Moreover, the Internet of Things (IoT)-aka, internet of connected devicesis in its infancy but will contribute significantly to the demand for semiconductor chips in the long term, as will the growth of smart grids, smart cities, and automated smart manufacturing. These imperious demands, together with the endless pursuit of lowering both costs per wafer and energy consumption, constitute the primary driving forces for marching down the path of decreased critical dimension (CD) and increased circuit intricacy [1]. Very recently, Taiwan Semiconductor Manufacturing Company and its Research Alliance partners, announced the 3 nm breakthrough [2], which offers a path to delivering chips with significant improvements on today's leading 5 nm chips. It is a big win for fabs and manufacturers all around because it will only be two years for 5 nm to have fully settled in the market, but it is also a nightmare to the entire community of process control, especially for wafer defect inspection: everdecreasing sizes of features and spaces in these patterns have dramatically strained the capabilities of all the current solutions in balancing the sensitivity, specificity, processing speed, and capture rate. As double patterning, triple patterning, or even quadruple patterning ultraviolet (UV) lithography are now widely used, the number of inspection steps scales up with the increase of patterning steps, which potentially decreases the throughput and increases the risk of device failure because the missed defect detection events will be transferred to the end process. To make things worse, the extremely complex fin field-effect transistor and gate-all-around nanowire devices are now employed to reduce leakage current and improve device's stability beyond the technology node of 22 nm [3], which, results in the fact that the key defects of interest in this three-dimensional (3D) architectures are typically sub-surface (especially voids), buried in the stack, or are residues in high aspect ratio structures [4]. Overall, as the industry starts largescale sub-10 nm high-volume manufacturing, there is greater awareness that defects introduced by original equipment manufacturer components impact yield and manufacturing costs. This grand challenge, undoubtedly, affects the entire semiconductor manufacturing supply chain. Therefore, wafer defect inspection systems have become increasingly important to the fab.
The wafer defect inspection system detects physical defects and pattern defects on wafers and obtains the position coordinates of the defects. Defects can be divided into two categories, i.e. random defects and systematic defects [5]. Random defects are mainly caused by particles that attach to a wafer surface, so their positions cannot be predicted. The major role of wafer defect inspection systems is to detect and locate defects on a wafer. Systematic defects are primarily caused by the variations of the mask and exposure process and will occur in the same position on the circuit pattern of all the projected dies. They occur in locations where the exposure conditions are very difficult and require fine adjustment. Typically, wafer defect inspection systems detect defects by comparing the image of the circuit patterns of the adjacent dies [6]. As a result, systematic defects sometimes cannot be detected using a conventional wafer defect inspection system. Depending on if the inspection is performed on a patterned process wafer or on a bare wafer, wafer defect inspection systems have different configurations. For bare wafers, optical inspection systems, especially the darkfield microscopy [7], are the workhorse, due to the fact that the primary defects (i.e. particles and scratches on the wafer) have high sensitivity at their highfrequency scattering components. While for patterned wafers, defect inspection is much more complex and challenging due to the complex topography of patterns and various materials on the wafer [8]. Therefore, sophisticated instruments alongside advanced modeling and post-processing algorithms are playing an increasingly important role in patterned wafer defect inspection.
Generally speaking, the most direct way to inspect defects on a patterned wafer is to see them. Indeed, currently, there are quite a few tools that are capable of resolving sub-10 nm structures: scanning nearfield optical microscopy has the potential for extremely large resolution gains, for example, resolution down to λ/20 [9,10]. The potential drawbacks to near field microscopy are that the sensitivity of the technique to defect signals within stacks are unknown and the working distance control requirements are aggressive for a technique that has a much tighter depth of focus (DOF) than conventional far-field microscopy. Multi-photon and fluorescence microscopy techniques offer radically different contrast mechanisms when compared with conventional far-field imaging and are widely used in biological microscopy [11,12], but both techniques have low photon yields and unknown DOF, presenting potentially major systems and source engineering challenges. Most importantly, they will contaminate the wafer due to the requirement of fluorescent dyes and a liquid environment. Atomic force microscope (AFM) is a non-contact solution, but the challenges in the fabrication of probes to go down inside the sub-10 nm trench and the low inspection rate due to the scanning mode, hinders its application in defect inspection [13,14]. Transmission electron microscope (TEM) can resolve atomic structures [15,16], but it requires a vacuum environment and cross-section sampling; thus, it is only widely used in the semiconductor industry as a reference tool to label data. Hence, today, there are only two basic tool technologies to find patterned defects in the fab: e-beam and optical far-field wafer inspection [17,18]. E-beam inspection, also known as a type of scanning electron microscope (SEM), can locate and characterize tiny defects with feature sizes down to 1 nm [19]. However, the ultra-small field of view limits its throughput, thus hindering its online application. To address this issue, the multiple column e-beam inspection technique [20], which boosts the scanning rate by an array of beams rather than a single beam, was developed recently as a potential candidate for online inspection. However, electrons do not like each other, the pitch of the e-beam array cannot be made arbitrarily small, thus image-stitching and precise alignment are critical in the system. To the best of our knowledge, state-of-the-art e-beam inspection is still far slower than optical inspection. Therefore, optical far-field inspection, which takes about 88% of the $2.1B wafer inspection market, is still the workhorse in the fab.
Optical far-field inspection, which is in essence brightfield microscopy, has large field-of-view and low-dose exposure. Although it suffers from the Rayleigh limit [21], the key to defect inspection is not resolution, but rather the signal-tonoise ratio (SNR) and contrast [22]. In Rayleigh scattering, a d 6/ λ 4 scaling of the detectable far-field signal is inevitable for a scatter with size d [23,24]. Noises from the imaging components side (such as defects in lenses, mechanical instability, and shot and readout noises in the camera sensors) and the sample side (such as line edge roughness and line width roughness of the background nanopattern) degrade the image contrast and overwhelm the scattering signal with respect to the nanoscale defects [25][26][27]. Fortunately, Rayleigh scattering is built upon the assumption that particles are surrounded by a homogeneous medium, while defects on a patterned wafer, apparently, do not meet the condition. In fact, the defect-substrate coupling may result in a stronger signal than its Rayleigh counterpart [28], which indicates that defect inspection at single-digit nodes is possible under the condition of low-lose exposure. Here, we use the term 'may' and 'possible' because the analytic formula for describing defect-substrate coupling is still not ready, and rigorous numerical modeling to determine the scattering signal is inevitable [29][30][31], though there are already considerable investigations from the perspective of electrostatic approximation [32]. Because the materials of defects (especially the systematic ones) are usually identical to that of the patterns, and the CDs of patterns and defects are much smaller than the illumination wavelength, defect contrast enhancement is as important as SNR enhancement. To boost the defect contrast, the image of the circuit pattern of a die is conventionally subtracted from that of its adjacent dies, but this may erase the signal of systematic defects, leaving only the random defects in the difference image. Depth-scanning-based imaging microscopy [33][34][35], on the contrary, inspects systematic defects by detecting the local wavefront and amplitude perturbation induced by systematic defects along the optical axis. Because a systematic defect differs from its adjacent patterns in topography and dimension, its out-of-focus scattering field along the optical axis also differs from that of patterns. This is the reason why the depth-scanning technique may enable reference-free inspection of systematic defects (i.e. no reference die), especially for the defect inspection for memory arrays [36]. Post-processing algorithms, which play the role of extracting defect information from raw optical images, are also critical because defect inspection is essentially a problem of SNR and contrast. The difference, Gaussian filtering, convolution, and many other conventional algorithms have been widely applied in defect inspection, let along the recent bloom of deep learning. However, the grey area for optical farfield inspection with a state-of-the-art deep ultraviolet (DUV) source and advanced post-processing algorithms is somewhere between 20 nm and 10 nm [37], stretching the technology to the limit below 10 nm and even 5 nm will increase the number of false positives and missed detection events. Although the industry is preparing for extreme ultraviolet (EUV) defect inspection systems from the perspective of Rayleigh scattering (i.e. the shorter wavelength, the stronger scattering) [38,39], the requirement of a vacuum chamber increases the technical complexity and difficulty, let alone the increased absorption of materials at shorter wavelength.
For all these reasons, we do believe that optical defect inspection on patterned wafers will remain a challenging but interesting topic that urgently needs to be addressed. Moreover, as conventional optical defect inspection is seeing barriers, we do feel, a thorough review for disclosing new perspectives and exciting trends from the academic point of view, on the foundation of former great reviews, is of great importance to both new entrants in the field and people who are seeking to use it in interdisciplinary work.

Defect detectability evaluation
Sensitivity, which is the minimal size of a defect that can be identified by an optical defect inspection system, is widely adopted as the primary evaluation merit in the field. Apparently, the sensitivity highly depends on the SNR and image contrast of defects. The scattering signal of a deepsubwavelength defect can be qualitatively expressed by the Rayleigh formula [40], i.e.
where θ is the incident angle, R is the distance between the observer point and the spherical scatter, λ is the wavelength of light. N and d are the complex refractive index and the diameter of the object, respectively. I 0 is the intensity of incident light.
As for the advanced design rule (DR) pattern composed of non-birefringent materials, the intensity of its scattering signal can be approximately estimated by the reflection and polarization response from the stratified construct with the property of form birefringence [41]. Correspondingly, the scattering signal I pattern depends on the effective refractive index N P of the P-polarization and the effective refractive index N S of the Spolarization, as shown in equation (2), where N 1 and N 2 are the complex refractive index of line and space in the pattern, respectively. a and b are the width of line and space, respectively. According to equations (1) and (2), the SNR depends on the wavelength, the complex refractive index of both the defect material and the pattern material, and the size of both the defect and the pattern. While the image contrast primarily depends on the optical resolution of the platform and the difference between the complex refractive index of defect and pattern. Since the optical resolution R = 0.61λ/NA of the inspection platform, the difference of the complex refractive index Diff N = |N defect − N background |, and the scattering properties of the pattern structure are strongly dependent on the light wavelength, the inspection wavelength has become one of the most important parameters for tuning the image contrast. Moreover, the choice of inspection wavelength for ensuring both a high SNR and an appropriate image contrast is essentially dominated by the complex refractive index of the materials and the topography of the pattern structure. Therefore, it is vital to investigate the influence of the material and the topography of the wafer pattern on the sensitivity and contrast of defects.

Effect of materials on defect detectability
On the basis of diffractive limit 0.61λ/NA and the Rayleigh scattering approximation [42], it is natural to extend the inspection wavelength from DUV spectrum to vacuum ultraviolet spectrum, even to EUV spectrum for enhancing both the image contrast and the SNR. However, the defect detectability is not linearly proportional to the illumination wavelength, due to the wavelength-dependent contrast of index between the defect and the background pattern. Figure 1 presents a preliminary summary of the complex refractive index N = n − i•k, the reflectivity under normal incidence R, and the penetration depth δ of the typical bulk materials that are widely used in integrated circuit (IC) devices. The complex refractive index is cited from [43,44]. As for the defects immersed in the background pattern, equations (1) and (2) can only qualitatively claim the influence of index and defect size on SNR and image contrast. In fact, because the defects are buried inside the background pattern and the size of patterns is much smaller than the wavelength, the difference in the image contrast between defects and background pattern is then primarily determined by the difference in the optical properties of materials, i.e. refractive index and reflectivity. In other words, the reflectivity contrast between the defect material and the pattern material shown in figure 1(c) could help in seeking for the optimal inspection spectrum. As shown in figure 1(c), different materials usually exhibit distinguishing reflectivity at most wavelengths, but also present the same reflectivity at several specific wavelength points. For example, the reflectivity of bulk Si and Cu are the same at λ = 390 nm, the reflectivity of bulk Si and Co are the same at λ = 330 nm, the reflectivity of bulk Si 3 N 4 and Cu are the same at λ = 170 nm, the reflectivity of bulk Si and Ni are the same at λ = 310 nm, and so on. Correspondingly, the reflectivity contrast between these material combinations above is quite small around the intersection points of reflectivity curves. If the materials of defect and background pattern are, say, Si and Cu, respectively, the SNR of the scattering signal of the defects will be rather weak and the image contrast will be quite small. On the contrary, if the detection spectrum is far away from these intersection points, as a result, defects can be distinguished clearly from the background pattern. For example, Cu and Co are widely used in the metallic interconnection layer (the M1 layer is most likely Cu or Co pattern structures on the Si substrate) at the 28 nm node and beyond. Defects can be detected much easier in the range of 430-500 nm than that in the range of 360-430 nm, due to the larger reflectivity contrast in 430-500 nm band. Moreover, defects in Cu patterns on Si substrate can be detected much easier than that in Co patterns on Si substrate in the range of 290-340 nm. Hence, the material composition has a significant impact on the defect inspection sensitivity. Because the thickness of each patterned layer is not infinite in the actual IC chips, it is much more appropriate to use the reflectivity of thin films rather than bulk materials in the model to estimate the reflectivity. Considering that the reflectivity of the patterned layer highly depends on the refractive index, the extinction coefficient, and the thickness, it is of great importance to evaluate their impact on defect detectability, as shown in figures 1(a), (b) and (d). Meanwhile, at some wavelength points shown in figure 1(d), the penetration depth of some material varies a lot, namely, the material can be either opaque or transparent. Correspondingly, by choosing the wavelength above 500 nm, the widely used Si substrate is almost transparent, and the background signal and wafer noise caused by the Si substrate or Si pattern can be suppressed. Therefore, the SNR of the defect can be greatly enhanced.
Here, we use the Cu pattern on Si substrate as an example to find the optimal band. As shown in figure 2, the defect detectability corresponding to the range of 360-450 nm may be worse than the one corresponding to the spectrum of 470-580 nm. Indeed, a shorter wavelength will induce a better optical resolution and a stronger scattering signal, but the image contrast becomes very weak, which leads to the fact that a defect can hardly be detected from the background. Furthermore, the normalized signal firstly decreases followed by an increase with the ending wavelength increasing from 360 nm to 710 nm when the starting wavelength is fixed at 360 nm, according to the simulation results shown in figure 2. Moreover, the local minimum of the normalized signal locates at λ = 450 nm, and the corresponding normalized signal is less than the detectability threshold determined by the shot noise in the time delay integration camera. If the starting wavelength is chosen as λ = 410 nm, for example, the normalized signal increases linearly as a function of ending wavelength, and all the signal is above the detectability threshold. Therefore, the result has confirmed the validity of the method that is built upon the analysis of the difference in optical properties, as shown in figure 1. As a result, it is worth mentioning that shortening the detection spectrum may not necessarily results in an enhanced inspection sensitivity. Similar results have been reported by Barnes et al [45]. In their results, the defect sensitivity at λ = 47 nm is better than that at λ = 13 nm, which can be attributed to the worsening of image contrast as the wavelength shrinks. The results demonstrated again that the defect detectability is not only affected by the optical resolution, but also by the image contrast. Here we should remind our readers that the primary goal of optical defect inspection is to identify and locate the defect rather than clearly 'see' the defects. Therefore, finding an optimal spectrum range in which the image contrast and sensitivity are high enough is more important than improving the optical resolution. This is especially critical at advanced technology nodes [3].
The above analysis indicates that the defect material, the surrounding pattern material, and the substrate material will affect the defect detectability by the way of tuning the image contrast. By selecting the optimal spectral range reasonably, it is possible to achieve high sensitivity, even at the 10 nm technology node and beyond.

Effect of topography on defect detectability
For patterned wafer inspection, the SNR and image contrast is primarily affected by the sizes and types of defects. Figure 3 presents several typical defects in the periodic line/space nanostructure, which is widely seen in the memory devices. The eight subfigures successively present the schematic diagrams of cutting, the horizontal bridge defect at the edge, the intrusion, the zig-zag bridge, the horizontal bridge in the line, the particle, the protrusion, and the perpendicular bridge defect. Up to now, the effects of topography on defect detectability have been widely investigated [46][47][48][49], which is usually associated with the optimization of defect inspection configurations. For example, both the horizontal and vertical bridging are rather sensitive to the polarization of the illumination beam [46]. With the same defect inspection configuration, different types of defects such as bridge and line cutting present different defect detectability [47]. Certainly, the size of defects and  patterns also directly affect defect detectability [48]. Correspondingly, in order to clarify the effects of the topography of patterns with defects on defect detectability, the following two aspects need to be considered: the size and the type of defect.
Usually, it has been generally accepted that the size of defects is typically in balance with the design rule [49]. In our discussion, the dependence of the defect inspection sensitivity on the defect size will be considered in the case when the defect size is equivalent to the CD size of the pattern. Figure 4 shows the aerial images and the corresponding differential images of Cu pattern on Si substrate with bridge defects calculated by using an in-house developed simulation tool based on the Hopkins imaging theory [50][51][52]. In the simulation, the size of a bridge defect varies from 90 nm to 10 nm, the band is chosen from 120 nm to 220 nm, the numerical aperture (NA) of the objective lens is 0.90, and the probing configuration is the conventional illumination aperture. In order to evaluate the defect detectability, we propose a criterion for defect inspection, i.e.
where Image diff and Image non are the differential and the aerial images of the pattern without defects, respectively. Noise shot and FWC are the shot noise and the full well capacity of the camera used in the optical far-field inspection platform, respectively. SNR defect represents the SNR of the defect signal. If SNR defect is larger than 1, the defect can be detected by the optical system. As for the bridge defect shown in figure 4, the SNR defect are 25.4, 11.0, 3.9, and 0.3, respectively. According to the decision criteria in equation (3), all the bridge defects with sizes above 22 nm can be detected by the optical system. Moreover, the bridge defect with a size larger than 15 nm can be reliably detected under the configuration of conventional illumination and collection pupils. If annular illumination and conventional collection pupils are used, the SNR defect of the defect with a size larger than 10 nm is larger than 1.4, which indicates that the defect can be reliably detected as well. These results clearly demonstrate that the defect size can strongly affect defect detectability. As the defect size shrinks, the defect detectability decreases dramatically. Meshulach et al reported similar effects of the size of particles, bridge defects, and line-cutting defects on the scattering signals of defects [53]. Theoretically, the scattering cross-section (SCS) of the defect is approximately proportional to d 6 /λ 4 , thus the scattering signal of the defects decreases with the shrinkage of defect size. Meanwhile, the scattering angles of the high diffraction orders from the background pattern will enlarge as the CD size shrinks [54]. Therefore, the SNR of the defect scattering signal will decrease with the shrinkage of defect size.
Massive literature has reported the effects of the defect type on defect detectability [55][56][57][58][59]. Barnes et al used the optical volumetric inspection experiments to verify the different detectability of horizontal and vertical bridge defects of the same size [55]. Silver et al compared the different detectability of the central particle and the corner extension by using the finite-difference time-domain (FDTD) simulations alongside an inspection experiment [56]. The results indicate that the SNR of the scattering signal from the corner extension is larger than that from the central particle. Fujii et al has discussed the different detectability of short and open defect in multiple layered patterns by using an optical simulation tool [57]. Their results exhibit that the short defect is easier to be detected than the open defect. Relying on the in-house developed simulation tool, we have compared the detectability of bridge, cutting, and particle defects in the Cu pattern on a Si substrate, the normalized signal of each defect is shown in figure 5.
The results shown in figure 5 indicate that the bridge defect, the cutting defect, and particle with the size respectively larger than 15 nm, 24 nm, and 37 nm can be detected under the conventional illumination configuration. If the annular instead of conventional illumination configuration is utilized, the detectable minimal size of bridge defect, cutting defect, and particle can be shrunk down to 10 nm, 13 nm, and 17 nm, respectively. Hence, the defect type has a non-negligible effect on defect detectability.

Diversity in optical inspection systems
Light is electromagnetic (EM) radiation within the portion of the EM spectrum that is perceived by the human eye or artificial detectors. An arbitrary light field can be fully described by four fundamental quantities, i.e. frequency, amplitude, phase, and polarization [60]. Typically, optical defect inspection is implemented in the regime of linear optics. Therefore, different from amplitude, phase, and polarization, frequency is independent of light-matter interactions [61]. Accordingly, optical inspection systems can be categorized by the measurands of light in practical use.

Amplitude-based optical inspection systems
Artificial detectors, such as charge-coupled devices (CCD) and complementary metal-oxide-semiconductor, can only detect the radiation of light. Therefore, the most direct way for optical inspection is to extract the defect signals from raw intensity frames. Top players in the market, including KLA-Tencor, Applied Materials, Hitachi High-Technologies, JEOL, and ASML, offer amplitude-or intensity-based optical inspection systems. In their systems, bright-field illumination, dark-field illumination, or a combination of both for defect detection are widely used [62]. Bright field illumination is the more commonly applied lighting geometry, which usually involves mounting and orienting lights between 90 • and 45 • from the imaging surface (off horizontal) [63]. Conversely, dark field lighting involves orienting lights between 0 • and 45 • off horizontal [64], which is particularly effective when imaging highly reflective surfaces or generating edge effects; see figure 6(a). As schematically shown in figure 6(b), patterned wafer inspection systems compare the image of a test die with that of an adjacent die on the wafer [65]. Any random defect in one of the dies will not zero out in the subtraction process, showing up in the subtracted image [66]. The positions of the defects allow a defect map to be generated over the wafer, similar to the maps generated for non-patterned wafers. Patterned wafer inspection requires precise and repeatable motion control of both the wafer stage and the optical components of the inspection system. To address the issue that systematic defects may not be inspected in the die-to-die comparison, one can consider fabricating a 'golden' die known to be defect-free [67], although it may be difficult because of the variations in the fabrication line. Another strategy, which is known as 'die-to-database' [68,69], compares the experimental image of a die to the simulated one of a modeled defect-free die. The simulated image can be computed efficiently through Fourier optics theory [70,71], which converts the nearfield of a die to the image space. However, it is never easy to mimic experimental conditions. First of all, because the size of a die is much larger than the wavelength of the illumination source in the inspection system, computing the nearfield of a die is extremely timeand resource-consumption for a logic circuit [72]. Although the open-boundary problem of nearfield computation can be approximated by periodic boundary conditions for the cases of defects in periodic arrays (i.e. memory circuits) [73], the size of a unit cell should be large enough to eliminate any possible boundary effects induced by the non-physical boundary assumptions [74]. As the line-edge and line-width roughness may have comparable size to that of defects of interest at advanced technology nodes [75], it should be included in the physical modeling procedure. However, we should mention that the topography of line-edge and line-width roughness is random in real cases, thus one can never precisely predict their impact on defect inspection procedure. This problem may become more serious as the wavelength of the illumination source keeps being shrunk in the near future. The finite element method [76], FDTD method [77], and the method of moment are well-developed numerical EM solvers for computing the nearfield of patterned wafers [78,79], but the requirement of fine mesh for nanoscale features (especially the roughness) at short wavelength will lead to large computation overhead.
Compared to the die-to-die or die-to-database strategy, 'self-comparison' may enable the inspection of systematic defects without time-consumption computation [80]. Selfcomparison, as the name suggests, is a strategy that inspects defects by comparing the optical image of a die with itself. If the patterns in a die are horizontally periodic (for example, memory circuits), self-comparison works well in theory [81], because shifting the die horizontally by n • P (where n is an integer and P denotes the pitch of patterns) does not change the optical images, i.e. I p (x) = I p (x + n • P) [82], where x and I p denote the horizontal position of a die and the optical image of the pattern in a die, respectively. However, a defect, which is not periodic in general, means that it is not shift-invariant in an optical image, i.e. I d (x) ̸ = I d (x + n • P), where I d denotes the optical image of the defect. Therefore, if a die has defects, subtracting its optical image from the one shifted by n • P will only show up the defect signal [83], i.e.
where ∆I is the difference image. This method is mathematically simple, and most importantly, it avoids the possible long-range instability induced by mechanical displacement. However, we should mention that the aforementioned strategy of self-comparison only works for memory circuits [83], not logic circuits, due to the requirement of shift-invariance, i.e. I p (x) = I p (x + n • P). To the best of our knowledge, die-to-die or die-to-database is still the workhorse in logic circuit defect inspection [84]. Conventionally, patterned wafer defect inspection in brightfield illumination mode is implemented at the best focal plane of the imaging system [85], implying an assumption that the best resolution and contrast of defects is at the best focal plane. However, the strong EM coupling between the pattern/defect and the wafer substrate may introduce a vertical shift of the peak signal of the defect [86], which can be physically explained by the theory of electrostatic approximation [87]. Therefore, implementing the die-to-die comparison along vertical direction rather than level direction may result in a better SNR for defect inspection. Through-focus scanning optical microscopy (TSOM) [88,89], which allows conventional optical microscopes to collect dimensional information down to the nanometer level by combining two-dimension (2D) optical images captured at several through-focus positions, has lateral and vertical measurement sensitivity of less than a nanometer [90]. As the grey area of conventional brightfield techniques used in semiconductor manufacturing is around the 11 nm node, TSOM may extend the capability of a brightfield microscope by utilizing the enriched information through z-axis slicing [91]. As shown in figure 7, throughfocus images are stacked as a function of focus position, resulting in a 3D space containing optical information [36,55,92].
From this 3D space, cross-sectional 2D TSOM images are extracted through the location of interest in any given orientation. Defect signals can then be obtained by subtracting the defect-perturbed TSOM image from the baseline TSOM image [93]. The 3D EM scattered field captured by throughfocus scanning can also be combined with optimized illumination configurations and advanced post-processing algorithms for sub-10 nm defect inspection [94]. Here we should mention that the potential improvement of defect sensitivity in the through-focus scanning method is obtained at the expense of more time. Although deformation-mirror [95], liquid lens [96], and spatial light modulator may enable motion-free vertical scanning [97], it is still more time-consuming than conventional brightfield microscopy.

Phase-based optical inspection systems
From Rayleigh scattering's point of view, the amplitude of light scattered by any deep subwavelength nanostructure with primary dimension d from a beam of unpolarized light of wavelength λ is proportional to d 3 /λ 2 [98,99]. However, the phase change φ due to a ridge with height h on a surface can be approximately expressed as φ = 4πh/λ [100], indicating that the phase is more sensitive than amplitude associated with vertical subwavelength dimensions. For a defect on the wafer, the refractive index difference ∆n between the background and defect could result in a phase change φ = 4πh∆n/λ [101]. From the above discussion, we may expect that the phase information is sensitive to both the on-and under-surface defects.
To reconstruct the optical phase, interferometric techniques, such as phase-shifting interferometry and digital holography microscopy [102][103][104][105][106], are widely used. However, because vibrational noises in conventional dual-path interferometers may disturb the weak defect signals [107], by no means all interferometric techniques are suitable for defect inspection. Common-path interferometry [108], a type of quantitative phase imaging technique that conventionally is an active field for biomedical applications [109], has great potential in patterned wafer defect inspection due to its robustness to vibrational noises and its measuring speed up to millisecond level. In order to implement defect inspection, Zhou et al built a specialized diffraction phase microscopy (DPM) in epi mode [110]. The DPM is a common-path interferometer, which is created using a diffraction grating in conjunction with a 4f lens system [113]. In this geometry, the interferometer is very stable, allowing highly sensitive time-resolved measurements. Because of the periodic nature of the diffraction grating, multiple copies of the image are created at different angles. The 0th and +1th orders are selected to create a final interferogram at the camera plane [114]. The other orders either do not pass through the first lens or are filtered out in the Fourier plane; see figure 8(a) [110]. This compact configuration inherently cancels out most mechanisms responsible for noise and is single-shot, meaning that the acquisition speed is limited only by the speed of the camera employed.
Various types of defects, including parallel bridge defect, perpendicular bridge defect, isolated dot defect, and perpendicular line extension defect with the minimal size down to 20 nm was successfully detected using this epi-DPM combined with a comprehensive post-processing strategy [115][116][117], which consists of second-order difference, image stitching, and convolution. The grating-induced common-path interferometer can be naturally extended to a white-light illumination configuration [112], thus more types of defects may be detected compared to a monochromatic interferometer. However, we should mention that in a broadband interferometry system the +1st order would spread out all the colors in the Fourier filter plane due to dispersion, making it impossible to low-pass filter it. Thus, the broadband interferometry system uses the 1st order as the image order and suffers from astigmatism.
Optical Pseudo Electrodynamics Microscopy (OPEM) (shown in figure 8(b)) [111,118], which is a method built upon the scattering force of nanostructures induced by optical illumination by solving a 2D Poisson equation [119], can be regarded as a special type of phase imaging technique [120]. This is because the scattering force on a nanoscale object illuminated by a slowly varying electric field (such as a plane wave) is directly proportional to the spatial gradient of the phase [121]. However, we should mention that the scattering force in OPEM is a quasi-force rather than the real one because it is impossible to decouple the SCS of an object from the measured intensity [122]. Fortunately, optical patterned defect inspection only requires the positioning of defects in a qualitative manner. The successful sensing of 2.3 nm scale height differences and various semiconductor defects with sub-10 nm width without complicated instruments and noise-reducing postprocessing algorithms [111], indicates that OPEM is robust to the systematic noises and sensitive to the nanoscale perturbations. However, we should emphasize that although OPEM has shown the capability of detecting nanoscale defects, so far, it is not able to quantitatively determine the height of nanostructures compared to quantitative phase imaging techniques. A more theoretical investigation is required to make the measurement of optical scattering force feasible.

Polarization-based optical inspection systems
Conventional optical wafer defect scan is the standard methodology in the industry because of its high throughput and sensitivity [123]. However, the defects reported by a standard optical defect inspection tool require a separate SEM review to determine if the defects are patterning defects [124]. Optical scatterometry [125,126], which is also referred to as optical critical dimension metrology, measures profile parameters of periodic nanostructures by leveraging the diffracted polarization properties of light on the wafer [127][128][129][130], i.e. comparing the measured change of polarization state to the simulated one. If patterned defects exist, it will break the geometrical periodicity and introduce an additional signature into the optical response [78], thus lowering the quality of the spectral fit. However, because scatterometry is a non-imaging technique and the reported minimal size of the illumination spot is still larger than 5 µm, scatterometry cannot precisely tell where the defect is on the wafer, making it less attractive to advanced nodes compared to conventional brightfield solutions. Imaging ellipsometer [131][132][133], which combines polarization-contrast microscope imaging with the measurement principles of spectroscopic ellipsometry, has the potential of positioning patterned wafer defects. State-of-the-art imaging ellipsometer has reached a spatial resolution less than 1 µm [134,135], and most importantly, it is an imaging technique, thus avoiding the time-consuming spot scan as in scatterometry (or spectroscopic ellipsometry). However, because commercially available imaging ellipsometers were designed for thin-film characterization, the optical path design, camera noises, and systematic calibration may not be optimal for patterned wafer defect inspection. As a result, customized optical systems rather than commercially available imaging ellipsometers, are more common in the academy. Chi et al rediscover the null ellipsometry principle for an outstanding imagecontrast enhancement method for darkfield imaging [136]. They demonstrated that by simply adding polarizers, compensators, and a photodiode sensor to a conventional darkfield imaging system and applying the null principle, gap defects as small as 14.6 nm and bridge defects as small as 21.9 nm on 40 nm line and 40 nm space patterns, which are invisible in conventional darkfield imaging, can be distinguished from scattered noise; see the schematic and experimental results in figure 9.

Orbital angular momentum-based optical inspection systems
Orbital angular momentum (OAM) [137,138], which is different to spin angular momentum that macroscopically presents in the form of circularly polarized light [139], manifests the orbital rotation of photon [140]. OAM describes structured waves possessing a helical wavefront, whose number of twists identifies each OAM state [141]. As a result, OAMcarrying waves have distinctive phase structures and allow for promising applications [142,143], ranging from classical to quantum regimes. Wang et al proposed a defect inspection strategy that uses an OAM beam as the probe in coherent Fourier scatterometry (CFS) [144]. As shown in figure 10(a), conventionally, the defective pattern is illuminated by a beam with a Gaussian spatial profile, after which the captured backreflection signal is compared with that of a golden reference pattern. As a comparison, OAM CFS is unique because it does not rely on referencing to a pre-established database, provided that the patterned structures have reflection symmetry. As shown in figure 10(b), when the defect (red dot) is illuminated by a single OAM beam P q (x, y) with integer OAM charge q = +1 or q = −1, the resulting diffraction patterns from OAM beams exhibit an obvious asymmetry, which may be leveraged to perform defect inspection. However, we should emphasize that although the OAM beam has presented an advantage over Gaussian illumination in terms of reference-free inspection, the strategy is built upon the assumption of geometrical symmetry, which may be only valid in the memory array. Moreover, OAM illumination cannot evade the intrinsically   O(x, y), which is chosen to be a uniform planar substrate for discussion, with a point defect D(x, y), marked in red, and the far-field diffraction patterns, Iq = 0, are recorded on the detector plane. The complex wavefront of the Gaussian beam Pq = 0(x, y) is plotted with amplitude and phase being represented as brightness and hue, respectively, as shown in the color wheel of the inset. The green dashed box shows the center of the diffraction pattern. (b) Model-based OAM CFS, model-based differential OAM CFS, and model-free OAM CFS. Reprinted with permission from [144]. Copyright The Optical Society. physical limit, i.e. Rayleigh scattering-if the size of a defect is much smaller than the illumination wavelength, the perturbation of wavefront induced by the defect is very weak. The presence of noise such as surface roughness and systematic errors (including optical aberration and defects in optical components) may easily overwhelm the signal of the defect. As the wavefront of an OAM beam has spatial phase distribution [145], the sensitivity of defects should be spatial coordinate-dependent in an OAM defect inspection system. In fact, an OAM beam with arbitrary topological charges can be decomposed into many plane waves with various wavevectors [146], which indicates that the OAM-based inspection, similar to the scenario of diverse illumination configurations used in conventional brightfield defect inspection system, is intrinsically an illumination engineering [147], from the standpoint of angular spectrum theory [148]. Overall, OAM-based defect inspection is a promising technique, but more detailed investigations, from both the technical and market point of view, are highly required.

Defect inspection systems using terahertz (THz) waves
Localized surface plasmon resonances (LSPRs) are collective electron charge oscillations in metallic nanoparticles that are excited by light [151,152]. Metal particles such as Au and Ag particles are well known for their enhanced nearfield amplitude at the resonance wavelength once illuminated by EM waves in the visible or near-infrared (NIR) regime [153,154]. This field is highly localized at the nanoparticle and decays rapidly away from the nanoparticle/dielectric interface into the dielectric background, though far-field scattering by the particle is also enhanced by the resonance [155]. Light Figure 11. Schematic of (a) the THz wave-based defect inspection system and (b) the THz real-time imaging system. From [150]. Reprinted with permission from Society of Photo-Optical Instrumentation Engineers (SPIE). intensity enhancement is a very important aspect of LSPRs [156], which indicates that metal particles are typically much easier to be optically detected than their dielectric counterparts at resonant wavelengths. Analogously, it is natural to imagine a scenario that if the silicon wafer under inspected can be excited by its corresponding resonant wavelengths, the scattering signature of defects may be enhanced by several orders of magnitude, similar to metal nanoparticles under LSPR state [157]. Indeed, most semiconductors have plasma frequencies in the THz domain [158], THz surface plasmon polaritons (SPPs) are anticipated to be useful tools for the non-destructive detection of defects on semiconductor surfaces [159]. Since the energy confinement of SPPs is related to the degree of overlap between the frequency of SPPs and the semiconductor's plasma frequency [160,161], a high-sensitivity inspection can be achieved by using a THz wave with frequencies slightly lower than the plasma frequency of the semiconductor. Yang et al developed a THz-band defect inspection system with the principle being schematically shown in figure 11(a) [162]. The THz wave launched at a gap between a razor blade and the semiconductor surface is scattered at the gap and the scattered waves comprise a continuum of both propagating and evanescent fields [150,163], which makes the excitation of SPPs possible. The SPPs propagate along the semiconductor surface mostly at directions normal to the razor blade. A second blade placed at a distance from the first one is used to couple the SPPs back into free propagating radiation, which is then detected. If there is a defect along the semiconductor surface (for example, a particle located above or an air bubble beneath the surface), the SPPs should be scattered or reflected by the defect when passing through it. Thus, the intensity of the output THz wave will be different depending on the existence of defects. Generally speaking, this SPP-based prototype cannot form an 'image' of the defective area, but rather generates a one-dimensional signal with the defect only observable by comparing with the signal of the reference wafer. Therefore, this system may be more suitable for bare wafer defect inspection. To implement patterned wafer defect inspection, an LSPR-based THz imaging system is more favorable, as shown in figure 11(b) [149]. In such a system, a customized THz wave strikes the patterned wafer and thus excites the LSPR of targeted types of defects. Due to the strong nearfield confinement and absorption of LSPR, the sensitivity and contrast of defects in a THz image may be strongly enhanced. However, we should emphasize that although SPR or LSPR can significantly enhance the SCS or nearfield confinement of defects, the sizes of patterned wafer defects are at the level of single-digit nodes, which are much smaller than the THz wavelength. As a result, the SCS of a sub-10 nm patterned defect would be very small [164], thus the enhancement of the scattering field at LSPR wavelengths may be neutralized by the extremely small SCS due to Rayleigh scattering. Hence, a more detailed investigation needs to be implemented before the real application of THz-band defect inspection systems.

Defect inspection using hyperbolic bloch modes
The aforementioned optical defect inspection systems (i.e. the ones in sections 3.1-3.5) highly rely on innovations in optical microscopy techniques. Optical imaging-based defect inspection technology acquires position-registered images of a wafer specimen [161,165], while the specimen base plate is scanning over a region of interest. Defect signals are detected by a fast image-comparison algorithm that processes acquired images with reference defect-free images [165]. Generally speaking, conventional defect inspection systems (especially the ones used in the fab) are able to inspect all types of defects. However, they may lose sensitivity for some defects, for example, deeply buried defects. Yoon et al proposed an innovative method that takes advantage of the sample side rather than the instrument side [166]. In their proposal, the sample under investigation is a 3D NAND (NOT AND) flash memory, which has hierarchical structures with micrometer-scale height, 100 nm-scale overall period, and a minimal CD on the order of 10 nm; as shown in figures 12(a) and (b). Akin to one-dimensional hyperbolic metamaterials or Bragg gratings or one-dimensional photonic crystals [167][168][169], such a hierarchical structure presents periodically coupled surface-plasmon modes (in other words, vertical Bloch modes) at the IR domain of λ > 1 µm due to the opposite-sign condition Re ( ε ∥ ) · Re (ε ⊥ ) < 0 [170], which provides a robust signal-magnification mechanism for buried defects including fume spouts, bridges, residues, voids, etc.; see figures 12(c) and (d). Moreover, because semiconductor materials have much weaker absorption at IR domain than that at visible or DUV domain [171], the proposed method can identify subsurface defects at a depth that is around ten times deeper than the conventional optical skin depth limit. However, we should emphasize that although the proposed method has significant advantages, it is limited to hierarchical structures only, due to the requirement of periodically coupled surface-plasmon modes that are only available in vertically periodic structures.

Defect inspection by using x-ray ptychography
Due to the well-known diffraction barrier of optical microscopy, the aforementioned defect inspection operated in DUV and visible regimes cannot image deep-subwavelength nanostructures, let alone the classification of various defects. To boost the resolution, the most direct way is to shrink the illumination wavelength down to the single-digit nanometer scale [39,174]. However, optical imaging at such a short wavelength cannot be implemented as in conventional lens-based systems, due to the ionization effect and strong material absorption [39]. X-ray Ptychography, however, paves a very attractive way to the direct 3D imaging of entire wafers with nanometric resolution [173,175,176]. Holler et al has demonstrated that ptychography operated in the hard x-ray regime can create 3D images of ICs of known and unknown designs with a lateral resolution in all directions down to 14.6 nm [172]. As shown in figure 13(a), the entire area of multilayer structures below the active layer of an Intel processor, including the source and drain contacts, gate contacts, and gate with fins, can be nondestructively reconstructed. Even more impressively, nanostructures with sub-20 nm dimensions can be clearly seen from the zoomed-in images; as shown in figures 13(c)-(g). Therefore, hard x-ray ptychography [177], to the best of our knowledge, is the only optical method that can directly image both surface and undersurface sub-20 nm defects for the entire wafer. As of now, x-ray ptychography cannot be directly applied in the fab due to the rigid requirements including the synchrotron x-ray light source, a massive amount of data, and the low speed. However, we believe that once the measurement geometry is combined with foreseeable improvements in x-ray sources and instrumentation in the future, rapid and nondestructive imaging of ICs at resolutions better than 10 nm is not a dream.
As of now, amplitude-based inspection techniques, especially brightfield microscopy, are still the workhorse in the fab, due to their intrinsic advantages such as high speed (for example, the inspection speed of KLA-Tencor 39XX series can be up to two 12 inch wafers per hour), economic efficiency, and general applicability for various types of defects. However, as SNR and sensitivity of defects is becoming more and more critical at advanced technology nodes, phase-, polarization-, and OAM-based inspection strategies, which do not comply the law of Rayleigh scattering in theory, may find its position in the field in terms of a higher SNR and sensitivity, yet their general applicability for more defect types need to be investigated in the near future. Because phase-, polarization-, and OAM-based inspection systems can all achieve singleshot measurement, their corresponding inspection speed can be as high as that of conventionally optical defect inspection tools in the line (i.e. brightfield inspection). THz wavebased inspection system, whose speed can be as high as that of conventionally optical solutions, has not been proved in finding defects from patterned wafers. However, because most semiconductors have plasma frequencies in the THz band, we may boost the SNR of defects by stimulating the LSPR of defects under certain illumination conditions, which is similar to the role of metals (e.g. Au and Ag, whose LSPR is in visible or NIR regime) in nanophotonics. However, the THz-based defect inspection system shown in figure 11(a) behaves like an AFM, in which the detection is implemented in a raster scanning mode. This indicates that the inspection speed is at least one order slower than that of conventionally optical defect inspection tools. The speed of patterned defect inspection at THz band can be improved via an imagingbased mode, although this has not been experimentally validated yet. Hyperbolic Bloch modes-based inspection system, different to any of the aforementioned systems, is not a universal solution to the field of patterned wafer inspection, but rather can only be applied in the case where the structure under investigation has Bragg-grating-like geometry and the period of which should be comparable to the wavelength of sources. As a result, it is more applicable in memory chips such as NAND flash devices, although its speed is comparable to that of conventionally optical defect inspection tools. X-ray ptychography, to the best of our knowledge, is the only optical method that can directly image both surface and undersurface sub-20 nm defects for the entire wafer. To date, x-ray ptychography has a very slow working speed (a rough estimation for the speed shows that at least 236 d are required to scan over an entire 12 inch wafer) compared to the workhorse (i.e. brightfield microscopy) in the fab, but we believe that this technique could provide revolutionary solution to the field of patterned wafer inspection, once the drawbacks including the synchrotron x-ray light source, a massive amount of data, and the low speed can be conquered in the future.

Post-processing algorithms
From the simplest image difference operator to the complex image synthetic algorithm [178], the post-processing algorithm plays a critical role in optical defect inspection in terms of improving SNR and contrast of defects. This is especially the case as deep learning algorithms emerge as a ubiquitous part of our daily life. In this section, we give a brief review of state-of-the-art post-processing algorithms in the field of optical defect inspection.

Traditional defect inspection algorithms
Die-to-die inspection method compares the image of defectfree dies with that of faulty ones to identify the defects in the logic chips, and it was also called the random inspection [179]. Cell-to-cell inspection compares the image of a cell with that of an adjacent cell in the same die to identify the defects in the memory chip [180], and it was also called the array inspection. Die-to-database inspection uses the differential images, which are obtained by subtracting the image of a target from the modeled image database of the design layout, to identify the systematic defects on the wafer [181]. To identify a defect from a raw image, the key is to ensure the area that contains a defect in the post-processed image (for instance, the differential image) is noticeably larger than a predefined threshold [182]. Henn et al proposed a method based on the standard deviation of the differential image and an area threshold A min = 3λ to determine the thresholds to separate the defect signature from its background [183]. Zhou et al used the 2nd order frame difference F n (x, y) − 2F n−1 (x, y) + F n−2 (x, y) to eliminate the periodic background patterns followed by a convolution of the panoramic image with a matched tripole pattern [110], which presented a greatly improved signal contrast. In the method, they defined the peak-signal-tonoise-ratio (PSNR) to evaluate the detectability of defects of interest, in which PSNR has the expression PSNR = 20•log 10 [(|D max − D min |/(σ n + σ i )] [103], where D max and D min are the maximum and minimum signals around the defect region, respectively. σ i and σ n are the standard deviations of the panoramic image and the noise, respectively. KLA-Tencor also proposed the multi-die auto-threshold (MDAT) detection algorithm and the inline defect organizer (IDO) filtering algorithm to enable high-efficiency and high-sensitivity in-line defect inspection [184,185]. The MDAT algorithm uses multiple die information to create a median image, which will be utilized as the threshold reference to reduce the process noise and improve the defect extraction [184]. The IDO filtering algorithm takes advantage of several defect properties such as feature vectors to automatically organize and eliminate nuisance defects [185]. As sub-10 nm manufacturing is entering the mainstream [186,187], advanced lithography techniques such as multiple patterning and EUV lithography will be used in the fab. Traditional optical bright-field inspection method needs to combine several inspection algorithms such as Dieto-Database and resolution enhancement to improve both the sensitivity and the efficiency.

Deep learning in wafer defect inspection
Current state-of-the-art optical and e-beam inspection systems based on image differences for detection and rule-based binning for classification are rigid and are not invariant to defect type, size, and substrate material [67,188]. Optical defect inspection tools operated at DUV and visible regimes are even more difficult in the classification of sub-10 nm defects due to the well-known diffraction barrier [189]. Furthermore, every new technology offers a new challenge and requires numerous hours of setup, debugging, and manual tuning of process parameters by integrated chip manufacturers. Deep learning [190], as a comparison, provides a relatively easy-to-implement way to tackle the challenges in defect inspection, including, but not limited to defect identification, location, and classification [191]. Generally speaking, the workflow of deep learningbased defect inspection is quite straightforward, i.e. capture enough e-beam or optical images of wafers (which can be either experimental or simulated ones), train a chosen neural network to extract useful features directly from the images, test the trained model with a small set of samples, and decide if the training should be repeated according to a pre-defined cost function that characterizes the confidence level of neural network [192]. According to the type and order of nonlinear transformation in the neural networks, deep learning models can be cast into different categories [193]. Convolutional Neural Network (CNN) [194], suitable for data with a hierarchical structure such as images, has made itself an attractive choice for defect inspection. Chien et al have used CNN to identify and classify surface defects including center, local, random, and scratch [195]. In their method, 25 464 raw images with visible defects were collected online from the WM-811K dataset containing 811 457 semiconductor wafer images from 46 393 lots with eight defect labels. For uniformity, each raw image was processed to extract only the area containing the wafer by blackening the image areas outside the wafer. Experimental results showed that CNN can reach an accuracy of above 98%. Cheon et al used a single CNN model to extract effective features for identifying defect classes that were not seen using conventional automatic defect classification systems based on SEM images [196]. More literature on deep learning-based defect inspection can be found in [197,198]. However, to the best of our knowledge, most of the reported studies were based on raw images in which the defects are at least barely visible (for example, SEM images). For the optical images where defects are much smaller than the wavelength, useful signatures are immersed in the background and perturbed by different types of errors stemming from either the hardware side or the software side, making the defect hardly seen from a raw optical image. Therefore, locating and classifying a deep-subwavelength defect from an optical image is quite challenging. Purandare et al introduced a machine learning technique that combines principal component analysis and simulated optical images to excavate the signatures of defects from raw optical images [8]. Specifically, they transform a few approximate EM simulation defect images to generate synthetic noisy defect images with trainable parameters such that its principal components can sufficiently capture variance related to the defect features. Experimental results on the inspection of parallel and perpendicular defects at 22 nm and 9 nm nodes have demonstrated that the classification system can accurately locate and classify the defects even though they are an order of magnitude smaller than the diffraction limit. However, this method has not yet proven its robustness against other types of defects such as isolated and buried defects. Henn et al studied the impact of linear classifiers and CNN on defect detection based on simulated optical images from the model of 3D field-effect transistors defined by SEMATECH [199]. Their simulation demonstrated that CNN outperforms both the linear classifier and the SNR, and it is possible to extend the defect detectability down to a scale that is 20 times smaller in one dimension than the illumination wavelength. In conclusion, although deep learning has emerged as an attractive tool for image processing, it has not been widely accepted in the actual production line, especially for optical inspection. The reasons may not only include the 'black-box nature' and lack of interpretability, but also include the unproven capability of positioning and classifying deep-subwavelength defects from pure optical images. To make deep learning techniques really applicable in the optical defect inspection in the fab, more works need to be implemented, especially the study of the grey area of deep learning in optical defect inspection and the exploration of the boundary between deep learning and optical physics.

Conclusions and outlook
This article reviewed the recent developments in the field of optical patterned wafer defect inspection. The conventional approaches in optical defect inspection, such as the amplitude-based one alongside its post-processing algorithms, has been thoroughly discussed. The novel inspection mechanisms including phase-, orbital angular momentum-, THz wave-, and hyperbolic Bloch modes-based ones, have been highlighted to remind readers of their potentials in opening up new directions in the field. Undoubtedly, amplitudebased optical defect inspection systems are the workhorse in the fab. From Rayleigh scattering's point of view, the amplitude of light scattered by any deep subwavelength nanostructure with primary dimension d from a beam of unpolarized light of wavelength λ is proportional to d 3 /λ 2 . Therefore, amplitude-based systems are inherently insensitive to deep-subwavelength defects. To boost the SNR and contrast of defects, hyperbolic Bloch modes that are conventionally involved in metamaterials are applied in the detection of buried defects in 3D NAND flash memory. The strong resonance modes induced by the back-and-forth reflection of EM waves in the Bragg grating-similar cavity play a critical role in the amplification of defect signals. Phase-based defect inspection systems, different from amplitude-based ones, are linearly proportional to the height of defects, which makes phase measurement a potentially high-sensitivity approach for defect inspection. Akin to the orbital angular momentum of light, polarization state nowadays is well-understood as the consequence of spin angular momentum of light. For monochromatic light, the spin and orbital angular momentum densities are the functions of the EM field and its spatial gradient, which indicates, at least in principle, that the defect sensitivity can be optimized by customizing the illumination light field for a given nanopattern. X-ray ptychography, different from any aforementioned optical defect inspection techniques, Figure 14. Schematic of optical systems that are capable of tackling the diverse challenges in patterned defect inspection. (a) Brightfield/darkfield imaging system, (b) dark-field imaging with null ellipsometry, (c) through-focus scanning imaging microscopy, (d) epi-diffraction phase microscopy, (e) patterned wafer containing logic dies and 3D NAND memory dies, (f) x-ray ptychography, (g) THz wave-based defect inspection system, and (h) CFS techniques using different OAM illumination beams.
is the only optical method that can directly image both surface and undersurface sub-20 nm defects for the entire wafer. The schematics of the aforementioned systems are presented in figure 14. As the complexity of both materials and geometry in modern IC keeps increasing, the combination of diverse systems for meeting diverse challenges may be a trend. In conclusion, we personally forecast three potentially important topics relating to optical defect inspection from an academic point of view. The first is the 3D computational imaging of a patterned wafer at very short wavelengths (for example, hard x-ray regime). As a hard x-ray beam is only weakly absorbed by silicon wafers, x-ray ptychography has the potential to penetrate the field by providing revolutionary 3D resolution and sensitivity once the drawbacks including the synchrotron x-ray light source, a massive amount of data, and the low speed being conquered in the future. The second is the structured light field-based inspection mechanism that treats the inspection as an optimization problem for maximizing defect sensitivity. Brightfield, annular, and dipolar illumination modes are widely adopted in conventional defect inspection tools. However, the use of these illumination modes is more like an experience-driven action. A physical connection of the illumination modes and defects is missing. As the surrounding patterns of defects are fundamentally consists of lines and circles, it is possible to customize a structured light field to suppress the background scattering without sacrificing the SNR of defects by exciting the background patterns into dark modes [200]. The last is the sample-oriented inspection, in which the characteristics of samples (such as geometrical structures and optical properties of materials) are pre-investigated for optimizing an optical inspection system. This is also what the industry is doing. Optical defect inspection, though is a long-standing engineering problem, has been regained vitality with the explosive growth of consumer electronic devices and the fusion of emerging techniques such as nanophotonics, structured optical field, computational imaging, quantitative phase imaging, and deep learning. We should emphasize that the aforementioned optical inspection systems, though were originally designed for patterned wafer defect inspection, are also critical to many other fields, including but not limited to photonic sensing, biosensing, and turbid photonics.