Diffraction-based overlay metrology using angular-multiplexed acquisition of dark-field digital holograms

In semiconductor device manufacturing, optical overlay metrology measures pattern placement between two layers in a chip with sub-nm precision. Continuous improvements in overlay metrology are needed to keep up with shrinking device dimensions in modern chips. We present first overlay metrology results using a novel off-axis dark-field digital holographic microscopy concept that acquires multiple holograms in parallel by angular multiplexing. We show that this concept reduces the impact of source intensity fluctuations on the noise in the measured overlay. With our setup we achieved an overlay reproducibility of 0.13 nm and measurements on overlay targets with known programmed overlay values showed good linearity of R2= 0.9993. Our data show potential for significant improvement and that digital holographic microscopy is a promising technique for future overlay metrology tools. © 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement


Introduction
The rapid developments in the micro-electronics industry have been achieved by a continuous and aggressive reduction of the dimensions of semiconductor devices, a trend known as Moore's law [1]. Currently, semiconductor devices like memory and logic CPU's are built 'layer-by-layer' in a sequence of repeating steps of lithography, etching, deposition etc. In order to keep defects under control, metrology of the fabricated features is performed at various stages of the manufacturing process [2].
Multilayered devices with feature sizes of only a few nanometers are in production and require robust metrology of parameters like pattern placement (overlay / OV) and pattern shape (critical dimension / CD) with sub-nm precision (Fig. 1). In addition, these parameters need to be measured at high throughput on many points on a wafer, requiring short measurement times in the milli-second range [3]. Our paper presents Digital Holographic Microscopy as a promising new technique for OV metrology. In order to clarify the context of our work we will first present a short summary of existing overlay metrology techniques.

Overlay metrology
Overlay (also called registration error) refers to the lateral displacement of the lithographically exposed and developed pattern in one layer with respect to a previously created underlying structure in another layer. The measurement and control of the overlay between subsequent lithography steps is one of the critical steps in high volume semiconductor manufacturing. Currently the overlay in high-end manufacturing is of the order of 1-2 nm and the precision of overlay metrology is of the order of 0.1-0.2 nm [4].
For many years, Image-Based-Overlay metrology (IBO) was used to measure overlay on so-called box-in-box (BiB) targets. BiB targets consist of an inner-box and an outer-box in, respectively, the developed resist layer and a previously created lower layer [5]. IBO uses a bright-field microscope to create an image of this BiB target and overlay is determined from this image by measuring the position of the inner box edges relative to the outer box edges. Over time these BiB targets were replaced by gratings since a grating image has more edges which improves overlay metrology precision. In practice the optics in an IBO tool have very low aberrations that are well-below the level of what is needed for "normal" imaging since aberrations can introduce a pattern shift that leads to overlay metrology errors.
In order to deal with the highly demanding overlay requirements of today's semiconductor devices, Diffraction-Based Overlay metrology (DBO) [6][7][8] was successfully introduced a few years ago. An overlay target used in DBO consists of small (approximately 5×5 µm 2 ) overlapping gratings in the resist layer and an underlying layer.
An overlay error between these gratings changes the intensities of the +1 st diffraction order (I +1 ) and the −1 st diffraction order (I −1 ). The intensities I +1 and I −1 as function of overlay are periodic with a period equal to the grating pitch. In practice, the overlay is much smaller than the pitch so we can linearize the response of I +1 and I −1 to a small overlay: Here I ill is the intensity of the illumination beam, DE is the diffraction efficiency of the combined gratings at zero overlay and K is an overlay sensitivity term. Taking the difference between I +1 and I −1 yields: In practice, DE and K are usually not known in advance since they depend on the stack of materials in which the top and bottom gratings are embedded. Moreover, the intensity of the illumination beam can also show some unknown variations. However, these terms can be eliminated by measuring the intensity difference on two pairs of overlapping gratings where a small known shift ("bias") of, respectively, +d and -d is added to the unknown overlay (Fig. 2). This results in two measured intensity differences: We have now two measured intensity differences from which we can eliminate the unknown term (I ill × DE × K) and determine the overlay [9]: The intensity differences ∆I +d and ∆I −d can be measured using, for example, a scatterometer [10] or a dark-field microscope [11]. The advantage of DBO with respect to IBO is that the overlay information is encoded in an intensity difference instead of location of edges in an image. This makes the measurement in DBO less sensitive to aberrations in the optics between the target and the detector. Moreover, in DBO the individual grating lines no longer need to be resolved which allows us to use smaller grating pitches which further improve metrology precision. By using two off-axis illumination beams and a large Numerical Aperture (NA) (≈ 0.8) of the collection optics, pitches as small as 400 nm can be used. However, differential intensity noise between these two off-axis illumination beams must be low since overlay is encoded in an intensity difference between a +1 st order and a −1 st order. In practice overlay metrology faces a few challenges that lead to demanding sensor requirements: 1. Targets are small (5×5 µm 2 ) and are surrounded by patterns that may impact the measured overlay. Therefore sufficient spatial resolution is needed to separate the target from its environment. For dark-field microscopy this means that we need a large NA for the image formation.
2. The bottom grating in DBO is often covered by absorbing layers like amorphous silicon and the top grating is formed in a thin resist layer. This results in low diffraction efficiencies of the +1 st and −1 st diffraction orders that are used in DBO. In addition, overlay needs to be measured on many points on a wafer for many wafers at high throughput. In practice this means that acquisition times for a DBO image must be in the order of a few milliseconds. The combination of weak signals, small acquisition times and sub-nm precision requirements drives the need for high-brightness light sources.
3. The large variety of materials used in semiconductor devices leads to a strong wavelengthdependency of the signal strength. Moreover, wafer-processing steps like etching can lead to an asymmetric deformation of a grating profile which results in a wavelength-dependent measured overlay. In order to adequately deal with this a DBO tool often measures at multiple wavelengths over a large wavelength range [12][13][14].
Single mode lasers have sufficient brightness to deal with the second challenge but they don't have the wavelength flexibility to deal with the third challenge. Supercontinuum sources (SCS), in combination with an Acousto-Optic Tunable Filter (AOTF), are a very attractive alternative for lasers since they offer high brightness and fast and flexible wavelength tuning over a large range. SCS sources, however, tend to show high intensity noise levels [15,16] that can potentially degrade a DBO-based overlay measurement. Figure 3 shows two possible versions of a dark-field microscope for DBO that both use two oblique illumination beams to illuminate the overlapping grating pairs from opposite sides. In both methods the objective captures only the +1 st and −1 st diffraction orders and discards specular reflections. These methods form a +1 st order image and a −1 st order image of the overlapping grating pairs on an image sensor from which the overlay of these pairs is derived.
The sensor shown in Fig. 3(a) uses a wedge in the pupil plane of the objective that separates the +1 st and −1 st order images on a camera. This allows parallel measurement of the two images which makes it robust against intensity noise of the light source since this noise is common-mode for the +1 st and −1 st order images. However, each image can use only half the NA of the objective which lowers the imaging resolution. This resolution loss is solved in the sensor shown in Fig. 3(b) where the +1 st and −1 st order images are measured sequentially in time. In this method, the full NA is available for imaging but the sequential acquisition makes this sensor more susceptible to intensity fluctuations of the light source that can degrade the measurement precision. In the following section, we present in detail a method that enables the full NA for imaging and at the same time offers parallel acquisition of the two diffraction order images.

Dark-field digital holographic microscopy
In this paper we present dark-field off-axis Digital Holographic Microscopy (df-DHM) [17][18][19] using a SCS plus AOTF as a novel solution for the challenges that we mentioned in the previous section. Our df-DHM achieves a high spatial resolution by using the full NA while at the same time is robust against intensity noise of the source since both −1 st and +1 st order holograms are acquired in parallel.
Moreover, our df-DHM uses only a simple uncoated plano-asphere imaging lens which offers a high transmission over a large spectral range. Such a simple lens will introduce aberrations but that can be computationally corrected [20] since DHM retrieves the complex field of an object image [21][22][23]. The large spectral range offered by this single imaging lens in combination with the SCS and AOTF also offers a path towards fast multi-wavelength imaging over a large wavelength range.
Another advantage of df-DHM over existing OV measurement techniques is the coherent amplification of the object image that is achieved by the coherent mixing with the reference beam. This coherent amplification lifts the image above the sensor read-out noise and allows quantum noise limited imaging of weakly diffracting overlay targets. This feature is especially of interest for OV metrology applications in the near-IR range where image sensors still tend to have relatively high detector noise levels.
In order to achieve sub-nm precision levels with df-DHM we need a good understanding of various challenges that come with the realization of such a tool. For example, in a previous paper [24], we have reported on the impact of the coherence length of the light source on the field of view (FoV). In this paper we extend our df-DHM investigation towards the parallel acquisition technique that makes the overlay metrology more robust against differential intensity noise in the two off-axis illumination beams.
Parallel acquisition of multiple holograms has already been demonstrated back in 1976 with a detailed review of two-reference-beam holographic interferometry by Dandliker et al. [25]. In this paper, it is shown that two-reference-beam holography can be used for quantitative measurements of surface displacements. Pedrini et al. [26] extended this approach for quantitative evaluations of image-plane holograms or digital Fresnel holograms to determine both in-plane and out-of-plane deformations at the same time.
Recently, D. Cohoe et al., have also reported a DHM technique that offers a parallel acquisition of three off-axis holograms at three wavelengths [27]. A three-wavelength beam illuminates a transparent sample inside a water-filled chamber and a relay lens is used to project the transmitted light on a camera. A color filter inside the chamber creates three separate reference beams per wavelength which results in three overlapping holograms that each have a different fringe orientation and fringe density. The authors used this method to improve the phase reconstruction of digital holograms of protozoa by reducing 2π phase ambiguities. Similarly, T. Tahara et al., presented a single-shot multi-wavelength off-axis DHM with a wide field-of-view using a large reference angle of more than 40 degrees [28]. This yields a very dense fringe pattern and the intentional aliasing that is introduced by this dense fringe pattern in combination with the relatively large pixel pitch results in a wide separation of the multi-colored signal in the spatial frequency domain.
In this paper we report the use of this parallel image acquisition technique for overlay metrology using df-DHM. In contrast to the other papers on this topic we use this parallel imaging technique to measure small intensity differences between the two images. We present the first measured overlay data on a test sample that demonstrates that parallel imaging reduces the impact of intensity noise of the light source and thus improves the measurement reproducibility. In what follows, Section 2.1 provides a very brief review of DHM, followed by an explanation of how we realize the parallel acquisition of multiple holograms. This is then followed by a detailed description of the novel off-axis dark-field DHM setup that we have built (Section 2.2). Section 3 presents the first experimental results that demonstrate the parallel acquisition (Section 3.1), the first overlay measurements with off-axis df-DHM (Section 3.2), and the verification of the noise correlation of the presented setup along with measured overlay reproducibility (Section 3.3). Section 4 concludes the paper.

Off-axis dark-field digital holographic microscope
This section presents the details of how we realize the parallel acquisition of multiple holograms using off-axis df-DHM. We first briefly present some theoretical background of our technique. Then we describe the setup that we have built to demonstrate the capabilities of this technique for semiconductor overlay metrology.

Theoretical analysis of off-axis df-DHM
In df-DHM the object is usually illuminated at an oblique angle of incidence. The specular reflection is discarded and part of the diffracted light is captured by a lens with a numerical aperture (NA). The captured light creates a dark-field image on an image sensor with magnification M and an off-axis reference beam is coherently added to this dark-field image which results in an intensity pattern on the image sensor given by: where E obj (r ⃗) is the complex amplitude distribution of the object wave, E ref (r ⃗) is the reference wave and γ is the degree of coherence between the object beam and the reference beam.
The last term of Eq. (5) is of interest as it contains the complex object field, E obj (r ⃗), that we need to retrieve. It is however multiplied by the reference field. In off-axis DHM the reference beam is usually a plane wave with amplitude A that is incident on the image sensor at an angle of incidence of θ ref and an azimuthal angle ϕ ref . Applying a 2D-Fourier Transform (FT) to the detected image and assuming an infinite plane wave reference beam yields the spatial frequency spectrum in k-space: where ⊗ denotes a convolution. Of these four terms, the first is the auto-correlation of the object beam E obj (k ⃗ ) which has a diameter of 4π(NA/M)/λ and is centered at the origin. Likewise, the second term is the auto-correlation of the reference beam E ref (k ⃗ ), resulting in a delta function placed at the origin ( Fig. 4(b)). These two terms are the DC components of the recorded intensity. The last two terms are cross-correlation terms that describe the interference between the object and reference beams. These terms are the Fourier transforms of the shifted object beam and its complex conjugate, respectively. The magnitude of the shift, |k ⃗ ref | is given by: where θ ref is the angle of the reference beam. In order to fully recover the object wave distribution, these side-bands need to be completely separated from the base-band. In case of incomplete separation, imaging artifacts may occur which can only be removed with advanced algorithms [29]. As a side-band separation is preferred we can determine the required angle of the reference beam. According to Eq. (6) the side-bands are completely separated from the base-band term if At the same time, the Nyquist sampling criterion [30] sets an upper limit on the maximum angle-of-incidence of the reference beam. For an image sensor pixel pitch of p x we require that |k ⃗ ref | + 2π(NA/M)/λ < 2π/2p x . These two conditions are summarized in the following expression: λ If this condition is satisfied then the hologram is adequately sampled and the two side-bands are separated from the base-band in the Fourier domain. For example, an NA=0.5, a magnification M of 100x, a wavelength of 550 nm and a typical image sensor pixel size of 4 µm results in a reference angle between 0.86 • and 3.65 • .
Retrieving the complex object field from an off-axis hologram requires only a few simple signal processing steps. Fourier transforming the acquired hologram yields the 2D-spatial frequency spectrum. Then by selecting only one of the two side-bands and filtering out the rest of the signal we obtain the spectrum of the object field. Finally we obtain the complex object field with an inverse Fourier Transform of this object spectrum.
This concept can be extended to retrieve multiple dark-field images in a single multiplexed hologram. Our approach is schematically shown in Fig. 4. Instead of using one illumination beam and a corresponding reference beam we now use two illumination beams that each generate two object beams that are projected on the image sensor. To each of these object beams we add two corresponding reference beams and we use coherence gating [31,32] to ensure that each object beam only adds coherently with its corresponding reference beam. In addition, we also give the two reference beams different azimuthal angles resulting in a different orientation of the side-bands of the spectra of the resulting holograms as shown in Fig. 4(b). With this approach, two holograms are captured by the image sensor using only one image acquisition, and the two object fields can be retrieved with only three Fast Fourier Transforms (FFT's).
This method is essentially a form of frequency multiplexing, which is already used in telecommunications for multiplexing signals [33]. Compared to time domain signals, images have 2D spatial frequency spectra which offers a lot of room to fill the frequency spectrum with multiple object fields that can all be acquired in a single image acquisition. The number of object fields that can ultimately be packed in the spatial frequency-domain depends on the pixel size of the image sensor (according to Eq. (8)) and dynamic range. With the continuous improvements of image sensor technology we expect that this technique can be scaled-up to ultimately acquire more than four object fields in parallel.
In our off-axis DHM the reference beams are placed at azimuthal angles of nominally +45 • and +135 • relative to the x-axis. For convenience we refer to these beams as Ref −1 and Ref +1 . The target is illuminated from opposite directions which generates two object beams ill −1 and ill +1 that are imaged on the camera. Figure 5 shows the novel off-axis dark-field digital holographic microscope that was built to demonstrate the parallel acquisition of multiple holograms scheme that was presented in the previous section.

Off-axis df-DHM setup
The off-axis df-DHM setup is comprised of a fiber coupled Supercontinuum White light source (LS ; Leukos Rock 400 5) combined with an Acousto-Optical Tunable Filter (AOTF ; Gooch & Housego TF550-300-4-6-GH57A). This AOTF device provides beams with a bandwidth in the range of [4-7 nm] and covers the whole visible range [400-700 nm]. We also added dichroic mirrors and bandpass filters in front of the AOTF to block longer wavelengths of the source. This AOTF was selected based on considerations presented in [24], where we showed that the required Field-of-View (FoV) (100 µm) for semiconductor metrology in a df-DHM setup can be achieved with light sources with bandwidth of about 5 nm. With this combination we get a horizontally polarized laser beam with an optical power of more than 1 mW. The AOTF introduces some wavefront distortions which we remove using a spatial filter to improve the wavefront quality and to obtain a beam with a Gaussian beam profile.
For the parallel acquisition of +1 st and −1 st order we separated the beam in two paths using a 50/50 beam splitter (BS1; Thorlabs BS013) and for each path we added a 90/10 beam splitter (BS2; Thorlabs BS025) that splits 10% of the light in the reference beam and the remaining 90% of the light in the illumination beam. As shown in Fig. 5, we matched the distances of each beam pair (illumination beam and reference beam) with the use of delay lines so that the optical path difference (OPD) between each pair is minimized. In addition, we have also introduced an optical path length difference between the two pairs of approximately 10 cm. This is well beyond the 24 µm coherence length of the light source which ensures that we have two mutually incoherent holograms on the camera.
Our df-DHM setup uses four polarization maintaining (PM) fibers (PM -Shafter-Kirchhoff PMC-400Si-2.3-NA014) with a nominal length of 1.5 meters to couple the two illumination beams and the two reference beams to the holographic microscope head with collimation lenses (CL; Schafter Kirchhoff 60FC-M10-01). This separates the "light source" from the "sensor head" and offers flexibility to build a compact df-DHM inside a metrology system where volume constraints can be challenging. Realizing a stable fiber coupling, however, is a challenge since small beam pointing fluctuations can lead to variations in coupling efficiency [34,35] that creates small additional beam intensity fluctuations at the fiber output. In order to minimize this effect we have placed the optics in front of the fibers in a separate housing that reduces beam pointing fluctuations. This resulted in coupling efficiencies of approximately 60% for all the four arms. The first results that we got from this setup and that we will present in Section 3.3 indicate that this simple approach already reduces the intensity noise in the illumination beams to acceptable levels.
The microscope has two off-axis illumination arms which illuminate the target from opposite directions at an incident angle of ≈ 70 • with respect to the normal of the sample plane. Each illumination arm consist of a collimation lens (CL2; Schafter Kirchhoff 60FC-M3.1-51) resulting in a large illumination spot of ≈ 500 µm. Two adjustable mirrors were used for fine-tuning the angle of incidence using the approach that is described in [24]. The objective lens is a plano-asphere lens with effective focal length of 8 mm and an NA of 0.5 (SL ; Thorlabs A240TM). With this NA the setup of Fig. 5 is capable of imaging small gratings with pitches down to 400 nm. We choose a nominal magnification of 100x by placing our detector 800 mm away from the SL. A camera (Basler acA4112-8gm) with a 12 Mpixel CMOS image sensor with 3.45 µm pixel size was used.
According to Eq. (8) the 3.45 µm pixel size and the 100x magnification yields an upper and lower limit of θ ref of, respectively, 3 • and 0.86 • (for the shortest wavelength). In our setup we set θ ref to approximately 1 • for both reference beams. This was measured from the angular spectrum of a measured hologram. The azimuthal angles of the two holograms were measured from the angle between a horizontal line and the line through the origin of the spectrum and the center of the side-band. The azimuthal angles were 37.5 • and 142.8 • respectively.
Our microscope uses a single imaging lens which adds a parabolic phase profile to the object image on the camera [36]. We can compensate for this curvature by also using a spherical reference wavefront. We realize this by directing the spherical wave that is radiated from the reference fiber directly to the image sensor without using a collimation lens.

Experimental results
This section presents the first experimental results that demonstrate the diffraction-based overlay metrology capabilities of our df-DHM setup. For these experiments we used an ASML test wafer with multiple targets. The square gratings printed on this wafer have various pitches, P, ranging from 400 to 1200 nm and various sizes ranging from 58×58 µm 2 to 5×5 µm 2 . Firstly, we present the parallel acquisition of multiple holograms followed by the first OV measurements on targets with a known pre-programmed overlay where we compare the measured values with these pre-programmed values. Finally, we present measured overlay reproducability data that demonstrate that our parallel image acquisition setup reduces the impact of intensity noise.

Parallel acquisition of multiple holograms
Firstly, we measured the actual magnification M of the setup. We measured M by collecting dark-field images of a known target. The silicon wafer that we used contained square gratings with dimensions of 58×58 µm 2 and etch depth of 90 nm. From the measured image size of this target on the camera we obtained a 100x magnification with an uncertainty of approximately ± 0.5 which is based on five repeated measurements.
In order to demonstrate our concept of parallel acquisition, we set the center wavelength at 532 nm with the AOTF. At this wavelength the bandwidth B is approximately 6 nm. This wavelength ensured that most of the diffracted light travels through the center region of the lens. This improved the imaging performance since lens aberrations are not yet corrected for in the experiments that we report here.
For the image acquisition, the delay lines of the reference arms (Fig. 5) were adjusted to ensure that the maxima of the observed fringe patterns were centered in the images. Then we acquired several holographic images of different targets of this test wafer. Figure 6(a) shows two overlapping holograms of two square gratings in one camera image. The presented targets are two square top-only (resist only) gratings with dimensions of 38×38 µm 2 and etch depth of 90 nm.
The large 1 • tilt angle of the reference beam results in a large fringe density and only if we enlarge the area of interest we can see the fringe pattern that is formed by the two overlapping off-axis holograms. Figure 6(b) presents the angular spectrum of the detected image, obtained with a Fast Fourier Transform (FFT). This image shows that the cross-correlation terms of the two overlapping fringe patterns are fully separated. Figure 6 presents the reconstruction steps needed to retrieve the amplitude and phase of the obtained object field. From Fig. 6(b) we selected the two cross-correlation terms located on top-left and top-right of the image. These areas contain the cross-correlation information for image reconstruction of the −1 st and +1 st order images. As shown, top left is the side-band of the −1 st order image and top right is the side-band of the +1 st order image while the side-bands at the bottom of the image are their complex conjugates, as explained in Section 2.1.
The side-bands in the spatial frequency spectrum show multiple displaced bright spots that correspond to the different pitches of the gratings that are present in the image. Figure 6(c),(e) show the intensity and Fig. 6(d),(f) show the phase of the retrieved images of −1 st order and +1 st order respectively. The phase images represent a phase profile of the optical field reflected from the surface of the target. For targets with different pitches we expect different phase profiles, similarly displayed in the spatial frequency domain.

Overlay measurements with off-axis df-DHM
After demonstrating the parallel acquisition of multiple holograms, we move to the next step where we demonstrate the use of our DHM concept for overlay metrology. The presented off-axis df-DHM can simultaneously measure the +1 st and −1 st diffraction orders which can be used to calculate overlay using Eq. (4).
The silicon test wafer that was used for our measurements contains various diffraction-based overlay targets with programmed overlay values in a range of −20 nm to +20 nm. Each target consist of two overlapping grating pairs as shown in Fig. 2 with a bias d of +20 nm and −20 nm that was added to the programmed overlay. The size of each grating pair is 38×38 µm 2 , the same as the ones shown on Fig. 6. Figure 7(a) presents the retrieved −1 st and +1 st order images for targets with various programmed OV errors and using a wavelength of 532 nm. These retrieved images were obtained using the method described in Section 3.1. In our setup the two illumination spots have a small intensity difference that leads to significant overlay offsets. In order to correct for this we also measured a reference resist grating without the etched grating underneath. For such a grating an unbalance in the measured +1 st order and −1 st order images can only come from an unbalance in the two illumination intensities. The measured intensities for this reference grating have therefore been used to correct for the illumination unbalance in the overlay measurements. In order to come to an overlay number we determined the total signal level for the −1 st order image and the +1 st order image in a Region-of-Interest (RoI). The RoI size was 10×10 µm 2 and was centered on the two grating pairs. This is done for the positively biased grating pair (d=+20 nm) and the negatively biased grating pair (d=−20 nm). This yields two intensity difference signals of ∆I +d and ∆I −d that allow us to calculate overlay with Eq. (4). Figure 7(b) presents measured OV as a function of the programmed OV. The black dashed line shows the expected OV while the red dotted line is a linear fit of the experimental data. A comparison of the measurements and the programmed OV are off by approximately 1 nm. Our measurements show good matching to a linear fit of the measured data (R 2 = 0.9993), which only shows small deviations from the expected value. The observed average −1 nm offset between measured and programmed overlay can be explained by the estimated 1 nm OV uncertainty that can occur during the lithographic patterning step of the resist grating. In addition we have also observed a small drift in our breadboard that may also contribute to this offset.

Differential intensity noise and measured overlay reproducibility
DBO encodes overlay in a small intensity difference between the +1 st and −1 st images which makes it only sensitive to differential intensity noise in the illumination beams since commonmode noise in the measured +1 st and −1 st order images will drop-out in the difference. We use a single light source for measuring these images so ideally the differential noise in the measured images should be zero. However, our setup uses single-mode fibers to guide the two illumination and two reference beams to the microscope head and, as noted in Section 2.2, small fluctuations in the fiber coupling efficiency can introduce some differential intensity noise to the illumination and reference beams.
The impact of intensity fluctuations in the illumination beams on the measured overlay can be modeled with the DBO signal formation model that we presented in the introduction. Denoting the normalized intensity fluctuations on the +1 st and −1 st order illumination beam by ϵ +1 and ϵ −1 yields for the detected DBO signals: Taking the difference ∆I ±d = I +1 − I −1 and substituting these differences in Eq. (4) using the approximation that 1/(1 + ϵ) = 1 − ϵ for ϵ ≪ 1, yields for the noise δOV in the measured overlay: The noise terms ϵ ±1 are of the order of 1% so we ignore the second term on the right-hand side. It can be seen that only differential noise will result in an overlay error. In practice K is of the order of 10 −2 nm −1 [37] so for overlay errors less than 0.1 nm the normalized differential noise ϵ +1 − ϵ −1 must be below 10 −3 . Figure 8(a) shows the measured normalized intensity fluctuations of the two illumination beams. This data was obtained by redirecting the two illumination beams to a CCD camera (Vimba Prosilica GT2300 with 4.1 Mpixels of 4.5 µm pixel size) and acquiring 100 images with an acquisition time of 50 µs over a time interval of about 100 seconds. The intensity in each beam was determined by summing the signal levels of those pixels that are covered by an illumination spot. As a final step we normalized the measured intensity to the mean value of all 100 measurements. The result shown in Fig. 8(a) clearly shows highly correlated intensity fluctuations in the "+1 st illumination" and "−1 st illumination" beams. The differential noise has a standard deviation of 6.7 × 10 −4 and a correlation coefficient of 0.98. This encouraging result shows that the impact of fiber coupling fluctuations is small compared to the intensity noise of the source. In holography, however, intensity noise in the reference beams also shows up as noise in the first order images so we also measured the intensity fluctuations for the two reference beams using the same procedure as described above for the illumination beams. Since the reference beams have lower intensities we used a longer acquisition time of 500 µs. The result of the measured normalized intensity fluctuations for the reference beams is shown in Fig. 8(b). This measurement was separately done from the measured illumination beams so a point-to-point comparison of the measured noise between illumination and reference beams is not possible. Figure 8(b) shows that the standard deviation of the observed noise is comparable to the noise in the illumination beams (standard deviation of the differential noise is 1.5 × 10 −3 ). However, we clearly see that the correlation between the two reference beams is not as good as what we observe in the illumination beams. The measured correlation is only 0.69 which indicates that we suffer from some fluctuations in the fiber coupling efficiency in one (or potentially both) of the reference fibers. This is supported by the fact that one of the reference beams in the current setup travels a much longer path through air before it is being coupled in the fiber. Since we have achieved such encouraging results on the illumination fibers we are confident that we can ultimately also achieve a good noise correlation between the reference beams.
After characterizing the differential noise in the illumination and reference beam we moved on to characterize the differential noise in the +1 st and −1 st order images that we acquired with our df-DHM setup. Figure 9 shows the normalized signal variations in the retrieved +1 st and −1 st order images that we measured with our df-DHM setup. This was done by collecting 100 holograms at an acquisition time of 30 ms and measuring the signal levels in the +1 st order and −1 st order signals. The signal is formed by a coherent addition of a reference beam and an object beam so we can expect that the noise has contributions from the illumination and reference beams and possibly also from air turbulence that can lead to contrast loss fluctuations of the interference pattern. The data in Fig. 9 show that the −1 st order and +1 st order signals have an encouraging correlation of 0.91. However, we also observe a difference in the magnitude of the noise that will result in an incomplete cancellation of the noise in the measured intensity difference. The measured data indicates that a combination of sub-optimal fiber coupling and beam pointing fluctuations is currently preventing us from achieving a complete suppression of the differential noise. We believe that better fiber couplers and shorter beam paths before the fiber couplers will solve this problem. Figure 10(a), presents measured overlay reproducibility data on nine different overlay targets for both a parallel and a sequential acquisition of the +1 st and −1 st order images. These results were obtained by doing 150 overlay measurements for each target and calculating the standard deviation of the overlay variation. It can be seen that parallel acquisition consistently has a better reproducibility which is due to the suppression of the common mode intensity noise. The measured reproducibility of the parallel acquisition is of the order of 0.13 nm which is promising since this already approaches the level of what is needed for overlay metrology. Moreover, we have to bear in mind that this result was obtained on a simple breadboard setup where we could not yet achieve a complete suppression of the differential noise in the illumination beams. It is expected that a more stable setup will significantly improve the reproducibility which will make the benefit of parallel acquisition over a sequential acquisition even more pronounced.

Conclusion
We have presented a novel concept for measuring overlay based on spatial frequency multiplexed dark-field Digital Holography (df-DHM). We have also presented first measured overlay data using this concept. We have built a breadboard version of this concept using standard off-the-shelf components and only a single imaging lens. This setup was not yet optimized for mechanical stability and alignment. Moreover, the retrieved images were not yet corrected for aberrations that are introduced by the single imaging lens. Despite these minor imperfections the measured data shows the potential of using df-DHM for overlay metrology.
After an experimental demonstration of the capability to acquire two holograms in parallel we have presented data showing good agreement between measured overlay and programmed overlay on a test sample. The linearity was very good (R 2 =0.9993) and the maximum observed deviation of about 1 nm between the measured overlay and the expected overlay can be explained by a combination of overlay uncertainty in the test sample (≈ 1 nm) and overlay errors (≈ 1 nm) that are introduced by various small imperfections in our breadboard setup. For example, a small mechanical misalignment of the stage on which the sample was mounted in combination with a small illumination beam inhomogeneity can already introduce a 1 % relative intensity unbalance between +1 st order and −1 st order images. For our sample this 1 % unbalance would already lead to a 1 nm error.
Diffraction-Based overlay measurements are sensitive to differential noise in the illumination beams that generate the +1 st and −1 st order images so we have also looked in the noise performance of our df-DHM setup. The two illumination beams and the two reference beams originate from the same source so ideally the noise between these four beams is common mode with no differential noise terms. Measured data on the two illumination beams clearly show a strong common mode noise term that cancels in the overlay measurement and only a small differential noise term that has a negligible impact on the overlay measurement. Unfortunately, the two reference beams show a significant differential noise term on top of the common-mode noise. After carefully studying this effect we have come to the conclusion that this is most likely coming from a small fiber misalignment in combination with small beam point fluctuations of the beams that are coupled into the single mode fibers. Despite these imperfections we have been able to show 0.12 nm overlay reproducibility using parallel acquisitions.
During the course of our investigations we have identified various improvements to our setup that we will implement in future experiments. For example, more stable fiber couplers in combinations with shorter beam paths in the illuminator part are expected to significantly reduce the differential noise. Better aligned sample stages in combination with a more homogeneous illumination beam are expected to significantly improve the overlay metrology precision to the levels that are required in the semiconductor industry. Once we have also calibrated the lens aberrations we can also apply image corrections enabling overlay measurements over a larger wavelength range. In this paper, we have shown that df-DHM is a very promising candidate for future overlay metrology and we are working on many directions for further improvements.