Robust overlay metrology with differential Mueller matrix calculus

Overlay control is of vital importance to good device performances in semiconductor manufacturing. In this work, the differential Mueller matrix calculus is introduced to investigate the Mueller matrices of double-patterned gratings with overlay displacements, which helps to reveal six elementary optical properties hidden in the Mueller matrices. We find and demonstrate that, among these six elementary optical properties, the linear birefringence and dichroism, LB′ and LD′, along the ± 45° axes show a linear response to the overlay displacement and are zero when the overlay displacement is absent at any conical mounting. Although the elements from the two 2 × 2 off-diagonal blocks of the Mueller matrix have a similar property to LB′ and LD′, as reported in the literature, we demonstrate that it is only valid at a special conical mounting with the plane of incidence parallel to grating lines. The better property of LB′ and LD′ than the Mueller matrix elements of the off-diagonal blocks in the presence of overlay displacement verifies them to be a more robust indicator for the diffraction-based overlay metrology. © 2017 Optical Society of America OCIS codes: (120.3940) Metrology; (120.2130) Ellipsometry and polarimetry; (050.1950) Diffraction gratings. References and links 1. M. Maenhoudt, J. Versluijs, H. Struyf, J. Van Olmen, and M. Van Hove, “Double patterning scheme for sub0.25 k1 single damascene structures at NA = 0.75, λ = 193 nm,” Proc. SPIE 5754, 1508–1518 (2005). 2. Y. S. Jung, J. B. Chang, E. Verploegen, K. K. Berggren, and C. A. Ross, “A path to ultranarrow patterns using self-assembled lithography,” Nano Lett. 10(3), 1000–1005 (2010). 3. S. H. Park, D. O. Shin, B. H. Kim, D. K. Yoon, K. Kim, S. Y. Lee, S. H. Oh, S. W. Choi, S. C. Jeon, and S. O. Kim, “Block copolymer multiple patterning integrated with conventional ArF lithography,” Soft Matter 6(1), 120–125 (2010). 4. E. Vogel, “Technology and metrology of new electronic materials and devices,” Nat. Nanotechnol. 2(1), 25–32 (2007). 5. J. Finders, M. Dusa, B. Vleeming, B. Hepp, M. Maenhoudt, S. Cheng, and T. Vandeweyer, “Double patterning lithography for 32 nm: critical dimensions uniformity and overlay control considerations,” J. Micro/Nanolith. MEMS MOEMS 8(1), 011002 (2009). 6. W. Yang, R. Lowe-Webb, S. Rabello, J. Hu, J. Y. Lin, J. Heaton, M. Dusa, A. den Boef, M. van der Schaar, and A. Hunter, “A novel diffraction based spectroscopic method for overlay metrology,” Proc. SPIE 5038, 200–207 (2003). 7. C. H. Ko and Y. S. Ku, “Overlay measurement using angular scatterometer for the capability of integrated metrology,” Opt. Express 14(13), 6001–6010 (2006). 8. A. J. den Boef, “Optical wafer metrology sensors for process-robust CD and overlay control in semiconductor device manufacturing,” Surf. Topogr.: Metrol. Prop. 4(2), 023001 (2016). 9. S. Peterhänsel, M. L. Gödecke, V. F. Paz, K. Frenner, and W. Osten, “Detection of overlay error in double patterning gratings using phase-structured illumination,” Opt. Express 23(19), 24246–24256 (2015). 10. Y. N. Kim, J. S. Paek, S. Rabello, S. Lee, J. Hu, Z. Liu, Y. Hao, and W. McGahan, “Device based in-chip critical dimension and overlay metrology,” Opt. Express 17(23), 21336–21343 (2009). 11. J. Li, Y. Liu, P. Dasari, J. Hu, N. Smith, O. Kritsun, and C. Volkman, “Advanced diffraction-based overlay for double patterning,” Proc. SPIE 7638, 76382C (2010). Vol. 25, No. 8 | 17 Apr 2017 | OPTICS EXPRESS 8491 #286087 https://doi.org/10.1364/OE.25.008491 Journal © 2017 Received 3 Feb 2017; revised 12 Mar 2017; accepted 28 Mar 2017; published 3 Apr 2017 12. C. Fallet, T. Novikova, M. Foldyna, S. Manhas, B. H. Ibrahim, A. De Martino, C. Vannuffel, and C. Constancias, “Overlay measurements by Mueller polarimetry in back focal plane,” J. Micro/Nanolith. MEMS MOEMS 10(3), 033017 (2011). 13. C. J. Raymond, “Scatterometry for Semiconductor Metrology,” in Handbook of Silicon Semiconductor Metrology (Marcel Dekker Inc., 2001). 14. V. F. Paz, S. Peterhänsel, K. Frenner, and W. Osten, “Solving the inverse grating problem by white light interference Fourier scatterometry,” Light Sci. Appl. 1(11), e36 (2012). 15. M. A. Henn, H. Gross, F. Scholze, M. Wurm, C. Elster, and M. Bär, “A maximum likelihood approach to the inverse problem of scatterometry,” Opt. Express 20(12), 12771–12786 (2012). 16. J. Zhu, S. Liu, X. Chen, C. Zhang, and H. Jiang, “Robust solution to the inverse problem in optical scatterometry,” Opt. Express 22(18), 22031–22042 (2014). 17. J. Li, J. J. Hwu, Y. Liu, S. Rabello, Z. Liu, and J. Hu, “Mueller matrix measurement of asymmetric gratings,” J. Micro/Nanolith. MEMS MOEMS 9(4), 041305 (2010). 18. T. Novikova, P. Bulkin, V. Popov, B. H. Ibrahim, and A. De Martino, “Mueller polarimetry as a tool for detecting asymmetry in diffraction grating profiles,” J. Vac. Sci. Technol. B 29(5), 051804 (2011). 19. D. J. Dixit, V. Kamineni, R. Farrell, E. R. Hosler, M. Preil, J. Race, B. Peterson, and A. C. Diebold, “Metrology for block copolymer directed self-assembly structures using Mueller matrix-based scatterometry,” J. Micro/Nanolith. MEMS MOEMS 14(2), 021102 (2015). 20. X. Chen, C. Zhang, S. Liu, H. Jiang, Z. Ma, and Z. Xu, “Mueller matrix ellipsometric detection of profile asymmetry in nanoimprinted grating structures,” J. Appl. Phys. 116(19), 194305 (2014). 21. X. Chen, H. Jiang, C. Zhang, and S. Liu, “Towards understanding the detection of profile asymmetry from Mueller matrix differential decomposition,” J. Appl. Phys. 118(22), 225308 (2015). 22. R. M. A. Azzam, “Propagation of partially polarized light through anisotropic media with or without depolarization: A differential 4×4 matrix calculus,” J. Opt. Soc. Am. 68(12), 1756–1767 (1978). 23. R. Ossikovski, “Differential matrix formalism for depolarizing anisotropic media,” Opt. Lett. 36(12), 2330–2332 (2011). 24. N. Ortega-Quijano and J. L. Arce-Diego, “Mueller matrix differential decomposition,” Opt. Lett. 36(10), 1942– 1944 (2011). 25. O. Arteaga and B. Kahr, “Characterization of homogenous depolarizing media based on Mueller matrix differential decomposition,” Opt. Lett. 38(7), 1134–1136 (2013). 26. R. Ossikovski and V. Devlaminck, “General criterion for the physical realizability of the differential Mueller matrix,” Opt. Lett. 39(5), 1216–1219 (2014). 27. S. R. Cloude, “Conditions for the physical realizability of matrix operators in polarimetry,” Proc. SPIE 1166, 177–185 (1990). 28. D. G. M. Anderson and R. Barakat, “Necessary and sufficient conditions for a Mueller matrix to be derivable from a Jones matrix,” J. Opt. Soc. Am. A 11(8), 2305–2319 (1994). 29. O. Arteaga, “Useful Mueller matrix symmetries for ellipsometry,” Thin Solid Films 571, 584–588 (2014). 30. Z. Sekera, “Scattering matrices and reciprocity relationships for various representation of the state of polarization,” J. Opt. Soc. Am. 56(12), 1732–1740 (1966). 31. L. Li, “Symmetries of cross-polarization diffraction coefficients of gratings,” J. Opt. Soc. Am. A 17(5), 881–887 (2000). 32. W. Ludwig and C. Falter, Symmetries in Physics: Group Theory Applied to Physical Problems (Springer, 1988). 33. I. T. Jolliffe, Principal Component Analysis, 2nd ed. (Springer, 2002). 34. M. G. Moharam, E. B. Grann, D. A. Pommet, and T. K. Gaylord, “Formulation for stable and efficient implementation of the rigorous coupled-wave analysis of binary gratings,” J. Opt. Soc. Am. A 12(5), 1068–1076 (1995). 35. L. Li, “Formulation and comparison of two recursive matrix algorithms for modeling layered diffraction gratings,” J. Opt. Soc. Am. A 13(5), 1024–1035 (1996). 36. X. Chen, S. Liu, H. Gu, and C. Zhang, “Formulation of error propagation and estimation in grating reconstruction by a dual-rotating compensator Mueller matrix polarimeter,” Thin Solid Films 571, 653–659 (2014). 37. S. Liu, X. Chen, and C. Zhang, “Development of a broadband Mueller matrix ellipsometer as a powerful tool for nanostructure metrology,” Thin Solid Films 584, 176–185 (2015).


Introduction
The unceasing requirement of small semiconductor device features drives the development of many new optical lithography techniques, of which double patterning or multi-patterning lithography has emerged as a promising enhancement technique to reduce the critical dimension (CD) in the pattern on a wafer [1][2][3].For the double patterning technique, the 2nd grating feature must be accurately aligned and printed at the half pitch of the 1st grating.A misalignment that results in a so-called overlay error will lead to a non-yielding device [4,5].According to the ITRS (International Technology Roadmap for Semiconductors, www.itrs2.net), the requirement of the maximum allowable overlay error is 4 nm (3σ) at today's feature sizes, but will approach to as small as 2 nm in 2018.The demand for the tighter overlay control poses serious challenges to the traditional image based overlay metrology.Over the past years, the diffraction based overlay (DBO) metrology techniques were developed to address the challenges [6][7][8][9][10][11][12].
The implementation of the DBO metrology techniques mainly involves two key issues, one is a specially designed DBO target and another is the specific measurement and analysis method.Moreover, the design of the DBO target that typically consists of one-dimensional gratings in the X and Y directions is usually closely related with the specific measurement and analysis method.A general requirement for the DBO target is that its size should be as small as possible because of the limited space reserved for the target in the pattern on a wafer.Most of current overlay measurement techniques, such as the normal incidence spectroscopic reflectometry (NISR) [6], the angle-resolved scatterometry [7,8], the phase-structured illumination technique [9], and the Mueller matrix ellipsometry (MME) [10][11][12], arise from the optical scatterometry, also termed as the optical critical dimension (OCD) metrology, which has been developed for the measurement of CD in semiconductor manufacturing [13].For this reason, the data analysis methods that have been developed for OCD metrology could be naturally applied for overlay metrology.Moreover, in this case, a DBO target that contains only a single cell per direction (X and Y) is enough to obtain the overlay error, which can therefore reduce the total target size as well.However, this data analysis approach involves complicated computation-intensive inverse problem solving [14][15][16], which is typically illposed, especially when multi-parameters including the overlay error are floated together to achieve a best fit.Thus, the data analysis approach directly inherited OCD metrology on a single cell is usually difficult to extract overlay accurately.To this end, an empirical analysis approach that does not need to solve the inverse problem was adopted to measure the overlay error by using DBO targets that usually consist of multiple cells per direction [6,8,11,12].
The empirical approach relies on a linear relation between the optical signature of the sample under test and the overlay displacement within a small range.Here, the term signature represents a property of light, which can be in the form of amplitude, phase, polarization, reflectance, or ellipsometric parameters, etc.For the NISR measurement technique [6], which collects the reflectance spectra of the zeroth-order diffracted light, it requires a DBO target that contains at least three cells per direction with suitably designed shifts to measure the overlay error.Moreover, to improve the sensitivity, the designed shifts between two resist lines in the DBO target of NISR are typically set as 25~35% of pitch, which makes the target less similar to device structures and is especially troublesome to the double patterning processes.Other measurement techniques that use less number of cells per target were also developed.The angle-resolved scatterometry that collects the ± 1st diffraction orders uses only two cells per target [8].Its measurement is based on the fact that the overlay error will induce asymmetric intensity distribution, which is particularly pronounced in the higher diffraction orders.However, the ± 1st order scatterometry poses a very tight requirement on the pupil uniformity and the quality of the calibration method used to reduce the effect of pupil non-uniformity.The MME also uses a target that consists of only two cells per direction but collects the zeroth-order diffracted light [11,12].Since the zeroth-order diffraction usually has a large intensity than higher orders, the measurement provides a good signal-tonoise ratio.In addition, 16 elements of a 4 × 4 Mueller matrix provide much more information than a single intensity in angle-resolved scatterometry, and moreover, the overlay displacement introduces a composite structural asymmetry [17][18][19][20][21], which makes MME a good candidate for the DBO metrology.
Current DBO metrology using MME is typically based on a property that the elements from the two 2 × 2 off-diagonal blocks of the Mueller matrix are zero when the overlay displacement is absent, otherwise will deviate from zero and respond to the overlay displacement linearly [10][11][12].However, we will show in this work that the above property of the off-diagonal block elements of the Mueller matrix is essentially only valid at the azimuthal angle of φ = 90°, i.e., with the plane of incidence parallel to grating lines.Furthermore, it is also necessary to accurately align the sample at φ = 90° when performing measurement, since even a minor offset of 0.1° from φ = 90° will lead to a large measurement error.This makes the DBO metrology using MME lack of robustness.In this work, we applied the differential Mueller matrix calculus [22] to investigate the Mueller matrices of double-patterned grating structures with overlay displacements.The differential Mueller matrix calculus that can be obtained by recently developed Mueller matrix differential decomposition [23,24] summarizes the elementary optical properties of the medium which influence the Stokes vector, including the linear birefringence along the p-s and ± 45° axes, the linear dischroism along the p-s and ± 45° axes, and the circular birefringence and dichroism.We found and demonstrated that, among the above elementary optical properties, the linear birefringence and dichroism along the ± 45° axes exhibited a linear response to the overlay displacement and were zero when the overlay displacement was absent at any conical mounting (with the plane of incidence no longer perpendicular to grating lines).The simulation results demonstrated that the linear birefringence or dichroism along the ± 45° axes was a more robust overlay indicator than the off-diagonal block elements of the Mueller matrix for the DBO metrology.

Theory
The polarization state of an electromagnetic wave can be described by Stokes vectors that consist of four elements 45 45 , where p I and s I are the light intensities of linear polarization in the p and s directions (p-s axes as well as the wave vector constitute a right-handed orthogonal coordinate that is conventionally used to describe polarization); 45 I + ° and 45 I − ° are the light intensities of linear polarization at + 45° and -45° (with respect to p-s axes); R I and L I are the light intensities of right-circular and left-circular polarization, respectively.Under the Stokes-Mueller formalism, the transformation of Stokes vector of incident light after interaction with a sample can be described by a 4 × 4 Mueller matrix M (also called the sample Mueller matrix) , where in S and out S denote the Stokes vectors related with the incident and emerging light, respectively.
In the differential Mueller matrix calculus, the differential Mueller matrix m relates the Mueller matrix M to its spatial derivative along the light propagation direction z as [22] .
The solution of Eq. ( 3) can be achieved by taking the logarithm of M if m does not depend on z, that is [23,24], which is also called the Mueller matrix differential decomposition.In Eq. ( 4), L is the accumulated differential Mueller matrix and l = L m, with l being the optical path length in the medium.) The superscript "T" denotes the matrix transpose.The term m L contains the mean values of the elementary optical properties, while u L contains the anisotropic absorption as well as the uncertainties of the corresponding elementary optical properties resulting from depolarization.It is shown that, if M is a nondepolarizing Mueller matrix, we will have u 0 = L within the experimental error, provided that the isotropic absorption has been subtracted beforehand from the diagonal elements of L. The term m L is given by [25] m where LB and LD refer to the linear birefringence and dichroism along the p-s axes of the reference frame, LB′ and LD′ refer to the linear birefringence and dichroism along ± 45° axes, CB and CD refer to the circular birefringence and dichroism.The accumulated differential Mueller matrix L can be readily calculated by Eq. ( 4) for a given experimental Mueller matrix ex M , however, it should be pointed out that the associated L is not always physical [26].To avoid an unphysical L, it is recommended to perform sum decomposition for ex M first.According to the sum decomposition, any Mueller matrix M (nondepolarizing or depolarizing) can be represented as a weighted sum of four nondepolarizing Mueller matrices k M (k = 1, 2, 3, 4) [27] where k λ (k = 1, 2, 3, 4) are the eigenvalues of the following coherency matrix and i σ are the Pauli spin matrices and ij m is the (i, j)th element of M. The (i, j)th element , k ij m of the kth nondepolarizing Mueller matrix k M can be calculated by † T , Tr ( ) , where k v (k = 1, 2, 3, 4) are the corresponding eigenvectors of C, Tr(⋅) denotes the matrix trace, and the superscript " †" stands for the Hermitian conjugate.It is shown that for a physically realizable Mueller matrix M its coherency matrix C should be positive semidefinite and therefore have non-negative eigenvalues k λ .Should negative eigenvalues happen for an experimental Mueller matrix ex M , we can make these negative eigenvalues to be zero and then use It is shown that ′ M is the best least-square estimate of ex M in the space of physically realizable Mueller matrices [28].We can then calculate the logarithm L of ′ M using Eq. ( 4).It is also worth pointing out that the differential Mueller matrix m (L) has a direct physical interpretation only in transmission measurements.Although we can also calculate the logarithm L of a sample Mueller matrix obtained in reflection using Eq. ( 4), here L will not be a direct read of the elementary optical properties of the sample as it happens in transmission [21,29].The calculation of L by Eq. ( 4) in a reflection mode, as we will see in the remainder of this paper, should be understood as a pure mathematical operation used to reveal the minimum number of independent parameters that are necessary to describe the change of polarization upon reflection.Moreover, we will show in Section 3 that the limitation of lack of direct physical meaning of the elementary optical properties obtained in reflection does not affect them to be indicators for overlay monitoring.δ φ + π M as the Mueller matrix obtained after a rotation of the sample by π around z-axis.According to the principle of electromagnetic reciprocity [30], when a light beam passes through the media (or is reflected or scattered) in a reverse direction, the reciprocal Mueller matrix M can be represented by where diag(1, 1, 1, 1) = − O accounts for the change in the coordinate system due to the reversal of motion and O is the inverse of O .Since the rotation of the sample by π around z-axis is equivalent to change of the direction of the overlay displacement, Eq. ( 9) is thus also equivalent to According to Eq. ( 10), we can further derive that where m ( , ) δ φ L is the G-antisymmetric term of the logarithm of ( , ) δ φ

M
. According to Eqs. ( 5) and ( 11), we have L( , ) L( , ), where , and C CB iCD = − , which represent the linear anisotropy along the p-s and ± 45° axes as well as the circular anisotropy, respectively.
Considering the electromagnetic reciprocity as well as the reflection symmetry of the double-patterned grating in Fig. 1 relative to the plane that is perpendicular to grating lines, we can restrict the azimuthal angle φ to the range of 0 to 90°.In addition, we assume that the grating is composed of only reciprocal materials and the investigated Mueller matrices are exclusively associated with the zeroth-order diffracted light.Based on Eqs. ( 9)-( 12), we can derive the follow properties that will be useful in overlay measurement.
When 0 δ ≠ , the overlay displacement breaks the C 2z and σ y symmetry which thereby leads to L 0 ′ ≠ at 0 φ ≠ ° [31].Here, C 2z and σ y are standard symbols in group theory used to denote symmetry operations and symmetry types [32].The Property 1 indicates that we can judge whether or not the grating sample has an overlay displacement from the absolute value of L′ and meanwhile we can distinguish the direction of overlay displacement from the sign of L′ .The proof of Property 2 is given in Appendix A. According to Eq. ( 10) as well as Properties 1 and 2, we can obtain the following relation for any element of the two 2 × 2 offdiagonal blocks of the Mueller matrix at φ = 90° when 0 δ ≠ .The relation in Eq. ( 13) was also the starting point for most of the reported literature dealing with overlay measurement based on the Mueller matrix formalism [10][11][12].
However, we should note that Eq. ( 13) is only valid at φ = 90°.According to Property 1, we can adopt LB′ (or LD′ ) as the indicator for overlay monitoring.In actual measurements, the sample Mueller matrices are typically collected in a spectral range.Therefore, the LB′ extracted from the collected Mueller matrices will also vary in a spectral range.For overlay measurement, it is necessary to define a scalar indicator that has a linear relation with respect to the overlay displacement within a small range.In this work, we define the scalar overlay indicator as LB 1 LB , where N denotes the number of spectral points and i ω is the weight associated with LB i ′ at the ith spectral point.Different i ω will lead to different overlay indicators.When 1 i ω ≡ , Eq. ( 14) corresponds to the mean weighting approach that was commonly adopted for the Mueller matrix elements of the 2 × 2 off-diagonal blocks in the reported literature [10,11].In this work, we propose another approach which we term as the principal component (PC) weighting approach based on the principal component analysis (PCA) [33].In the PC weighting approach, i ω that satisfy ′ .The covariance matrix of LB i ′ can be estimated by calculating LB i ′ at different overlay displacements.In this case, Eq. ( 14) is essentially the 1st PC of LB i ′ , which is expected to show a linear relation with respect to the overlay displacement within a small range according to PCA.We will also show in Section 3 that the PC-based scalar indicator is more robust than the mean-based indicator in most cases.With the scalar overlay indicator, we can design the DBO target to realize overlay measurement.Figure 2 illustrates the layout the DBO target, where d (or −d) is a known value that represents the designed shift and ε is the actual overlay error induced in the manufacturing processes.According to Fig. 2, we know that the total overlay displacement will be δ = d + ε (or −d + ε).Based on the linear assumption between the scalar indicator LB γ ′ and the overlay displacement δ, we can obtain the actual overlay error ε by The thickness of the BARC (bottom anti-reflective coating) layer is t = 30 nm.The total overlay displacement is denoted as δ, which should be the sum of the designed shift d and the actual overlay error ε, as illustrated in Fig. 2. In the simulation, the Mueller matrices are calculated using rigorous coupled-wave analysis [34,35] in the spectral range of 250 to 800 nm with an increment of 5 nm and by fixing the incidence angle at θ = 65°.The Mueller matrices at different azimuthal angles varied from 0 to 90° will be examined.

Sensitivity analysis
We first performed simulation for the double-patterned grating given in Fig. 3 to give an intuitive understanding of Properties 1 and 2. Figure 4 presents the spectra of the six elementary optical properties extracted from the simulated Mueller matrices at azimuthal angles varied from 0 to 90° with an increment of 30°.We can observe from Fig. 4 that when φ = 0°, LB′ = LD′ = CB = CD = 0, and when φ ≠ 0°, only LB′ and LD′ exhibit sensitivity to the direction of overlay displacement while other optical properties are at most sensitive to the absolute value of the overlay displacement.In addition, when φ = 90°, CB = CD = 0 even if the overlay displacement δ ≠ 0. However, when φ ≠ 90° and φ ≠ 0°, CB and CD will not be equal to zero any more.The nonzero property of CB and CD when 0 < φ < 90° at oblique incidence can be used to assist the alignment of azimuthal angle at φ = 90°.The necessity of accurate alignment of azimuthal angle at φ = 90° will be discussed in Section 3.4.The specific alignment process is to adjust the sample stage until the associated CB and CD are within the experimental error.We also checked the spectral response of the Mueller matrix elements of the 2 × 2 offdiagonal blocks when the grating sample has an overlay displacement.We took the elements m 13 and m 14 as an example.Considering that the off-diagonal block elements will not be equal to zero when φ ≠ 0° and φ ≠ 90°, we therefore calculated the difference spectra Δm 13 and Δm 14 defined by Δm ij = m ij (δ ≠ 0) -m ij (δ = 0) to examine whether their difference spectra have a similar property to LB′ and LD′. Figure 5 presents the difference spectra of m 13 and m 14 calculated at different azimuthal angles.As shown in Fig. 5, when φ = 0°, Δm 13 = Δm 14 = 0 since in this case the off-diagonal block elements are always equal to zero regardless of the overlay displacement.We also observe from Fig. 5 that the relation Δm ij (-δ) = -Δm ij (δ) for the off-diagonal block elements are only valid at φ = 90°, which can be readily interpreted according to Eq. (13).From this aspect, we know that LB′ (or LD′) seems to be a much better choice as the overlay indicator since the relation L′(-δ) = -L′(δ) is always valid at any azimuthal configuration.Since LB′ is always equal to zero when δ = 0, we can thus compare the norm of the LB′ spectra ( LB′ ) for a given overlay displacement at different azimuthal angles to choose a more sensitive azimuthal configuration.Figure 6 presents the variation of LB′ of the investigated grating sample at azimuthal angles varied from 0 to 90° with an increment of 15°.According to Fig. 6, we can observe that LB′ shows a larger value at φ = 60° and φ = 90°, which indicates that φ = 60° and φ = 90° will be the better azimuthal configurations.It was found that most of current overlay measurements based on Mueller matrix formalism in the reported literature were usually performed at φ = 90° [10][11][12].However, as can observed from Fig. 6, the LB′ at φ = 60° is even larger than that at φ = 90°, which suggests that φ = 90° might not always be the best choice of azimuthal configurations in overlay measurement.
Considering that there is no large difference between the values of LB′ at φ = 60° and φ = 90°, these two azimuthal configurations are both selected in the following analysis in order to make a comparison.

Linearity verification
Since the overlay measurement by Eq. ( 15) is based on an assumption that the scalar indicator LB γ ′ has a linear relation with respect to the overlay displacement δ, it is therefore necessary to check the linearity assumption prior to the measurement.If the linearity assumption is verified, we also need to check the range of δ where the linearity assumption is valid.To this end, we varied δ in the range from -50 to 50 nm.We first calculated LB γ ′ using the PC and mean weighting approaches at φ = 60° and φ = 90°, respectively.To make a comparison, we also adopted the Mueller matrix element of the 2 × 2 off-diagonal blocks (we took m 13 as an example) as the overlay indicator and calculated the corresponding PC-based and mean-based indictor values using a similar equation to Eq. ( 14), i.e., 13 13, 1 , where i ω is the weight associated with 13,i m at the ith spectral point.Note that here the mean of the offdiagonal block element of the Mueller matrix in a spectral range was also the commonly adopted scalar overlay indicator in the reported literature [10,11].We performed linear fitting to the indicator values at different overlay displacements.To achieve a good fitting result, we ignored some ranges of δ where there showed a poor linear performance (See Figs.7(d) and 8(d)).The fitting with a large coefficient of determination R 2 and small fitting errors over a large range of δ indicates a good linear relation between the overlay indicator and the overlay displacement.To do this, we need to calculate LB i ′ at different overlay displacements.For the results given in Fig. 7, we can estimate the covariance matrix of LB i ′ by using the LB i ′ (or m 13 for Fig. 8) calculated at all the overlay displacements varied from -50 to 50 nm.However, when we try to measurement the overlay error ε based on the DBO target shown in Fig. 2, we will only have two overlay displacements (i.e., δ = d + ε and −d + ε).Therefore, it is necessary to check whether or not we can still obtain the covariance matrix of LB i ′ in this case.It is also necessary to compare the difference between the weights ω i estimated by using the LB i ′ calculated at all the overlay displacements from -50 to 50 nm and those estimated by using the LB i ′ calculated at two overlay displacements.estimated by using all the overlay displacements (varied from -50 to 50 nm); the red solid line (with a legend title of "δ = (15, -40)") corresponds to the ω i estimated by using the overlay displacements of 15 and -40 nm; the blue dash-dot line (with a legend title of "δ = (37, -12)") corresponds to the ω i estimated by using the overlay displacements of 37 and -12 nm.(b) the red solid line corresponds to the difference between the weights ω i estimated by using all the overlay displacements and those by using the overlay displacements of 15 and -40 nm; the blue dash-dot line corresponds to the difference between the weights ω i estimated by using all the overlay displacements and those by using the overlay displacements of 37 and -12 nm.
We took the estimated weights ω i associated with LB i ′ in the PC-based scalar overlay indicator at φ = 60° (corresponding to the results given in Fig. 7(a)) as an example.In the analysis, we randomly chose two groups of overlay displacement duplet from the range of -50 to 50 nm, namely δ = (15, -40) nm and (37, -12) nm.Then, we estimated the corresponding weights using the above two groups of overlay displacement duplet, and compared them with the weights estimated by using all the overlay displacements varied from -50 to 50 nm.Figure 9 presents the comparison results.According to Fig. 9, we can observe that there is a minor difference between the weights estimated by using only two overlay displacements and those estimated by using all the overlay displacements.It thus suggests that two overlay displacements are enough to estimate the weights ω i used in the PC-based overlay indicator.

Overlay measurement
According to the above analysis, we try to measure the overlay error based on the DBO target given in Fig. 2 in this section.In the DBO target, d is a designed shift.Considering that the value of d might have an influence on the final measurement results, we thereby first investigated the measurement errors at different shift values using the proposed overlay indicators.In the analysis, we fixed the overlay error at ε = 10 nm while varied the designed shift d from 2 to 50 nm with an increment of 2 nm.We then estimated the overlay error at each shift value by Eq. ( 15).The measurement error is defined as the absolute difference between the estimated overlay errors and the given overlay error (ε = 10 nm).When estimating the overlay errors, we used both LB γ ′ and 13 m γ to make a comparison, and both LB γ ′ and 13 m γ were calculated using the PC and mean weighting approaches, respectively.In addition, we estimated the overlay errors using LB γ ′ at φ = 60° and φ = 90°, respectively.
Considering the bad linear performance for  γ ′ to estimate overlay errors, the azimuthal configuration of φ = 60° exhibits higher measurement accuracy than that of φ = 90° for both PC-and meanbased overlay indicators, especially for large shift values.It might be attributed to the relatively small linear range for the overlay displacements at φ = 90°, as predicted in Fig. 7.
We can also observe from Fig. 10(a) that, when using LB γ ′ to estimate overlay errors at φ = 60°, the PC and mean weighting approaches show comparative accuracy at small shift values (d < 20 nm), while for large shift values (d > 20 nm) the mean weighting approach exhibits higher accuracy than the PC weighting approach, which is really beyond our expect.In comparison, when using LB γ ′ to estimate overlay errors at φ = 90° (see Fig. 10(b)), the mean weighting approach exhibits rather worse accuracy for large shift values (d > 30 nm) than the PC weighting approach.It might be because of the relatively poor linear performance of the mean weighting approach when the overlay displacement δ is large than 40 nm, as predicted in Figs.γ to estimate overlay errors at φ = 90°, the PC weighting approach shows better accuracy than the mean weighting approach in the whole range of designed shifts.The poor performance of the mean weighting approach even suggests that it might not be used for the overlay measurement.We then tried to measure the overlay error based on the results shown in Fig. 10.In the simulation, we fixed the designed shift d at 20 nm.We varied the overlay errors from 2 to 30 nm with an increment of 2 nm and compared the measured overlay errors by Eq. ( 15) with the input (given) overlay errors.Random noise was also added to the simulated Mueller matrices, which was generated using a signal dependent noise model given in our previous work [36,37].The generated noise was dependent on both the sample under test and the measurement configurations (combination of the wavelength and the incidence and azimuthal angles).We then performed the sum decomposition to the simulated Mueller matrices by Eqs. ( 6)-( 8) and calculated the logarithm of the estimated physically realizable Mueller matrices by Eq. ( 4).We also used LB   approaches can lead to good measurement results as predicted in Fig. 10(a).The PC and mean weighting approaches can only achieve good measurement results for the overlay errors less than about 24 nm, when using LB γ ′ to measure overlay errors at φ = 90°, as illustrated in Figs.γ to measure overlay errors at φ = 90°, the PC weighting approach is still applicable to achieve good measurement results.Recalling the results presented in Fig. 8, we know that the PC weighting approach exhibits better linear performance than the mean weighting approach even at the azimuthal configuration of φ = 60°.We thereby want to examine whether or not we could achieve good measurement results using  The black dash-dot line corresponds to the input overlay error.
Considering the fact that the off-diagonal block elements of the Mueller matrix will not be zero even when the azimuthal angle φ has a minor offset from 0° or 90° in the case of δ = 0, we want to further examine whether or not we could achieve good measurement results using 13 m γ based on the PC weighting approach at the azimuthal configurations that slightly deviate from 90°.We still fixed the designed shift d at 10 nm and assumed that the input overlay error ε = 5 nm.We performed the test at the azimuthal angles varied from 89 to 90° with an increment of 0.1°.We first used the indicator  12 indicate that even a small offset of 0.1° from 90° in the azimuthal angle will lead to a large measurement error of nearly 1 nm (a relative measurement error of nearly 20%).We then replaced the original indicator with a corrected one As can be observed from Fig. 12, the corrected indicator leads to a much better measurement results than the original one.We also used the corrected indicator 13 m γ ′ to obtain the measured overlay error at φ = 60°.
The result was found to be ε ′ = 4.641 nm, which is also much better than the previous result of −24.560 nm obtained by using the original indicator 13 m γ .However, we should note that the corrected indicator involves the calculation of the off-diagonal block element value of the Mueller matrix at δ = 0, which is unrealizable based on the DBO target given in Fig. 2. It therefore suggests that we want to use the off-diagonal block elements of the Mueller matrix as the overlay indicator, we can only perform overlay measurement at φ = 90°.
Moreover, the azimuthal angle should be accurately aligned; otherwise large measurement error will be generated.Accurate alignment of azimuthal angle at φ = 90° might be achieved according to Property 2. In addition, we need to adopt LB′ as the overlay indicator, since it shows to be more robust than the off-diagonal block elements of the Mueller matrix even when φ has a large offset from 90°, as indicated in Fig. 10(a), and 11(a), and 11(d).

Conclusions
In this work, the differential Mueller matrix calculus was first introduced to investigate the Mueller matrices of a double-patterned grating structure, which helps to reveal the six elementary optical properties, i.e., LB, LD, LB′, LD′, CB and CD, hidden in the Mueller matrices.We found that among the six elementary optical properties only LB′ and LD′ exhibit the sensitivity to both the direction and amplitude of the overlay displacement δ.We thereby adopted the weighted sum of LB′ (LD′ could also be adopted), namely LB γ ′ , as the scalar overlay indicator.To make a comparison, we also adopted the weighted sum of the Mueller matrix elements (we took m 13 as an example) of the 2 × 2 off-diagonal blocks, namely 13 m γ , as the scalar overlay indicator.Two weighting approaches were used to calculate the indicator values.One is the commonly used mean weighting approach, and another is our proposed PC weighting approach based on PCA.Thereinto, the indicator 13 m γ using the weighting approach was also the commonly adopted scalar indicator for overlay measurement in the reported literature.The performances of the overlay indicators were examined at azimuthal angles φ varied from 0 to 90°, and we found that (1) The overlay indicator 13 m γ can only be used at φ = 90°, and moreover even at φ = 90°, only the PC weighting approach can achieve good measurement results while the mean weighting approach that was commonly adopted in the reported literature will lead to inaccurate results.
(2) In addition, when using 13 m γ to measure overlay error based on the PC weighting approach at φ = 90°, it is required to accurately align the azimuthal angle since a minor offset of 0.1° from φ = 90° will lead to a large relative measurement error of nearly 20%.
(3) The reason why the overlay indicator 13 m γ can only be used at φ = 90° was attributed to that the Mueller matrix elements of the 2 × 2 off-diagonal blocks will not be equal to zero any more at a general conical mounting (0° < φ < 90°) even when δ = 0.
(4) In contrast, the overlay indicator LB γ ′ can achieve good measurement results using both PC and mean weighting approaches even at a general conical mounting, since when δ = 0, we have LB′ = 0 for any azimuthal configuration, while when δ ≠ 0, we always have LB′ ≠ 0 for any conical mounting.The above achievements consequently lead to a conclusion that LB′ (or LD′) has a better performance than the 2 × 2 off-diagonal block elements of the Mueller matrix to be the overlay indicator.We thereby suggest replacing the off-diagonal elements of the Mueller matrix with LB′ (or LD′) for robust overlay measurement.It is also worth pointing out that the above conclusions are not limited to the double-patterned structures; it is also valid for the overlay between features at different layers.In addition, although the results given in Section 3.4 are achieved by simulation, the derived conclusions in this paper could be readily applied to experimental data for actually designed DBO targets.For experimental Mueller matrices, it is recommended to perform the sum decomposition first and then perform the differential decomposition to avoid possible unphysical differential Mueller matrices.Finally, we could use Eq. ( 15) to obtain the actual overlay error.

Appendix A. Proof of Property 2
Besides the Stokes-Mueller formalism, we can also use the Jones matrix formalism to describe the change of polarization state of incident light after interaction with a sample.Under the Jones matrix formalism, the Jones matrix J associated with the zeroth-order diffracted light of a grating sample, which connects the incident Jones vector with the reflected one, can be written as where E s,p refer to the electric field components that are perpendicular and parallel to the plane of incidence, respectively.Moreover, it is shown that the cross-polarization reflection coefficients r ps and r sp of the Jones matrix J can be written as [21]  , provided that the Jones matrix J is normalized by a complex phase shift term induced by isotropic phase retardation and amplitude absorption.
When the azimuthal angle φ = 0°, all diffraction orders are within the plane of incidence (this configuration is usually called planar diffraction) and the cross-polarization coefficients r ps and r sp are always zero for any periodic nanostructure.According to Eq. ( 17), we have C L 0 ′ = = .When φ = 90°, all diffraction orders are on the surface of a cone with revolution symmetry around the direction of grating lines (x-direction shown in Fig. 1).According to Theorem 3 in [31], we have r ps = r sp at this special symmetrical conical mounting.Moreover, when the overlay error δ = 0, we will have r ps = r sp = 0, while when δ ≠ 0, we will have r ps = r sp ≠ 0. Therefore, according to Eq. ( 17), we always have C = 0 whether δ is zero or not.That's the conclusion given by Property 2. Here, it is also worth pointing out that at oblique incidence C will not be zero any more when φ is not equal to 0° or 90°, as illustrated in Fig. 4.
We should note that the nonzero C does not necessarily suggest that the grating sample exhibits chirality or bianisotropy, which might be induced by the linear anisotropy along the p-s and ±45° axes.Nevertheless, the nonzero C at oblique incidence can be adopted as a ) terms of L, respectively.Here, G is the Minkowski metric and

Fig. 1 .
Fig.1.Representation of polarized light incidence upon a double-patterned grating structure with a pitch of Λ and an overlay displacement of δ, where E i and E r denote the incident and reflected electric fields, and E s,p refer to the electric field components that perpendicular and parallel to the plane of incidence, respectively.

Figure 1
Figure 1 depicts a polarized light incidence upon a double-patterned grating structure with a pitch of Λ and an overlay displacement of δ.We assume that 0 δ = corresponds to the

Fig. 2 .
Fig.2.Layout of the DBO target, which consists of two cells per direction used to measure the overlay errors ε x and ε y along the X and Y directions, respectively.In the DBO target, d is a designed shift, and Λ denotes the pitch of the doubled pattered structure.Without loss of generality, ε x and ε y are denoted as ε for simplicity.


are the elements of the eigenvector corresponding to the largest eigenvalue of the covariance matrix of LB i

Figure 3
Figure 3 depicts the schematic of the investigated double-patterned grating in the simulation, which has a pitch of Λ = 660 nm.The profile of Resist1 is characterized by top CD w 1 = 120 nm, line height h 1 = 50 nm, and sidewall angle α 1 = 87°, and the profile of Resist2 is characterized by top CD w 2 = 100 nm, line height h 2 = 140 nm, and sidewall angle α 2 = 87°.The thickness of the BARC (bottom anti-reflective coating) layer is t = 30 nm.The total overlay displacement is denoted as δ, which should be the sum of the designed shift d and the actual overlay error ε, as illustrated in Fig.2.In the simulation, the Mueller matrices are calculated using rigorous coupled-wave analysis[34,35] in the spectral range of 250 to 800 nm with an increment of 5 nm and by fixing the incidence angle at θ = 65°.The Mueller matrices at different azimuthal angles varied from 0 to 90° will be examined.

Fig. 4 .
Fig. 4. The spectra of the elementary optical properties LB, LD, LB′, LD′, CB, and CD extracted from the simulated Mueller matrices.The green solid curves with open circles and the red solid curves correspond to the double-patterned grating with an overlay displacement of δ = -10 nm and of δ = 10 nm, respectively.The blue dashed-dotted curves correspond to the grating with δ = 0.The spectra of LB′ and LD′ obtained at φ = 30° are multiplied by a factor of 5 for clarity.

Fig. 5 .Fig. 6 .
Fig. 5.The difference spectra of Mueller matrix elements m 13 and m 14 between those calculated for the investigated double-patterned grating with overlay displacements of δ = ± 10 nm and of δ = 0 at different azimuthal angles.The green solid curves with open circles and the red solid curves correspond to the grating with an overlay displacement of δ = -10 nm and of δ = 10 nm, respectively.The difference spectra obtained at φ = 30° are multiplied by a factor of 2 for clarity.

Fig. 7 .
Fig. 7. Linearity verification results for LB γ ′ .Each subfigure contains two curves, of which the top one shows the linear fitting result while the bottom one presents the corresponding fitting error.(a) and (b) are the results by using the PC weighting approach at φ = 60° and φ = 90°, respectively.(c) and (d) are the results by using the mean weighting approach at φ = 60° and φ = 90°, respectively.The R 2 inserted in each subfigure represents the coefficient of determination in linear fitting.

Figure 7
Figure 7 presents the results for LB γ ′ .According from Fig. 7, we can observe that both the PC-and mean-based scalar indicators show a good linear relation with respect to δ varied in the range from -50 to 50 nm φ = 60°.In comparison, at φ = 90°, the PC-and mean-based scalar indicators only show a good relation with respect to δ within the range from -40 to 40

Fig. 8 .
Fig. 8. Linearity verification results for 13 m γ .Each subfigure contains two curves, of which the top one shows the linear fitting result while the bottom one presents the corresponding fitting error.(a) and (b) are the results by the PC weighting approach at φ = 60° and φ = 90°, respectively.(c) and (d) are the results by the mean weighting approach at φ = 60° and φ = 90°, respectively.In addition, in the PC-based scalar overlay indicator, we need to obtain the weights ω i in advance so as to further calculate the indicator value.The calculation of the weights ω i in the PC-based indicator involves the estimation of the covariance matrix of LB i ′ (or m 13 for 13 m γ ).

Fig. 9 .
Fig. 9. (a) The weights ω i associated with LB γ ′ in the PC-based overlay indicator estimated at φ = 60°.The discrete black circles (with a legend title of "all data") correspond to the ω i

Figure 10 presents′ and 13 mγ
Figure 10 presents the measurement errors at different shift values estimated by LB γ ′ and 7(b) and 7(d).As shown in Fig. 10(c), when using 13 m

γ ′ and 13 mγ
to measure overlay errors at φ = 60° and φ 90° to make a comparison, and both LB γ ′ and 13 m γ were calculated using the PC and mean weighting approaches, respectively.

Figure 11 13 mγ
Figure 11 presents the measurement results by LB γ ′ and 13 m γ at φ = 60° and φ = 90°, respectively.As can be observed from Fig. 11, except the case of 13 m γ using the mean weighting approach at φ = 90°, which has been predicted in Fig. 10(c), the overlay indicators can achieve good measurement results in most cases.As shown in Figs.11(a) and 11(d), when using LB γ ′ to measure overlay errors at φ = 60°, both the PC and mean weighting 11(b) and 11(e).The result is consistent with that shown in Fig.10(b), which is attributed to the relatively small linear range of the overlay displacements, as predicted in Figs.7(c) and 7(d).If we want to measure overlay errors using 13 m γ at φ = 90°, Figs.11(c) and 11(f) suggest that we can only use the PC weighting approach.In addition, the linearly fitted equation inserted in each subfigure in Fig. 11 also revealed the corresponding measurement error, since in an ideal situation the linearly fitted equation should have a slope of 1 and an intercept of 0. The measurement error shown in Fig. 11 might be induced by the random error as well as the not strictly established linear relation between the scalar indicator and the overlay displacement, as revealed in the fitting errors shown in Figs.7 and 8. Figures 11(c) and 11(f) indicate that, although the mean weighting approach does not work when using 13 m

13 mγ
based on the PC weighting approach at φ = 60° or at other conical mountings, just as the indicator LB γ ′ did in Fig. 11(a).Considering that the PC-based overlay indicator might have a better performance in the case of a small designed shift value, as illustrated in Fig. 10, we fixed d at 10 nm in the test.We assumed that the input overlay error ε = 5 nm.Obviously, the total overlay displacement δ (δ = d + ε or δ = d − ε) falls in the linear range, as indicated in Fig. 8(a).We then used 13 m γ based on the PC weighting approach to obtain the measured overlay error ε ′ at φ = 60°.It was found that the measured overlay error ε ′ = −24.560nm, which is obviously wrong!The wrong overlay error achieved at φ = 60° might be because of the large intercept in the fitted linear curve, as revealed in Fig. 8(a), induced by the nonzero offdiagonal block element values of the Mueller matrix at φ = 60° even when δ = 0.

Fig. 12 .
Fig. 12.The measured overlay errors at different azimuthal angles using the original indicator:13 overlay errors at different azimuthal angles.As expected, the results shown in Fig.