Efficient Source Mask Optimization with Zernike Polynomial Functions for Source Representation References and Links

In 22 nm optical lithography and beyond, source mask optimization (SMO) becomes vital for the continuation of advanced ArF technology node development. The pixel-based method permits a large solution space, but involves a time-consuming optimization procedure because of the large number of pixel variables. In this paper, we introduce the Zernike polynomials as basis functions to represent the source patterns, and propose an improved SMO algorithm with this representation. The source patterns are decomposed into the weighted superposition of some well-chosen Zernike polynomial functions, and the number of variables decreases significantly. We compare the computation efficiency and optimization performance between the proposed method and the conventional pixel-based algorithm. Simulation results demonstrate that the former can obtain substantial speedup of source optimization while improving the pattern fidelity at the same time. Performance of FlexRay: a fully programmable illumination system for generation of freeform sources on high NA immersion systems, " Proc. SPIE 7640, 76401P (2010). Applicability of global source mask optimization to 22/20nm node and beyond, " Proc. SPIE 7973, 79730C (2011). 5. N. Jia and E. Y. Lam, " Pixelated source mask optimization for process robustness in optical lithography, " Opt. Improved mask and source representations for automatic optimization of litho-graphic process conditions using a genetic algorithm, " Proc. Efficient source and mask optimization with augmented Lagrangian methods in optical lithography, " Opt. Machine learning for inverse lithography: using stochastic gradient descent for robust photomask synthesis, " J. Experimental result and simulation analysis for the use of pixelated illumination from source mask optimization for 22-nm logic lithography process, " Proc. " Experimental verification of source-mask optimization and freeform illumination for 22-nm node static random access memory cells, " J. " Optimum mask and source patterns to print a given shape, " J. Kernel-based parametric analytical model of source intensity distributions in lithographic tools, " Appl. Gradient-based source and mask optimization in optical lithography, " IEEE Trans. Pixelated source and mask optimization for immersion lithography , " J. Convolution-variation separation method for efficient modeling of optical lithography, " Opt. Fast aerial image simulations using one basis mask pattern for optical proximity correction, " J. Fast source optimization involving quadratic line-contour objectives for the resist image, " Opt.


Introduction
As the critical dimension (CD) continues to shrink in the semiconductor industry, the continuation of ArF optical lithography depends heavily on resolution enhancement techniques (RETs) [1].Source mask optimization (SMO) as one of the RETs becomes critical in 22 nm technological node and beyond since it provides a viable and powerful approach to scale down the resolution [2].This is because highly customized sources are available by using diffractive optical element (DOE) or programmable illumination, which can shape the light to free-form with little throughput loss [3,4].At the same time, the SMO process is carried out by various algorithms including the gradient-based method, the genetic algorithm, and more recently the augmented Lagrangian method for speed enhancement [5][6][7].The algorithm is extended to take robustness to process variations into account [8].Simulations and experiments of SMO are also performed to demonstrate its applicability in integrated circuits fabrication [9,10].SMO is usually carried out through analysis of the aerial image generated on the wafer plane and inverse optimization for the mask and source designs [11][12][13].In this process, source representation methods play a critical role, and they affect the optimization performance and the efficiency significantly [14].As shown in Fig. 1, the source pattern in optical lithography has evolved from the traditional circular, annular, dipole, and quadrupole sources, to more complicated shapes such as sectors/track, and more recently to pixel-based sources.The traditional source patterns need only one or several parameters for its description [15]; more customized sources represented by arcs, sectors/tracks and so on, however, use dozens of variables, and bring larger flexibility [6,16].For these sources, the number of variables to represent the source patterns is small, and the source optimization problem is also of a limited scale.However, these representation methods lead to a nonlinear relationship between the aerial image and the variables, and thus the optimization requires a nonlinear optimization problem.More importantly, the source patterns described by these methods are binary, largely limiting the freedom of the source patterns and the optimization performance [14].A kernel based parametric model is also proposed that can represent the physical distribution of real-world illumination sources [17].However, the nonlinear relationship still remains, which make it difficult to be incorporated in source optimization.
In contrast, most of the recent SMO algorithms compute a source pattern represented by grayscale pixels [18][19][20].In these algorithms, the source patterns are discretized into matrices according to a specified pixel size, and each entry of the matrices is a variable [21].The grayscale pixel can represent a continuum of real numbers from 0 to 1, and the freedom of the solution space is then greatly enlarged.These methods can take advantage of the linear relationship between the aerial image and the source patterns, and the source optimization can be formulated as a quadratic problem.The drawback of the pixel-based method is that the number of pixel variables can be very large, leading to a computationally intensive optimization problem.
Recently, a library-based method is proposed by Yu et al. for efficient source optimization with a large mask pattern [22].This method employs the illumination cross coefficient (ICC) for source optimization for small mask patterns, setting up a library with the optimized source patterns and optimizing the source pattern within this library for large mask patterns.Yet, the steps in constructing the library still amount to substantial computation.Thus, it is still highly desirable to find a source representation method that can better leverage the representation freedom and optimization efficiency.
In this paper, we develop an SMO algorithm by representing the source patterns using the superposition of weighted Zernike polynomial functions.In optical lithography, these functions are widely used for the phase representation of the wavefront aberration [23,24], and can be easily incorporated in efficient modeling of the imaging process [25].The coefficients can be computed through matrix inversion or 2-D integration, making it straightforward to transform between the pixel images and the Zernike coefficients [26].In addition, we can take advantage of the source pattern characteristics, such as symmetry, to choose the Zernike polynomials.Thus the number of variables in source optimization can be significantly reduced compared with pixel-based algorithms.The relationship between the aerial image and the Zernike coefficients is also linear, and the former can be calculated efficiently.
In the following sections, we first model the forward aerial image formation with partially coherent imaging systems using the Zernike polynomials.We then develop an algorithm for a fast transmission cross coefficient (TCC) calculation with a similar linear relationship, which helps to reduce the computation of mask optimization by introducing the sum of coherent systems (SOCS) [27].Next, we cast the source optimization as a quadratic problem, which can be efficiently solved through convex optimization tools.We further analyze the source optimization results with different terms of the Zernike polynomial functions.Finally, simulations of a sequential SMO are performed to show the computation efficiency and optimization performance improvement of the proposed algorithm compared with the pixel-based SMO algorithm.

Fast aerial image calculation
The optical lithography imaging process is usually modeled as a partially coherent system, which consists of an extended source, a condenser, a mask pattern, a projection lens, and an aerial image on the wafer plane.Generally, the aerial image I on the wafer plane can be expressed, using Abbe's formulation, as [28] or with Hopkins' formulation as (2) where (x, y) are the spatial coordinates, ( f , g) are the spatial frequency coordinates, J is the illumination source, O is the mask spectrum, H is the projection pupil, F denotes the Fourier transform, † denotes the complex conjugate, and T is the transmission cross coefficient (TCC) defined by (3) In this system, J( f , g) is conventionally a circular function with radius σ , also known as the partial coherence factor.To improve the resolution, lithographers have developed off-axis illumination, including annular, dipole and quadrupole sources, and more recently the technology has enabled the use of more customized sources with a variety of shapes.These customized sources can be represented by several methods, including sector/track-based, and pixel-based.
The former uses a small number of variables to represent the source patterns, but with limited flexibility.The latter permits much more flexible designs, leading to the highest possibility to improve the resolution.However, this representation method requires a large number of pixel variables.
To cope with this, we make use of the Zernike polynomial functions, which are a sequence of orthogonal basis functions [24,28].We denote the source patterns as the expansion of P terms of Zernike polynomials, i.e.

J( f , g)
where ψ l is the corresponding Zernike coefficient, and Z l is the lth Zernike polynomial.For convenience, we can change this to a matrix representation using lexicographic ordering.Equation (4) can then be rewritten as where Ψ is the vector of Zernike coefficients Ψ = ψ 1 ψ 2 . . .T , and Z is a N 2 s × P matrix generated by stacking the vectors of the Zernike polynomials together.
In the pixel-based SMO method, the source patterns are discretized to a square grid, and the intensity of each grid location is represented by a pixel value.In order to better represent the source patterns with the Zernike polynomial functions, we also discretize these basis functions into matrices of the same size as the source patterns, as shown in Fig. 2. For our purposes here, we assume that they are of size N s × N s .Moreover, we note that for a partially coherent imaging system, the effective source intensity is limited to a unit disk, and only the pixels within the circle shown in Fig. 2 are of interest.Thus, we only need to consider the values within the unit disk for the Zernike polynomial functions.
The number of terms P can be quite large to represent a free-form source, which would require significant computation as a result.However, realistic source patterns in optical lithography often have some characteristics such as symmetry to reduce the pattern placement shift [9,29].This property can reduce the number of polynomials because we can restrict ourselves to those symmetrical to both the horizontal and vertical axes.The first 21 Zernike polynomial functions satisfying this requirement are shown in Fig. 3.   Using the Zernike polynomial representation, we substitute Eq. ( 4) into Abbe's aerial image formulation in Eq. ( 1).Since the aerial image is linearly related to the source pattern, we separate the Zernike coefficients by changing the position of integration and summation such that where This equation can be considered as the basis aerial image corresponding to the Zernike polynomial function Z l .In matrix form, Eq. ( 6) can be written as where I is the vector form of I, and Î is a matrix generated in the same way as Z.
In a similar way, we also substitute Eq. ( 4) in the TCC expression, getting where the basis TCC matrix Tl is The equivalent matrix form of Eq. ( 9) is Again, the generation of this matrix form of TCC follows the same method as Eqs.( 5) and (8).
We would like to point out that the basis TCCs can be pre-computed because they only involve the corresponding basis Zernike polynomial functions and the pupil functions.Then, only the coefficients ψ p vary when the source pattern changes, and the new TCC can be calculated through a linear combination of these bases, which can be very efficient.We depict this in Fig. 4, where Fig. 4

Mask optimization
In this section, we formulate the inverse optimization scheme for SMO based on the forward imaging model derived above.First, we consider the mask optimization.In previous SMO algorithms, Abbe's formulation is usually used for aerial image calculation, since the calculation of the TCC involves multiple integrations.Here, we can use the sum of coherent systems (SOCS) theory for fast aerial image calculation since the TCC can be calculated efficiently through Eq. ( 11) without computationally intensive integrations [27].
In this theory, the TCC matrix is decomposed into kernels through singular value decomposition (SVD), and the singular values descend rapidly.Only the few largest singular values and their corresponding eigenfunctions, which are considered as kernels in the imaging system, are maintained for the aerial image calculation.Thus the amount of computation can be significantly reduced.Let K be the number of singular values used for the computation.After the TCC is computed efficiently through Eq. ( 9), the decomposition can be expressed as where λ n is the nth eigenvalue, and φ n is the corresponding eigenvector.Then the aerial image can be calculated through where M is the mask pattern in the spatial domain, Φ n is the Fourier transform of the eigenfunction φ n ( f , g), and * denotes 2-D convolution.
For the cost function, we define the difference between the resist image I r and the target pattern I t as a measure of the image fidelity.The resist image is obtained from the aerial image with a sigmoid function modeled as the resist effect where t r is the threshold in the photoresist effect, and α indicates the steepness of the sigmoid function.Then the image fidelity term R m is given by In addition, to enhance image contrast, we define a penalty term R a on the aerial image as This term can force the aerial image to be 0 while the target is 0, and the aerial image to be 2t r while the target is 1.More explanations about this term can be found in Ref. [5].The overall cost function L m for mask optimization can be represented as where τ is a weight assigned to the image contrast term.Therefore the mask optimization can be formulated as M(x, y) = arg min A conjugate gradient method can be employed to optimize the mask pattern iteratively [5].

Source optimization
We formulate the source optimization as a quadratic problem as introduced by Yu et al. [30].Note, however, that in our method, the source patterns are fully characterized by the Zernike coefficients, and therefore we only need to optimize them instead of the pixel variables.The cost function consists of two terms, namely the contour awareness term R c and the side-lobes compressing term R 0 .The former forces the intensity on the contour to be equal to the threshold, as defined by where Îc is a N c × P matrix denoting the aerial images extracted from Î by choosing those located on the mask edge position, N c is the number of points on these position, and t r is a N c -element vector whose values are all t r .The latter suppresses the side-lobes by forcing the aerial image around the main features to be small, as given by where Î0 is a N 0 × P matrix denoting the aerial images located on a closed curve surrounding the main features of the mask, N 0 is the number of points on the curve, and is a N 0 length vector with all its values equal to ε, which is a small positive value.The distance between the curve and the main features is half a pitch for periodic patterns and 0.61λ /NA for isolated and semi-isolated patterns, where λ is the wavelength of the source, and NA is the numerical aperture.
Adding these two items together, we get the overall cost function where μ signifies the relative importance of the two terms.This cost function can be written as a quadratic form where The sizes of Q and b are P × P and P × 1, respectively, and c is a scalar.
The source intensities in lithography tools are non-negative, real-value functions.On the other hand, the Zernike polynomial functions contain negative values, and their summation is not necessarily positive.Another issue is that to avoid sharp spikes that can damage the lenses, the source intensities are limited to some value S max .As the source patterns with the Zernike representation can be expressed as ZΨ, these two requirements can be satisfied by setting a linear constraint on the source patterns as 0 ≤ ZΨ ≤ S max .
In addition, the dose variation in optical lithography can be characterized by either the total intensity of the illumination source, or the threshold value t r in the resist model.Here, we fix the threshold value, and limit the total intensity of the illumination source to a certain value D max .The total intensity of the source can be calculated as the summation of all the pixel values of the source patterns, that is, EZΨ, where E is an N 2 s row vector whose values are all 1.Thus, this requirement can be expressed as a linear constraint EZΨ ≤ D max in the optimization process.
Overall, the source optimization can be formulated as minimize This is a quadratic problem with linear constraints, and can be conveniently solved by convex optimization tools such as CVX [31].

Selection of the Zernike polynomials
In the above derivations, we take advantage of the prior information of the source patterns to choose the symmetric Zernike polynomial functions.There is a tradeoff between the amount of computation and the optimization performance, i.e., fewer terms can lead to faster computation, at the expense of lower source pattern flexibility.Here, we first perform simulations to evaluate this quantitatively.We compute the source optimization using different numbers of Zernike polynomial functions on several line array mask patterns with different densities.The line width of the line array patterns, also known as the CD, is 51.5 nm.For different densities, the ratio of CD to pitch ranges from 2 to 5, thus we have four mask patterns in total.It is well known that the Zernike polynomials have radial components and axial components [24].For the chosen symmetric Zernike polynomials, the number P = L(L + 1)/2 if the first L orders of radial components are selected.Here, we set L ranges from 2 to 17, and the corresponding P equals to 3, 6, 10, • • • , 136.Each Zernike polynomial, and therefore the source pattern, is represented by a 65 × 65 pixel image.The number of source pixels of interest located within the unit disk is 3785.The source wavelength is 193 nm, and the NA is 1.35.The t r and α in the sigmoid function to calculate the resist image are 0.3 and 85, respectively.We also set the weight μ = 0.1, ε = 0.001 in the source optimization, and the maximum pixel values S max = 1 and D max = 500.
Figure 5 plots the source optimization runtime and the optimized cost function value versus the number of Zernike polynomials.Each line in Fig. 5(a) plots the relationship between the runtime and the number of Zernike polynomials, while Fig. 5(b) displays the optimized values of the cost function corresponding to mask patterns at various densities.The runtime increases almost linearly in the shown simulation region, while the optimized value decreases with more Zernike terms.This demonstrates the tradeoff mentioned above between the amount of computation and the optimization performance.It is also noted that when the number of Zernike polynomials increases to a certain number, the optimized cost function reduces slowly.It indicates that the Zernike polynomials of low order contribute to the main features of the optimized source patterns.The Zernike polynomials of high order have little contribution to the reduction of the cost function, and can be neglected in source optimization.To better balance the amount of computation and the optimization performance, we choose P = 78 in our following simulations of source mask optimization.This is because the optimization process can approximate the largest solution space when P = 78, and the increase of terms can only add computational burden.

Optimization results
Source optimization alone is insufficient to obtained the required pattern fidelity in computational lithography.Thus, we apply the above technique to a sequential SMO process, and compare the performance and efficiency between the pixel-based algorithm and the Zernike polynomial based algorithm under the same conditions.The sequential SMO process is carried out by first performing source optimization with the target pattern as the initial step, and then performing mask optimization with the optimized source obtained earlier [5].This process can be repeated several times until convergence.Note that the source representation method here can be applied to other SMO algorithms, such as the simultaneous and hybrid algorithms.
We evaluate the optimization performance by measuring both the pattern fidelity and the robustness to process variations.The pattern fidelity is evaluated by computing the pattern error (PE) defined as the difference between the output pattern and the target pattern, and the edge placement errors (EPE) at critical places.In order to evaluate the robustness to process variations, we estimate the process windows and normalized image log slopes (NILS) at critical places.We also assess the optimization efficiency by the runtime of the optimization process.
Figure 6 shows the two target mask patterns, namely, a brick contact array and a regular contact array, which we use to test the SMO algorithm.Both patterns are of size 201 × 201 pixels, and each pixel represents 4.47 nm.The size of each contact is 40.2nm × 125.1 nm, and the distances between neighboring contacts are 107.2nm and 180.9 nm in the horizontal and vertical directions, respectively.The critical places to calculate the EPE are located at the cutlines, which are the central places of each contact region.As stated above, we choose 78 terms of Zernike polynomial functions as the basis functions to represent the source patterns.Each is represented as a size 65 × 65 pattern.The source pattern is therefore also of this size, hence N s = 65, and the spatial frequencies ranges from −1 to 1 after normalization by NA/λ .Similar to the earlier simulation, we set S max = 1, D max = 500, and μ = τ = 0.1.Furthermore, in mask optimization, the number of kernels maintained for the aerial image calculation is K = 10.
The optimization results of the brick contact array are shown in Fig. 7.The source optimization results using the pixel-based (PB) method and the Zernike polynomial-based (ZPB) method results compared with the PB algorithm in terms of robustness to process variations.After evaluating the optimized image performance of the two algorithms, we now assess the optimization efficiency.Table 1 summarizes the runtime for both the source optimization (SO) and the mask optimization (MO) steps with the two test patterns.With both algorithms, the SO converges to a global solution in about 30 iterations.The total runtime of the ZPB algorithm is about 40 − −50 times shorter than the PB algorithm.This is attributed to the fact that the number of source variables in the PB algorithm is 3785, including all the pixels inside the unit circle represented by a 65 × 65 image, which is about 48 times the number of bases used in the ZPB algorithm.In addition, the MO also records a slight speedup with the latter.This is because the TCC can be calculated from the linear equations in Eq. ( 11) efficiently, while ordinarily it would need multiple integrations.

Conclusions
In this paper, we propose an efficient SMO algorithm using the Zernike polynomial functions to represent the source patterns.We demonstrate that the source patterns can be represented with a small number of Zernike polynomials, and the source optimization problem can be formulated as a quadratic problem.We show that this can deliver similar performance to that provided by the pixel-based algorithm in enhancing both the pattern fidelity and robustness; at the same time, the optimization efficiency can be significantly improved due to the smaller number of source variables in source optimization and the use of the linear relationship to calculate the TCC in mask optimization.

Fig. 2 .
Fig.2.The sampling method of the source patterns and the Zernike polynomial functions.

Fig. 3 .
Fig. 3.The first 21 Zernike polynomial functions chosen to represent the source patterns.

Fig. 4 .
Fig. 4. Theory of aerial image simulation with Zernike polynomial-based source representation.
(a) shows the Zernike coefficients, Figs.4(b), 4(c), and 4(d) are the basis Zernike polynomial functions, basis TCCs, and basis images, respectively, and Figs.4(e), 4(f), and 4(g) are correspondingly the source pattern, the TCC, and the aerial image in the imaging systems.

Table 1 .
Comparison of the optimization performance and efficiency.