Single spherical mirror optic for extreme ultraviolet lithography enabled by inverse lithography technology

Traditionally, aberration correction in extreme ultraviolet (EUV) projection optics requires the use of multiple lossy mirrors, which results in prohibitively high source power requirements. We analyze a single spherical mirror projection optical system where aberration correction is built into the mask itself, through Inverse Lithography Technology (ILT). By having fewer mirrors, this would reduce the power requirements for EUV lithography. We model a single spherical mirror system with orders of magnitude more spherical aberration than would ever be tolerated in a traditional multiple mirror system. By using ILT, (implemented by an adjoint-based gradient descent optimization algorithm), we design photomasks that successfully print test patterns, in spite of these enormous aberrations. This mathematical method was tested with a 6 plane wave illumination source. Nonetheless, it would have poor power throughput from a totally incoherent source. © 2014 Optical Society of America OCIS codes: (110.4235) Nanolithography; (100.3190) Inverse problems; (340.7480) X-rays, soft x-rays, extreme ultraviolet (EUV); (110.5220) Photolithography. References and links 1. H. J. Levinson, Principles of Lithography, 3rd ed. (SPIE, 2010). 2. L. Pang, Y. Liu, and D. Abrams, “Inverse lithography technology (ILT), what is the impact to photomask industry?” Luminescent Technologies, Inc. (2006). 3. Y. Borodovsky, W. Cheng, R. Schenker, and V. Singh, “Pixelated phase mask as novel lithography RET,” Proc. SPIE 6924, 69240E (2008). 4. P. S. Davids and S. B. Bollepalli, “Generalized inverse problem for partially coherent projection lithography,” Proc. SPIE 6924, 69240X (2008). 5. V. Singh, B. Hu, K. Toh, S. Bollepalli, S. Wagner, and Y. Borodovsky, “Making a trillion pixels dance,” Proc. SPIE 6924, 69240S (2008). 6. W. Cheng, J. Farnsworth, W. Kwok, A. Jamieson, N. Wilcox, M. Vernon, K. Yung, Y. Liu, J. Kim, E. Frendberg, S. Chegwidden, R. Schenker, and Y. Borodovsky, “Fabrication of defect-free full-field pixelated phase mask,” Proc. SPIE 6924, 69241G, 69241G-10 (2008). 7. R. Schenker, S. Bollepalli, B. Hu, K. Toh, V. Singh, K. Yung, W. Cheng, and Y. Borodovsky, “Integration of pixelated phase masks for full-chip random logic layers,” Proc. SPIE 6924, 69240I (2008). 8. G. Kim, J. A. Domínguez-Caballero, and R. Menon, “Design and analysis of multi-wavelength diffractive optics,” Opt. Express 20(3), 2814–2823 (2012). 9. J. R. Fineup, “Iterative method applied to image reconstruction and to computer-generated holograms,” Opt. Eng. 19(3), 297–305 (1980). 10. C. Jacobsen and M. R. Howells, “A technique for projection x-ray lithography using computer-generated holograms,” J. Appl. Phys. 71(6), 2993–3001 (1992). 11. J. A. Domínguez-Caballero, S. Takahashi, S. J. Lee, and G. Barbastathis, “Design and fabrication of computer generated holograms for Fresnel domain lithography” in Digital Holography and Three-Dimensional Imaging, Vancouver Canada April 26–30 (2009). 12. Y. Cheng, A. Isoyan, J. Wallace, M. Khan, and F. Cerrina, “Extreme ultraviolet holographic lithography: initial results,” Appl. Phys. Lett. 90 023116 (2007). 13. J. S. Jensen and O. Sigmund, “Topology optimization for nano-photonics,” Laser Photon. Rev. 5(2), 308–321 #217435 $15.00 USD Received 30 Jul 2014; revised 17 Sep 2014; accepted 24 Sep 2014; published 7 Oct 2014 (C) 2014 OSA 20 October 2014 | Vol. 22, No. 21 | DOI:10.1364/OE.22.025027 | OPTICS EXPRESS 25027 (2011). 14. P. Seliger, M. Mahvash, C. Wang, and A. F. J. Levi, “Optimization of aperiodic dielectric structures,” J. Appl. Phys. 100(3), 034310 (2006). 15. W. R. Frei, D. A. Tortorelli, and H. T. Johnson, “Geometry projection method for optimizing photonic nanostructures,” Opt. Lett. 32(1), 77–79 (2007). 16. V. Liu and S. Fan, “Compact bends for multi-mode photonic crystal waveguides with high transmission and suppressed modal crosstalk,” Opt. Express 21(7), 8069–8075 (2013). 17. G. Veronis, R. W. Dutton, and S. Fan, “Method for sensitivity analysis of photonic crystal devices,” Opt. Lett. 29(19), 2288–2290 (2004). 18. O. D. Miller, “Photonic design: from fundamental solar cell physics to computational inverse design,” Ph.D. Thesis, EECS Department, Univ. of California, Berkeley (2012). 19. C. M. Lalau-Keraly, S. Bhargava, O. D. Miller, and E. Yablonovitch, “Adjoint shape optimization applied to electromagnetic design,” Opt. Express 21(18), 21693–21701 (2013). 20. V. Ganapati, O. D. Miller, and E. Yablonovitch, “Light trapping textures designed by electromagnetic optimization for subwavelength thick solar cells,” IEEE J. Photovolt. 4(1), 175–182 (2014). 21. M. P. Bendsoe and O. Sigmund, Topology Optimization Theory, Methods and Applications (Springer, 2003). 22. G. Strang, Computational Science and Engineering (Wellesley-Cambridge, 2007). 23. S. Krantz, A Guide to Complex Variables (2007). 24. Y. Borodovsky, “EUV lithography at insertion and beyond,” http://www.euvlitho.com/2012/P1.pdf. 25. C. Solomon and T. Breckon, Fundamentals of Digital Image Processing: a Practical Approach with Examples in MATLAB, (Wiley-Blackwell, 2011). 26. J. W. Goodman, Introduction to Fourier Optics, 2 ed. (McGraw-Hill, 1996). 27. M. Born and E. Wolf, Principles of Optics, 7 ed. (Cambridge University, 1999). 28. V. N. Mahajan, Aberration Theory Made Simple, 2 ed. (SPIE, 2011).


Introduction
Extreme ultraviolet (EUV) lithography is the leading contender to become the next industrial scale lithography technology in the semiconductor industry.Nonetheless, source power requirements are a major challenge that must be overcome [1].In EUV lithography, multiple multilayer mirrors are used instead of lenses.Since the maximum reflectivity of a single mirror is 70% [1], projection optics systems employing 6 mirrors for aberration correction transmit less than 12% of the illumination power to the wafer.To address this problem, we consider a single mirror system in which the aberration correction is built in to the mask design.This could result in (1-0.7 5 ) = 83% reduction in EUV source power required, but the mathematical procedure will constrain the source incoherence.
To design masks with built-in aberration correction, we employ the optimization approach called Inverse Lithography Technology (ILT), which was developed by Luminescent Inc [2].and Intel [3][4][5][6][7], independently.This approach has the ability to explore a large design space and systematically find unintuitive, yet high-performing solutions to mask design that would not otherwise be found.We use the adjoint method, a gradient descent optimization algorithm that has great advantages over algorithms used previously for photomask design.For example, the use of gradient descent lets the algorithm converge orders of magnitude faster than non-gradient methods such as the binary search algorithm used in [8].The adjoint method also provides more in-depth information than either the Gerchberg-Saxton algorithm in [9][10][11], or the back propagation technique in [12], allowing gradient descent to optimize more complex figures of merit.
In this paper, we begin by describing the general form of the adjoint method, which has been used to successfully design of all manner of electromagnetic components [13][14][15][16][17][18][19][20].We then present a specific way to apply the adjoint method to Inverse Lithography Technology.Finally, we apply this form of ILT to a single spherical mirror system with orders of magnitude greater aberrations than would ever be tolerated in a traditional multiple mirror system.The adjoint method allows us to design photomasks with non-intuitive shapes that nonetheless successfully print test patterns, in spite of these enormous aberrations.

Adjoint method for electromagnetic design
The adjoint method is a gradient descent optimization algorithm for designing the geometry of dielectric or metal electromagnetic devices under Maxwell's equations.Adjoint methods have been employed in the design of optical and photonic components [13][14][15][16][17][18][19][20] and mathematical derivations of the adjoint method are available in optimization textbooks [21,22].The adjoint method converges to an optimum much more rapidly than popular heuristic optimization methods such as genetic algorithms and particle swarm optimization, since it follows the gradient-the derivative of the Figure-of-Merit with respect to all geometric parameters.
The adjoint method calculates the gradient at all points in space within only 2 simulations, regardless of the size of the system.Absent the adjoint method, N simulations would be required to calculate the gradient using finite differences, where N is the number of geometrical parameters.For general geometry at all points in space, the adjoint method makes calculation of the gradient tractable when it would not be otherwise.For example, if a geometry is represented by a 1000 × 1000 pixel grid, and each pixel is a separate parameter, the adjoint method speeds up calculation of the gradient by 500,000 × .A large number of parameters is desirable because this provides more degrees of freedom to the optimizer, and hence makes a better optimum achievable.
In our implementation, the adjoint inverse solver is a small subroutine that wraps around a forward solver.This means any existing commercial Maxwell forward solver can be used.
A flowchart describing the adjoint inverse solver is shown in Fig. 1.In a given iteration, the forward simulation provides the electromagnetic fields for the current geometry.Then the adjoint simulation calculates the gradient.In gradient descent, a local change in geometry is made, proportional to the calculated gradient, in preparation for the next iteration.

Geometry update
Fig. 1.A flowchart showing one iteration in the adjoint method.First, electric and/or magnetic fields are found for the current geometry through the forward simulation.Then, the geometry gradient is found through the adjoint simulation.The gradient is used to make an update to the geometry.

Adjoint method applied to ILT
This section describes our mathematical approach for applying the adjoint method to ILT for photomasks.We have adopted the mathematical formulation of the adjoint method previously presented in Owen Miller's Ph.D. thesis [18].Miller's thesis contains a more general form of the present derivation that accounts for vector forms of both electric and magnetic fields.
Only scalar electric fields are considered here.
A reflective projection optics system with one mirror is depicted in Fig. 2(a).Equivalently, we model the system with a refractive lens as shown in Fig. 2(b).We will find the gradient of the Figure-of-Merit (the total image error) with respect to the mask transmission factor (which defines where the mask is opaque or transmissive).The mask transmission factor is in the mask plane, while the Figure-of-Merit is a function of the electric field in the wafer plane.To find the gradient of the Figure-of-Merit, with respect to the mask transmission factor, we apply the chain rule of calculus: First find the gradient with respect to the mask plane electric field, and then the derivative of electric field with respect to the transmission factor.Fig. 2. Projection optics with one mirror (a), and an equivalent system with one lens (b).S is the distance from the mirror to the mask, and S' is the distance from the mirror to the wafer (not to scale).D is the diameter of the mirror/lens.

Gradient with respect to electric field
The Figure-of-Merit is a sum of errors in the wafer plane image, and has the general form where f represents a local error in the image at point r W , the subscript W denotes a variable in the wafer plane, E W is the wafer plane electric field, r W is the two dimensional spatial position vector in the wafer plane, and bold face denotes a vector quantity.The local Figure-of-Merit f, is a step-like function of the local electric field E W , which might be larger or smaller than a desired target electric field.Differentiating Eq. ( 1), with respect to the mask plane electric field E M , we obtain where the subscript M denotes a variable defined in the mask plane.During optimization, we adjust the mask to vary E M to achieve the best possible image.To determine the partial derivative ∂E W (r W ) /∂E M (r M ), we must first express the wafer plane field in terms of the mask plane field.
Equation ( 3) is a convolution integral with the point spread function for propagation from the mask to the wafer plane, PSF M→W , which would generally require a solution of Maxwell's equations, but we use the paraxial and other approximations to determine PSF M→W .Substituting Eq. (3) into Eq.( 2), we obtain , where r' M is a dummy variable for convolution.We are interested in the derivative of the term in square brackets with respect to the variable E M at one particular position r M .Since E M (r M ) and E M (r' M ) are independently controlled variables, the derivative with respect to E M (r M ) produces a delta function δ(r' M -r M ) and Eq. ( 4) becomes Equation ( 5) nearly looks like a convolution integral, but PSF → is an operator that only operates on functions defined in the mask plane, and ∂f /∂E W (r W ) is in the wafer plane.This can be resolved by the reciprocity of Maxwell's equations, which dictates the reciprocal relation: where PSF → is the point spread function for propagation from the wafer plane back to the mask plane.Plugging Eq. ( 6) into Eq.( 5), we obtain Equation ( 7) is indeed a convolution integral, and it is the important result that we have been seeking.It states that ∂f /∂E W (r W ) can be treated as a source electric field and propagated from the wafer plane to the mask plane to obtain ∂FoM /∂E M (r M ).This is the adjoint simulation step shown on the right side of Fig. 1.
For a spatially incoherent system that is modeled as a sum of coherent systems, the above procedure must be executed for every angle of illumination.Gradients of the FoM with respect to the electric field, properly weighted over the angles of illumination must be considered.

Gradient with respect to mask transmission factor
In the geometry update, changes in mask geometry must be derived from changes in mask plane electric field.In our simple model, each pixel in the mask is either perfectly opaque or perfectly transmitting.Thus, the mask is represented by a transmission factor, T M , which has values of either 0 or 1, and multiplies the incoming field.To include mask edge effects, a more complete electromagnetic model would be required.A method for including electromagnetic effects in the optimization is described in [18].To update the mask geometry, represented by a transmission factor T M , the Figure-of-Merit derivative with respect to local electric field, ∂FoM /∂E M must be related to the derivative with respect to transmission factor ∂FoM /∂T M .The mask plane field is related to the mask transmission factor by Where E o is the normalized incident electric field magnitude, φ EM is the corresponding phase.
∂FoM /∂T M can be found by using the chain rule on ∂FoM /∂E M .Care must be taken because E M is generally complex.One could take derivatives with respect to the real and imaginary parts of E M .An equivalent and more convenient method is to take derivatives with respect to E M and its complex conjugate as follows where the asterisk * denotes complex conjugation.Since FoM and T M must be real, the two terms on the right hand side of Eq. ( 9) are complex conjugates of each other.Thus, their imaginary parts cancel out, resulting in Plugging Eq. ( 8) into Eq.( 10), we obtain which translates from electric field gradient to the more operational mask transmission factor gradient.
For a spatially incoherent system modeled as a sum of coherent systems, ∂FoM /∂T M can be expressed as the total derivative with respect to the electric field of equally weighted angles of illumination.

( ) ( ) ( )
2 Re exp where E Mn is the mask plane electric field for the angle of illumination indexed by the integer n.
∂FoM /∂T M is the gradient with respect to the operational mask design parameters, and provides information about how the Figure-of-Merit changes as the transmission factor T M changes at each point in space.Gradient descent, as in Newton's method for solving polynomial equations, operates by changing the mask transmission proportional to the rate of increase in the Figure -of-Merit: where ∆T M is the change in T M at a given iteration.As an optimum is approached, and the derivative approaches zero, the changes in T M become smaller and smaller.
To model a binary amplitude mask, such as those used in EUV lithography, we constrain T M to only take values of 0 or 1.Since the mask transmission is binary and does not take continuous values, the geometry update differs slightly from conventional gradient descent.Pixels in the mask are flipped only with the correct sign of ∂FoM /∂T M , and only when the gradient magnitude exceeds a threshold.The threshold is adjusted several times within each iteration to find the best improvement in the Figure-of-Merit.In this way, the iterative optimization procedure is well defined.

Figure-of-merit
The Figure-of-Merit that we have preferred in these optimizations is the total error region area in which the printed pattern differs from the desired pattern.That area must be minimized.This Figure-of-Merit is illustrated by the grey region in Fig. 3. Thus, the error region is defined as where P d and P a are binary functions defining the desired and actual printed patterns, respectively.These are defined as where I th is the exposure threshold for electric field intensity.Anywhere the intensity is greater than I th , P a is set to 1. Otherwise, its value is 0. Figure 3  The integrand of Eq. ( 14), |P d (r W )-P a (E W (r W ))|≡f, must be differentiated to obtain the wafer plane gradient ∂f /∂E W . Unfortunately, P a is not differentiable.Therefore, it is replaced by the continuous logistic function.(16) where A is a parameter defining the slope of the continuous differentiable function P' a .To differentiate f, we replace the absolute magnitude with the square root of the its square.
For a spatially incoherent system modeled as a sum of coherent systems, |E W | 2 in Eq. ( 15) is replaced by Σ n |E Wn | 2 , where E Wn is the wafer plane electric field resulting from one angle of illumination, indexed by the integer n.Differentiation with respect to E Wn proceeds similarly to Eq. ( 16)- (23).

Results
To test the method outlined in the previous sections, we consider a single lens lithography system as shown in Fig. 2(b) that incorporates the aberrations to be expected in an equivalent single mirror EUV system as shown in Fig. 2(a).The magnification is 0.25, as is the convention in photolithography.The lens/mirror diameter is D = 30cm, with a numerical aperture at the wafer plane, NA W = 0.33.This leads to a mirror surface-to-wafer distance D/2tan(sin −1 NA W ) = S' = 42.9082cm.The mirror focusing equation, 2/R = 1/S' + 1/S leads to a mirror surface-to-mask distance S = 171.6328cm, and a mirror radius of curvature R = 68.6532cm.For these dimensions, a spherical mirror, relative to an ideal parabolic mirror, has aberrations amounting to >10000λ for λ = 13.5nm.We assign 6 significant figures to the mirror radius of curvature owing to the need to specify the mirror surface within ~0.1λ, as is common in high precision optics.Indeed we have found that even ~0.1λphase shift at the edge of the mirror produces ~10% errors in the test pattern features, unless the mask is redesigned to account for the newly shifted mirror surface.
Six discrete plane waves are used for illumination.These points were chosen to give the illumination some of the characteristics of an extended dipole source.The illumination pattern used is shown in Fig. 4. Our ILT mask solutions do correct aberrations very well, within the diffraction limit of the six selected illumination angles, but our solutions fail to accommodate the broad power from an extended incoherent source.We can model the incoherent source with more plane waves, but within the diffraction limit the number of plane waves would eventually equal the number of pixels.Each incident illumination angle imposes an additional constraint.For a totally incoherent source, the computation would not be manageable, nor would there be enough pixels in the mask to satisfy the multi-faceted constraints.Thus, ILT aberration correction is most suited to a partially coherent illumination source, like a laser.
Calculations of the basic Eq. ( 3) are executed in MATLAB, using fast Fourier transforms to compute the convolution with the point spread function.More mathematical details are included in Appendix A.

Correcting severe spherical aberration
We use a test pattern from an industry presentation [24], which is shown by the dashed lines in Fig. 5(b).The pattern consists of six 14nm × 22nm features and one 14nm × 44nm feature.The features are placed 50nm apart in the x-direction, and 22nm apart in the y-direction.These dimensions should be compared with a diffraction limit λ/(4NA W ) = 10nm for an EUV wavelength λ = 13.5nm.The features are ellipsoidal to avoid sharp corners below the diffraction limit.The pattern in Fig. 5(a) and 6(a), is one unit cell of a periodic naïve mask, identical to the desired test pattern.The exposure threshold is taken to be half the clear field intensity.
For an un-aberrated case, the resulting wafer plane intensity and printed pattern are shown in Fig. 5(b).For the spherically aberrated case, (corresponding to a 30cm diameter focusing mirror) the wafer plane intensity and printed pattern are shown in Fig. 6(b).In the unaberrated case, Fig. 5, all the features print.In Fig. 6, the high spherical aberration produces 4 missing features and 3 unacceptable features.This spherical aberration relative to a perfect parabolic reflector has a peak value of >10000λ (>140μm) based on Eq. ( 31) in Appendix A. The wafer is readjusted to the plane of best focus for this level of spherical aberration, ~2.6mm closer to the mirror than the best focus in the un-aberrated case.The NA of the system is 0.33, the demagnification is 4, and the wavelength is 13.5 nm.The mirror radius is 15 cm.The image was taken at the center of the field.This naïve mask is used as the starting geometry for the optimization.
Using the test pattern in Fig. 6(a) as a starting point, we used our adjoint inverse solver, Eq. ( 13), to optimize the mask.The pixel size for the simulation during optimization is 0.25 nm at the wafer plane and 1nm at the mask plane.After optimization, the mask solution is tested with a smaller pixel size = 0.16nm at the mask plane, for validation.This change in pixel size is done to ensure the critical dimensions of the final shape are computed to within less than 1% accuracy during final analysis.This accuracy is not critically needed during optimization, but is important in validation.The optimized mask appearance is shown in Fig. 7(a).In the intensity profile of Fig. 7(b) all critical dimensions were achieved to within 5%.The optimization took 148 iterations to converge.Roughly 3 pixel-flip thresholds were compared per iteration, as discussed at the end of section 3.2.Additionally, a radius of curvature constraint <12nm was imposed on the mask at each iteration.This radius of curvature constraint is applied by morphological opening as described in [25].
A coarser, pixelated, version of the optimized mask in Fig. 7 is shown in Fig. 8, with the mask constructed from 14nm × 15nm rectangles to make the mask conform to Manhattan geometry.After pixelation, all critical dimensions are still achieved to within 8%.This demonstrates the robustness of the mask design.On-axis (pixelated) Fig. 8.The same optical system as in Fig. 7, with the mask pixelated.The pixels are 14nm × 15 nm.All critical dimensions are within 8% of their target.

Depth-of-focus optimization
We have also performed mask optimization as a function of focal depth.To do this, we began with the optimal mask at focus, and then optimized the Figure-of-Merit at 4 planes: −50nm, −30nm, −10nm, and + 10nm relative to the initial optimal plane, to investigate a 60 nm depth of focus.The Figure-of-Merit is the sum of area errors, as in Eq. ( 14), summed over all 4 image planes.Figure 9 shows the mask and wafer field at nominal focus resulting from this optimization.Roughly 5 pixel-flip thresholds were compared per iteration, as discussed at the end of section 3.2.This optimization required 339 iterations.Figure 10(a) shows a Bossung plot for the worst performing feature for the mask optimized at focus, and Fig. 10(b) for the mask simultaneously optimized at the four different planes −50nm to + 10nm.The sharp jumps seen in the plots correspond to changes in the location of the worst performing feature.For the mask optimized through focus, all critical dimensions remain within 11% for the full 60nm of defocus at nominal dose.The dose sensitivity, not optimized here, could be expected to improve if optimized.) for an optical system with a mask optimized to perform through 60nm of defocus.The simulation conditions are the same as in Fig. 5.With this mask, all critical dimensions are within 7% of their target at focus, and remain within 11% through 60nm of defocus.Fig. 10.Bossung plots for the worst performing feature for the masks optimized (a) at focus, and (b) for depth-of-focus.For the mask optimized through focus, all critical dimensions remain within 11% of their targets for 60nm of defocus at nominal dose.The sharp jumps seen in the plots correspond to changes in the location of the worst performing feature.

Off-axis aberration correction
We have considered severe spherical aberrations, combined with a depth of field requirement.Now we consider off-axis imaging, which includes, spherical aberration, coma, and astigmatism, all severe owing to the use of an uncorrected spherical optic.The additional aberrations change the point spread function used in simulation according to Eq. ( 25), (26), and (31) in Appendix A. The off-axis points are in a 33mm × 26mm wafer and are shown in Fig. 11.The mid-field point 6.5mm off-center is ~0.9° off-axis, and the field edge-point 13mm off-center is ~1.8°off-axis.The mid-field point experiences >4000λ of coma and >230λ of astigmatism in addition to the spherical aberration present in the on-axis case.The field edge point experiences >9000λ of coma, and >900λ of astigmatism.For simplicity we don't account for aberration variation within the 150nm × 132nm test pattern unit cell.This variation is relatively small, but must be taken into account in an industrial application.
The results for the mid-field optimization are shown in Fig. 12.The optimized mask for the on-axis case, Fig. 7(a) was used as the starting point for this optimization.After the final iteration, the critical dimensions are within 2% of the desired target.During optimization, the simulation pixel size was 2nm at the mask plane.As before, the pixel size was reduced to 0.16nm at the mask plane to accurately validate the critical dimensions after the last iteration.This optimization took 313 iterations with roughly 3 pixel-flip thresholds tested per iteration, as discussed at the end of section 3.2.
The results for the field edge case are shown in Fig. 13.The optimized mask for the midfield case was used as the starting mask for this optimization.After the final iteration, all critical dimensions are within 3% of their desired target.During optimization, the simulation pixel size was 2nm at the mask plane.For validation after the last iteration, the pixel size was reduced to 0.16nm at the mask plane.This optimization took 185 iterations with roughly 3 pixel-flip thresholds tested per iteration, as discussed at the end of section 3.  A comparison of Fig. 7(a), on-axis, Fig. 12(a), 6.5mm off-axis, and Fig. 13(a), 13mm offaxis, show completely different mask solutions, even though the test pattern was identical.The mask solution is sensitive to the exact level of aberrations.A mask solution at the center of a chip would be different from a mask solution at the edge of a chip.Even with a repeating pattern as in a DRAM chip, the mask would be aperiodic, and computationally intensive to design.
We have considered spherical aberration, and off-axis aberrations coma and astigmatism.Additional aberrations can be trivially included in the current model by adding more terms to the phase shift at the mirror; therefore changing the point spread function.Additionally, in an industrial application, the Figure-of-Merit should include tolerances toward exposure dose and errors in photomask fabrication.Since off-axis aberrations vary across the chip-field, a global optimization across the whole chip would be required.
In addition, electromagnetic edge effects in the mask, and angle-dependent mirror reflectivity, must also be accounted for in the simulation.This does not pose problems to the optimization method, since our implementation of the adjoint method can wrap around any Maxwell solver.

Conclusion
We have shown that, under a partially coherent source, like a laser, Inverse Lithography Technology can allow EUV Lithography to proceed in spite of severe aberrations, (as would be produced by a single-mirror imaging system).By reducing from 6 mirrors to 1 mirror, the power wasted by the projection optics would be reduced by ~7 × , owing to the diminished mirror losses with fewer mirrors.Since ILT is needed for mask optimization, the strategy of also using it for aberration correction seems well warranted.We have successfully designed photomasks to print test patterns in the presence of severe spherical aberration and including off-axis coma and astigmatism, and the requirement for 60nm depth of focus.
If we force current incoherent EUV sources to produce a six beam illumination pattern as in Fig. 4, the throughput would be very limited.Thus a partially coherent EUV source, like a laser should warrant more scientific and technological effort.

Appendix A: Linear system model of the lithographic imaging system
This section describes the linear system model based on the paraxial approximation used in our simulations.This model can be found in textbooks [1,26,27].
To simulate the projection optics, contributions to wafer intensity from different angles of illumination are considered.The electric field transmitted from the mask from one illuminating plane wave is where is the normalized radial coordinate, φ is the azimuthal angle, f x and f y are the spatial frequencies in the x and y directions, respectively, NA W is the numerical aperture at the wafer plane, and OPD is the optical path difference defined by the aberrations present in the system.The intensity at the wafer plane is where PSF M→W is the inverse Fourier transform of OTF in Cartesian coordinates.I W is the sum of the intensities from each plane wave, E Mn .Equation ( 27) is a modification of Eq. ( 3) for a spatially incoherent system.

A.1. Aberration wavefronts
Figure 2(a) shows a one-mirror imaging system.The relationship between radius of curvature R, mirror-to-mask distance S, and mirror-to-wafer S′ is If the imaging is on-axis and the height of the mirror at the edges is ignored, the numerical aperture at the wafer plane is where NA W is the numerical aperture at the wafer, and NA M is the numerical aperture at the mask, and a is the lateral radius of the mirror (a = D/2 from Fig. 2).The magnification of the system is

Fig. 3 .
Fig. 3.An example Figure-of-Merit calculation at the wafer plane.The color map shows electric field intensity.The desired pattern, P d , is outlined by the black dashed line.The actual printed pattern, P a , is outlined in cyan.The "error region", |P d −P a |, is shown in gray.This error region is integrated to obtain the Figure-of-Merit.

- 1 -Fig. 4 .
Fig. 4. Illumination pattern.σ x = sinθ x /NA W and σ y = sinθ y /NA W .These six plane waves were chosen to give the illumination some of the characteristics of an extended dipole source, such as the one outlined in black.The four σ x values are −0.8182,−0.2727, 0.2727, and 0.8182.The three σ y values are −0.3099,0, and 0.3099.

Fig. 5 .Fig. 6 .
Fig. 5. (a) mask and (b) wafer plane intensity (normalized to clear field) for an optical system with a naïve mask and no aberrations in the on-axis position.The pattern is periodic, with one unit cell shown.The NA of the system is 0.33, the demagnification is 4, and the wavelength is 13.5 nm.

Fig. 7 .
Fig. 7. (a) Mask and (b) wafer plane intensity (normalized to clear field) for an optical system with an optimized mask.The simulation conditions are the same as in Fig. 6.With this optimized mask, all critical dimensions are within 5% of their target.

Fig. 9 .
Fig. 9. (a) Mask and (b)wafer plane intensity (normalized to clear field) for an optical system with a mask optimized to perform through 60nm of defocus.The simulation conditions are the same as in Fig.5.With this mask, all critical dimensions are within 7% of their target at focus, and remain within 11% through 60nm of defocus.

Fig. 11 .
Fig.11.A diagram showing the three points on the wafer we designed masks for.The wafer was assumed to be 33 by 26 mm.The mid-field point is displaced 6.5 mm from the optical axis and has >4000 wavelengths of coma and >230 wavelengths of astigmatism (peak value, using the convention in Eq. (31) in Appendix A).The field edge point is displaced 1.3 mm and has >9000 wavelengths of coma, and >900 wavelengths of astigmatism.

Fig. 12 .Fig. 13 .
Fig. 12.(a) Mask and (b) wafer plane intensity (normalized to clear field) after optimization for the mid-field location 6.5mm off-axis.The mask resulting from the on-axis optimization was used as the starting mask for this optimization.All critical dimensions are within 2% of their target.