Generalization of Zernike polynomials for regular portions of circles and ellipses

Zernike polynomials are commonly used to represent the wavefront phase on circular optical apertures, since they form a complete and orthonormal basis on the unit circle. Here, we present a generalization of this Zernike basis for a variety of important optical apertures. On the contrary to ad hoc solutions, most of them based on the Gram-Smith orthonormalization method, here we apply the diffeomorphism (mapping that has a differentiable inverse mapping) that transforms the unit circle into an angular sector of an elliptical annulus. In this way, other apertures, such as ellipses, rings, angular sectors, etc. are also included as particular cases. This generalization, based on in-plane warping of the basis functions, provides a unique solution, and what is more important, it guarantees a reasonable level of invariance of the mathematical properties and the physical meaning of the initial basis functions. Both, the general form and the explicit expressions for most common, elliptical and annular apertures are provided.


I. INTRODUCTION
The problem of finding complete orthonormal systems to represent functions defined on finite supports with a given geometry appears in many areas of Physics and Engineering. In particular, Zernike circle polynomials [ 1 ] are widely used to represent optical path differences (phase differences or wave aberrations) in wavefronts, or even the sag of optical surfaces (such as the human cornea [2,3]) as they are well adapted to the circular shape of a majority of conventional optical systems. There is an infinite number of possible systems, but Zernike polynomials (ZPs) (or lineal combinations of them [4]) show important advantages and interesting properties. Among these properties, ZPs permit to establish a link with the traditional Seidel theory of aberrations [5], which is based on a third order Taylor series expansion, and with further extensions of the Seidel theory to 5th order, etc. On the one hand, the monomials of the Taylor series have a clear physical meaning as they represent different types of aberrations (defocus, astigmatism, coma, etc.) but are not orthogonal, which limits both theoretical and practical developments. On the other hand, higher order Zernike polynomials contain lower order terms, as a necessary balance to get zero average [6]. As a result of this cross-talk between higher and lower orders, the link of the ZPs to the Seidel theory is not evident. Nevertheless, the theoretical and practical advantages of orthonormal polynomials, make that ZPs became the standard way to describe the phase of wavefronts [7] (or the wave aberration or optical path differences) in many fields ranging from atmospheric optics [8], optical design and testing [9] or visual optics (the ANSI Z80.28 standard for reporting aberrations in the human eye is based on ZPs). Even though the circle is the most common optical aperture, there are other geometries, such as the annular pupils in large telescopes [ 10 ] or rectangular in anamorphic systems, etc. Furthermore, even in the case of circular apertures, the effective pupil becomes elliptical for off-axis field angles [11]. The eccentricity of the ellipse increases with field angle, and can reach high values for wide angle lenses. The importance of this problem motivated the development of a series of ad hoc solutions, most of them based on the Gram-Smith (G-S) method to obtain orthonormal basis on different types of apertures [12,13] such as ellipses [11], rectangles [14], annuli [15], circular sectors [16], etc. In some particular cases, such as rectangles, Legendre [14] or Chebyshev polynomials [17] were also proposed, but their integration within Seidel or Zernike theoretical frameworks is difficult. An extensive catalogue of polynomial basis functions on different types of apertures can be found in [18]. The G-S method has several advantages but also important drawbacks. The main advantage is that it is a quite general linear method, so that the resulting basis functions are linear combinations of the initial Zernike circle polynomials. The main drawbacks are that there is not a unique solution (the final basis may depend on the ordering of the initial system, or on the particular implementation [19], refining algorithms [20,21], etc.) and that one has to find an ad hoc solution for every type of aperture [12][13][14][15][16][17]. These, in turn, hinder the physical interpretation of the associated expansion coefficients (especially for the higher orders due to a cumulative effect associated to the G-S method). In addition, the G-S method is especially well-suited for numerical implementation, which means even further optimization for specific parameters of the apertures (eccentricity or orientation of the ellipse, radius of the central obscuration, etc.). Somewhat more general analytical expressions can be obtained too using a nonrecursive method [22], but the computational cost may dramatically increase with the order of the polynomial, which may become an effective limitation [11].
In this context, our goal was to develop a general framework able to provide a common formulation under a unique criterion and providing a unique general solution for most of the usual optical apertures. Our approach is based in finding the mapping that transforms the unit circle into the desired aperture geometry. That is finding the diffeomorphism (i.e. a mapping that has a differentiable inverse mapping) [23] that transforms the unit circle into the connected set within the plane that represents the optical aperture. This mapping means warping (and rotating within the plane) the input basis functions so that they fit into the new aperture geometry. On the contrary that ad hoc solutions, that warping permits not only unicity, but also a high level of invariance of the mathematical properties and physical meaning of the basis functions (tilt, defocus, astigmatism, coma, etc.) and hence a natural generalization of the aberration theory. One of the simplest possible mappings consists of the affine transformation (composed of scaling along x and y and rotation) which maps the circle into the ellipse. Here the resulting basis functions are linear combinations of the initial ones (i.e. polynomials), and the associated metrics are proportional (i.e. Euclidean). Other mappings, in particular those transforming the circle into (circular or elliptical) annuli, are non-linear and thus, in general, the warped basis functions are not polynomials and the associated metric may not be Euclidean.
Here we will consider the angular sector of an elliptical annulus with arbitrary orientation as our most general case of mapping of the circle, since other geometries such as circles, ellipses, annuli, sectors, etc. correspond to particular values of the parameters of the general sector. Square, rectangular, hexagonal, etc. geometries are not considered in the present study.

II. THEORETICAL BASIS
The generalization of unit circle polynomials (or, in general, any complete and orthogonal set of basis functions in the circle) to deformations or partitions of the disc can be achieved by applying a diffeomorphism of the unit circle (or disc) D into a connected set M within the plane (see Figure 1): The diffeomorphism is an especially useful transformation in this context, since it is a bijection and its inverse exists and is differentiable as well: and the Jacobian of the inverse transformation 1   will be ( , ) : ; and taking into account that ( , ) dudv J x y dxdy  , then we have: where is the area of the unit circle D. This means that the new functions resulting from this change of variables 1 ( , ) : are orthonormal on M with metric ( , ) J x y dxdy . We can obtain a further generalization by considering the product of these functions with a continuous function ( , ) Q x y so that: The resulting functions are orthonormal on M with metric 2 ( , ) ( , ) Q x y J x y dxdy  . In particular, we can take the trivial case ( , ) 1 Q x y  , so that ( , ) ( , ) ( ( , )) ( ( , )) ( , ) that is the expansion of f on the basis set Z j , where the coefficients are given by the projections of ( , ) f u v on the basis functions: If we now apply the change of coordinates to the definition of the new basis functions, and solve for Z j we have: As we discuss further below, despite the complexity added by the non-Euclidean metric to the computation of inner products or to the weighted least squares fit, it seems more convenient to set simple re-normalization factor.

III. STANDARD PORTIONS OF CIRCLES AND ELLIPSES
In this section we particularize the above general formulation to standard partitions of circles and ellipses (annuli, angular sectors, etc.). To this end we consider the angular sector G of an elliptical annulus with arbitrary orientation as the most general mapping  considered here. As shown in Table I, the geometry of this general sector G is given by 6 parameters, whereas other shapes (annuli, ellipses, etc.) are obtained as particular cases for specific values of these parameters. The mapping from the unit circle (radius R = 1) into the ellipse can be obtained through a linear affine transformation that is the composition of scaling x and y (to obtain semi axes A and B with B ≤ A) and rotation by angle  (formed by the major axis A with the x axis). The (concentric) elliptical annulus requires another parameter 0 1 h   , that is the proportionality constant between its inner and outer elliptical boundaries, are the inner semiaxes). Finally the angular sector will be the area inside the angular interval  1 2 ,   . Thus, our general mapping will be determined by six parameters: A, B, , h, 1  and 2  . This also includes the mapping of the unit circle into another circle with arbitrary radius We can obtain a more compact expression for the mapping  by considering it as the composition of two mappings 2 1      . In the second mapping 2  we apply a rotation by angle  so that: that is a change to the scaled variables ( , ) : ( , Now we can apply the mapping 1  that transforms the unit circle into the angular ring sector: were defined above. Then, the complete inverse mapping 1 As a result of the composition of two transformations, we have an intermediate change of , are now confined within the sector G depicted in Figure 2. The angular interval is 1 In Eq. 14a, we can see that The Jacobian of this transform is It is noteworthy that the Jacobian is not constant only for annuli and annular sectors (see Table I).

A. General system
Now, we can write the new basis functions on G. We will consider ( , ) 1 Q x y  : Functions G j form a complete orthonormal system on G, the general angular sector of the elliptical ring, with metric (differential of surface area) : ( , ) ( cos , sin ) ds J x y dxdy J r r rdrd . In the Appendix it is shown that functions G j : (1) are orthonormal (i.e. their inner products are Kronecker's deltas); (2) any square-integrable function, defined on G can be expressed as a linear combination of functions G j ; and (3) when the general sector of an elliptical annulus tends to the unit circle, that is when . Table I lists the ranges of the parameters as well as the Jacobian (Eq. 15) for the different particular geometries (annulus, sectors, ellipses, etc.)

B. Zernike polynomials
In what follows we will consider that If we apply the mapping described above, we obtain

IV. PARTICULAR CASES: ELLIPSES AND ANNULI
In this Section we analyze in detail two cases mostly relevant in optics: ellipses and annuli, departing from the system of Zernike circle polynomials.

A. Elliptical apertures
As we said above, the mapping from the unit circle into an ellipse is a linear transformation involving three parameters A, B (it is common to use the eccentricity were m n N and | | m n R were given in Eq.17b and the variables ( , ) r    were defined right after Eq. 18.
Since this is a particular case of G, these functions (1) and its inverse: With this mapping     Table I, are obtained by composing one, two, three or four of these transformations. The associated Jacobians are constant except for the radial mapping  , since the Jacobian to pass from Cartesian to polar coordinates is not constant. Thus, the simplicity of this framework, based on simple changes of variables is another interesting aspect (even though the composition of up to four of these changes may yield to long expressions).
These systems are especially important in the wave theory of aberrations and optical image formation [7,5]. As we said in the Introduction, there is not a complete agreement on a unified theory yet. Nevertheless, the good mathematical properties ZPs are making that they are becoming Table III. Expressions of the orthogonal annular basis functions (polynomial quotients) up to order n = 4. A and a =hA are the outer and inner radii. ( 1)  6 e a standard among the scientific community, but even more in technologies embedded in many industrial tools (optical design), devices (surface metrology, optical testing) and even in clinical apparatus in ophthalmology (corneal topographers, ocular aberrometers, etc.) In addition to form a complete orthonormal system they have compact expressions both in polar and Cartesian coordinates, and they also show an interesting list of additional properties. The main drawbacks come from the fact that they represent a highly convenient but arbitrary choice, and from some difficulties for unification with the Seidel (third order and further extensions to 5th and higher orders) theory of aberrations. The wave aberration theory based on orthogonal systems, such as Zernike polynomials, is superior to the Seidel theory for nearly diffraction-limited optical system because the image quality, as measured by the Strehl ratio (peak of the point spread function normalized to that of the Airy disk) [5] can be predicted from the wavefront variance, and hence from the expansion coefficients as:  (Maréchal criterion for diffraction-limited optical quality) [5]. Thus, the wavefront variance 2  , or the root mean square (RMS) wavefront error , are good metrics for optical image only when   . Thus the quadratic sum of the wavefront coefficients 2  , or the RMS  , are good metrics for the wavefront quality, but in aberrated systems they are not good metrics for predicting image quality.
Nevertheless, the need for a unified theoretical framework for describing wavefronts (optical aberrations) and optical surfaces, in terms of complete orthogonal systems, is patent. Even in the context of Zernike polynomials one can find that different authors use a variety of units (wavelengths, micrometers, etc.), normalizations (use of m n N or not), ordering (since they have two indexes n, m there are many different possibilities to define a single index j which were used by different authors), or even opposite sign conventions. Among all these possibilities the ordering proposed by Noll [8] seems widely accepted, but the Optical Society of America adopted a different ordering for reporting aberrations in the human eye (ANSI standard Z80.28). This disparity of criteria existing even for the Zernike circle polynomials may explode when extending the Zernike theory to other optical apertures, such ellipses, annuli, sectors, etc. In fact there is a large list of publications with different polynomials which are ad hoc solutions for every type of optical apertures [11][12][13][14][15][16][17][18]. Most of these polynomials are obtained through G-S orthonormalization. A relevant improvement over the classical G-S method can be obtained by a non-recursive algorithm [19] which shows better numerical stability. It consists of inverting the Gram matrix, but considering only up to a given order n, and using the Cholesky decomposition. Additional numerical stability can be obtained through re-orthogonalization [20] or by an iterative process [21] up to machine precision accuracy. This non-recursive G-S algorithm provided probably the best results published in the literature [18]. The main limitation is that the symbolic implementation of these methods requires heavy loads of computer memory [11]. We want to remark that the systems of orthogonal polynomials obtained with these improved G-S methods are perfectly valid for ad hoc applications. However, unicity is not guaranteed as the result depends on the initial ordering (index j), or on the particular algorithm. In addition, the physical meaning of each polynomial could change after orthogonalization, especially for higher order polynomials, due to the cumulative effect inherent to the G-S method. As a result, the basis functions obtained for different apertures may have different properties and physical meaning, which seems far from the goal of a unified theory.
We believe that the mapping method proposed here, implemented as a change of variables, overcomes most of these difficulties and drawbacks, and provides a common framework, especially well-suited for a unified theory of aberrations. In addition to the theoretical relevance, the unified formulation may be the starting point for developing standards for academic, industrial or even clinical (ophthalmic optics) use.
The two particular cases of Section IV are especially relevant. In the first case, the importance is clear if we realize even the circular aperture (most common in optics) becomes elliptical off-axis. The size (A) eccentricity (e) and orientation () of the ellipse changes continuously with field angle and azimuth. These changes can be large in wide angle optical systems, and especially important in vision (both biological and artificial), where the field of view can be of the order of 180º or even more. Thus, the need for having a unified method to describe wavefronts corresponding to different field angles seems clear. The proposed change of variables (mapping) is a simple affine transformation from the circle to the ellipse, to adapt the modes (polynomials) to the scales (A and B) and orientation of the effective aperture, which changes as we move from the optical axis to a peripheral field angle. This affine transformation (scaling and rotation) is general for virtually any type of aperture (annuli, polygons, etc.) while it keeps a reasonable level of invariance, and hence a similar physical meaning of the expansion coefficients. This mapping should be applied for representing off-axis wavefronts, especially when the field of view is significantly wide.
The case of annular apertures is relevant not only for being a typical aperture in telescopes, but also due to important differences associated to its particular topology. Other apertures, such as circles, ellipses, or even angular sectors are simply connected sets, whereas the central obscuration of annuli makes that they are connected (but not simply) sets. This has important consequences. The deformation (mapping) of the circle necessary to arrive at the annulus (intuitively this would be a sort of stretching of an infinitesimal central hole to form the finite obscuration) is not uniform, as it depends on the distance to the center (). This lack of uniformity makes that the original polynomials become polynomial quotients (see Table III) and that the Jacobian is not a simple constant normalization factor. Then here we have a non-Euclidean metric and hence the original plane element of area becomes curved now. That curvature becomes patent if we compare the upper left panels of E the mapping is linear and hence we can see tilted (rotated) but straight fringes, whereas 1 1 O shows vertical but curved fringes. As we said in Section II, it is possible to choose ( , ) ( , ) Q x y J x y  but this does not eliminate the curvature of the fringes. This choice does not seem compatible with a unified framework as it can potentially change the properties of the basis functions and the physical meaning of the associated coefficients.