An exploration of the freeform two-mirror off-axis solution space

We use a differential construction procedure to construct three starting solutions for a thermal infrared telescope design problem. We further refine these solutions with a simple optimization procedure. We show that this hybrid method is interesting by its flexibility that makes it suitable to find solutions to problems with tight volume constraints and its ease of use with little designer interaction required. Taking advantage of a fully automated process, we are able to investigate for each solution and its starting geometry, the impact of the Zernike order on the final performance obtained. Finally we show the systems proposed are sufficient for thermal infrared applications and while not as performant as the three mirror system used for comparison, their simplicity makes them an interesting alternative.


Introduction
The trend for aerospace optical payloads is compact and less expensive instruments without compromise on optical performance. In the infrared spectrum, reflective systems are an interesting option being lightweight and achromatic with high transmission over a wide spectrum. However, they are usually bulky and classical designs present a central obscuration.
Off axis telescopes have been proposed [1] to address central obscuration and can reach high performances in both straylight reduction and resolution. For classical uses, rotationally symmetric surfaces are sufficient to ensure a good image quality [2] because the surfaces are only slightly tilted, by no more than a few degrees. But to reach fast F-numbers, required for uncooled infrared detectors and large field of view while maintaining tight volume constraints, tilts are increased. In this case one cannot do without freeform surfaces, surfaces that do not present an axial symmetry (though usually some form of planar symmetry is conserved).
Freeform surfaces (without rotational symmetry) give the designer more degrees of freedom to correct for aberrations in constrained optical problems at the price of a complex optical surface.
When designing optical systems, freeform or not, few problems can be solved without optimization algorithms. Global optimization algorithms are currently unpractical, freeform design being by nature a problem with a large number of degrees of freedom. On the other hand, local optimization algorithms can cope with very large dimensionality, especially when one can do without finite-difference gradients [3], but the final outcome will depend on the suitability of the starting point chosen.
When looking for starting points, designers will often turn to literature but freeform design being a relatively recent field; these starting points are classical in nature. Other options are: starting from an on axis solution ignoring obscuration and vignetting [4], or an off axis TMA with minimal aberration with axial-symmetric surfaces [5]. Finally 3D-SMS method introduced to solve illumination problems has shown promise in imaging as well [6].
In all cases Kevin Thompsonʼs Nodal Aberration Theory (NAT) [7,8] is of great help, for the starting point generation or to guide the designer when gradually modifying the system by adding off axis constraints and increasing field and aperture until the desired solution is obtained.
Similarly, our research interest also lies in exploring new methods to design freeform starting points and their impact on the final outcome of the design process. In an earlier work [9] we demonstrated a new differential Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence.
Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
integration method which can be used to design axially stigmatic and aplanatic two-mirror freeform designs from a very limited set of parameters.
This method is not without limitations, and in this article we will tackle the case of the axial stigmatism condition which is not necessarily a desirable feature. Indeed designers will prefer a balanced optical system where residual on-axis aberrations are used to improve off-axis performance. Such systems can be designed using a hybrid procedure described in this article which follows three steps: 1. Starting point generation using differential methods.
2. Conversion from point cloud to a Zernike sag representation.
We propose three solutions of two-mirror systems (figure 1) that we name α, U and Z based on the resemblance of the letters chosen with the geometrical layouts considered.
The literature usually proposes three mirrors for similar problems so we conclude the article by discussing the trade-off between the two and three mirror solutions, showing that in thermal observation where imaging requirements are relaxed two-mirror systems are a viable and simpler option. However they are not significantly more compact than equivalent freeform three-mirror solutions.

Previous work
We showed previously [9] that the derivation of a system of equations and its subsequent resolution using numerical methods can be used to compute the partial derivatives of freeform surfaces and of functions representative of the ray propagation. This work is currently limited to two-mirror freeform systems and draws inspiration from the design of an aplanetic microscope objective by Wassermann and Wolf [10].
One can think of this procedure as performing a multivariate Taylor expansion, first computed around the parabasal ray, a ray originating from the center of the field of view (FoV) and intersecting the pupil in its center.
To perform this expansion we use the fact that Fermat path principle states that rays travel along a stationary path. It allows us to define F, the Fermat error function. Starting point for optimization, plotted on the same scale. The system considered in earlier work [9] was an α configuration but with convergent-convergent mirrors. In this article the F-number is larger (1.5) so the α configuration becomes divergentconvergent. The newly introduced U keeps the convergent-convergent configuration but requires an image plane on the other side of the incoming beam. The Z configuration is more classical and share similarities with a Schwarzchild configuration.
Since the Fermat error function is always equals to 0, the derivatives of the Fermat error function are also 0. By differentiation and expanding of equation (1) we set a nonlinear system of equation where the surfaces derivatives of the optical surfaces appear together with derivatives of the ray propagation with respect to the pupil and object position. The paraxial properties of the system (focal length, image and pupil position) and the aplanetic condition translate to known values of some of the derivatives of the ray propagation. Let k=(k x , k y ) the object coordinates, p=(p x , p y ) the pupil coordinates, w=x i , y i the local coordinates of a ray intersection at the image plane and f the focal length, the following relations hold: At the parabasal ray, the system of equations with the paraxial and aplanetic constraints can be solved with twomirrors by adding the tilt of the image plane as a supplementary degree of freedom. This allows integrating numerically the high order partial derivatives to construct a 3D representation of the surfaces. It has been shown in literature that a closed-form solution of the surfaces could be found in the afocal case [11] or axial-symmetric case [12,13].
While integrating, the degrees of freedom of the nonlinear system are reduced since we have to maintain consistency with previously found rays [9]. The system becomes over constrained, rendering full exact aplanetism impossible.
This was mitigated by unconstraining aplanetism on three distinct pupil, field and image plane axis combinations [9]. This observation was made earlier in literature in a different formalism and the authors have introduced the concept of semi-aplanetic to describe such systems [11].
In this article we chose instead a simpler approach: we approximate aplanetism in the least square sense by minimizing the quadratic sum of the derivatives.
As a consequence, the performance obtained is stigmatic on axis and the degradation of the performance with field is constrained even though only on-axis field rays are constructed. An illustration of the results for the three configurations studied in this article is shown in figure 1.
The solutions of these algorithms are expressed as a dense point cloud for each optical surface, which needs to be converted to a suitable representation in order to evaluate the performance obtained via ray tracing or to further optimize the system.

Conversion to sag representation
We represent points from the dataset constructed by integration by ) a vector in global coordinate expressed in an affine formalism. Let P=(x, y, z, 1) represent the same point expressed in a local coordinate system specific to the surface under consideration. Let θ be the geometrical parameters of the surface (radius of curvature, conicityK), and A the matrix representative of the affine transformation from the local to the global coordinate system.
In optical design software such as Zemax or CodeV, an optical surface is represented with a sag function defined in equation (3).
In the following we will refer to (x, y, z) as the coordinates in local coordinate system and to (x, y, z) as vectors representative of the local coordinate system axis expressed in the global coordinate system. To determine A we perform a Principal Component Analysis (PCA) of the point cloud. PCA finds the plane that best fits the point cloud representative of an optical surface under consideration, as shown in figure 2(a). Let v 1 , v 2 be the two principal components found by PCA, i the incident direction of the parabasal ray, ×denotes the cross product and · the dot product, we derive (x, y, z) in equation (4).
In equation (4) we use the sign function to orient z collinear to i, ensuring a consistent optical surface orientation. We can set x since the solutions considered are symmetric such that´= v v 1, 0, 0 0 . Let x 1 , x 2 , x 3 the coordinates of x and similarly for y, z and let x 0 , y 0 , z 0 the coordinates of the origin of the local coordinate system in global coordinates, we derive A in equation (5). We set x 0 , y 0 , z 0 to the centroid of the point cloud. We convert A to Euler angles and decenters, a representation compatible with optical design software.

Linear interpolation
Zemax supports defining a sag function as a bi-linear interpolation of a dense regular Cartesian grid, this feature is called Grid Sag. These surfaces are not suitable for further optimization of the solution, but are useful to check its performance. Indeed, in the next section, we will use a system with bi-linear surfaces as a reference to quantify the impact of the Zernike fit inaccuracies on the optical performance. The point cloud obtained is unstructured so we need to interpolate it on a dense rectangular grid. To maximize the accuracy of the interpolation we first fit the x, y coordinates to a second order polynomial. The sag surface is defined in equation (6) where I denotes interpolation of the fit residuals using a piecewise cubic interpolation method (SciPy). The Grid Sag based model obtained is used for a performance cross check, and to compute the actual optical footprint on each optical surface. The optical footprints are used to discard points that will not contribute to the useful optical area before the Zernike fitting procedure, as shown in figure 2(b). The presence of these points is due to a shortcoming of the integration algorithm, as the method constructs only rays originating from the on-axis field point. Integration is performed on a larger pupil than required, with a margin defined large enough to accommodate the Field of View.

Zernike polynomial
To optimize the solution we fit the surfaces to a sag function composed of a spherical base and Zernike decomposition.
The fit is performed in several steps: first we fit the best sphere to the point cloud as shown in figure 3(a). This can be done in the global coordinate system using linear algebra.
Next, to determine the local coordinate system, we define a plane tangent to the best fit sphere at the point closest to the centroid of the point cloud and on the sphere.
The departure from the best sphere along z (shown in figure 3(b)) is fitted on a basis of Zernike polynomials with the normalization radius and the center of the Zernike contribution computed as the center and radius of the smallest encircling circle (with a small margin).

Reoptimization
We setup a simple merit function, optimizing root-mean-square spot size and constraining marginal rays in a box to prevent unrealistic volume increase.
We also add constraints to ensuring that the system remains unobscured and that the focal length is maintained.
The focal length is maintained by constraining the X centroid position of the X marginal field image to be at the correct location and similarly for the two Y marginal field points. Intermediate field points images position are not constrained, effectively leaving distortion unconstrained.
The system is maintained unobscured by constraining the clearances between bundle of rays and mirror to be larger than 10 mm. The clearances are computed using marginal rays as described in literature [14] with the modification that in our case we used dummy surfaces instead of trigonometric formulas to compute the clearances. This was done in order to simplify the merit function.
All Zernike coefficients resulting in a deformation symmetric with the x = 0 plane are optimized (except piston and tilt), as well as the coordinate break parameters, radii and conic constants. The normalization radius of the Zernike polynomials is kept constant as is the location of the center of the Zernike contribution.
The number of Zernike coefficients used is the same as in the fitting procedure, extrapolation of Zernike polynomials was allowed. To evaluate the number of Zernike coefficient necessary we repeated the fitting and reoptimization procedure with increasing order n, meaning that we performed the calculation multiple times with Zernike polynomials used such that  n n max with n max increasing from 3 to 10. The reoptimization was performed in Zemax using the damped least squares algorithm.

Results
In table 1 the construction parameters for each of the α, U and Z configurations are listed. These parameters are the trajectory of the parabasal ray which is required as input from the user by the method described in section 2.1. Requirements are adapted from our previous work to study faster systems with a F-number of 1.5. We keep the focal length requirement of 150 mm. The field of view considered is larger: 6.7°× 0.4°c orresponding to a micro-bolometer detector of 1024 pixels with a pixel size of 17 μm. The use case is earth observation with a push-broom satellite hence the rectangular field-of-view, since only a strip on earth is imaged at a time.
Results before optimization ( figure 4(a)) indicate that more exotic solutions such as α requires more Zernike polynomials to converge to a suitable fit. Indeed the minimum rms spot radius (which should be 0) as the integration method produces stigmatic systems is very large for few Zernike polynomials and decreases more rapidly for configuration U and Z than for configuration α.
Similarly we see that the performance after optimization ( figure 4(b), systems depicted in figure 5) does not significantly improves with maximum Zernike radial degree for configurations U and Z while the performance of α continues to improve up to maximum Zernike radial degree such that n max =10.
A designer usually wants to minimize the number of polynomials used, as large orders require denser ray sampling in the pupil which slows the merit function computation which slows in turn the convergence. Additionally, a larger number of degrees of freedom increases the number of optimization iterations which also slows the convergence. We deduce from figures 4(a) and (b) that the optimum Zernike order does not only depend on the application considered, but also on the design under consideration.  No clear trend is apparent for U and Z. A significant variation of outcomes is observed for Z, which indicates that the merit function landscape contains convergence difficulties (e.g. local minimas, saddle points). The Airy disk radius is indicated by a red line for l m = 9 m.
In figure 6 we see that the dominant aberrations before re-optimization are a variable focus term (z 4 ) and astigmatism (z z , 5 6 ). Some residual coma is present (z 7 , z 8 ), especially for the U configuration as well as a trefoil term (z 10 ).
We also investigate the performance on a larger FoV. Results in figure 7 show that re-optimization is necessary to create the asymmetric performance profile required by the specification.
The performance is summarized in table 2, we observe that the two more exotic configuration U and α perform better than the more classical Z. This observation is consistent with results from the Nodal aberration theory [15] which have shown that off-axis configurations where the two mirrors folding angles are of the same sign can correct for astigmatism. Since astigmatism is the dominant aberration before reoptimization, unsurprisingly the U and α configuration are better corrected, even if some residual astigmatism is still present in figure 6.

Discussion
The telescopes presented in this article achieve image quality suitable for thermal observation in the infrared with a linear FoV. For these applications, where the resolution requirements are relaxed compared to the visible domain they are a reasonable alternative to freeform TMA that have been proposed [4,[16][17][18] to be used with uncooled infrared detectors. However compared to a three mirror telescope system, the footprint of the best performing system (U) is similar as seen in figure 8, so while using only two surfaces for small pushbroom satellites might help to reduce alignment complexity and manufacturing costs, compactness might not be improved that way. Future studies including tolerancing and stray light analysis will shed more light on the relative comparison of two mirror and three mirror systems.
Additionally we showed that the construction algorithm that we introduced in earlier work is compatible with classical optimization and can supply good and varied starting point for optimization. Figure 7. Rms spot radius before and after re-optimization evaluated on a larger field of view than specified (in dashed lines). Reoptimization produces an asymmetric field performance due to the asymmetric field specification. The U system degrades faster and to preserve contrast values above 200 μm are truncated. Table 2. Performance summary. Rms spot radius performance over the field expressed by its minimum, average and maximum value. Distortion relative to a reference wavelength of 150 mm and volume of a minimum enclosing box. The box orientation is such the entrance aperture is coplanar with one of the sides. Results of lines fit and optimized are for = n 10 max . Solution U exhibits the worst field dependent performance degradation before optimization but is the best performing after optimization.

Rms spot radius (μm)
Distortion ( Figure 8. Size comparison of the U configuration and a TMA. The two systems have similar footprints. The TMA performance is much better (lower than 1.5 μm rms spot radius over the field) but for infrared applications this is unnecessary and comes at the price of an extra component.