Analytical expression of aperture efficiency affected by Seidel aberrations

The effect of aberrations on the aperture efficiency has not been discussed analytically, though aberrations determine the performance of a wide field-of-view system. Expansion of a wavefront error and a feed pattern into a series of the Zernike polynomials enables us to calculate the aperture efficiency. We explicitly show the aperture efficiency affected by the Seidel aberrations and derive the conditions for reducing the effects of the spherical aberration and coma. In particular, the condition for coma can reduce a pointing error. We performed Physical Optics simulations and found that, if the Strehl ratio is higher than 0.8, the derived expression provides the aperture efficiencies with a precision of<2%.


Introduction
Most of the existing radio telescopes were designed as a single-beam telescope. Multi-pixel detectors have appeared and are still being developed [1][2][3][4]. A wide field-of-view (FOV) radio telescope has also been obtained with ray-tracing simulation [5][6][7][8][9]. However, it is not obvious whether ray tracing is a sufficient tool to design a wide FoV radio telescope because it can assess aberrations but cannot take diffraction into consideration. Since a radio telescope generally transmits an electromagnetic wave with a long coherence length compared to the telescope size, a wide FOV system should be evaluated and optimized in terms of both aberrations and diffraction.
The detectors used in a radio telescope have an angular response, i.e., a feed pattern. Since diffraction in a telescope can be evaluated to some extent by using quasioptics, especially by the Gaussian beam theory, a feed pattern is usually approximated to a Gaussian function as a first step. Some literature, e.g. [10,11], has discussed the wavefront errors of the Gaussian beams. The balanced aberrations for Gaussian beams have also been reported [12,13]. They are, however, inadequate to assess a wide FOV system because an incident wave from a celestial object enters it as a uniform plane wave. The incident plane wave is partially coupled to the feed pattern, i.e. not all the incident energy is detected. Therefore, the aperture efficiency is introduced as a figure of merit, which gives the ratio of the detected energy to the entering energy. To the best of our knowledge, there is no literature discussing the coupling between a feed pattern and a plane wave analytically as a function of a wavefront error.
Olmi and Bolli [14] addressed the relation between wavefront error and aperture efficiency. They pointed out that aperture efficiency was a function of a wavefront error and examined the relation between an apodized Strehl ratio and aperture efficiency. They, however, assumed that the direction of the peak gain is known, though it cannot be determined in a practical designing phase due to aberrations. Moreover, though a direction from the exit pupil toward a feed is intrinsically independent of an incident direction of a plane wave even in the paraxial limit, they fixed the relation between them, which was a special case. They concluded their one-to-one correspondence between the aperture efficiency and the apodized Strehl ratio under the special assumption, which hinders us from deriving some useful conditions for the cancellation of aberrations.
Recently, Nagai and Imada [15] revealed that the aperture efficiency is determined by two spillover efficiencies and coupling efficiency. The coupling efficiency is determined by two electric fields, an incident wave that holds the information on a wavefront error and an imaginary field illuminated by a feed. In this paper, we will show that the coupling efficiency is analytically expressed as a function of aberrations and a feed pattern, which depend on the incident direction and the detector position, respectively. In Section 2, the technical terms and assumptions used in this paper are described. In Section 3, the aperture efficiency is analytically calculated and the conditions for reducing the effects of spherical aberration and coma are derived. We verify the analytical expression using Physical Optics (PO) simulation in Section 4. In Section 5, the precision and applications of the analytical expression are addressed.

System settings
We assume an axially symmetric optical system with an annular aperture. The time dependence of an electromagnetic wave is assumed to be exp( jωt), where j = √ −1, ω = ck, c is the speed of light, and k = 2π/λ is the wave number. The radius of curvature of the wavefront is positive when the wavefront is convex, as seen from the negative part of a coordinate. An incident wave is linearly polarized and parallel to the detector polarization.

Telescope aperture, pupil, and pupil plane
A telescope aperture is an opening of the first optical element that defines the energy going into the telescope. It sometimes works as an aperture stop or otherwise the telescope has an aperture stop at a different position. In the latter case, the incident energy is cut out by the aperture stop. A pupil is a fundamental concept defined as an image of the aperture stop [16]. The aperture stop determines the electromagnetic field that forms an image on the focal plane. We refer to the infinite plane including a pupil as the pupil plane.
Once the entrance and exit pupils are considered, we can consider an equivalent system that has no optical elements between the object and the entrance pupil and between the exit pupil and the image. Although the object, the entrance pupil, and the exit pupil are sufficient for discussion of aberrations, we need the telescope aperture to define the incident energy if it is prior to the entrance pupil. Thus, we regard the telescope aperture, the entrance pupil, and the exit pupil as essential components in this paper. The radii of the aperture, the entrance pupil, and the exit pupil are denoted by R ap , R en , and R ex , respectively. We may consider holes at the center of each pupil if need be. The radii of the holes are denoted by a dimensionless parameter, 0 ≤ ε < 1, which is multiplied by each pupil radius. It is assumed that the propagation from the sky to the entrance pupil is described with geometrical optics. On the other hand, the feed beam from the focal plane to the exit pupil is assumed to be described with the Gaussian beam theory. Fig. 1 shows the coordinates used in this paper. An incident plane wave comes from the direction specified by an incident and azimuthal angles (Θ, Φ). A cylindrical coordinates (r, φ, z) are used in the image space and z = 0 is located at the exit pupil. Another set of coordinates (ρ, ψ) is introduced on the pupils, which is the common parameters used between the pupils. A dimensionless parameter ρ (ε ≤ ρ ≤ 1) denotes a radial distance normalized by each pupil radius and ψ denotes the azimuthal angle. Another radial parameter on the telescope aperture, , normalised by R ap , is introduced. The focal length of the optical system is f . The incident and azimuthal angles (Θ, Φ) relate to a corresponding Gaussian image point [16], (r g , φ g , z g ) = ( f tan Θ, Φ + π, f R ex /R en ), if there are no aberrations. For convenience, the following vectors are introduced: p = (sin Θ cos Φ, sin Θ sin Φ) specifying the direction of the incident wave, = ( , ψ) on the telescope aperture, ρ = (ρ, ψ) on the pupils, and r = (r, φ, z) from the exit pupil center.

Wavefront error
Let us focus on the wavefront of an incident wave at the exit pupil. When no aberrations exist, it is a spherical shape whose radius equals the distance between the exit pupil center and the Gaussian image point r g ; however, the actual wavefront deviates from it. Moreover, the actual wavefront is not necessarily compared to the sphere centered on r g . As a consequence, we introduce a reference sphere centered on r ref , which is in the vicinity of r g , and the wavefront error W between the actual wavefront and the reference sphere along a ray. If the actual wavefront deviates to the direction of beam propagation, W is positive. The wavefront error W depends on the incident direction p, position on the pupil ρ, and reference sphere center r ref . The wavefront error can be expanded into a series of the Zernike annular polynomials (ZAPs, Appendix A) where m and n are integers, and n − |m| ≥ 0 is an even integer. In this paper, we focus only on the Seidel aberrations and do not discuss random wavefront errors caused by the atmospheric fluctuation or surface roughness of the optical elements. To consider these errors, statistical methods are needed as seen in [17][18][19], and therefore, these errors are beyond the scope of this paper. Thus, we employ ZAPs up to the third order aberrations (0 ≤ n + |m| ≤ 4).

Incident wave
An incident wave at the telescope aperture is assumed to be uniform and the electric fields at the telescope aperture and the entrance pupil can be expressed as respectively. When ρ > 1 and > 1, E ap = 0 and E en = 0, respectively. They are normalised by the power passing through the telescope aperture whose area is πR ap 2 . The electric field distribution at the exit pupil is assumed to be a spherical wave with small aberrations where represents a spherical wave centered on r ref . The scaling R en /R ex is determined according to [20]. Expanding the Taylor series of exp[ j kW( p; ρ; r ref )] into ZAPs, we obtain When the Taylor series is taken up to the second order and the Seidel aberrations are considered, i.e. 0 ≤ n + |m| ≤ 4 for A n m in Eq. (1), the coefficients B n m are explicitly given in Appendix B.

Feed pattern
A feed pattern is often assumed to be a Gaussian function when a telescope is designed. We here consider a feed beam propagated from a beam waist position r bw = (r bw , φ bw , z bw ) toward the exit pupil center. The angle between the optical axis and the beam propagation axis is given by tan θ bw = r bw /z bw . We expand a feed pattern at the exit pupil plane which is given as a superposition of the Laguerre-Gaussian beam modes E p q (ρ; r bw ; w bw ) into ZAPs, i.e., The coefficients C p q depend on the beam waist size of the Laguerre-Gaussian beam, w bw , and the beam waist position r bw = (r bw , φ bw , z bw ). When |sin θ bw | 1 the Laguerre-Gaussian beam can be expanded into to the Taylor series up to the first order of sin θ bw and the coefficients C p q are calculated from Eq. (22) in Appendix C, where The polynomialR p |q | (I u ) is defined in Appendix D. The coefficient F ± 1 corresponds to tip-tilt and distortion of the intensity distribution caused by the calculation on a tilted plane concerning its beam axis. The coefficient F 2 denotes defocus which can be cancelled out in principle. The coefficient F ± 3 is also the distortion of the intensity distribution.

Aperture efficiency evaluated at pupil
Aperture efficiency η A is factorized into entrance and exit spill spillover efficiencies, η sp,en and η sp,ex , and beam coupling efficiency η bcp [15]. The beam coupling efficiency η bcp keeps the same values among pupils, which allows us to calculate it at any pupil where the calculation is easier. We evaluate η bcp at the entrance pupil to see the relation between the factorization and beam properties and also at the exit pupil to relate aberrations to η A . Only the fundamental-mode Gaussian beam case is considered in this section.

Spillover efficiency
The entrance and exit pupil spillover efficiencies are given as follows: Eqs. (2) and (3) are used for η sp,ent . Since the propagation between the telescope aperture and the entrance pupil is described with geometrical optics in this paper, an actual entrance pupil spillover efficiency might be considerably different from that calculated geometrically due to diffraction. The exit pupil spillover efficiency with blockage ε for a fundamental-mode Gaussian beam is already known [21] and Eq. (11) is equal to

Evaluating at the entrance pupil
The beam coupling efficiency evaluated at the entrance pupil is written as where the electric field E det (ρ; r bw , w bw ) is the field on the entrance pupil originating from the feed. The beam coupling efficiency is the most important quantity of the three because it has a close relation to a beam pattern. Let us introduce the beam pattern as a function of direction cosines l and m,P(l, m; r bw ) (cf. [22]). The numerator in Eq. (13) is proportional toP(l, m) because the exponential function is equivalent to the incident field E en ( p). As a result, we can relate the beam coupling efficiency with the beam pattern as follows: where A en = πR en 2 1 − ε 2 ,P n (l, m; r bw ) =P (l, m; r bw ) are the area of the entrance pupil, a normalized beam pattern, and a beam solid angle, respectively. The direction cosines (l 0 , m 0 ) denotes the direction that makesP(l, m) maximized. WhenP n = 1, i.e., p 0 = (l 0 , m 0 ), Eq. (14) reduces to a simple form,

Evaluating at the exit pupil
We make an analytical expression of the aperture efficiency with the coefficients in Appendix B. By using Eqs. (5) and (6), the coupling efficiency is written as The coefficients B n m hold information about the wavefront distorted by aberrations. In terms of the denominator, Eqs. (4) and (5) Since we are considering the fundamental-mode Gaussian beam, the following is derived: The fundamental-mode Gaussian beam propagated obliquely through the system is expressed with Eqs. (6) and (7) with D 0 0 = 1 for p = q = 0, and D p q = 0 for the others. The coefficients C p q for the fundamental mode are obtained, The coefficient F 2 corresponds to defocus as mentioned. We therefore assume compensation by the longitudinal adjustment of a feed such that F 2 = 0. Using the aberration coefficients A n m and Eqs. (10), (12), (16), and (17), we obtain the analytical expression of the aperture efficiency affected by the Seidel aberrations, where .
The products A 1 ±1 C * 1 ±1 and A 2 0 C * 2 0 correspond to the effects of tip-tilt and defocus, respectively, which are strongly dependent on r bw . Let us focus on the first order A n m . If the beam waist of the feed is located such that the following conditions are satisfied, then, the first order terms vanish. Eqs. (20) and (21) represent the conditions for reducing the effect of spherical aberration and coma, respectively. We can calculate the coefficients C p q for an arbitrary feed, though we have limited ourselves to the fundamental-mode Gaussian beam case. If an asymmetric feed pattern is considered, e.g., a diagonal horn, we will obtain the conditions for reducing the effect of astigmatism, A 2 ±2 .

Verification
Eq. (18) and the conditions in Eqs. (20) and (21) are verified with numerical simulations. We compare the aperture efficiency evaluated using ray tracing [23] and the PO simulation [24].

Model and calculation
We use a simple system composed of a spherical mirror with a circular aperture (ε = 0, Fig. 2). The radius of curvature of the mirror is −1000 mm and its diameter is 300 mm.

Results
Tables 1 and 2 are for p = (0, 0) and (sin 1 • , 0), respectively. Both tables contain the beam waist position r bw , Strehl ratio, edge taper, aperture efficiency from ray tracing, η A,an , aperture efficiency from the PO simulations, η A,PO , and difference η A,an /η A,PO −1. The aperture efficiencies estimated using Eq. (18) agree with those calculated from the PO simulation for the higher Strehl ratios. Fig. 3 shows the points obtained from the PO simulation and the theoretical curves predicted by Eq. (18) as a function of the edge taper for both incident angles. The red lines represent the aperture efficiency without any aberrations for reference. The green lines (case 1) give the highest aperture efficiency in the cases considered here with the same values of A n m . Note that the difference between cases 1 and 2 was small but case 1 provided the higher aperture efficiency. That is, the optimization of the feed position in terms of the Strehl ratio does not necessarily maximize the aperture efficiency. Fig. 4 shows the beam patterns on the meridional plane for p = (sin 1 • , 0). The peak gains and positions are different among the cases 1, 2, and 3. When the condition in Eq. (21) holds, Fig. 4 indicates that we can reduce pointing errors due to the third-order coma. The condition in Eq. (21) is practically useful to design a wide FOV radio telescope.

Approximation precision and the Strehl ratio
In this subsection, we address how precise the analytical expression works and how the Strehl ratio relates to it.  The results in Section 4.2 indicate that the aperture efficiency estimated with Eq. (18) agrees with that calculated from the PO simulation. Eq. (18) has been derived under the approximations: only the Seidel aberrations are taken into account and the Taylor series is terminated at the second order. The higher order aberrations are quite small in our verification. Therefore, let us focus on the order of the Taylor expansion in Eq. (5). The omitted terms were the third or higher orders of W. The absolute value of the largest term − j k 3 W 3 /6 would be estimated by replacement of W with the standard deviation of wavefront errors, W dev . Table 3 shows the deviation of the wavefront error, the estimated third order from W dev , and the corresponding Strehl ratio. The Table 3. Relative magnitude of the third-order term with respect to the zeroth-order term and corresponding Strehl ratio.  Tables 1 and 2 seem close to the magnitude of the third-order term in Table 3 for higher Strehl ratios. The Strehl ratio implies the precision of Eq. (18). If the Strehl ratio is 0.8, we will be able to estimate the aperture efficiency with a precision of 2% or so.

Applications
We can extract various information on an optical system by selecting a proper parameter as a free parameter. We briefly look into potential applications in this subsection.
When an incident direction p is a free parameter and the other parameters are fixed, we can obtain the beam pattern as shown in Eq. (14). All we have to do is to calculate A n m ( p; r ref ; ε) as a function of p using ray tracing software. The higher orders of A n m may be added if need be. Let us focus on the focal plane and consider the case when aperture efficiency is a function of the beam waist position r bw , which can be regarded as a detector position, and the other parameters are fixed. The detector position determines the coefficients C p q . The dependence of aperture efficiency on r bw allows us to estimate the tolerance of the detector position. In a special case, when a detector has an isotropic sensitivity we will obtain a point spread function.
The aberration coefficients A n m can free parameters with p and r bw fixed. This situation happens when the optical elements are misaligned and deformed. In that case, we can apply Eq. (18) to tolerance analysis with ray tracing. Generally, tolerance analysis requires numerous cases of misalignment and deformation, and therefore, it is unreasonable to use full-wave simulation, which consumes considerable amount of computing resources. Eq. (18) can give the aperture efficiencies without full-wave simulation.
Finally, the limitation of this analytical expression is addressed. The assumption used in this study are that the propagation from the telescope aperture to the exit pupils can be described with geometrical optics and the feed beam is described with Gaussian beam theory, which is an equivalent approximation to the Fresnel diffraction theory. Therefore, we need full-wave simulation in the following cases: diffraction effects at the edges of optical elements are significant, a higher order approximation of diffraction is needed compared to the Fresnel diffraction theory, and polarization has to be evaluated.

Conclusion
Aperture efficiency is one of the figures of merit of a radio telescope. We explicitly show that it depends on the incident direction p, the position of detectors r bw , and the feed pattern. The wavefront errors and feed pattern are expanded into a series of the Zernike annular polynomials, whose coefficients are given as a function of either p and r bw , respectively. The expansion enables us to derive the analytical expression of the aperture efficiency affected by the Seidel aberrations. If the Strehl ratio without apodization is greater than 0.8, this expression gives aperture efficiency with a precision of 2% from ray tracing simulation. In addition, we derive the useful conditions required to reduce the effects of spherical aberration and coma. In particular, the condition for reducing coma avoids the pointing error caused by coma. If a non-axially symmetric feed pattern is assumed, a condition to reduce astigmatism may be derived. The expression can be applied for the evaluation of a beam pattern and tolerance analysis.

acknowledgements
This work was supported by Grant-in-Aid for JSPS Fellows and MEXT KAKENHI Grant Number 15K17598. We are grateful to the National Institute of Information and Communications Technology for supporting the PO simulation and to Professor Naomasa Nakai at University of Tsukuba for supporting the ray tracing simulation. We are also grateful to Alvaro Gonzalez at the National Astronomical Observatory of Japan and Yutaro Sekimoto at Institute of Space and Astronautical Science, Japan Aerospace Exploration Agency for the fruitful discussions.

A. Zernike annular polynomials
The Zernike annular polynomials [25] are of the form where m and n are integers such that n ≥ |m| and n − |m| is even. The domain is 0 ≤ ε ≤ ρ ≤ 1 (ε < 1) and 0 ≤ θ ≤ 2π. The normalization is as follows: The polynomials which were not demonstrated in [25]

B. Coefficients B n m for the Seidel aberrations
The notation for the ε polynomials in Eq. (19) is used. .