A new temporal control approach for SCAO systems

Adaptive optics (AO) systems for ground-based telescopes use deformable mirrors to physically correct wavefront distortions induced by atmospheric turbulence. Due to time delays caused by different parts of the AO system, the process of turbulence correction becomes even more difficult since the earth’s atmosphere changes continuously. In this paper we propose a new temporal control approach for the computation of optimal mirror configurations based on the solution of a sequence of inverse problems for the wavefront sensor operator. Our mathematical formulation of the underlying problem allows the incorporation of computationally efficient wavefront reconstruction methods and a wavefront prediction step. Based on the frozen flow assumption, the prediction of a future wavefront relies on a suitable shift of the reconstructed wavefront. The performance of our temporal control algorithm is demonstrated in the context of a single conjugate adaptive optics system on a 37 meter telescope using a Shack–Hartmann wavefront sensor. Numerical results of the proposed control method are provided using OCTOPUS, the official end-to-end simulation tool of the European Southern Observatory.


Introduction
In observing the sky with ground-based telescopes the atmosphere of the earth poses a major challenge. Through the interaction with the solar radiation and the wind, certain regions in the atmosphere are heated differently and get mixed up, leading to turbulent eddies [31], which introduce spatial and temporal variations in the air's refractive index. As a result, the image taken by a ground-based telescope appears blurred in the scientific camera.
A remedy to counteract the turbulence-related image degradation is provided by a technology which is known as adaptive optics (AO). AO physically compensates the distorted incoming light by reflecting it on one or several deformable mirror(s) (DMs), which are steered according to measurements provided by wavefront sensors (WFSs). An essential part of the AO system is the control unit, which represents the link between the WFSs and the DMs. For a classical system (using only one WFS and one DM), the task of the AO control unit is basically concerned with the calculation of the new DM actuator commands from WFS measurements. However, due to unavoidable temporal delays in the AO correction loop, the control of the DM becomes even more complicated. In particular, the control algorithm should have a prediction ability in order to follow the rapid changes in the atmospheric turbulence. Different control strategies have been proposed, ranging from simple, low-computational integrator-based control methods to complex linear quadratic Gaussian (LQG) optimal control algorithms which utilize statistical priors on the atmosphere and the measurement noise. In contrast to integrator-based control methods, LQG control relies on a dynamical model of the AO system, which allows to predict the turbulent perturbations in the incoming light at a future time. However, due to its complexity, optimal control has its price of beeing (in general) computational costly. Furthermore, such methods additionally require statistical priors.
In this paper, we present a new temporal control method. It relies on a more general approach, where the calculation of the DM actuator commands is done by first solving an inverse problem of reconstructing the incoming WF from noisy WFS data and then projecting the reconstruction onto the DM. The proposed method is based on the consideration of the whole AO correction process in terms of operator equations and allows to incorporate computationally efficient reconstruction methods. For the wavefront reconstruction, we particularly focus on the cumulative reconstructor with domain decomposition (CuReD) [32]. The CuReD method is a highly efficient reconstructor which does not require any statistical priors. Furthermore, the proposed approach allows to include an additional prediction step, which is carried out by a shift of the reconstructed wavefront. The required shift vector is calculated based on a sequence of previously reconstructed wavefronts. In contrast to the LQG-based control method, the proposed temporal control method is fully deterministic and does not need any additional statistical knowledge on the atmosphere nor the measurement noise. All informations used in this approach are purely obtained from the reconstructed WFs.
The reminder of this paper is organized as follows. In section 2 we give an introduction to atmospheric turbulence and discuss its effect on telescope imaging. Section 3 describes the principle of AO in more detail. In section 4 we give a mathematical model for the SCAO system and describe our temporal control approach. Section 5 presents the results obtained in simulation and in section 6 we give our conclusions.

Atmospheric turbulence and perturbation of light
From the physical point of view, light is described as an electromagnetic wave U (r r r, t) = A(r r r, t)e iφ(r r r,t) with amplitude A and phase φ, depending on spatial coordinates r r r ∈ R 2 and time t ∈ [0, ∞). In case of a wave with globally constant amplitude, a surface of equal phase is called wavefront (WF), travelling through the three-dimensional space. Before the light (stemming from a distant point source) enters the earth's exosphere, it can be described as a planar WF. While the light is passing through several atmospheric regions with different temperature inhomogeneities, it gets perturbed by random variations in the air's refractive index and fluctuations in the intensity, leading to a perturbed incoming WF ϕ tur on the telescope aperture 4 Ω A := {r r r = (r 1 , r 2 ) ∈ R 2 : a int |r r r| a out } with outerior radius a out and radius a int of the central obstruction (through which the collected light is led to the optical devices). While variations in the refractive index mainly affect the phase (in form of local phase shifts), intensity fluctuations (called scintillation) causes variations of the amplitude. However, due to the aperture averaging effect [14], the effect of scintillation is usually very low and thus is neglected, i.e. A = 1.
In astronomical observations, the image quality is degraded by the phenomenon of diffraction (depending on the size of the telescope aperture) and the effect of atmospheric turbulence. While the image quality is always limited by diffraction, the effect of the atmosphere can be reduced by compensating the shifts in the phase φ.
An important characteristic of the turbulence is represented by the so-called refractive index structure coefficient C 2 n , which varies in dependence of the height and describes the turbulence distribution in the atmosphere [10,31]. Based on the C 2 n profile integrated over height, the Fried parameter r 0 [30] represents another important astronomic parameter which provides a measure of the optical strength of the turbulence. While the atmosphere extends over a height of about 300 000 m, the refractive index structure coefficient decreases for increasing altitude. Actually, the C 2 n profile is non-zero only up to a certain height which is located at about h max = 30 000 m [10]. Furthermore, the turbulence strength of the atmosphere is not distributed uniformly in height. In fact, the C 2 n profile shows that most of the turbulence is located within separated layers at different altitudes [30]. Therefore, the atmosphere can be modeled by a finite number n L ∈ N of thin and statistically independent turbulent layers located at heights 0 h 1 < · · · < h nL h max , where each layer is described by an appropriate statistical atmospheric model (von Karman or Kolmogorov model, see e.g. [10]). While the light propagates through turbulent regions with different refractive indices, parts of the wavefront will be delayed compared to others, leading to local shifts in the WF. According to the layered structure of the atmosphere, the incoming WF gets aberrated by the turbulence contributions ϕ tur,l of the single turbulent layers along the optical path in direction α α α ∈ R 2 (measured in radians from the zenith axis), i.e. the shape of the distorted incoming WF can be written as the sum [29] ϕ tur α α α (r r r, t) = nL l=1 γ l ϕ tur,l (r r r + h l α α α, t) , where γ l denotes the normalized turbulence strength of the lth layer (according to the C 2 n profile). Note that γ l = 1 [29]. The phase is related to the WF aberrations via φ = (2π/λ) ϕ tur α α α , depending on the wavelength λ.
Similar to the motion pattern of clouds, the temporal fluctuations of atmospheric turbulence are mainly driven by two processes: fast deterministic changes due to wind shifts and slow random motions due to heat dissipation in the air, known as boiling [3]. A common approach to model the temporal evolution of the atmosphere is the Taylor frozen flow assumption [10,30,31], which basically states that, for short time periods (in the order of 10 ms), the temperature inhomogeneities are considered as frozen within the turbulence layers, and the only change of such a layer is due to the wind shift. In particular, boiling effects (which evolve on time scales greater than 50 ms [4]) are neglected. According to the frozen flow assumption, each turbulent layer propagates with its own wind speed and shift direction. Considering that the lth layer is moving with a wind velocity vector v v v l , then ϕ tur,l at time t + τ can be related to the same layer function at time t with an additional shift τv v v l in the spatial coordinate [31], i.e.
Typical values of the wind velocities range from 5 m s −1 up to a peak velocity of 40 m s −1 . An important temporal parameter represents the Greenwood time delay [30], defined as beeing the turbulence weighted average of the layers velocities. τ 0 , also known as turbulence coherence time, represents the characteristic time scale for changes in the WF aberrations. For time scales shorter than τ 0 , the perturbations in the WF and thus the turbulence in the atmosphere are considered to be approximately constant.

Adaptive optics
Essentially, an AO system consists of three main components. WFSs measure the incoming light of reference light sources, i.e. bright and distant astronomical objects. Usually, they correspond to stars (called natural guide stars, short NGSs) located in the near vicinity of the observed object. If NGSs are not available, or if they are too faint for measuring, artificially produced laser guide stars can be used. The second AO component are DMs, each conjugated to a certain altitude. The shapes of the DMs are calculated based on WFS measurement data and are adjusted via actuators. The link between the sensors and mirrors is given by the control unit, which represents the third AO system component. In dependence of the arrangement of the sensors and mirrors, an AO system can be driven in different modes. If the WFSs are located in front of the DMs, then one speaks of an open loop configuration. In contrast, the AO system operates in closed loop if the mirrors and sensors are placed the other way around.
In the latter case, the residual wavefront (=incoming WF already corrected by the DMs) gets measured.
Depending on the number of the considered WFSs and DMs and the type of the used guide stars, we distinguish between different AO systems. Consisting of only one DM optically conjugated to the ground layer and one WFS directed at a natural guide star, the single conjugate adaptive optics system, short SCAO, represents the most simple AO system, which is usually operated in closed loop. In this configuration, the telescope is positioned in such a way that the reference light source is located in the center of the field of view. Basically, the light emitted from the NGS travels in direction α α α to the telescope aperture (see figure 1). However, for notational simplification, we assume that the optical axis coincides with the zenith axis, i.e. α α α = (0, 0). Thus, the turbulent incoming WF can be written as ϕ tur (r r r, t) = nL l=1 γ l ϕ tur,l (r r r, t).
(4) SCAO highly depends on the direction of the observation and only provides a good correction in an extremely limited area surrounding the guide star. Furthermore, by using SCAO, only small areas of the sky can be observed because there are not enough bright guide stars. For a classical AO system like SCAO, the task of the AO control unit is basically concerned with the calculation of the new DM actuator commands from WFS measurements. Note that the WFS, which is also optically conjugated to the ground layer, measures the cumulation of the contributions of the turbulence layers in the direction of the optical axis (see (4)). This is also the only information available to control the DM. Based on the WFS sampling period, the whole AO correction process is considered in discrete time steps (see section 4.3). Let a a a i = (a i,1 , . . . , a i,nact ) be the reconstructed DM actuator commands with respect to the ith time instant, where n act ∈ N denotes the number of actuators. Then, the corre sponding DM shape is given by where h j : Ω A → R denotes the DM influence function of the jth actuator. In general, the mirror shape extracted from currently available measurements does not correspond to an appropriate approximation of the incoming WF at a future time, since the atmosphere changes rapidly. Hence, any time delay in the control loop results in a temporal error, leading to a degraded image correction. Time delays are unavoidable and are mainly caused by the WFS read-out time, the time used for the calculation of the DM actuator commands and the adjustment of the DM. Another error source, that contributes to the overall temporal error, arises due to the finite bandwidth of the AO system. The WFS sampling rate represents the clocking period of the AO system and thus specifies the correction bandwidth. In order to provide accurate measurement data, the WFS has to collect a sufficient large amount of photons within the sampling period. This requires a sampling period of a certain length, which, on the negative side, causes a larger temporal error. Basically, the bandwidth-related temporal error can be reduced by shortening the sampling period. However, speeding up the frame rate will in general lead to inaccurate measurement data as the WFS can only collect a smaller number of photons. A sufficiently bright reference light source would provide a remedy. But, as the sky is barely covered with very bright stars, an appropriate NGS is not available in general. Due to the considerations above, control of the DM poses some challenges. In order to keep up with the rapidly changing turbulence, the DM has to be updated approximately every millisecond. For this reason, the control action has to be sufficiently simple to allow a fast computation of the new DM shape such that the delay-related error is small. Nevertheless, a control strategy should also be able to reduce the overall temporal error that limits the performance of the AO system. This requires a certain complexity of the control algorithm. In order to keep the computational costs low, many AO systems utilize simple control strategies in which the spatial and temporal dynamics of the AO control system are considered separately. Hence, the action of the control system can be split into a spatial control step and a temporal control step. At first, the DM actuator commands are calculated from WFS data. To this end, the so-called interaction matrix, which maps the actuator commands to the WFS measurements, gets measured. The DM commands are then computed by multiplying the (stabilized) pseudo-inverse of the poke matrix (called control matrix) onto the measurements [22]. It is noted that, in general, the poke matrix and thus also the control matrix varies in time. Therefore, the control matrix has to be updated every time step. In the second step, a temporal control method is applied to ensure a good disturbance rejection and temporal stability in the AO correction process. A standard temporal control method is the so-called integrator control. In this approach, the previous DM actuator commands are combined with the currently reconstructed ones to obtain the new commands (see e.g. [15,22]). For stability reasons, the actuator commands are multiplied with damping factors before they get combined. For an AO system running in closed loop configuration, another approach known as pseudo open loop control (POLC) [28] has been introduced. Here, open loop like data are generated from originally given closed loop data. From these data, the new actuator commands are computed, which are then used to update the previous commands. In addition, the whole process is stabilized over time using e.g. integrator control (see e.g. [2]). In order to reduce the overall temporal error, optimal control strategies have been introduced. In this approach, the AO system is represented by a state-space model which describes the dynamical behaviour of the system and its outputs. Such a state-space model consists of a state evolution equation and a measurement equation and includes (statistical) prior informations on the system's dynamics and uncertainities. A very successful control technique is the linear quadratic Gaussian (LQG) controller [5,23], which is based on a linear state-space model including priors that follows Gaussian statistics and aims at the minimization of a quadratic cost function corresponding to the residual phase variance. Under the assumption of a DM with instantaneous response, i.e. neglecting dynamics of the DM, the action of the LQG controller consits of a two-step procedure [15,33]. At first, the WF is reconstructed from noisy measurements by solving a stochastic minimum variance estimation problem. This is done in the framework of the Kalman filter (e.g. [15,20,25,33]). The corresponding mirror actuator commands are then computed by projecting the reconstructed WF onto the DM space, which is done by solving an optimal deterministic control problem [15,33].
The Kalman filter (KF) corresponds to a system of linear, finite-dimensional equations (including statistical priors) that evaluates the best state estimate in the least-squares sense including all measurements available up to the current time step. The discrete representation of the phase can either be done in a zonal approach (the phase is represented by a set of discrete values at given points of the telescope aperture, e.g. at the mirror actuator positions) or a modal approach (the phase is represented as an expansion with respect to some set of basis functions (modes), e.g. Zernike polynomials) [8,9,30,33,15]. The dynamics of the turbulence can be modeled by a simple first-order autoregressive model [15,33] or a more sophisticated model that reflects the turbulence evolution according to the Taylor frozen flow assumption [26,27,36]. For the latter case, a near-Markov approach has been proposed [36], which allows to not keep all the prior history of measurement data in memory. In its original form, the KF involves the computation of the so-called Kalman gain, which has to be calculated for each time step and is obtained by the inversion of a generally dense matrix. However, in AO applications the asymptotic formulation of the KF is usually implemented (see e.g. [33,35]), without any remarkable loss of optimality. In this case, the state transition matrix (representing the evolution of the state vector) and the measurement process matrix are considered as stationary during a part of the observation, which allows the use of the constant, asymptotic Kalman gain. As a consequence, the Kalman gain can be calculated off-line by solving a discrete algebraic Riccati equation (DARE) and only has to be updated if the atmosphere is changing. However, since the off-line computational costs are still proportional to the 6th power of the telescope diameter [8,20], the calculation of the Kalman gain can take a long time and will be challenging for the furture generation of extremely large telescopes.
Several approaches have been proposed to reduce the computational costs of solving the DARE. For example, in [26], a Fourier decomposition of the atmospheric turbulence is used. Under the frozen flow assumption, the Fourier modes can be assumed to be statistically independent, which reduces the computational burden. Another Fourier-based method has been proposed in [20]. Here, the Fourier transform (FT) is used to decompose an infinite-order system into an infinite set of low, finite-order ones. The FT is used as an off-line tool which accelerates the computation of the Kalman gain. The on-line computation is improved by sparsity properties in the zonal representation of the phase. In [5], the computational costs are reduced by replacing vector-matrix multiplications by spectral iterative methods that use a sparse approximation to the Riccati equation solution. Further considerations have been done, e.g. in [21]. Also improvements in the on-line computation of the Kalman gain have been made, see e.g. [8,9].
In comparison to simple control strategies, LQG control has many advantages. As already stated, the KF provides the best estimate of the turbulence phase in the mean-square sense, since prior knowledge on the turbulence statistics and the measurement noise are taken into account in its formalism. Actually, it has been demonstrated that LQG control provides superior performance in terms of noise rejection and phase reconstruction in classical AO [5]. It has also been demonstrated that LQG can be used to compensate for non-turbulent perturbations like wind shake and vibrations [24,35] or DM dynamics [6], and also allows to consider arbitrary, real-valued time delays in the AO loop [27]. A significant benefit of Kalman filterbased controllers rely on the ability of phase prediction. In [4,12,19], wind predictive control methods have been proposed. According to the frozen flow assumption, the prediction relies on a suitable shift of the turbulence according to a wind shift profile. On the plus side, prediction-based optimal control efficiently reduces the temporal related WF error, allows longer integration times and weakens brightness requirements on the reference light sources [11,12]. In particular, (wind) predictive control supports the optimal use of limited photons collected by the WFS from faint guide stars. However, optimal control also has its price. Beside the computational burdens, these methods require prior informations on the atmospheric statistics and the measurement noise and rely on an accurate dynamical model that reflects the temporal behaviour of the turbulence.
As described above, the reconstruction of the new mirror commands can be basically classified into two categories, where the computation is either done on only the previously obtained measurement data or on the whole sequence of obtained data up to the current time step [7]. In general, the performance of 1st category reconstructors is diminished at the benefit of the computational efficiency. For reconstructors of the 2nd category, the behaviour is exactly the opposite. In order to optimize the control action, any control method should be able to balance the performance and the computational complexity, i.e. the method should be somewhere in between this two categories. One way in this direction is to combine a fast and efficient reconstruction method with an explicit prediction step.
We present a fully deterministic temporal control method, which relies on the more general approach (similar to the two-step approach for LQG control, mentioned above), where the calculation of the DM actuator commands is split into two separate steps: at first, the incoming wavefront is reconstructed from noisy WFS measurement data s s s i . In this regard, we basically have to solve an inverse problem of the form Γϕ i = s s s i for ϕ i , where Γ corresponds to the mathematical model of the WFS. The reconstructed WF is then used to calculate the actuator commands for the DM. This is done via a deterministic projection step. The decoupling of the WF reconstruction process from the DM has two main advantages: on the one hand, by exploiting the mathematical model of the WFS, this approach allows to include matrix-free, model-based reconstruction methods. On the other hand, this approach also provides more flexibility regarding the geometry or the used influence functions of the DM. This approach also allows to include an additional prediction step.

Temporal control for SCAO systems
As mentioned above, an AO system does not have the ability to respond instantaneously to changes in the wavefront. As a result, any delay between the beginning of the measurement process right up to the DM adjustment causes a deteriorated image correction. Furthermore, also the finite bandwidth of the AO system contributes to the total temporal error budget. In order to counteract temporal related errors, we need to modify the reconstructed incoming wavefront such that the resulting mirror shape represents a good approximation of the unknown turbulent wavefront at a future time.
In the following we outline a temporal control law which is capable of reducing the temporal error of a SCAO system in closed loop configuration, sketched in figure 2.

AO performance criterion
In the sequel let ϕ dm denote the shape of the DM. In order to assess the performance of an AO system, the Strehl ratio, which is related to the residual wavefront ϕ tur − ϕ dm , provides the most important quality measure. The Strehl ratio will also be used for the AO performance optimization. For a classical AO system like SCAO, the performance criterion is given by the minimization of the space-averaged square of the residual wavefront, averaged over a suffi- According to this criterion, the turbulence correction aims at the determination of an appropriate mirror shape ϕ dm such that J(ϕ dm , ϕ tur ) gets minimized, where Since 1/T and 1/ |Ω A | only represent scaling factors we may neglect them. As indicated above, the functional J also depends on the unknown incoming wavefront. Additionally required information on ϕ tur are obtained from measurements provided by the WFS.

Function spaces for wavefronts and mirror shapes
Note that Ω A actually defines a compact set in R 2 . However, for the theoretical considerations we will focus on an open and bounded set Ω A = {r r r ∈ R 2 : a int − < |r r r| < a out + } with > 0 sufficiently small. In the following let H 1 (Ω A ) be the Hilbert space and the induced norm Here, ∇ r r r φ corresponds to the weak formulation of the gradient, satisfying the condition ΩA ∇ r r r φ(r r r)· (r r r) dr r r = − ΩA φ(r r r)div r r r (r r r) dr r r (8) Figure 2. Diagram of an AO system in closed loop configuration. The residual WF is split up into two paths using a beam splitter (BS). While one path is led to the WFS the other one leads to the scientific camera (SC). Based on noisy measurement data, the control unit computes the mirror commands, which are used to steer the DM. The introduced correction WF (partially) compensates for the perturbations in the incoming WF.
Since any function ϕ(r r r, t) can also be interpreted as a family of functions ϕ(r r r)(t) parametrized in time, the definition of Bochner-measureable functions will be of great importance for our mathematical modeling. A function σ : Here, χ Tj correspond to the index function, which is 1 if t ∈ T j and zero otherwise. Furthermore, a function ϕ : For our purposes, we consider the linear Bochner-Lebesgue space containing all Bochner-measureable functions ϕ : is also a Hilbert space with inner product given by (7) integrated over time.
For the rest of this paper we consider ϕ tur ,

Temporal modeling
While the turbulent wavefront ϕ tur evolves over time, it gets measured by the WFS within fixed sampling periods. For the sake of simplicity we consider equidistant sampling intervals of length ∆ T ∈ R + . More precisely, ∆ T corresponds to the length of the period between the beginning of a measurement phase and the start of the next measurement process. The measurement read-out of the first sampling period takes place in the subsequent sampling phase (see figure 3). Strictly speaking, a WFS is not able to collect data continuously: due to periods of sensor resetting, so-called dead times arise. Within these phases no measurement data are obtained. However, for all commonly used sensors, the actual dead times are close to zero and they are usually not considered in the modeling process.
In the following we use these sampling periods to discretize the control functional (6) in time. Introducing the temporal nodes with t i := i∆ T , and letting F i := [t i−1 , t i ) define the ith frame spanned by two consecutive nodes, we obtain a decomposition [0, T) = nF i=1 F i of the exposure time T into a number of n F time frames F i , each of length |F i | = ∆ T . Since ∆ T is usually very small (at about 0.0003 to 0.002 s), the incoming turbulent wavefront ϕ tur restricted to F i can be approximately described by its mean value, i.e.
Together with the theorem of Bochner [34], it immediately follows that ϕ tur i ∈ H 1 (Ω A ). Due to representation (4), we further obtain with ϕ tur,l i ∈ H 1 (Ω A ) (according to the linearity of the Bochner-Lebesgue space (10)). As already mentioned above, the DM gets adjusted according to WFS measurements. Due to dynamics caused by the readjustment of the DM, the shape of the mirror will, in general, not be constant over a whole time frame. However, in practice, the DM shaping process requires much less time compared to the sampling period ∆ T , which itself is very short. Thus, for the sake of simplicity, the temporal dynamics of the DM are usually neglected. Defining ϕ dm i as the mirror shape corresponding to the ith frame F i , we may write Since (13) corresponds to a simple function (see (9)), it immediately follows from the definition of the Bochner-Lebesgue space (10) that ϕ dm is in L 2 ([0, T), H 1 (Ω A )) if and only if ϕ dm i ∈ H 1 (Ω A ) for all 1 i n F . For the choice τ = j∆ T with j ∈ N 0 , the frozen flow equation (1) averaged over F i is given by we obtain the discrete-time frozen flow equation:

Measurement data process
In the following let ϕ meas ∈ L 2 ([0, T), H 1 (Ω A )) denote the wavefront which gets sampled by the WFS. Independently of the type of the used sensor, only discrete information on ϕ meas is provided. Let ), c ∈ {1, 2, 3, 4}, as the coordinates of its corner nodes and g ∈ R + as the edge length. Furthermore, let n N ∈ N denote the total number of nodes given by the subaperture grid Ω S . Then, for each n ∈ {1, . . . , n N }, we define r r r (n) := (r 2 ) as a grid node, which is connected to a corner node via s : c ←→ n = n(s, c) .
In case of a Shack-Hartmann (SH) wavefront sensor, only discrete information about the derivative of ϕ meas gets measured (see [29]), i.e.
In practice, the measurement data provided by the WFS include information on ϕ meas within a period of time (sampling period). Therefore, analogously to (11), we consider the mean value of s s s with respect to one time frame, i.e. Based on the theorem of Fubini-Tonelli [1] (in combination with Bochner's theorem [34] and the definition of the weak gradient (8)), the spatial derivative and the integral with respect to time can be interchanged, i.e.

∆ T |Ω s | Fi Ωs
∇ r r r ϕ meas (r r r, t) dr r r dt = 1 |Ω s | Ωs 1 ∆ T Fi ∇ r r r ϕ meas (r r r, t) dt dr r r . Introducing the operator , which corresponds to the Shack-Hartmann wavefront sensor, the discrete-time measurement equation (derived from (15)) is then given by the formula with ϕ meas i analogously defined as (11). In case of an AO system driven in closed loop configuration we have ϕ meas Due to the data read-out time, measurements are usually not obtained at the end of the sampling period, i.e. at time t i . Actually, s s s i is available at a later time point with d 1 > 0.

Calculation of the mirror shapes considering time delays
Concerning the calculation of the new DM shape, we first consider that, for ϕ(r r r, t) = nF i=1 χ Fi (t) ϕ i (r r r), the functional J(ϕ, ϕ tur ) can be rewritten as whereφ tur (r r r, t) : and · L 2 (ΩA) denotes the L 2 -norm on the telescope aperture. The proof of (17) is based on the considerations in [15]. According to the temporal discretization (11) and the definition (18), φ tur is piecewise constant and thus defines only an approximation of the turbulent WF ϕ tur (which evolves over time). Hence, the first term in (17) does not vanish in general, leading to a loss of optimality in the DM shape calculation. However, since J(φ tur , ϕ tur ) does not depend on ϕ, the original AO performance criterion can be considered in discrete time without any additional loss of optimality.

Wavefront reconstruction.
In order to keep J(ϕ, ϕ tur ) small, it is sufficient to determine for all 1 i n F an appropriate approximation of the turbulent wavefront ϕ tur i . Hence, the turbulence correction aims at the calculation of ϕ dm i , i ∈ {1, . . . , n F }, that optimally fits the unknown turbulent wavefront ϕ tur i . In order to extract information on ϕ tur i from WFS measurements, we basically have to solve an inverse problem of the form However, due to imperfections in the sensing process, a WFS only provides noisy data. For our considerations we assume additive noise described by the random process η η η i ∈ R nS×2 . Hence, instead of the exact measurement data s s s i , only an approximation s s s η i = s s s i + η η η i is known. Note that, in closed loop configuration, any dynamics of the DM causes distorted measurements. However, since mirror dynamics are neglected in the modeling process, these disturbances are implicitly considered in the measurement process noise η η η i . Since the problem of WF reconstruction is in general ill-posed (see [13]), any reconstruction method, described as operator R, has to meet the following requirements: • R has to be stable with respect to noise in the measurements, i.e. small perturbations in the data should not cause arbitrarily large changes in the solution R s s s η . • R has to be an appropriate approximation of the (generalized) inverse Γ † of the WFS operator, i.e. RΓϕ ≈ ϕ for all ϕ.
A reliable reconstruction algorithm for Shack-Hartmann sensors is the cumulative reconstructor with domain decomposition (CuReD, [32]), which will be the reconstruction method of choice in this paper. Basically, the CuReD method provides discrete information on the measured wavefront with respect to the corner nodes r r r (n) of the subaperture grid. Introducing the reconstruction operator the discrete reconstruction φ ϕ ϕ tur i of the turbulent wavefront ϕ tur i is then given bỹ ϕ ϕ ϕ tur where P defines the collocation operator which evaluates the continuous mirror shape on the nodes r r r (n) of the subaperture grid and γ gain ∈ (0, 1] defines the so-called loop gain, which is used for the stability control of the AO correction process. The parameter γ gain is found empirically and mainly depends on the reconstruction method and the accuracy of the measurement data. If we assume that the measurements are not contaminated by noise, R perfectly reconstructs the measured WF and the DM response has no effect on the data, then ϕ ϕ ϕ rec i = Rs s s η i = Rs s s i = RΓ(ϕ ϕ ϕ tur i − ϕ ϕ ϕ dm i ) = ϕ ϕ ϕ tur i − ϕ ϕ ϕ dm i . In this case, (20) combined with γ gain = 1 would provide an exact reconstruction of the incoming WF, i.e. φ ϕ ϕ tur i = ϕ ϕ ϕ tur i . However, in real world applications, non of these assumptions will be fulfilled. Hence, the reconstruction process needs to be stabilized, which is controlled by a parameter γ gain < 1. In particular, stabilization is necessary in order to damp error propagations caused by uncertainities in the reconstruction. If not stabilized, errors in the reconstruction gets transferred to the DM shape which further affects the measurement data.
Note that, due to the time used for the reconstruction, φ ϕ ϕ tur i is available at time t i + (d 1 + d 2 )∆ T with d 1 given in (16) and d 2 ∈ R + .

Determination of the new mirror shape.
Based on the sampling period, the WFS read-out time and the reconstruction process, we have to consider a time delay of length (1 + d 1 + d 2 )∆ T . However, due to the time d 3 ∆ T , d 3 ∈ R + , required for the calculation and adjustment of the new mirror shape, we have to expect a total delay of length d∆ T , where d : figure 5). For the sake of simplicity we assume that d ∈ N, i.e. the delay extends over a fixed number of frames, and is considered as constant.
In this paper we use a simplified DM model, where the actuators are placed on the corner nodes of the subapertures (according to the Fried geometry), and the shape is determined via (bilinear) interpolation between the actuators. According to (5), the actuator commands a j are simply given by the values of the WF at the corresponding subaperture corner nodes and the DM space gets spanned by continuous shape functions h j . To a certain approximation, such a DM model is sufficient, especially for an AO system running in closed loop configuration [38]. Furthermore, this choice enables to consider the continuous function φ tur i , which is generated via interpolation between the values φ ϕ ϕ tur i at the subaperture corner nodes, directly as DM shape, i.e. ϕ dm =φ tur .
Basically, the reconstructor allows to extract information on ϕ tur i from given data s s s η i . However, due to the temporal delays we actually require an estimate of the unknown turbulent WF ϕ tur i+d based on the same measurement set. Thus, in general, φ tur i does not represent an appropriate approximation of ϕ tur i+d , i.e. ϕ dm i may not appropriately compensate for WF distortions at time frame i + d. Consequently, the reconstructed wavefront has to be further modified. As a remedy, AO correction may be improved by including additional knowledge of the temporal evolution of the incoming wavefronts to obtain a prediction φ tur i+d of the turbulent wavefront ϕ tur i+d based on the reconstruction φ tur i . Once the predicted WF is available, the new mirror shape (corresponding to the (i + d)th frame) is then given by ϕ dm i+d =φ tur i+d . Due to the considerations above, it immediately follows that ϕ dm 1 = · · · = ϕ dm d , where ϕ dm 1 defines the initial mirror shape. Since, in general, no information on ϕ dm 1 is available, we set ϕ dm 1 = 0.

Wavefront prediction
In general, the incoming wavefront is affected by several turbulent layers, which, according to the frozen flow assumption, are blown across the telescope aperture by the wind in different horizontal directions with different speeds. However, since the turbulence strength varies in dependence of the altitude according to the C 2 n profile, some of these layers have a stronger contribution to the incoming WF than the others. From (12) together with the discrete-time frozen flow equation (given in the alternative form ϕ tur,l i (r r r) = ϕ tur,l i+j (r r r + j∆ T v v v l )), we have Assuming one strong turbulent layer ϕ tur, i with ∈ {1, . . . , n L } dominating all the others, i.e. γ > γ l for all l ∈ {1, . . . , n L } \ { }, the incoming WF is mainly affected by this layer. In particular, we have that ϕ tur, i contributes the most to ϕ tur i and ϕ tur i+j is dominated by ϕ tur, i+j . Hence, according to the frozen flow assumption, the temporal evolution of the incoming wavefront is approximately determined by the temporal evolution of the dominant turbulence layer. Consequently, we may assume that with turbulence weighted average propagation vector v v v, which is mainly dictated by the shift vector v v v of the strongest layer. From (21) and (22), we finally obtain i.e. the temporal evolution of the incoming wavefront can be approximately described by a shift with a turbulence weighted propagation vector. Since, typically, the ground layer contributes the most to the incoming wavefront, a shift-based predictive control approach for a SCAO system is also possible in presence of multiple turbulent layers. In order to counteract temporal delays, the prediction of a future wavefront is essential. As described above, a shifted version of φ tur i can be taken as a prediction of ϕ tur i+d . However, this approach involves two main difficulties: on the one hand, the shift requires the identification of the actual windshift vector, and on the other hand we need information on the wavefront outside of the telescope aperture Ω A . In the following we outline a method which allows to compute an approximation of ϕ tur i+d based on the discrete reconstruction φ ϕ ϕ tur i . By using interpolation, φ ϕ ϕ tur i can be extended to a continuous function on Ω S . However, in order to obtain a function over the whole telescope aperture or a shifted version of Ω A , we additionally need information outside of Ω S . Therefore, we use extrapolation of the reconstructed data.
In the following, let v v v i ∈ R 2 denote the windshift vector corresponding to the ith frame, which, at the moment, is assumed to be known. In the next subsection we will deal with the extraction of v v v i . The domain of the predicted wavefront is then given by the shifted telescope aperture figure 6). Furthermore, let n N,i ∈ N define the number of grid nodes additionally gained by the extension. In order to extend the reconstructed data we use extrapolation. For any n ∈ {n N + 1, . . . , n N + n N,i }, let B ρn (r r r (n) ) := {r r r ∈ R 2 : |r r r − r r r (n) | ρ n } define the circle with radius ρ n > 0, surrounding the corner node r r r (n) of a newly added subaperture. Based on that, we further define N n := ñ : r r r (ñ) ∈ B ρn (r r r (n) ) Ω S .
Then, for given weights ωñ ∈ R satisfying ñ∈N n ωñ 1, we introduce the extrapolation operator which enables to extend the reconstructed data onto Ω + i . Now, let Q κ define the linear space of all polynomials of degree κ in each of the coordiantes r (1) , r (2) . Furthermore, let define the space of all ν -times differentiable, piecewise polynomial functions on the extension Ω + i . In order to extend ψ ψ ψ = E i [φ φ φ] to a continuous function, we define the operator where β s (ψ ψ ψ; r r r) := 4 c=1 ψ ψ ψ(n(s, c))β (s,c) (r r r), which generates a function in V i (Ω + i ) by means of shape functions β (s,c) ∈ Q κ , which interpolates the function values between the discrete values ψ ψ ψ(n(s, c)) obtained at the corner nodes of a subaperture. Thereby, the shape functions satisfies the conditions β (s,c) (r r r (s,c ) ) = δ c,c , where δ denotes the Kronecker symbol. Based on (27), we finally introduce the operator Note that any function φ ∈ V A,i has support Ω A . According to (28), the operator extends ψ ψ ψ ∈ R nN +nN,i to a continuous function ψ = S d∆Tv v vi i (ψ ψ ψ) which corresponds to a shifted version of the continuous extension of φ φ φ ∈ R nN . As one can see from the definition, any function φ in V A,i inherits the properties of the functions in V i (Ω + i ), i.e. φ is ν -times continuously differentiable and piecewise polynomial. As an immediate result, it follows that V A,i ⊂ H 1 (Ω A ).
Once the windshift vector v v v i is available, an approximation of the turbulent WF ϕ tur i+d (and thus also ϕ dm i+d ) can be determined (according to (25) and (28), see figure 7) bŷ

Calculation of the propagation vector
A method to obtain information on the windshift profile of the atmospheric layers using multiple Shack-Hartmann wavefront sensors is the so-called slope detection and ranging method, Figure 6. The subaperture grid Ω S (dark grey area) gets extended by additional subapertures (light grey areas) such that the shifted telescope aperture Ω + A (d∆ T v v v i ) (area restricted by the dashed circles) gets overlapped by the extension Ω + i .
short SLODAR [37]. In this method, phase aberrations of the wavefront are used to estimate the turbulence profile. More precisely, SLODAR uses wavefront slopes of two or more guide stars and their cross correlation to extract informations on the height, strength and velocity of the turbulence layers. However, in case of SCAO, only one guide star and one WFS is used, i.e. SLODAR can not be used. Furthermore, SCAO only provides information on the sum of the turbulence layer contributions. Therefore, a shift vector, which describes the propagation of the cumulated turbulence contributions, is required. Following the method proposed in [4,12], the problem of propagation vector identification is related to a cost function which has to be minimized. Here, the shift vector is calculated based on a reduced telescope aperture Ω − A . Beside the reduction of the computational costs, there is another important reason for computing the propagation vector on a reduced set. As we have seen in the wavefront prediction above, we have to extend the data outside of the telescope aperture to apply a shift on it. However, if we focus on a reduced telescope aperture we are able to circumvent this problem (see figure 8). As the turbulent flow is assumed to be constant over a certain number of time frames, it is also possible to calculate the propagation vector based on a sequence of previously reconstructed wavefronts, leading to more sophisticated cost functions, e.g. [19] with q, m ∈ N. In order to make the translation effect of small shifts more visible, it is recommended to use m > 1 [19]. In comparison to (30), this approach will in general increase the approximation quality of the reconstructed vector as more informations are used in the calculation. As a drawback, it also increases the computational costs to minimize (31). In [4,12,19], the methods above are proposed to calculate the wind velocity vectors of single turbulent layers. However, as discussed above (see (23)), these methods can also be used to determine the propagation vector in case of a SCAO system with multi-layered atmosphere. Basically, the cost functions (30) and (31) can be considered independently of the way how the reconstructed wavefronts are obtained, i.e. it does not matter if the approximations φ tur i are calculated based on a deterministic or a Bayesian reconstruction approach. However, in case of a deterministic approach, one has to keep in mind that the reconstruction method  (25) and (28). Here, the reconstructed values of the incoming wavefront are given by the green points. The red points correspond to the extrapolated data. ony minimizes effects of the measurement noise. In particular, the noise is not filtered out. To achieve a smoothing effect, it is recommended to average the reconstructions. Due to that, we consider an alternative approach in which we first compute the mean values of two bundles of reconstructed wavefronts and then use them to determine the propagation vector. Furthermore, the calculation is directly done on the discrete data provided by the reconstructor.
define the mean change of the reconstructed wavefronts φ ϕ ϕ tur i−u ,φ ϕ ϕ tur i−u−1 . . . ,φ ϕ ϕ tur i−(u+v)+1 over v time frames. Furthermore, let φ ϕ ϕ tur i−p , p ∈ N, be the most recent reconstruction among those used to compute the windshift vector. In order to calculate v v v i , we focus on a set of reconstructed wavefront data within a fixed, reduced telescope aperture with some a priori chosen constant a ∈ R + . For a set I ⊂ {1, . . . , n S } of indices let Ω I := ∪ s∈IΩs define a reduced subaperture grid such that Ω I ⊆ Ω − A . Furthermore, let N I := {ñ ∈ {1, . . . , n N } : r r r (ñ) ∈Ω I } be the set of indices of those nodes which correspond to the corners of the subapertures in Ω I . Consider w w w ∈ R 2 such that r r r (n) − w w w ∈ Ω S for all n ∈ N I . Introducing the operator the shift vector v v v i is calculated by solving the minimization problem with q i := min{i − (m + p), q max }, where q max ∈ N defines the maximal number of reconstructions used for averaging (32), and m ∈ N is used to make the translation effect of small windshifts more visible. Note that, if w w w equals the zero vector, then T 0 0 0 only corresponds to a restriction operator. By the definition of q i and (32), v v v i can only be calculated for i > m + p . For i m + q we define v v v i as the zero vector. In this case, no shift will be done, i.e. φ tur i+d =φ tur i . For any given vector v v v ∈ R 2 , it holds that {r r r ∈ R 2 : r r r + m∆ T would be an appropriate criterion for the constant, since the windshift hardly changes over a certain amount of time. However, for practical use, we may replace the factor max{|v v v j |} nF−d j=m+p+1 by an upper bound v max for the wind speed, and choose the constant according to a > m∆ T v max . A possible choice for v max would be the wind velocity at which the observation is no longer possible. Note that the choice of a represents a compromise between the reduction of the computational cost and the loss of information used for the calculation of the shift vectors.
Note that the parameter p in (35) has no effect on the computational cost of the minimization problem. Actually, p relates to the number of time frames needed to solve a problem of the form (35). If d 4 ∆ T , d 4 ∈ R + , defines an upper bound on the required time needed to calculate a shift vector, then we may choose p = min{p ∈ N :p d 4 }.
To guarantee a higher stability in the calculation of the shift vectors, one can also think of using exponential smoothing [19], i.e. if ṽ v v i denotes the solution of (35), then, for a fixed

Predictive control algorithm
Summarized, we obtain the following temporal control algorithm:  Beside the length ∆ T of the sampling period and the number d of frames of delay, the algorithm requires the input parameters p (which indicates the most recent reconstruction among those used to compute the windshift vector), m and q max (introduced in (35)), the loop gain γ gain , and the damping factor γ smooth , which is used for the calculation of the shift vector (see (36)). Furthermore, the algorithm involves the reconstruction operator R (19), the collocation operator P (see (20)), the operator E i (25) used for extrapolation, and the shift operators (28), respectively. We also use the operator B, which is similar defined as the interpolation operator B i (27) but only on the original aperture Ω A .

Numerical validation
For our simulations we consider a SCAO system operated in closed loop configuration on an assumed 37 m telescope with a central obstruction of 28%, i.e. a out = 18.5 m and a int = 18.5 m * 0.28. The WFS measurements from the observed guide star are received by a Shack-Hartmann wavefront sensor with 74 × 74 subapertures, which measure the wavefront slopes at a frequency of 500 Hz (i.e. ∆ T = 0.002 s) and a sensing wavelength of 700 nm. The edge length of such a subaperture is g = 0.5 m. Since not all subapertures are exposed equally, only n S = 3832 of them are active. Overall, we have n N = 4204 grid nodes.
To demonstrate the qualitative performance of the proposed temporal control method, we use European Southern observatory's (ESO's) official end-to-end simulation tool OCTOPUS [17,18]. The software enables to simulate a certain number of atmospheric turbulence layers located at different altitudes. Each layer is described by a von Karman model with outer scale L 0 = 25 m. For our purpose, we simulate median atmosphere conditions with Fried parameter r 0 = 0.157 m. The evolution of the atmosphere is considered as a frozen flow and is simulated by shifting the constant layers according to a wind shift profile. For more information on the parameters r 0 , L 0 and the von Karman turbulence model we refer the reader to [10,30,31].
For wavefront reconstruction we use the CuReD algorithm [32]. The loop gain γ gain given in (20) depends on the used reconstruction method among other things and is found empirically. In our case, we consider γ gain ∈ {0.1, 0.2, . . . , 0.9}. In order to extend the reconstructed data we extrapolate the given values on the boundary of the subaperture grid Ω S . Depending on the position of the node r r r (n) outside of Ω S , we choose ρ n as a multiple of the edge length g such that B ρn (r r r (n) ) Ω S (see (24)) only contains boundary nodes of the subaperture grid. The weights in (25) are chosen as ωñ = 1 |Nn|ω with some damping factor ω. In our case, we consider ω = 0.7. The DM, which is conjugated to the ground layer, is modeled by continuous (ν = 0 in (26)), piecewise bilinear functions, i.e. β (s,c) ∈ Q 1 = span{1, r 1 , r 2 , r 1 r 2 }, with a total number of 4204 degrees of freedom. In order to calculate the shift vector we focus on a reduced data set with |N I | = 1538 given values. This corresponds to an approximately 60% reduced aperture Ω − A in (33). Furthermore, we use the parameters p = 4, q max = 8, m = 2 in (35) and γ smooth = 0.4 in (36) for all test cases. To calculate the shift vectors (i.e. to solve problem (35)) we use the MATLAB c function fminsearch, which is based on the downhill simplex method (see [16]).
We simulate two seconds of real time for different configurations of atmospheric layers, where we assume d = 2, 3, 4 frames of delay. The achieved AO correction quality is evaluated in terms of long exposure Strehl ratios ∈ [0, 1] for a wavelength of 2200 nm after n F = 1000 iteration steps, where 1 is best and 0 worst. We run simulations for a NGS photon flux of 10 000 photons per subaperture per frame (ppspf) (high flux) and also for 10 ppspf (low flux).
Furthermore, we assume that the WFS measurements suffer from photon noise and read-out noise.
As test cases, we focus on single layer atmospheres and atmospheres with n L = 2,3,9 turbulence layers. In the former case we use one atmospheric layer located at an altitude of 30 m, which is simulated to move with a wind speed of 25 m s −1 , 15 m s −1 and 5 m s −1 , respectively. For the two-and three-layer atmospheres, we simulate two different layer configurations for demonstration purposes. Finally, we also consider the nine-layer ESO median seeing atmosphere.
All main parameters used for the simulations are summarized in table 1. A detailed list of the heights, wind speeds and shift directions of the single layers, and also the normalized layer strengths γ l can be found in table 2.
For the numerical validation, we compare the results of the AO correction using the mirror shape ϕ dm i+d =φ tur i+d determined by the prediction step (29) with those obtained by only using ϕ dm i+d =φ tur i (integrator-type control), where φ tur i corresponds to the continuous extension of the reconstruction (20).
As it can be seen from figures 9 and 10, we obtain a strong boost in performance in the low flux case for all simulated layer configurations. In the high flux case, we only get an improved AO correction in the cases where the wind speeds and also the time delays are large. It is noted that, in the high flux case, the main temporal limitation is due to time delays. From (14) and (23) together with the Greenwood time delay (2), we obtain that the perturbations in the incoming WF are approximately constant for j∆ T < τ 0 , i.e. for j ∈ N 0 with  (37) and |v v v| given as in (3). Hence, in the high flux case, predictive control may not always lead to an improvement in performance. For all simulated test cases, the values of |v v v| and J are given in table 3. However, since prediction-based control uses the limited photons obtained from faint guide stars optimally, an improvement is expected in the low flux case. As already mentioned, the parameter γ gain in (20) (which is used for stability purposes) is found empirically and mainly depends on the reconstruction method and the accuracy of the measurement data. In particular, γ gain will be different for high flux and low flux considerations and has to be determined for each simulation case separately. In table 4, we have listed the values of γ gain for all simulated test cases. At a first glance, it may be confusing that γ gain is smaller in case of low wind velocities and less time delays as for the other cases. However, since SCAO is operated in closed loop configuration, formula (20) can also be seen as update of the discrete DM shape Pϕ dm i with γ gain ϕ ϕ ϕ rec i . Since perturbations in the incoming WF are approximately constant for low wind velocities and less time delays (see discussion above), the DM shape remain approximately constant with respect to a certain number of time frames and thus an essential update of the DM shape is not required, i.e. γ gain is usually small. For a SCAO system, the WFS measures the sum of the turbulence contributions of the single atmospheric layers along the optical axis. This is also the only information available for the computation of the new mirror shape. Therefore, the use of shift-based predictive control methods becomes less accurate as the number of turbulence layers increases, which can move in different directions. Based on the approximations φ tur , only one shift vector can be determined, which mainly includes information of the shift direction of the most dominant layer. Nonetheless, we obtain a good AO correction in the low flux case for the nine-layer ESO median seeing atmosphere (see figure 10).  Long exposure (LE) Strehl ratios for different configurations of the simulated single-layer (above) and two-layer (below) atmospheres (defined in table 2) using the CuReD method combined with 1) integrator-type control and with 2) shift-based predictive control.  table 2) using the CuReD method combined with 1) integrator-type control and with 2) shiftbased predictive control.

Conclusion and outlook
In this paper, we consider a new, fully deterministic temporal control approach for SCAO systems operated in closed loop configuration. The derivation of the method is based on a detailed mathematical model of the AO correction process in terms of operator equations. The proposed method is based on the separation of the correction process into subproblems, including a prediction step, which is used to reduce the temporal error in the AO system. In particular, this approach enables to incorporate computationally efficient WF reconstruction algorithms like the CuReD method. The prediction step is carried out by an appropriate shift of the reconstruction. The corresponding shift vector is calculated based on a sequence of previously reconstructed wavefronts. The qualitative performance of the proposed temporal control method is illustrated in high flux and low flux regimes.
In view of the numerical results, the proposed control algorithm is a promising method to counteract temporal delays in SCAO. Due to its simplicity and flexibility, the algorithm can easily be adapted to a pyramid wavefront sensor and other WF reconstruction methods without major changes. One can also incorporate other methods for the computation of the shift vector and it is also possible to adopt the algorithm to other DM models. We believe this is also a very promising temporal control method adaptable to multi conjugate adaptive optics (MCAO) systems, which use multiple numbers of wavefront sensors and DM. Future work includes the development of more efficient methods for calculating the shift vectors and a comparison of the proposed method to other control algorithms-in particular with Kalman filter-based control algorithms.