Eye-pupil displacement and prediction: effects on residual wavefront in adaptive optics retinal imaging.

This paper studies the effect of pupil displacements on the best achievable performance of retinal imaging adaptive optics (AO) systems, using 52 trajectories of horizontal and vertical displacements sampled at 80 Hz by a pupil tracker (PT) device on 13 different subjects. This effect is quantified in the form of minimal root mean square (rms) of the residual phase affecting image formation, as a function of the delay between PT measurement and wavefront correction. It is shown that simple dynamic models identified from data can be used to predict horizontal and vertical pupil displacements with greater accuracy (in terms of average rms) over short-term time horizons. The potential impact of these improvements on residual wavefront rms is investigated. These results allow to quantify the part of disturbances corrected by retinal imaging systems that are caused by relative displacements of an otherwise fixed or slowy-varying subject-dependent aberration. They also suggest that prediction has a limited impact on wavefront rms and that taking into account PT measurements in real time improves the performance of AO retinal imaging systems.


Introduction
Adaptive optics (AO) systems, which combine a wavefront sensor (WFS) and a deformable mirror (DM) inserted into the telescope's optical path, have been used since the early 1990s to counter the effects of atmospheric turbulence on ground-based telescopes [1]. This technique, together with associated DM and Hartmann-Shack (HS) WFS technology, has been subsequently and successfully adapted to correct optical aberrations in retinal imaging. In 1994, Liang et al demonstrated for the first time the feasibility of wavefront sensing in the eye [2]. This work was extended in 1997 by Liang and Williams, who measured the eye aberrations up to a very high order (65 Zernike modes) and were the first to close an AO loop on an eye in vivo and to obtain sharper images of a human retina [3] -see also [4,5]. In the 2000s', as noted in a 2010 review of emerging clinical applications of this technique, 'AO imaging has changed the way vision scientists and ophthalmologists see the retina, helping to clarify our understanding of retinal structure, function, and the etiology of various retinal pathologies' [6]. Existing retinal imaging AO systems use the same integral-action controller popular in astronomical AO to compute DM controls from WFS measurements. However, several more recent works have investigated and/or tested improved controller structures, including Smith predictors [7], adaptive controller tuning [8] and minimum-energy control for dual-deformable-mirror 'woofer-tweeter' systems [9,10].
In astronomy, accurate models of atmospheric turbulence enable to construct detailed 'error budgets' which enable designers of new AO systems to translate image quality requirements of end-users (i.e., astronomers) into detailed performance requirements for all components and elements of the AO loop. Clearly, such an understanding of the physical nature and statistical properties of disturbances to be compensated would be hugely beneficial for future developments of retinal imaging AO systems. Studies of the trajectories of aberrations measured by HS WFSs suggested that different optical modes exhibit complex temporal behavior, hinting at a combination of diverse underlying mechanisms [11,12]. Another experiment showed the non-negligible contribution of the tear film [13]. However, a number of authors noted that eye motion played a major role, see, e.g., [14,15]. Thus, at least for the purpose of efficient DM control computation, a plausible conjecture is that a major part of performance degradation can be modeled as resulting from the relative displacements of a fixed or slow-moving and subjectdependent pupil aberration, as stated in [16]. One way to investigate the implications of this conjecture in terms of achievable performance is to consider an ideal case where the aberration seen by the AO system results only from the relative horizontal and vertical relative displacements (with respect to the imaging system) of an otherwise fixed pupil aberration. Throughout this paper, we shall call this simplified scenario the 'moving aberration' assumption.
For an ideal retinal imaging AO controller achieving perfect sensing of the aberration through the WFS and perfect compensation by the DM, the residual variance would be equal to the tracking error variance. In addition, under the 'moving aberration' assumption, this tracking error variance can be evaluated by taking a representative sample of fixed aberrations and moving them by appropriately distributed horizontal and vertical displacements during the total AO loop's delay between WFS measurement and DM correction.
The moving aberration hypothesis also suggests that a sensor capable of monitoring pupil displacements in real time could be used to improve aberration correction. In recent years, such so-called 'pupil trackers' (PT) camera-based devices have been integrated into retinal imaging systems and have been used for a number of purposes, for example to detect and forecast where a subject is looking within a scene. In 2006, Hammer et al. used a pupil tracker integrated into a scanning laser ophtalmoscope system to actively compensate pupil movements in real time using a flat two-degrees of freedom field stabilization mirror [17]. A WFS-based pupil tracker, which eliminates the need for a separate PT camera, was proposed in [18].
In 2012, Sahin et al. [16] implemented a PT-based real-time control scheme where DM's inputs were computed by shifting a previously estimated eye aberration across the horizontal and vertical axis according to pupil displacement measurements. This experiment conducted on a robotic 'model eye' and three human test subjects showed that the PT-based control achieved a level of correction performance broadly similar to a conventional AO controller. This pioneering work provided additional (if somewhat circumstantial) evidence in support of the moving aberration assumption. This PT-based control experiment used a PT system developed by the company Imagine Eyes.
The analysis presented in this paper extends the preliminary results obtained in [19], with more statistical considerations and in-depth performance evaluation. We used a set of pupildisplacement trajectories measured by the same PT device (see Acknowledgments section), together with measured wavefronts, to derive quantitative assessments of the tracking error variance term entering an AO error budget under the moving aberration assumption. The displacement data set comprises a total of 52 displacement trajectories, each about 13 seconds long, recorded by Betul Sahin at Imagine Eyes on 13 different healthy subjects, with a PT based on a camera running at 80 frames per second. The wavefront data set corresponds to 500 wavefronts measured on healthy subjects by Imagine Eyes, under the form of Zernike coefficients for radial orders 1 to 6. Using these data sets, we simulate and quantify the tracking error due to the displacement of the aberration between the time its last position has been measured by the system and the moment the DM correction is applied. As explained above, this corresponds to the ultimate performance (expressed in terms of residual wavefront rms) that could be achieved by a standard AO loop fitted with ideally accurate WFS and DM and operating at a given WFS/DM sampling rate -or alternatively by a purely PT-based controller endowed with an exact knowledge of the fixed aberration.
In both cases, the tracking error is expressed centrally as a function of the total delay between the last available WFS/PT acquisition and DM correction -in our case, for delays ranging from 12.5 ms to 85 ms. Also, this computation is predicated on the implicit assumption that the last available pupil position measurement is used to estimate the pupil position at the time when the control is actually applied. An alternative would be to use a more elaborate method to predict the pupil's displacement in this short time interval. To this effect, we have tested several standard short-time prediction methods based on simple dynamic stochastic models of eye pupil displacement.
In the domain of retinal imaging, whenever a patient is fixing on a target, involuntary residual movements are indeed still present, due to a combination of eye and head movements. For instance, the human eye is known to have micro-movements [15] which are essential to our vision as they make it possible to maintain the transmission of the image fixed in our brain [20]. Such micro-movements are for example studied in [21], where the authors seek a mono-dimensional Auto-Regressive (AR) model of tremor movements. Martinez-Conde et al. [15] mention three main types of eye movements during visual fixation: tremor, drifts and micro-saccades. Quantitatively, pupil displacements occur with different translational/torsional amplitudes and frequencies (comprising approximately 0.001-1 degrees and 0.1-100 Hz), and are known to vary significantly from individual to individual [22,18], and also for one same individual observed at different times. Dynamic stochastic models of eye pupil displacement need therefore to be adapted to each individual, and thus identified from each displacement trajectory. This paper is organized as follows. Section 2 describes the PT set-up and the data sets used throughout the paper; a preliminary statistical analysis of the pupil displacement trajectories is presented, focusing on the issues of saccade detection in real time and PT frame-rate selection. Section 3 is devoted to construction and identification from PT data of stochastic pupil displacement models. Section 4 discusses the performance of these identified models for short-term prediction of pupil displacements. In section 5, the tracking errors (residual wavefront rms) corresponding to the recorded pupil displacement trajectories applied to the measured wavefront data set are evaluated, and the impact of pupil prediction on performance is discussed. Some conclusions and perspectives for future works are presented in Section 6.

Description of the PT data and basic setup
The pupil tracking system (see [23] for more details) consists in imaging the pupil with a CCD camera using Near Infrared (NIR) LEDs as light sources, followed by digital image postprocessing. The system works for 6-9mm diameter pupils with an accuracy of ±20 μm (3σ ). The measurement sampling frequency F s is 79 Hz or 80 Hz (depending on the acquisitions), with a CCD exposure time of 10 ms. The subject's head is stabilized with a standard ophthalmic chinrest (see Fig. 1). Measurements from this PT correspond to eye pupil's positions following the horizontal and vertical axes. Our PT dataset consists of 52 trajectories, corresponding to data acquired from 13 different people, each trajectory being about 13s long. Figure 2 shows an example of position measurements for a subject in our sample. These horizontal and vertical trajectories show abrupt changes in measured positions due to saccades (zones delimited by the dashed lines), which can also occur after eye blinking. These data are used for a study on short-term prediction of eye pupil positions, where the identified dynamic stochastic models allow also for automatic detection of saccades in real time. Then, we evaluate the impact on residual wavefronts of pupil displacement and its predictions. A total of 500 phase screens measured on different subjects by Imagine Eyes using a 32×32 high-resolution Shack-Hartmann (haso 32-eye Wavefront Sensor) were used for the study. They have been compared with synthetic ones obtained by Thibos' model [22] in Section 5.   We present in Fig. 3 two trajectories of horizontal and vertical pupil displacements, corresponding to the same subject, during fixation of a fixed target. This illustrates the high variabil-ity of behavior for a same subject, and the presence of different types of eye micro-movements noted in the introduction. These can be described as follows: • Drifts: slow motions of the eye (up to a few Hertz), average 2-5arcmin in amplitude with a mean speed of about 6 arcmin/s.
• Tremors: very high frequency oscillations. Tremors' amplitudes and frequencies are usually in the range of the recording system's noise. Motions amplitudes average 5-60arcsec at frequencies up to ≈ 90 Hz.
In these two recordings, one can distinguish clearly between blinks and saccades. A blink of the eye is separated from a saccade by the absence of measurements from the PT. Tremors correspond to smaller amplitude movements. It may be also noted that after a blink the position may not be similar to that before the blink. In other words, a blink goes often with a saccade. Also, a saccade following the horizontal axis does not necessarily correspond to a saccade following the vertical axis, and vice versa.
With the selected sampling frequency of about 80 Hz, there will be aliasing on the tremor's and micro-saccades' spectra. (Shannon-Nyquist sampling theorem: the sampling frequency F s must be greater than twice the maximum frequency of the analog signal to avoid aliasing of frequencies > F s /2 and to allow a perfect reconstruction from the sampled values.) In the case of tremor, its very low amplitude (less than the diameter of a cone, as reported in [15]) leads to include a large part of this movement in the system's noise range (20 μm), so that the aliasing of frequencies above 40 Hz will have anyway a low effect. Micro-saccades have a random occurrence and a too short duration to be corrected in such an AO loop, but they are detectable thanks to their high amplitudes and will be regarded as outliers values.
It has to be noted that measurements made by the PT result from the computation of the center of the pupil in the PT plane, the pupil positions resulting from the projection on this plane of a combination of eye and head movements. The center of the pupil is calculated taking the center of an ellipse fitted to the image of the pupil taken by the PT's camera [23]. Head movements may also induce in the measurements slow drifts and kinds of saccades, which become indistinguishable from the movements produced by the sole eye.
According to Thibos [22], one subject can present a high diversity in pupil movements, therefore we consider all the PT data analyzed here as independent from each other. Moreover, instead of modeling the eye pupil's positions, we will consider one-step displacements (i.e. differences between positions). This eliminates the need to estimate the absolute reference position, enables prediction algorithms to lock in immediately when saccades occur, and allows to model the process around a zero value. Moreover, in an AO system, only displacements are necessary to update the control values. As these displacements are computed from noisy position measurements, they will be in return affected by a stronger noise component.
Prior characterization of the movements amplitude is necessary for the improvement of the position estimation process. Let us here consider all one-step displacements as an ensemble. Displacements depend on (and increases with) the time increment for which they are evaluated. We can then plot the absolute value of the one-step displacements as a function of the time delay between PT measurements. Figure 4 shows the upper bounds within which lie 99%, 95% and 90% of the absolute value of one-step displacements (dashed, dot-dashed and dotted lines, respectively), and also the curve of their average values (dot-dashed line), as a function of the delay between measurements; the plots mean that the amplitudes of practically all onestep displacements are concentrated below 200 μm, and that extreme changes in the trajectories represent a few events. Also, it can be seen that at Δt = 12.5 ms, 95% of the one-step displace-  ments have a value in the range of the PT measurement noise (20 μm). Thus, with this noise level, it would be pointless to increase the PT rate.

Models and algorithms for displacement prediction
The choice of a model implies a choice of the most suitable variables to work on. As noted above, we will consider one-step displacements defined as where p k is the vector of horizontal and vertical positions at sample time kΔt. Likewise, the measured one-step displacement δ p m k is defined as: where p m k is the PT measurement at time kΔt. In order to correct for aberrations, one needs to predict the actual position p. However, in order to assess the quality of this prediction, one would need to have access, at least afterwards, to the true values of the eye pupil positions. In our setup, this ground truth is not available to evaluate prediction performance. Performance is thus evaluated by comparing predictions with future measurements. Assuming that the measurement noise is an additive zero mean white noise with standard deviation σ for both horizontal and vertical positions, the performance indicators based on position errors are affected by an additional uncertainty. This will not have a significant impact on relative performance of prediction methods: the squared rms of the prediction error is equal to rms 2 = p − p * 2 + σ 2 where p * is the true position (which is unknown). Therefore, the measurement noise affects the rms computation in the same way for all methods.
As a consequence of Eq. (2), the position at time index k + , ≥ 1, is given by An estimate of p k+ based on all the measurements available until time k will then be computed asp where δ p k+ j|k is the predicted values of the one-step displacements. We will consider in the following three different algorithms for predicting pupil displacements: the reference one, called 'dummy', consists in doing nothing else than taking as a prediction the last available measurement. This is the reference because it is the simplest, and because when the PT frame is of the order of an AO system frequency loop, the dummy predictor is equivalent to the ideal AO loop, where perfect WFS measurements are used for perfect DM correction. We will also consider two predictors based on Kalman filters, with simple autoregressive models of order 1.

A simple (dummy) predictor
Since one-step displacements are weakly correlated, a simple way to predict the future positions is to take δ p k+ j|k = 0 for any j ≥ 1. This leads to take the last measured position p m k as predicted value for all future positions:p The underlying position model is a random walk driven by a white noise sequence. This will hereafter be called the dummy predictor.

Observer with parameter estimation
We reformulate the prediction problem in standard state-space form: where v and w are independent Gaussian white noises with covariance matrices Σ v and Σ w respectively. The vector y k denotes the measured output, in our case y k = δ p m k . Assuming uncorrelated horizontal and vertical one-step displacements measurements with same variance, we take here Σ w = σ 2 w I. The so-called state vector x contains the quantities that need to be estimated. In order for x and y to be stationary processes, A should have all its eigenvalues inside the unit circle. The magnitude of the eigenvalues expresses how fast the variable x decorrelates in time. A simple autoregressive model of order 1 (AR1) has been chosen for both horizontal and vertical one-step displacements. A scalar AR1 process {η} is defined as where v is a zero mean Gaussian white noise and |a| < 1. Using such a model for horizontal and vertical displacements leads to x = δ p and Assuming uncorrelated horizontal and vertical displacements, covariance matrix Σ v is taken as Σ v = σ 2 v I. When A and C are known, the state x can be estimated optimally (in the sense of minimizing the variance of the estimation error) using an optimal observer, the widely used Kalman filter (see, e.g., [24], or [25] for an example in the different context of gaze prediction and anatomicalbased models). It shall thereafter be referred to as the SKF, for standard Kalman filter. The SKF is described by recursive equations summarized as where the so-called Kalman gain H k is itself computed recursively together with Σ k+1|k , the covariance matrix of the prediction error (see below). As said above, this formulation requires the matrix A to be known. This is not the case here and therefore an on-the-fly identification is required along with the prediction. In our case, matrix A contains two unknown parameters, a h and a v . The parameter vector to be estimated is therefore: and we shall denote as A(θ ) = diag(θ ) the corresponding value of A. Two solutions with low computational burden have been tested: Kalman filtering with separate parameter identification, and extended Kalman filter.

Standard Kalman filter with recursive least squares
We now use a recursive least-squares (RLS) procedure (see, e.g., [26]) to update the estimation of parameter θ based on all the data obtained until time kΔt. This method is applicable to models that are linear in parameters, i.e with measurements that can be expressed as For our AR1 models with independent displacements, r k is a diagonal matrix with diagonal terms δ p k−1 , and the covariance matrix of {w} is also diagonal. Thus, the RLS algorithm can be split into two independent scalar estimators. We give here the more general version which gives at each time step k an estimated value θ k : 1. Get y k (new measurement).
2. Using the previous estimate θ k−1 , calculate the output error e k = y k − r k θ k−1 .
3. Compute the estimation error covariance matrix Σ mc k : 4. Define the gain L k as L k = Σ mc k r k . 5. Compute the new estimate as θ k = θ k−1 + L k e k 6. Increment k and go to step 1.
The algorithm starts at k = k 0 such that the matrix r k 0 r k 0 is invertible, and with initial values: One iteration of the SKF+RLS identification/prediction procedure can then be summarized as follows: 1. Get y k (new measurement).

Compute the estimation gain
6. Run a RLS to update parameter θ k .
8. Compute one-step and two-step 9. Increment k and go to step 1.
Initialization of all variables is set to zero, except for covariance matrices which are set to λ I, with λ >> 1. Note thatx k+1|k and Σ k+1|k+1 are no longer optimal nor do they correspond anymore to conditional expectations;x k+1|k is only an estimate based on all measurements until time index k. This will be true also for the extended Kalman filter described below.
When using the SKF+RLS method, we have set some limits to the estimated transition matrix diagonal values. (This matrix is 2×2, comprising the two axes of movement.) They were forced to lie between 0.3 and 0.9 (a value of one is equivalent to the random walk case), and then the transition state matrix was updated (at each time step) taking the average of the 40 latest estimated values for its diagonal elements. This ensured that the parameter values would not diverge.
Also, in order to get the most out of the models described in Section 3 we had to scan through a set of input parameters, such as the relative strength of the noise and process covariances matrices. We screened through a wide range of values, and found that by taking σ 2 v /σ 2 w ≈ 100 we obtained the best results in terms of rms of the difference between prediction and measured data. Performance is however not very sensitive to this value. It should nevertheless not be too small, otherwise filters may be destabilized.

Extended Kalman filter
Another classical approach, very close to the previous one, consists in using an extended Kalman filter (EKF, see e.g. [24]). In this formulation, parameters to be estimated are also considered as state variables, so that an extended state denoted by x e is formed: Based on Eq. (6)- (7), we can write: which can be rewritten in a matrix compact form as where and v e k = v k 0 The function f e (x e k ) mixes the data and parameters in a nonlinear way. The extended Kalman filter based on such a system corresponds to an observer in the form: x e k+1|k = f e (x e k|k ) The observer gain L k is computed using a first order Taylor expansion of f e around the current estimated value: where A k is the Jacobian matrix evaluated inx e k|k and defined as In the case of the AR1 models in Eq. (6), we have The gain matrix L k is computed using this linear approximation and the measurement Eq. (7). Using the extended state, this equation can be rewritten as with C = (I 0). One iteration of the complete procedure can be summarized as: • Given the previously estimated state vector x e k|k−1 and a new measurement, calculate the innovationỹ k = y k − Cx e k|k−1 .
• Update the statex e k|k =x e k|k−1 + H k y k − Cx e k|k−1 .
The updated state vector and covariance matrix of the prediction error is calculated through an algorithm adapted from Yi Cao [27]. It is worth noticing that an extension to any ARn model is possible through this formulation using the original A and C matrices of the SKF model to build A . For further details refer to [24].

Masking high amplitude movements
Sudden and large amplitudes movements can lead the eye pupil to be only partially seen by the wavefront sensor, producing poor quality measurements. The identification of such events are not only important for the post-processing (in which images acquired during that particular period are not taken into account: they are generally completely blurred), but also for the realtime parameter estimation process. Bearing Fig. 2 in mind, we have set an automatic procedure for identification of abrupt changes in the trajectories, which were masked out in real time during the RLS parameter estimation process. Hereafter we consider that the PT is working at a 80 Hz frame rate, and the masking is designed for this sampling rate, in real time. The same mask is used in the EKF estimation, where we freeze the state vector during such events; this is done to have a fair comparison basis for the methods.
Once a sudden and large amplitude movement is identified, the masking is done in both axes of the PT data (i.e. for horizontal and vertical movements). At time k, the standard deviation σ k of all δ p m up to that moment is computed, and a flag of 'non-useful data' is ascribed to a temporal window comprising k ±2 if |δ p m k | > 3.5σ k . This procedure masks out up to about 16% of the trajectories in our data sample. Absolute displacements beyond 3.5σ k are rare when fixing firmly a target and can be masked in the post-processing (i.e. discarding the corresponding science images; see Section 5 on wavefront errors).

Description of prediction error rms data
Adopting the state vector defined in Section 3, we compare here the measured and predicted positions for multiples of Δt = 12.5 ms, the sampling period of the PT. We call horizontal (H) and vertical (V) the two position axes measured by the PT, in order to avoid confusion with the variables x and y used previously. Performance of the dummy predictor and of the predictors based on SKF+RLS and on EKF was evaluated by computing, for each of the 52 trajectories, the rms (in microns) of the prediction errors for horizontal and vertical movements and for prediction horizons ranging from one to five time steps. Table 1 presents the mean rms value, standard deviation, maximum value for each direction and prediction horizon, together with the success rates for both Kalman filters, i.e. the percentage of trajectories for which they perform better than the dummy predictor.

Improvement in mean rms
For most of the 10 combinations of direction and prediction horizon, both Kalman predictors give lower average rms and standard deviation than the dummy predictor (the extended Kalman filter has higher average and standard deviation for horizontal displacements and prediction horizons of 4 and 5 steps). The Kalman filters also have lower maximum rms in all 10 cases. However, dummy and Kalman predictors values are not so different (they differ on average by less than 2 μm, i.e. much less than the PT measurement error itself), and it is necessary to perform a statistical test.
For the two Kalman predictors, for every direction (horizontal and vertical) and prediction horizon (1 to 5 steps), we performed a standard one-sided t-test (based on Student's statistics, using the Matlab function ttest.m) to decide whether the mean rms of the prediction error for SKF+RLS or EKF is significantly lower than the mean rms for the dummy predictor. Table 2 presents the p-values for each of these 10 tests (i.e., the probability of observing a Student test statistic as extreme as, or more extreme than, the observed value under the null hypothesis that the two mean rms are equal). For vertical displacements, both Kalman filters perform significantly better (at a 5% confidence level) than the dummy predictor for all prediction horizons. Table 1. Comparison of the performance of the predictors for all trajectories. Each cell gives the mean, standard deviation and maximum value of the series of 52 horizontal and vertical prediction errors rms (in μm) computed for all trajectories; the success rate (r) of both Kalman predictors over the dummy predictor for the 52 trajectories is displayed in the two last lines. Saccades and abrupt changes in the trajectories have been masked out in the estimation process, as described in the text. H: horizontal; V: vertical. Δt = 12.5 ms. For horizontal displacements, SKF-RLS and EKF perform significantly better than the dummy predictor for horizons ranging respectively from Δt to 4Δt and from Δt to 3Δt. Better results are obtained for horizontal displacements predictions for all methods, with smaller average rms, standard deviations and maximum values. This is in agreement with [28] where the authors noticed that, for fixational eye movements, the horizontal components are much strongly correlated than the vertical ones at the short time scale. They explain this by the fact that micro-saccades are controlled by different brainstem regions, as reported in [29].
Also, larger prediction horizons lead to non significant differences: the performance of the Kalman filters for large horizons is not improved with respect to the dummy one. This is not surprising, as the models of the Kalman filters have been built simple and for short-term prediction (at most 2Δt). Examples with good rms improvement and without rms improvement are given in Fig. 5 and 6 respectively, where a zoomed part of one trajectory of the horizontal position is plotted along with predictions for Δt and 2Δt. The solid black line denotes the measurements, whereas the dashed one represents the different predictions. Longer horizons introduce larger gap between predicted and measured trajectories, but also spurious peaks, especially in the Kalman cases as illustrated by Fig. 5 where a peak starts to appear between 3 and 3.2 s for Kalman predictors with 2Δt, increasing significantly the rms.    Table 3 is another way to look at the results, by taking the success rate of the Kalman based predictions over the dummy prediction in the trajectories themselves. For each horizontal and vertical displacement trajectory, we compute the percentage of sample times where the Kalman prediction is closer to the measurement than the dummy prediction (in terms of absolute value of the prediction error for each sample). These percentages are then averaged over the 52 trajectories, and standard deviation, maximum and minimum values are given. Any value above 50% in the mean percentage indicates a prediction improvement in average over the dummy predictor. Maximum and minimum values when Kalman predictors are better than dummy lie within the ± 2 standard deviation interval, so there are no outliers. The first maximum value of 77% (SKF+RLS, H, max. when better) means that over all 52 trajectories, for the horizontal displacements, the prediction with SKF+RLS is better than dummy for at most 77% of the samples. Similarly, the first minimum value of 40% (SKF+RLS, H, min. when better) indicates that the prediction with SKF+RLS is better than dummy for at least 40% of the samples.

Relative performance improvements and degradations
Finally, it is interesting to appreciate one-and two-step prediction performance in terms of rms when Kalman predictors perform better than dummy (for SKF+RLS H and V, 92% and 69% of the cases, see the last two lines of Table 1 Table 3. Prediction error along trajectories: mean, standard deviation, maximum and minimum values of the percentage of sample times in which the Kalman filters give a smaller prediction error than the dummy predictor. Any value above 50% of mean percentage indicates a better prediction than with dummy. compare, in Table 4, (i) the relative performance improvement obtained by the Kalman predictors when they perform better than dummy, and the relative performance degradation when they perform worse, and (ii) the relative performance improvement obtained by the dummy predictor when it performs better than Kalman predictors, and the relative performance degradation when it performs worse. Maximum values lie within the mean ± 3 standard deviation interval (except for the first value of 95% in the last line).
The values presented here confirm that in average, Kalman predictors perform better than dummy: for one-step ahead prediction, the mean improvement for 92% of the horizontal trajectories is around 24%, and is around 11% for about 70% of the vertical trajectories. The performance degradation for these trajectories when dummy is worse than Kalman predictors is about 33% and 14% for horizontal and vertical trajectories respectively, while maximum degradation is very high (95% and around 44% respectively). For the trajectories where dummy has better rms (8% for horizontal and around 30% for vertical, as deduced from Table 1), the improvement is very low (at most 7%), while performance degradation on these trajectories induced by Kalman predictors is very low.

Pupil displacement prediction: summary of results
Our experimental results confirm that a better prediction of the pupil displacements can be achieved with the use of the Kalman tools. This improvement (measured as the mean rms of the prediction error) is statistically significant for all vertical displacements and for horizontal displacements of less than 4Δt (with a slight advantage to the SKF+RLS method). However, this also suggests that even at a 80 Hz PT rate, the measurements are weakly correlated, so the Δp process at this rate tends to be close to a random walk, leading to results only slightly better than the ones obtained with the dummy model. The goal of the next section is to evaluate the impact of pupil motion and of course the level of improvement brought by the predictors presented above.

Experimental data: impact of pupil motion on residual wavefront errors
In order to evaluate the impact of pupil motion on the imaging quality, we have performed a series of simulations of the displacement of eye aberrations. The goal is to evaluate, given a frozen aberration, the contribution to the residual wavefront due only to pupil displacement (moving aberration hypothesis). Notice that, as we are focused on residual wavefront budget error, the hardware limitations are not taken into account and the frozen aberration is assumed to be known (i.e. there is no fitting error for the correction and aberrations are perfectly represented -no spatial discretization due to a wavefront sensor).

Generation of phase screens
Phase screens were generated using sets of normalized Zernike coefficients obtained from wavefronts measured on real healthy subjects by Imagine Eyes. The coefficients ranged from radial orders 2 to 6, totaling 25 modes. The phase screens were generated inside a larger frame than the actual pupil size of the instrument, so we could always see aberrations when shifting the phase screens to mimic movements. When setting up the size of the phase screen frame, the statistics of the eye movements (Section 2) were taken into account, and extreme eye movements were actually not considered in the final computation of the rms (as the images acquired during such events can be discarded in post-processing).
The statistics of the generated phase screens were compared with a parameterization given in the literature. Based on Fig. 9A of [22] and a rough estimate of its parameters for our phase screen size, let us assume that the aberrations have a dependence of wavefront variance σ 2 n (this is a 'partial' variance in the sense that is due only to a radial order mode) vs. Zernike radial order (n) as σ 2 n = 8.96 exp(−1.5n). Using this expression, synthetic phase screens can be generated by taking a set of normalized Zernike modes (Z k ) and linearly combining them with Gaussian random weights with zero mean and [σ 2 n /(n + 1)] 1/2 standard deviations. The histograms of synthetic and real data wavefront rms are shown in Fig. 7. The distributions are not consistent (two-sample Kolmogorov-Smirnov test (Matlab function kstest.m) at 5% significance level) due to the large scattering of the real data, but have consistent median values (as attested by the Mann-Whitney U-test (Matlab function signtest.m) at 5%). We have considered that this data set obtained from measured wavefronts was more reliable for performance evaluation than synthetic data, and have used it for our performance analysis.

Wavefront error analysis
The advantage of a PT as the one analyzed here is that it can run at a faster rate than the wavefront sensor, as mentioned in [16]. Any analysis of error budget here requires therefore the computation of differences between a reference phase screen (the corrected one) and a displaced phase screen. This difference has to be piston and tip-tilt corrected, because these modes introduce only an image translation. Since the final science image is produced by stacking individual frames, the translation can be easily accounted for in the post-processing. In this section we focus on the impact of the PT frame rate and of the prediction models on the error budget. Once the phase screens have been generated, it is possible to predict the wavefront error resulting from a random pupil displacement. Figure 8 was built based on 50 phase screens randomly selected amongst 500; for each of them, 10 random directions were generated for every displacement amplitude; displacement amplitudes ranged from 0 to 10% of the pupil size, by steps of 1%. The reference phase screen is kept centered, and the rms of the difference between this latter and the displaced pupil is computed, after piston and tip-tilt modes have been removed. The figure shows average rms values along with their standard deviations (curve and error bars, respectively). The trend of increasing error follows a quite linear relation, and shows that for a 10% of pupil size displacement not accounted for by the system correction, the wavefront error can reach, on average, about 140 nm, with a large scatter around this value (bars correspond to ±σ , with σ the empirical standard deviation computed over 500 values).   We can connect this result with the displacements statistics of Section 2, Fig. 4. Those statistics were built based on the delay between PT position measurements, so such delays can be considered as the inverse of the PT frame rate. The expected residual wavefront rms as a function of this rate is shown in Fig. 9, which has been obtained by combining Fig. 8 and 4. For the latter, both horizontal and vertical displacements were considered together (calculating the square root of the sums of the squared displacements in both directions), instead of using the statistics for separate axes. The sampling frequency of the WFS is shown in the figure as a reference point. It corresponds in our set-up to the case without PT, that is, a delay of 5Δt between WFS measurement and DM correction (and thus image acquisition), which gives an eye pupil position sampling frequency of 16 Hz. We used the mean/median rms values and made a linear fit on the logarithmic scale. From that we get a general expression for the residual rms expected for a given PT frame rate (in Hz): wavefront rms ≈ α × (PT frame rate) β [nm] (28) with α = 63.8 and β = −0.69 using the mean wavefront rms, and α = 42.1 and β = −0.72 using the median ones. Such fits are shown in red and green lines in Fig. 9, respectively. The parameters have been estimated from the 500 realizations mentioned in Fig. 8, and can vary a bit around such values depending on the phase screens randomly selected for the simulations (about ±3 for α and ±0.01 for β ). From this analysis, we confirm that, in order to minimize the residual error, we have to keep the PT frame rate as high as possible. The power around −0.7 obtained above tells us that we double the mean rms error, if for example, we drop the PT frequency from 80 Hz to 30 Hz. The curves in Fig. 9 thus chart the gain in performance to be expected from the use of a PT at a faster rate than the WFS. The case were the PT rate is equal to the WFS rate (about 10 Hz) indicates the level of performance that could be achieved without PT. It shows that for 95% of the one-step displacements, the performance loss in terms of wavefront rms is lower than 50 nm, (and lower than 100 nm for 99% of the displacements). This is to be compared with the rms of the wavefront itself (without displacement), which lies between 0.1 and 8 μm approximately. Depending on the trajectory, the rms of the aberration caused by the eye displacement may thus represent a large percentage of the total aberration.
In [16], the authors present an AO system running at about 10 Hz (8.4 precisely), and the AO loop led to an average rms of 0.12 ± 0.05 μm for the three studied subjects (a PT was used in this experiment at the same sampling frequency of 8.4 Hz to verify that eye aberrations were mainly frozen wavefronts moving with the eye). If we suppose that the error budget due to the eye displacement is at this frequency at most 40 nm rms (the value at 95% in Fig. 9), this represents the third of 0.12 μm. With a PT running at, e.g., 80 Hz, this budget could decrease to 7 nm rms (the 95% value at 80 Hz), which lies in the error range of the experiment. For an imaging wavelength of 850 nm, going from 120 nm rms to˜113 nm (= √ 120 2 − 40 2 + 7 2 ) would lead to an increase of about 5 points of Strehl ratio (SR), which is very significant in terms of image quality. (The Strehl Ratio was computed with the Marechal approximation SR ≈ exp(−σ 2 ϕ rms ), where the variance in radians is given by σ 2 ϕ rms = ( 2π λ im rms) 2 with λ im the imaging wavelength.)

Impact of position predictions on wavefront error budget
With a PT working at 80 Hz, we can check now the impact of the position predictions on the wavefront errors. These can be obtained by computing the root mean square of the difference between a reference wavefront displaced to the real position p and the same wavefront displaced to the estimated positionp, at each PT frame. This reflects the wavefront error contribution due to the position estimation error, if computed inside the fixed and centralized pupil of the imaging camera, after piston and tip-tilt removal. The larger phase screen frames have been generated with a resolution of about 6 μm/pixel, so they were well-suited to track the impact of small prediction errors in the positions. The simulations consist in computing the average rms of displaced screens difference along a whole trajectory, and this is done for 20 phase screens. Absolute displacements of the phase screen larger than about 500 μm have not been taken into account in the computation of the rms error, as the corresponding images would be discarded. The histograms of the overall rms average values are shown for the 52 pupil trajectories in Fig.  10 for prediction horizon Δt.
Mean values of the wavefront errors rms shown in the histograms as dashed lines are gathered in Table 5. Comparing such values with the non-PT case in Fig. 9, we see once again that the performance improvement is important when using a PT with fast rate. In the case of a fast rate, predicting the pupil position can be made simple by keeping the last measured value. Using more complicated schemes does not bring here significant improvement (less than 0.5 point of SR for the numerical example presented in the previous section). Table 5. Mean wavefront residuals (nm rms) obtained from the prediction models, for three different time delays. These numbers are to be compared with the non PT case, corresponding to an approximate delay of 5Δt and which leads to an mean and median rms of 12

Discussion and conclusions
We have investigated in this work the impact that a pupil tracker working at different rates than a WFS can bring to a retinal imaging instrument equipped with adaptive optics. Using a data set of 52 trajectories provided by Imagine Eyes, we started by analyzing the statistics of the eye movements, from which we could derive constraints both for real-time and post-processing of science images. Through this analysis, it has been verified that the vast majority of pupil one-step displacements lies (in absolute value) below 200 μm, and on average below 30 μm for measurement delays inside the AO cycle. Three models have been presented for pupil position estimation, assuming a pupil tracker working at 80 Hz. The first was the 'dummy' one, which corresponds to keeping the latest measurement as a predictor for the next position (and which represents the best solution for the case of a random-walk nature for the one-step displacements). The two other predictors are of Kalman filter type, based on an auto-regressive modeling (order 1) of the position, with parameters estimated in real time either through a recursive least square (RLS) approach or by including the parameter in the state vector (extended Kalman filter, EKF, approach). Extreme displacements have been masked out automatically in the real-time process, since they represent the abrupt changes known as saccades.
Although one-step pupil displacements trajectories are weakly correlated, the pupil position prediction obtained using identification/prediction tools such as Kalman filters can improve results with respect to the dummy solution in terms of position prediction error rms. Improving further prediction performance would need to consider more complex models and predictors, capable of detecting and adapting to the different displacements behaviors in real time.
We have used 500 phase screens from healthy subjects, obtained by linearly combining Zernike modes with measured aberration coefficients provided by Imagine Eyes. The idea was to test the impact of the position predictors on the wavefront error, neglecting any other external contributors to the system error budget. The difference between a reference phase screen and the displaced one gives, inside the system's pupil, the basis for the calculation of the wavefront error.
Considering a fixed reference at the pupil center, we have checked how pupil displacement impacts on the wavefront error. The statistics of the absolute value of the one-step displacements help us to check the impact of the pupil tracker frame rate on the wavefront error. We have derived a relation for the expected wavefront error rms with pupil tracker rate proportional to the power of −0.7. This means, for instance, that if the pupil tracker rate drops from 80 Hz to 30 Hz, one can expect the rms wavefront error to double. Comparing what is expected from a pupil tracking at 10 Hz (which corresponds to an ideal AO set-up at 10 Hz with perfect WFS measurement and DM correction) with a pupil tracking at 80 Hz, the performance improvement in terms of wavefront rms is significant. We have compared the error budget caused by the pupil displacement with real experiments presented in [16], for a system running at around 10 Hz. It is shown that for such a system, the maximum degradation impact due to pupil displacement when compared with the use of a PT at 80 Hz may be significant. The numerical example led to an error budget of a third of the given experimental wavefront rms, and a Strehl ratio decrease of about 5 points, significant in terms of image quality.
Rates higher than 80 Hz have not been considered, as the one-step displacements have at this rate a value in the range of the PT measurement noise (20 μm). Other PT data with higher rates, as in [30] where abberation dynamics are studied at 236 Hz, and lower measurement noise could allow to extend the results to higher PT sampling frequencies.
We have also checked the impact of the pupil position predictors on the residual wavefront, assuming a pupil tracker rate at 80 Hz. Comparing performance of the one-and two-step Kalman predictors with respect to the dummy one, we found no significant improvement in average in terms of residual wavefront rms. In this sense, just keeping the latest measurement should be enough for a good wavefront prediction. However, for other new imaging systems like AO SLO/OCT, the methods presented here could be of interest to compensate for motion in-frame distortions for example, as reported in [31].
Further developments include the establishment of the error budget for a complete system. The error budget due to pupil movement needs then to be completed with all the other budget terms (WFS measurement errors, deformable mirror fitting error, error due to WFS spatial discretization, precision of the fixed aberration estimation, calibration errors). Only residual wavefront and the fixed ocular aberration estimation depend on pupil movement. Taking into account all the remaining sources of rms wavefront error is essential to assess a complete performance for a particular AO system. The next step is then to perform experimentations combining AO retinal imaging with a PT device operating at a higher rate, and to compare retinal images in term of visual quality. This is clearly beyond the scope of this paper and left for future research. The results presented here allow nevertheless to evaluate, for any system, the possible improvement brought by a pupil tracker at different rates, when the static wavefront is supposed to be known (under the moving aberration assumption as proposed in [16]). In that sense, they are independent of system components and estimation algorithms all together, and can thus be used as part of a global system error budget analysis.