High-resolution, ultrafast, wide-field retinal eye-tracking for enhanced quantification of fixational and saccadic motion

: We introduce a novel, noninvasive retinal eye-tracking system capable of detecting eye displacements with an angular resolution of 0.039 arcmin and a maximum velocity of 300°/s across an 8° span. Our system is designed based on a confocal retinal imaging module similar to a scanning laser ophthalmoscope. It utilizes a 2D MEMS scanner ensuring high image frame acquisition frequencies up to 1.24 kHz. In contrast with leading eye-tracking technology, we measure the eye displacements via the collection of the observed spatial excursions for all the times corresponding a full acquisition cycle, thus obviating the need for both a baseline reference frame and absolute spatial calibration. Using this approach, we demonstrate the precise measurement of eye movements with magnitudes exceeding the spatial extent of a single frame, which is not possible using existing image-based retinal trackers. We describe our retinal tracker, tracking algorithms and assess the performance of our system by using programmed artiﬁcial eye movements. We also demonstrate the clinical capabilities of our system with in vivo subjects by detecting microsaccades with angular extents as small as 0.028°. The rich kinematic ocular data provided by our system with its exquisite degree of accuracy and extended dynamic range opens new and exciting avenues in retinal imaging and clinical neuroscience. Several subtle features of ocular motion such as saccadic dysfunction, ﬁxation instability and abnormal smooth pursuit can be readily extracted and inferred from the measured retinal trajectories thus oﬀering a promising tool for identifying biomarkers of neurodegenerative diseases associated with these ocular symptoms. in the fast and slow axes, respectively. For our studies, we have set the FOV to 3.37° × 3.24°, as measured using the USAF 1951 test target. The imaging beam diameter at the cornea was 1.96 mm. The calculated on-axis beam diameter on the retina in


Introduction
The human eye is an optical instrument in constant motion. Even during stable fixation, eye movements exhibit a broad range of magnitudes and frequencies [1,2]. Early research on eye movement was pioneered by Hering [3] and Lamare [4] at the end of the 19th century. Emerging techniques such as mechanical recording and photography were subsequently replaced by suction caps [5], scleral search coils [6], and the dual Purkinje image eye tracker [7]. Notably, the scleral search coil approach has excellent signal-to-noise ratio (SNR) and a typical resolution of 0.25 arcmin [2,6]. However, it is a highly invasive method involving the use of topical anesthesia and specialized contact lenses. Currently, owing to their non-invasive nature and its easiness of use, the most popular eye trackers are video-based devices that utilize anterior eye features. Their typical tracking accuracy ranges within 18-30 arcmin [8,9].
Consequently, the range of measured displacements is not bounded by the size of the acquired frames. Moreover, the eye displacement calculations are performed with inherent immunity to the accumulation of tracking error what is achieved by algorithm using the concept of Key Frames (KFs). In the upcoming sections, we describe in detail the optical design of our device, the engineering of its quantitative algorithms including the explanation of KFs concept, and we assess its tracking performance and accuracy. We demonstrate the capabilities of FET by showing examples of saccadic and fixational eye movements. The performance of FET in terms of accuracy and dynamic range of makes it a tool well suited for clinical diagnostics in ophthalmology and neurology.

Optical setup
The FET's design consists of a retinal scanner with confocal detection inspired by the SLO. In order to achieve kilohertz frame acquisition rates, we acquire relatively small images with 4,432 pixels per frame. The schematic diagram of the FET optical setup is shown in Fig. 1. The illumination source is a 785-nm laser diode (LP785-SAV50, Thorlabs, USA) coupled with a single-mode fiber and collimated to a beam diameter of 0.7 mm (1/e2) by an aspherical lens CL. The pellicle beam splitter BS reflects the beam and directs it onto a 2D scanning mirror with a 1-mm microelectromechanical systems (MEMS) based active aperture (VC3141/5/48.4, VarioS 2D microscanner, Fraunhofer IPMS) . After reflecting off the scanning mirror, the beam passes through a 4f telescope system composed of lenses L4 and L3. The telescope conjugates the scanner's aperture with a pair of galvanometric mirrors FET PS (GVS002, Thorlabs, USA) which steer the position of the scanning pattern to the selected region of interest in the retina (for example images refer to Fig. 2). The conjugate plane of the MEMS scanning mirror is then imaged onto the eye pupil plane by a second 4f telescope composed of lenses L2 and L1. Lens L2 is mounted on the translation stage MS for the correction of spherical refractive error. The beam reflected off of the retina reverts to the same path, is de-scanned by the MEMS mirror, passes through the pellicle beam splitter, and is collected by lens L5. The pinhole PH in the focal plane of L5 is conjugate with the retinal plane and rejects light out of focus. The signal from the retina is detected by the avalanche photodiode APD (A-Cube-S1500-01, Laser Components, Germany) and is processed by a PC. System synchronization and data acquisition are performed using custom software engineered in the LabVIEW environment (National Instruments, USA).
A critical element of our design for ultrafast image acquisition is the MEMS 2D scanning mirror. Its maximum operating scanning frequencies are 20 kHz and 620 Hz in the fast and slow axes, respectively. The maximum achievable frame rate of 1,240 fps derives from the shortest scanning half-period possible in the slow axis. This scanning frequency is achievable for an aperture size of 1 mm, with larger apertures requiring longer acquisition times. This tradeoff poses a design challenge requiring a compromise between maximum scanning angle, beam size at the cornea, and the throughput parameter T, defined as the ratio of the scanning mirror aperture to the diameter of the beam reflected by the retina [50]. In our design, the relay optics L1-L4 serves the purpose of balancing the system's magnification. Ideally, for T ≥ 1, no light reflected off of the eye is lost in the mirror aperture. The returning light exits the eye pupil with its full aperture in the range 4-7 mm [51]. Assuming scotopic measuring conditions with a 7-mm pupil, a magnification 1/7 yields T = 1. However, this case would require a beam 4.9 mm in diameter entering the eye, which would drastically reduce the scanning angles and in turn, the lateral resolution of the images [52].
In order to achieve maximum MEMS scanner deflections, the optical scanning angles were set to ±4.64°and ±4.29°in the fast and slow axes, respectively. For our studies, we have set the FOV to 3.37°× 3.24°, as measured using the USAF 1951 test target. The imaging beam diameter at the cornea was 1.96 mm. The calculated on-axis beam diameter on the retina in Fig. 1. Schematic diagram of the optical design of the FET. LD-laser diode, CL-collimating lens, BS-pellicle beam splitter, L1-L9 achromatic doublet lenses, MEMS 2D-two-axis resonant MEMS scanning mirror, FET PS-positioning galvanometric mirrors, D-variable iris aperture, PH-pinhole, APD-avalanche photodiode, HIL-halogen lamp, BST-Badal stage, AES-artificial eye scanner set, AEO-artificial eye objective, AET-artificial eye test target. Stars and circles denote conjugated plane pairs. Inset A: FT1, FT2-targets for static fixation and saccadic measurements, respectively. The diameter of FT1 subtends 1.5°. The diameters of individual targets in FT2 subtend 0.6°and their variable baseline separation range is 1-8°. Inset B: artificial eye for system testing and calibration, described in subsection 2.3.  (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15) and in the artificial eye (I-III). Images 1-2: part of an optic nerve, images 3-4: fovea; images 5-15: retinal vasculature with different sizes. Images I-III in the red frame were acquired using an artificial eye (see description in subsection 2.3) and are shown here for visual comparison with images of the living eye. Angular extent of scale bars is 1°. this case was 8.5 µm. In our setup, the design throughput was T = 0.7 for a 4-mm pupil and T = 0.4 for a 7-mm pupil. By design, the confocality of the system was traded off in order to gain sensitivity by using a 100-µm pinhole. The Airy disc diameter at the detection was 36.4 µm and the times-diffraction-limited (TDL) number of the optical system is 2.75 [50].
In order to perform the experiments with human subjects, the system was equipped with a fixation path, separated by a dichroic mirror (DM) from the imaging/tracking path. A halogen lamp was used to illuminate the target ( Fig. 1 Inset A) projected onto the retina through a Badal optometer setup [53]. The Badal setup consisted of lenses L7-L8 and mirrors M3-M4, which were mounted on a movable stage BST to correct the subject's spherical refractive error without magnifying the target. This feature allowed the angular size of the target to be constant, regardless of the refraction in the subjects' eye. Additionally, the system is prepared to be merged with other imaging modalities such as SLO or/and OCT that can provide high-resolution imaging of the subject's retina and additional information such as cyclotorsion measurement [54]. Adding a module with an extra pair of scanners to the imaging modalities connected to FET via a dichroic mirror inserted, for example, between M5 and L6 opens up the possibility for active correction of the acquired images of the eye.

Retinal motion tracking algorithm
The MEMS scanner sweeps the beam following the path of Lissajous scanning pattern shown in Fig. 3(a). Acquired raw images are then re-sampled to produce the uniformly sampled rectangular images shown in Fig. 3(b). This step is performed using the matrix vector multiplication: where f is an R × R rectangular image organized in a single row-by-row vector of size R 2 , M is the sparse resampling matrix of size R 2 × K, and f L = [f L 1 , f L 2 , . . . , f L K ] is the intensity data vector acquired by the APD on the Lissajous scanning path, f L i is a single APD reading, i ∈ [1, 2, . . . , K], and K is the number of data points per frame. The matrix M is constructed from the Lissajous coordinates so that each row of M has s non-zero elements at indices that correspond to the s-closest distances between the Lissajous coordinates and the coordinates in the resulting image.
The algorithm that reconstructs the retinal trajectory operates in two stages. In the first stage, the retinal trajectory is estimated by the N-back algorithm, in which each new frame is aligned with n previous frames as shown in Fig. 3(c-f). This stage consists of two steps: in the first trajectory estimate of the N-back algorithm is used to remove motion artifacts in the frames used in the second iteration of the N-back algorithm. In the second stage, estimation errors in the trajectory reconstructed in the first stage are corrected.
In the N-back algorithm, each new retinal trajectory point is calculated from displacements measured between the most recently acquired frame and N previous frames. In the simplest case, N=1, the displacement computed from only one previous frame is given by: where t m = {x m , y m } is the trajectory point in Euclidean space, m is the frame index, and p a,b denotes the displacement between frames a and b. We estimate p a,b by minimizing a functional criterion D(f a , f b ) which quantifies the quality of the registration of frames f a and f b . In particular, we use the enhanced correlation coefficient (ECC) [55] criterion, which readily provides sub-pixel precision. However, alternative criteria can be implemented interchangeably based on hardware and computation latency requirements. In order to reduce the effect of stochastic noise on the trajectory point calculation, the Eq. (2) can be applied to a number of N previously acquired frames and the result averaged. This defines b) Image frame created by re-sampling the data to a rectangular grid with equidistant pixels without motion compensation. c) Image frame corrected for motion artifacts. d) Calculation of displacement between consecutive frames using image correlation. e) Displacement calculated for a number of previously acquired frames. f) Trajectory recovery using N-back algorithm. g) Trajectory correction using Key Frames. h) Saccade detection using eye motion velocity and calculation of saccade magnitude. the full version of the N-back algorithm, where the resulting point is the averaged position calculated from displacements measured using frames from empirically chosen subset B, with some acquired up to half a second earlier: where n is the index of frames calculated in reference to the newly acquired frame, with n=1 for the frame directly preceding the new frame. If the calculated criterion D(f m , f n ) is below a threshold set at 0.8 in our experiments, the corresponding weighting coefficient w m,n is set to zero. The drop in criterion value can result from low SNR (e.g., caused by a blink or accidental vignetting of the scanning beam) or a displacement impossible to calculate either due to lack of satisfactory retinal features or movement exceeding the size of the frame. It is worth to emphasize that even if for a given pair of distant frames, basically imaging different retinal regions, the criterion D(f m , f n ) cannot be calculated, the retinal position will still be estimated based on the remaining collection of frames with coinciding regions.
The above procedure can be repeated to take into account the retinal motion, which may geometrically distort the frames acquired during high velocity motion. For example, for a velocity of 1000°/s and 3°-wide frame the last frame line will be displaced by almost 1/3rd of its width with reference to the first line. Therefore, the velocity of retinal motion is estimated from the first run of N-back algorithm and the geometrical correction is applied to each frame as shear mapping in case of horizontal motion and scaling in case of vertical motion. Next, the second run of N-back algorithm is run with the criterion D(f m , f n ) calculated using frames corrected for geometrical distortion. We have empirically selected the subsets B 1 = {1, 2} and B 2 = {1, 2, 4, 8, 16, 32, 64, 128, 256, 512} of historical frame indices for the first and second run of the N-back algorithm, respectively. The alignment of frames described above is prone to small errors in p a,b which propagate in trajectory points t m over time according to Eqs. (2)-(3). Due to the recursive nature of these equations, the trajectory estimation error can be modeled as a random walk process, and therefore for an ideally calibrated system, has a non-stationary zero mean error, and error variance linearly increasing with time (a drift). In order to suppress drift, we use the fact that the eye returns from time to time to the same location and the error accumulated by Eqs. (2)-(3) can be corrected by a new displacement calculation. This technique introduces Key Frames (KFs), which are a subset of all frames in the acquired dataset with translations (p KF a,b ) that can be calculated (please refer to Fig. 3(g)). This means that the frames f a and f b correspond to closely spaced locations on the retina, whereas the time separation between them is not important. Next, the algorithm calculates such corrections for the KFs positions that minimize the error on the calculated (p KF a,b ), see red arrow in Fig. 3(g). These corrections for KFs are performed using the multidimensional scaling (MDS) mathematical framework described in Ref. [56]. In other words, a distance matrix P KF is constructed with norms of displacements, |p KF a,b |, between the KFs which can be calculated. The error minimization for trajectory T KF is performed with the use of a stress function σ with respect to the KFs positions, where P KF a,b = |p KF a,b | are the distances computed between the a-th and the b-th KF, and w a,b is a weighting coefficient to indicate missing values as described earlier. The KF-trajectory T KF has an inherently low sampling rate, however it has zero mean and stationary error. Since the KF-trajectory has missing values for non-overlapping frames the final retinal trajectory T FET is estimated by casting the N-back trajectory, T N−back on the KF-trajectory, T KF using linear interpolation for trajectory points in between KFs, as depicted by yellow arrows in Fig. 3(g).

Eye blink detection
In most cases, the effect of blinks, as well as other incidents leading to the low value of criterion D(f m , f n ) that quantifies the quality of frame registration, is of little significance on trajectory correction in the KF algorithm. It is incorporated in the distance matrix P KF in the weighing coefficients w a,b , as described in the previous section. During the eye blinks, the overall intensity of frames drops significantly. Therefore, for the saccade detection algorithm, we remove the frames that correspond to blinks by thresholding the measured mean frame intensity.

Saccade detection and quantification procedure
The design of our saccade detection algorithm is based on the velocity thresholding principle reported in Ref. [57]. In our implementation, the x-and y-components of the detected retinal trajectory T KF are denoised using a 21-point moving average filter spanning 17 ms. The first and second derivatives are computed on the x-and y-trajectory components separately using the finite difference method, and are then used to calculate the magnitudes of the absolute velocity and acceleration.
Saccades are first identified by points of local maxima of absolute velocity, shown as an orange solid circle in Fig. 3(h). The initial boundaries of each saccade are determined on the opposite sides of the detected velocity maximum as the two sample points with a velocity below the threshold value empirically set to 1.24°/s. This value allows the elimination of random noise and associated spurious peak velocities mimicking saccades. Both starting and ending boundaries of the saccade are next expanded by 12 ms to compensate for saccade trimming due to velocity thresholding. The final boundaries are depicted by the green and red solid circles in Fig. 3(h) and Figs. 5-6.
The saccade magnitude is calculated using the Euclidean distance given by the denoised x, y coordinates. Next, the upper and lower boundaries are set to the maximum and minimum values and expanded into bands with a width equal to 5% of the extrema. The average values of the points located in these bands are calculated, and the total saccade magnitude is computed as the difference between them. This procedure is visually depicted in Fig. 3(h) with bands marked with grey rectangles.

Artificial eye experiments
In order to evaluate the FET system's tracking performance, a simplified artificial eye composed of a set of X-Y galvanometric scanners (AES), imaging lens (AEO), and a test target (AET) was installed in the system, shown in Fig. 1, inset B. The imaging lens is identical to L1 and is arranged in a 4f system in order to preserve the scanning angles. The scanners were positioned in the eye pupil plane, and a voltage-to-optical-angle calibration was performed using the USAF 1951 test target. For tracking validation, an in vivo human eye fundus image was obtained by a scanning laser ophthalmoscope, printed and used as the test target. A visual comparison of the artificial eye images and the in vivo human eye images is shown in Fig. 2. By steering the AES with a known voltage we introduce controlled movements of the retinal image for calibration. The AES control circuit provides feedback outputs used to monitor the actual position of the galvanometric mirrors and compare them with the FET measurements. The AES is programmed to mimic a diversity of eye movements. Back-and-forth saccade sequences, for example, were generated using the model described in Ref. [58]. Eight 20-second sequences of 200 horizontal back-and-forth saccades and eight 20-second sequences of 200 vertical back-and-forth saccades were imaged. The measurement times of sequences were chosen arbitrarily, resulting in 25,000 collected FET image frames per sequence. The magnitudes of the saccades during each sequence were constant and spanned the range of 1-8°in 1°increments. Furthermore six waveforms of fixational eye movements, 20 s each, were generated according to the model described in Ref. [59]. The artificial eye was programmed to move according to these waveforms during the FET frames acquisition.

Human eye experiments
For our in vivo measurements, we have enlisted three healthy subjects (age group 25-40 years) with emmetropic vision and no reported or diagnosed fixation problems. The study adhered to the tenets of the Declaration of Helsinki. After the nature of the study was explained to all the participants, they gave their informed consent. Non-scanning beam power measured at the pupil plane was approximately 100 µW, which is significantly below the safety exposure limits [60].
The experiments were conducted in a dimly lit room. The subject's eye was always randomly chosen for the measurements. Neither mydriatic nor cycloplegic drugs were used. Each subject was directed to place their head in a chin-rest mounted in front of the device, and the line of sight of their eye was aligned with the optical axis of the instrument [61]. Subjects were allowed to blink during the measurement as needed. After each measurement, the subjects were asked to withdraw their head from the chin-rest and rest for at least one minute.

Fixations
The first experimental goal is to demonstrate our system's capability to image and detect in vivo fixational eye movements. For this purpose, subjects were directed to focus their sight on the center of a fixation target, which consists of a cross-hair and bull's-eye combination with a diameter which subtends a visual angle of 1.5°. This target is shown as target FT1 in Fig. 1(A) [62]. The target was projected onto the subjects' retina via a Badal system, which allows for the correction of defocusing without altering the angular magnification of the target. The procedure of finding the subjective far point was followed by optically moving the target away from the eye to the last position before the subjects could perceive a "just noticeable" blur [63]. Initially, in such an alignment, the system images the subjects' fovea. By using a pair of FET positioning galvanometric mirrors, the operator moves the scanning pattern across different retinal regions. Once the scanning pattern is positioned on the desired retinal region, subjects are directed to blink and then fixate on the target for 20 s while the measurement is acquired. The experiment is repeated three times using different retinal features: the fovea, optic nerve, and retinal vessels. Typical examples of retinal features are shown in Fig. 2.

Saccades
In the second part of the experiment, subjects were directed to continuously switch their gaze between two fixating points separated by a known angular distance ranging from 1°to 8°i ncreasing gradually in steps of 1°. Each fixation point has an angular extent of 0.6°. The goal in this experiment is to register and detect the saccades corresponding to the angular separation between the fixating points. This target is shown as target FT2 Fig. 1(A) [62]. For each subject, we select a vascularized area in the retina as a region of interest for imaging and tracking. Before the measurements, subjects are instructed to blink and then perform the periodic saccades with the aid of a regular auditory metronome set to 70 beats per min. Each measurement lasted 20 s and was repeated twice for each angular separation of the fixating points.

Tracking algorithm evaluation
According to Eqs. (2)-(3), the N-back tracking algorithm accumulates errors in the registration of each pair of frames over time. A full model of the error for the T N−back trajectory must include sources of error representing different inputs to the algorithm. Namely, errors introduced during stable fixation and fixation periods in between saccades, when eye displacements and velocities are relatively small, and errors introduced during the larger excursions of saccades. The former can be modeled as a random walk process with a constant proportionality coefficient, because in the velocity range characteristic for fixations the distortions of frames due to eye motion are negligible and the frame overlap is high. The error increases significantly during saccades because the N-back frames at higher velocities start to show geometrical distortions and the overlapping areas become smaller. As a result, the error of the system is not stationary, and its variance increases with time. In our experiments, the error was measured as the difference between the true position returned by the GVSC monitor and the position estimated by the algorithm. A typical accumulation of root squared error (RSE) for the experiment with saccades 4°in magnitude is shown in Fig. 4. The increase of accumulated RSE due to the intersaccadic parts of the trajectories is interleaved with abrupt changes occurring during the saccades, clearly illustrating the non-stationary nature of the T N−back error. This error model is not valid for the trajectory T KF , estimated by minimizing Eq. (4) and subsequently sub-sampling its result over T N−back . Here, the accumulated errors are corrected by the MDS procedure (see Ref. [56]) and a non-stationary error is hypothesized. The augmented Dickey-Fuller (ADF) test with trend adjustment rejects the hypothesis that T KF is a non-stationary process with DF 37 = −2.870 and p = 0.047, compared with the test for T N−back , which yields DF 37 = −1.259 and p = 0.647. The results of the tests indicate that T KF is free from accumulative tracking error, and thus a figure for RMSE can be reported for the whole trajectory. Figure 4 shows the error time series of T KF and T N−back for a 4°saccade experiment. Note how the increasing N-back error is eliminated once the KFs correction is performed.
Two strategies for choosing the KFs were tested. In the first strategy, the KFs are selected at fixed intervals (20 ms, 80 ms and 160 ms). This strategy is most effective for fixation experiments, when the motion amplitude is small compared to frame size and the probability that the frames will overlap is high. In such case, the typical (minimum) RMSE for trajectories is 0.039 arcmin for a frame interval 20 ms and 0.045 arcmin for frame intervals of 80 and 160 ms. In the experiments with forced saccades, acceptable results are achieved by taking a single KF from a fixation period that occurs between saccades. The KF is chosen from the time between velocity peaks (i.e., after the end or in-between the saccades). The interval between the KFs in this strategy is almost 900 ms, therefore, the achieved RMSE is higher and increases from 0.36 arcmin for a 1°saccade to 5 arcmins for an 8°saccade. For in vivo eye movement data, both strategies for choosing the KFs are combined.
In the current stage of the device development, all the computations are implemented in C language and run on a multithreaded CPU. The trajectory reconstruction is performed in post-processing. For 20 s of eye motion recording with 25 000 frames and 1000 KFs requires approx. one minute for the N-back and three minutes for calculation of KF displacements followed by MDS trajectory optimization. Our preliminary experiments show that the N-back part of the algorithm can be performed in less than 100µs per frame when implemented in the graphics processing unit (GPU) what makes an implementation operating in real-time feasible. Figure 5 illustrates a typical saccade of 4°in magnitude from the experiment described in Methodology subsection 3.2. In this case, retinal vessels were chosen as the tracking features. Selected frames are shown corresponding to the solid blue circles in the saccadic plot. Solid green and red circles represent the start and end of the saccade, respectively. For clearness of presentation, the start of the saccade is moved to the origin of the coordinate system.  The green region indicates a gap in the trajectory due to a blink. One can notice that the trajectory reconstruction is not affected by the partial loss of data during the blink. Green and red solid circles represent the detected starts and ends of the saccades, respectively. Horizontal grey-shaded stripes mark the angular size of the fixation targets, also shown in scale on the right side of the plot. The time between the detected saccades corresponds well with the settings of the metronome beats, and the majority of the magnitudes of the saccades are within the range of fixation targets.

Human eye experiments
Follow-up correcting saccades were observed in all the trajectories, with saccadic undershoot or overshoot. Because the task involved performing horizontal saccades (x-coordinate), the vertical component (y-coordinate) is small in comparison with the horizontal component. Nevertheless, the correlation of both x-and y-retinal trajectories is clearly evident, which indicates that the eye motion deviates from a horizontal line during the saccade.
Motion parameters such as velocity and acceleration can be readily calculated from the saccade trajectories. Figure 7 summarizes the displacement angular magnitudes, velocities, and accelerations of all 42 saccades from both 4°-forced saccade experiments on subject 1, with a distinction between temporal-nasal and nasal-temporal directions. A clear asymmetry can be observed amongst the directions, especially in the acceleration plots. Correcting saccades, which are comparatively much smaller in magnitude, are not shown in Fig. 7.   Figure 9 shows typical results of in vivo fixation experiments for Subject 1. Figure 9(a) and (d) show vertical and horizontal eye motion components, respectively. Green-shaded regions represent blinks that occurred during the measurement. Figure 9(b) shows the retinal position density map during this measurement. Although eye drift and microsaccades can be easily identified, it is clear that most of the time, the eye is focused on a definite region, likely the center of the fixation target. The target is drawn to scale in the same figure panel. Eye excursions are clearly visible in individual x-and y-trajectories, as well as in panel (c), which shows the entire detected x-y retinal trajectory with color-coded velocity. In addition, despite the relatively stable eye position, typical eye drift and microsaccades are easily identified. These eye excursions are clearly visible in individual x-and y-trajectories, as well as in Fig. 9(c). For further examples and results, please refer to Supplementary Materials Visualization 9, Visualization 10, Visualization 11, and Visualization 12, which show videos of FET imaging and retrieved trajectories during fixation periods. In these examples the tracking was performed respectively on retinal vessels, the optic disc, larger retinal vessels near the optic disc and the fovea.
A summary of all 5,159 detected saccades and microsaccades from all the measurements performed in this study is presented in Fig. 10 in the form of a saccadic main sequence [64]. We have removed any remaining outliers by fitting a main sequence formula as proposed by Baloh [65] and by further visual inspection of data points in the sequence. The final plot consists of 5,159 points, each one corresponding to a saccade or microsaccade.
Results shown in Fig. 10 are in good agreement with the previously published results using different instrumentation [2,35]. Notably, our tracking system extends the range of detectable magnitudes of microsaccades and increases the accuracy of their measurement. The inset of Fig. 10 shows the smallest microsaccade magnitude detected in this study, with a magnitude of 0.028°.
The results of this study demonstrate our system's capability for the accurate reconstruction of retinal motion with a maximum angular resolution of 0.039 arcmin RMSE and a temporal resolution of up to 790 µs. Further parametric characterization of eye motion, including intersaccadic intervals and their statistical distribution, number and duration of fixations, are currently being conducted in a clinical setting with a more statistically significant population.

Conclusions
We have demonstrated a novel, noninvasive eye tracking system capable of detecting retinal displacements as small as 0.028°with an angular resolution of 0.039 arcmin and a maximum velocity of 300°/s across an angular span as wide as 8°. Our tracking algorithms quantify eye displacements using the shifts of a subset of frames in a sequence spanning the full acquisition cycle, obviating the need for a single reference frame and allowing for the precise measurement of eye movements exceeding the spatial extent of single acquired frames. Therefore, our system extends the limitation on maximum detectable saccadic magnitude and velocity characteristic for current image-based retinal trackers and allows the detection of finer features of eye motion enabling new, promising opportunities in retinal imaging or clinical neuroscience. Furthermore, our system offers the ability to perform the precise measurement of both microsaccades occurring during fixation as well as large saccades without the need for any additional external imaging devices such as a wide-field SLO. The subtle features of saccadic dysfunction, fixation instability and abnormal smooth pursuit can be readily extracted and quantified in deeper detail, thus offering a promising tool set for the early identification of biomarkers of neurodegenerative diseases. Moreover, FET can be readily combined with other eye imaging modalities such as SLO or OCT to provide eye motion correction without major hardware changes to these modalities.