Computational aberration compensation by coded-aperture-based correction of aberration obtained from optical Fourier coding and blur estimation

We report a novel generalized optical measurement system and computational approach to determine and correct aberrations in optical systems. The system consists of a computational imaging method capable of reconstructing an optical system ’ s pupil function by adapting overlapped Fourier coding to an incoherent imaging modality. It re-covers the high-resolution image latent in an aberrated image via deconvolution. The deconvolution is made robust to noise by using coded apertures to capture images. We term this method coded-aperture-based correction of aberration obtained from overlapped Fourier coding and blur estimation (CACAO-FB). It is well-suited for various imaging scenarios where aberration is present and where providing a spatially coherent illumination is very challenging or impossible. We report the demonstration of CACAO-FB with a variety of samples including an in vivo imaging experiment on the eye of a rhesus macaque to correct for its inherent aberration in the rendered retinal images. CACAO-FB ultimately allows for an aberrated imaging system to achieve diffraction-limited performance over a wide field of view by casting optical design complexity to computational algorithms in post-processing.


INTRODUCTION
A perfect aberration-free optical lens simply does not exist in reality. As such, all optical imaging systems constructed from a finite number of optical surfaces are going to experience some level of aberration issues. This simple fact underpins the extraordinary amount of optical design efforts that have gone into the design of optical imaging systems. In broad terms, optical imaging system design is largely a complex process by which specialized optical elements and their spatial relationships are chosen in order to minimize aberrations and provide an acceptable image resolution over a desired field of view (FOV) [1]. The more optical surfaces available to the designer, the greater the extent the aberrations can be minimized. However, this physical system improvement approach for minimizing aberrations has reached a point of diminishing returns in modern optics. Microscope objectives with 15 optical elements have become commercially available in recent years [2], but it is unlikely that another order of magnitude of optical surfaces will be supported within the confines of an objective in the foreseeable future. Moreover, this strategy for minimizing aberration is never expected to accomplish the task of completely zeroing out aberrations. In other words, any optical system's spatial bandwidth product (SBP), which scales as the product of system FOV and inverse resolution, can be expected to remain a design bound dictated by the residual aberrations in the system.
The issue of aberrations in simpler optical systems with few optical surfaces is, unsurprisingly, more pronounced. The eye is a very good example of such an optical system. While it does a fair job of conveying external scenes onto our retinal layer, its optical quality is actually quite poor. When a clinician desires a high-resolution image of the retinal layer itself for diagnostic purposes, the human eye lens and cornea aberrations would have to be somehow corrected or compensated for. The prevalent approach by which this is currently done is through the use of adaptive optics (AO) [3,4]. This is in effect a sophisticated way of physically correcting aberrations where complex physical optical elements are used to compensate for the aberrations of the lens and cornea. AO forms a guide star on the retina and uses a wavefront detector (e.g., Shack-Hartmann sensor) and a compensation device (e.g., deformable mirror) to correct for the aberrations affecting the guide star and a small region around it as it is under similar aberrations. This region is known as the isoplanatic patch [5], and its size varies depending on the severity of aberrations. To image a larger area beyond the isoplanatic patch, AO needs to be raster-scanned [6]. Since AO correction is fast isoplanatic patches quickly. However, the AO system can be complicated as it requires the active feedback loop between the wavefront measurement device and the compensation device and needs a separate guide star for the correction process [8].
Fourier ptychography (FP) circumvents the challenges of adding more optical elements for improving an optical system's performance by recasting the problem of increasing the system's spatial bandwidth product (SBP) as a computational problem that can be solved after image data have been acquired. Rather than striving to get the highest-quality images possible through an imaging system, FP acquires a controlled set of low-SBP images, dynamically determines the system's aberration characteristics computationally, and reconstitutes a high-SBP, aberrationcorrected image from the original controlled image set [9][10][11][12][13][14][15]. FP shares its roots with ptychography [16][17][18] and structured illumination microscopy (SIM) [19][20][21], which numerically expand the SBP of the imaging system in the spatial and spatial frequency domain, respectively, by capturing multiple images under distinct illumination patterns and computationally synthesizing them into a higher-SBP image. One way to view FP is to note its similarity to synthetic aperture imaging [22][23][24]. In a standard FP microscope system, images of the target are collected through a lownumerical-aperture (NA) objective with the target illuminated with a series of angularly varied planar or quasi-planar illumination. Viewed in the spatial frequency domain, each image represents a disc of information with its offset from the origin determined by the illumination angle. As with synthetic aperture synthesizing, we then stitch the data from the collected series in the spatial frequency domain. Unlike synthetic aperture imaging, we do not have direct knowledge of the phase relationships between each image data set. In FP, we employ phase retrieval and the partial information overlap among the image set to converge on the correct phase relationships during the stitching process [9]. At the end of the process, the constituted information in the spatial frequency domain can be Fourier transformed to generate a higher-resolution image of the target that retains the original FOV as set by the objective. It has been demonstrated that a sub-routine can be weaved into the primary FP algorithm that will dynamically determine the pupil function of the imaging system [11]. In fact, the majority of existing FP algorithms incorporate some versions of this aberration determination function to find and subsequently correct out the aberrations from the processed image [25][26][27][28][29][30][31][32]. This particular sub-discipline of FP has matured to the level that it is even possible to use a very crude lens to obtain high-quality images that are typically associated with sophisticated imaging systems [33]-this drives home the fact that correcting aberration computationally is a viable alternative to physical correction.
The primary objective of this paper is to report a novel generalized optical measurement system and computational approach to determine and correct aberrations in optical systems. This computational approach is coupled to a general optical scheme designed to efficiently collect the type of images required by the computational approach. Currently, FP's ability to determine and correct aberration is limited to optical setups with a well-defined, spatially coherent field on the sample plane [9,11,[34][35][36][37][38][39][40][41][42]. We developed a computational imaging method capable of reconstructing an optical system's pupil function by adapting the FP's alternating projections as used in overlapped Fourier coding [10] to an incoherent imaging modality, which overcomes the spatial coherence requirement of the original pupil function recovery procedure of FP. It can then recover the high-resolution image latent in an aberrated image via deconvolution. The deconvolution is made robust to noise by using coded apertures to capture images [43]. We term this method: coded-aperture-based correction of aberration obtained from overlapped Fourier coding and blur estimation (CACAO-FB). It is well suited for various imaging scenarios where aberration is present and where providing a spatially coherent illumination is very challenging or impossible. CACAO-FB ultimately allows for an aberrated imaging system to achieve diffraction-limited performance over a wide FOV by casting optical design complexity to computational algorithms in post-processing.
The removal of spatial coherence constraints is vitally important in allowing us to apply computational aberration correction to a broader number of imaging scenarios. These scenarios include: (1) optical systems where the illumination on a sample is provided via a medium with unknown index variations; (2) optical systems where space is so confined that it is not feasible to employ optical propagation to create quasi-planar optical fields; (3) optical systems where the optical field at the sample plane is spatially incoherent by nature (e.g., fluorescence emission).
CACAO-FB is substantially different from other recent efforts aimed at aberration compensation. Broadly speaking, these efforts can be divided into two major categories: blind and heuristic aberration recovery. Blind recovery minimizes a cost function, typically an image sharpness metric or a maximum-likelihood function, over a search space, usually the coefficient space of Zernike orthonormal basis [44][45][46][47][48][49], to arrive at the optimal aberration function. However, blind recovery is prone to converging towards multiple local minima and requires the aberrated sample image to be a complex field because blind aberration recovery with an intensity-only sample image is extremely prone to noise for any aberrations [45] other than simple ones such as a camera motion blur or a defocus blur [50]. Heuristic recovery algorithms rely on several assumptions, such as assuming that the captured complex-field sample image has diffuse distribution in its Fourier spectrum such that each sub-region in the Fourier domain encodes the local aberrated wavefront information [51][52][53][54]. Thus, heuristic methods are limited to specific types of samples, and their performance is highly sample dependent.
CACAO-FB is capable of achieving a robust aberration recovery performance in a generalized and broadly applicable format. In Section 2, we describe the principle of CACAO-FB. In Section 3, we report the demonstration of CACAO-FB with a crude lens and an eye model as imaging systems of interest. Finally, in Section 4, we demonstrate the potential of using CACAO-FB for retinal imaging in an in vivo experiment on a rhesus macaque's eye and discuss the current challenges it needs to address to become a viable alternative to other AO retinal imagers. We summarize our findings and discuss future directions in Section 5.

PRINCIPLE OF CACAO-FB
To best understand the overall operation of CACAO-FB processing, we start by examining the optical scheme (see Fig. 1). Suppose we start with an unknown optical system of interest (target system). This target system consists of a lens (unknown lens) placed approximately at its focal length in front of a target sample (unknown sample). The sample is illuminated incoherently. For the sake of simplicity in this thought experiment, we will consider the illumination to occur in the transmission mode. The CACAO-FB system collects light from the target system using relay lenses L1, L2, and L3, and an aperture mask in the pupil plane, which is the conjugate plane of the target system's pupil with coordinates u, v, that can be modulated into different patterns. Our objective is to resolve the sample at high resolution. It should be clear from this target system description that our ability to achieve the objective is confounded by the presence of the unknown lens and its unknown aberrations. A good example of such a system is the eye-the retinal layer is the unknown sample, and the lens and cornea can be represented by the unknown lens.
From this thought experiment, we can see that, to accomplish our objective, we would need to first determine the aberration characteristics of the unknown lens and then use the information to somehow correct out the aberration effects from the final rendered image. CACAO-FB does this by using three primary computational imaging algorithmic components that operate in sequence: 1) local aberration recovery with blur estimation; 2) full aberration recovery with a FP-based alternating projections algorithm; and 3) latent image recovery by deconvolution with coded apertures. The first two steps determine the target system's aberrations, and the third step generates an aberration-corrected image. This pipeline is summarized in Fig. 2. The sample plane, which has coordinates x, y, is divided into small tiles within which the aberration can be assumed to be spatially invariant, and CACAO-FB processes each corresponding tile on its image plane, which has coordinates ξ, η, to recover a high-resolution image of the sample tile. In the following analysis, we focus our attention to one tile t. CACAO-FB begins by capturing a series of images with varying mask patterns in its pupil plane, which has coordinates u, v. The patterns consist of two kinds: a set of small circular apertures W m u, v, which collectively spans the pupil of the unknown lens, and a set of big apertures A n u, v, which includes coded apertures and a full circular aperture with their diameters equal to the unknown lens's pupil's size. m and n are integers ranging from 1 to the total number of the respective aperture. The images captured with W m u, v are labeled as i m,t ξ, η, and they encode the local aberration of the unknown lens's pupil function in their point spread functions (PSF). The blur estimation algorithm (Section 2.B) extracts these PSFs; b m,t ξ, η. These intensity values of the spatially filtered pupil function can be synthesized into the full pupil function P t u, v with an FP-based alternating projections algorithm (Section 2.C). The images captured with A n u, v, labeled ϕ n,t ξ, η, are processed with the reconstructed pupil function and the knowledge of the mask patterns to generate the latent, aberration-free image of the sample o t x, y (Section 2.D).
The next four sub-sections will explain the mathematical model of the image acquisition process and the three imaging algorithmic components in detail.

A. Image Acquisition Principle of CACAO-FB System
We consider a point on the unknown sample sx, y and how it propagates to the camera plane to be imaged. On the sample plane, a unit amplitude point source at x 0 , y 0 can be described by where U 0 x, y; x 0 , y 0 is the complex field of the point on the sample plane, and δx − x 0 , y − y 0 is the Dirac delta function describing the point located at x 0 , y 0 . We then use Fresnel propagation to propagate it to the unknown lens's plane and apply the phase delay caused by the unknown lens, assuming an idealized thin lens with the estimated focal length f 0 [Eqs. (S2) and (S3) of Supplement 1]. Any discrepancy from the ideal is incorporated into the pupil function Pu, v; x 0 , y 0 , which is usually a circular bandpass filter with a uniform modulus and some phase modulation. Thus, the field right after passing through the unknown lens is where λ is the wavelength of the field and u, v are the coordinates of both the plane right after the unknown lens and the CACAO-FB system's aperture plane as these planes are conjugate to each other. Thus, we refer to the aperture plane as the pupil plane. The spatially varying nature of a lens's aberration is captured by the pupil function's dependence on x 0 , y 0 . We divide our sample into small tiles of isoplanatic patches (e.g., t 1, 2, 3, …) and confine our analysis to one tiled region The CACAO-FB system consists of three tube lenses (L1, L2, and L3) to relay the image from the target system for analysis. The target system consists of an unknown lens and an unknown sample with spatially incoherent field. The CACAO-FB system has access to the conjugate plane of the target system's pupil, which can be arbitrarily modulated with binary patterns using a spatial light modulator. The images captured by the CACAO-FB system are intensity only. f 0 , f 1 , f 2 , and f 3 are the focal lengths of the unknown lens, L1, L2, and L3, respectively. d is an arbitrary distance smaller than f 3 .
t on the sample plane that contains x 0 , y 0 and other points in its vicinity such that the spatially varying aberration can be assumed to be constant in the analysis that follows (i.e., Pu, v; x 0 , y 0 P t u, v). This is a common strategy for processing spatially variant aberration in wide-FOV imaging [55,56]. We can see from Eq. (2) that the field emerging from the unknown lens is essentially its pupil function with additional phase gradient term defined by the point source's location on the sample plane. At the pupil plane, a user-defined aperture mask M u, v is applied to produce where we dropped the constant factor C F x 0 , y 0 . After further propagation to the camera plane [Eqs. (S5)-(S9) of Supplement 1], we obtain the intensity pattern i PSF,t ξ, η that describes the mapping of a point on the sample to the camera plane as follows: where h t ξ, η jF fM u, vP t u, vgξ, ηj 2 is the intensity of the PSF of the combined system in Fig. 1 for a given aperture mask M u, v and within the isoplanatic patch t. We observe from Eq. (4) that PSFs for different point source locations are related to each other by simple lateral shifts, such that an image i t ξ, η captured by this system of an unknown sample function within the isoplanatic patch s t x, y can be represented by where o t ξ, η is the intensity of s t ξ, η, is the convolution operator, and we ignore the coordinate scaling for the sake of simplicity. This equation demonstrates that the image captured by the detector is a convolution of the sample's intensity field with a PSF associated with the sub-region of the pupil function defined by an arbitrary mask at the pupil plane. This insight allows us to Research Article capture images of the sample under the influence of PSFs that originate from different sub-regions of the pupil. We have aperture masks of varying shapes and sizes, mainly categorized into small masks and big masks. Small masks sample small regions of the pupil function to be used for reconstructing the pupil function, as will be described in detail in the following sections. Big masks include a mask corresponding to the full pupil size and several coded apertures that encode the pupil function to assist in the latent image recovery by deconvolution. To avoid confusion, we label the mth small mask, its associated PSF in isoplanatic patch t, and the image captured with it as W m u, v, b m,t ξ, η, and i m,t ξ, η, respectively. The nth big mask (coded aperture or a full aperture), its associated PSF, and the image captured with it are labeled as A n u, v, h n,t ξ, η, and ϕ n,t ξ, η, respectively. The CACAO-FB system captures i m,t ξ, ηs and ϕ n,t ξ, ηs in the data acquisition process, and these data are relayed to postprocessing algorithms to recover o t ξ, η, the underlying aberration-free image of the sample. The algorithm pipeline begins with the blur estimation algorithm using i m,t ξ, ηs as described below. In all the following simulations, there is one full aperture and four coded apertures A n u, v with the diameter of 4.5 mm; 64 small masks W m u, v with the diameter of 1 mm; an unknown lens L1 and L2 with the focal length of f 0 f 1 f 2 100 mm; a tube lens with the focal length of f 3 200 mm; an image sensor with the pixel size of 6.5 μm (3.25 μm effective pixel size); and a spatially incoherent illumination with the wavelength of 520 nm.

B. Local Aberration Recovery with Blur Estimation
The blur function b m,t ξ, η associated with the small mask W m u, v applied to the pupil P t u, v is also referred to as the local PSF, and it contains valuable information about the target system's pupil function that we wish to recover. The size of W m u, v is set small enough such that W m u, v applied to a region on P t u, v shows a local phase map that resembles a linear phase gradient, as shown in Fig. 3(b1). In such case, the associated b m,t ξ, η approximates a diffraction-limited spot with a spatial shift given by the phase gradient. W m u, v applied to other regions on P t u, v may have b m,t ξ, η, whose shape deviates from a spot if the masked region contains more severe aberrations, as shown in Figs. 3(b2) and 3(b3). In general, the aberration at or near the center of an imaging lens is minimal, and it becomes severe near the edge of the aperture because the lens's design poorly approximates the parabolic shape away from the optical axis [57]. Thus, the image captured with the center mask i 1,t ξ, η is mostly aberration free with its PSF defined by the diffraction-limited spot associated with the mask's aperture size. Other i m,t ξ, ηs have the same frequency band limit as i 1,t ξ, η but are under the influence of additional aberration encapsulated by their local PSFs b m,t ξ, ηs.
We adopt an image-pair-based blur estimation algorithm widely used in computational photography discipline to determine b m,t ξ, η. In this algorithm, one of the image pairs is assumed to be blur free while the other is blurred [58,59]. The blur kernel can be estimated by an iterative PSF estimation method, which is iterative Tikhonov deconvolution [60] in the Fourier domain, adopting the update scheme in Yuan's blur estimation algorithm [58] and adjusting the step size to be proportional to jI 1,t u, vj∕jI 1,t u, vj max for robustness to noise [61], where I 1,t u, v is the Fourier spectrum of i 1,t ξ, η. The blur estimation process is described in Algorithm 1 as shown in Fig. 4.
The recovered b m,t ξ, ηs are the intensity information of the different masked pupil regions' Fourier transforms. They can be synthesized into the full pupil function P t u, v using an FPbased alternating projections algorithm, as described in the following section.

C. Full Aberration Recovery with Fourier-Ptychography-based Alternating Projections Algorithm
FP uses the alternating projections algorithm to synthesize a sample's Fourier spectrum from a series of intensity images of the sample captured by scanning an aperture on its Fourier spectrum [9,10]. In our implementation, the full pupil's complex field P t u, v is the desired Fourier spectrum to be synthesized, and the local PSFs b m,t ξ, ηs are the aperture-scanned intensity images to be used for FP-based alternating projections, as shown in the bottom half of Fig. 2(b). Therefore, reconstructing the pupil function from a series of local PSFs' intensity information in our algorithm is completely analogous to reconstructing the complex spatial spectrum of a sample from a series of its low-passed images. The FP-based alternating projections algorithm is Algorithm 2, and it is described in Fig. 5.
The FP-based alternating projections algorithm requires that the scanned apertures during image acquisition have at least 30% overlap [62] for successful phase retrieval. Thus, the updating b m,t ξ, ηs in Algorithm 2 are ordered in a spiral-out pattern, each having an associated aperture W m u, v that partially overlaps (40% by area) with the previous one's aperture. The influence of the overlap on the reconstruction is illustrated in Fig. S1 of Supplement 1. For the simulated pupil diameter of 4.5 mm, there are 64 W m u, vs of 1 mm diameter to span the pupil with 40% overlap.
We simulate the image acquisition by an aberrated imaging system and our pupil function reconstruction process in Fig. 6. Algorithms 1 and 2 are able to estimate the local PSFs from the 64 images captured with the small masks W m u, v [ Fig. 6(c))], and they reconstruct the complex pupil function P t u, v successfully [ Fig. 6(e)]. A simple Fourier transformation of P t u, v generates the PSF of the aberrated imaging system. On a Macbook Pro with 2.5 GHz Intel Core i7 and 16 GB of RAM, it takes 2 min for Algorithm 1 and 20 s for Algorithm 2 to operate on the 64 images (1000 by 1000 pixels) taken with W m u, v. To gauge our method's performance among other computational blur estimation methods, we attempt PSF reconstruction with two blind deconvolution algorithms. One is MATLAB's deconvblind, which is a standard blind deconvolution algorithm based on the accelerated, damped Richardson-Lucy algorithm, and the other is the stateof-the-art blind blur kernel recovery method based on variational Bayesian approach by Fergus et al. [63,64]. They both operate on a single blurred image [ Fig. 6(d)] to simultaneously extract the blur function and the latent image [ Fig. 6(f )]. For our purpose, we compare the reconstructed PSFs to gauge the performance. As shown in Fig. 6(f ), the reconstructed blur kernels by MATLAB and Fergus et al. both show poor fidelity to the true PSF. This clearly demonstrates the effectiveness of our algorithm pipeline in reconstructing a complicated PSF, which would otherwise be impossible to recover by a blind deconvolution method. The  Research Article absolute limit of our aberration reconstruction method, assuming an unlimited photon budget, is essentially determined by the number of pixels inside the defined full aperture. However, in real-life settings with limited photon budget and a dynamic sample, the smallest subaperture we can use to segment the full aperture is determined by the allowable exposure time and the shotnoise-limited condition of the camera. One has to consider the number of photons required by the camera for the signal to overcome the camera noise and the length of exposure permissible to capture a static image of the sample.

D. Latent Image Recovery by Deconvolution with Coded Apertures
With the knowledge of the pupil function obtained from Algorithms 1 and 2, it is possible to recover o t x, y from the aberrated image ϕ t ξ, η taken with the full pupil aperture. In the Fourier domain, the image's spectrum is represented as Φ t u, v H t u, vO t u, v, where H t u, v and O t u, v are the spatial spectrum of h t ξ, η and o t x, y, respectively. H t u, v is also called the optical transfer function (OTF) of the optical system and, by Fourier relation, is an auto-correlation of the pupil function P t u, v. In the presence of severe aberrations, the OTF may have values at or close to zero for many spatial frequency regions within the bandpass, as shown in Fig. 7. These are due to the phase gradients with opposite slopes found in an aberrated pupil function, which may produce values at or close to zero in the auto-correlation process. Thus, the division of Φ t u, v by H t u, v during deconvolution will amplify noise at these spatial frequency regions since the information there has been lost in the image acquisition process. This is an ill-posed inverse problem.
There are several deconvolution methods that attempt to address the ill-posed problem by using a regularizer [60] or a priori knowledge of the sample, such as by assuming sparsity in its total variation [65,66]. However, due to their inherent assumptions, these methods work well only on a limited range of samples, and the parameters defining the a priori knowledge need to be manually tuned to produce successful results. Fundamentally, they do not have the information in the spatial frequency regions where the OTF is zero, and the a priori knowledge attempts to fill in the missing gaps. Wavefront coding using a phase mask in the Fourier plane has been demonstrated to remove the null regions in the OTF such that a subsequent deconvolution by the precalibrated PSF can recover the latent image [67][68][69][70]. We adopt a similar method called coded aperture proposed by Zhou and Nayar [43] that uses an amplitude mask in the Fourier domain to achieve the same goal. With the amplitude-modulating SLM already in the optical system, using the amplitude mask over a phase mask is preferred. Combined with the knowledge of the pupil function reconstructed by Algorithms 1 and 2, no a priori knowledge is required to recover the latent image via deconvolution. A coded aperture designed by Zhou and Nayar at the pupil plane with a defocus aberration can generate a PSF whose OTF does not have zero values within its NA-limited bandpass. The particular coded aperture is generated by a genetic algorithm that searches for a binary mask pattern that maximizes its OTF's spatial frequency content's modulus. The optimum aperture's pattern is different depending on the amount of noise in the imaging condition. We choose the pattern as shown in Fig. 7 since it performs well across various noise levels [43].
The pupil function in our imaging scenario does not only consist of defocus, as the imaging lenses have severe aberrations. Therefore, our pupil function can have an unsymmetrical phase profile unlike the defocus aberration's symmetric bullseye phase profile. Thus, rotating the coded aperture can generate PSFs with different spatial frequency distribution, resulting in a different PSF shape beyond the mere rotation of the PSF. Therefore, in the data capturing procedure, we capture a series of images with a sequence of big masks A n u, v consisting of four coded

Research Article
apertures and a standard circular aperture at the pupil plane, as represented in Fig. 2(c). This ensures that we obtain all spatial frequency information within the NA-limited bandpass. The PSF associated with each A n u, v applied to P t u, v is easily obtained by h n,t ξ, η jF −1 fA n u, vP t u, vgξ, ηj 2 and its OTF by H n,t u, v F fh n,t ξ, ηgu, v. With the measured full and coded aperture images ϕ n,t ξ, ηs and the knowledge of the OTFs, we perform a combined deconvolution using iterative Tikhonov regularization, similar to Algorithm 1, to recover the object's intensity distribution o t x, y, as described by Algorithm 3 in Fig. 8 and represented in Fig. 2(d).
A simulated deconvolution procedure with the coded aperture on a Siemens star pattern is shown in Fig. 7. The combined OTF of a circular aperture and the coded aperture at four rotation Fig. 7. Simulation that demonstrates the benefit of coded-aperture-based deconvolution. (a1)-(a5) Masked pupil functions obtained by masking the same pupil function with the full circular aperture and coded apertures under different rotation angles (0°, 45°, 90°, 135°), their associated OTFs along one spatial frequency axis, and captured images. Each coded aperture is able to shift the null regions of the OTF to different locations. (b) Comparison between the OTF of a circular-aperture-masked pupil function and the summed OTFs of the circular-and coded-aperture-masked pupil functions. Null regions in the frequency spectrum are mitigated in the summed OTF, which allows all the frequency content of the sample within the band limit to be captured with the imaging system. The OTF of an ideal pupil function is also plotted. (c1) Deconvolved image with only a circular aperture shows poor recovery with artifacts corresponding to the missing frequency contents in the OTF's null regions. (c2) A recovered image using one coded aperture only. Reconstruction is better than (c1) but still has some artifacts. (c3) A recovered image using circular and multiple coded apertures is free of artifacts since it does not have missing frequency contents.

Research Article
angles is able to eliminate the null regions found in the circularaperture-only OTF and thus produce a superior deconvolution result. The deconvolution performance across different frequency components is correlated to the combined OTF's modulus. We observe that the signal-to-noise ratio (SNR) of at least 40 in the initial aberrated image produces a deconvolution result with minimal artifacts. On a Macbook Pro with 2.5 GHz Intel Core i7 and 16 GB of RAM, it takes 2 s for Algorithm 3 to process the 5 images (1000 by 1000 pixels) taken with A n u, v (one full aperture, four coded apertures) to generate the deconvolution result.

EXPERIMENTAL DEMONSTRATION ON ARTIFICIAL SAMPLES A. Demonstration of CACAO-FB on a Crude Lens
The CACAO-FB prototype system's setup is simple, as shown in Fig. 9. It consists of a pair of 2 inch, f 100 mm achromatic doublets (AC508-100-A from Thorlabs) to relay the surface of an imaging lens of interest to the surface of the ferroelectric liquidcrystal-on-silicon (LCOS) spatial light modulator (SLM) (SXGA-3DM-HB from 4DD). A polarized beam splitter (PBS) (PBS251 from Thorlabs) lies in front of the SLM to enable binary modulation of the SLM. A polarizer (LPVISE100-A from Thorlabs) is placed after the PBS to further filter the polarized light to compensate for the PBS's low extinction ratio in reflection. A f 200 mm tube lens (TTL200-A from Thorlabs) Fourier transforms the modulated light and images it on a sCMOS camera (PCOedge 5.5 CL from PCO). To determine the orientation of the SLM with respect to the Fourier space in our computational process, we us a phase-only target, such as a microbead sample, illuminated by a collimated laser source to perform an overlapped-Fourier-coding phase retrieval [10]. With the correct orientation, the reconstructed complex field should have the expected amplitude and phase. The imaging system to be surveyed is placed in front of the CACAO-FB system at the first relay lens's focal length. The imaging system consists of a crude lens and a sample it is supposed to image. The crude lens in our experiment is a 6D trial lens (26 mm diameter, f 130 mm) from an inexpensive trial lens set (TOWOO TW-104 TRIAL LENS SET). A resolution target is placed at less than the lens's focal length away to introduce more aberration into the system. The sample is flood-illuminated by a monochromatic LED light source (520 nm, UHP-Microscope-LED-520 from Prizmatix) filtered with a 10 nm bandpass filter.
The relayed lens surface is modulated with various binary patterns by the SLM. The SLM displays a full aperture (5.5 mm diameter), a coded aperture rotated at 0°, 45°, 90°, and 135°with the maximum diameter matching the full aperture, and a series of limited apertures (1 mm diameter) shifted to different positions in a spiraling-out pattern within the full aperture dimension. The camera's exposure is triggered by the SLM for synchronization. Another trigger signal for enabling the camera to begin a capture sequence is provided by a data acquisition board (NI ELVIS II from National Instrument), which a user can control with MATLAB. Multiple images for each SLM aperture are captured and summed together to increase their signal-to-noise ratio (SNR). The full-aperture image has SNR 51, with other aperture-scanned images having SNR approximately proportional to the square root of their relative aperture area. SNR is estimated by calculating the mean and variance values in a uniformly bright patch on the image.
To quantify the resolution performance of CACAO-FB, we image a Siemens star pattern with the crude 6D lens. The pattern consists of 40 line pairs radiating from the center such that the periodicity along a circle increases with increasing radius. The smallest circle along which the periodic structure is barely resolvable determines the resolution limit of the optical system [71]. For the focal length of 130 mm, the aperture diameter of 5.5 mm, and the illumination wavelength of 520 nm, the expected resolution is between λ∕NA 24.6 μm (coherent) and λ∕2NA 12.3 μm (incoherent) periodicity, defined by the spatial frequency cutoff in the coherent and incoherent transfer functions, respectively. As shown in Fig. 10, CACAO-FB can resolve features up to 19.6 μm periodicity, which is within the expected resolution limit.
The 6D lens is expected to have poor imaging performance that varies across its FOV since it is an inexpensive single element Fig. 9. Experimental setup of imaging a sample with a crude lens (i.e., unknown lens). Sample is illuminated by a monochromatic LED (520 nm), and the lens's surface is imaged onto the SLM by a 1∶1 lens relay. The part of light modulated by the SLM is reflected by the PBS and is further filtered by a polarizer to account for the PBS's low extinction ratio in reflection (1:20). The pupil-modulated image of the sample is captured on the sCMOS camera. L, lens; P, polarizer; PBS, polarizing beam splitter. lens. We image a sample slide consisting of an array of USAF targets and select three tiled regions, each containing a USAF target pattern, to demonstrate CACAO-FB's ability to address spatially variant aberration in its latent image recovery procedure. As shown in Fig. 11, the recovered PSFs in the three different regions are drastically different, which demonstrates the spatially variant nature of the aberration. Deconvolving each region with the corresponding PSF can successfully recover the latent image. The expected resolution limits as calculated above correspond to a range between Group 5 Element 3 (24.8 μm periodicity) and Group 6 Element 3 (12.4 μm periodicity). Features up to Group 5 Element 5 (19.68 μm periodicity) are resolved after deconvolution as shown in Fig. 11, which matches closely with the resolution determined by the Siemens star pattern.

B. Demonstration of CACAO-FB on an Eye Model
We use an eye model from Ocular Instruments to simulate an in vivo retinal imaging experiment. We embed a cut-out USAF resolution target (2015a USAF from Ready Optics) on the model's retina and fill the model's chamber with de-ionized water, as shown in Fig. 12. We make adjustments to our CACAO-FB system as shown in Fig. 13 to accommodate the different imaging scenario. First, it has to illuminate the retina in a reflection geometry via the same optical path as imaging. A polarized beam  splitter (PBS) is used to provide illumination such that the specular reflection from the eye's cornea, which mostly maintains the s polarization from the PBS, is filtered out of the imaging optical path. The scattered light from the retina is depolarized and can partially transmit through the PBS. The light source is a fiber-coupled laser diode (520 nm) (NUGM01T from DTR's Laser Shop), which is made spatially incoherent by propagating through a 111 m long, 600 μm core diameter multimode fiber (FP600URT from Thorlabs), following the method in Ref. [72]. The laser diode is triggered such that it is on only during camera exposure. Images are captured at 50 Hz, ensuring that the flashing illumination's frequency lies outside of the range that can cause photosensitive epilepsy in humans (i.e., between 15 and 20 Hz [73]). We add a pupil camera that outputs the image of the eye's pupil with fiduciary marks for aligning the eye's pupil with our SLM. Finally, a motion-reference camera (MRC) that has the identical optical path as the encoded-image camera (EIC) aside from pupil modulation by SLM is added to the system to account for an in vivo eye's motion between image frames. The amount of light split between the MRC and EIC can be controlled by the PBS and a quarter-wave plate before the SLM.
In Fig. 14, the recovered images show severe spatially varying aberration of the eye model but good deconvolution performance throughout the FOV, nonetheless. The tile size is set such that it is the biggest tile that could produce an aberration-free image, judged visually. The full aperture in this scenario had a 4.5 mm diameter, and its associated aberrated image had a SNR of 126.
In this imaging scenario, the blur kernels of the limitedaperture images had a significant impact on the deconvolution result, as shown in Fig. 15. The aberration of the eye model was severe such that the retrieved blur kernels of the limitedaperture images had distinct shapes in addition to lateral shifts.  Illumination is provided by a fiber-coupled laser diode (520 nm), and the eye's pupil is imaged onto the SLM by a 1∶1 lens relay. The sample is slightly defocused from the focal length of the crude lens to add additional aberration into the system. Pupil alignment camera provides fiduciary to the user for adequate alignment of the pupil on the SLM. PBS2 helps with removing corneal reflection. The motion-reference camera is synchronized with encoded-image camera to capture images not modulated by the SLM. BS, beam splitter; L, lens; M, mirror; P, polarizer; PBS, polarized beam splitter; QWP, quarter-wave plate.

Research Article
We observe a much better deconvolution result with the reconstructed pupil that takes blur kernels' shapes into account compared to the one that does not. The latter is analogous to Shack-Hartmann wavefront sensing method, which only identifies the centroid of each blur kernel to estimate the aberration. Thus, this demonstrates the importance of the blur kernel estimation step in our algorithm and the distinct difference of our aberration reconstruction from other wavefront sensing methods.

ADAPTING CACAO-FB TO AN IN VIVO EXPERIMENT ON THE EYE OF A RHESUS MACAQUE
The same setup as in Section 3.B is used for the in vivo experiment on a rhesus macaque's eye. The animal is anesthetized with 8-10 mg/kg ketamine and 0.02 mg/kg dexdomitor IM. Two drops of tropicamide (0.5%-1%) are placed on the eye to dilate the pupil. To keep the eye open for imaging, a sanitized speculum

Research Article
is placed between the eyelids. A topical anesthetic (proparacaine 0.5%) is applied to the eye to prevent any irritation from the speculum placement. A rigid gas permeable lens is placed on the eye to ensure that the cornea stays moist throughout imaging. The light intensity is kept below a level of 50 mW∕cm 2 on the retina in accordance with ANSI recommended safe light dosage.
Due to the safety limitation on the illumination power, the captured images of the retina have low SNR (e.g., the fullaperture image has SNR 7.5). We increase the SNR by capturing multiple redundant frames [213 frames for A n u, vs, 4 frames for W m u, vs] and adding them together (Fig. S3 of Supplement 1). Thus, a long sequence of images (∼45 s) has to be captured, and these images with weak retinal signals have to be registered for motion prior to CACAO-FB since the eye has residual motion even under anesthesia. Due to a long averaging window, aberration of high temporal frequency is washed out, but we still expect to be able to resolve the photoreceptors, albeit at a lower contrast [7]. Motion registration includes both translation and rotation, and these operations need to be done such that they do not apply any spatial filter that may alter the images' spatial frequency spectra. Rotation is performed with fast discrete sincinterpolation [74], which is a series of Fourier transform operations that can be accelerated by GPU programming. The frames from the motion-reference camera are used for the registration process ( Fig. S2 of Supplement 1 and Visualization 1). A center region with half the dimensions of the full frame is selected from one of the frames as a template for registration. The normalized cross-correlation (NCC) value is found between the template and each frame [75] for every rotation angle (−1.5 to 1.5 deg, 0.0015 deg step size). The set of rotation angle and lateral shift values that produces the maximum NCC value at a pixel resolution for each frame corresponds to the motion registration parameters for that frame and the corresponding encoded-image camera's frame. The registration parameters for all the frames are applied to the images of the encoded-image camera, and the images are grouped by different apertures to be summed together (Fig. S3 of Supplement 1).
The deconvolution result is shown in Fig. 16. Although the sensor size of 2560 × 2160 pixels is used for capturing raw images, the resultant averaged images are only 1262 × 1614 pixels after the motion registration. The input full-aperture image had SNR 109. Photoreceptors are much better resolved after aberration removal. We expect the entire visible region to have an even spread of photoreceptors, but we observe well-resolved photoreceptors mostly in the brighter regions. This may be due to the lower SNR in the darker regions leading to a poorer deconvolution result. We cannot capture more frames of the retina to further increase the SNR because the animal's eye's gaze drifts over time and the original visible patch of the retina is shifted out of our system's FOV. Furthermore, non-uniform specular reflections from the retina add noise to part of the captured data, leading to sub-optimal latent image recovery by the CACAO-FB algorithm pipeline.

DISCUSSION
We developed a novel method to characterize the aberration of an imaging system and recover an aberration-free latent image of an underlying sample in post-processing. It does not require the coherent illumination necessary in other computational, aberration-compensating imaging methods. It does not need separate wavefront detection and correction devices found in many conventional adaptive optics systems, as the hardware complexities are off-loaded to the software regime, which can harness the ever-increasing computational power. Its principle is based on incoherent imaging, which obviates sensitivity issues such as phase fluctuations and incident angles associated with coherent imaging and allows for characterizing an integrated optical system where the sample plane is only accessible via the target system's lens. Our demonstrations of CACAO-FB on sub-optimal lenses in benchtop and in vivo experiments show its viability in a broad range of imaging scenarios. Its simple hardware setup is also a key advantage over other aberration-correction methods that may allow for its wide adoption.
More severe aberration can be addressed readily by shrinking the scanned aperture size on the SLM so that the aberration within each windowed pupil function remains low order. This comes at the expense of the acquisition speed as more images need to be captured to cover the same pupil diameter.

Research Article
If the masks on the pupil are shrunk smaller with no overlap, this pupil masking process becomes a Shack-Hartmann (SH) sensing method. This illustrates the key advantages of our scheme over a SH sensor: using bigger masks allows for fewer image acquisitions and increases the images' SNR. A bigger mask of an aberrated pupil no longer encodes for a simple shifted spot in the spatial domain as would be the case for a SH sensor but rather a blur kernel as shown in Fig. 3. Therefore, reconstructing the blur kernels of the limited aperture images is critical for CACAO-FB's performance, as is demonstrated in Fig. 15.
Using an aperture mask in the Fourier plane discards a significant amount of photons in the image acquisition process. One possible way to improve the photon efficiency of our system would be to use a phase mask instead of an amplitude mask to code the Fourier plane as has been demonstrated in Ref. [70] to remove nulls in the OTF of an aberrated imaging system.
Although the recovered retinal image in Section 4 is not on par with what one can achieve with a typical AO retinal imager, it showcases the proof of concept of using CACAO-FB to correct for aberrations in a general optical system. There are several challenges of imaging an in vivo eye that can be addressed in future works to allow CACAO-FB to be a viable alternative to AO retinal imagers. First, increasing the SNR by averaging multiple retinal images of the rhesus macaque's eye in vivo is challenging as its gaze continues to drift even under general anesthesia. There is a finite number of frames we can capture before the original patch of retina shifts out of our system's FOV. Imaging a human subject would be less susceptible to this issue as an operator can instruct the subject to focus on a target and maintain the same patch of the retina within the system's FOV as done in Ref. [7]. The small lateral shifts between captured frames due to the eye's saccade can be digitally registered prior to averaging. Using a different wavelength invisible to the eye will allow the subject to maintain his/her gaze throughout an extended acquisition time. Second, non-uniform specular reflections from the retinal layer corrupt the captured images. The flood illumination provided on the retina through the pupil does not have sufficient angular coverage to even out the specular reflections such that some images captured with a small aperture contained specular reflections while others did not. A flood illumination with a wider divergence angle can mitigate this problem.
On the other hand, our method will be able to provide a readily viable solution for imaging static targets such as wide FOV imaging of fluorescent samples under sample-and systeminduced aberrations since the fluorescent signals are inherently spatially incoherent and the CACAO-FB principle can be applied to the imaging process. With its simple optical setup, we believe CACAO-FB can be easily incorporated into many existing imaging systems to compensate for the limitations of the physical optics design.