Modeling classical wavefront sensors

We present an image formation model for deterministic phase retrieval in propagationbasedwavefront sensing, unifying analysis for classical wavefront sensors such as Shack-Hartmann (slopes tracking) and curvature sensors (based on Transport-of-Intensity Equation). We show how this model generalizes commonly seen formulas, including Transport-of-Intensity Equation, from small distances and beyond. Using this model, we analyze theoretically achievable lateral wavefront resolution in propagation-based deterministic wavefront sensing. Finally, via a prototype masked wavefront sensor, we show simultaneous bright field and phase imaging numerically recovered in real-time from a single-shot measurement. © 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

We refer to classical wavefront sensors as sensor-side coded wavefront sensors under collimated illumination.Classical wavefront sensors are considered as deterministic for phase retrieval, since their image formation models are simple and numerically easy to invert.However, deterministic models are hard to derive, and different sensors are usually considered separately, for example using centroid tracking (slope tracking) for modeling Shack-Hartmann sensors, or using the Transport-of-Intensity Equation (TIE) for modeling curvature sensors.
In this paper, we model classical wavefront sensors in a unified framework, extending our conference paper [23] with extended analysis and results.We consider wavefront sensors working under collimated white light illumination with sensor-side coding using custom optics.Figure 1 shows such a general modeling of single-shot or dual-plane wavefront sensors, where an optical element (e.g., a microlens array, or a phase mask, see Table 1 for examples) is placed distance z in front of a bare image sensor.Two images are captured, one without the sample (reference image I 0 (r)), another with the sample (measurement image I(r)).A numerical solver recovers the unknown wavefront φ(r) from image pair I 0 (r) and I(r).In §2, we derive a formula relating I 0 (r) and I(r).

Ray optics derivations
Damberg and Heidrich [24] proposed a warping based phase optimization image formation model for computational lens design, in which the core idea is that the energy conservation law is valid in each local differentiable area, and consequently a ray optics based formula is derived relating caustic images and the freeform lens.Similar approach has been proposed in [25] to yield the same result as as [24], including higher order terms.Here, we revisit these approaches to further account for sample absorption A(r) and general custom optics (i.e., different I 0 (r)) as in Table 1, resulting in a ray optics model to general classical wavefront sensors.See §A for more details from a wave optics perspective.Consider monochromatic free space light propagation in Fig. 2, for a single ray of interest, for a unknown object A(r) exp[jφ(r)], we have: where we assume a small angle so that λ|∇φ(r)| 2π.Rearranging yields where the approximation comes from small local curvature assumption that λz|∇ 2 φ(r)| 2π, as required for weak diffraction to hold (see Eq. ( 27) derived from Eq. ( 16) in §A).In our case J(r) is the diffraction pattern of the optics at distance z, and hence is the reference image under collimated illumination, i.e., J(r) = I 0 (r).Therefore Equation ( 3) is the main model of this paper, and is a natural consequence of [24,25].Under quasi-monochromatic lighting, Eq. ( 3) could be reformulated in terms of optical path differences (OPD), i.e., without wavelength λ.That said, strictly temporally coherent light sources (such as lasers) are not necessary and this formula works under broadband illumination, as been experimentally verified [22].With d(r) denoting the OPD, we get φ(r) = 2πd(r)/λ, and Eq. ( 3) can be rewritten as: To convert OPD back to wavefront or phase, a nominal wavelength (e.g., λ = 500 nm) can be used, as normally seen in other white light wavefront sensing techniques such as Shack-Hartmann.Finally, we make no specific assumptions about optics, thus Eq. (3) presumably works for all wavefront sensors in a similar configuration of Fig. 1.Some examples are in Table 1.

Connection to previous formulas
We now see how Eq. ( 3) is connected to previous formulas for single-shot or dual-plane wavefront sensing.Table 2 summarizes these formulas, which will be discussed in the following.The well-known Transport-of-Intensity Equation (TIE), a.k.a. the irradiance transport equation [13], serves as the basic principle for curvature wavefront sensors and variants [14,15,[26][27][28][29][30].
By performing a 1 st -order Taylor expansion for I(r + λz 2π ∇φ) around r, and setting I 0 (r) = 1 since there are no coding optics in traditional TIE setups, Eq. (3) reduces to: where I 1 (r) = | A(r)| 2 and I 2 (r) = I(r) are the first and second images captured on the optics and the sensor plane, respectively.Let I 1 (r) ≈ I 2 (r) ≈ Ī(r) and we obtain TIE with z → 0 justifying the finite difference approximation.As a result, Eq. ( 3) generalizes TIE to large distances z, i.e., to regions beyond the convergence radius of the 1 st -order Taylor expansion, hence beyond the finite difference range.Compared to TIE, Eq. ( 3) claims following advantages: • Our model is accurate for two planes that could be separated by a large distance (at mm level), whereas TIE is only valid at one particular transversal plane, and suffers from finite difference approximation for the z-axis derivative.
• Our model contains an image warping operation, which preserves the nonlinearity with respect to φ(r), i.e., I(r + λz 2π ∇φ) is nonlinear is a non-linear function of ∇φ.This non-linearity allows us to handle large displacements [35], which is crucial property for large-scale wavefront distortions.At the same time, a 1 st -order Taylor expansion of I(r + λz 2π ∇φ) yields the familiar linear formulations in TIE.• Our model extends the degrees of freedom for classical TIE systems to allow for a custom modulation mask (reflected as a customizable reference image I 0 (r), for which in TIE is usually uniform), thus making it possible for single-shot measurement.

Flow tracking
By dropping the sample amplitude and wavefront curvature, Eq. ( 3) reduces to: which we recognize as the famous optical flow formulation in computer vision [36], or the classical centroid tracking model for Shack-Hartmann, as well as the underlining model for previous speckle-pattern tracking phase imaging techniques [16-18, 20, 21, 31].

Amplitude-combined flow tracking
If dropping out only the wavefront curvature term, Eq. ( 3) reduces to which we recognize as models typically employed in X-ray phase imaging applications [19].
Another commonly seen formulation is which can be derived by assuming A r − λz∇φ/(2π) ≈ A(r) as in Eq. ( 31) of §A.

Dark field model
It is worth noting that Eq. ( 3) only considers refraction and absorption effects, excluding scattering contributions, i.e., the dark fields.Yet, we provide a connection to Zanette et al.'s dark field model [34], illustrating the feasibility of future development on Eq. ( 3).Zanette et al. [34] modified Eq. ( 8) by incorporating a dark field term D(r) that represents a scattering contribution: where Ī0 is the mean value of I 0 (r), and ∆I 0 (r) = I 0 (r) − Ī0 .With Eq. ( 3), we know part of D(r) originates from the caustic term 1 − λz∇ 2 φ/(2π) that contains wavefront curvature ∇ 2 φ, representing sample diffraction under a distorted wavefront along z, which is an effect of refraction.Therefore, the dark field term D(r) in Eq. ( 9) can be further refined to contain only scattering terms.As such, Eq. ( 3) provides new insights for incorporating dark field term into the image formation model.

Lateral wavefront resolution and numerical conditioning analysis
Based on Eq. ( 3), we would like to gauge the maximal lateral wavefront resolution that a wavefront sensor (e.g., the ones in Table 1) can achieve, and under what condition would it be possible.In this section, we analyze what the roles of distance z and coding optics (as in Fig. 1) are for a wavefront sensor.For a given optical system (fixed relay lens and pixel size), it turns out that distance z decides the theoretically achievable lateral wavefront resolution, whereas specific coding optics decide the numerical conditioning of the wavefront retrieval problem.In the remainder of this paper, without ambiguity, we refer to wavefront resolution as lateral wavefront resolution, i.e., the spatial wavefront resolution.

Theoretical resolution bound by distance z
Recall the approximation conditions when deriving Eq. ( 3), in terms of OPD d(r): Using Eq. ( 10) as constraints, one can derive a wavefront (or OPD) transfer function D(ρ) in terms of Fourier harmonics of frequency ρ.Let d(r) = D(ρ) cos(2πρ • r), and: Equation ( 11) fully characterizes the attainable wavefront resolution of classical wavefront sensors based on Eq. ( 3).From Eq. ( 11), at low frequencies, the wavefront transfer function is fixed regarding |ρ|, whereas at high frequencies the transfer function depends on z.We now discuss how z affects Eq. (11).Let ρ max be the maximum frequency that a diffraction-limited broadband optical system can achieve.Thus, ρ max depends on the numerical aperture (NA) and the image sensor pixel size, and is not considered related to the coding optics employed by the wavefront sensor.As an example, for an image sensors of pixel pitch = 5 µm, by Nyquist Theorem, ρ max = 100 mm −1 .It thus requires z ≈ 0.628 µm such that 1/z > 2πρ max .This distance, however, is too small for practical slope tracking sensors, but explains typical µm level defocusing distances in previous studies of curvature sensing [37].Finally, notice a similar bound was derived in [21] assuming non-overlapping speckles, which turns out to be a loose assumption that ∇ 2 d(r) ≤ 1/z.  3) to hold decreases with an increase of z.This z-|ρ| resolution trade-off applies to all classical wavefront sensors based on Eq. ( 3), especially for slope tracking sensors whose coding optics are usually mm away.
To visualize Eq. ( 11), we draw D(ρ) in Fig. 3 for two different z configurations.To obtain a tight bound from Eq. ( 11), let equality be attained at 10 %, i.e., ∇ 2 d(r) ≤ 0.1/z, denoted by the shaded regions in Fig. 3.In other words, for Fourier harmonic OPD d(r) within the shaded regions, Eq. ( 3) holds.Small z (≈10 µm, as in curvature sensing) achieves wavefront resolution at pixel sampling rate of 100 mm −1 , whereas larger z (≈1 mm, as for most slope tracking sensors) achieves approximately 1/5 pixel resolution in this case.Similar results have previously been obtained in experiments for speckle-tracking wavefront sensors [21,22].As a short conclusion, Eq. ( 11) defines the maximum theoretical resolution for classical deterministic phase retrieval wavefront sensors based on Eq. (3).

Numerical conditioning by coding optics
Previously we analyze the theoretically achievable wavefront resolution for Eq. ( 3) to hold at a fixed distance z.We now discuss how coding optics (hence reference image I 0 (r)) affects wavefront resolution in terms of numerical stability.
Recall we would like to solve φ(r) from Eq. (3) (or d(r) from Eq. (4) equivalently), given I 0 (r) and I(r).Unfortunately, sample intensity | A(r)| 2 and local wavefront curvature ∇ 2 φ(r) are coupled (hence correlated), meaning we can only rely on estimating ∇φ(r) for lateral recovery of φ(r).This property defines the basic principle for slope tracking wavefront sensors.As a result, one may reduce Eq.(3) to Eq. ( 7) or (8), either would provide the same analysis.For simplicity in terms of Eq. ( 8), by expanding the nonlinear term up to higher order terms (H.O.T.): To solve ∇φ(r), for non-singular samples that A(r) 0, we identify a purely coding optics dependent diagonal linear system with diagonals as ∇I 0 (r).Consider the condition number: In terms of numerical stability, κ can be used as a figure of merit for the performance of the optics in Fig. (1): the smaller κ, the better the optics chosen for slope tracking wavefront sensing.
As such, the "best" optics would produce uniform gradients and κ = 1.However, this design is not practical due to limited dynamic range of image sensors (usually of 8 to 10 bits).The other extreme is the classical Shack-Hartmann sensor, for which the microlens array produces only bright spots array with black background, and κ → ∞.In this case, the effective wavefront resolution is limited to the number of microlens, instead of the number of pixels.

Simulation results
To verify the proposed formula Eq. ( 3), we simulate a curvature sensor propagating through varies distance z, assuming no optics in Fig. 1.In other words, I 0 (r) = 1 in Eq. ( 3).In this way, we eliminate the influence of coding optics and evaluate only the physical correctness.In Fig. 4, we simulate a Gaussian wavefront (peak-valley ≈ 24λ) propagating through different z using the angular spectrum method [38,39] at wavelength λ = 550 nm, pixel size = 5 µm, under a sampling rate of 0.5 µm < λ.With oversampling, numerical propagation error is small compared to model errors of interest here.Given φ(r) and the obtained I(r) at different z, we numerically evaluate the absolute fitting errors of different formulas (TIE: Eq. ( 5); Flow: Eq. ( 7); Ours: Eq. ( 3)), with 1 -norm of the error maps shown in Fig. 4(b).Image warpings are implemented using cubic spline interpolation with piece-wise spline coefficients computed from I 0 (r).As revealed by Fig. 4, our formula maintains a low error throughout z range, whereas TIE fails for large z and flow-tracking formula fails for small z.This superior performance is the consequence of combining both TIE and flow-tracking as discussed previously.Notice all formulas start to fail for very large z because of diffraction, as a consequence of violating ∇ 2 d(r) 1/z in Eq. ( 10).For small wavefront slopes, the advantage of our formula Eq. ( 3) is not obvious compared to TIE, but still maintain its advantage comparing to flow-tracking Eq. ( 7).However, once the assumption ∇ 2 d(r) 1/z is violated, both TIE and our formula amplify the defocusing error, showing I(r) as a blurry image.
Similarly in Fig. 5, we simulate a real-captured cheek cell wavefront (dataset from [40]) propagates through different distance z, and evaluate the absolute fitting errors of different formulas.TIE and our formula produce almost the same error curves due to small wavefront slopes, and displacement is smaller than one pixel, z |∇d(r)| < , justifying the linear Taylor approximation from Eq. (3) to TIE.As in Fig. 4, both TIE and our formula fit well for small z ≤ 2 mm.However, at large z, the assumption that ∇ 2 d(r) 1/z does not hold anymore, and I(r) appears blurry.As a result, both TIE and our formula produce large errors because the caustic term 1 − z∇ 2 d(r) is not small and amplifies the defocusing error.From Fig. 4 and Fig 5, we conclude our formula combines the advantage of both approaches for short distance z when Eq. ( 10) holds.

Experimental results
In this section we show experimental results using a speckle-tracking wavefront sensor [20,22].
0.17 µm 0 Fig. 6.The "optics" for our wavefront sensor (variant of [20]) in Fig. 1, and its capability to recover amplitude A(r) and OPD d(r) = λ 2π φ(r) from reference I 0 (r) and measurement I(r).The condition number of our mask as defined in §3.2 is κ = 12.2, justifying a "good" enough coding optics for wavefront sensing.Notice the round biconcave bell shape of the human red blood cell has been successfully reconstructed.

Hardware prototype
We implemented a speckle-tracking wavefront sensor [20] using a binary phase mask.The mask was placed z = 1.43 mm in front of a bare image sensor (pixel size = 6.45 µm, 1501M-USB-TE, ThorLabs) with a cooling system to prevent thermal expansion of the mask (hence stabilizing the speckle pattern).Figure 6 shows the "optics" of our wavefront sensor (mask pixel size 12.9 µm), and the two experimentally obtained images I 0 (r) and I(r), from which by solving Eq. ( 3) we are able to recover sample amplitude A(r) and phase φ(r).We also modified a common bright field microscope by taking off the condenser, keeping a collimated illumination by a halogen lamp (HPLS245, ThorLabs).The original image sensor was replaced by our wavefront sensor.

Numerical solver
After discretization, Eq. ( 3) could be solved as a joint optimization problem a * , φ * = arg min a, φ where denotes element-wise multiplication, R 1,2 (•) are smoothness and gradient sparsity priors regarding a and φ, and the two variables are discretized versions of: We implemented the wavefront solver using an optimization scheme that solves for a and φ in an alternating fashion [22], with source code available online [41].The real-time solver was implemented in CUDA, running efficiently in parallel, requiring approximately 100 ms solving time per frame for a megapixel input image on a Nvidia Titan X (Pascal) GPU.

Quantitative phase imaging application
Figure 7 shows simultaneous amplitude and phase recovery for transparent thin cells.Given single shot measurement I(r), the numerical solver in §5.1.2recovers sample amplitude A(r) and phase φ(r).Notice how the speckle pattern in I(r) has been computationally removed, revealing sharp amplitude images in the reconstruction.Also notice our recovered wavefront map is slightly blurry, not only because of the fundamental limit analyzed as by Eq. ( 11), but partially due to diffraction-limited small cutoff spatial frequency of the microscopy objective.

Discussion
The derived Eq. (3) in §2.1 fully characterizes deterministic image formation in classical wavefront sensors such as Shack-Hartmann and curvature sensors.Using its ray optics nature as constraints, achievable wavefront resolution follows Eq. ( 11) in §3, revealing a z-|ρ| trade-off curve, offering a theoretical bound for future wavefront sensor designs.For slope tracking sensors, due to mm level distance z, wavefront resolution is fundamentally limited as in Fig. 3, and hence we suggest classical slope tracking sensors be applicable to large-scale smooth aberrations, e.g., large-scale adaptive optics [42] and autorefraction metrology [43].If higher resolution is desired, possible workarounds are scanning optics [44,45] for ptychography, combining spatial light modulators for computational sensing [46] beyond simple optics, or using numerical propagation based inversion [47,48].We may further extend Eq. ( 3) to include minor effects, e.g., dark fields from indirect scattering.This will yield straightforward extension to existing scattering models in propagation-based wavefront sensing, for example the Fokker-Planck equation in paraxial X-ray imaging [49,50].Another direction is to impose stronger assumptions on samples, e.g., weak phase that φ 1.We expect to achieve similar results of contrast [51] and mixed transfer function [52].
We may also take into account the higher-order infinitesimals in §A ignored when deriving Eq. (3).These small amounts render the diffraction effects, although not considered easy for wavefront sensing via the inversion numerical scheme in §5.1.2,could be helpful to forward modeling in freeform lens designs for caustic imaging [24,53,54] to reduce blurring artifacts.

Conclusion
To conclude, we derive a new image formation model Eq. ( 3) for general classical wavefront sensors consist of a simple coding optics and an image sensor as in Fig. 1.This new model generalizes TIE to far distance regions beyond the finite difference approximation as shown in §2.2, verified by simulations in §4.The validity of this model depends on propagation distance z, which in turn defines achievable spatial wavefront resolution as analyzed in §3.Finally, using the new model, we show a quantitative phase imaging application in §5.2 via a speckle-tracking wavefront sensor.We believe this more general model could be useful in propagation-based deterministic wavefront sensing applications, for those TIE is replaceable by Eq. (3).

A. Wave optics derivations
In this appendix, we derive Eq. ( 3) from first principles, in wave optics.
A.1.Diffraction models Let r = (x, y).Consider in free space monochromatic scalar field u 0 (r) propagates through a short distance z and becomes u z (r).Their relationship is [13,38]: (spatial domain) (frequency domain) (15) where ρ is the Fourier dual of r, and U 0 (ρ) and U z (ρ) are Fourier transforms of u 0 (r) and u z (r) respectively.Denote F as Fourier transform, then U 0 (ρ) = F {u 0 (r)} and U z (ρ) = F {u z (r)}.λ is wavelength, k = 2π/λ is wave number, and ∇ 2 is the Laplacian operator.We follow the convention that | • | denotes the 2 -norm of a vector, e.g., |ρ| = (ρ 2 x + ρ 2 y ) 1/2 .The Fresnel diffraction formula can be derived by 1 st -order Taylor expansion the matrix exponential in Eq. ( 15) [13].Specifically, near field (0 th and 1 st -order) can be written as Fresnel propagator u 0 (r) ≈ 1 + j z 2k ∇ 2 0 th and 1 st -order u 0 (r), where we neglect constant phase exp(jk z), and the approximations lead to constraint max 1 where we safely assume We will be using Eq. ( 15) for total field diffraction (potentially of high frequency), whereas using Eq. ( 16) for analyzing sample-related diffraction (potentially of low frequency).
In the following, small letters denote scalar fields in spatial domain, and capital or calligraphic letters denote scalar fields in Fourier domain, with subscript 0 and z denote the original and diffracted fields, respectively.

A.2. Diffraction from optics to sensor
The initial field u 0 (r) is a multiplication of general (amplitude or phase) complex transfer function p 0 (r) of the optics and a general sample scalar field f 0 (r) = A(r) exp[jφ(r)]: For diffraction field u z (r) at distance z, using Eq. ( 15) via expanding the propagation in the Fourier domain, we have [20]: where the third equality of Eq. ( 20) results from the introduction of variable ρ = ρ − ρ , and • denotes inner product.The approximation comes from (for λ = 500 nm and pixel size = 6.45 µm, at Nyquist frequency λ 2 |ρ| 2 ∼ 0.0015 1) [13]: To simplify notation, we neglect constant phase, and substitute ρ with ρ in Eq. ( 20): Notice that Eq. ( 22) is a general formula and is applicable to any field that u 0 (r) = p 0 (r) f 0 (r).
We now focus on deriving sample diffraction field f z (r − λzρ).

A.3. Sample diffraction
Let g 0 (r) = exp[jφ(r)], f 0 (r) = A(r)g 0 (r) and Eq. ( 22) is applicable to expressing f z (r): where A z (ρ) is the Fourier transform of A z (r), which is the diffraction field of A(r).Assuming near field for A(r) and g 0 (r) respectively, by Eq. ( 16) we have: ≈ g z (r) ≈ Since we assume near field for computing g z (r), the conditions are, following Eq.( 17): z 2k ∇ 2 φ(r) (17) 1 Notice we have obtained the two identical constraints when deriving Eq. ( 3) in ray optics, justifying the conditions in §2.1.For sufficiently small λz|ρ| compared to pixel size , we have valid 1 st -order approximation for φ(r − λzρ), and 0 th -order approximation for higher orders: By Eq. ( 28), we may obtain g z (r − λzρ) from Eq. ( 25), and hence f z (r) in Eq. ( 23) derives as: We further assume smooth sample amplitude A(r) with a small spectrum with respect to sampling frequency 1/ .As a result, A(r − λzρ) ≈ A(r).Thus, and given Eq. ( 29), we may obtain f z (r − λzρ) as: where we neglect constant phase exp[jφ(r)] in the expression.We should point out another way of deriving f z (r − λzρ) is to assume near field directly on f 0 (r), instead of imposing separately on its amplitude and phase as shown previously.Thus, its condition of validity is considered more restrictive.As such, f z (r − λzρ) derives as: Notice the small difference in the amplitude term, i.e., A(r) versus A r − (z/k)∇φ(r) in Eq. (30).

Fig. 1 .
Fig. 1.General single-shot wavefront sensor model.The unknown wavefront φ(r) is numerically solved from cached I 0 (r) and single-shot measurement I(r).

Fig. 2 .
Fig. 2. Monochromatic free space light propagation as image formation model.

Fig. 3 .
Fig. 3. Wavefront transfer function for different optics-sensor distance z configurations.Area of valid regions (shaded) for Eq.(3) to hold decreases with an increase of z.This z-|ρ| resolution trade-off applies to all classical wavefront sensors based on Eq. (3), especially for slope tracking sensors whose coding optics are usually mm away.

Fig. 4 .
Fig. 4. Errors of different formulas for a Gaussian wavefront propagating through z.As a mixed approach, our formula Eq. (3) outperforms alternative formulations in terms of 1 -fitting errors.With a gradual violation of ray optics and weak diffraction assumption (increasing z, and hence breaking of ∇ 2 d(r) 1/z), all formulas start to fail.

Fig. 5 .
Fig.5.Errors of different formulas for a real-captured cheek cell wavefront propagating through z.For small wavefront slopes, the advantage of our formula Eq. (3) is not obvious compared to TIE, but still maintain its advantage comparing to flow-tracking Eq. (7).However, once the assumption ∇ 2 d(r) 1/z is violated, both TIE and our formula amplify the defocusing error, showing I(r) as a blurry image.

Fig. 7 .
Fig. 7. Experimental results using the proposed quantitative phase imaging pipeline.Images were taken under a ×100 Mitutoyo plan apochromat objective, 0.70 NA.Inset close-up images indicate that the speckle patterns have been fully removed from the original raw data.Wavefronts are shown in terms of OPD.

Table 1 .
Optics and corresponding reference image of different wavefront sensors.

Table 2 .
Theoretical models used in classical wavefront sensing research.