Iterative aperture mask design in phase space using a rank constraint

We present an iterative camera aperture design procedure, which determines an optimal mask pattern based on a sparse set of desired intensity distributions at different focal depths. This iterative method uses the ambiguity function as a tool to shape the camera’s response to defocus, and shares conceptual similarities with phase retrieval procedures. An analysis of algorithm convergence is presented, and experimental examples are shown to demonstrate the flexibility of the design process. This algorithm potentially ties together previous disjointed PSF design approaches under a common framework, and offers new insights for the creation of future application-specific imaging systems. ©2010 Optical Society of America OCIS codes: (110.1220) Apertures; (050.5082) Phase space; (110.1758) Computational imaging. References and links 1. E. R. Dowski, Jr., and W. T. Cathey, “Extended depth of field through wave-front coding,” Appl. Opt. 34(11), 1859–1866 (1995). 2. W. Chi, and N. George, “Electronic imaging using a logarithmic asphere,” Opt. Lett. 26(12), 875–877 (2001). 3. H. Zhao, and Y. Li, “Optimized sinusoidal phase mask to extend the depth of field of an incoherent imaging system,” Opt. Lett. 35(2), 267–269 (2010). 4. A. Greengard, Y. Y. Schechner, and R. Piestun, “Depth from diffracted rotation,” Opt. Lett. 31(2), 181–183 (2006). 5. A. Levin, R. Fergus, F. Durand, and W. T. Freeman, “Image and depth from a conventional camera with a coded aperture,” ACM Trans. Graph. 26(3), 70 (2007). 6. M. D. Stenner, D. J. Townsend, and M. E. Gehm, “Static Architecture for Compressive Motion Detection in Persistent, Pervasive Surveillance Applications,” in Imaging Systems, OSA technical Digest (CD) (Optical Society of America, 2010), paper IMB2. http://www.opticsinfobase.org/abstract.cfm?URI=IS-2010-IMB2 7. A. Ashok, and M. Neifeld, “Pseudo-random phase masks for superresolution imaging from subpixel shifting,” Appl. Opt. 46(12), 2256 (2007). 8. R. W. Gerchberg, and W. O. Saxton, “A practical algorithm for the determination of phase from image and diffraction plane figures,” Optik (Stuttg.) 35, 237–246 (1972). 9. J. R. Fienup, “Iterative method applied to image reconstruction and to computer generated holograms,” Opt. Eng. 19, 297–305 (1980). 10. D. Elkind, Z. Zalevsky, U. Levy, and D. Mendlovic, “Optical transfer function shaping and depth of focus by using a phase only filter,” Appl. Opt. 42(11), 1925–1931 (2003). 11. W. D. Furlan, D. Zalvidea, and G. Saavedra, “Synthesis of filters for specified axial irradiance by use of phasespace tomography,” Opt. Commun. 189(1-3), 15–19 (2001). 12. A. Papoulis, “Ambiguity function in Fourier optics,” J. Opt. Soc. Am. 64(6), 779–788 (1974). 13. K. H. Brenner, A. Lohmann, and J. Ojeda-Castaneda, “The ambiguity function as a polar display of the OTF,” Opt. Commun. 44(5), 323–326 (1983). 14. M. Testorf, B. Hennelly, and J. Ojeda-Castaneda, Phase Space Optics: Fundamentals and Applications, (McGraw-Hill, (2010). 15. J. Ojeda-Castañeda, L. R. Berriel-Valdos, and E. Montes, “Ambiguity function as a design tool for high focal depth,” Appl. Opt. 27(4), 790–795 (1988). 16. M. G. Raymer, M. Beck, and D. F. McAlister, “Complex wave-field reconstruction using phase-space tomography,” Phys. Rev. Lett. 72(8), 1137–1140 (1994). #134103 $15.00 USD Received 27 Aug 2010; revised 22 Sep 2010; accepted 27 Sep 2010; published 8 Oct 2010 (C) 2010 OSA 11 October 2010 / Vol. 18, No. 21 / OPTICS EXPRESS 22545 17. J. Tu, and S. Tamura, “Wave-field determination using tomography of the ambiguity function,” Phys. Rev. E Stat. Phys. Plasmas Fluids Relat. Interdiscip. Topics 55(2), 1946–1949 (1997). 18. X. Liu, and K. H. Brenner, “Reconstruction of two-dimensional complex amplitudes from intensity measurements,” Opt. Commun. 225(1-3), 19–30 (2003). 19. M. R. Teague, “Deterministic phase retrieval: a Green’s function solution,” J. Opt. Soc. Am. 73(11), 1434–1441 (1983). 20. L. Waller, L. Tian, and G. Barbastathis, “Transport of Intensity phase-amplitude imaging with higher order intensity derivatives,” Opt. Express 18(12), 12552–12561 (2010). 21. J. W. Goodman, Introduction to Fourier Optics (McGraw-Hill, 1982), Chap. 4. 22. E. Wolf, “New theory of partial coherence in the space-frequency domain. Part I: spectra and cross spectra of steady-state sources,” J. Opt. Soc. Am. 72(3), 343–351 (1982). 23. H. M. Ozaktas, S. Yüksel, and M. A. Kutay, “Linear algebraic theory of partial coherence: discrete fields and measures of partial coherence,” J. Opt. Soc. Am. A 19(8), 1563–1571 (2002). 24. L. Hogben, Handbook of Linear Algebra (Chapman & Hall, 2007) Chap. 5. 25. J. R. Fienup, “Phase retrieval algorithms: a comparison,” Appl. Opt. 21(15), 2758–2769 (1982). 26. R. Piestun, and J. Shamir, “Synthesis of three-dimensional light fields and applications,” Proc. IEEE 90(2), 222– 244 (2002). 27. A. W. Lohmann, “Three-dimensional properties of wave fields,” Optik (Stuttg.) 51, 105–117 (1978).


Introduction
Computational imaging systems use a co-designed optical pre-processing element and postprocessing algorithm to achieve new camera modalities. Examples include extended depth-offield (EDOF), depth detection, object tracking, and super-resolution, among others [1][2][3][4][5][6][7]. The optical pre-processing element typically takes the form of a phase and/or amplitude mask placed in the pupil plane of the imaging setup. This aperture mask uniquely modifies the point spread function (PSF) response of the camera at any plane of defocus. For example, an EDOF system uses a mask that causes the PSF to be invariant to defocus, while depth detection systems design PSFs to change significantly with defocus, but in a predictable manner. Determining an optimal mask for a given system is connected with the specifics of the desired functionality, camera parameters and imaging environment.
In this paper, we explore a method of designing aperture masks that offers a high degree of control over a camera's PSF response. An iterative algorithm based on Woodward's ambiguity function (AF) takes a set of desired PSF intensities at different defocus planes (i.e., at different planes along the direction of propagation), and solves for the optimal amplitude and phase distribution for a mask at the pupil plane. If the desired PSF intensities are not physically realizable from a thin mask, the algorithm converges to a nearby solution that is realizable.
Unlike traditional phase retrieval iteration algorithms [8,9], our approach uses a unique constraint: the mutual intensity function. Working in this space allows us to apply a global constraint to the entire light field each iteration, as opposed to just a planar slice of the field (Fig. 1). While aperture mask optimization has previously been attempted using a phase retrieval algorithm or phase space tomography for EDOF [10,11], we will show how our algorithm can provide approximate solutions to unique and rapidly varying 3D PSFs that are often desired in computational imaging applications. Following, we present an explanation of how we use the AF as an iterative design tool, an analysis of our algorithm's performance, and an experimental demonstration of an example aperture mask.

Contributions
1. An iterative technique named "mode-selective" to estimate the amplitude and phase of a wavefront from a sparse number of intensities along the direction of propagation.
2. The application of mode-selective iteration to design amplitude and/or phase masks for PSF engineering 3. An analysis of the design process for masks that create arbitrarily desired PSF responses at different defocus depths. All can determine the amplitude and phase at one plane from 2 or more intensity distributions. (a) Phase retrieval techniques iterate between a few planes and the mask plane, one at a time.
(b) Phase-space tomography techniques do not iterate, but require intensity distributions from many planes. (c) Our method iterates between the mask plane and a few planes viewed simultaneously, using a global rank-1 constraint on the mutual intensity function.

Mask design procedure
For simplicity, we assume quasi-monochromatic light in the paraxial region and consider a camera and mask setup in 1D geometry.

Ambiguity function representation of a camera
The ambiguity function (AF) is considered a notably useful function for modeling a camera's response to defocus [12,13]. It is a "phase space" function of both space (x) and spatial frequency (x'), and has close ties to the well known ray space of geometric optics [14]. From Fig. 2(a), we represent the 1D plane wave incident upon the pupil plane mask, U(x), with the 2D ambiguity function A, where x and x' are the space and spatial frequency coordinates at the pupil plane, respectively, u is a second parameter proportional to defocus, and the asterisk represents complex conjugation. Note that since we are considering an incident plane wave, U(x) at the pupil plane is equivalent to the function that defines the amplitude and phase of the aperture mask. The wavefront's mutual intensity function is obtained from the AF through an inverse Fourier transform and coordinate transformation to center-difference coordinates x 1 and x 2 : In practice, this transformation is performed on a discrete AF function as an inverse Fourier transform along u, a rotation of 45°, and a coordinate re-scaling by one half along x 1 . This function will be used as a constraint in the iteration process in Section 2.2. Setting the x 2 coordinate to zero yields, which shows the wavefront U(x), and hence mask pattern, can be recovered from the AF up to a constant phase factor. The AF of the aperture mask function U(x) can represent all defocused OTFs of the imaging system [13]. Specifically, a defocused OTF H at any plane z n along the direction of propagation in Fig. 2 is given by, where k is the wavenumber and W 20 is a defocus coefficient [1] here defined as, Equation (5) assumes an object plane at infinity, where ∆z denotes defocus distance, r is the radius of the lens, and f is its focal length. The complicated OTF function in Eq. (4) can be simply represented as a slice through the middle of the AF function: In other words, a camera's optical response H at any plane of defocus is given as a slice through the center of its aperture function's 2D complex AF at an angle   20 tan , Wk   as shown in Fig. 2(b). The utility of this special property was primarily noted while designing apodizing masks for EDOF purposes, where one wishes to establish a depth-invariant OTF [15]. What this paper attempts to do is move away from using the AF as a function to view the performance of a given mask, but to instead design a mask output from a desired set of OTF or PSF inputs. This design process is an inverse problem, where we will attempt to establish the AF that best matches a desired set of OTFs at different planes of defocus. Once this optimized AF is known, the mask pattern to place at the aperture can then easily be determined, up to a constant phase factor, from Eq. (3).

Ambiguity function for design
From above, it is clear that if we are able to design a physically valid AF from desired inputs, we can establish an optimal mask to place at the pupil plane. Not surprisingly, a large field of work has been dedicated to determining a full phase space function of a given wavefront from multiple intensity measurements at different planes along the direction of propagation. Phasespace tomography [16][17][18] borrows tools like the Radon transform and filtered backprojection from tomography to reconstruct a 2D phase space function from a set of its 1D slices. Unfortunately, a tomographic approach typically requires many experimental measurements, unlike our few desired input intensities, which may not even be physically realistic. Similarly, methods based on the transport-of-intensity equation [19,20] generate a phase space function from two or more closely spaced measurements, but do not facilitate the design of arbitrary intensities at widely spaced planes. To map a few desired input values to a fully valid AF, we propose an iterative solution using constraints available in the mutual intensity domain. We begin our procedure by defining a set of n desired OTFs at planes of different defocus z n , which we would like our imaging system to achieve (Fig. 3(a)). Note that these OTFs can be directly determined from desired PSF intensity patterns through a well-known Fourier relationship [21]. There are no fundamental restrictions on n, z n , or the shape of the curves, although Sections 3 and 4 will examine how performance varies with these parameters. An "approximate" AF function is populated with these desired OTFs at slices from Eq. (6), each filling two slices in Fig. 3(b) given a symmetric aperture, which is then used as an input to an iteration procedure. To obtain a more realistic initial AF approximation (Fig. 3(c)), a one-time interpolation is performed between input slices based on a Taylor power series expansion with respect to u, which is similar to a previously used expansion along W 20 [15]. Equation (7) simply fills in zeros between populated slices to better pose the function for an iteration process, and is typically carried out to the second order. A constraint must be applied to verify this approximate AF obeys Eq. (1) for a given wavefront U(x). Since our model is for PSF measurements, we can assume prior knowledge of a spatially coherent point source propagating to the lens. Thus, our final AF solution at the pupil plane originates from a spatially coherent wavefront U C (x). It is well known that a spatially coherent wavefront has a fully separable mutual intensity, here given as Γ [22]: This separation is identical to the transformation result of Eq. (2). The importance of Eq. (8) becomes clear from a linear algebra viewpoint. Ozaktas et al. [23] demonstrate that a discrete coherent mutual intensity matrix must fulfill a rank-1 condition, whose value can be given by the first singular value of a singular value decomposition (SVD): Here we show Γ decomposed into the well known SVD matrices S and V, with the coherence restriction on a discrete U c allowing us to represent it as the outer-product between the first column of S (s 1 ), and the first row of V (v 1 ). Since a spatially coherent wave is composed of a single mutual intensity mode, all singular values besides λ 1 are 0. This constraint reduces our redundant 2D phase space representation to the two 1D vectors s 1 and v 1 , which are equal if Γ is positive semi-definite (Fig. 4). In summary, the algorithm first creates an "approximate" mutual intensity (Fig. 3(d)) from the approximate AF using Eq. (2), which is then constrained to a single coherent SVD mode (Fig. 3(e)). Continuing with linear algebra notation, a coherent AF (Fig. 3(f)) can now be created from the mutual intensity's first singular value: where R is a 45° matrix rotation, L scales the axis by two, and F is a Fourier transform along one dimension in the rotated coordinate system. Equation (10) is an implementation of Eq. (1) in a discrete matrix operation form. After this procedure, the AF provides a physically realistic representation of our PSF measurement setup, but may not optimally match the desired inputs. Once again, originally desired OTFs are used to populate the new AF guess (which is normalized to unity) at their respective slices and the outer product constraint of Eq. (9) is applied. This procedure iterates until convergence to a final AF, which will match desired responses within a specified error threshold. This AF can be inverted using Eq. (3) to solve for the optimal aperture mask function up to a constant phase factor, or can directly determine an OTF at any other defocus plane from Eq. (6). Since the SVD in Eq. (9) provides a rank-1 approximation of the original matrix with minimized Euclidean error, from the Eckart-Young Theorem [24], quick convergence is expected. As noted earlier, this iterative approach of replacing OTF values shares many similarities with the iterative replacement of amplitude values in the well-known phase retrieval methods of Gerchburg and Saxton [8] and Fienup [9], and could indeed be applied outside of a camera in a holographic setup. However, instead of cycling through the system one plane at a time, this mode-selective approach replaces values and constrains the entire system at each iteration step. Benefits of this include an even weighting of error in the presence of noise and direct control over the system's state of coherence. Fig. 4. The decomposition of a mutual intensity function. An initial mutual intensity "guess" of a wavefront incident upon an open aperture (left) is decomposed into multiple modes using an SVD (right), with weights given by their singular values. For example, λ 1 = 1, λ 2 = 0.21, and λ 3 = 0.13 after the algorithm's first iteration, but quickly approach a single large value. Our constraint simply takes the first mode of this decomposition.

Iterative mode selection
The mode-selection algorithm restricts a set of desired intensity patterns, which may or may not obey the constraints of propagation, to a solution that follows coherent wave propagation. Therefore, two regimes of performance evaluation are necessary. The regime considered in this section will test performance for a set of OTF inputs that are known to obey the constraints of propagation, which is equivalent to testing the algorithm's ability to recreate an entire AF from a few OTF inputs generated from a known mask. This will demonstrate iterative mode-selection's accurate convergence. One could imagine using a known mask as a design starting point, and then altering PSF or OTF responses to determine a new mask for a specific application. In the next section, we will test the algorithm's ability to converge to arbitrary desired sets of intensity distributions, which may be impossible to recreate exactly.
As a first example, we model the binary amplitude mask distribution in Fig. 5(a), whose form is similar to masks previously used for depth detection. Three of the OTFs it generates are used as algorithm input: one in-focus, one at W 20 = 0.25λ, and one at W 20 = 0.5λ (Fig. 5(b)). For a 10mm mask and a lens with 50mm focal length, this corresponds roughly to 0.1mm and 0.2mm of sensor defocus, respectively. After 25 iterations, the algorithm converges to the AF, mask function and OTFs in Fig. 5(c) and Fig. 5(d). Since the original inputs obey propagation, we expect iterative mode-selection to approach an exact reproduction of the OTFs, which it nearly achieves. The performance metric of mean-squared error (MSE) from desired OTFs is 0.007, which is on the order of error from phase retrieval approaches [25].
As a second example, we use the well-known continuous Cubic Phase Mask (CPM) [1] to generate three 1D OTFs for algorithm input. Unlike the previous example, this mask provides a depth-invariant blur for EDOF systems. In its basic form, the 1D mask has a phase , where α is a parameter that controls the phase deviation of the mask. Figure 6(c) displays the reconstructed AF from using three depth-invariant OTFs as input. Since the CPM is a phase-only element, we place a restriction on the algorithm at each iteration to keep the phase-only contribution. The output mask, given as a separable 2D distribution in Fig. 6(d), shows a clear cubic phase profile. The reconstructed OTFs in Fig. 6(a) show a total MSE of 0.004 from expected. By providing a method to alter and optimize the AF at numerous planes of defocus, the proposed algorithm has a large potential for assisting the design process of EDOF systems.  Since the above examples originate from OTF inputs that obey propagation, the algorithm can converge to a solution that matches the originally desired OTFs, up to a negligible amount of error, upon iteration. Figure 7 displays this convergence to a low MSE, which approaches zero for very large iterations, as well as the mutual intensity function's ability to approach a single mode (one large singular value). Likewise, both examples used three inputs at three easily definable, uniformly separated depths, for demonstration purposes. In fact, any number of inputs at any plane of depth could be used, and algorithm performance will vary as input parameters change. Clearly, if OTF slices that obey propagation are used, it is desirable to fill in more of the AF with a larger number of estimates n. Likewise, a larger maximum defocus distance ∆z (i.e., a larger W 20 value) will allow for a wider wedge area of the AF to be filled in, as is known in tomographic reconstruction problems. Both of these trends are demonstrated in Fig. 8, but do not remain valid in an arbitrary design situation. values, representing modes of partial coherence, approach a single mode with increased iteration n (shown for the binary mask example). This single mode implies spatially coherent light, which follows from our assumption of modeling a camera's response to a point source. Fig. 8. A demonstration of algorithm performance as a function of two free parameters: maximum input defocus parameter and number of input slices. (a) MSE between all input and output OTFs decreases as the maximum input defocus parameter is increased for both example masks. Each MSE value is an average MSE for 3 to 8 equally spaced input planes, each after 15 iterations. (b) MSE of input vs. output OTFs also decreases as the number of pre-determined equally spaced input planes is increased. Here, each MSE value is an average over a maximum W 20 value of 2λ-7/2λ, also after 15 iterations.

Mode selection for desired PSFs
The above examples confirm convergence to known PSF responses. In the case of arbitrary inputs which may not obey propagation, the mode-selection algorithm becomes a method of sculpting an approximate 3D PSF from a desired 3D PSF intensity response with the AF. Arbitrary OTFs yield an AF guess at the initiation of the algorithm that does not obey Eq. (1). The constraint in Eq. (9) that takes the first SVD mode guarantees a valid coherent AF function, but at the expense of altering the desired response at each plane z n . MSE minimization depends heavily upon the input PSF (or corresponding OTF) complexity along z. A set of PSFs that vary rapidly with defocus will be difficult to recreate due to the constraints of propagation. We refer the reader to discussions of 3D wavefront restrictions presented in [26,27] for more insight into the limitations of designing an intensity pattern at multiple planes along the direction of propagation.
Iterative mode-selection can be applied to find an aperture mask to approximate any set of desired PSF intensity patterns. One arbitrary but demonstrative PSF set of one point in-focus, two points at an initial defocus plane, and three points at a further defocus plane is considered in the next two sections. This type of "counting" 3D PSF has a potential application in depth detection, but is mostly used as an illustrative example. Figure 9 presents the process of our algorithm in simulation, with the same camera parameters as last section. The three desired OTFs from equally separated defocus planes of 0.1mm and 0.2mm populate the AF, which iterates to yield optimized OTFs with an MSE of 0.032 from desired responses. The optimal amplitude and phase distribution to generate the responses in Fig. 9(d) is in Fig. 10(a). Figure 10 also includes optimal amplitude-only and phase-only masks determined using a different restriction each iteration, which generate similar but slightly different PSF responses.
The influence of the number of inputs n and their defocus distance ∆z n becomes less predictable for the unknown mask case. From simulation and through experiment, it appears that the complexity (i.e., rate of change) of the desired OTF set is the most significant influence on MSE performance. MSE versus maximum input plane distance does not follow a general trend, and has been examined in part in [26]. Likewise, as opposed to a set that follows the propagation equation, increasing the number of arbitrary desired OTFs can overconstrain the design problem. Since the approximated output wavefield must be compatible with the Fresnel propagation process that ties all planes of defocus together, more inputs indicates a riskier search. Even specifying intensities at two different defocus planes may not offer an approximate solution, which is especially relevant for closely spaced planes.

Experimental verification
To verify the performance of our algorithm, the optimal amplitude mask pattern described in the previous section is tested with a PSF measurement experiment. An aperture mask similar to Fig. 10(c) is printed as a binary pattern on a 25μm-resolution transparency and placed on the aperture stop of a Nikon 50mm f/1.8 lens, with the stop opened to a 1.5cm diameter. The expect 1 PSF peak to turn into 4, and then into 9 with defocus. Discrepancy between actual and predicted PSFs can be attributed to a low SNR due to long required integration time, a non-zero extinction of the printed mask, the finite size of the point source and possible aberrations. However, the general trend of an in-focus peak splitting into 4 and then 9 spots is observable. Fig. 11. Three simulated PSFs (a) in focus, (b) at a defocus plane of 0.1mm, and (c) at a defocus plane of 0.2mm. (d)-(f) PSF measurements obtained with the experimental setup described above for the same amounts of defocus as each PSF in (a)-(c) above. The scale bar in the lower right represents 50μm.

Limitations and future work
While iterative mode-selection can recover known mask patterns and find desired mask patterns with a high degree of accuracy, there are still a number of failure cases and areas for future improvement. To begin, simply taking the first singular value for a rank-1 mutual intensity function is an approximate restriction. While rarely observed, this first value could be negative, or could lead to mode values with a dynamic range too large to fabricate. Furthermore, while an exact solution is approached when the input OTFs are known to obey propagation constraints, arbitrary OTFs may iterate to local instead of global minima. These problems may be overcome with an exact solution to phase space function inversion. Another limitation of current PSF generation examples is their restriction to 1D. Extension to 2D requires a 4D AF, which increases computational complexity, but is needed for application of mode-selective iteration to areas like holographic display. Additionally, a 4D mode-selective process will need to be modified to perform a 4D SVD while not over-constraining astigmatic regions of the mutual intensity function. Finally, the potential drawbacks of limiting the aperture masks to be either amplitude-only or phase-only due to fabrication constraints was not fully considered. Future work could consider the benefits of using either amplitude or phase masks for different desired 3D PSFs. In general, a major advantage of using iterative mode-selection over phase retrieval or phase-space tomography is its direct extension to the use of multiple mutual intensity modes. Selecting more than one mode during iteration allows for optimization of partially coherent wavefields, which will be the focus of future efforts.