Computational multifocal microscopy

Despite recent advances, high performance single-shot 3D microscopy remains an elusive task. By introducing designed diffractive optical elements (DOEs), one is capable of converting a microscope into a 3D"kaleidoscope", in which case the snapshot image consists of an array of tiles and each tile focuses on different depths. However, the acquired multifocal microscopic (MFM) image suffers from multiple sources of degradation, which prevents MFM from further applications. We propose a unifying computational framework which simplifies the imaging system and achieves 3D reconstruction via computation. Our optical configuration omits chromatic correction grating and redesigns the multifocal grating to enlarge the tracking area. Our proposed setup features only one single grating in addition to a regular microscope. The aberration correction, along with Poisson and background denoising, are incorporated in our deconvolution-based fully-automated algorithm, which requires no empirical parameter-tuning. In experiments, we achieve the spatial resolutions of $0.35$um (lateral) and $0.5$um (axial), which are comparable to the resolution that can be achieved with confocal deconvolution microscopy. We demonstrate a 3D video of moving bacteria recorded at $25$ frames per second using our proposed computational multifocal microscopy technique.


Introduction
Typically, an optical microscope focuses on one focal plane at a time. Thus, recovering 3D information with sufficient resolution usually requires sequentially z-scanning the focal planes. This process is time consuming and requires precise mechanical control. The long acquisition time of focal scanning microscopy fundamentally prevents the applications in in vivo imaging, especially when the 3D movement of biomedical objects is desired. To realize single-shot 3D microscopy, a standard microscope has to be modified in such a way that 3D information can be encoded onto a 2D plane.
One of the few modifications to achieve single-shot 3D microscopy, for example, is by placing a diffractive optical element (DOE) in the Fourier plane of a standard microscope (Fig. 1a). The DOE is designed as a distorted phase gratings [1][2][3], also referred to as a multifocal gratings (MFG). The MFG projects different depth layers in a 3D volume onto different sub-regions of the imaging plane simultaneously. In particular, if the imaging plane is divided into l × l tiles and each tile is focused on one focal plane by design, then a 3D volume of l 2 depths can be retrieved by z-stacking all the tiles. An l = 3 case is shown in Fig. 1(c). This imaging technique is also referred as multifocal microscopy (MFM) [4].
MFM stands out among other single-shot 3D microscopy, e.g. light fields microscopy [5,6]   ). In (a), a conventional microscope is augmented by a 4 f system. An MFG (b) is inserted at the Fourier plane of the 4 f system to produce an array of l × l differently focused tile images in a single exposure (c; l = 3). Note that our CMFM system discards CA corrective optics, significantly reducing the system complexity and cost compared to conventional MFM. We correct for CA computationally rather than optically.
(d-f) The pipeline of proposed computational framework: by capturing a z-stack 3D PSFs (d), the algorithm can simultaneously recover the background noise b * , the optimal regularizer parameter λ * and a high resolution 3D image (f) from a single captured 2D MFM image (e). and lensless 3D coded microscopy [7][8][9], for its capability of achieving diffraction-limited 3D resolution comparable to conventional focal scanning microscopy. However, as an imaging system, MFM faces several imaging issues, some of which are even more severe that prevents MFM from broader applications: • Out-of-focus blur. MFM captures an array of multifocal planes. Like standard wide-field microscopy, each focal plane image is convolved by out-of-focus light. Estimating and removing out-of-focus blur requires implementation of a 3D deconvolution algorithm.
Our proposed algorithm builds on the regularized Rechardson-Lucy (RL) deconvolution, and features automatic estimation of the background noise and the optimal regularizer parameter.
• Chromatic aberration (CA). Conventionally, CA is optically corrected by adding a chromatic correction gratings (CCG) and a multifacet prism after the MFG [4], resulting in excessive hardware cost. In our work, we propose a simplified version by abandoning the CCG and the prism. Instead, we place a 10 nm bandpass filter in front of the detector to mitigate CA, as shown in Fig. 1(a). We demonstrate that the resolution loss due to CA can be computationally compensated. This comes from a fact that CA from MFG is directional due to tile geometry. In each tile, the PSF is stretched mainly in one direction, but preserves imaging quality in the orthogonal direction. Therefore, by jointly incorporating all the PSFs in each tile, our proposed 3D deconvolution algorithm enables restoring the resolution loss due to CA.
• Poisson noise due to limited photon budget. There are three factors contribute to the limited photon budget of the MFM. First, as mentioned before, a narrower bandpass filter of 10nm is used to mitigate CA, and thus, the total number of emission photons arriving at the detector is smaller than when using CA corrective optics. Second, the theoretical maximum efficiency of the MFG is smaller than one. For example, the theoretical limits are 68% for a binary phase-only (0 or π) MFG with 3 × 3 diffraction orders, and 78% for a MFG with 5 × 5 diffraction orders, which means 32% or 22% of the total emission light will be lost and cannot be collected by the detector. The efficiency can be improved by using a multi-phase MFG [10] at the expense of the complicate fabrication process. Third, each tile only shares about 1/l 2 of the total photons. The limited photon budget requires a noise modeling of Poisson process. We incorporate the Poisson noise modeling for MFM, for the first time.
• Field-of-view (FOV). As described above, MFM trades lateral FOV for depth information.
Conventional MFM designs use large tile spacings for large objects. We address here that this design is not optimal for tracking dynamic objects, which are usually small in size but require large FOV. We analyze the FOV performance theoretically, and demonstrate that by modifying the MFG design to create a smaller tile spacing, small objects (e.g. bacteria) can be tracked over a larger lateral area than the tile width.

Related works
The 2D sensing limitation imposed by optical sensors makes 3D imaging an interesting and active research topic both in macroscopic and microscopic regime. In microscopy, several approaches have been proposed for a variety of imaging conditions. One scheme for achieving 3D imaging is to use coherent illumination, e.g. digital holographic microscopy [11][12][13], variably-controlled light-emitting diodes (LED) arrays [14,15], etc. Yet the coherent modeling of light propagation does not apply to incoherent/fluorescent objects. For the incoherent case, one seminal idea is to introduce self-interference by projecting a set of Fresnel patterns [16][17][18]. However, most of the existing methods, especially for incoherent cases, require multiple exposures and massive processing time. This limitation prevents broader applications such as in vivo imaging as the motion of the objects during capture process is hard to circumvent and thus, deteriorates the image quality [19,20]. Recent works have proposed to reduce the measuring requirement by computationally exploiting spatial-temporal redundancy of the scenes [21,22]. Here, we review two types of computational microscopy that enable single-shot 3D fluorescence imaging. Light field microscopy (LFM). By introducing a microlens array on the primary image plane of a microscopy, LFM encodes 4D (spatial-angular) information into a 2D image and computationally recover the 3D objects [5,6]. Because LFM trades spatial resolution for single-shot capture, its spatial resolution is lower than that of the conventional microscope. Recently, a RL deconvolution method based on the wave optics theory [23] is derived and demonstrated to improve the resolution for LFM. So far, the best resolutions experimentally achieved by LFM are ∼ 1.4um and 2.6um [5] in the lateral and axial dimensions respectively. In addition, due to the variation on sampling density, the lateral resolution decreases to ∼ 3.75um at the focal plane [5].
Lensless 3D microscopy. Traditionally, lens-based cameras/microscopes map a point in the scene to a pixel on the detector. Lensless imaging architectures, instead, replace the lens with other encoding elements such that a point is mapped to many points on the detector and thus, require computation to recover. Several seminal literatures have proposed to place a single encoding element, such as a coded mask [7] or diffuser [9], directly in front of a detector. Therefore, the imaging system is compact and cost-effective, and has large FOV. A computational algorithm is then designed to make use of the physical effects, e.g. the Point Spread Functions (PSF), to recover the 3D information of the scene. However, the spatial resolution of lensless 3D microscopes is restricted to the pixel pitch of the detector. In particular, a lens-free fluorescent imaging platform was reported to achieve the spatial resolution of ∼ 10um based on a compressive sensing algorithm for sparse objects [8]. The recovered 3D volume consisted of two or three depth layers with intervals of 50um or 100um.

Computational multifocal microscopy
Our computational imaging framework, which we call computational multifocal microscopy (CMFM), balances and optimizes the processing capabilities of optics and computation. We demonstrate a CMFM system [ Fig.1(a)] that utilizes simplified optics, as well as algorithms that correct for CA and low-photon counts. The pipeline of our computational framework is shown in Fig.1(d-f).
We perform two experiments to demonstrate the effectiveness of our CMFM system. First, we capture a static image of several frozen periplasms in 3D to demonstrate the spatial resolutions of 0.35um and 0.5um in the lateral and axial dimensions [see Fig. 7(d)], which are comparable to those achievable with confocal deconvolution microscopy [see Fig. 7(b)]. Note that we compare 0.5s captures with our CMFM instrument to a 20s confocal scan taken with a dual spinning disk confocal microscope (Model: CSU-W1) made by Yokogawa Electric Corporation. Our CMFM results show similar 3D image quality, but achieve a 40x reduction in acquisition time. However, we kindly remind readers that the principle of the axial resolution improvement is different for the two techniques; Further discussion will appear in section 4.1. Second, we record a video of in vivo bacterium at 25 frames per second (see Visualization 1) and perform high resolution 3D reconstruction with tracking results (see Visualization 2) using our CMFM technique.

Image formation model
The image formation of our CMFM fluorescence system can be modeled as the axial integral of the 2D convolution of the PSF with its corresponding depth layer. In the noiseless case, such model can be written as: where g is the observed image, o is the 3D object we want to recover, and h is a z-stack 3D PSF. o z and h z are 2D slices of o and h at axial depth z, respectively. Here, we discretize o into N x × N y × N z voxels in three dimensions. Each voxel has size ∆ x × ∆ y × ∆ z . The 2D image g on the detector is sampled with M x × M y pixels. Each pixel has size ∆ x × ∆ y . For a band-limited system, the Nyquist sampling rate has to be satisfied in order to avoid aliasing. In MFM, the lateral cut-off frequency is f c = 2N A/λ, and therefore ∆ x ≤ 1/(2 f c ) and ∆ y ≤ 1/(2 f c ). For simplicity, we set ∆ x = ∆ x and ∆ y = ∆ y in practice. In order to match o and g dimensions, h is therefore of M x × M y × N z voxels, with each voxel size of ∆ x × ∆ y × ∆ z . Then the discrete form of Eq. (1) can be written as a matrix-vector multiplication form: where g is an M × 1 column vector, in which M = M x × M y , o is an N × 1 column vector, in which N = N x × N y × N z , and H is the sensing matrix of a size M × N. Note that each H z is a Toeplitz matrix representing 2D convolution, and can be constructed from h z . Note that, in our case, two types of noise are considered. First, our imaging process has Poisson noise due to the limited photon budget resulted from (1) the narrower bandpass filter, (2) the lower diffraction efficiency of the MFG and (3) splitting of light. Second, the background noise (room stray light, or the sample itself) is also considered. We assume a uniform background photon noise across all the pixels in the acquired MFM image. Therefore, the image formation model under the Poisson noise and additive background noise model can be expressed as where P represents Poisson statistics originated from signal photons, and b models the uniform background noise. Note that b is an M × 1 column vector and each entry of b is a same constant, denoted by b.

Joint regularized RL deconvolution
Equation (3) leads to the following likelihood function according to Poisson distribution where i stands for the pixel coordinate in g. In RL deconvolution, the optimal solution o * is found from the observation g by maximizing Eq. (4), or equivalently minimizing its negative logarithm, subject to all the voxels of the restored image have non-negative values, that is, where is the negative Poisson log-likelihood of p(g|o, b) in Eq. (4), in which a constant term log (g i !) is omitted.
In regularized RL algorithm, a regularizer term is added to the objective function in Eq. (5). Here, we consider total variation (TV) regularization because it can avoid the noise amplification problem in RL deconvolution by allowing to recover a smooth and stable solution with sharp edges. The objective function in the TV regularized RL algorithm is therefore written as where λ is the regularization parameter and TV(o) = N j=1 |∇o j |, in which j denotes the voxel coordinate in the o. Our joint algorithm estimates the 3D image, the background noise and the regularization parameter simultaneously by minimizing Φ in Eq. (7) subject to all the voxels of the restored 3D image have non-negative values, that is, The minimization of Φ(o, b, λ) in Eq. (8) with respect to all three unknown variables can be performed by an iterative alternating gradient descent method. More specifically, at each iteration, the three variables are updated sequentially, and when one variable is updated, the other two variables are fixed as constants.

Estimation of 3D image
To update o for (k + 1)-th iteration, we substitute o k , b k and λ k estimated from k-th iteration in Eq. (7), and take partial derivatives with respect to each voxel o k where the divisions are element wise, h i, j is the entry in the i-th row and the j-th column of H, and M i=1 h i, j = 1 if H is column normalized (which is equivalent to normalizing PSFs h z ), and div stands for the divergence operator. Given the partial derivative in Eq. (9), We update o using the gradient-based algorithm (or equivalently by using EM algorithm), defined by [24,25] Algorithm 1 Joint regularized RL algorithm for MFM 1: inputs: g and H 2D MFM data and 3D PSFs 2: initialize: o 0 , b 0 , and λ 0 3: set k = 0 4: while not convergence do 5: update o k+1 using o k , b k and λ k in Eq. (10) 6: update b k+1 using o k+1 , and b k in Eq. (12) 7: update λ k+1 using o k+1 , and b k+1 in Eq. (14) 8: where the second line in Eq. (10) is to enforce the non-negativity on j-th reconstructed voxel. Note that for 1 ≤ j ≤ N, all the voxels in o can be updated simultaneously by storing them in a vector.

Estimation of the uniform background
When we have o k+1 , we substitute o k+1 , b k and λ k in Eq. (7), and take partial derivatives with Note that each pixel in the captured MFM image is assumed to have the same background value b. Then we update b using the same gradient-based algorithm as is in 3D image estimation

Estimation of optimal regularization parameter
The approach of estimating optimal λ is based on the fact that when the optimal solution o * is found, the partial derivatives with respect to each restored voxel o k j in Eq. (9) should be equal or close to zero. So the λ is obtained by minimizing the sum of the square of Eq. (9) over all the reconstructed voxels [26] Because the objective function in Eq. (13) is a quadratic form of λ, the close form solution of λ can be found by equating the derivatives of Eq. (13) with respect to λ to zero The joint regularized RL deconvolution algorithm is summarized in Algorithm 1. We performed a series of simulations to evaluate the effectiveness and convergence of the joint regularized RL algorithm for our CMFM system. The simulation results and convergence plots are shown in Appendix A.

System analysis
A schematic of our CMFM imaging setup is shown in Fig. 1(a). A regular microscope is augmented by a 4 f relay system (L1 = 200mm and L2 = 400mm). The total magnification of the whole system is 120× with the use of 60× objective lens. An electron multiplying charge coupled device (EMCCD) has 1024 × 1024 pixels with pixel pitch 13um × 13um. Thus, each pixel corresponds to a size of 108nm × 108nm in the object domain, which satisfies the Nyquist sampling criterion. A multifocal gratings (MFG) is placed at the Fourier plane of the 4 f relay system. As an example, schematics of the MFG used in our first experiment is shown in Fig. 1(b). This MFG is optimized and designed to produce 3 × 3 differently focused tile images on a single 2D detector as shown in Fig. 1 Next, we provide analysis on the Point Spread Function (PSF), the chromatic aberration (CA), and the FOV of our CMFM system.

Point Spread Function (PSF) and resolution
We image a z-stack of a small fluorescent bead (about 170nm in diameter), which approximates an ideal light-emitting point, to measure PSF. Figure 2 (first row) shows CMFM lateral PSF imaged under five different axial positions (columns). Each lateral PSF consists of 3 × 3 tiles, with one tile in focus (outlined by a green box; also zoomed in second row) and eight tiles out of focus. From left to right, as the focus depth increases, we can observe a focal shift on the axial direction of the x-z plots of PSFs (third row). The Optical Transfer Functions (OTFs) are plotted as reference (bottom two rows).
We measure the full-width half-magnitude (FWHM) values to quantify the resolution. Note that the PSFs are affected by chromatic aberration and the lateral resolution is anisotropic. Therefore 55}um. The variation of the in-focus PSF sizes results from directional chromatic aberration. Based on the PCA results, each horizontal and vertical diffraction order tile has a resolution loss of 0.55um due to CA. Each diagonal diffraction order tile has a resolution loss of 0.85um due to geometry. We show, in the next sub-section, that this observation is consistent with the theoretical calculation.

Chromatic aberration (CA)
Due to the diffractive nature of the MFG, the CMFM system suffers from obvious CA effect, which causes the loss of the intensity and resolution. Mathematically, the lateral CA per tile can be expressed as [27]: where m and n are horizontal and vertical diffraction orders of the tile, f is the focal length of the second lens of the 4 f system, d x and d y are periods of the MFG in the x and y directions, and ∆λ is the bandpass spectrum of the emission collected by the detector. The axial CA for each tile, i.e., different wavelengths are focused at different distances from the lens, can also be expressed as [27]:  where λ c the wavelength used to design the MFG, and f m,n = (m + ln)∆z is the focusing distance of each individual tile.
From Eq. (15) and Eq. (16), we can see that for the central tile where m = n = 0, δ x = δ y = δ z = 0, which means the central tile's PSF and image are free of CA. However, for off-axis tile, both lateral and axial CA exist. Each horizontal and vertical diffraction oder tile has a dispersion of m f ∆λ/d x , which is equal to 0.59um when m = 1 for our CMFM system of f = 400mm, ∆λ = 10nm, d x = d y = 56um andM = 120. Each diagonal diffraction order tile has a dispersion of √ m 2 + n 2 f ∆λ/d x . In addition, the dispersion direction varies for different diffraction order tiles due to geometry. If left uncorrected, the CA would cause a loss of the the intensity and resolution in the chromatic dispersion direction. In principle, CA can be optically corrected by using CA corrective optics. However, the corrective optics increases system cost and complexity [4]. Here, we show that CA can be effectively compensated by computation. This is possible because the CA is directional due to geometry of the tile distribution in our CMFM system. For example, in Fig. 2, the eight non-centric tiles exhibit directional stretching towards the image center. At depth z = 0 (first column), the PSF is CA-free. On the other focal planes, from ∆z to 4∆z (second column to last column), even though the green boxes indicate in-focus tiles, the PSFs stretch in different directions at different diffraction orders. However, the stretching does not occur at the perpendicular direction of CA. This implies that although a certain tile is affected by CA on one direction, the perpendicular direction is less affected and can be used to compensate another tile which suffers CA at this direction. Therefore, by jointly utilizing all the tiles in the model, 3D deconvolution can preserve 2D spatial frequencies that are CA blur vs defocus blur. Note here that the blur resulted from CA is different than the defocus blur. Fig. 3 presents a comparison of the two types of blurs. In Fig. 3(a), the red box highlights the in-focus tile. This in-focus tile is not located at the image center and has CA blur in horizontal dimension. Meanwhile, the blue box highlights the tile that suffers both defocus blur and CA blur but in vertical dimension. However, in horizontal dimension, this out-of-focus tile only has defocus blur. An Optical Transfer Function (OTF) comparison is shown in Fig. 3(b). The defocus blur size is smaller than the CA blur size in horizontal direction, resulting in a wider OTF lobe, which implies that higher resolution reconstruction can be achieved by using all the information provided from the PSFs. On the right panel, we present numerical simulation results. The simulated resolution target [ Fig. 3(d)] contains bars pointing four directions. By using only the in-focus PSF, as shown in Fig. 3(f), only horizontal bars can be resolved, as the high frequency has only been preserved in vertical direction [ Fig. 3(b)]. On the other hand, by making full use of the PSFs, signals on different directions can be well-recovered [ Fig. 3(g)].  4. (a) Conventionally, a MFG is designed to project the diffraction pattern occupying the entire imaging space. (b) We propose to use a smaller tile spacing, so as to achieve a small lateral FOV that can be tracked over a large area for MFM tracking applications.

Field-of-View (FOV)
In conventional MFM designs [ Fig. 4(a)], the lateral FOV is the same as the lateral size of each tile image, which can be expressed in the sample domain as: where L x and L y are detector size in the x and y directions, l is the number of tiles in each dimension, andM is the magnification of the MFM. The axial FOV is limited by the focused imaging range of the designed MFG as From Eq. (17) and Eq. (18), we can see that by increasing the number of tiles l 2 , the axial FOV will increase but at the expense of the reduced lateral FOV for a fixed detector size. Thus, MFM trades its lateral FOV for the depth resolution. The maximum recovered 3D volume or tracking space in MFM is FOV x × FOV y × FOV z .

Enlarging lateral tracking space
In conventional MFM designs, large tile spacings are used to ensure that large objects can be imaged [see Fig. 4(a)]. However, this comes at the cost of a large reduction in lateral tracking area. This, however, is not optimal for the MFM 3D tracking applications where the objects of interest (bacteria or molecule, etc.) are quite small but move freely in a large 3D space. In single molecule tracking, larger tracking space is more desirable than the lateral FOV. By modifying our MFG design to produce a smaller tile spacing, we can achieve a small lateral FOV that can be tracked over a large area [see Fig. 4(b)]. The comparison of MFM image diagrams obtained by two different design methods are shown in Fig. 4. When optimizing the MFG for tracking applications, the dimensions of the trackable area in x and y dimensions, where s x and s y are tile spacing in x and y dimensions. Note that Eq. (19) is a general form of Eq. (17). From Eq. (19), we can clearly see that the lateral tracking area T x and T y can be enlarged by designing a smaller tile spacing s x and s y . Therefore, our method can achieve a larger lateral tracking area for MFM tracking applications.

Numerical validation
We In conventional MFM, since the horizontal position of the synthetic ellipsoid exceeds the tile spacing, the left two-column tiles produced by MFM which contains axial depth information from −12∆z to −3∆z are shifted out of the detector's FOV and therefore can not recorded by the detector [Fig. 5(e)]. However, in our new MFM design, the 5 × 5 tiles are recorded in their entirety by the detector [Fig. 5(f)]. For reconstruction, we recovered a 3D space on a grid of 1024 × 1024 × 71 with each grid size being 108nm × 108nm × 200nm for both methods. The reconstructed ellipsoid (cropped from the larger 3D recovery space) and its xz slice by different design methods are shown in Figs. 5(b) and (c), respectively. The 1D axial profiles of the reconstructed ellipsoids are also compared in Fig. 5(d) (black and blue lines). We can clearly see that since the left two-column tiles of the ellipsoid are missing in the conventional MFM measurement, the ellipsoid is reconstructed poorly while our design provides a good reconstruction. We also compare the signal loss as a function of lateral position of the tracked object in Figs. 5(g) and (h). Similar to vignetting effect, the signal falls off when approaching the edges. Our proposed design alleviates peripheral signal loss and preserves full-sized PSF array over an enlarged lateral space.

Experimental results
We present two experimental results with different biological samples to test the spatial and temporal performance of our system. As shown in Fig. 6, we divide the whole image into different sets of tiles by modifying the MFG pattern. However, the focal step ∆z is designed to be 0.25um for both cases. In Fig. 6(a), the image is divided into 3 × 3 tiles so that 9 focal planes are in focus, while in Fig. 6(b), 5 × 5 focal planes are obtained for the purpose of tracking.

Static scene: CMFM vs confocal microscopy
We prepare several periplasms (static) as our imaging sample for 3D spatial resolution characterization. The captured raw image is shown in Fig. 6(a). The exposure time is 0.5s. The 3D PSFs are measured by recording a z-stack of a small fluorescent bead with z focal step of 50nm. The 3D image are reconstructed on a 300 × 300 × 51 grid with each grid size being 108nm × 108nm × 50nm. Figure 7 shows our computational 3D reconstruction [ Fig. 7(d)] in comparison with the captured focal stack [ Fig. 7(c)] assembled by z-stacking 9 tiles from the 2D raw MFM image. For ground truth, we captured the 3D image of periplasms using a scanning confocal microscope [ Fig. 7(a)] and performed deconvolution of the confocal image [ Fig. 7(b)] by using a commercial Huygens software. For each 3D image, a XY slice at z = 0 (first row) and a YZ slice at x = −6.5um (third row) are displayed. The comparisons of vertical line profiles indicated by the red solid line in XY slice, and comparison of axial profiles indicated by the red dotted line in YZ slice are shown in the second and fourth rows, respectively. From Fig. 7(c), we can see that without computational reconstruction, the 3D image assembled by z-stacking nine tile images from 2D MFM measurement is degraded by noise, out-of-focus blur and low spatial resolution in all three dimensions. The outer membranes are blurred and the empty space between them cannot be recognized. In addition, due to the out-of-focus blur, the FWHM of the axial profile is about 2um. However, after deconvolution, the reconstructed 3D image [ Fig. 7(d)] is much cleaner, less out-of-focus blur, and more importantly, has high resolution in three dimensions. The empty space between outer membranes can be clearly recognized. The outer membranes which are vertically blurred to a single peak can be well separated by a dip of 84% between two peaks with the lateral FWHM of 0.35um [second row of Fig. 7(d)], which is consistent with FWHM of the measured PSF. Due to removal of out-of-focus blur, the axial FWHM is about 0.5um [fourth row of Fig. 7(d)], with about two times improvement over that of the raw MFM image. Note that we compare 0.5s captures with our CMFM instrument to a 20s confocal scan taken with a dual spinning disk confocal microscope (Model: CSU-W1) made by Yokogawa Electric Corporation. Our CMFM results show similar 3D image quality, but achieve a 40x reduction in acquisition time. The automatically recovered background noise and optimal regularizer parameter at each iteration of the joint RL-TV deconvolution process is shown in Appendix B Fig. 11.
Cautionary remarks. Note that although both our CMFM and confocal deconvolution microscopy achieve 0.5um axial resolution in the experiment, the principle of the axial resolution improvement is different for two techniques. Confocal microscopy increases axial resolution by means of using a (or multiple) spatial pinhole(s) to block out-of-focus light in image detection process. Therefore, the axial extent of confocal PSF is narrower than that in the widefield microscope, and thus a high contrast and resolution image can be obtained. However, for CMFM, the axial resolution improvement is based on sparsity-based super-resolution microscopy techniques [28][29][30] by utilizing the signal sparsity in the arbitrary known domain (i.e., gradient domain in our case). Recently, a new method called SPARCOM [31]: sparsity-based superresolution correlation microscopy by utilizing sparsity in the correlation domain, is also reported to achieve spatial resolution comparable to PALM and STORM.

3D tracking of moving bacterium
In the second experiment, we demonstrate the 3D video reconstruction of a moving bacterium (see Visualization 2). The MFG is modified to focus on 5 × 5 planes, as shown in Fig. 6(b). A video is captured at 25 frame per second (fps) (see Visualization 1). According to our first experimental reconstruction where the axial FWHM can achieve 0.5um, we reconstruct the 3D image of the bacterium on a grid of 150 × 150 × 41 with each grid size being 108nm × 108nm × 200nm, which corresponds to a FOV of 16.2um × 16.2um × 8um in 3D space for each video frame. Figures  8(a-e) show five out of fifty frames 3D reconstructions. Note that for the visualization purpose, we cropped the reconstructed 3D volume and just showed 4um × 6um × 4um region around the bacterium. By computing the center of mass for each frame reconstruction, a 3D trajectory of the moving bacterium can be tracked [ Fig. 8(f)]. The colorbar in Fig. 8(f) indicates the frame index over time. The automatically recovered background value and the optimal regularizer parameter for each video frame is shown in Appendix B Fig. 12.

Conclusion
We have presented computational multi-focus microscopy (CMFM), a framework that balances the imaging system design and computational processing to achieve single-shot 3D microscopy. Our optical design significantly reduces the system complexity and experimental alignment by discarding CA correction optics. We correct for CA computationally rather than optically. This comes from the fact that CA occurs in different directions at different diffraction order tiles. Therefore, by jointly utilizing all the tiles in the model, 3D deconvolution can preserve 2D spatial frequencies that are lost by CA. By incorporating TV regularization, our algorithm can not only compensate for CA, but also perform high quality 3D reconstructions from noisy data. We build on a joint regularized RL deconvolution algorithm and incorporate two types of noise. Notice that our proposed algorithm is free of any parameter tuning by automatically estimating the background noise and the optimal reularization parameter using alternating gradient descent. We experimentally demonstrate that the out-of-focus blur along z can be significantly suppressed and the axial resolution as high as 0.5um is achievable. The lateral resolution of 0.35um, which is consistent with the diffraction-limited resolution, is also experimentally demonstrated. A high resolution 3D video of a movable bacterium at 25fps is also computationally reconstructed to verify the proposed CMFM framework.
Finally, we propose a new design method of MFG to enlarge the lateral tracking area of MFM tracking applications without sacrificing its axial FOV and single-shot capture speed. The benefits of the simple system design and high resolution image recovery offered by the proposed CMFM will broaden its applications in 3D single-shot imaging.

Appendix A: algorithm evaluation
To verify the effectiveness of the joint regularized RL algorithm for our CMFM and also quantify the reconstruction quality, we performed a series of simulations by using an experimentally captured CMFM PSFs. The synthetic 3D image [ Fig. 9(top left)] was obtained from the confocal microscope image of the 3D bacterial that was imaged under the CMFM. The CMFM measurement image was generated based on image formation model of Eq. (3) and then degraded with background and Poisson noise. The maximum signal and background photon counts are set to be 50 and 5, respectively. The Poisson noise was then added to the measurement by using Matlab's Poisson random number generator.
To quantify the reconstruction quality, we computed peak signal-to-noise ratio (PSNR) and I-divergence [25] between ground truth image u and the reconstructed image v: where MAX u is the maximum pixel value of the image u, MSE is the mean square error (MSE) between two images u and v, and i stands for the voxel index. Figure 9(top right) shows the reconstruction of the standard RL deconvolution without TV constraint by setting λ = 0. Because TV prior is not used, the reconstruction can not preserve the smooth edge of the original image and contains the artifacts. To investigate the effect of the background noise on the CMFM reconstruction quality, we performed RL-TV deconvolution but with an incorrect background value of 10 (the truth value is 5). The reconstructed image is shown in Fig. 9(bottom left). Since the used background value is two times bigger than the real one, the recovered 3D image suffers from substantial signal loss. The reconstructed 3D image by joint RL-TV algorithm is shown in Fig. 9(bottom right). The PSNRs between the original image and reconstructed images by three methods are 34.2dB, 27.5dB, 40.2dB, and I-divergences are 204.3, 517, and 96.7, respectively. Clearly, the joint RL-TV algorithm, which simultaneously recovers 3D image, the background noise value and optimal regularizer parameter, gave the best Fig. 9. Simulations that demonstrate the capability of the joint RL-TV algorithm to simultaneously recover the 3D image, background noise and the optimal regularizer parameter for CMFM. Top left: a ground truth image. Top right: the standard RL deconvolution without TV regularizer (λ = 0). Bottom left: RL-TV deconvolution by using an incorrect background values (b = 10). Bottom right: joint RL-TV deconvolution can simultaneously recover a 3D image, background noise and the optimal regularizer parameter. The PSNRs for three methods are 34.2dB, 27.5dB, 40.2dB, and I-divergences are 204.3, 517, and 96.7, respectively. reconstruction quality of CMFM among three methods. Figures 10(a-d) show the reconstructed background value b, the optimal regularizer parameter λ, PSNR and I-divergence at each iteration of the deconvolution process. Note that both the background value and regularizer parameter were initialized to be 100, but they gradually converged to 4.99 and 2.2 × 10 −3 , respectively. From Fig. 10(c-d), we can also see that the image quality is dramatically improved during the iterative reconstruction process, proving the effectiveness of joint RL-TV algorithm for the 3D reconstruction of our snapshot CMFM system.
For the second experiment, we captured an MFM video of a moving bacterium at 25 frames per second (fps) using our CMFM techniques. The complete 3D video reconstruction from the first frame to the last frame is shown in Visualization 2. The automatically recovered background value and the optimal regularizer parameter for each video frame is shown in Fig. 12.

Appendix B: experimental results
We also tested the algorithm on two types of real data. The 3D reconstruction of static periplasms is shown in Fig. 7. The automatically recovered background noise and optimal regularizer parameter at each iteration of the joint RL-TV deconvolution process is shown in Fig. 11. Fig. 11. The recovered background values (left) and optimal regularizer parameter λ (right) at each iteration of the joint RL-TV deconvolution process for the first experiment.

Disclosures
The authors declare that there are no conflicts of interest related to this article.