Fourier light-field microscopy

: Observing the various anatomical and functional information that spans many spatiotemporal scales with high resolution provides deep understandings of the fundamentals of biological systems. Light-ﬁeld microscopy (LFM) has recently emerged as a scanning-free, scalable method that allows for high-speed, volumetric imaging ranging from single-cell specimens to the mammalian brain. However, the prohibitive reconstruction artifacts and severe computational cost have thus far limited broader applications of LFM. To address the challenge, in this work, we report Fourier LFM (FLFM), a system that processes the light-ﬁeld information through the Fourier domain. We established a complete theoretical and algorithmic framework that describes light propagation, image formation and system characterization of FLFM. Compared with conventional LFM, FLFM fundamentally mitigates the artifacts, allowing high-resolution imaging across a two- to three-fold extended depth. In addition, the system substantially reduces the reconstruction time by roughly two orders of magnitude. FLFM was validated by high-resolution, artifact-free imaging of various caliber and biological samples. Furthermore, we proposed a generic design principle for FLFM, as a highly scalable method to meet broader imaging needs across various spatial levels. We anticipate FLFM to be a particularly powerful tool for imaging diverse phenotypic and functional information, spanning broad molecular, cellular and tissue systems.


Introduction
Light-field microscopy (LFM) simultaneously captures both the 2D spatial and 2D angular information of the incident light, allowing computational reconstruction of the full 3D volume of a specimen from a single camera frame [1][2][3][4][5]. Unlike other fluorescent imaging techniques that accumulate spatial information in a sequential or scanning fashion, this 4D imaging scheme effectively liberates volume acquisition time from the spatial scales (e.g. field of view (FOV) and spatial resolution), thus making LFM a highly scalable tool for high-speed, volumetric imaging of biological systems with low photodamage across several spatial levels [5][6][7][8][9]. Towards the tissue level, the latest LFM techniques have demonstrated promising results for functional brain imaging with a cellular-level spatial resolution of several micrometers across a depth of tens to hundreds of micrometers, and at a fast volume acquisition time on the order of 10 milliseconds [5][6][7][8]. At the other extreme, the method has recently been demonstrated for observing structures and dynamics in single-cell specimens with a near-diffraction-limited 3D spatial resolution, an imaging depth of several micrometers to cover a significant volume of single cells, and a volume acquisition time of milliseconds [9].
For conventional LFM, a microlens array (MLA) is placed at the native image plane (NIP) of a wide-field microscope, and the optical signal is recorded in an aliased manner across the microlenses at the back focal plane of the MLA [2][3][4]. Mitigating the tradeoff between the spatial and angular information, the recent development of wave-optics models allows for the reconstruction of densely aliased high-spatial frequencies through point-spread function (PSF) deconvolution [4,5]. However, to date, there are two major limitations that restrict LFM from broader applications. First, the sampling pattern of the spatial information is uneven for LFM. Especially near the NIP, it becomes redundant, resulting in prohibitive reconstruction artifacts [4]. Existing LFM techniques mainly circumvent the problem by imaging on one side of the focal plane [4,5,7] or engineering the image formation or the wavefront [10][11][12], which, however, compromise either the imaging depth or resolution. Meanwhile, the focused plenoptic scheme has also been reported, which removes the artifacts near the NIP but may impair the refocusing (or volumetric imaging) capability due to the restricted sampling geometry of the angular information [13]. Alternatively, a defocused optical design has been proposed to reject the artifact region away from the FOV, recovering the image quality across the NIP [9]. However, the problem remains fundamentally unresolved. Second, the volumetric reconstruction employs PSF deconvolution using the wave-optics model [4,5]. The PSF of conventional LFM is spatially-varying in both the lateral and axial dimensions, thus described by a 5D matrix that projects the 3D object domain onto the 2D camera plane [4]. This aggravates the computational cost, making the reconstruction considerably slow and impractical for rapid observation of the dynamic or functional data. While several algorithms have been reported to accelerate the process [6,7], their applications are particularly limited to improve post-processing of calcium imaging data.
Recently, Fourier imaging schemes have been proposed as a promising path to improve the current LFM techniques [14][15][16]. However, the initial works rely on the ray-optics integral model, thus compromising the image quality for high-resolution microscopy [14,15]. Another Fourier approach has lately been demonstrated for recording sparse neural activities in larva zebrafish with an extended FOV [16]. The work employed experimentally measured PSFs for reconstruction. However, the system lacks a complete algorithmic framework for modeling light propagation and has thus not been fully explored to gain optimum performance for diverse applications beyond calcium imaging. Hence, the advancement and generalization of the optical design and computational framework of LFM for artifact-free and fast light-field imaging and reconstruction of broad volumetric data are still highly desirable.
Here, we introduce Fourier light-field microscopy (FLFM) to achieve high-quality imaging and rapid light-field reconstruction. By recording the 4D light field in the Fourier domain (FD), the imaging scheme transforms LFM in two main ways. First, the FD system allows allocating the spatial and angular information of the incident light in a consistently aliased manner, effectively avoiding any artifacts due to redundancy. Second, because the FD processes the signals in a parallel fashion, image formation can thus be depicted by a unified 3D PSF, leading to substantially reduced computational cost by more than two orders of magnitude. The system was validated by high-resolution imaging and artifact-free reconstruction of various caliber and biological samples. In addition, we constructed a complete model for light propagation, image formation and reconstruction. Using this model, we established a generic design principle to facilitate any further development of FLFM, as a highly scalable method to meet broader imaging needs across various spatial levels.

Experimental setup
We constructed FLFM on an epifluorescence microscope (Nikon Eclipse Ti-U) using a 40×, 0.95NA objective lens (Nikon CFI Plan Apochromat Lambda 40×, 0.95 NA) ( Fig. 1(a)). The sample stage was controlled by a nano-positioning system (Prior). The samples were excited with either a 647-nm or a 561-nm laser (MPB). The corresponding emitted fluorescence was collected using a dichroic mirror and an emission filter (T660lpxr (Chroma) and ET700/75 (Chroma), and T560lpxr (Chroma) and FF02-617/73 (Semrock), respectively). The imaging area on the NIP was adjusted by an iris (SM1D12, Thorlabs) to avoid overlapping light-field signals on the camera plane. A Fourier lens (FL, f FL = 75 or 100 mm, Thorlabs) performed optical Fourier transform of the image at the NIP, forming a 4f system with the tube lens (TL). A microlens array (MLA, S600-f28, RPC photonics; f MLA = 16.8 mm; pitch = 600 µm) was placed at the back focal plane of the FL, thus conjugated to the back pupil of the objective. The light field was imaged using a 1:1 relay lens (Nikon AF-S VR Micro-Nikkor 105 mm f/2.8G IF-ED) and recorded on a scientific complementary metal-oxide-semiconductor (sCMOS) camera (Andor Zyla 4.2 PLUS sCMOS) at the back focal plane of the MLA.

Image formation and model for system characterization
Image formation in the FLFM system is illustrated in the inset of Fig. 1(a) and described in detail in Appendices A-E. As seen, the light field from the image at the NIP is transformed to the FD by the FL, conjugated to the wavefront at the back pupil of the objective lens. The MLA segments the wavefront, thus transmitting the corresponding spatial frequencies (i.e. the angular information) after individual microlenses to form images in different regions on the camera. Using this scheme, the spatial and angular components can be recorded in an uncompromised and well-aliased manner. This advantage results in image formation free of redundancy near the NIP and uniform across the depth of focus (DOF), permitting high-quality volumetric reconstruction without artifacts.
We developed the theoretical framework that integrates the complete set of FLFM parameters and thus allows for model-based system characterization (Appendices B-E). The model links the input parameters of FLFM such as the wavelength, NA, magnification, camera pixel and sensor size, and the focal length of the FL, with the system performance parameters, including the lateral (R xy ) and axial (R z ) resolution, the FOV, and the DOF. For example, for the setup described above, using the FL of f FL = 75 mm, these system performance values can be obtained as R xy = 2.12 µm, R z = 4.70 µm, FOV = 67 µm and DOF = 64 µm. The performance indicates high-resolution over a 2 to 3-fold extended depth, compared to conventional LFM using a similar objective [4,5]. The model and its results were validated using various numerical and experimental methods as demonstrated in Section 3 and extended to generate a generic FLFM design principle in Section 4.

Model of light-field propagation
Projecting the 3D volume in the object domain to the 2D image space, the wavefunction at the NIP using the high-NA objective, is predicted by the Debye theory as [17]: where f obj is the focal length of the objective, and J 0 is the zeroth order Bessel function of the first kind. The variables v and u represent normalized radial and axial coordinates; the two variables are defined by v = k[(x 1 /M − p 1 ) 2 + (x 2 /M − p 2 ) 2 ] 1/2 sin(α) and u = 4kp 3 sin 2 (α/2); p = (p 1 , p 2 , p 3 ) ∈ R 3 is the position for a point source in a volume in the object domain; x = (x 1 , x 2 ) ∈ R 2 represents the coordinates on the NIP; M is the magnification of the objective; the half-angle of the NA is α = sin −1 (NA/n); the wavenumber k = 2πn/λ were calculated using the emission wavelength λ and the refractive index n of the immersion medium. For Abbe-sine corrected objectives, the apodization function of the microscope P(θ) = cos 1/2 (θ) was used. Next, the image at the NIP, U i (x, p), is optically Fourier transformed onto the back focal plane of the FL, described as OF T[U i (x, p)], which is then modulated by the MLA, described using the transmission function φ(x ), where x = (x 1 , x 2 ) ∈ R 2 represents the coordinates on the MLA. Specifically, the aperture of a microlens can be described as an amplitude mask rect(x /d MLA ), combined with a phase mask exp −ik 2f MLA x 2 2 . The modulation induced by a microlens is then described as: where f MLA is the focal length of the MLA, and d MLA is the diameter of a single microlens (or the pitch of the MLA if the microlenses are tiled in a seamless manner). Thus, the modulation of the entire MLA, composed of periodic microlenses, can be described by convolving φ(x ) with a 2D comb function comb( where ⊗ is the convolution operator. The light field propagating from the MLA to the camera can be modelled using the Fresnel propagation over a distance of f MLA [18]:

Research Article
Vol. 27, No. 18 / 2 September 2019 / Optics Express 25577 where x = (x 1 , x 2 ) ∈ R 2 represents the coordinates on the camera plane, the exponential term is the Fresnel transfer function, f x and f y are the spatial frequencies in the camera plane, and F {} and F −1 {} represent the Fourier transform and inverse Fourier transform operators, respectively. OF T {} is the optical Fourier transform performed by the FL. In practice, the Fresnel propagation over the distance of f MLA is divided and calculated over steps for computational accuracy. The final intensity image O(x ) at the camera plane is described by: where p ∈ R 3 , as defined in Eq. (1), is the position in a volume containing isotropic emitters, whose combined intensities are distributed according to g(p).
Using this model, the light-field propagation in FLFM at varying depths was demonstrated (Figs. 1(b) and 1(c)) and compared with the experimental data acquired by recording 200-nm fluorescent beads (T7280, ThermoFisher) at the same depths ( Fig. 1(d)). As seen, the numerical and experimental data agreed well across the entire DOF and exhibited several unique features of the system. First, like conventional telecentric microscopes, the lateral translation of an emitter provides no parallax of images formed by each individual microlens. Second, unlike the orthographic views produced by conventional microscopes, different axial depths lead to variations of the wavefront (i.e. composite spatial frequencies) at the back pupil. This axial information is coupled to the lateral displacement and recorded in a radially asymmetric manner (except for the microlens centered at the optical axis). Therefore, such paralleled recording of signals in the FD allows describing the imaging system using a unified 3D PSF (e.g. Figs. 1(b)-1(d)), rather than a five-dimensional matrix as in the conventional, spatially-varying LFM system. This provides superb computational benefit in the reconstruction process. Furthermore, the model is fully compatible with various modules for phase variations (e.g. PSF engineering [10], aberrations, customized MLA, etc.) to accurately describe the actual performance of the FLFM system. In addition, such a complete framework of light propagation enables the reconstruction of those volumetric data, where experimentally measured PSFs may not be readily available [16], such as in intact, dynamic, or time-evolving biological samples.

Reconstruction algorithm
Mathematically, the reconstruction is an inverse problem to recover the radiant intensity at each point in the 3D volume, denoted by g, using the camera image O. They satisfy the relationship O = Hg, where the measurement matrix H is determined by the PSF, and its elements h j,k represent the projection of the light arriving at pixel O(j) on the camera from the kth calculated volume g (k) in the object space. To obtain g, the inverse problem thus becomes: where the operator diag{} diagonalizes a matrix. This expression is a modified deconvolution algorithm based on the Richardson-Lucy iteration scheme [19,20]. In our case, the sampling intervals for the reconstruction of the object space are given as ∆ xy = 0.943 µm, and ∆ z = 0.5 µm. For visualization, we additionally interpolated 4×4 pixels into each reconstructed pixel in x and y. The 3D deconvolution conducts forward projection (Hg (k) ) and backward projection (H T O and H T Hg (k) ) iteratively between the 3D object space and the 2D camera plane. Furthermore, because of the spatially-invariant nature, the 3D PSF for reconstruction can be simplified to PSF(x , z) = |h(x , p)| 2 , where the PSF can be described by an emitter located on the optical axis, i.e. p = (0, 0, z). As a result, the forward projection can be described as a sum of 2D convolution layer by layer for an axial range of [z 0 , z 1 ], i.e. Hg (k) = z=z 1 z=z 0 PSF(x , z) ⊗ g (k) (z), where g (k) (z) relates to the single layer of the 3D object located at the depth of z. The back projection can thus be given as is obtained by rotating PSF(x , z) by 180 degrees. As seen, the simplified computational scheme using the unified 3D PSF effectively reduces the dimensions of the deconvolution algorithm, allowing for substantial reduction of the reconstruction time by orders of magnitude. Furthermore, due to the viable alignment of the system and consistency between the numerical and experimental PSFs, no image registration and matching have been considered in the reconstruction process, which, however, can be employed to further improve the image quality. An algorithm flow chart has been provided in Appendix F (Fig. 14) for readers' information and the code is available from the corresponding author upon request.

Imaging caliber structures
Using the 647-nm laser, we first imaged sub-diffraction-limit 200-nm dark-red fluorescent beads (T7280, ThermoFisher) and measured the 3D reconstructed images at varying depths ( Fig. 2(a)). The full-width at half-maximum (FWHM) values of these reconstructed images at each depth were ∼2 µm and ∼4-5 µm in the lateral and axial dimensions, respectively, consistent with the theoretical values of 2.12 µm and 4.70 µm (Section 2.2, Appendices B and C), respectively. Furthermore, two beads separated by 2.90 µm can be resolved using FLFM, in good agreement with the FWHM values ( Fig. 2(b)). It should be noted that the wave-optics model of FLFM allows for high-resolution reconstruction through PSF deconvolution. In contrast, the conventional ray-optics based integral model cannot provide sufficient resolution [14,15] (Fig. 2(b)).
Next, we imaged surface-stained, 6-µm fluorescent microspheres (F14807, ThermoFisher), which hollow structure was clearly resolved using FLFM (top row, Fig. 2(c)). The corresponding lateral and axial cross-sectional profiles exhibited the FWHM values of the stained surface at 1-2 µm and 2-4 µm in the lateral and axial dimensions, respectively, consistent with the theoretical values and experimental measurements in Fig. 2a. Notably, the structure of the microsphere revealed by FLFM agreed well with our numerical results (Appendix G). We further analyzed in Appendix G the corresponding resolving capability of FLFM and the influence of any intrinsic cross-talk among the overlapping 3D information, which can be readily mitigated by adjusting f FL , the focal length of the FL.
For comparison, we reconstructed the same structure using the integral imaging model, which was, however, poorly resolved due to the limited 3D resolution (middle row, Fig. 2(c)). Furthermore, we imaged the microsphere using conventional LFM, i.e. by placing the MLA (MLA-S125-f30, RPC photonics; f MLA = 3.75 mm; pitch = 125 µm) on the NIP (bottom row, Fig. 2(c)). However, prohibitive artifacts have been observed near the NIP and extending into a significant range of the DOF. In this case, the result showed substantially degraded volumetric imaging capability, unable to display the structure in the lateral dimension and leading to erroneously reconstructed patterns in the axial dimension.
Using FLFM, we also imaged and reconstructed 200-nm fluorescent beads distributed in the 3D space ( Fig. 2(d)). Four emitters located at different depths of −28 µm, −22 µm, −9 µm and −5 µm were reconstructed from one camera frame without the need for axial scanning, compared to wide-field microscopy. Lastly, we tracked the movement of 200-nm fluorescent beads suspended in water at various volume acquisition rates of 10 ms (Fig. 2(e) and Visualization 1), 40 ms (Visualization 2) and 100 ms (Visualization 3). The 3D positions and trajectories of the particles were determined by localizing the reconstructed particles using Gaussian fitting with nanometer-level precision in all three dimensions. The measurements of the reconstructed particles are consistent with the values as in Fig. 2(a), showing no compromise in spatial precision as the volume acquisition rate is varied.

Imaging biological samples
We then demonstrated FLFM by imaging mixed pollen grains stained with hematoxylin and phloxine B (304264, Carolina Biological Supply) using the 561-nm laser (Fig. 3). As seen, the raw data behind individual 3 × 3 microlenses have revealed different perspective views of the sample (Fig. 3(a)). The system captured a FOV >67 × 67 µm and allows to reconstruct across a DOF > 50 µm, enclosing the entire pollen, at a volume acquisition time of 0.1 s. The full 4D light-field information recorded by a single camera frame consents to synthesize the focal stacks of the full volume of the specimen (Figs. 3(b) and 3(c)). Remarkably, FLFM recovered the structures that were out-of-focus and poorly sectioned by wide-field microscopy (e.g. insets in Fig. 3(c)). The high spatial resolution allows us to visualize the fine spine structures of the pollen that exhibited FWHM values of ∼1-2 µm in width and of ∼5 µm in length, and were separated as close as a few micrometers in all three dimensions (Figs. 3(c) and 3(d)). It should be mentioned that these measured system parameters are consistent with our theoretical predictions in Section 2.2, and the use of denoising or thresholding strategies should further improve the crosstalk between the axial stacks due to the limited axial resolution and the overlapping 3D data during projection onto the 2D camera sensor. Next, we imaged mouse kidney tissue slice (F24630, ThermoFisher) using the 561-nm laser, where the filamentous actins, prevalent in glomeruli and the brush border, were stained with Alexa Fluor 568 phalloidin ( Fig. 4 and Visualization 4). FLFM recorded the 4D light-field information of the proximal tubule brush border ( Fig. 4(a)) and allows to reconstruct the entire thickness of the tissue slice (∼20 µm) from one camera frame (Fig. 4(b)). In the reconstructed image, the pattern of proximal tubules and brush borders can be well visualized in all three dimensions. It is also noticed that the apical domain of the brush border of the tubules exhibited

Research Article
Vol. 27, No. 18 / 2 September 2019 / Optics Express 25581 brighter stain, as known due to the denser distribution of actin bundles. Compared to wide-field microscopy, FLFM is capable of capturing the volumetric signals from a single camera frame and recovering synthesized axial stacks. The high resolution and sectioning capability of FLFM allow visualizing finer 3D structural variations (Figs. 4(c) and 4(d)). Tubular structures of a few micrometers across the DOF in all three dimensions can be well resolved (Figs. 4(c)-4(f)). Lastly, it should be addressed that for all the above imaging tasks, we used standard GPU calculation on a desktop computer (Intel(R) CPU i7-6850K (3.60 GHz), 128GB RAM, NVIDIA GeForce GTX1080Ti (Default/Turbo Clock 1480.5/1582MHz)). For example, conventional LFM typically takes an average of 31 s per deconvolution iteration to reconstruct a volume of 67 × 67 × 30 µm with a voxel size of 0.943 × 0.943 × 0.5 µm from a single camera frame of 210 × 210 pixels. As a comparison, FLFM utilizes an average of 0.34 s per iteration for an image of the same size, representing a roughly two orders of magnitude of improvement in computational cost (both methods use a similar total of 10-100 iterations for reconstruction). The reconstruction speed can be further accelerated using cluster computing and advanced deconvolution algorithms.
The scheme has thereby paved the way for ultimately overcoming the spatio-temporal-limiting step for live imaging and rapid video-rate reconstruction.

General design principle of FLFM
In this work, we have established the complete theoretical and algorithmic framework underlying light propagation, image formation and system characterization of FLFM. Here, we further derived this framework into a general design principle for FLFM, essential for advancing the highly scalable approach to meet broader imaging needs across various spatial levels.
As seen in Fig. 1(a), as well as the top panel in Table 1, the FLFM system is composed of two main modules, the original wide-field imaging module including the camera, and the light-field module including the FL and the MLA. Given a standard wide-field system, to design a FLFM system is inherently to identify the parameters for the light-field module. Next, we d based on the m and derived relationships Fig. 1(a), as w odules, the ori odule including M system is in ummarized the ies: First, we summarized the complete set of FLFM parameters, which can be organized into three categories: 1) the input parameters of the wide-field imaging module, including the emission wavelength λ, the numerical aperture NA and the magnification M of the objective, and the camera pixel size P and the physical size of the sensor D cam ; 2) the performance parameters of the intended FLFM imaging capability, including the lateral and axial resolution R xy and R z , pixel sampling rate S r (i.e. the ratio between R xy and the effective pixel size P/M T , where M T is the magnification of the FLFM system), FOV, and DOF; 3) the design parameters to describe the light-field imaging module, including the focal length of the Fourier lens f FL , the diameter of the microlens d MLA , the occupancy ratio N (i.e. the ratio between the effective pupil size at the MLA D pupil and d MLA ), the focal length of the microlens f MLA , the pitch of the MLA d 1 , and the distance from the outmost microlens covered by the illumination beam to the center of the MLA d max . It should be mentioned that we here considered hexagonal MLAs (e.g. the right in top panel of Table 1) for the best radial symmetry and dense packing to gain optimum light-field reconstruction, although the strategy is applicable to various MLA patterns. The goal of the design is thus to identify the design parameters in the third category.
Next, we derived theoretical relationships between these parameters as an inverse problem based on the model of system characterization developed in Section 2.B. As listed in Table 1 and derived in Appendix H, the design parameters can be determined using these relationships with the input and performance parameters. Specifically, the input parameters are usually provided for the wide-field imaging system, while the developers can decide on the desired performance parameters of FLFM. It should be noted that the identification of the proper performance and design parameters can be iterative due to their interdependent relationships (e.g. lateral and axial resolution R xy and R z are intrinsically connected with d MLA and d max ). In practice, only three independent performance parameters such as R xy , S r , FOV are initially required in order to determine all the design parameters and the rest of the performance parameters. The developers can then adjust the initial values and iterate the process to obtain an optimum combination of elements to satisfy the imaging need.
As an illustration, we can customize the performance of the FLFM system reported in this work using this design protocol. First, the input parameters are given as λ = 680 nm, NA = 0.95, M = 40, P = 6.5 µm, and D cam = 13.3 mm. Next, our desired performance, for instance, is to achieve a 3D resolution of ∼2-3 µm, a large FOV > 500 µm, and a DOF > 50 µm for a specific imaging need. We first set D pupil equal to D cam = 13.3 mm for the maximum use of the camera sensor. Then, using the model, we can generate various combinations of the performance parameters that meet the requirement. One set of such parameters could be R xy = 2.14 µm, R z = 2.79 µm,

Conclusion
In summary, we have developed FLFM to achieve high-quality light-field imaging and rapid volumetric reconstruction. Imaging the light field in the FD fundamentally mitigates the prohibitive artifacts, providing two-to three-fold larger DOF than the previous LFM design. The design also substantially reduces the computational cost by more than two orders of magnitude. We have established the theoretical and algorithmic models to describe the system and demonstrated high-resolution, artifact-free imaging of various caliber and biological samples. Furthermore, the work provided a generic design principle for constructing and customizing FLFM to address broad imaging needs at various spatial scales.
The study has overcome the two major limitations in the current LFM methods and may advance new imaging physics and applications for LFM and MLA-facilitated microscopy. The advancements in both imaging capability and computation speed promise further development toward high-resolution, volumetric data acquisition, analysis and observation at the video rate and ultimately in real time. Combining molecular specificity, great scalability, engineered MLAs and advanced computing, we anticipate FLFM to be a particularly powerful tool for imaging diverse phenotypic and functional information, spanning broad molecular, cellular, and tissue systems.

Appendix A: Image formation in traditional LFM and FLFM
Here, we demonstrated both the numerical and experimental data for light propagation (Figs. 5 and 6) for traditional LFM and FLFM. We also provided the corresponding image formation for traditional LFM (Fig. 7), compared to the result in Fig. 2(a).    Fig. 2(a), severe reconstruction artifacts can be observed for traditional LFM near the NIP.

Appendix C: Axial resolution.
We utilized theoretical and numerical models to analyze the axial resolution R z in this section. As shown in Fig. 10 (a), we considered two patterns on the camera plane, generated respectively from two emitters located at different axial positions on the optical axis. We reasoned that if the elemental images of these two patterns after the outmost microlenses, described as (−1, −1), (−1, 1), (1, −1), and (1, 1) (top right panel, Fig. 10 (a)), can be resolved (i.e. by the diffraction limit of λ 2NA ML of each microlens), the axial positions of the two emitters can be resolved through the reconstruction process. We assume that the PSF experiences negligible change within the axial range of interest, i.e. maintaining the width of λ 2NA ML . In this sense, the axial resolution can be calculated from ray optics directly. In our case, it is given as based on the paraxial approximation. Plugging the system parameters, we obtain the axial resolution R z = 4.7 µm. We assessed the theoretical axial resolution by simulating the PSF using our wave-optics model for light propagation as described in Section 2.3 of the main text (Figs. 10(b) and 10(c)). We first simulated light propagation in the 3D space ( Fig. 10(b)), where the lateral displacement of the PSF generated by the (−1, 1) microlens as a function of the axial position can be identified by Gaussian fitting of the beam profile (Fig. 10(c)). The fitted curve exhibited a linear relationship and the slope was calculated as p = 0.2763, indicating that one-pixel change in the z direction leads to a shift of 0.2763 pixel in the x-y plane. In practice, as mentioned in Section 2.4, the pixel size in the x-y plane in object space is ∆ xy = 0.943 µm, and the pixel size in z is ∆ z = 0.5 µm. The axial resolution can thus be determined as R z = R xy p×∆ xy × ∆ z = 4.1 µm, consistent with the above theoretical value of 4.7 µm.
Next, we performed another numerical study to verify the axial resolution (Fig. 11). We studied two axially separated point emitters at (4 µm and 0 µm), (2 µm and −2µm), and (0 µm and −4 µm), respectively. In Fig. 11, the left panel shows the raw light field images generated by the two point emitters at different depths; the middle panel shows the 3D reconstructed emitters and their perspective and axial views; the right panel shows the profiles of the emitters along the dashed lines in the axial images. As seen, these emitters separated by 4 µm at varying depths can be resolved. In practice, due to the noise and other system imperfections, the axial resolution in our experiments was validated closed to 5 µm, still consistent with the theoretical value of 4.7 µm. Like the lateral resolution, the axial resolution also varies along the axial dimension as shown in Fig. 9. As seen, the axial FWHM values increase from ∼5 µm to 10 µm when the emitter is moved from the focal plane to 20 µm away from the plane.

Appendix D: Field of view (FOV).
The FOV around the focal position in the object space can be identified from the model shown in Fig. 12. As seen, the FOV is determined by the image area after each individual microlens. Therefore, the FOV depends on the size and focal length of the microlens and the focal length of the FL, given as FOV = d MLA × f FL f MLA × 1 M . Using our system parameters, the value is givenas FOV = 66.96 µm. In addition, the FOV can be customized by adjusting the FL and MLA, while maintaining the same spatial resolution. Furthermore, the FOV can be expressed as This suggests a type of FLFM design that does not compromise the FOV by engineering S r . This condition can be fulfilled in most high-resolution wide-field fluorescence microscopy. Lastly, in our FLFM system, we have noticed that the FOV moderately decreases in a linear fashion, as the sample is moved away from the focal plane.  First, the light-field information on the camera will be out of focus and the intensity will decrease to a threshold value as the emitter is away from the focal plane along the axial direction. Therefore, the DOF can be calculated by the range of detectable intensity considering the diffraction effect in the axial dimension. As shown in Figs. 13(a) and 13(b), the depth can be obtained from the FWHM value of the PSF behind the central microlens on the optical axis along the axial direction. Theoretically, the axial FWHM value of the PSF of the central microlens is 2S r NA 2 =32 µm. Numerically, the measurement using our wave-optics model demonstrated the FWHM value of 33 µm ( Fig. 13(b)), in good agreement with the theoretical result. In this work, because the reconstruction relies on deconvolution, which is known to recover the diffracted information outside of the Rayleigh range of the axial PSF, we thereby used the full width of the axial PSF (i.e. 2 times of the FWHM value in the axial dimension) to describe the DOF. Therefore, both our theoretical and numerical models resulted in the DOF of 64 µm.
Second, when the object is away from the focal plane, the light-field pattern will spread out or shrink in the lateral dimension. The change of the pattern size is thus limited by the pitch of the MLA (Fig. 13(c)). This value is denoted as DOF = d MLA × f FL f MLA × 1 tan(θ) × 1 M 2 , where tan(θ) = d MLA f FL , obtaining a value of 209 µm in our experimental system. Beyond this value, the light-field pattern of an emitter is not fully captured by one microlens and thus generates cross talk into the nearby microlenses. Here this value is much greater than the DOF of 64 µm limited by the diffraction.
These two factors set the restriction of the DOF of FLFM, and in this work, we used the minimum value of the two to determine the final DOF (i.e. 64 µm).
Appendix F: Algorithm flow chart. Appendix G: Analysis for reconstruction of the surface-stained structure.
In this section, we analyzed the structure of the reconstructed surface-stained microsphere as observed in Fig. 2(c) in the main text. We simulated such a structure (diameter = 20 µm) and assessed the reconstruction results using different axial sampling step sizes (Fig. 15). As seen, as tighter sampling slices were taken from Figs. 15(a)-15(c), the reconstructed microsphere appeared a stretched axial pattern and the rugby-like profile was clearly observed in Fig. 15(c), similar to our experimental observation in the top row of Fig. 2(c). The stretch is attributed to the cross talk of each axial slices as they become denser, which project more volumetric information onto limited 2D imaging space on the sensor, the interference of the light field leads to the distortion of the original structure of the object during the deconvolution process.
This problem can be readily resolved by adjusting the FL and the MLA with different parameters to improve the resolution of the system. Here, we numerically reconstruct the continuous microsphere (diameter = 20 µm) with a different FL (i.e. changed from f FL = 75 to 50 mm), in which case the resolution will be improved based on the analysis given above. This (a) top row from left to right, the structure sampled with 5 slices, raw light-field image, reconstructed image of the surface-stained structure, cross-sectional images in x-z of the sampled structure, reconstructed image and the overlay, showing difference between the original and reconstructed structures due to the limited sampling step size and overlapping spatial information. (b) Same images with respect to (a) but with 11 slices. (c) Top rows are the same images with respect to (a) but with infinite slices (or sufficiently small step size to be exact), showing increasingly continuous structure and the enhanced stretching effect in the axial dimension (e.g. the 3 rd image from the left in the top row of (c)) compared to (a) and (b). The bottom row of (c) demonstrates the agreement of the lateral patterns in x-y on the focal plane of the raw (bottom left), reconstructed (bottom middle) and overlay (bottom right) structures. The numerical results are consistent with our experimental observation in Fig. 2(c). Scale bars: 20 µm (c, left on the top row), 5 µm (c, right on the top row and the bottom row). improvement mitigated the cross talk of the light field, recovering the original structure after reconstruction (Fig. 16).