Snapshot volumetric imaging with engineered point-spread functions

: The biological world involves intracellular and intercellular interactions that occur at high speed, at multiple scales and in three dimensions. Acquiring 3D images, however, typically requires a compromise in either spatial or temporal resolution compared to 2D imaging. Conventional 2D fluorescence imaging provides high spatial resolution but requires plane-by-plane imaging of volumes. Conversely, snapshot methods such as light-field microscopy allow video-rate imaging, but at the cost of spatial resolution. Here we introduce 3D engineered point-spread function microscopy (3D-EPM), enabling snapshot imaging of real-world 3D extended biological structures while retaining the native resolution of the microscope in space and time. Our new computational recovery strategy is the key to volumetrically reconstructing arbitrary 3D structures from the information encapsulated in 2D raw EPM images. We validate our technique on both point-like and extended samples, and demonstrate its power by imaging the intracellular motion of chloroplasts undergoing cyclosis in a sample of Egeria densa . Our technique represents a generalised computational methodology for 3D image recovery which is readily adapted to a diverse range of existing microscopy platforms and engineered point-spread functions. We therefore expect it to find broad applicability in the study of rapid biological dynamics in 3D.


Introduction
Video-rate volumetric imaging is a key challenge in fluorescence microscopy. Optically probing the underlying dynamics of fast biological processes such as neuronal activity, the beating hearts of small animals or the flow of blood requires fast imaging techniques able to resolve image features in 3D. Conventional approaches to volumetric imaging, such as confocal or light-sheet microscopy, rely on scanning to sequentially build a 3D image from a series of focal slices [1,2]. However, the time needed to scan limits the temporal resolution of these methods, rendering them ineffective for applications where the observed dynamics occur over shorter timescales. For these purposes, there is a vital need for scan-free methods that are able to reconstruct volumes from snapshot images.
The challenge of imaging an entire volume in a single snapshot is twofold. Firstly, the narrow depth-of-field (DoF) inherent to conventional imaging systems means that often only a few microns of the sample depth can be imaged in-focus with sufficiently high resolution at any one time, compared with typical requirements to image throughout volumes with depths of tens or even hundreds of microns. Secondly, since we wish to reconstruct a 3D volume from 2D camera images, active steps must first be taken to encode depth information into the acquired data. For snapshot volumetric imaging, therefore, we require an imaging system that features an extended DoF while simultaneously encoding depth information into the 2D data.
Several strategies exist to encode and extract depth information from snapshot microscopy data. These include multiplane imaging, light field microscopy, and pupil-plane engineering techniques. However, each of these methods suffer significant drawbacks that have prevented their widespread use in fast volumetric imaging.
Perhaps the most straightforward strategy is multiplane imaging. Here, several focal planes spaced throughout the sample depth are imaged simultaneously, either onto different cameras [3], or onto the same camera using diffractive optical elements [4][5][6][7], spatial light modulators [8] or beamsplitters [9,10]. However, these methods typically suffer from severe anisotropy in voxel size owing to only a small number of focal planes being imaged, limiting the information available along the optical axis, as well as a reduction in image contrast due to planes that are imaged in-focus in one sensor region contributing out-of-focus light to others, unless actively suppressed with image deconvolution [10].
Alternatively, light-field microscopy (LFM) has gained popularity for fast biological imaging applications [11][12][13][14][15]. By inserting a microlens array (MLA) into the imaging path, usually at the native image plane, the light-field microscope simultaneously samples spatial and angular information across an extended DoF in a highly aliased manner, permitting volumetric reconstruction to be performed post-acquisition via light-field deconvolution [16]. However, reconstructed volumes suffer from characteristic artefacts and significant resolution loss, owing to a non-uniform transfer of information across the sample depth, with the native focal plane in particular suffering from poor reconstruction quality [16]. Although more complicated optical designs have been proposed to tackle some of these issues [15,[17][18][19][20], fundamentally the depth of field and optical sectioning capabilities of LFM are provided at the expense of reduced lateral resolution [16]. This limits LFM to cellular level imaging [13,14] or sub-cellular imaging in a restricted depth range close to a coverslip with ultra-high numerical aperture (NA) oil-immersion lenses [18], thus hampering the wider study of biological dynamics.
An alternative line of attack is pupil-plane engineering techniques, used extensively for 3D point-localisation applications [21]. Here, the PSF of the microscope is modified such that its appearance changes with depth over an extended axial range. Example PSFs include the Double-Helix, Tetrapod, single-Airy or Twin-Airy [22][23][24][25]. Raw images recorded with engineered PSFs are then subject to position-recovery algorithms to localise sparse, discrete point-emitters in 3D. While recent multi-fitting algorithms and deep-learning based approaches have offered some incremental improvements in acceptable emitter density [26][27][28], all these methods are fundamentally constructed around the concept of localising individual point-emitters. Complimentary-kernel-matching (CKM) [29] can recover a patch-based range-map of an observed scene, from a pair of images acquired simultaneously each using extended-depth-of-field PSFs. However, the reported strategy was fundamentally limited to measuring the relief of a textured 2D surface, and was incapable of reconstructing images of extended three-dimensional structures. The challenge of imaging truly extended structures in a snapshot has remained unsolved until now.
Here we report a fundamentally different approach that, for the first time, permits PSFengineering methods to be used for snapshot volumetric 3D imaging of truly extended fluorescent structure, with potential applications spanning a diverse range of biological imaging applications wherever there is a requirement for 3D imaging with high temporal resolution. Our method, 3D-EPM, retains the benefits of imaging with engineered PSFs, namely providing an extended DoF and preserving near-diffraction limited resolution, while solving the problem of extracting the 3D information encoded into one or more snapshot raw images of an observed scene consisting of extended and diffuse 3D biological features. Our strategy can use any PSF that uniquely encodes information about the depth of a feature in the imaging volume: the PSF should exhibit an appearance that changes with depth and be asymmetric about the focal plane, while maintaining a high-valued modulation transfer function when the PSF is defocused.
In keeping with the optical methods presented in [24,29], we chose in this work to image using a cubic PSF, projected onto two cameras that are focused at marginally different sample depths.
The two independent imaging channels are essential for 3D-resolved imaging with the cubic PSF since a single cubic PSF does not effectively encode depth information into acquired images. Indeed, the extended DoF property of the cubic PSF can be interpreted as acting to minimise the variation of the PSF with depth. Thus, its appearance remains largely unchanged throughout the DoF, despite it exhibiting a depth-dependent lateral translation. However, by fusing together information from the 2 independent imaging channels, depth information becomes encoded through disparity between images, since the depth-dependent translation differs between the two cameras [29]. Figure 1(a) illustrates the optical setup used to implement this two-camera system, and Fig. 1(b) compares a conventional microscope PSF (left) to the PSF formed at the two cameras by the optical setup described here (center and right). Figure 1(c) displays the superimposed PSF from both imaging channels, illustrating the depth-dependent appearance of the overall system PSF. In the remainder of this work we will refer to this combined two-camera system PSF as the differential-Airy PSF. We note, however, that other PSFs suited for use with our method include the twin-Airy or Tetrapod [23,25], which encode depth information into a single frame and therefore would permit our method to be implemented using only a single camera. In our 2-camera differential-Airy setup we found a differential defocus of approximately 25% of the extended DoF to be optimal for enabling effective volume reconstruction. A difference that is too small results in too little disparity between images, preventing depth information being encoded

Fig. 1. Differential-Airy PSF permits depth-encoding over an extended DoF. (a)
Experimental setup: epi-illumination is delivered through the imaging objective. A cubic phase-mask is inserted at the re-imaged back pupil-plane. Images are projected onto two cameras simultaneously with a differential defocus introduced with the relay lenses. (b) Normalised xy slices from simulated 0.5NA diffraction-limited (left) and the two cubic PSFs, offset by −5 µm (center) and +5 µm (right), that form the differential-Airy PSF over an axial range of 30 µm. (c) The 2 cubic PSFs from (b) superimposed over the same axial range, with one displayed in green and the other displayed in magenta, illustrating how the 2 PSFs respond differently to defocus, therefore encoding depth information into the system. (d) Phase-retrieval improves contrast in experimentally-measured PSFs. Left: zx MIP of an experimentally measured 0.5NA cubic PSF. Right: The regenerated PSF after phase-retrieval and Zernike decomposition are used to estimate the pupil function. Plot shows traces along the lines shown. Scale bar is 10 µm.
into images and therefore inhibiting reconstruction performance; whereas a difference that is too large results in sample features being imaged within the DoF of one imaging channel but outwith the DoF of the other, again limiting depth information content and inhibiting reconstruction performance.
In the sections that follow, we first provide a theoretical basis for our reconstruction process and an overview of our experimental methods. We then validate our method experimentally, comparing a reconstructed volume of fluorescent beads to a light-sheet image of the same sample to quantify our performance. Finally, we demonstrate our method experimentally and in simulation across a range of samples, both static and dynamic, and explore its limitations.

Volumetric reconstruction
Here we introduce our strategy for volumetric reconstruction of a 3D volume from snapshot images. A discrete 2D image measurement I of a 3D volume O can be represented as the result of a transfer matrix H operating on the volume, with image formation described by I = HO. While the nature and dimensions of H prevent direct inversion for recovery of O, the use of an xy spatially-invariant PSF introduces a sub-structure where N Z is the number of z planes in the reconstructed volume and each sub-matrix H n has a Toeplitz structure, the effect of which is to perform a 2D convolution. With this assumption, and by considering the dominant noise sources encountered in fluorescence imaging, image formation can instead be modelled, more compactly, as the series of convolutions given by where k is the z-plane index, PSF is the 3D PSF stack, N is a noise term describing background and readout sources, Pois represents the Poisson distribution that describes photon shot-noise, and ⊛ denotes convolution. Our reconstruction strategy attempts to invert this model of image formation to yield an estimateÔ of the true object O via a modified form of iterative Richardson-Lucy (RL) deconvolution [30,31], which we have constructed to map between a 3D volume and M separate 2D images:Ô with B m n,z=k (x, y) = where n is the iteration index, m is the view index (in imaging modes where multiple views of the sample are required to encode depth information, such as when imaging with the differential-Airy PSF), B is the back-projected volume, C is a corrective mask applied to mitigate edge artefacts and R is a total-variation regularisation term, given by where div denotes divergence, ∇ is the vector differential operator and λ TV is a small constant [32]. The regularisation term is applied to prevent the over-fitting of noise components before convergence is reached, since the solution generally converges faster laterally than it does axially.
The corrective mask C is applied to allow for the fact that some light recorded by the camera may have originated outwith the confines of the volume O due to the extended nature of the Airy PSF, and equally, some fluorescence emitted from inside O may not be confined to the lateral extent of I. Further description of C is provided in Supplement 1. Equation (3) combines the forward and backward operators of the RL deconvolution, which are both implemented as series of 2D convolutions, while Eq. (2) represents a general result that can be extended to imaging modalities where an arbitrary number of views of the sample are obtained simultaneously. Equation (2) is the key innovation in our reconstruction process, but the complete pipeline requires careful consideration of several other issues. For instance, effective deconvolution is dependent on having good quality, high-SNR PSF measurements that reflect system aberrations. However, experimentally-measured 3D PSFs are subject to noise corruption from multiple sources and may also suffer from motion during acquisition. To mitigate these effects, we refine our experimentally-measured PSFs with computational post-processing, using phase retrieval to estimate the pupil-function, which is then low-pass filtered in the Zernike basis. We then re-simulate our PSFs using the filtered pupil function, leaving a smooth, noise-free PSF that captures system aberrations to maximise deconvolution performance. This procedure is summarised in Fig. 1(d), which illustrates a PSF before and after this PSF refinement process, and detailed fully in Supplement 1, Fig. S1. Other key considerations in our reconstruction pipeline are the registration of cameras and background subtraction. We summarise the pipeline in Table 1, and complete details of each step can be found in Supplement 1. Table 1. Calibration and imaging pipeline. Prior to imaging, an xy affine transformation between the cameras is calculated for registration. A system PSF is then measured by imaging a sub-diffraction fluorescent bead. The PSF is then refined with the phase-retrieval procedure outlined in Section 2. Experimental data is then recorded. The edge corrective factors are calculated and data flat-field corrected, before finally being deconvolved.

1.
Calculate affine transform between cameras

3.
Process and refine PSF with phase-retrieval

Point-source validation
We first demonstrate the ability of our 3D-EPM method to extract depth information from 2D images acquired with differential-Airy PSFs. We acquired snapshot images of a volume of fluorescent beads (Bangs Laboratories, mean diameter 0.19 µm, Dragon Green, excitated at 488 nm, emission spectrum peaking at approx 520 nm, suspended in a 0.5% agarose gel) at 0.8NA, and then applied our deconvolution scheme to reconstruct a 3D volume. Figures 2(a) and 2(b) show zy and zx maximum-intensity projections (MIPs) through the reconstructed volume after 200 deconvolution iterations, as well as traces through a single reconstructed bead, where we see full-width-half-maximums (FWHM) of 0.16 µm laterally and 0.88 µm axially, although we note that the quoted FWHMs do not directly equate to the resolution of our microscope, since deconvolution affects the size of single points in reconstructed images but does not necessarily improve the overall resolution provided by the system. To explore the resolution provided by our method, we simulated imaging a pair of closely-located sub-diffraction point sources, separated first laterally and then axially, before reconstructing the sample volume. The distance between the points was increased until they were clearly resolved in the reconstruction. We found that at 0.8NA, a separation distance of 0.31 µm laterally or 1.6 µm axially was required to reconstruct both points, see Fig. 2(e) and Supplement 1 Fig. S3. At separation distances beneath these values, the implicit sparsity prior of Richardson-Lucy deconvolution causes the solution to converge to a single point, see Supplement 1 Fig. S3 for further illustration.
To validate the reconstruction experimentally, we then imaged the same volume via sequential planar imaging using light-sheet excitation on the same microscope. The light sheet was delivered via a 0.3NA objective oriented orthogonally to the detection objective, with an estimated full-width-half-maximum of 2.6 µm. We extracted the 3D positions of beads in both the light sheet image and reconstructed volume using the Trackpy python library [33]. Matching beads were identified with a custom nearest-neighbour matching algorithm, detailed in Supplement 1. Figures 2(c) and 2(d) show zy and zx MIPs through the light-sheet image, with markers overlaid to indicate the positions of identified beads. Beads identified in the reconstruction are marked with circles and those identified in the light-sheet image are marked with crosses.
Matched beads are colour coded. Unmatched beads in the light-sheet/reconstruction are marked with green squares/red pentagons respectively. Of the beads identified in the reconstruction, 93% were matched to a bead in the light-sheet image, and between matched beads, we saw a root-mean-square (RMS) localisation error of under 0.11 µm in xy and under 0.16 µm in z, confirming high fidelity between the light-sheet image and reconstructed volume. Full details of the matching algorithm and further performance analysis is provided in Supplement 1.

Snapshot volumetric imaging of extended structures
Next, we demonstrate the ability of our method to reconstruct extended structure in 3D by imaging in a snapshot the fibrous structure in a sample of lens-tissue. The sample was prepared by soaking the tissue in a solution of fluoroscein salts before being left to dry, such that it would fluoresce under 488 nm illumination. We imaged the lens-tissue with a custom-built upright microscope adapted to exhibit a differential-Airy PSF, based around a 10X 0.3NA objective (Nikon) with a cubic phase-mask placed at the re-imaged pupil plane. At this magnification, the tissue fibres appear as thin filaments with extended regions of fluorescence emission. The extension to the DoF provided by the phase-mask allowed the entire sample depth to be imaged in-focus, so the raw camera images were not significantly degraded by out-of-focus light. Figures 3(a) and 3(b) depict the raw snapshot images. Note that the differential defocus introduces small translation of image features between images (insets, Figs. 3(a) and (b)), and this is how depth information is encoded. Figure 3(c) depicts an xy summed intensity projection of the reconstructed volume after 500 deconvolution iterations, colour-coded for depth, illustrating the intertwined 3D structure of the tissue fibres spanning a depth of 100 µm. Figure 3(d) displays xy planes from the region marked by the red box in Fig. 3(a) at different depths within the reconstructed volume, with arrow markers indicating image features that are most prominent at certain depths. Figures 3(e) and (f) display yz and zy MIPs through the reconstructed volume, clearly showing the fibrous structure of the sample. In order to visualize the full dynamic range of the reconstructed image, we applied a gamma intensity correction of γ = 0.75 prior to display in Figs. 3(c-f).
To further explore the ability to reconstruct extended structure, as well as the dynamic capabilities of our method, we simulated imaging a volume in which filaments were growing during acquisition. Simulated differential-Airy images of the volume at 60 time-points were generated according to Eq. (1) , with both Poisson and Gaussian noise added. From each pair of differential-Airy images a volume was reconstructed; see Visualization 1 for time-lapse videos of the ground-truth volumes, the raw snapshot images and the reconstructed volumes. Simulation parameters are detailed in Supplement 1. In the reconstructed volumes the 3D structure of the filaments is clearly resolved, and the simulated dynamics are clear in all 3 dimensions, demonstrating how our method is well suited to observing biological processes in 3D.

Video-rate volumetric imaging of biological dynamics
Finally, we apply our method to real dynamic biological scenes where the sample undergoes continuous motion during acquisition, highlighting the value of being able to perform volumetric imaging in a snapshot. We imaged the 3D motion of chloroplasts undergoing cytoplasmic streaming in a leaf cut from a sample of Egeria densa. The sample was imaged at 0.8NA via a custom upright microscope based around a 40X 0.8NA water-immersion objective (Nikon) for 80 s at a frame rate of 5 Hz. Auto-fluorescence of chlorophyll was excited in the leaf using laser illumination at 488 nm, with fluorescence emission light detected using a 630 nm emission filter. From each time-point a 83.2 × 83.2 × 21.1 µm volume was reconstructed over 300 deconvolution iterations; see Visualization 2 for a 3D visualization of the reconstructed time-lapse data, where the 3D motion of chloroplasts is apparent. Figure 4(a) depicts a 3D visualization of the reconstruction at the t = 0 time-point, and Fig. 4(b) displays an xy summed-intensity projection of the same volume that is colour coded for depth, illustrating how the organelles are distributed through the 3D volume. Figure 4(c) displays the xy region marked by the white box in Fig. 4(b) at a depth of 8.8 µm, where the chloroplasts are positioned differently at various time-points, demonstrating the time-evolution of the sample. Some underlying structure of chloroplasts, each approximately 3-4 µm in diameter, is visible in the reconstructions, demonstrating the high resolution provided by our method. The same region is shown in Fig. 4(d) at various depths, demonstrating the optical sectioning provided by the reconstruction. Figure 4(e) displays estimated motion tracks of chloroplasts in the region outlined by the white box in Fig. 4(a).

Discussion
The factor limiting temporal resolution in conventional volumetric imaging methods is the time needed for acquisition: in confocal and light-sheet imaging this is dictated by the need to scan, whereas in single-molecule localisation microscopy (SMLM) it is the need for sparsity in the image domain. SMLM techniques such as photo-activated localisation microscopy (PALM) [34] and stochastic optical reconstruction microscopy (STORM) [35] have become invaluable tools for imaging cellular function and structure, however the need for identification of individual emitters in images, imposed by the fitting algorithms, requires that only a sparse subset of fluorophores are active each acquisition. This means that full reconstructions require many individual camera acquisitions to localise a sufficient number of activated fluorophores. Reconstructed SMLM images from tens of thousands of individual frames are not uncommon, which fundamentally limits the temporal resolution of these methods. By removing the need for either (i) imaging individual isolated point sources, or (ii) scanning to obtain depth information, our method overcomes these limitations. Additionally, our method provides a resolution that is equal to the full resolution of the parent imaging objective, offering significant advantages over LFM, which requires a reduction in lateral resolution to allow 3D imaging.
Because our deconvolution is performed with volume-calibrated PSFs, the apparent blur resulting from the cubic phase-aberration can be successfully reversed across the full volume depth, offering further advantages over LFM. In this work we have demonstrated our method by imaging with a differential-Airy PSF. While our method is, in principle, not limited for use with any particular pupil phase-function, we chose a PSF based on the Airy PSF as it provides several desirable features: (1) an extension to the DoF through its diffraction-free propagation; (2) an absence of zeros in its Fourier-domain representation, or modulation-transfer function (MTF) [36] to ensure complete information transfer, and (3) a parabolic xy translation of the PSF that varies with defocus [37]. The combination of the depth-dependent translation and the differential defocus yields an overall system PSF that is asymmetric about the native focal plane, thus enabling effective extraction of depth information. We emphasize, however, that our method is suitable for use with any PSF that effectively encodes depth information across an extended DoF. Additionally, our method is not limited to a particular NA, magnification or immersion medium, allowing it to be tailored to suit individual experiments.
While our method delivers volumetric imaging of extended structure, there does remain a modest requirement for sparsity in the sample. Because depth information is encoded effectively through disparity between camera images, the reconstruction may not be as effective if the sample exhibits large areas of texture-free uniform intensity that are larger in size than the lobes in the cubic PSF, since that would result in image regions that feature minimal disparity. This means that certain samples with large areas of uninterrupted, uniform fluorescence emission may not be as well suited for imaging with this method. However, we anticipate that future work using 3D-patterned illumination will further broaden the versatility of our method.
The number of iterations required for reconstruction varies depending on image characteristics including sparsity and SNR. Deconvolution should be stopped before the Richardson-Lucy begins over-fitting to high-frequency noise components which degrades final solution quality: in practice we find the optimal number of iterations to be in the range 50-500. Because the deconvolution is implemented as a series of 2D fast Fourier transforms, the time needed for reconstruction scales with deconvolution volume size. It also scales linearly with the number of iterations: the 200 iterations required to reconstruct the bead sample in Fig. 2 took approximately 450 s to run on a machine equipped with a single Nvidia GeForce GTX 1080 GPU.

Conclusion
Our proposed hybrid method provides high-resolution scan-free volumetric imaging applicable to a wide range of biological samples. Our results show that volumetric deconvolution enables effective exploitation of depth information encoded in snapshot 2D images to reconstruct a full 3D volume of extended objects, while maintaining the lateral resolution of the parent imaging objective. We have used images acquired with a differential-Airy PSF to demonstrate the snapshot volumetric imaging across a large, extended depth-of-field, for samples that feature extended structure and dynamic motion. Because our method can be tailored to the size and structure of an individual sample, it has the potential to unlock a broad range of biological imaging applications where high temporal resolution is required.