Confocal super-resolution microscopy based on a spatial mode sorter

Spatial resolution is one of the most important specifications of an imaging system. Recent results in quantum parameter estimation theory reveal that an arbitrarily small distance between two incoherent point sources can always be efficiently determined through the use of a spatial mode sorter. However, extending this procedure to a general object consisting of many incoherent point sources remains challenging, due to the intrinsic complexity of multi-parameter estimation problems. Here, we generalize the Richardson-Lucy (RL) deconvolution algorithm to address this challenge. We simulate its application to an incoherent confocal microscope, with a Zernike spatial mode sorter replacing the pinhole used in a conventional confocal microscope. We test different spatially incoherent objects of arbitrary geometry, and we find that sorter-based microscopy can achieve more than 5-fold resolution enhancement over a diffraction-limited image. In addition, the resolution enhancement of sorter-based microscopy is on average over 30% higher than that of a conventional confocal microscope using the standard RL deconvolution algorithm. Our method could potentially be used in diverse applications such as fluorescent microscopy and astronomical imaging.


Introduction
Enhancing spatial resolution is a persistent goal for imaging systems. The resolution of an incoherent far-field imaging system was previously believed to be limited by Rayleigh's criterion [1]. In recent decades, a multitude of super-resolution methods have been demonstrated to break the diffraction limit, such as stimulated-emission depletion (STED) [2], photoactivated localization microscopy (PALM) [3], and stochastic optical reconstruction microscopy (STORM) [4]. However, these methods generally require the use of specially prepared fluorescent molecules, and the data collection in an experiment can take a long time. In addition to these classical methods, various quantum effects have been investigated to enhance the imaging resolution. Optical centroid measurement [5][6][7][8][9] is another quantum approach that can improve the resolution by up to 41% via detecting the centroid of entangled bi-photons. The anti-bunching effect has also been exploited to enhance the spatial resolution when imaging single-photon sources such as quantum dots through the use of coincidence measurement [10,11]. Nonetheless, these non-classical methods typically require the use of quantum, low-brightness light sources (e.g., single-photon sources and entangled-photon sources) as well as slow, high-order intensity correlation measurements, which limits their widespread adoption in real-world imaging systems.
In recent years, a quantum-inspired super-resolution imaging method based on spatial mode sorting (called SPADE) has been proposed [12] and experimentally demonstrated [13][14][15][16][17]. The Rayleigh diffraction limit can be broken through the use of an appropriate spatial mode sorter, and an arbitrarily small separation between two spatially incoherent, equally bright point sources can be well resolved. Although the theory for SPADE was developed in the framework of quantum metrology, it can be interpreted classically [18] and does not need non-classical light sources or high-order coincidence detection, which is the major advantage over the aforementioned super-resolution methods. The theoretical treatment of SPADE approaches the super-resolution task as a parameter estimation problem, which works well when imaging a scene with a small number of point objects. However, it is non-trivial to apply the theory to a general scene that contains many point sources or continuous objects due to the complexity of multi-parameter estimation problems [19,20]. A few theoretical attempts have been made towards sorter-based super-resolution imaging for a scene with a very small number of unknown parameters [20][21][22][23][24][25][26][27][28][29]. However, these previous works mainly focus on the Fisher information analysis of the mode sorter, which exhibits intractable complexity in the calculation of quantum Fisher information. Furthermore, even if the quantum Fisher information can be computed, it is challenge to determine if the quantum Fisher information can be achieved by practical measurements for all parameters. Hence, to the best of our knowledge, no method has yet been reported to super-resolve an object of arbitrary geometry using a mode sorter. Here we address this challenge by treating the sorter-based super-resolution imaging as a deconvolution problem. We propose to replace the pinhole in a standard confocal microscope with a spatial mode sorter. We generalize the standard RL deconvolution algorithm [30,31] to digitally process the multiple outputs of the mode sorter in order to reconstruct a super-resolved image. In Section 2, we introduce the conceptual schematic of the sorter-based confocal microscopy as well as the algorithm for image reconstruction. In Section 3 we present the numerical simulation results. The conclusion of this work is discussed in Section 4.

Generalized Richardson-Lucy deconvolution algorithm
The schematic of a confocal microscope is shown in Fig. 1(a,b). Conventional confocal microscopy uses a pinhole in the image plane before the single-pixel detector. Here we assume that the illumination beam is spatially coherent and the light scattered by the object is spatially incoherent, which is common in fluorescence microscopy. The illumination and reflected beam wavelengths are assumed to be the same for simplicity, although they can be different in fluorescence microscopy. We use 0 to describe the object brightness profile and use to denote the point spread function (PSF) of the conventional confocal microscope using a pinhole of diameter .
Here we assume a circular aperture of the objective lens, and the PSF of the objective lens is thus an Airy disk [32]. Additional details of are presented in Supplementary Section 1. By raster scanning the object, a 2D image can be obtained, and the resultant confocal image con can be described by the convolution of 0 and as con = * 0 . In our model we consider only the fundamental quantum noise which leads to Poissonian photon statistics; we ignore other technical sources of noise. Therefore, the shot-noise-limited image that can be experimentally measured is described by exp con = Poisson( con ), where Poisson(·) denotes one random realization of the Poisson distribution for a given mean. The standard RL deconvolution algorithm can be expressed as [30,31] where is the deconvolved image in the -th iteration and * denotes convolution. In general, the iterative deconvolution algorithm begins with an image of uniform intensity =1 = const, and the term inside the parentheses in the above equation can be understood as a correction to during each iteration. The proposed sorter-based confocal microscope is shown in Fig. 1(b). It can be seen that a Zernike mode sorter is used in the Fourier plane. Here Zernike modes are adopted because they have been shown to be the optimal basis for an imaging system with a circular aperture [22,33]. In particular, we choose the six lowest-order Zernike modes ( 0 0 , −1 1 , 1 1 , −2 2 , 0 2 , and 2 2 ), as shown in Fig. 1(c). The Zernike mode sorter projects the collected photons onto each Zernike mode in the Fourier plane, and each output port of the Zernike mode sorter produces a 2D image by raster scanning the object. The image is given by the convolution of the original object image 0 and as = 0 * , where is the effective PSF when projecting into the mode : where +1 (·) is the Bessel function of order + 1; ( 1 , 1 ) are the Cartesian coordinates at the object plane, and ( 1 , 1 ) are the corresponding polar coordinates; 0 is the photon number in the illumination beam at each raster scanning step; = 2 / is the wave number, is the wavelength, and NA is the collection numerical aperture of the objective lens. In this equation, =0, =0 represents the PSF of the illumination beam, and is the  intensity profile of the Fourier transform of in the image plane as shown in the bottom row in Fig. 1(c). Here, we assume that the illumination beam has a flat spatial profile before being focused by the objective lens and that the illumination NA is the same as the collection NA. We use this assumption to simplify the calculation and simulation, but we note that this assumption can be relaxed and is not necessary to the result. Derivations of the analytical form of , and are presented in Supplementary Section 2. We use exp = Poisson( ) to denote the randomly generated shot-noise-limited image that can be measured in an experiment. As one can see, a major difference between the conventional confocal microscopy and the sorter-based confocal microscopy is that multiple images (six in our case) can be obtained simultaneously when using a mode sorter. Here we propose a generalized RL deconvolution algorithm, which can be expressed as Compared to the conventional RL deconvolution algorithm (Eq. (1)), it can be seen that the correction term inside the parentheses in the above equation is the sum of contributions from different modes.

Numerical simulation
We next present the results of numerical simulations that implement the generalized deconvolution algorithm and compare its performance to that of the conventional deconvolution algorithm. One of the objects we use (pattern A) is shown in Fig. 2(a) and has 256 × 256 pixels. The original image is zero padded to avoid the diffraction-induced boundary effect, which is irrelevant to this study. A 2D image exp can be obtained at each output port of the mode sorter when raster scanning the translation stage by ( , ). Here we choose the scanning step size to be 1 pixel and the total scanning steps to be 256 × 256, resulting in a 2D image exp ( , ) of 256 × 256 pixels.
At each scanning step, we assume that 0 photons are used to illuminate the object, and thus the total illumination photon number = 256 × 256 × 0 . The results for different Zernike mode  outputs are shown in Fig. 2(b-g). One can see that the output of high-order modes has a lower photon count and is thus more susceptible to Poisson noise. We use =1 = const as the starting point and run the iterative deconvolution algorithm based on Eq. (3). We choose the commonly used peak signal-to-noise ratio (PSNR) to quantify the quality of the reconstructed image . The definition of PSNR is given by [34] PSNR = 10 log 10 max( 0 ) 2 where ( , ) are the integer pixel indices of the digital image and = 256 is the pixel size along one dimension. We stop the deconvolution algorithm at a maximum iteration number ite = 10 4 , which is limited by time and computational power constraints. In general, the PSNR increases with increasing iteration number . However, if the data is noisy, the noise can be amplified when the iteration number is large, and thus the PSNR can decrease if continues to increase. In our implementation, we monitor the PSNR as a function of the iteration and choose the maximum PSNR for 1 ite for each implementation. The reconstructed image is shown in Fig. 2(h). More details on the PSNR as a function of the iteration number are provided Supplementary Section 3. For practical applications where the ground truth is not available, a stopping criterion [35] must be used.
We next characterize the performance of the generalized deconvolution algorithm under different levels of Poisson noise by adjusting the total illumination photon number . The ground truth for pattern A is shown in Fig. 3(a1), and we choose the Rayleigh-criterion resolution 0 = 1.22 /( NA) to be 80 pixels. The noiseless, diffraction-limited confocal image without deconvolution con is shown in Fig. 3(a2), which is too blurry to reveal the details of the ground truth. We next vary the total photon number and test the performance of the deconvolution algorithm with different . The reconstructed images by the sorter-based deconvolution algorithm are presented in Fig. 3(a3,a4). We also test another pattern B made of four handwritten digits (MNIST handwritten digit database [36]) with non-uniform intensity profile as shown in Fig. 3(b1). The conventional confocal image con without deconvolution is shown in Fig. 3(b2), and the digits cannot be resolved based on this image. The sorter-based deconvolved images are presented in Fig. 3(b3,b4). It can be seen that the digits '0' and '1' using the sorter-based approach are visually resolvable. Fig. 3(a5,b5) are 1-D cross-sections through the images (shown by the white line), comparing the reconstructions to the ground truth. We can see how the reconstructions evolve as the photon number increases. At a large photon number the 1-D cross-section shows a higher contrast and is more similar to the ground truth than at low photon numbers.
In Fig. 4 we compare the performance of the conventional deconvolution algorithm to the generalized deconvolution algorithm in terms of PSNR under different total illumination photon number for patterns A and B. For each , we run the simulation six times with randomly generated Poisson noise to obtain the mean and the standard deviation of the PSNR of the reconstructed images. It can be seen that the generalized deconvolution algorithm based on the mode sorter consistently provides higher PSNR than the conventional confocal approach. Also, the PSNR of the reconstructed image generally increases when increases. Although PSNR is a widely used metric for quantifying the image quality, the PSNR of reconstructed images for different ground truths cannot be compared directly. In addition, the PSNR does not provide an intuitive understanding of the reconstructed resolution. We next translate PSNR to the effective resolution enhancement in order to answer the frequently asked question "what is the resolution enhancement of your super-resolution method?". For a ground truth image 0 , we blur it with PSFs of different resolutions as blur = 0 * ( ), where ( ) is the PSF of the conventional confocal microscopy given a particular Rayleigh resolution . We then numerically calculate the PSNR of blur using Eq. (4) to obtain the relation between resolution and PSNR, i.e. PSNR = ( ). Therefore, for each reconstructed image , we can calculate its effective resolution based on its PSNR via eff = −1 (PSNR), where −1 is the inverse function of . Hence, the effective resolution enhancement (EFE) can be calculated as and this quantity is shown on the right-hand side axis in Fig. 4. It can be seen that at the maximum total photon number = 10 10 , the effective resolution enhancement of sorter-based approach is higher than 5.0 for both pattern A and B. Moreover, the effective resolution enhancement of the sorter-based approach is on average 38% and 30% higher than that of the conventional approach for pattern A and pattern B, respectively. We also test nine additional images which are shown in Supplementary Section 4. It can be seen that the mode sorter can provide on average 24% higher resolution enhancement over the conventional approach for the nine additional objects. We also present the reconstructed images obtained by the conventional deconvolution algorithm and the sorter-based deconvolution algorithm in Supplementary Section 5. It can be seen that the sorter-based deconvolution algorithm provides a visibly higher resolution as compared to the conventional deconvolution algorithm, in particular at low total photon number. We believe that our method can be readily applied to confocal fluorescent microscopy by using a Zernike mode sorter, and the Zernike mode sorter can in principle be experimentally realized by the multi-plane light conversion [37]. Another potential application of our method is the astronomical imaging where the collected light field is spatially incoherent. However, since the confocal scheme cannot be used in astronomical imaging, the formulas developed here need to be adjusted accordingly to account for the non-confocal scheme used in astronomical imaging.

Conclusion
In conclusion, we generalize the standard RL deconvolution algorithm and apply it to enhance the resolution of sorter-based confocal microscopy. We test our algorithm with general scenes, which has not previously been realized by spatial mode sorting, to the best of our knowledge. The effective resolution enhancement of the sorter-based approach can be as large as ≈5.6 when the total illumination photon number is = 10 10 . For both patterns we test, the average effective resolution enhancement of the sorter-based approach is more than 30% higher that that of the conventional deconvolution algorithm. Hence, our generalized deconvolution algorithm can achieve robust super-resolution for general scenes compared to the conventional RL deconvolution algorithm. In particular, our generalized deconvolution algorithm allows for super-resolving strongly blurred images of digits, which could be used a front-end to a machine learning-based digit identification task [36]. Furthermore, our method does not require non-classical quantum light sources, and thus our generalized deconvolution algorithm can be potentially useful to applications such as fluorescent microscopy and astronomical imaging. Given the simplicity and generality of our generalized deconvolution algorithm, it is possible to integrate our method with existing quantum or classical super-resolution methods to increase the resolution even further.

S1. Derivation of Effective PSF of the Conventional Microscopy
In a conventional confocal microscope, the Airy-disk-shaped PSF of the illumination beam ( 1 , 1 ) can be written as where ( 1 , 1 ) are the Cartesian coordinates and ( 1 , 1 ) are the corresponding polar coordinates in the object plane, NA is the numerical aperture, is wavelength, and = 2 / is the wave number. Here the total energy of the illumination beam is ∬ ( 1 , 2 ) 1 2 = 0 , and thus the illumination beam contains 0 photons. It can be noticed that (see Eq. (2) in the manuscript): To obtain a 2D image, one needs to raster scan the object. This 2D translation is denoted as ( , ). The intensity profile of the object is denoted by 0 ( 1 , 1 ). When the object is excited by the illumination beam, the excited intensity distribution at the object plane is 1 ). For the reflected beam, we assume the imaging system has a magnification of unity, and the intensity PSF at the image plane can be described as where ( 2 , 2 ) are the Cartesian coordinates at the image plane, ( 2 , 2 ) are the polar coordinates at the image plane, and the energy of ( 2 , 2 ) is normalized to unity as ∬ ( 2 , 2 ) 2 2 = 1. Hence, the resultant image at the image plane can be written as In a conventional confocal microscope, we use a pinhole with diameter at the image plane in front of a bucket detector to measure the photon number. For the conventional confocal microscopy simulation, we use a pinhole of diameter = 0 , which is commonly used in experiments [1]. Therefore, by raster scanning the object, a 2D image con ( , ) can be obtained by the bucket detector, which can be written as Hence, the expression for con ( , ) can be rewritten in the form of a convolution as where

S2. Derivations of Effective PSF of the Mode Sorter-Based Microscopy
For our sorter-based method, we use a Zernike mode sorter to perform spatial mode decomposition. The expression for the normalized Zernike modes in the Fourier plane is where ( , ) are the scaled, dimensionless polar coordinates at the Fourier plane with 0 1; . Consider a point source located at ( 1 , 1 ). Then the electric field on the scaled pupil plane is 1 √ exp( NA( 1 + 1 )), where ( , ) is the dimensionless Cartesian coordinate at the Fourier plane with 2 + 2 1. In the polar coordinate, this field can be rewritten as 1 √ exp( NA 1 cos( 1 − )). Now we project this field to the Zernike modes ( , ), and the overlap integral is [2] ( 1 , 1 ) = (S10) It can be seen that the overlap integral coincides with the Fourier transform of Zernike modes. Therefore, when the object 0 ( 1 − , 1 − ) is excited by the illumination beam ( 1 , 1 ), the sorter output is where (S12)

S3. Quality of Iterative Reconstructions
We allow the algorithm to run until it reaches a maximum iteration number ite = 10 4 , which is limited by time and computational power constraints. We monitor the PSNR at each iteration and choose the maximum PSNR for 1 ite for each implementation. In general, the PSNR increases with increasing iteration number , as shown in Fig. S1(h). However, noisy data can cause the PSNR to decrease with increasing , as the noise can be amplified when the iteration number is large. This effect is shown in Fig. S1(d). In Fig. S1(a-c) we see the progression of the image quality as the iteration number is increased. Fig. S1(e-g) illustrates the same progression at a larger illumination photon number. These images are reconstructions of pattern A.

S4. Performance characterization for Additional Images
The main goal of our approach is to super-resolve an object of arbitrary geometry using a mode sorter, as to the best of our knowledge this has not yet been accomplished. In order to test our algorithm, we fed it many images to reconstruct. In Fig. S2 we compare the performance of the conventional algorithm to the generalized algorithm in terms of PSNR and the effective resolution enhancement. Among all the data points shown in the inset of Fig. S2 (a-i)

S5. Comparison of conventional and sorter-based algorithm
For the conventional confocal microscopy simulation, we use a pinhole of diameter = 0 , which is a value commonly used in experiments [1]. The images reconstructed by the conventional deconvolution algorithm and the sorter-based deconvolution algorithm are shown below.