Quantum-limited estimation of the axial separation of two incoherent point sources

Spatial mode demultiplexing has recently been proposed as a method to overcome the diffraction limits and can be used to precisely measure the transverse separation between two point sources. Here we show that superresolution can also be achieved along the axial direction through the use of a radial mode sorter, i.e. a device that can decompose the incident field into an orthonormal set of radial modes. We demonstrate axial superresolution in the laboratory through use of a binary radial mode sorter, a device that can sort odd-order and even-order radial modes into two separate output ports. We also show theoretically and experimentally that our method allows for saturating the the quantum Fisher information in the limit of a zero axial separation between two incoherent point sources. Our method presents an important step towards three-dimensional superresolution imaging via spatial mode demultiplexing and can be useful to a variety of applications such as far-field fluorescence microscopy.

Optical microscopy is one of the most important imaging devices and has been broadly applied in various areas. One crucial metric for an optical microscope is the spatial resolution, which is fundamentally constrained by the diffraction limit. Generally, when the separation between two point sources is smaller than the diffraction spot size, it becomes rather challenging to localize the two point sources, a phenomenon referred to as "Rayleigh's curse" [1]. In recent decades, various methods have been proposed to surpass the diffraction limit [2][3][4]. In fluorescence microscopy, a widely used approach is to activate each fluorescence molecule individually, and therefore the overlap between neighboring molecules is avoided and the localization precision can be improved to tens of nanometers. However, these techniques usually require specially prepared samples and the activation and detection of individual fluorophores can take a long time. More recently, a coherent detection strategy is proposed to overcome the diffraction limit [1]. While measuring in the position basis leads to a vanishing Fisher information when the separation of two point sources is close to zero, decomposing the field into the Hermite-Gaussian mode basis set [5] can reach a constant, non-zero Fisher information for a Gaussian point spread function (PSF). In addition, such a strategy can reach the quantum Cramér-Rao lower bound (CRLB) and therefore is considered to be theoretically optimal. Many works have been done to verify and generalize this theory [6][7][8][9][10][11][12][13][14].
While transverse spatial resolution can be enhanced by these methods, the axial superresolution remains challenging and is highly desirable for resolving three-dimensional structure. A variety of methods have been proposed to realize axial super-localization [15], such as an interferometric microscope [16,17], PSF engineering [18][19][20], and multi-plane detection [21]. In spite of these advances, it remains difficult to determine a small axial separation between two simultaneously emitting incoherent point sources. Nonetheless, our calculations show that the quantum Fisher information does not vanish for an arbitrarily small axial separation, and therefore it should be possible to achieve superresolution in the axial direction. In the following we first present the theory of axial superresolution and then introduce the corresponding experimental realization.
A conceptual diagram of two different imaging methods is shown in Fig. 1. A traditional microscope employs an objective to collect photons and then use a tube lens to form an image of the object as shown in Fig. 1(a), and we refer to such a configuration as direct imaging. Alternatively as proposed in [1], one can decompose the optical field into a complete and orthonormal basis set as shown in Fig. 1(b), which can be realized by a spatial mode sorter and is referred to as sortedbased measurement. For a more tractable analysis and experiment here we assume a Gaussian PSF and the field distribution in the pupil plane for an on-axis point source is denoted by |ψ , where where NA is the numerical aperture, z is the axial position of the point source, k = 2π/λ is the wavenumber, λ is the wavelength, r 0 is the normalized radial coordinate in the pupil plane, and |r 0 is the corresponding position eigenstate. Here we define r 0 = r p /(f 1 NA), where r p is the radial coordinate in the pupil plane and f 1 is the objective focal length. This pupil plane field distribution can be viewed as a paraxial, Gaussian approximation to the pupil function of a hard-edged circular aperture [22,23]. For direct imaging a tube lens is used to perform a Fourier transform to the pupil function, and the intensity distribution on the image plane becomes where r is the radial coordinate in the image plane, w(z) denotes the Gaussian beam waist width on the image plane, and M is the magnification of the imaging system. By measuring the beam size we can estimate the axial position z. Similar to the case of transverse superresolution [1], here we assume a priori knowledge of two on-axis, equally bright incoherent point sources with the centroid located at z = 0 plane, and the axial separation between them is s. The density matrix of these two point sources at the pupil plane can be written as ρ = (|ψ 1 ψ 1 | + |ψ 2 ψ 2 |)/2, where r 0 |ψ 1 = ψ(r 0 ; s/2) and r 0 |ψ 2 = ψ(r 0 ; −s/2). The normalized total intensity at the image plane can be calculated as I s (r) ≡ r| ρ |r = [I(r; s/2) + I(r; −s/2)]/2, where |r is the position eigenstate in the image plane, and the image plane is related to the pupil plane by the Fourier transform. For sorter-based measurement, the incident field is decomposed to an orthonormal basis set, and here we consider the radial Laguerre-Gaussian (LG) basis because we notice that the axial position only affects the radial profile of pupil function. The radial LG basis in the pupil plane can be denoted as |LG p , where r 0 |LG p = LG p (r 0 ) and LG p ( where L p (·) is the Laguerre polynomial. While the two-dimensional LG basis involves another azimuthal index , this radial subset with = 0 can still form a complete basis to describe the pupil function because of the rotational symmetry of the pupil function as shown in Eq. (1). Decomposing the pupil function ψ(r 0 ; z) to this basis leads to the following radial mode distribution where z R = πw 2 0 /λ and w 0 = λ/πNA [1]. For two equally bright sources separated by s, the output radial mode distribution becomes P s (p) ≡ LG p | ρ |LG p = [P (p; s/2) + P (p; −s/2)]/2.
It can be noticed that for direct imaging and sorter-based measurement, the two incoherent point sources have the same response because Eq. (2) and Eq. (4) are even function of z, which implies that the analysis presented here can be readily applied to single point localization.
We next compare the performance of direct imaging and the sorter-based measurement by calculating the Fisher information for both techniques. The Fisher information for direct imaging which is independent of the magnification M . The Fisher information for the sorter-based measurement is The upper bound of Fisher information for any type of measurement is referred to as quantum Fisher information, which in our case can be calculated as [23]  where |∂ s ψ 1 = ∂ |ψ 1 /∂s and it can be readily shown that K s = 1/4z 2 R . We also follow the typical way of using the symmetric logarithmic derivative to calculate the quantum Fisher information and the details are presented in Supplementary Section I, which give the same result. The reciprocal of quantum Fisher information is the quantum CRLB which gives the lower bound of classical CRLB for any measurement techniques. We notice that the sorter-based measurement can reach the quantum Fisher information when the separation goes to zero, i.e. J sorter (0) = K s [see Eq. (6)], therefore it can be considered to be an optimal measurement for s close to zero. However, in a realistic experiment, a mode sorter can only access a finite-dimensional Hilbert space. Therefore we follow the procedure in [24] to construct other possible optimal measurements that can reach the quantum Fisher information in the limit of s = 0. In Supplementary Section II we show that a binary radial mode sorter can perform one of the optimal measurements. A binary sorter has two output ports, and all odd-order radial modes are directed to one output port while all even-order modes are directed to another output port. Therefore, the photon probability distribution at two output ports is Therefore the Fisher information for a binary sorter is .
The plot of Fisher information for different methods is shown in Fig. 3(a). It can be readily seen that this binary sorter can also achieve the quantum Fisher information while the direct imaging has a vanishing Fisher information when s approaches zero. The reciprocal of the Fisher information is the corresponding CRLB, which is plotted in Fig. 3(b).
Having analyzed the performance of each method, now we need to establish the estimator of separation. For direct imaging, it can be verified that the maximum likelihood estimator iŝ where r m is the radial coordinate of m-th photon in the image plane and N is the total detected photon number. However, the estimator ofŵ does not take the detector noise and pixelation into account and thus this estimator may not be robust in a realistic experiment. Therefore, we apply the algorithm in [25] to realize a robust, efficient Gaussian width estimatorŵ in our experiment.
For binary sorter-based measurement, the maximum likelihood estimator iŝ where m 0 and m 1 are the photon numbers in the two output ports and m 0 + m 1 = N .
A schematic for experimental setup is shown in Fig. 2. We use an attenuated laser source to illuminate the spatial light modulator (SLM) to generate the Gaussian pupil function produced by a point source. An acousto-optic modulator (AOM) is driven by a signal generator to produce 3 µs pulses, and the driving signal is also connected to an intensified charge coupled device (ICCD, PI-Max 4 1024i) for synchronization. The average detected photon number in each pulse is around 2,000. We use the calibration factor provided by the manufacturer to calculate the photon number in each pixel of the camera. We emulate two incoherent sources by mixing the data for z = ±s/2 that is generated by SLM separately. A computer-generated hologram is displayed on SLM 1 to generate the Gaussian pupil function of an objective of NA = 0.1 and f 1 = 4 mm at the first diffraction order [26]. The calibration data of SLM 1 is presented in Supplementary Section III.
A polarizer sets the photons to be diagonally polarized. The binary radial mode sorter [27][28][29] consists of a polarization-sensitive SLM (Hamamatsu X10468-02). In our experiment we use two different areas on a single SLM to act as two SLMs for reduced experimental complexity. The phase pattern on both SLMs is set to be the phase of a spherical lens of focal length 46.5 cm, and the separation between two SLMs is 65.8 cm. One can check that even-order radial modes remain diagonally polarized, while odd-order radial modes become anti-diagonally polarized [27].
Through the use of a half wave-plate (HWP) and a polarizing beamsplitter (PBS) one can efficiently separate odd-and even-order radial modes to distinct output ports. More details about the principle of radial mode sorter can be found in [27][28][29]. In our experiment we direct the photons from different output ports to different areas of an ICCD. For direct imaging, we use a 10 cm tube lens to form the image on the ICCD detector plane. For each separation we repeat the experiment 400 times and calculate the standard deviation from the collected data.
The measured separation and the standard deviation as a function of actual separation for different measurement methods are presented in Fig. 4. The Monte Carlo simulation results are also provided as comparisons and they agree well with the experimental data. In the simulation we set the detected photon number to 2,000 which is equal to that of our experiment, and the statistical results are retrieved by averaging 4,000 simulations. We use a noiseless detector with a sufficiently high spatial resolution in the simulation, and the estimators for direct imaging and binary sorter- based measurement are given by Eq. (10) and Eq. (11) respectively. As can be seen in Fig. 4(a), the measured separation of direct imaging is biased when the actual separation is close to zero.
This bias is not due to experimental imperfections as indicated by the Monte Carlo simulation and should be attributed to the bias of the estimator [6]. This bias also explains why the standard deviation of direct imaging does not blow up as predicted by CRLB as analyzed in Ref. [6]. Compared to the direct imaging, the standard deviation of the binary sorter-based measurement does not increase when the actual separation is close to zero. In addition, it can be noticed that the standard deviation in the simulation drops to zero when the separation is sufficiently small as shown in Fig. 4(d), which is a well known phenomenon called superefficiency [1,30]. This phenomenon is caused by the bias of the maximum likelihood estimator for a finite number of samples and is not considered to be significantly useful in statistics [1]. While this phenomenon is not visible in our experimental data possibly due to the experimental imperfections such as dark noise of the detector and misalignment of the sorter, it is apparent that the sorter-based measurement can outperform the conventional direct imaging. We also note that the ICCD used for sorter-based measurement can be replaced by single-pixel detectors, which can have a higher signal-to-noise ratio and consequently lead to better experimental results.
In this work we assume a Gaussian PSF for more tractable analysis and experiment. While Gaussian PSF is a widely adopted approximation [31,32], for a high-NA imaging system a more accurate PSF model may be needed. In this case, one can always establish a complete and orthonormal basis based on the PSF model and construct a sorter accordingly to achieve superresolution accordingly [12]. Very recently, it has been proposed theoretically that for the pupil function of a hard-edged aperture, the optimal measurement basis turns to be the Zernike basis [23,24], and we discuss other optimal measurements that are easier to implement in Supplementary Section II. Therefore, based on our result it can be anticipated that three-dimensional superresolution can be realized as long as a Zernike mode sorter is available. Moreover, given the widely used PSF engineering technique [18][19][20], we can also use a Gaussian amplitude mask or SLM to convert the pupil function of a hard-edged aperture to a Gaussian and then apply the radial mode sorter accordingly.
In conclusion, we theoretically and experimentally demonstrate the axial superresolution based on a radial mode sorter. The binary radial mode sorter employed in out experiment can reach the quantum Cramér-Rao lower bound for an arbitrarily small axial separation. Our method makes three-dimensional superresolution imaging promising and can be potentially useful for enhancing the resolution of optical microscopes.

I. DERIVATION OF THE QUANTUM FISHER INFORMATION
Following the procedure in the supplement of Ref. [23], the quantum Fisher information in our case can be directly calculated as and it can be readily verified that K s = 1/4z 2 R for the Gaussian PSF. Here we also follow a typical method to calculate the quantum Fisher information based on the symmetric logarithmic derivative. For two point sources that are located at z = ±s/2 respectively, the density matrix of a single photon can be expressed as where r 0 |ψ 1 = ψ(r 0 ; s/2) and r 0 |ψ 1 = ψ(r 0 ; −s/2). The orthonormal bases in which the density matrix is diagonal are found to be Hence the density matrix can be rewritten as Here we also define the following orthonormal bases where |∂ s ψ 1 = ∂ |ψ 1 /∂s, |∂ s ψ 2 = ∂ |ψ 2 /∂s, and the coefficients are calculated to be Here the symmetric logarithmic derivative of density matrix is where D j = e j | ρ |e j and the matrix elements of L(ρ) can be calculated as where L jk = e j | L(ρ) |e k . After more algebra one can calculate that the quantum Fisher information is (S9) In our experiment we use SLM 1 to generate Gaussian point spread function, and we calibrate SLM 1 to compensate the aberration and experimental imperfection. The pupil function we want to generate is where z = ±s/2 and this pupil function is generated at the first diffraction order of the computergenerated hologram impressed on the SLM. We apply Seidel abberations to the computergenerated hologram to improve mode quality and tune the parameter z to fit the expected curve.
For direct imaging, the measured width w 0 of the Gaussian PSF should increase with z as expressed in Eq. (2), and the corrected experimental result is shown in Fig. S1(a). For binary sorter-based measurement, the value of Q calculated from the output photon numbers of the binary sorter defined in Eq. (11) can be written as a function of z as We correct the SLM to fit this curve and the experimental result is shown in Fig. S1(b). Due to the detector noise and misalignment of radial mode sorter, the output Q has a small, nonzero value at z = 0, and we treat this experimentally measured nonzero value as a bias term and subtract it before estimating the axial position.

III. SATURATION OF THE QUANTUM CRAMÉR-RAO BOUND
We are interested in the saturating the quantum Cramér-Rao bound in the limit of s = 0. Similar to the proof of Corollary 1 in Ref. [24], one can show that a measurement consisting of projectors Π k = α |π kα π kα | can saturate the quantum Cramér-Rao bound at s = 0 if and only if for all regular projectors (defined as | ψ 1 Π k ψ 1 s=0 | > 0) we can have where |∂ 0 s ψ 1 s=0 = (|∂ s ψ 1 − ψ 1 |∂ s ψ 1 |ψ 1 ) s=0 . Such an optimal measurement can be constructed by choosing a proper trial basis {π kα } and then follow the procedure in Ref. Assemble regular basis vectors satisfying Eq. (S12) as a regular projector Π k = α |π kα π kα |.
(iv) A null basis vector |π kα is flexible if ∂ s ψ 1 |π kα s=0 = 0. The rank one flexible projector Π kα formed by a flexible basis vector can be added to any of the previous regular projectors or the following null projectors. (v) Assemble null basis vectors that are not flexible as a null projector Π k = |π kα π kα | defined as ψ 1 Π k ψ 1 s=0 = 0. One can check that any measurement constructed from the previous procedure can satisfy Eq. (S12).

A. Optimal measurement basis set for the Gaussian pupil function
In the main text, we consider the case of the Gaussian pupil function, It is straightforward to find ψ 1 (r 0 ) s=0 = LG 0 (r 0 ), (S14) We take radial Laguerre-Gaussian modes r 0 |LG p = LG p (r 0 ) as a trial basis. So we find that for step (i) in the limit s = 0, the only regular basis vector is |LG 0 and the remaining other basis vectors are null. (ii) It can be readily shown that LG 0 |LG 0 = 0.
(iii) We obtain a regular projector Π 0 = |LG 0 LG 0 |. (iv) From Eq. (S15), we find ∂ s ψ 1 |LG p = 0 for p ≥ 2. Thus the basis vectors with p higher than one are all flexible and therefore can freely added to any regular or null projector. (v) The only non-flexible null basis vector is |LG 1 , thus we can form a null projector Π 1 = |LG 1 LG 1 |. We add the projector formed by flexible basis vectors of even and odd order to the previous regular projectors Π 0 and Π 1 respectively. Therefore the final optimal measurements can be Alternatively one can add the flexible projectors to Π 0 or Π 1 to give rise to optimal measurements of Π 0 = |LG 0 LG 0 | , or Any sorter that can efficiently perform the above measurements can be used to reach the quantum Fisher information when s approaches 0.
B. Optimal measurement basis set for the pupil function of a hard-edged aperture The construction of optimal measurement for the Gaussian pupil function can be analogously done for the pupil function of hard-edged aperture. Consider the pupil function ψ 1 (r 0 ) = 2/π · circ( √ 2r 0 ) exp −iksNA 2 r 2 0 /4 , where circ(r) = 1 if 0 ≤ r ≤ 1 and circ(r) = 0 if r > 1. The quantum Fisher information is still expressed by Eq. (S1). It is straightforward to find ψ 1 (r 0 ) s=0 = Z 0 ( √ 2r 0 ), (S19) where Z n ( √ 2r 0 ) = 2(n + 1)/πR n ( √ 2r 0 )circ( √ 2r 0 ) and R n is the Zernike polynomial. We take radial Zernike basis r 0 |Z n = Z n ( √ 2r 0 ) as the trial basis. Following the previous procedure, we find the following optimal measurements: or or Any sorter that can efficiently perform the above measurements can be used to reach the quantum Fisher information when s approaches 0.