Tempering Rayleigh's curse with PSF shaping

It has been argued that, for a spatially invariant imaging system, the information one can gain about the separation of two incoherent point sources decays quadratically to zero with decreasing separation, an effect termed Rayleigh's curse. Contrary to this belief, we identify a class of point spread functions with a linear information decrease. Moreover, we show that any well-behaved symmetric point spread function can be converted into such a form with a simple nonabsorbing signum filter. We experimentally demonstrate significant superresolution capabilities based on this idea.

These experiments dispel Rayleigh's curse, but require sophisticated equipment. In this Letter, we revisit the scenario of direct detection, for it is the cut-and-dried method used in the laboratory. We show that by using a simple phase mask-a signum filter-direct detection makes the Fisher information drop linearly. This scaling law opens new avenues for boosting resolution, as we demonstrate here with a proofof-principle experiment. Moreover, this means that the advantage of the aforementioned quantum schemes, with a separation-independent Fisher information, over classical techniques is smaller than previously thought.
We work with a spatially invariant imaging system and two equally bright incoherent point sources separated by a distance s. We assume quasimonochromatic paraxial waves with one specified polarization and one spatial dimension, with x denoting the image-plane coordinate in direction of the separation. This simplified 1D geometry is sufficient to make clear our procedure and it works for some applications, such as, e.g., spectroscopy. The more realistic 2D case can be worked along the same lines, although at the price of dealing with a two-parameter estimation.
If I(x) is the spatial distribution of the intensity in the image from a point source, commonly called the point-spread function (PSF) [15], the direct image can be written as where p(x|s) is the probability density for detecting light at x conditional on the value of s. We model light emission (detection) as a random process (shot noise) [16]. The precision in estimating s is governed by the Fisher information where ∂ s is the partial derivative with respect to s. In this stochastic scenario, N is the number of detections, which can be approximately taken as Poissonian with a mean N p(x|s)dx. Without loss of generality, we set N = 1 and evaluate the Fisher information per single detection. Reintroducing N into the final results is always straightforward. The CRLB ensures that the variance of any unbiased estimatorŝ of the quantity s is bounded by the reciprocal of the Fisher information; viz, (∆ŝ) 2 ≥ 1/F(s).
Since we are chiefly interested in the case of small separations, we expand p(x|s) in s, getting p(x|s) = I(x) where a prime denotes derivative respect to the variable. Observe that the odd powers of s make no contribution, because the contributions from the two PSF components cancel each other. The associated Fisher information becomes Commuting the order of integration and summation immediately yields a quadratic behavior for all PSFs: F(s) ∝ s 2 + O(s 4 ). However, such an operation is not always admissible [17], which leaves room for tempering Rayleigh's curse with PSF shaping techniques.
To illustrate this point, let us assume, for the time being, that our PSF is well approximated by a parabolic profile near the origin; i.e., I(x) αx 2 , which implies p(x|s) α(x 2 + s 2 /4) , x, s 1 .
When this holds true, the integrand in Eq. (2) reduces to a Lorentzian function Because of the strong peak at x = 0, when s 1 the tails of the Lorentzian do not contribute appreciably and can be ignored. As a result, we get F(s) λ s , with λ = πα/2, and the information is indeed linear rather than quadratic at small separations. Note that the proper normalization of p(x|s) is guaranteed by higher-order terms in the expansion Eq. (4), but they do not affect the scaling in Eq. (6). Next, we show that any PSF can be converted to the form (5) by applying a simple nonabsorbing spatial filter at the output of the system. In what follows, Ψ(x) indicates the amplitude PSF, so that I(x) = |Ψ(x)| 2 , and Ψ( f ) its Fourier transform. We process the image by a coherent processor, such as, e.g., a standard 4 f system [15] schematized in Fig. 1. In the Fourier plane, each point source gives rise to Ψ(x ± s/2) → Ψ( f )e ±iπ f s . In that plane, we apply a signum mask: sgn( f )Ψ( f )e ±iπ f s , where for a real number sgn(t) = |t|/t for t = 0 and sgn(0) = 0. As the signum is a pure phase filter, no photons are absorbed. The signal components are then convolved with the inverse Fourier transform of the signum function, which is In this way, the processor performs which is the Hilbert transform of the signal. The optical implementation of this transform has a long history [18][19][20]. It has been used in several fields, but most prominently in image processing for edge enhancement, because it emphasizes the derivatives of the image.
Applying the change of variable ξ = x − x, expanding Ψ to the second order in the small quantity x ± s/2, and using the spatial symmetry of Ψ, we approximate the output amplitudes after the signum mask by The detection probability density near the origin now takes the parabolic shape, as discussed before, viz, with α = [ Ψ (ξ )/ξ dξ ] 2 /π 2 . Note carefully that the parabolic behavior (9) is general, but the value of the coefficient α depends on the explicit form of the PSF Ψ(x). We thus have a linear Fisher information as in Eq. (6). In physical terms, this happens because the Fourier-space processing incorporates phase information. In addition, the combined system consisting of the imaging and PSF reshaping step remains spatially invariant and so the information about the separation is not degraded by misaligning the signal and detection devices, as it happens, for example, when the centroid of the two-component signal is not perfectly controlled. We recall that the Hilbert transform is the essential tool for getting the dispersion relations [21], which relate the real and imaginary parts of the response function (i.e., susceptibility) of a linear causal system. If we think of Ψ(x) as an absorption profile near resonance , we realize that the real part of susceptibility shows anomalous dispersion-linear slope-near the resonance: after squaring we then get a parabolic p(x, s) near the origin.
Let us ellaborate our proposal with the relevant example of a system characterized by a Gaussian PSF: Ψ(x) = (2πσ 2 ) −1/4 exp(− 1 4 x 2 /σ 2 ), where σ is an effective width that depends on the wavelength. Henceforth, we take σ as our basic unit length, so the corresponding magnitudes (such as separation, variance, etc) appear as dimensionless. Apart from its computational efficiency, the Gaussian PSF approximates fairly well the Airy distribution when the illumination is done by a Gaussian distribution that apodises the circular aperture.  Fig. 3. Fisher information about separation for imaging with a Gaussian PSF with (red concave line) and without (blue convex line) the signum filter. The asymptotic behavior of the superresolution given by the right hand side of Eq. (12) is also shown (broken red line).
The Fisher information associated with the direct imaging is obtained from Eq. (2); the result reads confirming once again the quadratic scaling of Rayleigh's curse. This is to be compared with the information accessible by signum-filter enhanced detection. We first perform the Hilbert transform of the Gaussian PSF; where D(z) denotes Dawson's integral [22]. Similar results have been reported for the dispersion relations of a Gaussian profile [23]. In particular, D(−z) = −D(z) and D(z) z(1 − 2 3 z 2 ) for z → 0, so the dominant behavior is indeed linear. Therefore, the expansion in Eq. (9) holds with α = (2π 3 ) −1/2 σ −3 and, in consequence, The detection probabilities and Fisher information densities typical for a signum-enhanced detection with a Gaussian PSF are shown in Fig. 2 for two different values of σ . Notice that nonzero separation is evidenced by nonzero readings at the center of the image. Interestingly, most of the information on the separation comes from detections near the origin and this region shrinks with decreasing s.
The superresolution potential of our technique is illustrated in Fig. 3. The linear scaling can provide big advantages in terms of the resources required to measure very small separations. For example, to measure a 10× smaller separation to a given precision requires 100× more detection events with the conventional setup, while just 10× would do with the new technique. For a fixed photon flux this translates into shorter detection times. At the same time, the new technique is simple to implement in existing imaging devices, such as telescopes, microscopes or spectrometers.
We have implemented the method with the setup sketched in Fig. 4. Two mutually incoherent equally bright point sources were generated with a controlled separation. After preparation, this signal was detected using a signum spatial-frequency filter, by which the original Gauss PSF is reshaped into the Dawson form as in Eq. (11).
A spatially coherent, intensity-stabilised Gaussian beam was used to illuminate a digital micro-mirror chip (DMD, Texas Instruments) with a mirror pitch of 10 µm. Two sinusoidal grating patterns with very close spatial frequencies were created by the DMD, which allows for a very precise control of the angular separation in a chosen diffraction order. Angular separations as small as 4.6 µrad were realized; these correspond to a linear separation of 0.042σ . To ensure incoherence, one pattern was ON at a time, while keeping the switching time well below the detector time resolution. Imaging with an objective of focal length f = 300 mm gave rise to two spatially separated Gaussian spots of σ = 33.2 µm. An aperture stop was used to cut-off unwanted diffraction orders. This completes the direct imaging stage.
In the signum-enhanced imaging part, a phase spatial light modulator (SLM) (Hammamatsu) with square pixels of 20 × 20 µm was operated in the Fourier plane of a standard 4 f optical system. The SLM implemented the signum mask hologram calculated as an interference pattern between a phase unit-step and a blaze grating, allowing for over 0.9 transfer efficiency. Finally, the output signal was measured by a CCD camera (Basler) with 7.4 × 7.4 µm pixels positioned at the output of the 4 f processor. Vertical 4-pixel binning was activated to reduce the readout noise, so the effective pixel size is 7.4 × 29.6 µm. The corresponding signals used in the reconstruction process, resulting from summing three pixels, were in the range of 120-253 photoelectrons, in comparison to a sum of 3 × 7 photoelectron readout noise. The camera exposure time was set to 100 ms to keep the dark noise contribution negligible.
Several separations, ranging from about 0.042σ (1.4 µm) to 0.18σ (6 µm), were measured. Two hundred intensity scans were recorded for each separation setting. One typical 2D scan is shown in the inset of Fig. 4. Since the two incoherent points are separated horizontally, no information about separation is lost by collecting pixel counts columnwise. The resulting 1D projections are samples from the theoretical intensity distribution p sgn (x, s), see Fig. 2. Notice that for small separations only the central parts of the projections contribute significant information. In particular all pixel columns, except the central one, can be ignored in the raw data in the inset of Fig. 4. Therefore, each 2D intensity scan is reduced to a single datum-the total number of detections in the central pixel column. We express the response of the real measurement by a second-order polynomial on the separation p(s) = a + bs 2 and estimate the coefficients from a best fit of the mean experimental detections. For each separation, we calculate the estimator mean s and variance(∆ s) 2 .
Experimental results are summarized in Figs. 5 and 6. Figure 5 compares the experimentally determined variances with the theoretical limits of the direct and signum-enhanced imaging for a Gaussian PSF and 434, 000 detections per measurement. Reciprocal quantities (precisions) are also shown. Signum-enhanced imaging clearly breaks the quadratic Rayleigh's curse in the whole range of measured separations, with an variance improvements up to 10× compared with the direct imaging. Notice also the apparent linear behavior of experimental precisions (red symbols) as compared to the quadratic lower bound predicted for the direct imaging (red broken line).
More estimator statistics is shown in Fig. 6. Experimental estimates are nearly unbiased and not much worse than the theoretical limit calculated for the finite pixel size used in the experiment. Engineering the PSF brings about reliable separation estimates in the region where direct imaging fails, as for example for separations s 0.07σ in Fig. 6.
In summary, we have demonstrated a robust experimental violation of Rayleigh's curse. Experimental imperfections prevent one from achieving the ultimate limit shown in Fig. 3. For larger separations, systematic errors and setup instability make important contributions to the total (small) error. For very small separations, the measured signal is very weak and background noise becomes the limiting factor. Further improvements are possible by optimizing the noise statistics and resolution of the camera.
Finally, one could wonder whether a different filter could yield a better scaling of the Fisher information using direct imaging. The dispersion relations suggest that this behavior is largely determined by the zeros of the PSF. Additional work is needed to explore all these issues, but the simplicity of the signum mask makes it very attractive for superresolution applications.
We acknowledge financial support from the Grant Agency of the Czech Republic   Fig. 6. Estimation of the separation from signum-enhanced imaging. Estimator means (dots) and standard deviations (error bars) are shown. Same statistics is provided for the best unbiased estimators from direct (blue lines) and signum-enhanced (red lines) imaging as given by the CRLB. The latter takes in account the finite pixel size used in the experiment.