Resolution-enhanced quantum imaging by centroid estimation of biphotons: supplementary material

The spatial resolution of an optical system is limited by diffraction. Various schemes have been proposed to achieve resolution enhancement by employing either a scanning source/detector configuration or a two-photon response of the object. Here, we experimentally demonstrate a full-field resolution-enhancing scheme, based on the centroid estimation of spatially quantum-correlated biphotons. Our standard-quantum-limited scheme is able to image a general non-fluorescing object, using low-energy and low-intensity infrared illumination (i.e., with <0.001 photon per pixel per frame at 710 nm), achieving 41% of the theoretically available resolution enhancement. Images of real-world objects are shown for visual comparison, in which the classically bound resolution is surpassed using our technically straightforward quantum-imaging scheme.


A. Camera settings
The EMCCD camera (Andor ULTRA 888) was operated as a single-photon detector, by using a single discriminating threshold (T=510) above which an analogue count for a pixel would be considered as one detected event. The threshold T for binary detection of photons was chosen according to T ≥ 2σ readout + µ readout , where σ readout and µ readout are the standard deviation and the mean of the electronic readout-noise of the EMCCD camera, measured in ADC counts [1][2][3]. The detailed acquisition settings are reported here: −95 • C cooling (15 • C refrigerating liquid); Frame transfer enabled; 0.039733 s exposure time; 0.040908 s cycle time; kinetics frame transfer mode; 356 × 356 pixel frame size; 1.13333 µs vertical shift speed; 10 MHz pixel readout rate; baseline clamp on; normal clock amplitude; 1000 electron multiplying gain level; 1.0 preamplifier gain.

B. Resolution enhancement
A broadening of the FWHM of the MTF corresponds to a narrowing of the point spread function (PSF) [4]. In the case of our standard-quantum-limited centroid estimation of biphotons, the theoretically attainable broadening of the MTF is a factor of ≈ 1.41 (i.e. a factor of √ N = √ 2, where N = 2 is the number of jointly-detected photons), which corresponds to a narrowing of the PSF by ≈ 29% (i.e. a factor of 1/ √ 2. By comparing the experimentally determined MTF curves our scheme was found to achieve a 17% ± 1.2% MTF advantage, corresponding to a 12% resolution enhancement in terms of the narrowing of the FWHM of the PSF, which is 41% of the theoretically available resolution enhancement, as set by the standard-quantum-limit.

C. Simulated model
A simulated model based on our theoretical description was used to investigate the influence of the detection efficiency and of the detector's mean number of noise events on the performance of our resolution-enhancing imaging scheme. The detector's mean number of noise events parameter is the sum of dark-counts and clock induced charge. The detection efficiency parameter (or total effective quantum efficiency) refers to the product of the quantum efficiency (QE) of the electron-multiplying CCD (EMCCD) (including the losses induced by applying a binary threshold to the frames) and the transmission of the optical elements after the crystal. The optical system was simulated by generating sparse binary frames, like those retrieved from the EMCCD camera after applying an event-discriminating threshold, starting from the following experimentally determined parameters: detection efficiency; mean number of noise events; mean number of biphotons/photons per frame; transverse correlation-width of biphotons in the plane of the object; Gaussian beam-size; diffraction limited spot-size; position, orientation and size of the resolution test-target. The bottom-up approach of our model allows to simulate the detected position arrivals of individual photons, accounting for their interaction with both an object and with a non-ideal imaging system placed between the object and the detector. A statistically significant number of simulated frames (1 million per modulation transfer function (MTF) curve) was processed using the same treatment as for the experimentally generated frames, eventually producing MTF curves of a synthetic slanted-edge object, for different simulated scenarios.
Firstly, the influence of the detection efficiency on the computed MTF-curves was compared for ideal QE (i.e. QE=100%) and experiment-like QE (i.e. QE=35%), as shown in Fig. S1. In order to isolate the effects of non-ideal QE, the number of noise-events in the simulated frames was fixed to zero. Whereas in the case of the simple average of all frames a lower QE left the MTF-curves unchanged, the resolution of images produced by centroid estimation of biphotons was worse. In fact, a non-ideal QE translates in the presence of uncorrelated events and therefore in a number of erroneously estimated centroids. It should also be noted that the degradation of the MTF was not as sharp as one would expect for a QE=35%. This is due to the strong rejection of ambiguous event-pairs enforced by our biphoton-finding algorithm.

Fig. S1. The influence of the detection efficiency (or total effective QE) on the resolution enhancement. The solid-MTFs
were computed for ideal QE, whereas the dotted-MTFs were computed for QE=35% (i.e. matching the experimental conditions). In order to isolate the role of QE the mean number of noise events of the simulated frames was set to zero.
A quantitative analysis based on our model was also carried out to identify what level of optical loss the centroid estimation of biphotons is able to withstand. A series of reconstructed images of the slanted-edge were produced for increasing levels of optical loss (here incorporated in the total effective QE parameter of the model), while keeping both the overall light-level constant and also the total number of estimated centroid in the final reconstructed images. This last requirement is important to make sure that reconstructed images for different levels of loss have the same fill-factor (measured in centroids per pixel). From these reconstructed-edges we computed the slanted-edge MTFs, which were used to extract the percentage of achieved advantage over the theoretically available 1/ √ 2 standard-quantum-limit. The results are shown in Fig. S2 for a range of total effective QE spanning from 100% to 6%. As expected, the achievable resolution advantage is adversely affected by increasing levels of loss. This dependence on noise is readily understood by considering the final reconstructed image comprising of two contributions: 1) the resolution-enhanced coordinates of actual estimated centroids, associated to detected biphotons; 2) the classically-limited (i.e. without a resolution enhancement) coordinates of accidentally detected centroids, associated to detected events that are spatially uncorrelated.
Secondly, the influence of the detector's mean number of noise-events on the computed MTF-curves was compared for ideal noise-free and experiment-like noise (i.e. noise=0.0055 The achieved percentage of theoretically available advantage was computed from simulated frames for decreasing levels of total effective QE (i.e. for increasing levels of loss) as shown in (a). Simulated frames of a slanted-edge object were used to reconstruct centroid estimated images, which were then used to compute the MTF curves and the associated resolution-enhancements. The extra frames required to maintain a constant number of detected centroids for higher levels of spatially uncorrelated events is shown in (b). In order to isolate the role of QE the mean number of noise events of the simulated frames was set to zero. events per pixel per frame), as shown Fig. S3.
In order to isolate the effects of noise, the QE of the simulated frames was fixed to 100%. The presence of noise events was found to have a general negative impact on the resolution of the reconstructed images, as shown by the dotted MTF-curves in Fig. S3. In the case of reconstructed images using our centroid estimation of biphotons, the presence of uncorrelated noiseevents causes a number of erroneously estimated centroids to be detected, as shown by the dotted-blue MTF in Fig. S3. In the case of reconstructed images using the classical simple average of all events, the degradation in resolution associated to the presence of noise-events is due to a higher background intensity. In other words, the resolution of the reconstructed images, assessed by computing the slanted-edge MTF, was found to be adversely affected by the presence of randomly distributed noise. This salt-and-pepper noise present in the individual frames resulted in a higher background intensity of the final reconstructed images, especially in the case of the classical simple average of all detected events, as shown by the dotted-red MTF in Fig. S3.
The significance of this model in relation to the attainable resolution-enhancement of our imaging scheme lies in the ability to reliably predict the effects of the performance of the detector and of a non-ideal imaging system placed between the object and the detector; as well as allowing to explore the effects of different biphoton-finding algorithms.

D. Reconstructed images with less frames (i.e. 50,000 frames)
In Figure S4 we show resolution-enhanced images reconstructed using 50,000 frames (i.e. 5% of the number used in the images reported in the main manuscript). According to specifications of our camera and using the optimised settings listed in the methods section of the main manuscript, we are able to estimate how long it would take to produce a final reconstructed image made of 50,000 binary frames and for differently sized canvases: 256 × 256 pixel 2 and 128 × 128 pixel 2 . The estimated times are respectively 9 and 3 minutes and correspond to 92.7 and 282 frames per second (fps). Such elevated acquisition rates are made possible by the 'optically-centred crop-mode' of our EMCCD camera.
A resolution advantage using our centroid estimation of biphotons is apparent in the MTF curves, albeit the noise floor becomes larger at high spatial frequencies, due to higher pixelto-pixel noise of the sparse centroid-estimated reconstructed images. The MTF curves generated using 50,000 frames are shown in Fig. S5, where sparsity of the centroid estimated images is responsible for a greater noise-floor in the blue and green curves.
This noise-induced artefact was confirmed by a simulated model. As it can be seen, different levels of shot-noise were added to the image of a slanted edge, as shown in Fig. S6.
The amount of shot-noise was quantified by computing a signal-to-noise-ratio (SNR) metric, chosen to be the ratio of the mean I and the standard deviation SD(I) of the pixel intensities, over a uniform area of the slanted-edge, according to: SNR(I)/SD(I). More specifically, the image of a slanted edge with perfect SNR (i.e. for which SD(I) = 0 and SNR → ∞) was deteriorated by iteratively adding 10% of the maximum intensity to pixels picked at random. Thus, a higher number iterations resulted in greater noise and thus a lower SNR.
The MTF transfer functions are shown in Fig. S7, confirming that high levels of shot-noise cause a higher noise-floor in the MTF at high spatial frequencies, as well as a marginal degradation in resolution.

E. Calculation of acquisition time for an idealised camera
The acquisition time of our system is currently limited to both the detection efficiency and the frame-rate of our detector. A low detection efficiency means that in order to improve the ability to unambiguously detect biphotons, it is necessary to operate with very few photons per frame. This issue is exacerbated by the presence of dark events (caused by clock induced charge and thermalised electrons) which in a binary detection scheme constitute accidental detections. Although these current technical limitations do not prevent the proof of principle of our resolution-enhancing scheme, they are enough to affect its performance, both in terms of the absolute advantage and in terms of the time required to generate a resolution-enhanced image.
Here we estimate the time required to acquire enough frames for a resolution-enhanced image with a pixel-to-pixel signal-tonoise (SNR) ratio of 10. We assume a frame size of 128 × 128 pixel, and an acquisition rate of 282 frames per second (fps) currently achievable by our EMCCD camera using the framecentred crop-mode and our optimised acquisition settings. Moreover, we assume a 100% detection efficiency and absence of dark events. We still assume that the camera can only detect at most one event per pixel, and accept that according to our pair-finding algorithm a safety margin of 2 pixels must be ensured around each event-pair in a 3 × 3 pixel kernel.
According to this description, the footprint in pixels (p) to unambiguously detected 2 × 2 event-pairs is 36p 2 , whereas for 3 × 3 event-pairs is 49p 2 . Thus, the number of potentially detectable event-pairs in a 128 × 128p 2 frame are 455 and 334 respectively, which also correspond to the same number of estimated centroids (c), according to: We assume that a Gaussian distribution can satisfactorily approximate the transverse correlation lengths of detected biphotons. According to the properties of our system, we also assume 68% of detected centroids to be from 2 × 2 event-pairs and 100%-68%=32% of detected centroids to be from 3 × 3 event-pairs. Accordingly, the total number of potentially detectable centroids per frame is 419, according to: This implies that the number of frames required for at least 1 centroid per pixel in the reconstructed image is 39, according to: Fig. S4. Image resolution comparison using 50,000 frames. A resolution-enhancement is still visible even though the presence of shot-noise in the centroid-estimated images makes a visual comparison of the spatial resolution more subjective. Features are mapped over the differently sized images using the coloured arrows.
We assume a satisfactory image to have an SNR of 10, i.e. to have at most 100 centroids per pixel per frame, since in the case of a perfect detector with zero noise, Poissonian shot-noise is the source of noise. It follows that the number of required frames for such an image is 3,900 frames.
According to the currently attainable frame-rate of our camera of 282.25 fps, the time required for an image with an SNR = 10 is 13 seconds (s), as shown below: 3900 f 282 f ps = 13s; We have shown that for an ideal camera with 100% detection efficiency and zero dark-count a 128 × 128 resolution-enhanced image with an SNR = 10 may be acquired in 13s. This time may be reduced further using current state-of-the-art singlephoton avalanche diode array detectors, which have a detection efficiency similar to current EMCCD cameras, and have been shown to operate at 100,000 frames per second, for a frame size of 256 × 256 pixels [5]. With such a device the acquisition of 50,000 frames would only take 0.5s, enabling sub-second acquisition of resolution-enhanced images. Fig. S5. Quantitative assessment of resolution via slantededge MTF, using 50,000 frames datasets. The resolution of our centroid estimation of biphotons is compared against the resolution of an equivalent classical imaging system, revealing an enhancement. The error bars were computed over ten datasets, each comprising of 50,000 frames. Each 50,000 frames dataset can be acquired in less than 3 minutes, using the 'optically-centred crop-mode' of the EMCCD camera. The pixel-topixel SNR of the images is degraded (left to right) according to an increasing amount of shot-noise, added to randomly chosen pixels. Fig. S7. Link between shot-noise and higher noise-floor of the slanted-edge MTF curves. Shot-noise is measured in terms of pixel-to-pixel intensity fluctuations, here shown by different SNR values. The MTF curves for images with high shot-noise (i.e. low SNR), are characterised by a higher noise-floor.