Single photon imaging and sensing of obscured objects around the corner

Non-line-of-sight (NLOS) optical imaging and sensing of objects imply new capabilities valuable to autonomous technology, machine vision, and other applications. Existing NLOS imaging methods rely heavily on the prowess of computational algorithms to reconstruct the images from weak triply scattered signals. Here, we introduce a new approach to NLOS imaging and sensing using the picosecond gated single photon detection generated by quantum frequency conversion. With exceptional signal isolation, this approach can reliably sense obscured objects around the corner and substantially simplify the data processing needed for position retrieval and surface profiling. For each pixel, only $4 \times 10^{-3}$ photons are needed to be detected per pulse to position and profile occluded objects with high resolution. Furthermore, the vibration frequencies of different objects can be resolved by analyzing the photon number fluctuation received within in a ten-picosecond window, allowing NLOS acoustic sensing. Our results highlight the prospect of photon efficient NLOS imaging and sensing for real-world applications.

efficient NLOS imaging and sensing for real-world applications.

Introduction
The capacity of optical detection and imaging technology is ever expanding to keep pace with emerging autonomous technology and evolving sensing needs. In particular, the desire to see around corners has attracted much research interest from various fields, with the prospect of unlocking new imaging modalities over a breadth of applications, such as non-line-of-sight (NLOS) imaging and NLOS tracking for machine vision and sensing, autonomous driving, and biomedical imaging 1 . The ability to sense, track, and image occluded objects with sufficient resolution and accuracy is valuable for autonomous technology and machine vision when direct line-of-sight is prohibited or split second decision-making is needed for preemptive safety measures 2, 3 . Practical NLOS imaging and sensing is an interdisciplinary problem at the intersection of physics, optics, and signal processing. It requires a sophisticated optical measurement system for capturing information-carrying photons, combined with an appropriate light transport model for efficient and reliable reconstruction of hidden scenes with reasonable computational overhead.
Over the last decade, we witnessed considerable progress in NLOS imaging and sensing based on advanced measurement systems such as streak cameras 4 , single-photon sensitive avalanche diodes (SPADs) [5][6][7][8][9] , and interferometric detection [10][11][12][13] . Leveraging their high sensitivity, a diverse toolbox of NLOS reconstruction algorithms have been developed based on various light transport models for recovering hidden scenes 4,6,14,15 . Notwithstanding, those methods, maybe with few exceptions, require a priori information on the hidden scene and are hence overly restricted to imaging the hidden and obscured objects of known geometry. Arguably, all those NLOS reconstruction approaches rely on faithful measurements of the optical signals that carry the information of the scene.
In a typical NLOS scenario, the probe laser beam first bounces off a single point on a diffusive wall, with some photons redirected towards the hidden scene. A small portion of those photons are back-scattered by the scene and redirected again by the wall to reach a detector. The detector can consist of a separate receiver, which captures photons from a different point on the wall. It can also use a transceiver to capture those from the same point, for coaxial NLOS measurement. In either case, as the intensity of light scattered from a diffusive surface is bounded by the inverse-square (distance) law, in such a NLOS scenario, those triply-bounced informationcarrying photons decay several orders of magnitude faster amid much brighter photons returning directly from the wall. Many existing optical NLOS imaging and NLOS tracking systems, achieved by a single pixel 7,8,16 or 2D 4-6 single-photon detector, capture the back-scattered photons using a separate receiver to avoid receiving the photons directly returning from the wall, which may saturate the single photon detector and suffer the pile-up effect 17 . However, NLOS with a separate receiver has to aggressively illuminate and image pairs of distinct points on the wall for the time resolved single photon detection 18 . On the other hand, coaxial NLOS systems which uses a monostatic single transceiver setup, benefit from a straightforward geo-metrical relation of the time-of-flight measurement and the hidden scene. It can utilize simpler algorithms with much less computational complexity, such as light-cone transformation, to reconstruct the scene 14,19 . The drawback, however, is the strong pile-up effect. To avoid this issue, in the previous confocal NLOS setup, the targeted object was placed far away from the wall, and the receiving field-of-view of the SPAD was carefully aligned to be slightly off the illumination point of the outgoing probe beam on the wall 14 . Also, there was no obscurant between the targeted object and the wall, because any obstacle in front of the object will further attenuate the information-carrying photons while also increasing background photons that are hard to be rejected based on time-of-flight, rendering complications in reconstructing the hidden scene 20-24 . The above restrictions pose a significant challenge in practical NLOS imaging and sensing, preventing them from deployment with complex scenes and possible presences of obscurants.
Here, we aim at overcoming this challenge by distinguishing the information-carrying photons from overwhelming background photons in a coaxial NLOS setting and introduce a new optical detection modality for NLOS imaging and sensing. We demonstrate a single pixel NLOS imaging and sensing system based on time-correlated single-photon counting through nonlinear optical gating 25 . It achieves an absolute 10 ps temporal resolution for object imaging, positioning, and surface normal retrieval, as well as vibration sensing of highly obscured objects around the corner. Our method employs highly-efficient and low-noise quantum frequency conversion of single photons in a nonlinear waveguide, where a time correlated 6-ps pump pulse performs effectively as a narrow nonlinear optical gating to up-convert 6-ps signal photon via sum-frequency generation (see supplementary 1). Crucially, the pump pulses create a high extinction picosecond photon detection window much narrower than timing jitter of the detector and its associated electronics, thus realizing a mechanism to isolate and distinguish information-carrying NLOS photons from the background photons that are usually several orders of magnitude stronger 25,26 . This picosecond photon detection window will minimize the photon count distortion from pile-up and detector saturation, which are otherwise plaguing applications based on conventional single photon detection 27,28 . The system thus provides the ability of picosecond-precision NLOS objects recovery even in highly obscured scenarios. The capability of exclusively capturing the photons from the target open the path towards NLOS vibration sensing, which will underpin the prospect of new hybrid imaging modalities, such as acousto-optics imaging or photoacoustic remote sensing, for NLOS application 29 .
The proof-of-principle experiments demonstrating NLOS imaging, positioning and sensing of highly obscured objects are shown in Fig. 1. The setup consists of a mode-locked laser (MLL), a micro-electromechanical-system (MEMS) scanning mirror, a single-mode fiber (SMF) coaxial optical transceiver, a programmable optical delay line (ODL) and a silicon SPAD. The scene around the corner is realized by using a 2-inch diameter metallic diffuser as the wall, an aluminum mesh (1 mm diameter wire grid with 2 x 2 mm openings) as the obscurant before the hidden object. A probe pulse train derived from the MLL is collimated and sent out via the transceiver, and steered by the MEMS mirror to perform 2D raster scan over different points on the diffuser. After triply-bounced, few of the information-carrying NLOS photons are scattered back in the retro direction, and coupled into the coaxial transceiver. On the other hand, the pump pulse train, derived from the same MLL and synchronous with the probe, is sent through the ODL for temporal scan along the depth dimension to facilitate time-resolved photon counting with high resolution. It is combined with the received photons in a dense wavelength-division multiplexer (DWDM) into a quasi phase-matched nonlinear waveguide for frequency up-conversion. Only when the received photons are temporally aligned with the pump can the up-conversion process achieve high efficiency. Subsequently, the up-converted photons are detected by a silicon SPAD. The entire system thus realizes nonlinear gated single photon detection (NGSPD) 2 , which distinguishes the information-carrying NLOS photons while rejecting the photons scattered back directly from the diffuser or obscurant even in this coaxial transceiver setup.

NLOS imaging of highly obscured target
To probe and image the targeted scene, the MEMS scanning mirror steers the probe laser beam for raster scanning 32×32 points on the diffuser, while recording the photon count as a function of temporal delay of the pump at each scanning point. This results in a temporally resolved 3-dimensional photon count array whose axes are x, y (scanning coordinates on the diffuser) and t (relative temporal delay of the pump).
The 3-dimensional photon count array is then processed to reconstruct the NLOS scene.
Prior to image reconstruction, we pre-process the raw data by first compensating the relative time-of-flight difference caused by the tilt angle of the diffuser -since the time-of-flight from the transceiver to different scanning points on the diffuser varies with optical path (see supplementary Note 3). Then the time-resolved photon counting histogram at each scanning point is filtered individually using a one-dimensional convex target function with CVX toolbox 31,32 for Matlab. In the target function above, y is the time-resolved histogram measurement ( Fig.2(c) as an example) on one scanning point, e is the average background noise level, x is the filtered time-resolved histogram(target), A is the impulse response matrix of single point object, where the impulse response of the system is measured to be 10 ps FWHM (see supplementary Fig.2). This optimization procedure has a similar form of compressive sensing recovery, which removes the background noise due to the intrinsic dark count of the NGSPD and the ambient light. λ x 1 is added as a l 1 regularizer to prevent over-fitting of the processed data, and λ is set at a low value(0.1) to preserve the signal response thus not overly sparsifying the target. Subsequently, the targeted scene can be recovered from the processed data by using the 3-dimensional reconstruction algorithm based on light-cone transformation 14 .
A typical retroreflective 14 arrowhead is used as the imaging target, shown in the inset of to the transceiver is blocked. We first perform the NLOS imaging as is, and afterward insert the obscurant at about 1 cm right in front of the target. The obscurant reduces considerable amount of the information-carrying photons from the target while inducing substantial back-scattered photon ahead of them, thus likely to conceal the target from non-gated single photon detection 14,27 . Utilizing NGSPD to negate the drawbacks due to the obscurant, we are able to reconstruct the image of the NLOS arrowhead behind the obscuring aluminum mesh in high accordance to the arrowhead as shown in Fig.2 Considering that the temporal resolution of the NGSPD ∆t ≈ 10ps, the spatial reso-lution of this coaxial NLOS imaging system based on NGSPD can be estimated as ∆w = c √ w 2 +z 2 2w ∆t ≈ 1.1cm 14 , where z is the distance from the diffuser to the object, w is half of the spatial scanning range on the diffuser and ∆t is the temporal resolution. As the size of the arrow is in centimeter scale, it is remarkably well resolved in the reconstructed images except at the sharp tips of the arrow with feature size well below 1 cm. The total acquisition time for one image is about 15 minutes at a rate of 10 ms dwell time per delay point. For reconstructing

NLOS position and orientation retrieval of obscured targets
Identifying the position and surface normal of the obscured NLOS target requires the capability of isolating or being able to identify the information-carrying photons from the target rather than obscurant 5,9 . This can be achieved via NGSPD, by acquiring pristine and picosecond-resolved photons arrival time-resolved histogram.
In this experiment, we place two 4-cm distanced retroreflective bars that are both about 12 cm in front of the diffuser, and having the obscurant in between, shown in Fig.3(a). We scan the probe on the diffuser along a single horizontal row of points and record the photon arrival time-resolved histogram. The NGSPD temporally resolves the back-scattered NLOS photons from different objects with small time-of-flight difference, which enables retrieving each bar's position. An example of the time-resolved histogram at one scanning point is shown in Fig.3 (c), where the NLOS photon counts from two target bars are isolated from the obscurant and clearly distinguishable despite only separated by about 60 picoseconds. To best assess the capability of the NGSPD in locating the obscured targets, we use two 5 mm wide bars whose width are smaller than the spatial resolution of the system. This width minimizes "long tail" in the histogram attributed to the late arriving photon back-scattered off the target. Also with the narrow bar width, the first returning photon counting peaks can be identified for estimating the nearest distance from the bars to the diffuser with minimal ambiguity 34 . In the meanwhile, the simple coaxial single transceiver setup allows us to have same scanning point on the diffuser for both illuminating and photon capturing, providing simpler spherical geometry for the light path between the diffuser and the object rather than ellipsoidal geometry 5 for a given coordinate (x, z) on the plane. In this evaluation, the ensemble of probable position for the object r oj retrieved from one scanning point r di forms a spherical surface centered at r di with radius ct ei 2 . With N scanning points, N probability distribution spheres are defined.
The (x, z) point with least sum-square-distance to all the spheres, or minimum err(x, z), gives the most probable position for the object. Simple geometry of first returning photon is due to coaxial single transceiver setup compared with the separate receiver case 35 . We use the joint probability density 5 of the least-sum-square to approximate the position of the two bars. Since the NGSPD system has a 10 ps Gaussian-like FWHM of impulse response, the joint probability density is approximated in Gaussian form as where σ t is the standard deviation of the time-resolved measurement and approximated to be FWHM/2 = 5ps. The joint probability of the object position on x-z plane are labeled in difference of the first-arriving peak of the two bars is 60 ps as observed in Fig.3 (c).
The vibration signals from the two bars are isolated, as demonstrated in Fig. 4. In each spectrogram, only one actuation frequency is manifested which highlights another advantage of NGSPD on targeted NLOS acoustic-optics sensing with high selectivity and spatial resolution 29,37 . High extinction isolation of undesirable photons is enabled by the picosecond temporal gating and single-mode fiber transceiver that captures very few photons other than those from intended target. Note that, one can observe the frequency noises at 120 Hz due to the power line frequency supplied to the ambient LED lighting, and at 335 Hz due to the resonant frequency of the MEMS mirror in the Fourier Transform figures.

Discussion
Existing techniques for NLOS imaging 6,8,14 and tracking 5,7 are overly restrictive for practical uses, and rely heavily on the prowess of data post-processing. By nonlinear optical gating and single photon detection, we have demonstrated a novel approach that achieves picosecond single-photon time gating while rejecting orders of magnitude stronger background noise. It eliminates the otherwise detrimental detection piling-up effects 27 and allows coaxial NLOS measurement to provide direct time-of-flight information of hidden objects. As such, hidden NLOS scenes, even those additionally occluded, can be reliably reconstructed at centimeter resolution, releasing the need for intense computational imaging or complicated, scene-specific propagation models 20 . The same approach also enables non-interferometric NLOS acoustooptics sensing capable of locating hidden objects by their vibrational frequencies. These results highlight the prospect of hybrid or cross-modality NLOS imaging and sensing, by applying far-reaching acoustics waves to excite objects around the corner and using NLOS single photon detection to read the acoustic response 29 . One major drawback of the current NGSPD approach is the need of temporally delay the gating pump pulse for retrieving photon arrival time information, which makes the data acquisition time-consuming and limits the imaging depth.
Several improvements can be applied to decrease the data acquisition time. For example, multi wavelength phase matching enables up-converting two or even more wavelength bands in one waveguide 38 , which will reduce the acquisition time by times of the multi peak numbers. On the other hand, using a synchronized pump pulse train with higher repetition rate and combined with a correlated time tagger 39 for acquiring the macro arrival time of triply-bounced photons, the maximum imaging and sensing depth of the NGSPD system can be improved significantly.
With the above advantages, this NGSPD system can perform NLOS imaging and sensing over realistic, complex environment, including those of obscured and partially occluded objects, yet without complex reconstruction models. Meanwhile, the nonlinear gated single photon detection presents a new optical measurement modality for various potential NLOS applications in imaging, sensing, and communications 40,41 . An interesting future study of this NLOS imaging technique is to exploit pristine and picosecond-resolved photons arrival time histogram for reconstructing NLOS spatial information with only single illumination point aided by machine learning 42 , which is expected to significantly improve its functionality and imaging speed.

Author Contributions
S.Z., Y.M.S., P.R., and Y.H. contributed extensively to the work presented in this paper.

Additional Information
Competing interests: The authors declare no competing interests.

Correspondence
Correspondence and requests for materials should be addressed to (email: ysua@stevens.edu and yhuang5@stevens.edu).

Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.  The nonlinear gated single photon detector contains a quasi phase-matched nonlinear waveguide module and a silicon SPAD. The system transmits probe laser and receives signal photons using the transceiver, then the hidden object obscured by the aluminum mesh is imaged and sensed.     Besides the constant background noise from the NGSPD, our system rejects external noise photon counts far better than conventional single photon detection. By inserting a noise source (the amplified spontaneous emission(ASE) noise of another EDFA, filtered using the same filter as the probe pulse) which has identical spectral distribution of the signal, our NGSPD shows 36 dB higher noise rejection than a 1-ns gating InGaAs detector 2 . Thus the external noise count can be neglected, and the background noise can be treated as a constant number. The retrieved temporal signal counts y(t) can be treated as y(t) = P(Ax(t) + e), where A is the impulse response, e is the background noise level, x(t) is the reflection distribution from the hidden object, and P is Poisson distribution. Currently the signal photon counts at certain temporal delay is about 10 times higher than the background noise, so the photon count fluctuation is much lower than the signal itself. In the pre-process for NLOS imaging, we only consider y(t) ≈ Ax(t) + e to retrieve a filtered data, in order to get rid of the background noise and to obtain a finer temporal data for image reconstruction. .
The first returning signal peak of each pixel is picked as the first-arrival signal photons from the bar. Although the width of the bars are less than the spatial resolution of the system, the time of flight is still different for the two edges of the bar. Considering the geometry of the setup as Fig. 8b shows, the time of flight uncertainty on the i th scanning point for the j th bar can be interpreted as ds where A j B j is the bar width (about 5mm), and O i P j is the distance from the scanning point to the midle point of the bar (about 12 cm).
The complementary of the tilt angle, ∠O i P j A j , is different at each scanning points. In the experiment, the two bars were facing at the center of the wall, such that the largest tilt angle is about 12 • (∠O i P j A j about 78 • ), which corresponds to ds max ≈ 1mm or 6.7ps temporal difference.
We used a simple least square approximation for the bar positioning, which does not assume a specific shape of the surface. So the largest error of one scanning point can be ∆t = taped with retroreflector is used as the object, and is put onto a rotational stage in front of Figure 9: NLOS orientation measurement for different tilt angle at -10 • , 0 • and 10 • . The first three rows of plotting are original data, each row corresponds to one pixel and each column corresponds to one tilt angle. At each pixel, we pick the time of the earliest arriving signal peak, and compare with the simulation result at the same scanning points, which are plotted in the last row. For the pixels on the edge, the error from the experiment and simulation differs most, which can be brought in by the fact that the intensity from the edge is lower than other parts, then the real earliest peak can be immersed.
the diffuser. First, its surface normal is adjusted to be parallel to the surface normal of the diffuser, which is labelled as 0 • . Then the bar is yawed using the rotational stage. We measure the time-resolved histogram of the bar on a row of scanning points, at each of the three yaw angles(−10 • , 0 • and 10 • ). The earliest returning peak on the time-resolved histogram is chosen and plotted for each scanning point as shown in the last row in Fig.9.