Non-line-of-sight tracking of people at long range

A remote-sensing system that can determine the position of hidden objects has applications in many critical real-life scenarios, such as search and rescue missions and safe autonomous driving. Previous work has shown the ability to range and image objects hidden from the direct line of sight, employing advanced optical imaging technologies aimed at small objects at short range. In this work we demonstrate a long-range tracking system based on single laser illumination and single-pixel single-photon detection. This enables us to track one or more people hidden from view at a stand-off distance of over 50~m. These results pave the way towards next generation LiDAR systems that will reconstruct not only the direct-view scene but also the main elements hidden behind walls or corners.


Introduction
Recent advances in light sensing and computational imaging technologies are providing solutions to the problem of looking around obstacles [1][2][3][4][5][6][7][8][9][10][11]. In particular, devices that can detect the arrival of light at the single-photon level with extremely high temporal resolution have enabled the ranging and reconstruction of images of small, static hidden objects using methods based on laser-illuminated detection and ranging (LiDAR) [3,6,8]. LiDAR is an active remote-sensing technique that uses the round-trip time-of-flight information of light signals backscattered from objects, typically in the line of sight, to determine their positions [12]. LiDAR-based detection and ranging of objects hidden from view is achieved by illuminating the objects and detecting the backscattered signals via an intermediary scattering surface such as a wall or the floor. These additional, intermediate scattering events and their isotropic nature greatly reduce the available signal for detection, leading to the need for long acquisition or processing times, and/or the need for advanced detection devices.
Buttafava et al. recently studied the possibility of determining the full three-dimensional profile of static objects using a single-pixel single-photon avalanche diode (SPAD) and scanning laser illumination [6]. Although the system did not have the necessary speed to track a moving object in real-time, single-pixel SPADs do have a distinct advantage over SPAD cameras [13][14][15][16] in that they provide close to 100% coupling of light onto the sensitive detector area, compared to the few percent currently available in visible−near-infrared wavelength SPAD cameras [17,18].
For some applications, object identification and position or motion information are the main points of interest; a three-dimensional reconstruction of the object is not crucial [2,8]. Rather, the scenario may demand, first and foremost, knowledge of the presence and position of objects moving in a hidden environment. Example applications include search and rescue, and autonomous driving. Gariepy et al. have shown that it is possible to detect and track a moving hidden object, albeit with no information of the object's form [8]. Their setup uses a 1024-pixel SPAD camera [13,15,17] to detect indirect light scattered back to the system from their target object around a corner. Each detector pixel in the 32 × 32 array has single-photon sensitivity and can time each photon's arrival with 110 ps resolution. Tracking of the hidden object uses this precise photon-arrival timing information together with knowledge of the camera's field of view and the geometry of the setup in the line of sight.
Here we present a tracking system that extends beyond the lab-based near-range setups implemented thus far. Our system enables the detection of people moving outside of the direct line of sight at stand-off distances greater than 50 m. The active imaging system uses one single-pixel SPAD detector and a single pulsed laser to "look" down a corridor and around a blind corner. In a first experiment, we demonstrate that we can accurately detect and locate a single person around the corner. We then demonstrate that we can accurately do this for two hidden people. The expected decrease in spatial resolution due to the use of a single-or few-pixel detection system compared to previous camera-based approaches is compensated by the arbitrarily large effective numerical aperture that is provided by using independently focusable detectors: even moving objects at large distances can be located with high precision. These results, therefore, show that single-photon counting technology can indeed perform outside the lab and over length scales that are relevant for large stand-off distance detection and location of hidden moving objects, such as humans, in real-life applications.

Experimental setup
In our experiments we use a corridor that forms a T-junction. Figure 1 shows a schematic of the scenario as seen from above. Compared to the experiment conducted by Gariepy et al. where an intermediary scattering surface in the plane of the object's movement (the floor) was used, here we use a vertical surface (screen), as in previous work (see e.g. [1][2][3][4]). Our screen is a mobile chalkboard that we can freely position at the T-junction. The laser and the field of view of the detector are directed onto viewing screens attached to the board to improve the signal level of our returns.
We send a train of light pulses from the transceiver onto our screen positioned at the far end of the corridor (∼ 53 m away from the transceiver), as illustrated in Fig. 1(a). The 100 ps pulses from the portable laser hit the wall with a ∼ 7 cm beam diameter and scatter from the wall approximately as a spherical wavefront that propagates in all directions. Some of this light reaches the hidden objects (persons) and is scattered back again towards the wall. A discrete position with a spot size diameter of ∼ 4 cm on the wall is imaged to the SPAD using the telescope. Our TCSPC module measures the photon arrival time (4 ps time binning) for the signal returning to the detector and a histogram is built up in one second of acquisition time over 40 million laser pulses.

Position retrieval
In order to precisely locate a hidden object, we successively image four discrete positions on the screen to the SPAD as indicated in Fig. 1(a). This is equivalent to simultaneously imaging four detector positions to four SPADs with their associated collection optics; i.e. with four independent detectors and imaging systems, we would, in effect, be able to perform the total data acquisition in 1 s. We use the temporal information between the laser signal and the SPAD detector signal for each of the four pixels (detector positions) to reconstruct the position of the hidden persons.
In our system, we can simply choose our point of reference (origin of the Cartesian coordinate system), for example the righthand-side corner of the T-junction, and we then use an extension of the approach presented in [8].  Fig. 1(a)) we obtain four ellipses that overlap in correspondence to the target position in two dimensions.
To calibrate our measurements, we apply an offset to the histograms recorded for each pixel (detector position) and define the timing of all events relative to the start of the histograms. We then limit the histogram to our time window of interest. In our current method, we pre-acquire a signal of the background scene with the person(s) absent, for each detector pixel, and subtract this from the return signal for the corresponding pixel when we have the person(s) present in the scene. Alternatively we can consider using the median of the histograms for each pixel [8,20,21] or frame-to-frame change detection [2,22] as a way to approximate the background signal.
For each detector pixel i, we perform a Gaussian fit on the histogram to locate the position of the peak(s) in the return signal corresponding to our scattering source(s) of interest (see insets in Fig. 2). Each fitted peak gives us our photon time-of-flight t i , with uncertainty σ i , that we measure between the initial instant when the train of laser pulses hits the wall at r l and the final instant when it has scattered to and from a hidden person at r o , back into the field of view of the detector at r i . Using t i , our method back-projects the photons onto the hidden scene. We create an ensemble of discrete positions in Cartesian space for where our hidden person could be located. Defining an appropriate scattering height z for our person reduces the search space from a volume (x, y, z) to an area (x, y) with a significant increase in computation speed. For each discrete position r o = (x o , y o , z), we then calculate a probability: The probability density that we obtain for our two-dimensional problem is maximised for a set of (x,y) that describes an ellipse with r l = (x l , y l , z l ) and r i = (x i , y i , z i ) as the foci, such that |r o − r l | + |r o − r i | = ct i . Here, c is the speed of light. Assuming independence between the measured times of flight t i , the final distribution of r o is obtained by multiplying the probability density functions (PDFs) P i (r o ) associated with each pixel. Since the counts in our target signal have contributions from the same scattering source, the resulting distribution will present a region of high probability where the individual densities overlap.

Tracking of a single person
A single person is standing around the righthand-side corner, hidden from the line of sight of our transceiver. The screen is angled at ∼ 40 degrees to the corridor on the right to mimic, for example, an open door. The corridor lights are switched off yet full daylight illumination enters from lateral windows, as seen in Fig. 1(b), however no significant differences were observed with the artificial corridor lights switched on or off. We acquire data for 1 s with the person located at each of four positions along the corridor, and we do this for all four pixels. The signal of the background scene is also acquired for each pixel. Figure 3 shows the joint probability density functions (PDF) that we retrieve for the hidden person at each position. These are overlaid with the ground truth. The agreement between the PDFs and the person's true positions show that our system is able to locate a stationary person situated up to ∼ 1.8 m outside of the direct line of sight, and determine its position with an uncertainty that is always less than ∼ 0.5 m, although this increases the further away the person is from the screen.
In order to assess how location accuracy and precision varies with the baseline (i.e. the separation of pixel positions on the screen), we perform a computer simulation of the experiment for the simplified case where we have a single laser, two SPADs and a hidden object (see Fig. 4). ∼ 0.7 m in the xand y-directions respectively (see Figs. 5(c) and 5(d)). By introducing further pixels, a 1 m baseline will therefore be sufficient to guarantee a location precision comparable to the typical shoulder width of a human being. In most cases, the precision increases very rapidly with increasing baseline but then tails off. In other words, there is not much gain in going to large baselines -baselines of the order of 1 m are optimal in terms of requirements on the environment (access to a sufficiently large visible area) and location accuracy. These results are verified in our experiments in which we have baselines of the order of 1 m. Simulation results for a two-detector system. We investigate how separating out the pixels in one direction affects the accuracy and precision of the target position retrieval. (a) and (b) show the respective change in accuracy of the retrieved xand y-coordinates ("error x " and "error y ") as the separation between pixels increases, while (c) and (d) show the change in precision in the xand y-directions ("σ x " and "σ y ") respectively.

Tracking of multiple persons
The previous experimental measurements are repeated with two people in the same hidden scene. Again, data is acquired for 1 s for each detector position. We show the results in Fig. 6. We also perform a measurement with two people located on opposite sides of the T-junction. We use only two detector positions and the screen is now placed at normal incidence with respect to the transceiver. The retrieved positions are shown in Fig. 7. The system is now retrieving the target positions with larger uncertainty and we observe that if either of the persons moved to further than ∼ 2 m away from the laser spot on the wall, the signal-to-noise ratio of the return signal degraded to the point that position reconstruction is no longer possible within the integration time. Preliminary measurements showed that person orientation has no notable effect on scattering. Instead, we attribute this largely to a non-uniform scattering distribution of the photons from the screen surface with a significantly lower distribution parallel to the wall itself (i.e. in the direction of the hidden persons in this configuration). However, the presence of hidden people and their correct positions can still be identified by the system, showing that no particular tuning of the angle of the screen is required. That is, the limitations observed here can, in principle, be overcome by using detectors with lower noise (the detector used here has a dark count rate of roughly 1000 counts/second and systems with a factor ∼ 10× less are available). The signal may also be increased by collecting the return scatter from a larger spot size, e.g. by optimising the collection optic numerical aperture. The spot size diameter could be increased by a factor ∼ 2 − 3× (therefore increasing signal by a factor ∼ 10×) with respect to the current configuration without significantly compromising precision. This would imply (when  combined with lower noise detectors) a total increase in the signal-to-noise ratio of ∼ 100× and the potential to detect hidden objects that are located up to several meters behind the corner while maintaining the same 50 m stand-off distance. The detected signal scales as 1/d 4 , where d is the distance between the screen and the object(s) of interest. With our current setup with the current technology, our method is limited to resolving two simultaneous objects in the hidden scene; any increase will require increased temporal resolution.

Discussion
Photons returning via objects located outside of the line of sight undergo three or more scattering events, yet the timing information of return signals can be used to accurately retrieve the positions of these objects. The flexibility of single-pixel detection provides an increased field of view that allows detection and tracking with precision at long range, > 50 m. The use of single-pixel detectors also has the advantage of high detection efficiency. The results from our measurements show that by using a few single-pixel SPADs in parallel and TCSPC, real-time tracking at large stand-off distances is possible. The ability to perform non-line-of-sight detection at long range, using experimental components that can be made both compact and portable, takes us one step closer to developing a solution that is usable for real-life scenarios. An interesting possibility also is to use just one single pixel: while this will not provide sufficient information to locate the exact position of a hidden object, it is sufficient to identify the existence of a moving object, its distance from the laser beam spot, its direction of motion and velocity.