Three-dimensional single-photon imaging through obscurants

We investigate the depth imaging of objects through various densities of different obscurants (water fog, glycol-based vapor, and incendiary smoke) using a time-correlated single-photon detection system which had an operating wavelength of 1550 nm and an average optical output power of approximately 1.5 mW. It consisted of a monostatic scanning transceiver unit used in conjunction with a picosecond laser source and an individual Peltiercooled InGaAs/InP single-photon avalanche diode (SPAD) detector. We acquired depth and intensity data of targets imaged through distances of up to 24 meters for the different obscurants. We compare several statistical algorithms which reconstruct both the depth and intensity images for short data acquisition times, including very low signal returns in the photon-starved regime. Published by The Optical Society under the terms of the Creative Commons Attribution 4.0 License. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.


Introduction
In recent years, the Light Detection and Ranging (LiDAR) method has been the technique of choice for many remote sensing applications which require three-dimensional mapping [1,2]. Single-photon time-of-flight (ToF) depth imaging based on the time-correlated single-photon counting (TCSPC) technique has emerged as a strong candidate technology for obtaining both depth and intensity information from targets in challenging environments, such as for longrange imaging [3][4][5][6][7][8]. The benefits of the TCSPC approach include the high optical sensitivity of single-photon detectors and the picosecond time resolution, which results in excellent surface-to-surface resolution. The effectiveness of TCSPC has also been demonstrated for use in extreme environments such as underwater imaging in highly attenuating conditions [9,10] at visible wavelengths. CMOS SPAD detector arrays in the visible and near-infrared regions have also been used in range-gated operation to achieve ranging and depth profiling in the single-photon domain [11,12]. The availability of SPAD detectors in the short-wave infrared (SWIR) has allowed single-photon LiDAR and depth profiling in this spectral region of low atmospheric attenuation and reduced solar background [13][14][15]. Long range single-photon measurements in the SWIR region have shown the excellent surface-to-surface resolution, which has been shown to be highly effective in demonstrating target identification in cluttered environments such as camouflage [16,17].
An increasing subject of interest is the high-resolution imaging of targets in visually degraded environments, which is essential for a number of emerging application areas, including automotive LiDAR. The high levels of particulate scattering in fog or smoke can greatly diminish image contrast, spatial resolution and, of course, return signal strength [18][19][20][21]. Several studies have investigated the propagation of light through various types of atmospheric obscurants [22][23][24][25]. These studies, which included both simulations and experimental measurements, clearly illustrate the detrimental effects of obscurants on laser beam propagation and image quality.
There is ongoing debate in the imaging community regarding the wavelength dependence of optical propagation through water-based fog. Some authors suggest that SWIR wavelengths have lower levels of attenuation through fog than wavelengths in the visible band [15,26] while others suggest that, depending on the nature of the fog, there is no clear benefits to using such longer wavelengths [27,28]. Laser beam propagation through waterbased fog is complex and is dependent on a variety of factors, particularly the distribution of particle sizes. One other such parameter is fog density, which can vary substantially over the duration of a single measurement. Fog density can range from hazy fog (visibility > 2 km) to dense fog (visibility < 50 m) [29].
In this paper, we present a SWIR scanning system based on the TCSPC detection technique to obtain depth and intensity images of targets in various types of highly obscuring media at low optical power levels. The system had an illumination wavelength of 1550 nm operating with an average optical power level of approximately 1.5 mW. Measurements were performed in a 26-meter-long obscurant chamber so that an environmentally stable, consistent, and empirically measureable optical attenuation could be achieved. This active imaging system was successfully used to reconstruct depth and intensity profiles of targets placed in obscurants such as canister smoke, glycol-based smoke, and water-based fog, at stand-off distances of up to 24 meters within the obscurant chamber. Moreover, we present experimentally obtained results which give a comparison of the attenuation through each type of obscurant for both λ = 1550 nm and the visible region. We compare three state-of-the-art algorithms designed for sparse photon returns and apply them to reconstruct both depth and intensity images of targets in high levels of atmospheric obscurants. These three algorithms all exploit the statistics and spatial correlations of the data.
In Section 2 of the paper, there are details describing the transceiver unit and outlining the key system parameters used for these measurements. Section 3 describes the experimental layout of the targets within the obscurant chamber and the different obscurants used in this study. Section 4 presents experimentally obtained results for the attenuation of light in each obscurant for both λ = 1550 nm and the visible band. The computational methods used to reconstruct depth and intensity information from the single-photon data are described in section 5. Section 6 shows results obtained using each of the image processing algorithms. Finally, conclusions and a discussion are presented in section 7.

System description
A ToF monostatic scanning transceiver sensor was used in conjunction with the TCSPC technique to obtain high resolution depth and intensity images of the target scene. The TCSPC technique was used to determine the time difference between the outgoing optical signal and the occurrence of a photon detection event. This is typically recorded over many laser pulses for each depth measurement (i.e. each "pixel" in the scene) in order to acquire reliable ToF information, and hence a high resolution depth estimate of the target.
A schematic of the monostatic single-photon imaging system used in these measurements is shown in Fig. 1 and a summary of the key system parameters is listed in Table 1. This system, operated using an illumination wavelength of 1550 nm, was broadly similar to that used in previous work where depth and intensity measurements were obtained of targets over long ranges (> 1 km) [5], to identify targets obscured by highly scattering surfaces such as camouflage netting [16], and in a very preliminary report by this group for use in obscurants [30]. Fig. 1. Schematic of the single-photon depth imaging system which was operated at a wavelength of 1550 nm. The system comprised of a pulsed supercontinuum laser source, a InGaAs/InP SPAD detector, a custom-designed monostatic scanning transceiver unit, and a TCSPC timing module. Optical components include: polarizing beam splitter (PBS); fiber collimation packages (FC1, FC2, FCR, FCT); scanning galvanometer mirrors (G1, G2); relay lenses (RL1, RL2, RL3); objective lens (OBJ); longpass filters (LP1, LP2); shortpass filter (SP); bandpass filters (BP1, BP2). The target was a polystyrene head placed at distances of up to approximately 24 meters into an obscurant chamber, which was subsequently filled with various types of obscurants. The output from a broadband supercontinuum illumination source (SuperK EXTREME EXW-12, NKT Photonics) was used in conjunction with a series of optical filters (indicated in Fig. 1) to obtain a fiber-coupled illumination wavelength centered on λ = 1550 nm. This set of high performance filters comprised of a longpass filter (LP1) with a cut-on wavelength of 1500 nm, a shortpass filter (SP) with a cut-off wavelength of 1800 nm, and a 10 nm fullwidth half-maximum (FWHM) bandpass filter (BP1) centered on 1550 nm. This filter combination allowed an illumination wavelength of 1550 nm to be transmitted whilst providing sufficient out-of-band rejection. The illumination beam used for all measurements reported here had a maximum average optical power of approximately 1.5 mW. The repetition rate of the illumination source was 15.6 MHz, giving a maximum stand-off distance without range ambiguity of approximately 9.6 meters. In applications requiring absolute range information, other methods of source illumination such as the use of pseudorandom pulse trains are required to remove this ambiguity [31]. Alternatively, the maximum unambiguous range of the target could have been extended by reducing the repetition rate of the illumination source or by using combinations of different repetition frequencies. Due to the frequent need to change the stand-off distance of the target within the obscurant chamber, a parallax-free system configuration was beneficial. Therefore, the transmit and receive paths of the custom transceiver unit were configured to be coaxial with a polarizing beam splitter used to de-multiplex the return signal from the common channel. Two scanning galvanometer mirrors (G1 and G2 in Fig. 1) were used to raster scan the beam on the target scene.
An objective lens (OBJ in Fig. 1) was used to both focus the outgoing illumination beam on the target and collect photons, which had been scattered back from the target. In all measurements, the transceiver was positioned 17 meters from the obscurant chamber. An objective lens with an effective focal length of 200 mm and an aperture of 50 mm was used for scenes at 5 meters into the obscurant. For targets at 24 meters into the obscurant, an objective lens with an effective focal length of 500 mm and an aperture size of 80 mm was used. The beam was raster scanned over an area of approximately 350 mm × 275 mm for the stand-off distance in obscurants of 5 meters and 335 mm × 270 mm for a stand-off distance of 24 meters. The collected scattered return photons were routed to the receive channel and fiber-coupled to the single-photon detector via a 10 µm diameter core armored optical fiber. In these measurements, we used an electrically gated InGaAs/InP SPAD detector (Micro Photon Devices) with an operating wavelength range of 900 -1700 nm. The detector had a single-photon detection efficiency of 30% at 1550 nm. The overall timing jitter of the system was measured as being approximately 220 ps (FWHM). The detector was a significant source of the overall system jitter, however the laser source and data acquisition hardware also contribute.
Due to the monostatic configuration of the system, light scattered from optical components within the transceiver unit can cause significant levels of back-reflections, which could result in saturation of the highly sensitive SPAD detector, rendering the detector unable to record a target return. To avoid detection of these spurious back-reflections, the detector was electronically gated, or activated, approximately in synchronization with the expected pulsed laser return and de-activated at other times. For the measurements described in this paper, a 30 ns gate duration was chosen. Afterpulsing effects are typically found in InGaAs/InP SPAD detectors, where charge carriers are trapped in defects, which are subsequently released much later causing spurious avalanches. This results in an increased background count level, which directly leads to a reduction in the signal-to-noise ratio in LiDAR applications. For these measurements, the detector was de-activated for a predetermined amount of time, or hold-off time, of 40 µs after a photon detection event. This allowed any trapped charge carriers to be released without resulting in further avalanche events, thus reducing the effects of afterpulsing. To reduce the effects of ambient light on the background level, the receive channel was spectrally filtered (as indicated in Fig. 1) using a longpass filter (LP2) with a cut-on wavelength of 1500 nm, and a 10 nm FWHM bandpass filter (BP2).
The pulsed laser source provided an electrical start trigger to the TCSPC counting module (PicoQuant HydraHarp 400) while the stop trigger was provided by an electrical output from the SPAD module. The TCSPC module was configured to record time-tags for the detection events and this data was then transferred via USB to the control computer.

Target scene within the obscurant chamber
The measurements reported in this paper were taken using targets placed within an obscurant chamber housed in an indoor facility at the French-German Research Institute of Saint-Louis. The 26-meter-long obscurant chamber was approximately 2.5 meters tall by 2.3 meters wide as shown in Fig. 2. The chamber was lined with plastic sheeting in order to contain the obscurant during measurements. Black fabric curtains were used to block the open ends until a sufficient obscurant density was achieved, then the front curtain was opened to allow for measurements of the target scene. The use of this indoor facility allowed for stable atmospheric conditions, experimental repeatability, and slow obscurant dispersion. Previously, this obscurant chamber experimental layout was used to examine range-gated active imaging at SWIR wavelengths [24,25]. However, this earlier work did not utilize single-photon detection based approaches as described in this paper. The target scene for all the measurements comprised of a life-size polystyrene head mounted in front of a smooth wooden backboard as shown in Fig. 2. A metal railing was used to mount the target and allow for precise movement of the scene within the chamber as required. Two sets of calibration panels (indicated in Fig. 2), set A(1,2,3,4) for λ = 1550 nm measurements and set B(1,2,…,8) for the visible band, were used to calculate the attenuation coefficient at set points within the chamber for each measurement. Set A comprised four 150 × 150 mm calibration panels which were mounted on the metal railing to the left of the polystyrene head at one meter increments. These 150 × 150 mm calibration panels consisted of a sheet of aluminum, half of which had a 94% reflectance Permaflect coating [32] and the other half with 18% reflectance Permaflect. These coatings act as near Lambertian reflectors to provide a contrast measurement. Calibration target A3 was placed nominally in the same plane as the polystyrene head so that an accurate measurement of the attenuation coefficient at λ = 1550 nm could be made at the target location. The purpose of Set B was to measure the visibility from contrast analysis in the visible part of the spectrum using passive imaging. For this, we used a Si-based camera (Prosilica GT1380) using a 100 mm objective lens (NIKON 18-300 zoom lens). This passive imaging approach uses a combination of broadband illumination and detection in the wavelength band 400 -800 nm, which approximated the responsivity of the human eye. This visible camera system was placed directly in front of the obscurant chamber, approximately 27 meters from the entrance. This second set of eight panels extended from the front to the rear of the chambereach panel was 250 × 250 mm in size, and was coated with 5% and 94% Permaflect.
Two different target configurations were used for the measurements presented in this paper. In the first configuration, the target was placed at a distance of ~5 meters from the front of the chamber. Four calibration targets were mounted on the rail at 1 meter equidistant spacing with the third Permaflect target nominally in the same plane as the front of the polystyrene head. In the second target configuration, the target and the rails were moved to the end of the obscurant chamber. The target and calibration targets were mounted in the same configuration as in set-up 1, but at a distance of 24 meters from the front of the chamber.

Obscurants used in this study
Three different types of obscurants were used in this study: white canister smoke; glycol-based smoke; and water-based fog. The white canister smoke was generated using Björnax AB Ventilax smoke canisters, which are based on the combustion of potassium chlorate and ammonium chloride. The average particle diameter for the white canister smoke was measured in situ to be approximately 400 nm using a GRIMM Mini-Las 11-R. This was further confirmed by taking smoke particles on a glass slide and imaging with a Scanning Electron Microscope (SEM), where a particle diameter distribution of 390 ± 100 nm was measured. This is typical of the type of smoke found in most smoke grenades. A transmission spectrum for this particular smoke can be found in [24]. The glycol-based smoke had a particle size distribution of under 1 µm and was generated using a smoke machine (JB Systems FX-700) vaporizing Universal Effects ST-Smoke Fluid Light. The water-based fog was generated using a system of high pressure nozzles mounted along the ceiling of the full length of the obscurant chamber. These nozzles were capable of creating a very fine mist, with an average particle size distribution of 18 ± 8 µm (measured using a LISST-100X granulometer (Sequoia Scientific Inc)) which is a realistic particle size for atmospheric water particles [33]. Further details on these obscurants can be seen in [34]. For each measurement set, the obscurant was pumped into the chamber until a sufficient density was achieved. Over the duration of the measurement set, the obscurant would slowly disperse out of the open ends of the chamber until the scene was fully visible. Figure 3 shows a photograph of the target scene obscured by the glycol-based smoke, near the beginning and end of the measurement set. Fig. 3. On the left is a photograph of the target scene (located at 24 meters) obscured by glycol-based smoke near the beginning of the measurement set. In this case target B1 is visible and B2 is partially visible. On the right is a photograph near the end of the measurement set (approximately 10 minutes later) once the obscurant has mostly dispersed from the chamber. The whole of Set B, Set A, and the target is visible.

Attenuation measurements for λ = 1550 nm and the visible band
Due to variations in the obscurant over the duration of the measurement sets, it was necessary to obtain the attenuation coefficient during each measurement for both λ = 1550 nm and the visible band. These measurements were made using the two sets of calibration targets (Set A and Set B) placed in the chamber alongside the target, as discussed in the previous section of this paper. This section details the methods used to calculate the attenuation coefficient for both imaging systems and presents the results for both SWIR and visible wavelengths. It is important to note that the calculation of the attenuation coefficients for both the visible and SWIR wavelengths will be affected differently by obscurant inhomogeneity since very different spatial averaging are used in these two measurement approaches.

Calculating the attenuation coefficient for the visible band
The attenuation coefficient for the visible part of the spectrum was obtained through visibility measurements taken from calibration panels in Set B. In the obscurant chamber, 8 contrast targets in Set B were placed at the following distances from the tunnel entrance: 2. A measurement of the contrast between a selected region the 5% and the 94% Permaflect was made for each measurement in the presence of obscurant and then compared with the same region in a reference set of color images with good visibility (i.e. no obscurant). The first measurement set taken for each obscurant, when the density is highest, shows very poor contrast while as time passes and the obscurant disperses from the chamber the contrast improves. As defined by Koschmieder [35], atmospheric visibility V is related to the attenuation coefficient α (m −1 ) by the law: where ln is the natural logarithm, C 0 corresponds to the target contrast (C 0 = 1) and C th is set equal to 0.05, i.e. the human eye's minimal perceptible contrast as defined by the CIE (International Commission on Illumination). This limit can be calculated through use a polynomial fit to find the point at which the contrast is 5%. Given these values, the relation between visibility and attenuation can be given as: 3 .
Using this method, we noted a good agreement with estimations with the naked eye and measurements taken with a visibility meter.

Calculating the attenuation coefficient for λ = 1550 nm
The attenuation coefficient, and hence the number of attenuation lengths between the target and the transceiver, was calculated for each λ = 1550 nm scan by examining the single-photon return from known areas of a calibration panel in Set A when compared to a free-space (i.e. no obscurant) reference measurement of exactly the same part of the panel. The number of attenuation lengths (N AL ) between the transceiver and target was then calculated from the Beer-Lambert law [36] as follows: where ln is the natural logarithm, n 0 is the number of returned photons measured in the absence of obscurant, n is the number of returned photons in the presence of obscurant, d is the (one-way) distance of propagation in the obscurant, and α represents the attenuation coefficient for the level of obscurant present in the chamber.
A comparison of the attenuation coefficient α (m −1 ) in the obscurant as a function of time for both λ = 1550 nm and the visible band for white canister smoke, glycol-based smoke, and water-based fog for one full measurement set is shown in Fig. 4. These measurements were attempted for the highest attenuation level to the lowest as the fog dispersed. show the attenuation coefficients for measurements taken through approximately 24 meters of white canister smoke and glycol-based smoke, respectively. Due to the high level of obscurant, a very low return -resulting in large variance in the signalwas obtained from the target at the beginning of each measurement set. Thus, for the λ = 1550 nm attenuation data we show only those data points where there was a sufficient signal-to-noise ratio to make a reliable attenuation coefficient measurement. The measurements were taken every 60 seconds for an overall duration of 540 seconds for the white canister smoke and every 90 seconds for an overall duration of 630 seconds for the glycol-based smoke. The longer measurement set duration is shown for glycol-based smoke to account for its slower dispersion. The attenuation coefficient at λ = 1550 nm is much lower than in the visible band over the entire duration of the measurements for both glycol and canister smoke, demonstrating a clear advantage of SWIR operation compared with visible detection approaches. Figure 4(c) shows the attenuation coefficient taken through approximately 24 meters of water-based fog. The measurement set duration in this case was 480 seconds with measurements taken every 60 seconds. The water-fog results show that there may be a slight advantage in working in the SWIR region for this particular particle size and density of water-based fog, as the measurements shown in Fig. 4(c) indicate that the operating wavelength of λ = 1550 nm may have slightly lower attenuation under these conditions than the visible band. However, it should be noted that entirely different measurement approaches are used for the λ = 1550 nm and the visible bands.

Computational methods
During each measurement, photons time-tagged by the HydraHarp 400 were streamed to the control computer via a USB 3.0 interface. Each pixel's timing histogram, with 2 ps timing bins, was constructed via software and stored in memory once the scan was complete. Recently there has been great interest in the development of reconstruction algorithms for single-photon data processing. Several state-of-the-art algorithms have demonstrated good reconstruction results of single-photon data in free-space scenarios where the return signal is very low and the background is relatively high [37][38][39][40]. In this work, three different reconstruction algorithms of varying complexity were investigated as potential candidates for successful depth and intensity reconstruction of targets in obscuring media. These three algorithms were chosen as they have been previously demonstrated to be successful in the reconstruction of single-photon data at both long ranges and through scattering surfaces such as camouflage netting [4,16,37,41]. However, this is the first time that they have been applied to scenarios with such high levels of atmospheric particulate scattering.

Pixel-wise cross-correlation
Pixel-wise cross-correlation provides an estimate of depth and intensity for each pixel by calculating the cross-correlation, C, between an acquired timing histogram, y, and the known instrumental response of the system, R, such that: where y t is the timing histogram value at the tth bin and T is the total number of timing bins. Using this method, a time position corresponding to the highest cross-correlation can be found for each pixel, which represents the target's position and is related to the depth measurement Z. Once the relevant time positions (or bin location) are picked up for each pixel, through combining the spatial (i.e. X and Y) information and time-correlated depth measurement (i.e. Z), a depth profile of the scanned scene can be reconstructed. The instrumental response function used for this analysis was acquired using a single-pixel measurement with a 100 second integration time, on the 94% reflectivity region of panel A3 providing a uniform response. This is a simple and computationally inexpensive method commonly used in single-photon three-dimensional image reconstruction [3][4][5]9,16,42,43]. The cross-correlation estimate approximates the maximum likelihood estimate of the when assuming the absence of the background noise and Gaussian statistics. Therefore, this algorithm performs poorly in the presence of high levels of obscurants due to low levels of photon returns from the target in comparison to the background levels. Figure 5 shows two timing histograms taken from a single pixel in measurements of the same target under the same illumination conditions taken in glycol-based smoke at α = 0.08 m −1 (i.e. low level of obscurant) and α = 0.18 m −1 (i.e. high level of obscurant). Figure 5(a) indicates that in clear conditions, the timing bins containing the target return peak are easily discernible, while in the presence of an obscuring media [ Fig. 5 (b)] the target bins may contain a similar number of photons counts to the background level due to high levels of scattering. Since crosscorrelation is only effective at finding the highest peak, this can potentially result in an incorrect estimate. This highlights the need to design advanced processing algorithms to deal with scenarios involving high levels of scatter. Figures 5(c) and 5(d) also show aggregated timing histograms of the target and calibration Set A through 24 meters of glycol-based smoke obtained with cross-correlation. In these aggregated histograms, data from all 10695 pixels in the image are summed and displayed. Figure 5(c) shows that with a lower level of obscurant in the chamber, the return peaks from the target and the calibration panels are very pronounced and have an average background of approximately 5 counts per timing bin. Figure 5(d) shows that when the glycol-based smoke is much denser, the signal-to-noise ratio is reduced, and the average background is increased to approximately 25 counts per timing bin. Satat et al. [21] previously observed a non-uniform histogram background due to scatter at very short range (~1 meter) at 580 nm wavelength, however, a uniform background was found in these medium range (several 10's meters) measurements. Also, the detector gating meant that close ranges to the transceiver (i.e. < 25 meters) were not measured.

RDI-TV algorithm
As previously discussed, the presence of obscuring media leads to an increase in noise in both depth and intensity measurements, and can also lead to missing pixels in the reconstructed image due to low levels of photon returns. In order to improve the quality of the estimated depth and intensity images a second, more sophisticated algorithm was investigated. The Restoration of Depth and Intensity using the Total Variation (RDI-TV) approach has been previously shown to be a good candidate algorithm for the reconstruction of single-photon data which contains multiple surfaces but a single depth layer per-pixel [4,44]. The algorithm aims to both reduce the effect of Poisson noise affecting the observed histograms and reconstruct multiple target surfaces. It achieves this by assuming several approximations, namely: (i) the presence of a single depth layer per pixel, (ii) the absence of background counts, (iii) a Gaussian impulse response of the system, (iv) known positions of the corrupted pixels, and finally (v) that the target lies far from the edge of the observation window. Under these considerations, it is possible to show that the problem reduces to the restoration of two preliminary estimates of depth (D init ) and intensity (I init ) images without using the cumbersome observed histograms. This makes RDI-TV a computationally fast and robust way to analyze single-photon data which contains a single depth layer. The cost function C f , which should be minimized with respect to D and I, is given by: where L(D init , I init ) is the log-likelihood of the Poisson distributed data, D and I are the restored depth and intensity images, and TV(D) and TV(I) are the total variation regularization terms [45]. Halimi et al. [44] contains further details regarding the RDI-TV algorithm.

Unmixing algorithm (UA)
While the RDI-TV algorithm has demonstrated good results on single-photon data with a single surface [4,44], it may not perform as well in the presence of high background levels resulting from the presence of obscurants. In this case, a better solution is the Unmixing Algorithm (denoted UA) proposed by Rapp and Goyal [37], which is considered as the stateof the-art algorithm for data containing one surface per pixel. First, the algorithm aims to unmix the signal counts from those resulting from background by considering a data-driven adaptive super-pixel approach, and subsequently use the isolated signal counts to provide improved depth and reflectivity estimates of the target. These estimates are then further improved by adopting a regularized maximum likelihood approach that accounts for data statistics and spatial correlations. Note that the algorithm assumes: (i) the presence of a single depth layer per pixel, (ii) that the target lies inside the observation window and (iii) some known parameters from a calibration step. In contrast to RDI-TV, the UA algorithm is designed to deal with high background levels and has shown good performance when dealing with complex scenes as indicated in [37]. We invite the reader to consult [37] for further details on the UA algorithm.

M-NR3D algorithm
To deal with high background levels, a solution is to consider the full histogram cube as it contains more information than the preliminary estimates of D init and I init used by RDI-TV. The Multidimensional Nonlocal Reconstruction of 3D (M-NR3D) images method is a new statistical algorithm, which uses the full histogram to restore multilayered data corrupted by high background levels. The approach is based on the minimization of a cost function accounting for the Poisson statistics of the single-photon data, and improves the methodology used in the RDI-TV and UA algorithms by (i) accounting for non-local spatial correlation between reflectivities; (ii) using a local collaborative sparse prior [46] to assume the presence of few number of peaks in each group of local pixels [47][48][49]; (iii) accounting for the correlations between multi-temporal 3D images (3D videos). Note that the algorithm will be denoted by NR3D when performing an independent processing of the data and M-NR3D for a joint processing of the multi-temporal frames. The cost function C f used to restore multitemporal 3D images composed of K frames is given by: where L k (Y k ,X k ) is the log-likelihood of the data, X k is the matrix representing the cloud points of the data after denoising for the kth frame, and φ 1 and φ 2 are two regularization terms promoting our assumptions described in (i), (ii), and (iii). Note that in contrast to RDI-TV and UA that process the frames independently, the new M-NR3D algorithm performs a joint processing to improve the restoration quality. Note also that both RDI-TV and M-NR3D minimization problems are solved using variants of the alternating direction method of multipliers (ADMM) algorithm [50,51], which is a fast algorithm that has shown good results in several applications [52][53][54]. The interested reader is invited to consult [41] for more details on the M-NR3D algorithm.

Reconstruction of depth and intensity images
This section presents results from the trial obtained using the pixel-wise cross-correlation, RDI-TV, UA, and M-NR3D methods discussed in the previous section. The signal-toreconstruction error (SRE) is used in order to obtain a quantitative metric for comparison between the algorithms in each obscurant [41]. The SRE is given by: where d GT is the reference depth profile (or ground-truth image) and d is the reconstructed depth profile, and Np is the total number of pixels in the scan. The ground truth image is taken from data with the highest quality depth reconstruction, with the longest acquisition time available. The SRE values are given in decibels (dB) with a higher SRE representing a better reconstruction.

White canister smoke
A series of measurements were performed with the target placed at a stand-off distance of 24 meters within the obscurant chamber (approximately 41 meters from the transceiver unit). Canisters discharging white smoke were placed at several points within the obscurant chamber and ignited to achieve the desired density while three fans were used to homogenize the smoke over the full length of the chamber. The dispersion rate of the smoke allowed for a 30 second measurement duration repeated every 60 seconds for an overall period of approximately 10 minutes. The scan area was 335 × 270 mm mapped by 115 × 93 pixels which was equivalent to a pixel-to-pixel pitch of approximately 2.9 mm in both X and Y. Figure 6 shows results obtained in both clear conditions and in high density white canister smoke. In order to provide a good visual representation of the polystyrene head, depth data were only selected from a range of 0.2 meters around the target and a subset of pixels containing only target information from the head is shown. Figures 6(a) and 6(d) shows RGB photographs of the target scene at the point of measurement for both clear conditions and in white canister smoke; Figs. 6(b) and 6(e) shows the depth profiles obtained using the pixelwise cross-correlation approach. Figures 6(c) and 6(f) show the intensity maps obtained by summing events over a 200 bin range centered on the highest value in the cross-correlation. The results shown in Figs. 6(b) and 6(e) show that, even at the very first measurement point of 30 seconds, the target is fully discernible using the λ = 1550 nm scanning system. As shown in Fig. 4(a), the attenuation coefficient in the target plane for the very first measurement of 30 seconds was α = 0.10 m −1 for λ = 1550 nm compared to α = 0.5 m −1 in the visible, which is equivalent to 2.4 attenuation lengths and 12.0 attenuation lengths, respectively, between the transceiver and target. As seen in Fig. 6, the long acquisition time of the measurement (approximately 30 seconds) resulted in an image with a far greater amount of photon returns than required for successful reconstruction of the target. For each measurement, both the time from start of scan (known as macro time) and the time between the corresponding start signal and the recorded photon arrival time (micro time) were recorded for each detection event. This meant that shorter durations of measured data could be extracted from each pixel's entire measurement data so that the level of depth reconstruction achievable at shorter acquisition times could be investigated. The resulting depth profiles and SRE values for the target at 24 meters in white canister smoke, when α = 0.10 m −1 , for per-pixel acquisition times of 0.01 ms, 0.05 ms, 0.1 ms, and 3 ms, which correspond to image acquisition times of ~0.11 s, 0.55 s, 1.1 s, and 30 s, respectively are shown in Fig. 7. In each case, reconstructions using the cross-correlation, RDI-TV, UA, and NR3D algorithms are shown where the parameters of each algorithm have been optimized to provide the best performance. The SRE values use a ground truth, which was reconstructed using cross-correlation of the full acquisition time data (i.e. 3 ms per-pixel). , obtained using the pixel-wise cross-correlation (first row), RDI-TV (second row row), UA (third row), and NR3D algorithms (fourth row). These depth profiles were reconstructed from the full data set (3 ms per-pixel) and data with reduced acquisition times of 0.01 ms per-pixel, 0.05 ms per-pixel, and 0.1 ms per-pixel.
As expected, at shorter acquisition times there is a correspondingly lower number of collected photon returns resulting in a decrease in quality of the depth profile reconstruction. At shorter acquisition times, the depth profiles obtained using the pixel-wise cross-correlation algorithm exhibit high levels of noise and have many 'missing pixels' where a depth estimate could not be made. However, with the use of the RDI-TV, UA, and NR3D algorithms we are able to achieve good levels of depth reconstruction, even at acquisition times of approximately 0.55 seconds. Generally, these three algorithms demonstrate superior performance compared with cross-correlation in the sparse photon regime, with the best performance provided by RDI-TV. However, while the RDI-TV algorithm succeeds in reconstructing the missing pixel data, it has a tendency to over smooth the depth profile resulting in a loss of detail. However, both NR3D and UA algorithms achieve good depth reconstruction while retaining the finer details of the target such as the nose, although in the case of very sparse photon returns of 0.01 ms, the UA does not appear as good as NR3D. It should be noted that the average number of photons per pixel in the 0.01 ms per-pixel data is only approximately 0.1 photon per pixel with an estimated signal to background ratio (SBR) of 6.2. The SBR is computed by estimating the average background in a region that does not contain target returns and then deducing the average signal counts accordingly.
The computational time for the algorithms for each of the reconstructed depth profiles shown in Fig. 7 is approximately 10s of seconds for RDI-TV, 1000s of seconds per image for NR3D, and approximately 1 second for UA. For the most photon-starved measurements, RDI-TV appears to give the best reconstruction at a moderate processing time. These results highlight the interest in such image processing algorithms for improving single-photon data taken in the sparse photon regime, and for allowing for more rapid data acquisition.

Glycol-based smoke
Measurements were repeated through 24 meters of glycol-based smoke using the same scan parameters as for canister smoke. Due to the relatively slow dispersion of the glycol-based smoke out of the obscurant chamber, a total period of 14 minutes with a measurement format of 30 seconds scans every 60 seconds was used. The optical attenuation at λ = 1550 nm was found to be considerably less than the attenuation in the visible band, as shown in Fig. 4(b). Figure 8 shows the estimation of the number of attenuation lengths for both λ = 1550 nm and for visible wavelengths, and the cross-correlated depth and intensity profiles of the target measured at several points of measurement set over a duration of approximately 5 minutes. The results show that the target depth and intensity profiles are becoming discernible at approximately N AL (1550 nm) = 4.0. At N AL (1550 nm) = 3.5, the target is fully visible in both the intensity and depth profiles. At this time, the attenuation is equivalent to 11.4 attenuation lengths in the visible. As discussed in section 4, the attenuation coefficient at λ = 1550 nm is lower than in the visible band, clearly demonstrating the advantages of SWIR operation with this type of obscurant. Figure 9 shows 3D point cloud representations and SRE values of the target in glycolbased smoke at three levels of attenuation (N AL (1550 nm) = 3.5, 3.1, and 2.7) reconstructed using cross-correlation, RDI-TV, UA, and M-NR3D. As in Fig. 7, the ground truth image was reconstructed using cross-correlation of the full acquisition time data (i.e. 3 ms per-pixel), taken at the end of the measurement set under clear conditions.  Figure 9 shows that the advantage of processing algorithms such as UA and M-NR3D becomes apparent at very high levels of attenuation in glycol-based smoke with an SRE of 6 dB in comparison to −3 dB for both cross-correlation and RDI-TV. At N AL (1550 nm) = 3.5, both the cross-correlation and the RDI-TV algorithms give poor results with the finer features of the target mostly unrecognizable. The multi-depth M-NR3D algorithm appears to give a slightly more accurate representation of the target than UA at the highest attenuation level, being successful in reconstructing the facial features. However, the M-NR3D algorithm has a much longer computational time of approximately 2500 seconds to jointly process the three images, compared to the faster RDI-TV algorithm, which takes approximately 100 seconds per image, with UA being even faster. In the results shown in Fig. 9, the average number of photons per pixel is 5 for N AL (1550 nm) = 3.5, but with a much lower SBR of only 0.4, due to the much higher scattering background compared to the results shown in Fig. 7.

Water-based fog
This measurement set was taken using a much shorter stand-off distance of 5 meters in waterbased fog due to the high density of the obscurant at the start of the measurement set (visible attenuation coefficient = 0.94 m −1 ). In order to account for the relatively faster dispersion of the water-based fog, the scans were taken for a duration of 30 seconds, with a separation of 30 seconds between scans for a total measurement set period of 8 minutes. A target area of 350 × 275 mm mapped by 117 × 92 pixels was used to achieve a similar image format to the 24 meter measurements. For the 5 meter results the attenuation measurements showed that there appears to be no advantage in operating at λ = 1550 nm in this particular water fog environment when compared to the visible band. It should be noted that different measurement approaches were used in obtaining the attenuation measurements for each wavelength band, which may lead to a possibility of a consistent systematic error.
Due to the very high attenuation levels at λ = 1550 nm, the measurements did not commence until 3 minutes after the start of the measurement set. Figure 10 shows the number of attenuation lengths for the SWIR wavelength, RGB images of the scene taken at the time of the SWIR measurement, and the cross-correlated depth and intensity profiles of the head at several different N AL . The results show that at N AL (1550 nm) = 3.8 the target becomes recognizable in the depth and intensity profiles using simple cross-correlation. At N AL (1550 nm) = 3.1 the target becomes fully discernible. Due to the short range and rapidly changing obscurant density of this measurement, it was difficult to make a reliable calibration in the visible region. However, the 24-meter data indicates that the 1550 nm wavelength will have slightly less attenuation. Further simultaneous measurements in both wavelength bands are required in naturally occurring fog environments.
The 3D point cloud reconstruction and corresponding SRE are shown using the crosscorrelation, RDI-TV, UA, and M-NR3D algorithms in Fig. 11 for three values of attenuation (N AL (1550 nm) = 3.8, 3.1, and 2.8) in water fog. As above, the SRE values were calculated using a ground truth image, which was reconstructed using cross-correlation of the full acquisition time data (i.e. 3 ms per-pixel), taken at the end of the measurement set under clear conditions. As with glycol-based smoke, the cross-correlation and RDI-TV algorithms produce satisfactory results at lower levels of attenuation lengths (as seen in Fig. 11) but are unable to successfully reconstruct the target at higher levels of attenuation. However, the use of the M-NR3D and UA algorithms allows for improved reconstruction of data obtained at up to 3.8 attenuation lengths in water fog. The average photons per pixel for N AL (1550 nm) = 3.8 is 4.6 and the SBR is 0.3, the lowest SBR of images reconstructed for the three obscurants. The M-NR3D algorithm provides improved reconstruction compared with UA for the highest attenuation. As for the case of white canister smoke and glycol-based smoke shown above, the RDI-TV algorithm provides a computational time of approximately 100 seconds to process each image, while the UA algorithm is only ~1 second per image, and the M-NR3D algorithm takes approximately 1300 seconds for the joint processing of the three images. The results shown in Fig. 9 and Fig. 11 demonstrates the potential for the proposed UA and M-NR3D algorithms for the reconstruction of single-photon data in highly scattering environments.

Conclusions and outlook
This paper presents high-resolution depth and intensity profiling of targets in high levels of obscurants for the first time using time-correlated single-photon counting operating in the SWIR region. A time-of-flight scanning imager using the TCSPC technique was used to perform three-dimensional imaging at up to 24 meters through several forms of obscurants including canister smoke, glycol-based smoke, and water-based fog. The results demonstrate that the use of 1550 nm wavelength illumination shows significant benefits over the visible band for both smoke types used and little or no benefit for the case of the water-based fog used in these measurements. In both the visible and SWIR bands, absorption effects are a relatively small contribution to the overall attenuation. At these wavelengths, scattering effects, such as Mie scattering, from particles suspended in the atmosphere dominate [55]. These scattering effects are dependent on the particle size of the fog in comparison to the wavelength of the transmitted beam. Therefore, the use of SWIR wavelengths will prove highly beneficial compared to visible wavelengths for imaging applications in high densities of obscurants where the particle size is small, as in the case of smoke. The measurements presented in this report had initial visibilities in the order of meters, which can be regarded as propagation in high levels of obscurants. However, further studies are necessary to examine a natural fog environment with a range of droplet sizes, in order to examine the imaging scenarios of most interest, i.e. target distances of 100 meters or more, consistent with automotive LiDAR, for example. This paper also presented results from several image processing algorithms which reconstruct depth and intensity images by considering spatial correlations in single-photon data. Each algorithm's performance depends on the considered scenario, where RDI-TV appears to be more suitable for measurements containing a reduced background level. The UA algorithm provides good reconstruction in the presence of scattering background with very fast processing times. The M-NR3D algorithm is more general in the sense that it accounts for the presence of multiple peaks, it jointly processes a sequence of images (i.e., 3D videos) and it performs very good data restoration even in presence of high levels of scattering background. The M-NR3D algorithm will be particularly useful in reconstructing distributed targets composed of several surfaces in the presence of obscurants, and this will be a feature of future work. However, M-NR3D takes the longest computational time of all algorithms examined. The use of such image processing algorithms indicate significant improvements in image reconstruction in the sparse photon regime, which will assist in the move towards imaging at higher levels of atmospheric attenuation and reduced data acquisition time in future work. This paper has also demonstrated the use of the TCSPC technique for depth profiling through obscurants at short acquisition times (less than 1 second for the entire scan) using a single-pixel detector making this method more suitable for a number of depth imaging applications. However, in recent years, large format InGaAs/InP SPAD detector arrays integrated with TCSPC electronics [56][57][58][59] have become of great interest for rapid threedimensional imaging and the use of these detectors in SWIR 3D imaging through obscurants will be a focus of future work. This work has demonstrated the reconstruction depth and reflectivity measurements obtained through water-based fog at up to 3.8 attenuation lengths using the M-NR3D reconstruction algorithm. A limiting factor for these measurements was the optical power level of our pulsed illumination source (~1.5 mW average power). The use of a higher power laser would allow imaging of targets at a higher number of attenuation lengths whilst remaining eye-safe, potentially at more than five attenuation lengths between transceiver and target. Future work will also involve a comparison between the visible spectrum and SWIR wavelengths through high levels of obscurants in outdoor scenarios with high ambient background levels. Further future work will investigate the potential of such detectors for faster acquisition leading to real-time imaging of moving targets through obscurants.