Towards multispectral , multi-sensor indoor positioning and target identification

ELECT A concept and first results of combining multispectral light detection and ranging (LiDAR) with positioning sensors to produce spatially resolved target identification in indoor environment is presented. The aim is to enhance the sensor-based indoor localisation with a multispectral target identification and mapping. There is a growing need for automatic and mobile mapping and surveillance in buildings and locations where satellite positioning is not available. LiDAR is a common sensor in feature-based simultaneous localisation and mapping. As multispectral LiDARs are emerging and becoming increasingly popular in research applications, the multi/hyperspectral point clouds are likely to improve object recognition and enable a new level of autonomous surveillance in the near future. The first results show that position solution can be obtained using sensors attached to the LiDAR.

Introduction: Indoor positioning for global navigation satellite system (GNSS) denied environments have been studied increasingly during the past few years [1].Sensor positioning is based on propagating the known initial position and heading with motion measurements obtained from the sensor.However, sensor measurements suffer from errors, which deteriorate the solution in time.Therefore, fusion of sensors with different error characteristics enables the solution to stay accurate for longer time.This is particularly the case when using low-cost sensors, especially micro-electro-mechanical sensors [2][3][4].
Indoor positioning methods can be roughly categorised into three approaches, based on radio frequency fingerprinting, motion-based, and visual [1,5].As radio fingerprinting requires an existing infrastructure (such as wireless network or separate identification tags), sensor positioning is more widely applicable in any environment.Inertial sensors, namely accelerometers and gyroscopes forming an inertial measurement unit (IMU), have been used for sensor positioning.IMUs provide translation and heading measurements, but they must be calibrated with some absolute positioning means or fused with motion measurements obtained from different sources, such as visual odometry [4] to improve the quality.
Combining positioning sensors and light detection and ranging (LiDAR) has become a widely used approach for mobile mapping and robotics [4,6].LiDAR is commonly utilised in mobile laser scanning (MLS).The MLS systems also rely on GNSS, mainly for direct georeferencing of point clouds [7], but the increasing number of autonomous robotic platform applications [4,8] has emphasised the need for better location accuracy.The growing demand for robotic and situational awareness applications in GNSS denied environments will emphasise the role of multi-sensor indoor positioning [9].Three-dimensional (3D) detection of features with LiDAR is also common in simultaneous localisation and mapping (SLAM) [6].SLAM provides infrastructure-free accurate and reliable localisation and information on the environment by means of adaptive integration of data from multiple sensors [8,10].In SLAM, LiDAR is typically combined with other optical and odometry sensors, IMU [4], and GNSS in outdoor applications [11].
The common estimation frameworks to produce a SLAM solution are based on Kalman filter and Particle filter (PF) [8].PF was implemented in [2] to integrate motion (heading and translation) measurements from a monocular camera, foot-mounted IMU, sound navigation and ranging (SONAR), and a barometer to produce an accurate and reliable 3D localisation solution for SLAM.The horizontal accuracy was 3.14 m with standard deviation of 2.82 m.The results could be improved with more accurate error modelling and additional sensors, such as multiple IMU's or visual LiDAR odometry [4].
Iterative closest point (ICP) is a method for providing the translation and rotation between two point clouds [12].However, ICP needs a good a priori measurement of transformation to converge.Also, when LiDAR is moved, the point clouds suffer from motion distortion, which should be corrected using other measurements.When these challenges are tackled and the two point clouds are matched, the difference between their position and orientation may be used to obtain motion information via LiDAR odometry [4].
Visual odometry measurements from a monocular camera are often used to solve the challenges related to LiDAR odometry, and to obtain an accurate localisation solution.Monocular cameras have advantages for visual perception compared with stereo cameras; they have wider field-of-view and provide faster image processing capabilities [13].However, traditionally stereo cameras have been mainly used, because methods based on monocular cameras usually suffer from so called ambiguous scale problem [14].In our previous studies, we have developed a concept called visual gyroscope and visual odometer [13], which resolves the scale problem and provides absolute translation and rotation information to be used for localisation.
There is a trade-off between accuracy and computing time cost in SLAM solutions, and numerous methods have been introduced to optimise the positioning solution [10,15].With post-processing, centimetrelevel accuracy can be achieved, while a real-time application could be improved to decimetre level at least for feature-rich environments [15].
Laser scanning applications have reached a new level as multiwavelength LiDAR applications have emerged [16][17][18].The output from a multispectral LiDAR comprises the point cloud (x, y, z, I), where the intensity I contains multiple values of wavelength.The multiwavelength aspect has enabled a new level of detail in, e.g.vegetation studies, where hyperspectral sensing is a well-established method for identification and classification of targets and monitoring different plant activity [19].Active hyperspectral sensing based on supercontinuum lasers has also been applied for long-range target characterisation [20].
In this Letter, we demonstrate the potential of multispectral target identification combined with multi-sensor indoor positioning, to provide one-shot spectral identification and position information for the targets.In our previous paper [21], we extended the target identification from 3D hyperspectral point clouds into industrial targets and built environment.We have shown in our earlier studies that the hyperspectral LiDAR (HSL) is capable of measuring different phenomena in 3D, such as leaf-level moisture in vegetation and its distribution over extended targets [16].We have also developed algorithms for automatic classification of targets with both spectral and spatial features [22].The ultimate aim of our research is to combine the HSL target detection with sensor positioning for a real-time SLAM method with improved optical sensing (point-wise target identification using spectral libraries) and autonomous indoor mapping.We also discuss the accuracy of our method and discuss its future prospects of providing autonomous target characterisation from 3D features and spectral data in addition to autonomous mobility.
Fig. 1 The FGI hyperspectral lidar setup.Red beam: Laser input.R: 2D scanner.M: Off-axis parabolic mirror.S: Spectrograph.A: photo diode array Data and methods: The Finnish Geospatial Research Institute Hyperspectral LiDAR (FGI HSL) is a prototype multi-wavelength laser scanner [16], with a supercontinuum laser light source (420-2400 nm, 41 mW average optical power, 5 kHz pulse rate).The operation principle is the same as in a monochromatic pulse-based terrestrial LiDAR, but the output point cloud (x, y, z, I) contains the intensity I as a function of wavelength, i.e. an eight-channel spectrum (500-1000 nm) is associated with each point (x, y, z).The range measurement is based on the time-of-flight of the reflected laser waveform.An off-axis parabolic mirror is used as a primary optic to gather the returning laser pulses.The detector consists of a spectrograph (Specim Imspector V10), placed in front of a 16-element avalanche photodiode (APD) array.A high-speed (1 ns) digitiser enables data storage at eight wavelength bands.Thus, the detector system is multispectral, but the wavelength channels can be selected by adjusting the spectrograph position with respect to the APD array.A monochromator ELECTRONICS LETTERS 20th July 2017 Vol.53 No. 15 pp.1008-1011 (Oriel, Cornerstone 74,125) was used to calibrate the spectral responses of the APD elements.The scanning over the target was performed with a 2D scanner to produce a point cloud.See Fig. 1 and [16] for more details on the instrument and data processing.
Three different targets were investigated (see also [21]): two were cardboard samples one of which was sprayed wet with water, and the third was a wooden panel with some mould on it.The targets were placed hanging on a wire in the middle of the area to be scanned.The HSL wavelength channels in this Letter were 536, 589, 634, 688, 741, 793, 848, and 951 nm.We scanned the entire corner of the laboratory to produce a point cloud (Fig. 2), which was processed with MATLAB R2013a software (The MathWorks ® , Inc).The samples were manually cropped from the point cloud to obtain the mean backscattered reflectance of all the echoes from each target.In further applications of this method, it is possible to replace the manual identification with algorithms developed for HSL data for automatic target identification [22,23].The multi-sensor 3D indoor positioning solution in this experiment is computed by fusing the measurements obtained using a monocular camera, IMU, a barometer, and SONAR.The goal of our research is to develop a hyperspectral multi-sensor SLAM solution, providing improved accuracy and reliability for localisation.
Heading and velocity may be computed from the IMU measurements and when the initial position and orientation are known, a continuous position solution obtained by propagating the measurements in time.However, IMU alone is not sufficient for accurate positioning and therefore the measurements are fused with visual odometry, which also provides heading and velocity measurements.With a special configuration of the camera, namely knowing its height and being able to compute its orientation using image processing means [13] absolute translation measurements may be obtained.Visual odometry does not provide accurate vertical motion information and therefore a barometer will be used for obtaining height measurements.A barometer provides accurate height information by measuring changes in air pressure.However, the ambient pressure indoors may change significantly due to opening a window or due to air conditioning and therefore changes in height measured by a SONAR sensor pointing down to the floor will be used to evaluate the reliability of obtained height measurements.More details of the measurement are given in [2].
Particle filtering [24] was used here for fusing the heading and translation measurements from a monocular camera and the IMU, a barometer, and SONAR.The automatic target identification algorithms developed for the HSL data are based on 3D features extracted from the point cloud, combined with spectral identification using spectral correlation mapping [23] or spectral indices [22].As an example of spectral indices in identifying target characteristics, we computed two different water indices: the water concentration index WI [25], which is based on a water absorption band at 970 nm and reference wavelength (900 nm) As 900 and 970 nm were not available in this measurement, we used 848 and 951 nm instead to obtain WI = 0.96 for dry and WI = 0.94 for wet sample, using the average spectra plotted in Fig. 4. We also compared the normalised water index (NWI) [26] Again, we used 951 and 848 nm instead and obtained NWI = 0.02 for dry and NWI = 0.33 for the wet sample, using the values averaged over the entire sample.However, the NWI values show some variation over the targets, which is seen in the plotted extract of the point cloud in Fig. 5, showing the NWI for both cardboard samples.Nevertheless, there is a visible difference between the wet and dry cardboard samples.The fused position solution was computed by navigating from outdoors to indoors into the room where the HSL sensor was located.The navigation trajectory is presented in Fig. 6.The starting position and the attitude were obtained with a Novatel SPAN reference system, including a GPS receiver with a Honeywell HG1700 AG58 tactical grade IMU.After initialising the system the SPAN measurements were used only for a reference, the results presented below are obtained by comparing the fused position solution with the one computed from the reference system measurements.
Osmium MIMU22BT foot-mounted multiple IMU and GoPro Hero 3 action camera were used for horizontal positioning and XSENS MTi-G-700 inertial navigation system's barometer and HRUSB-MaxSonar SONAR for vertical positioning.The positioning experiment was carried out at the premises of the FGI in Masala.The route walked was ∼200 m and contained both outdoor and indoor parts, also including features that were challenging for both visual odometry and inertial sensing, such as spiral stairs.The distance root mean squared (DRMS) error of the computed horizontal position solution for the route was 3.4 m.For comparison we computed the horizontal position solution using an IMU only and the resulting DRMS was almost 6 m.Fig. 7 shows the horizontal positioning solution for fused (blue) and reference (green) solutions.The mean error for the fused vertical position solution was 0.9 m, with standard deviation of 0.6 m, when computed by using IMU only 4.2 and 2.1 m, respectively.Fig. 8 shows the vertical positioning for fused (blue), IMU only (red), and reference (green) solutions.Discussion: Our results are similar (in terms of error levels) to those from non-GNSS positioning: Fourati [5] obtained a 5 m position drift in 1100 m using a non-GPS pedestrian navigation with a foot mounted IMU.The results were the same order as in [27], where the mean error was 4.87 m for a set of half-an-hour walks, where the route varied from rectangular to random, and the step detection method was calibrated before each test.In this Letter, the route was challenging and contained height changes, such as spiral stairs, which complicated the measurement.Our method also aims at seamless outdoor-indoor navigation, which reduces the temperature stabilisation time for the sensors.

Conclusion:
The clear difference in the NWI for wet and dry samples indicates the potential for using the NWI and other spectral indices in automatic target detection algorithms.In case of moisture based detection, the water absorption is stronger at wavelengths further in the nearinfrared [cf.20], which means that in the future implementations of a hyperspectral instrument for indoor mapping, wavelengths >1000 nm should also be considered.
We also presented the results for infrastructure-free indoor positioning to obtain the position of the HSL and hence the targets.The accuracy of the fused position solution was already feasible for most indoor positioning applications.However, in the future we will fuse the sensor positioning algorithms with measurements obtained from the HSL for an accurate SLAM solution.We anticipate the SLAM localisation accuracy to improve up to decimetre level.This Letter is the first step towards spatially resolved target identification and mapping.Although these results are preliminary, they show the potential of using a multispectral LiDAR enhanced indoor mapping in localising different targets and phenomena, such as humidity or mould in building structures.In future, this will enable the use of indoor SLAM not only for navigation of people or autonomous vehicles but also for autonomous surveillance in construction or security applications.

Fig. 2
Fig. 2 Targets hanged on wire in scanned area.Wet and dry cardboard samples are in left and middle, respectively (left).HSL point cloud of room corner, showing targets (right)

Fig. 3 Fig. 4
Fig. 3 Targets cropped from original point cloud.Intensity for each point is plotted in HSL channel 4: 720 nm (left) and 6: 818 nm (right).Dry and wet cardboard samples are in left and middle, respectively (cf.Fig. 2), while wood sample is on right