Inferring the most probable maps of underground utilities using Bayesian mapping model

Mapping theUnderworld(MTU),a majorinitiativeintheUK,isfocusedonaddressingsocial,environmentaland economicconsequencesraisedfromtheinabilitytolocateburiedundergroundutilities(suchaspipesandcables) bydevelopinga multi-sensormobiledevice.Theaim ofMTUdeviceistolocatedifferenttypesofburiedassetsin real time with the use of automated data processing techniques and statutory records. The statutory records, even though typically being inaccurate and incomplete, provide useful prior information on what is buried under the ground and where. However, the integration of information from multiple sensors (raw data) with these qualitative maps and their visualization ischallenging and requires the implementation of robust machine learning/datafusionapproaches.AnapproachforautomatedcreationofrevisedmapswasdevelopedasaBayes-ian Mapping model in this paper by integrating the knowledge extracted from sensors raw data and available statutory records. The combination of statutory records with the hypotheses from sensorswas for initial estimation of what might be found underground and roughly where. The maps were (re)constructed using automated image segmentation techniques for hypotheses extraction and Bayesian classi ﬁ cation techniques for segment-manholeconnections.ThemodelconsistingofimagesegmentationalgorithmandvariousBayesianclassi ﬁ cation techniques (segment recognition and expectation maximization (EM) algorithm) provided robust performance on various simulated as well as real sites in terms of predicting linear/non-linear segments and constructing re-ﬁ ned 2D/3D maps. © 2018 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/4.0/).


Introduction
The costs associated with street works in the UK is of critical consideration due to the vast majority of utilities buried underneath the roads and their repair/(re)installation (£7b annual) (Mcmahon et al., 2005). The types of utilities buried under the ground are diverse and their amount is notoriously large which makes excavation a challenging task in order to upgrade these underground networks. In addition, the statutory records of underground networks are typically incomplete and inaccurate particularly for old street works (Burtwell and E. A., 2004). An important undertaking is to develop schemes to detect what is buried underground that could be associated to their records and could become cost savior. A multi-sensor mobile laboratory MTU (Underworld, 2011) was developed which consists of multiple sensors capable of deploying several approaches to detect different types of buried infrastructure. The MTU device, was designed to assess the feasibility of a range of potential technologies that can be combined into a single device to accurately locate buried pipes and cables. The potential technologies included ground penetrating radar (GPR), low-frequency quasi-static electromagnetic fields (LFEM), passive magnetic fields (PMF) and low frequency vibro-acoustics (VA) and significant advances have already been made (Royal et al., 2011;Royal Acd et al., 2010).
The location estimation approaches combined by MTU provide significant advantages over other commercially available techniques (Ashdown, n.d.) for detecting wide variety of utilities and control trials were taken for test commercial sites. As a result, excavations necessary for maintenance and repair can be largely reduced using such device. An important undertaking is to use heterogeneous information from these sensors and build refined maps of buried utilities in real time. However, due to the heterogeneity in features of utilities and ground properties, it is challenging to develop a general technique that could assess heterogeneous information and handle the uncertainties associated to this task. The integration of information obtained from multiple sensors on MTU is of critical importance in order to make sense of the data before providing a precise information on a site. The knowledge obtained from different sensors presents itself non-symbolically i.e. the delivered data is essentially an image representing what the sensor "sees" underground. In contrast, utility records are almost universally represented symbolically i.e. they are stored in a spatial database as records with a vectorized representation of their spatial position, along with attribute information (such as material, diameter). It is therefore challenging to provide a useful and accurate representation of the data acquired from a variety of sensors. Therefore, a data fusion approach consisting of automated techniques for data extraction and integration was imperative.
The map (re)construction model developed in this work was an improvement over (Chen and Cohn, 2011) which was initially designed only for 2D construction of the map assuming that it consists of only linear segments. In addition, the data preprocessing for hypotheses extraction in (Chen and Cohn, 2011) was not combined as a complete model and it was assumed that the hypotheses were extracted from GPR images using an iterative clustering/classification techniques prior to data fusion tasks. Simple clustering/classification algorithms for hypotheses extraction such as k-means or Dbscan were restricted in several ways for asset classification problem when developing real time maps. For example, traditional k-means clustering algorithm creates the clusters based on Euclidean distance of each data point to the centroids (initially selected randomly). Also, the number of clusters to be created is known in k-means algorithm. Depending on statutory records to identify the number of segments was not reliable as, even providing valuable information, they are inaccurate and may contain incomplete information. Dbscan (Sander et al., 1998) also separates the clusters based on Euclidean distance without providing the desired number of clusters to be generated as prior. However, Dbscan requires radius in order to differentiate the clusters that is used as a criterion for decision making on number of clusters. The Euclidean distance between parameters is important in both approaches which is helpful in situations where clustering is only distance based.
Bayesian data fusion models have been utilized for numerous applications and there is a large body of literature proposing Bayesian modeling for data fusion and uncertainty management, thus, providing motivation for the work proposed in this study. To date, Bayesian modeling has been successfully implemented in similar applications, such as seismic/Magnetotelluric inversion (Dettmer et al., 2014;Guo et al., 2011), water distribution management, modeling for rockphysics analysis, gas and buried near-surface utility mapping (Ristić et al., 2017;Ji et al., 2016;Wang and Lu, 2016;Ren et al., 2017;Aleardi et al., 2017;Fernández-Martínez et al., 2013). Among several impactful studies using Bayesian modeling, the approach of combining multiple data sources and Bayesian data fusion for bedrock tracking has been of significant interest such as (Fiannacca et al., 2017;Christensen et al., 2015;Oldenborger et al., 2016). These studies proposed automated tracking of bedrock depth and orientation by combining data from different inversion models, borehole data (Christensen et al., 2015), and the utilization of time-domain electromagnetic data (Oldenborger et al., 2016) to systematically handle uncertainties in data of heterogeneous nature and reconstruct estimated maps of bedrocks. An application of Bayesian data fusion approach for the prediction of water pipe failures was developed by (Oldenborger et al., 2016) with the capability to be integrated with the geographical information system of water resources and automatically predicting pipes of potential failures. Another application of neural networks and pattern recognition was developed utilizing only ground penetrating radar (GPR) data (images) to train the model on hyperbolic features (of buried objects) and predict the locations and depths of buried solid objects followed by automatic construction of the maps of underground solid objects (pipes and cables) (Ristić et al., 2017;Al-Nuaimy et al., 2000). It is noted that, in addition to the inclusion of GPR image analysis as proposed by (Al-Nuaimy et al., 2000), the work proposed in this paper provides wider applicability due to the inclusion of multiple sensors of the MTU device and the application of Bayesian models being capable of incorporating incremental learning (unlike neural networks) upon the acquirement of new knowledge.
In other similar works, Neira (Neira and Tardos, 2001) developed a data association model for addressing the problem of robust data association for simultaneous vehicle localization and map building which was an improvement over gated nearest neighbor (NN) (Bar-Shalom, 1987) for tracking problems that successfully rejects spurious matching and provides optimal solutions in terms of pairs of matching in cluttered environments. The correlation between measurement prediction errors in 2D space in cluttered environment provides robust data association with an efficient traversal of the solution space. However, the directional errors (linearity) caused mismatching of the segments with manholes using the hypotheses extracted from the sensors. Abhir and Roland (Bhalerao and Wilson, 2001) also used a Multi-resolution Fourier Transform (MFT) for capturing sufficient shape and orientation of objects within a given image. The use of statistical analysis and camera projections to estimate the location/orientations of line segments in 3D image was also implemented for similar linear segment construction problems (Dong-Min and Dong-Chul, 2009;Chen and Wang, 2010). However, these approaches are only limited to an image of objects and segments which is used to reconstruct a 3D image. MTU mapping, on the other hand, is multi-source data fusion approach to integrate information from multiple sources and produce most probable maps utilizing advanced machine learning/data mining techniques. For linear segment fitting, significant amount of literature report the use of different regression models including EM algorithm that can efficiently fit at higher accuracy levels (Ward et al., 2009;Ester et al., 1996;Sanquer et al., 2011;Delicado and Smrekar, 2007;Werman and Keren, 1999;Friedman and Popescu, 2004). The classification of data samples based on its source as distinguished by MTU sensors is, however, lacking in these approaches as these algorithms were developed for regression scenarios. In addition, the connection establishment (manhole-segment) was not considered as an underlying issue as only the general regression was covered.
The Bayesian mapping model is capable of using automated techniques for hypotheses extraction, classification, segment recognition and connection establishment with the associated manholes. We associate a probability distribution with every such hypothesis reflecting possible errors in the measurements (uncertainty due to the fusion of data from multiple sources) and hypothesis extraction process. These geographical positions (x, y) and depths (z) were used as input to the next stage of the mapping system. A variety of Artificial Intelligence (AI) techniques and algorithms were implemented such as Bayesian Data Fusion (BDF), image segmentation, orthogonal distance hyperbolic fitting, and weighted variation. The algorithms for automated data processing and map (re)construction were developed for real time operative capability of MTU device. A complete use case can be tested using real time mapping model where hypotheses extraction techniques were combined with iterative connection establishment and visualization techniques. Several simulated as well as real sites were tested, and it was demonstrated that the model is robust in various conditions where statutory records were unavailable, and the sensor readings were sparse. The segments were recognized and noise was removed successfully in various situations for mapping the utilities demonstrating the ability of model to work in real time complex situations.

Materials and methods
The model for Bayesian mapping followed the workflow depicted in Fig. 1. The sequential steps in model workflow were as follows; (1) data preprocessing, (2) segment recognition, and (3) segment-manhole