Elevation Angle Estimations of Wide-Beam Acoustic Sonar Measurements for Autonomous Underwater Karst Exploration

This paper proposes a solution for merging the measurements from two perpendicular profiling sonars with different beam-widths, in the context of underwater karst (cave) exploration and mapping. This work is a key step towards the development of a full 6D pose SLAM framework adapted to karst aquifer, where potential water turbidity disqualifies vision-based methods, hence relying on acoustic sonar measurements. Those environments have complex geometries which require 3D sensing. Wide-beam sonars are mandatory to cover previously seen surfaces but do not provide 3D measurements as the elevation angles are unknown. The approach proposed in this paper leverages the narrow-beam sonar measurements to estimate local karst surface with Gaussian process regression. The estimated surface is then further exploited to infer scaled-beta distributions of elevation angles from a wide-beam sonar. The pertinence of the method was validated through experiments on simulated environments. As a result, this approach allows one to benefit from the high coverage provided by wide-beam sonars without the drawback of loosing 3D information.


Introduction
The context of this paper is the development of robotic systems to explore karstic environments. Karsts are networks of underground natural conduits resulting from the dissolution of soluble rocks, limestone, dolomite and gypsum that drain groundwater on large scales. This is illustrated in Figure 1. Karst aquifers are of upmost importance as they are the main sources of drinking water for millions of people worldwide. They are gigantic tanks of renewable fresh water, but their exploitation requires deeper investigation of their characteristics: location, geomorphology and dynamics. Active groundwater management is a major issue for public authorities involved in the prospection, protection and management of the groundwater resource and hydrogeologic risk prevention. In order to have a better understanding of karst aquifers, it is essential to know the geometry of the flow paths, which may be difficult to acquire. Currently, this is done by cave divers with a limited range due to physiological constraints. An autonomous robotic solution would provide the capacity to go further and deeper in the karst conduit while acquiring dense information on the environment.
However, there are still many scientific and technical challenges to overcome before reaching this goal. This paper is related to the robot localization problem in a SLAM framework. Recently, a lot of studies have focused on vision-based localization approaches. Although they can provide accurate and dense measurements, they are subject to the water conditions and the appearance of the environment. Thus, for extensive exploration of an unknown environment, we need to complement it with robust sensors. For terrestrial robots, one can use a scan matching technique such as ICP (iterative closest point) using LIDAR [1]. However, in underwater robotics, LIDAR requires water transparency, which is not guaranteed in a karstic environment. Then, the solution relies on acoustic sonars. This has two major drawbacks: • As sonar deals with acoustic waves, the time-of-flight is always higher than for a laser. Moreover, mechanical pencil-beam sonars are slow, and unlike LIDAR, for which full scan are instantaneous, they take several seconds. • A sonar beam can also have non-negligible beam width which leads to high uncertainty on the 3D positions of the observations.
Our final objective was then to develop a variant of ICP adapted to those constraints. In particular, that requires dealing with the 3D estimation of sonar measures with non-negligible beam width. In our setup, similarly to [2][3][4], the robot was mounted with two mechanically scanning sonar systems (MSIS). A narrow-beam sonar is mounted vertically while the another wide-beam sonar is mounted horizontally. In [4], the vertical sonar is only used for 3D reconstruction of the environment while the horizontal sonar is used to estimate the 2D robot motion using a probabilistic ICP approach. We introduce in this paper a new approach for merging both sonars' information to estimate the 3D positions of points in the environment measured with the wide-beam sonar. Indeed, the advantage of a wide acoustic beam is to obtain a large coverage of the environment, increasing the surface covered by two successive full 360 • scans. Obviously, it comes at the price of having a high uncertainty in the true observed 3D point. The idea was to use the narrow vertical sonar to learn a local probabilistic model of the surface with Gaussian process regression. This model was then leveraged to estimate the probabilistic distribution of each horizontal sonar measurement.  [5]).
The first contribution of this paper is to propose a 3D Gaussian process regression adapted to karst environment and more generally applicable to approximately elliptic cylindrical shaped surfaces. We analyzed the results obtained with different kernels to better comprehend their effects on the estimated surfaces. The second contribution is a principled approach to estimating the scaled-beta distribution of elevation angles from a wide-beam sonar by exploiting the previously estimated environment. The pertinence of the method was quantitatively assessed through simulations wherein the estimated distributions could be evaluated against the environment ground truth.
The structure of the paper is as follows. Section 2 gives an overview on related works in underwater robotics. Our robotic system and related definitions used throughout the paper are described in Section 3. Section 4 is the main part of this paper in which we develop our approach. Finally, Section 5 describes evaluations of our work in simulated environments. The conclusion and future working directions wrap up in Section 6.

Related Works
In the last few decades, a wide range of simultaneous localization and mapping (SLAM) techniques have been developed [1,6]. In particular, a large focus has been put in the robotic field on terrestrial and aerial applications based on camera sensors [7,8]. Some efforts were also made to use optical sensors for underwater applications [9][10][11], but were rather limited as they were highly dependent on water conditions (turbidity, luminosity). Hence we cannot guarantee the robot localization in an unknown environment based solely on vision sensors. However, those approaches can be used when applicable to complement more robust sonar-based methods.
Thus, a large part of the underwater SLAM literature is based on acoustic sonars. References [12] and [13] proposed a pose-graph SLAM approach using a dual-frequency identification sonar (DIDSON) for the inspection of underwater structures. The advantage of their approach is to compensate for the low coverage of their sonar by sweeping it in a stationary position. However, this is not applicable in the context of karst exploration where the robot moves while sensing the environment. Closer to our application, Fairfield et al. [14,15] first developed a SLAM approach for cenote (underwater vertical tunnel) exploration. It is based on a Rao-Blackwellized particule filter (RBPF) with a mapping is represented by a 3D evidence grid in an octree structure. They used an array of 54 pencil-beam sonars placed around their vehicle in order to map the environment and recognize previously mapped region for SLAM purposes. Rahman et al. [10] proposed a system for underwater cave exploration. They notably leverage profiling sonar, and inertial and depth sensors to strengthen their approach based on stereo camera vision. However, the system is not designed to work in degraded mode when vision is not available.
Palomer et al. [3] used an extended Kalman filter (EKF) SLAM framework for seafloor mapping. They used a multi-beam echosounder and the resulting scans were matched using probabilistic ICP [16], as proposed by [2] in a similar context. While this is pertinent for bathymetry applications, we cannot fix a multibeam echosounder in a fixed direction for the exploration of a karst aquifer. Similarly, Burguera [4] developed a method for registering data from mechanically scanned imaging sonar (MSIS). Elevation angles from beam measurements and thus 3D data are obtained through the assumption that the robot moves parallel to the seafloor. Obviously we cannot make the same assumption in our context. Chen et al. [17] also recently proposed a Rao-Blackwellized particle filter (RBPF) SLAM framework using MSIS for the same application. They considered a beam enpoint model to compute the likelihood of an observation which is classical in the occupancy grid approach. They focus on the 2D localization of the robot and the lost of 3D information from the wide-beam sonar was ignored.
Unlike previous works which focused either on inspection or bathymetry in open space, we are interested in the exploration and mapping of confined underwater environments (karstic caves). Mallios et al. [18,19] presented experimental results of the exploration of an underwater cave based on their earlier work [2]. They also provided a dataset from this experiment. Thanks to the MSIS vertical beam width and by supposing a quasi-translational displacement between two full 360 • scans, they could restrict the estimation to a 2D problem. Depth, pitch and roll absolute values were directly read from the precision and the IMU sensor. Our work was inspired by this approach and aimed at considering the full 3D problem. For instance, this approach cannot deal with L-shaped conduits where measured points cannot be consider coplanar. Furthermore, even in coplanar cases, it is not theoretically sound to separate the rigid transformation estimation to SE(2) and then set the other values (depth, pitch and roll) with the sensor's data. The problem should be cast as a constrained optimization on SE(3). Finally, Martins et al. [20] developed a robotic system called UX-1 under the European project UNEXMIN designed to explore flooded, abandoned mines. It is equipped with a camera, a four-customized-laser-based structured light system and multibeam profiler sonar. They extract natural landmarks from the different sensors which are then integrated into a graph-based SLAM framework. While their context was similar to ours, they mainly relied on the fusion of vision and a structured light system. Note that while we focus solely on sonar sensors in this paper, we ultimately aim at complementing already existing methods.
To the best of our knowledge, karstic exploration by AUV remains rather unexplored territory. The main difficulty is that the karst can be composed with narrow galleries or large caves. In particular, we need MSIS sonar with a wide beam to increase the chance that a previously mapped region will be encountered when revisited. Meanwhile, we also need 3D data from the sonar as we cannot restrict the problem to 2D here. Indeed, in a karstic environment, the robot would have to follow non-planar trajectories. Furthermore, it gives the possibility to further improve the relative displacement estimation by considering a 3D ICP constrained in depth, pitch and roll. This means that we have to estimate the elevation angle from the sonar measurements. It is the same problem expressed recently by Wang et al. [21]. Their work was similar to [3] but improved on it by searching to estimate the sonar elevation angles which were assumed null in the previous works. They did so by directly tracking features in the multibeam sonar images. They also estimated the terrain height by modeling it with Gaussian processes directly integrated in their factor graph SLAM. Note that in their case, the elevation angle was indirectly obtained through the terrain height estimate. In our case, we used MSIS (single beam) sonar. Besides, due to the karstic environment, we had to fully model the environment. The assumptions, such as elevation angle with null mean, are not applicable in our case.

Notation
We define here the notation used throughout this paper. Vectors are represented in bold lower case (e.g., x) and matrices in upper case (e.g., M). Integer intervals are denoted a, b . If X is a random variable, X ∼ q(λ) means that X follows a law defined by the probability density function (pdf) q with parameters λ. With an abuse of notation, we denote as P(X = x) = P(x) the probability of X taking the value x (in the continuous case , P(X = x) is always zero (null set). In this case, P(X = x) = q(x|λ) where q is its pdf.). We also denoteX the estimator of a random variable X.
The operator ⊕ (resp. ) represents the pose-pose and pose-point composition (resp. inverse composition) operator [22]. We use the notation x |{R} to express that a vector x is expressed in the basis (frame) {R}.
Let M be a d-manifold embedded in R n . We denote T x M the tangent plane (d-dimensional vector space) at x ∈ M. In particular, if M is a Riemaniann manifold, we denote < ., . > x the inner product on T x M. SE(3) is the special Euclidean group representing the rigid transformations of space. It is a Lie group and we denote se(3) its Lie algebra. We denote Euler angles [ψ, χ, φ] T in yaw-pitch-roll convention. R(q) (resp. R(ψ, χ, φ)) is the rotation matrix corresponding to a quaternion q (resp. the Euler angles [ψ, χ, φ] T ). When needed, we also use the Matlab notation x(i : j).

System Description
In the sequel we will consider the underwater robotic system Telemaque, depicted in Figure 2a. Telemaque is equipped with 8 thrusters, affording the robot with omni-directional capability. A complete sensor suite (9-axis IMU and DVL) allows for computing dead-reckoning navigation. Two profiling sonars horizontally and vertically scan the environment and will be used for SLAM. As a preliminary step, the merging between the 2 sonar measurements is performed. This is the subject of this paper. Equipped with those frames definitions, let: The robot is equipped with navigation sensors (IMU and DVL) which allow for performing dead-reckoning navigation and provide, regularly, an estimation of the global system pose η. It is assumed to follow a normal Gaussian distribution, i.e., η ∼ N (η, Σ η ), where N (η, Σ η ) stands for a Gaussian distribution of mean η and covariance matrix Σ η . We call discretized trajectory the set of system pose H = {η i } , i ∈ 1, n r regularly obtained by dead-reckoning along its n r first poses.

Sonars and Measurements
The two sonars embed an acoustic rotating head and are mounted on the robot according to Figure 3 with poses in {B} denoted p H and p V . We denote Θ H (resp. Θ V ) the vertical aperture and Ψ H (resp. Ψ V ) the horizontal aperture for the horizontal (resp. vertical) sonar.
Due to those apertures, a single beam from those sonars can correspond to several echos with different ranges, as illustrated in Figure 4. In our case, according to the specification of the sonar used, we consider Ψ H = Ψ V = Θ V = 0. In other words, only the vertical aperture of the horizontal sonar is taken in account. As a corollary, a single beam from the vertical sonar provides a unique measure expressed as spherical coordinates in its local frame {S V } and acquired from the robot pose η i (in fact, it is slightly different from the classic spherical coordinates in the θ definition. Here it is defined as the oriented angle between the XY-plane and the measured ray. See Figure 3.  For the horizontal sonar, due to the vertical aperture, a single insonification can provide n different echos corresponding to n different ranges ρ j with unknown elevation angles θ j , j ∈ 1, n . Hence the i-th measurement of the horizontal sonar is represented by corresponds to an observed point on the environment surface. Note that the angle ψ h i corresponding to the sonar head rotation angle is the same for the n points coming from the i-th beam.
Similarly to [4], only the horizontal sonar can be used to estimate the relative poses using a variant of ICP. Thus we define a complete scan S = {M h , M v } when the horizontal scan has completed a full 360 • measurement. M h = M h i , i ∈ 1, n h is the set of measurements from the horizontal sonar during a complete scan. Similarly, we define M v = m v k , k ∈ 1, n v the set of measurements from the vertical sonar obtained during the same interval of time.
Note that in [4] the vertical sonar is only used to add points to the reconstructed 3D environment. In this work, the vertical sonar is also used to estimate the elevation angles from the horizontal sonar.
As the dead-reckoning navigation and both sonar measurements are asynchronous, the i-th sonar measurements M h i , M v i are not acquired simultaneously at the i-th robot pose η i . We then define the following mapping linking the sonar measurements to the robot pose at which they have been acquired. We denote σ h : 1, n h → 1, n r (resp. σ v : 1, n v → 1, n r ) the function which returns the pose index at which a horizontal (resp. vertical) sonar measurement has been made. This means for example that the i-th horizontal measurement M h i has been acquired at the robot pose η σ h (i) . We also define the functions g h and g v which maps local spherical coordinates in the horizontal (resp. vertical) sonar frame to the Cartesian coordinates in the global frame {0}: where Ω is the environment surface. All the definitions are illustrated in Figure 2.

Gaussian Process Regression for Surface Estimation
A Gaussian process (GP) [23] is a set of random variables such that every finite subset of those variables are normally distributed. A GP regression is thus a non-parametric method for function regression. In our case, it is used to model the environment surface through the relation between the vertical sonar output (range ρ v ) and inputs (rotation angle ψ v and curvilinear abscissa s along the robot's trajectory as explained in the Section 4.1): where is the observation noise. Let f p denote the prior function which models our prior knowledge of the environment. The definition of this function in our case will be discussed in Section 4.2.1. Gaussian processes involve kernel functions which measure the similarity between data points. A kernel is simply defined as a function mapping a pair of inputs x, x into R. In the case of Gaussian processes, the notion of similarity is defined by the covariance. Hence we are interested in symmetric positive definite kernels. More details on Gaussian processes and kernel properties can be found in [23]. The choice of the kernel has a great incidence on the training result. The radial basis function (RBF) kernel is generally the default kernel used in GP: where σ 2 p is the output length scale and l 2 is the input length scale. It gives smooth results which are appropriate for smooth surface but not for rougher one. This kernel is member of the broader Matern class of covariance functions K M ν [24]. Those kernels depend on a positive parameter ν which controls the smoothness of the function. Only values of the form ν = p + 1 2 , p ∈ N are considered, as it allows one to considerably simplify the general expression of Matern kernel. Besides, values above 7 2 are ignored as they give results almost identical to the case ν = +∞ (see [23], pp. 84-85). In particular, the case ν = 1 2 corresponds to the squared exponential kernel l and the case ν = +∞ gives the RBF as in Equation (5), e.g., K RBF = K M +∞ . A trade-off between those two values is obtained for ν = 3 2 as also explained in [25]. The expression of the Matern kernel for ν = 3 2 is given by All the hyperparameters l, σ p and the observation noise variance σ n (Equation (4)) can be set manually and/or obtained by maximum likelihood estimation (MLE) on a training data set. Note that products and sums of kernels are also kernels. This allows one to build customized kernels for a specific problem based on basic kernels such as those presented in this section.
The original GP regression problem considers the full set of training input and considers them as noise-free. Some recent work [26][27][28] proposes extensions for sparse GP regression and/or taking into account input uncertainties. Here, as we are only locally modeling the surface through the vertical sonar measurements, our training data are relatively few; thus, a sparse formulation is not mandatory. However, as our robot accumulates position uncertainties (due to dead-reckoning drift) during the scan acquisition, taking in account input uncertainties could significantly improve our estimation. In a first step, we currently assume noise-free inputs.

Environment Model
We model the surface of the environment based on virtual vertical sonar measurements. More precisely, a 3D point on the surface is parameterized by the 6D pose of the robot and the vertical sonar rotation angle at which the point could have been seen by the vertical sonar. Formally, the surface is defined as a function which maps a 6D pose η and a scan rotation angle ψ v to an estimated distance ρ v : This means that ρ v is the distance expressed in the vertical sonar plan (XY-plan in {S V }) that would have been acquired for a measurement at the robot pose η and with the sonar rotation angle ψ v . We cannot use here a terrain model of the form z = f (x, y) like in [21,29] as it is a non-functional relation in our karstic context.
As the robot trajectory is a 1D submanifold of SE (3), we can parameterize it with a simple scalar u ∈ R. We recall that kernels measure similarity between two inputs (cf. Section 3.4). If we consider a kernel K(., .), we would like to have The first idea is to use the time t to describe the position of the trajectory manifold, as done in [25]. However, in this case Equation (8) is not verified. Indeed, the kernel length scale l as in (5) in the Gaussian process regression will be depending on time and not on space. The problem can be illustrated by considering an immobile robot at two instants t 1 and t 2 such that t 2 t 1 and To avoid this, we use the curvilinear abscissa as an input parameter based on the length of the trajectory in SE (3). We use the left-invariant Riemannian metric induced by the following point-independant Riemannian metric on GA(3) (general affine group) ( [30], Section 3.2) where e is the identity element of SE (3), T A SE(3) the tangent plane at A and s u , s v ∈ se (3).
The length of a curve η(t) : R → SE(3) between two instants t 1 and t 2 is then given by The curvilinear abscissa s is simply given by In our work, we consider a discrete trajectory regularly updated from dead-reckoning fusion algorithms, fed with DVL and IMU measurements. Note that we only consider the trajectory during one full scan acquisition. The full trajectory estimation in a graph SLAM framework is out of the scope of this paper. Given the discrete trajectory H = {η i }, we approximate the length as follows.
∀i ∈ 2, n r , where log is the logarithm map of SE(3) and T k ∈ GA (3) is the transformation matrix equivalent to the pose η k . Similarly to [25,29], our objective is to estimate f by training a Gaussian process on the vertical sonar measurements. The learned function is then used to estimate the most likely angles θ h ij inside the horizontal sonar beams measurements.

Algorithm Description
In this section, we provide a detailed description of our algorithm which is summarized in Algorithm 1.
First, we compute the curvilinear abscissa for each pose of the trajectory based on Equation (13) (line 1). The surface prior f p is then estimated by fitting an elliptic cylinder to the vertical sonar points (line 2). The Gaussian process (GP) modeling the environment surface is trained using the vertical sonar measurements and their corresponding curvilinear abscissa (line 3). Once done, for each horizontal sonar data, we estimate the most likely elevation angles θ h ij (line 5) and compute their corresponding uncertainties (line 7). In the following, we detail the process for each step.

Environment Prior f p
In this section, we explain how our environment prior f p is defined. Based on typical karstic environment, the natural choice is to use an elliptic cylinder as a prior shape. For a given pose η on the trajectory (equivalently its curvilinear abscissa) and a given sonar rotation angle ψ v , f p (η, ψ v ) returns the distance to the prior shape.
In order to fit this primitive on the vertical sonar measurements, we follow an approach similar to [31]. First, as for any non-linear optimization approach, we need a good initial shape. It is obtained by computing robust principal component analysis (RPCA) [32] on the vertical sonar point cloud. It provides three principals vectors which, in the case of a perfect elliptic cylinder, should correspond to the cylinder axis, the major and the minor elliptic axis. If the length of the estimated cylinder is higher than its width, then the eigenvector with the highest eigenvalue is the cylinder axis. However, we cannot make this assumption in our case. To find the cylinder axis among those principal vectors, we used the fact that every tangent on a cylinder surface is orthogonal to its axis. This means that by approximating tangent vectors by the vector formed by two successive points, their projections on the PCA bases should be minimal along the cylinder axis.
Then, the elliptic section is estimated by fitting the projections of all the points in the plane spanned by the two other principals axis [33]. The PCA base is then rotated around the cylinder axis to be aligned with the ellipse axis.
As can be seen with the orange elliptic cylinder in Figure 5, the dissymetric distribution of the vertical sonar points has a huge influence on the RPCA result. To further improve our estimation, we use the Levendberg-Marquardt (LM) algorithm to obtain the shape which minimizes orthogonal distances to the elliptic cylinder. The estimation vector x ∈ R 8 representing our elliptic cylinder is composed of a 6D pose in SE(3) and the scales along the minor and major ellipse axis. Here, we use on-manifold optimization on SE(3), as described in [22]. The final prior can be seen as the purple cylinder in Figure 5.

Gaussian Process Regression and Kernel Choices
In this section, we explain the choice of kernels for the following Gaussian process regression. Based on our formulation of the surface through the functional f (cf. Section 4.1), we exploit its cylindrical structure by combining a kernel on the trajectory abscissa and a kernel on the sonar rotation angle: K s can be any kernel as proposed in Section 3.4. For K ψ , as we deal with angles, we cannot use a simple euclidean distance. When choosing the metric to use, we have to ensure that the resulting kernel is positive definite [34,35]. We can define two distance d chord and d geo based respectively on the chordal distance and the geodesic distance on S 1 : In [35], it is shown that for geodesic distance d geo a C 2 -Wendland function [36] can be used to define a valid kernel For d chordal , any kernel can be used.
In this paper, we only consider exponential K exp and Matern52 K M 5 2 kernels with chordal distance for angles. In the experimental section, we investigate the effects of those different kernels and distances on the surface estimation.

Maximum Likelihood Estimation of θ
Originally, we cannot obtain the 3D points observed by the horizontal sonar as any point in its beam may return the same measurement. We exploit the estimated surface by the vertical sonar to infer the elevation angles θ for each horizontal sonar measurements. The idea is that elevation angles corresponding to 3D points near the estimated surface are good estimate. This is illustrated in Figure 6. Hence, for each horizontal sonar measurement ψ h k , ρ h k , we search for the elevation angles solutions of: whereρ v is the estimated range we would have obtained if this point was observed by the vertical sonar. In fact, depending on the environment configuration, several angles can correspond to one measure as previously illustrated in the Figure 4. Thus, instead of solving Equation (18), we search for local log-maximum likelihood valuesΘ: The corresponding algorithm is described in Algorithm 2 and detailed in the following.
As it is difficult to obtain a closed-form expression of p ρ v | f , ρ h k , ψ h k , η σ h (k) , θ relative to θ, we uniformly sample θ values in the range Θ H . In practice, Θ H = − b 2 , b 2 where b is the beam vertical aperture. We denote those samples θ q , q ∈ 1, n q . For each sample θ q , we compute the corresponding virtual 3D points x(θ q ) with g h defined by Equation (1). To assess the likelihood of having measured this point on the surface, we need to use the previously trained GP model f . We therefore have to express this point in terms of f inputs and output. In other words, we need the curvilinear abscissaŝ q , the rotation angleψ v q and rangeρ v q as if the point was virtually seen by the vertical sonar. This is illustrated in Figure 7. Figure 6. Illustration of elevation angle maximum likelihood estimation on a vertical slice. The green curve is the mean surface (cofounded here with the true surface for visibility) and blue curves are the confidence interval at ±3σ. Note that in this particular configuration . . . To obtain those parameters, we proceed as follows. First, let us denote n v |{S V } the rotation axis of the vertical sonar expressed in its local frame. It is also the normal vector of the sonar plane. We are searching a poseη q along the trajectory such that x(θ q ) belongs to the vertical sonar plane. To do so, we need to interpolate the robot trajectory. The first step is to determine in which segments of the trajectoryη q could belong to (several segments can be found). We recall that we note the global robot position by τ i and its orientation by the quaternion q i . The vertical sonar orientation quaternion is also defined by q V . The rotation axis of the vertical sonar expressed in the global frame n v |{0},i when the robot is in its i-th pose η i is given by A segment (represented by indexes [i, i + 1]) is considered if the projection of the vector x(θ q ) − τ i on the vertical rotation axis n v |{0},i has a change of sign between the beginning and end of the segment (Figure 8). Indeed, thanks to the intermediate value theorem, such a segment contains the pose for which the projection is null. Formally, we consider the set I q of starting index such that For each selected segment [i, i + 1], i ∈ I q , we denote η [i,i+1] : [0, 1] → SE(3) the linear interpolation between the poses η i and η i+1 . τ [i,i+1] is defined by the first three coordinates of η [i,i+1] and is simply given by Let define the relative angular pose express in yaw-pitch-roll Euler angles between η i and η i+1 : We then have the associated interpolated rotation matrix δR(t): The global rotation matrix R(t) is given by We can now define the linear interpolation of the vertical sonar rotation axis n v |{0},[i,i+1] : We search for the interpolation valuet such that x(θ q ) belongs to the sonar plane with normal We assume small angle variation so that we can approximate δR to the first order t is the root in [0, 1] of the following order 2 polynomial expression We then haveη q = η [i,i+1] (t) the robot pose corresponding to the computedt. We can now compute the curvilinear abscissaŝ corresponding tot To computeψ v q , we express x(θ q ) in the vertical sonar frame coordinates corresponding to the robot poseη q,ix Finally, the estimated rangeρ v q is simply the L2 norm ofx( We can now compute the log-likelihood J( The expressions for Equation (35) can be found in ( [23], p17 Equation (2.25/2.26)) and are obtained using the GP regression trained to estimate f (Section 3.4).

Estimation Uncertainty
In the previous section, we explained how the setΘ is obtained. We are also interested in its related uncertainties. In other words, we want to define the probability distributions p(θ). A normal distribution is not adapted as its support is infinite while the width of our horizontal sonar beam is restricted to its aperture. A good choice is a scaled beta distribution [37] defined by two parameters α, β θ ∼ Beta (α, β, θ min , θ max ) (37) In our case, we consider θ max = −θ min = b 2 where b is the horizontal sonar beam width. We then have the following probability density function where Γ is the gamma function which extends the factorial function to complex numbers and is defined by For each estimatedθ from the previous section, the parameters α, β can be computed from a given meanθ and variance σ 2 θ : In our case, the mean is simply the estimated θ ieθ =θ. The variance σ 2 θ requires more computation and is estimated as the inverse of the Fisher information I(θ): Direct computation of Equation (43) gives us (see Appendix A for the derivation): Note that a uniform distribution can still be represented by a beta distribution with shape parameters α = β = 1. Indeed, from Equations (39) and (38) we have which is the density probability for a uniform distribution on − b 2 , b 2 .
We now define a threshold σ l to filter out variances σ θ leading to invalid Beta distribution parameters . First, the parameters α, β of a Beta distribution must be strictly positive: Furthermore, as we consider each distribution as unimodal, we must have α ≥ 1 or β ≥ 1. (In the case α, β < 1, the beta distribution is bimodal with modes at {− b 2 , b 2 }. A mode corresponds to a local maxima of a probability density function (pdf).) From the expressions (40) and (41), we have This implies It is also easy to verify that the first constraint α, β ≥ 0 is included in the second as we have We also add another threshold σ m to filter out valid but too uncertain estimations. It is defined by The parameter k controls the filter sensibility, with higher values being more restrictive. In our implementation, we used a default value k = 3. The final threshold σ T is then (52) Figure 9 shows the graphs of σ l , σ m and σ T . Figure 10 illustrates the uncertainty estimation process. The colors used to represent the beta distributions directly on the measurement arc are obtained through the following mapping m : Note that the central color (turquoise) is obtained for x = 1 b , which is the pdf for a uniform distribution.

Experimental Results
To validate the proposed method described previously, we made experiments on simulated environments shown in Figure 11. Simulation allows to modify the experimental configuration such as the measurements density or the smoothness of the environment surface. It also provides a ground truth so that quantitative analysis can be carried out.
We are currently working on a novel version of pICP adapted to the presented methodology. It will be integrated in a pose-graph SLAM framework and implemented on the Telemaque AUV shown in Section 3.2. Real experiments will then be performed in the Lez karst aquifer in the South of France [38,39].

Implementation
Our current implementation consists of a simulation engine implemented in C++ based on the MRPT library [40]. The Gaussian process regression is done in python using the GPy library [41]. In this paper, we do not focus on the algorithm complexity and computation time. It will be part of our future work as a mandatory step towards its use in real applications. Note that final implementation is planned to be fully C++ using the Limbo library [42] for the Gaussian process regression.

Settings
We present here the settings for which we test our algorithm. We consider three variables which are the surface smoothness, the measure density and the kernel choices.
We can also include the robot trajectory as a fourth variable. If not explicitly stated, we consider a straight linear robot trajectory as it results in clearer display easier to understand, especially for qualitative analysis. Quantitative results are also provided for random trajectories.
To apprehend the effect of environment surface smoothness on the surface estimation, we also evaluate our approach on a chaotic model where we have added some random noise to a cylindrical karst model. We also consider two measurement densities. While fixing the robot movement at each simulation step, we generate dense measurements by considering a large karst (radius of 22m) and sparse measurements with a scaled down karst (radius of 1m). Note that the dense case is the similar to the general density of measurements expected in real karst exploration. The sonar measurements, used as training data for the surface estimation, are computed using ray-tracing on the environment mesh.
For the kernels, we consider the products between kernels on trajectory abscissa K s ∈ {K exp , K M 3 2 , K M 5 2 , K RBF } and kernels on the sonar rotation angle K ψ ∈ {K exp • d chordal , K M 5 2 • d chordal }. This is summarized in Table 1.

Effect of Kernel Choices
In order to quantitatively assess the performance of our algorithm, we have modeled our karst environment using the 3D modeling software Blender (http://www.blender.org). Unless otherwise stated, we optimize the GP with all the kernel parameters presented in Section 3.4. We discuss when needed the benefit and/or effect from fixing some of those parameters. As we are dealing with surface in 3D, it is hard to have a clear representation of the GP results. Figure 12 shows some examples of estimated surface with the mean (green) and the lower/upper bound at −/ + 3σ (dark blue/blue). To assess the effect of kernels on the surface estimation, we represent the surface estimation by considering upper contours in the XY, XZ and ZY planes (The selected ZY plane is set at the middle of the trajectory).
One should choose kernels which provides the best trade-off between confidence (variance) in the estimation and accuracy. Ideally, the variance in the surface estimation should be high enough to contain the real surface in its confidence interval. However, too high variance will lead to uniform distribution for the elevation angle estimation.
Note that as our surface estimation operates per full scan and thus is local, one can adapt dynamically the kernels used during operation depending on a priori information on the robot environment.

Kernel K s
We fix here K ψ = K M 5 2 • d chordal and consider the cases where the lengthscale l ψ is learned or fixed to 0.5. The objective here is to apprehend the behavior of each kernel K s . Figure 13a (resp. Figure 13b) shows the Gaussian process regression results in the sparse and dense configuration for l ψ = 0.5 (resp. learned l ψ in Table 2b). From those graphics, we see that when l ψ = 0.5, the uncertainties in the surface estimations are different in the sparse and dense cases. In the former, we have by increasing order of uncertainty the Matern32, Matern52, rbf and exponential kernels. In the latter, we have Exponential, Matern32, Matern52 and RBF. Based solely on kernels smoothness, we expect the results seen in the dense case. This shows that the kernels behavior depends heavily on the environment configuration. We explain this difference by the bigger variations in measured ranges in the sparse case compared to the dense case ( Figure 12). Obviously, the accuracy of the estimated surface is worse in the sparse case as large part of the surface are unseen by the vertical sonar.
When the l ψ is learned, we see results similar between the sparse and dense case. The variability of the results obtained shows the importance to consider both kernels K s and K ψ simultaneously.   Table 2). • d chordal and several K s with different lengthscale l ψ .

Kernel K ψ
We now fix K s = K M 5 2 . Figure 14a • d chordal kernel for different fixed lengthscales. Table 2 contains the values of lengthscale l ψ obtained when learning them. We observe that the exponential kernel gives lower variance than required to take into account for the unknown environment. The first row of Table 2 shows that we obtain very high values for learned lengthscale. It shows that the GP failed to find appropriate value for this parameter. On the other hand, the kernel based on Matern52 gives more appropriate uncertainties controlled by the lengthscale parameter. The learned lengthscales are in this case more consistent.
The main conclusion of those results is that the choice of K s is not obvious. In other works dealing with surface estimation [21,25], K M 5 2 and K M 5 2 were chosen as being a compromise between K exp and K RBF . However, depending on the configuration, it could be interesting to consider other kernels.
Concerning K ψ , K M 5 2 • d chordal seems to be the best choice. Note that we are not exhaustive on the tested kernels. We use here generic kernel used extensively in the machine learning field. It would be interesting to further investigate on other kernels or creation of customized kernels adapted to the task at hand.
From the previous results, the pair of kernels K s = K M 5 2 , K ψ = K M 5 2 • d chordal seems to be a reasonable choice and we use them in the following experiments.
We also tested our surface estimation approach on a more chaotic karst environment as shown in Figure 15.

Random Trajectories
In the previous sections, we only considered straight line trajectory to facilitate the visualization. We present here the surface estimation for a robot with a randomly generated trajectory. This trajectory is a simple translation along x with an added Gaussian noise. It is illustrated in Figure 16a. Figure 16 shows the corresponding surface estimation. Note that the estimated surface cannot be directly compared with results in the previous section: the trajectories being differents, the points observed by the vertical sonar are different.

Evaluation of Elevation Angle Estimation
In this section, we aim at quantitatively assessing the proposed approach. We simulated random trajectories in modeled karsts in the following configurations: smooth karst with sparse/dense measurements and chaotic karst with dense measurements. The surfaces are estimated using the method described previously. Each simulated horizontal sonar beams is sampled by considering the ranges measured at 5 equally spaced elevation angles. Note that each of those samples corresponds to an arc of measure for which we estimate the elevation angle distribution as represented in Figure 17. For the robot trajectories, we used a RNG generator with a fixed seed for simulation in a given environment. It means that for the sparse, dense and chaotic cases each trial with a different kernel is based on the same randomly generated trajectory. However, note that as the estimated surface depends on the kernel used in the GP regression, the number of non-uniform estimated distributions will vary.
The performance of our algorithm is difficult to measure directly. One possible metric would be to compute the likelihood of the true elevation angle relative to the estimated distribution. However, as discussed in Section 3.3, one range can correspond to different real points and hence to different distributions. We choose to measure the error of an estimated non-uniform distribution (A non-uniform distribution arises when the beam measurement concerns an uncertain part of the estimated surface.) as the distance expectation from the measurement beam to the environment mesh. We denote d(θ) the distance from a point at elevation angle θ to the surface mesh and call this expected error the posterior error E post (d). We compare it to the prior error E prior (d) which is simply the distance expectation of the same measurement beam with a uniform distribution. Formally those errors are defined as follows In practice, those expectations are numerically approximated using the same n q samples θ q defined in Section 4.2.3 so that Dense Sparse l ψ = 2.77, learned In the following experiments, we use n q = 200 and beam width b = 35 deg. For each fixed settings (smooth karst with sparse and dense measurements, chaotic karst), we compute the errors for several kernels K s while fixing K ψ = K M 5 2 • d chordal . Error distributions are represented using box-and-whisker diagram. The rectangular box extends from the lower Q1 to the upper quartile Q3 of the distribution. The filled (resp. dot) line corresponds to the median (resp. mean). The upper whisker is drawn above the upper quartile up to the largest value at a distance below 1.5 times the interquartile range IQR Q3 − Q1 (It corresponds to the rectangular box vertical length.). Similarly, the lower whisker is drawn at the lowest value value above 1.5 times the IQR. Values above and below the whiskers are drawn as circles. For each kernel K s , the non-uniformly distributed beams considered for error computation are different. Indeed, as the surface estimation varies, some beam are estimated as uniform distribution for some kernel but not for others. Hence the need to compute the prior error for each kernel. The results are represented by pair of corresponding diagrams in Figure 18. We also compare the number of non-uniform distributions estimated with each kernel in Figure 18d. Table 3 provides the values of medians and means for each case.  We globally observe that in all cases regardless to the kernel considered our approach improves significantly upon the prior configuration. In particular, we have a relatively high percentage of the distribution with low errors. Besides, the number of estimated non-uniform distributions can vary depending on the kernel used. In our context, the estimated distributions are expected to be further processed by higher level algorithms to reject outliers.
In the sparse case (Figure 18a), the intermediate kernels Matern32 and Matern52 give the best results. We notice some discrepancy in the number of relevant estimations (non-uniform distribution) as can be seen in Figure 18d. This relates to results observed in Table 2a,b where the surface estimated with the exponential Kernel has less uncertainty. In this particular case, this kernel gives an overconfident surface estimation.
In the dense case (Figure 18b), the Matern32 and Matern52 kernels also give the best results with the RBF kernel being more erroneous. There is also a drop in the number of non-uniform distributions obtained for the RBF kernel.  Finally, Figure 18c shows the results obtained for a chaotic karst. Note that in this case, the difference between the prior and posterior are relatively small compared to the previous results (beam measurements are generally intersecting the environment several times). Even in this more difficult configuration, an important proportion of the estimated distribution gives low distance expectation error.
Overall those different observations validates what was theoretically expected. In particular, it also justifies the use of Matern52 kernel for K s as the best default choice. Our experiments is however limited to non exhaustive simulation results. Those results are thus a guideline to better understand the effect of kernel choices and this will be refined based on futur experiments on real data.  The filled (resp. dot) line corresponds to the median (resp. mean). The upper (resp. lower) whisker corresponds to the largest (resp. lowest) value at a distance below (resp. above) 1.5 times the IQR. Values above and below the whiskers are outliers drawn as circles.

Conclusions
In this paper, we presented a method for probabilistic sonar scan merging in the context of underwater karstic exploration. It consists of using the measurements of a high resolution sonar (narrow beam) to estimate the measurement of another lower resolution sonar (wide beam). This is done using the cofounding variable linking both sonar measurements which is the environment. The first sonar measurements are used to obtain a probabilistic approximation of the surrounding surface through Gaussian process regression. It was then leveraged to estimate the probability distribution of the second sonar measurements.
In this work, we validated our approach to obtaining a probabilistic estimate of elliptic cylindrical shaped surface such as a karst aquifer. We analyzed the behavior of different kernel choices on the surface estimation. The results obtained were generally in agreement with the smoothness of considered kernels, but some interesting differences were also noted, notably between the sparse and dense measurements case. We also showed quantitative results based on the expected distances between measurement beams and the simulated mesh. We compared the distances obtained with our estimated elevation angles and a uniformly distributed prior distribution. On our simulated smooth karst, we improved the median expected distance by a ratio of 6.1 in the dense measurement case which corresponds to the measurements expected in real experiments and similar to those obtained in the dataset provided by [19]. We also show significant improvements in more difficult settings with a ratio of 4.1 for sparse measurement in a smooth karst and a ratio of 1.62 on a difficult chaotic environment.
This paper aimed at validate our approach and we will focus on improving it in our future works. There are still areas of improvement to explore. We are currently not taking in account uncertainties on the robot trajectory during a full scan acquisition and the noise on the sonar measurements. Besides, we can also improve the GP regression computation time using sparse GP based on inducing points. Another direction is to develop non-stationary kernel (Kernel which does not solely depends on the distance between its two inputs. Informally, it can modeled with variable lengthscale.) as proposed in [29]. It could be particularly interesting to deal with an environment containing strong discontinuities.
The proposed methodology is only a first step towards a real application for underwater karstic exploration. We are currently working on its usage by adapting probabilistic ICP [16] to our estimated point distributions inside a pose graph SLAM framework.