MASAT: A fast and robust algorithm for pose-graph initialization

In this paper, we propose a novel algorithm to compute the initial structure of pose-graph based Simultaneous Localization and Mapping (SLAM) systems. We perform a Breadth-First Search (BFS) on the graph in order to obtain multiple votes regarding the location of a certain robot position from all of its previously processed neighbors. Next, we define the initial location of a pose as the average of the multiple alternatives. By adopting the proposed initialization approach, the number of iterations needed for optimization is significantly reduced while the computational complexity remains lightweight. We perform quantitative evaluation on various 2D and 3D benchmark datasets to demonstrate the advantages of the proposed method. © 2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license. ( http://creativecommons.org/licenses/by/4.0/ )


Introduction
The robustness of Simultaneous Localization and Mapping (SLAM) algorithms highly depends on the tracked key-features in consecutive frames and the graph optimization methods for concatenating intermittent key-frames along the trajectory. For the features, point-based methods can be exceeded by structural elements [1] or object based [2] descriptors, while the position network (pose-graph) mostly needs iterative methods and an efficient initial network guess.
Numerous modern SLAM algorithms follow the pose-graph optimization formulation of the problem [3] , where the nodes of the graph (the variables to be estimated) represent discrete robot positions sampled along the trajectory, and each edge (constraint) represents a measurement between a pair of poses. The measurements can originate either from ego-motion estimation odometry or from detecting loop-closure situations whenever the robot returns to a previously visited place [4] .
The structure of the pose-graph is iteratively refined by nonlinear optimization (e.g., Gauss-Newton) starting from an initial guess [5] . Hence, a good initialization provides important benefits, since R Source code and dataset: Please note that this paper is accompanied by the source code of the proposed algorithm: https://github.com/karoly-hars/MASAT _ IG _ for _ SLAM . The dataset used in this work is available at: http://mplab.sztaki.hu/ masat _ slam/masat _ slam _ data.zip . ✩✩ Handled by Associate Editor Antonio Fernández-Caballero. * Corresponding author.
in case the initial estimate is near to the optimal solution, the optimization converges faster and the risk of convergence to a local minimum is reduced [6] . Conversely, a bad initial guess increases the computational time of the optimization and might lead the convergence of the algorithm to a local minimum, c.f., Figs. 1 and 2 .
Commonly, the initial guess is computed by heuristic methods either by using the odometry measurements or by using a minimum spanning tree search. Both methods are computationally lightweight and have low-complexity. In order to avoid the caveats of a bad initial guess, several high-complexity initialization algorithms were proposed, more recently Cauchy algorithm [8] . However, all these algorithms are computationally complex and often include prior optimization steps, resulting in increased computational time. Conversely, the proposed approach is lightweight and has low-complexity. Furthermore, it can be applied as a preprocessing step before running more complex algorithms, in order to speed up and improve their results. Since the proposed algorithm uses all the previously computed nodes of the pose-graph to estimate a new location (c.f. Fig. 3 ), we name our method Multi-Ancestor Spatial Approximation Tree (MASAT).
To summarize, this paper advances the state-of-the-art of posegraph optimization algorithms with the following contributions:  1. The results of Gauss-Newton optimization from different initial guesses on the noisy version of the sphere dataset (independent, Gaussian noise with 0 mean, and 0.03 and 0.06 standard deviation on the coordinates and angles). Left to right: odometry, spanning tree, TORO [7] , Cauchy algorithm [8] , MASAT (proposed). posed algorithm even outperforms high-complexity initialization algorithms. This is notably true in case of 3D pose-graph structures. • An extensive evaluation and comparison is reported using three 2D and two 3D datasets (with multiple noise levels) between the proposed method and five other baseline algorithms. The best results in terms average normalized error, rate of successful convergence (robustness), and average number of iterations is achieved when the proposed algorithm is applied as a preprocessing step of the Cauchy algorithm [8] . • We release the source code of the proposed algorithm, and provide all the data used to perform the comprehensive evaluation and comparison of the different approaches.

Related work
Numerous modern image based visual odometry (SVO algorithm [9] ) and visual SLAM (LSD-SLAM [10] , ORB-SLAM2 [11] algorithms) systems use the pose-graph representation to solve the underlying SLAM problem. Therefore, there is a great need for robust and efficient pose-graph optimization back-ends. Generally, the main task of a pose-graph based SLAM back-end is to minimize the accumulated error of the measurement flow with the restrictions gained by different loop closures in the movement.
SLAM was first formulated as a pose-graph optimization problem in the seminal work of [12] . Since then, great effort was spent on studying this topic, a comprehensive overview is presented in [3] . Also, several open-source, computationally efficient pose-graph optimization back-ends were proposed in the literature: Georgia Tech Smoothing and Mapping library (GTSAM) [13] , General Graph Optimization framework (g2o) [5] , and Efficient Compact Pose SLAM library (SLAM++) [14] .
Generally, non-linear optimization has a tendency to diverge, or converge into a local minimum if its initial state-the startingpoint of the optimization-is too far from the ground truth. In order to avoid a bad initial guess, complex initialization techniques (also known as bootstrappers) were proposed: LAGO [15] , TORO [7] , and more recently Cauchy algorithm [8] . LAGO is a linear approximation technique that provides an accurate initial estimate for 2D pose-graphs, however, it can produce inadequate results if the measurements are noisy. TORO is an optimization framework that is robust against bad initial guesses, thus, it can be used as bootstrapper for other non-linear optimization techniques. Finally, Cauchy algorithm is an iterative approach based on M-estimation which produces accurate and reliable initial guesses even in scenarios with noisy measurements.
It was shown, that estimating rotations first in case of posegraph initialization has significant advantages in terms of computational costs and robustness [16] . As rotation estimation in posegraph optimization is a major issue, a survey is given about its advantages for 3D SLAM in [6] . Following the Lagrangian relaxation of [17] in the 3D case for tight process, in [18] remarkably good initialialization can be achieved even for non-tight relaxations. Applying these relaxation based methods to large scale, real-world problems is difficult, because of their high computational complexity. In special cases, some additional features can make the iteration process faster and more accurate: • Sparsification is one possible solution to make the convergence faster, see [19] for finding particularities of pose-graph SLAM to exploit a novel factor descent iterative optimization method, achieving 80% of node reduction. • Pose of absolute orientation can be better estimated if we can detect well-defined local structural cues, as architectural elements [20] .
The proposed algorithm is unlike to the aforementioned relaxation based approaches: instead of performing a preliminary optimization step, our method traverses the pose-graph only once, similarly to the straightforward odometry and spanning tree techniques. During the traversal, it approximates each node's location based on its already positioned neighbors. The details of the proposed approach are described in the next section.

Proposed algorithm
The proposed algorithm approximates the exact position of a node by the votes on its location made by its previously processed neighbors. We illustrate this major difference between the proposed method and the competitor ones in Fig. 3 . The figure illustrates that in contrast to other low-complexity algorithms, the proposed algorithm uses all the previously computed nodes of the pose-graph to estimate a new location. Formally, we run a Breadth-First Search (BFS) on our graph G = ( V, E ) starting from node x 1 .
the set of the neighbors of node x i . In each step for every node x j ∈ N i that was previously processed during the BFS, we get a vote regarding the position of x i by the formula: . . . , p d ) based on the measured distance between the two nodes and the angle of view from node x j . In the aforementioned formula d stands for the dimensions of our problem.
Practically, the value of d is usually 2 or 3, but theoretically, this algorithm works on different values efficiently. After we got all the votes, we define the new location of x i as the average of the votes on its position. The algorithm ends when all the nodes were processed by the BFS, c.f. (see Algorithm 1 ).

Results
We compared our algorithm to low-complexity heuristic methods mentioned before (odometry/ODO, and spanning tree/SPT), and to high-complexity approaches: the LAGO, TORO, and Cauchy bootstrapping methods, on several well-known 2D and 3D benchmark datasets.
In details, we used the following process to generate scenarios with different noise levels: for each dataset we used in our experiments, we added independent, Gaussian noise with 0 mean and ( σ c , σ a ) standard deviation (for the coordinates and angles) to every measurement, and we examined how the Gauss-Newton algorithm performs from the different initial guesses. To perform the experiments, we used the g2o framework [5] , since it is one of the most popular and widely adopted pose-graph optimization backends.
Let an attempt on finding the optimal placement of the nodes be defined successful if the growth of the error function χ 2 is less than a threshold 10 −6 after an iteration step. The maximum number of iteration steps is fifty, and after reaching this amount of iteration steps an attempt will be defined unsuccessful. In the cases of successful experiments, we measured the average χ 2 error (the χ 2 error divided by the number of measurements) after the algorithm is finished, and the average number of Gauss-Newton iterations from start to end. In order to quantitatively evaluate the proposed algorithm for every dataset and every noise level, we repeated this process 50  Table 3 Quantitative evaluation on the City10K dataset. times. The outcome of our experiment is summarized in Tables 1-6 , where 'conv.' refers to the ratio of successful experiments to the 50 repetitions, 1 , 2 , 3 while 'avg. iter.' and 'avg. χ 2 shows the average number of iterations before convergence and the average χ 2 error over the successful cases. A low χ 2 error indicates that the final pose-graph is close to the ground truth, and the number of Gauss-Newton iterations is directly proportional to the runtime of the optimization. 1 On an Intel Core i7-8700 CPU 2 100 TORO iterations. 3 50 Cauchy iterations.

Evaluation on 2D datasets
We summarize the cumulated results on 2D datasets in Table 1 for Manhattan3500 [21] , in Table 2 for Manhattan10 0 0 0 [7] , and in Table 3 for City10k [22] dataset respectively. Note on the tables, that in case the Gaussian noise is moderately large (noise (σ c ; σ a ) = (0 . 1 ; 0 . 1) ), the proposed low-complexity algorithm (MASAT) performs similarly to other high-complexity methods (LAGO, TORO, and Cauchy) in terms of average number of iterations until convergence . However, the results are obtained an order of magnitude faster (in less than 0.1 seconds) with the proposed approach. If the Gaussian noise is larger (noise (σ c ; σ a ) = (0 . 2 ; 0 . 2) ), the high-complexity Cauchy algorithm slightly outperforms MASAT at the cost of a significantly longer running time. Interestingly, the proposed approach performs best on the Manhattan10 0 0 0 dataset. The other, high-complexity methods (LAGO, TORO) are outperformed by MASAT in terms of convergence rate. On the 2D datasets, the localization accuracy -in terms of average χ 2 error -after 50 iterations of Gauss-Newton optimisation is roughly the same whether we initiate from an estimation made by MASAT, TORO or Cauchy.
In contrast, MASAT surpasses other fast (low-complexity) methods by a large margin in terms of convergence and average number of iterations, in all of our experiments. Note from the data presented in the tables, that in case of large noise (noise (σ c ; σ a ) = (0 . 2 ; 0 . 2) ) other simple methods (ODO, SPT) and LAGO have a very low convergence rate in comparison to the proposed approach. This is notably true in case of large pose-graphs like the Manhat-tan10 0 0 and City10k datasets, which contain ten thousand nodes.
Furthermore, if there is a large rotation error (noise (σ c ; σ a ) = (0 . 15 ; 0 . 3) ), these methods fail to compute a proper initialization, and the pose-graph optimization will not converge. Therefore, the cumulated results on 2D datasets indicate that the proposed algorithm provides a good balance between speed and accuracy.
We illustrate some representative results in Fig. 2 . The comparison shows that by applying the proposed initialization algorithm the grid-like structure of the Manhattan dataset is successfully computed even in case of large rotation errors (bottom right).

Evaluation on 3D datasets
Next, we used the simulated Torus dataset and a Sphere dataset (generated with g2o [5] ), and the real world multi-level Parking Garage dataset [17] to evaluate the bootstrapping methods' performance for 3D measurements. The detailed evaluation shows that MASAT is superior to all other low or high-complexity approaches in terms of both convergence and accuracy (c.f., Tables 4, 5 , and 6 ) at all three noise levels, while the runtime of the proposed algorithm is two order of magnitude faster than in the case of highcomplexity approaches.
Although, Toro fails on the simulated Torus and Sphere datasets, it achieves good performances on the real-world Parking Garage dataset. Conversely, the Cauchy algorithm produces good results on the Torus and Sphere datasets and performs worse on the Parking Garage dataset, notably in the case of large noise levels (noise (σ c ; σ a ) = (0 . 04 ; 0 . 04) ) and noise (σ c ; σ a ) = (0 . 03 ; 0 . 06) )). This might be due to the fact that in the simulated datasets the nodes are well connected between each other. To the contrary, the Parking Garage dataset contains fewer links between the consecutive and loop-closure positions recorded onboard a vehicle driving within a multi-storey parking facility. Better performance might be achievable by fine tuning the parameters of the Toro and Cauchy algorithms. In our experiments, we used the standard set of parameters originally proposed by the authors. Another major benefit of the proposed MASAT algorithm is that it does not rely on any predefined parameter or threshold.
A sample result obtained with the proposed algorithm using the real-world Parking Garage dataset is shown in Fig. 4 . The ground truth trajectory of the vehicle (left) was successfully retrieved after optimisation (right) from very noisy measurements (second left) using the initial guess computed with the proposed MASAT algorithm (second right).
Finally, since the runtime of MASAT is rather low, we examined the possibility of running MASAT before the Cauchy bootstrapping, thus providing an initial guess using the combination of the two methods. Our findings show that the initialization has a large effect on the accuracy of Cauchy algorithm, and that Cauchy initialized with MASAT exceeds the regular Cauchy algorithm (initialized with odometry as suggested in [8] ) in every aspect.

Conclusions
In this paper, we examined the problem of initial guess computation for off-line SLAM algorithms. We introduced MASAT, a heuristic, fast, and low-complexity method for bootstrapping. The utility of the proposed algorithm has been demonstrated through extensive experiments in both 2D and 3D settings. The method has been tested thoroughly on artificial datasets and on a real-life experimental data recording. Compared to high-complexity state-ofthe-art solutions, the proposed method makes the pose-graph calculus for SLAM more efficient, and its excellent performance and precision in our experiments shows that it can be effective when applied in real-time evalutation.

Declaration of Competing Interest
None.