DETECTION OF HIDDEN EDGES AND CORNERS IN SLAM-BASED INDOOR POINT CLOUDS

: Mobile mapping systems are commonly used for surveying buildings. The acquisition of the buildings’ indoor spaces with laser scanning or photogrammetry generates data in the form of point clouds. These point clouds are often used to create a model of those buildings, but so far with a low degree of automation. To automate this process, it is important to extract geometric information about corners, edges, and planes from unorganized indoor point clouds. In an indoor scenario consisting of several rooms including furniture and other objects, a point cloud is expected to show occlusions. Therefore, the detection of hidden corners and edges is of importance. In this work one approach based on contour point clouds and one approach based on planes are examined for the detection of corners and edges. Both approaches use RANSAC to extract either straight lines or planes. Through their intersection, edges and corners are determined. To examine the influence of the data quality on the results, the approaches are applied to and evaluated on different datasets of the same area of a building, which are captured by various measurement methods, including mobile mapping systems and terrestrial laser scanning. Therefore, we are creating a ground truth for parts of the building to evaluate the completeness and correctness of the corner detection. The approach based on planes presents itself to be more reliable in noisy and incomplete point clouds. The approach based on contour point clouds indicates advantages in terms of the complexity of a building ´ s indoor geometry.


INTRODUCTION
Mobile scanning systems can be used for the complete and fast collection of building inventory data.Different laser scanning systems and photogrammetric systems can be used for this task.The acquisition generates point clouds of the buildings.The resulting point clouds can be organized or unorganized depending on the acquisition system and embody large datasets depending on the point density.From the point clouds as data base e.g.3D models can be created, which are part of the method Building Information Modeling (BIM) (López Iglesias et al., 2020).Those models can be used for e.g.building inventory management.In regard to this, it is necessary to extract the relevant information.So far, the process of creating those models is typically done manually based on a low degree of automation.For a complete and correct modelling, it is important to detect corners and edges of objects in the point cloud automatically.Since occlusions typically occur in an indoor scene, it is of particular importance to detect hidden edges and corners.This work focuses on the extraction of edges and corners in indoor point clouds using contour points and planes with the focus on occluded and hidden corners and edges.The goal is to extract the base shape of an indoor room fully automatically and to allow for a complete and correct polygon or wire frame like description of indoor scenarios.Through this representation the amount of data is reduced and the information content of the raw point clouds increases.Two different geometric approaches, a plane based and a contour point based approach, will be applied and evaluated in this work.The contour point based approach is utilizing Random Sampling Consensus (RANSAC) (Fischler and Bolles, 1981) to estimate edge candidates in a contour point cloud (Ahmed et al., 2018) and the intersections of those for corner candidates.The plane based approach is detecting planes within the point cloud (Schnabel et al., 2007) and intersects orthogonal planes for corner and edge candidates.
The scope of this work is to analyze the capability of the two approaches to detect the edges and corners in the point cloud.In order to evaluate the results, we acquire indoor point clouds with different measurement methods, two mobile mapping systems and a terrestrial laserscanner.The aim is to analyze the difference in the results of the two approaches concerning the different quality of the point clouds.

RELATED WORK
In this chapter, a brief overview of the topics indoor modelling, contour point extraction and plane detection is given.One solution to detect contours of buildings is the use of two-dimensional corners as an intermediate step (Lu et al., 2019).Therefore, the point cloud is converted into an imagelike binary representation.Edges in these images are extracted with edge detection filters.The resulting two-dimensional edges are re-projected into the point cloud to classify contour points.(Iwaszczuk et al., 2017) use images in combination with depth maps to extract semantic information about contours and planes in 2D and also enrich the point cloud with it.They use k-means clustering and RANSAC to detect planes in this semantically enriched point cloud.
To detect contour points in the point cloud directly, (Hackel et al., 2016) are using geometric features based on eigenvalues and eigenvectors.A feature vector is calculated for every point in the point cloud and the probability for being a contour point is calculated.These contour candidates are used in a graph within which subgroups are selected, which represent the contour of a building based on the neighborhood relationships of geometric features.(These contour candidates are represented in a graph, based on the neighborhood relationships of geometric features.Within the graph subgroups are selected, representing the contour of a building.)As an alternative to the eigenvalues, the symmetry of the local neighborhood can be used to detect contour points (Ahmed et al., 2018).The center of gravity is calculated in a local neighborhood.The difference between a point and this center of gravity is considered to classify contour points.From these contour points, curvature vectors are used to further distinguish these points in edge and corner points.Other solutions use neural networks to classify contour points.(Himeur et al., 2021) are using the roughness, curvature and the normal as features in different scales to classify contour points.They distinguish into smooth edges which lie close to a corner, and sharp edges lying directly at a corner.This can be helpful to detect the exact edges and allows the use of points close to edges, for example in noisy point clouds.This approach requires only few training data and thus allows the contour detection to fit to varying point clouds from different sensor systems for a more robust detection.
A standard approach to detect indoor geometry is the use of planes to detect flat surfaces like walls, ceilings, and floors (Lehtola et al., 2021).Such planes can be detected using for example RANSAC or Local Hough Voting (Sommer et al., 2020).They are using the intersection of orthogonal planes to detect edges and corners in a point cloud.The intersection line between two planes is interpreted as an edge and the intersection of three planes as a corner.It is also possible to use semantic segmentation to detect planes in point clouds.(Castagno and Atkins, 2020) are segmenting surfaces in meshes based on normal vectors.The outlines of these surfaces are determined and represented as polygons which depict structures in an indoor scenario.Instead of plane detection, (Liu et al., 2021) present a deep learning approach to directly generate wire-frame models from point clouds.They use a feed forward neural network to detect corners and to link these corners with edges.The neural network is developed and trained for CAD models, without consideration of occlusions.(Wang et al., 2019) emphasize the need for a higher degree of automation in modeling in order to make the BIM method more efficient and advantageous.The basis for an accurate BIM model is the correct extraction and detection of segments and elements in a point cloud to enable a correct positioning of the fitting parts of the model, such as walls, floors, doors, windows and more complex geometries.This process is also known as Scan-to-BIM.(López Iglesias et al., 2020) present some semiautomatic methods to detect objects in indoor point clouds.Occlusions are named as a decisive factor concerning indoor point clouds.Most of the presented methods consist of a segmentation of the point cloud and a bounding and refining procedure.The segmentation of clouds is based on the detection of planes, mostly using RANSAC.Region growth algorithms are also presented but categorized as insufficient for point clouds with lower density.The methods for refining and boundary mapping are mostly based on the use of binary images.The goal here is to eliminate noisy points.(López Iglesias et al., 2020) state that those approaches still need to be improved.
In this research we are presenting an approach, which is detecting edges and corners directly from contour point clouds, using RANSAC, in an indoor scenario.We created our own dataset for the evaluation which also contains a ground truth for hidden corners.While most other works evaluate their approaches mainly on datasets from a single type of sensor and oftentimes only qualitatively, we present quantitative evaluation results on a range of different sensor modalities capturing the same building spaces in empty state as well as furnished.

METHOD
We are presenting two approaches to calculate edges and corners from a point cloud.The contour point based approach is utilizing contour points along the edges, the plane based approach is using planes fitted to the point cloud.The corner and edge detection is based on the intersection of straight lines estimated from contour points and on planes.This allows for the detection of hidden corners and edges.The approaches are based on previous research and mainly extended in the evaluation part (Schmidt et al., 2023).The approaches are tested on new and larger data sets and the completeness evaluation is more thorough.

Contour point based approach
The contour based approach is using the method described by (Ahmed et al., 2018) to extract contour points.They use the symmetry of the local neighborhood with the distance to the center of gravity of the point under consideration.A large distance indicates a contour point.We are using the code published by (Ahmed et al., 2018) to create a contour point cloud.For the contour extraction, the datasets have to be downsampled to be used with the algorithm developed by (Ahmed et al., 2018) due to the computer memory restrictions of 64 GB.For the conference room, which contains of less points overall, the point clouds are downsampled to 1 point per 2 cm.The much larger ground floor datasets are downsampled to 1 point per 5 cm.A RANSAC algorithm is used on the contour point cloud to fit straight lines iteratively to the point cloud (Fig. 1).The RANSAC algorithm picks two random points from the contour point cloud and calculates a straight line from them.The distance of every point in the contour point cloud to this line is calculated and the points are counted which lie within a threshold of 5 cm and are therefore considered to be inliers.Every iteration, 200 lines are calculated and the line with the most inliers is picked as edge candidate.The points which lie within the threshold of the line are excluded from further calculation to allow for a complete edge candidate detection.This straight line extraction step is repeated until 250 lines are found for each room in the contour point cloud.These lines are considered edge candidates.The edge candidates are filtered under the Manhattan World condition, which assumes that all edge candidates are approximately parallel to the xyz-axis.This condition is not universally applicable but can be regarded as true in the shown experiments.The filtered edge candidates are intersected to calculate corner candidates.Since the straight lines are typically skewed, the center of the shortest distance between two lines is considered to be a corner candidate.In an indoor scenario, typically three edges meet in the corners of a room.Therefore we create the condition, that a third edge candidate has to be within t = 20 cm to a corner candidate to validate the corner.However, this also makes it more difficult to detect openings in walls, as these are sometimes only mapped in two dimensions in point clouds.

Plane based approach
In the plane based approach, M-estimator sample consensus (MSAC), a derivative of RANSAC which weights inliers according to the the distance to a plane, is used.Planes are iterative fitted to the point cloud and the points, which belong to a plane, are excluded from further calculation, to allow for a complete plane detection (Fig. 1).The maximum inlier distance is set to 5 cm and the process is repeated until only 1000 points are left.These planes are interpreted as potential wall elements.To calculate edge and corner candidates, orthogonal planes are detected using the cross product of the normal vectors (Sommer et al., 2020).A tolerance is allowed within the orthogonality constraint.Two orthogonal planes are intersected and the intersection lines are interpreted as edge candidates.The intersection point of three orthogonal planes are interpreted as corner candidates.The corner and edge candidates are filtered using a bounding box with a threshold of 10 cm around the point cloud to allow for noise in the point clouds.Points that lie outside the building part are cut with a bounding box, which encloses the point cloud with a distance of 20 cm.

EXPERIMENTS
To test the approaches datasets were created from two conference rooms, a hallway and three lecture rooms.Mobile mapping systems and a terrestrial laser scanner were used to evaluate the influence of the data quality caused by different acquisition techniques on the approaches.To evaluate the completeness of the corner detection a ground truth is generated, which contains the true corners in the datasets.

Data collection
In the data collection step, a point cloud from one conference room is captured in a state with and without furniture.The whole set of rooms is called ground floor.For the acquisition, three different measuring systems are used, which are a mobile laser scanner (MLS) in the form of a NavVis VLX, the terrestrial laser scanner (TLS) Zoller & Fröhlich Imager 5016 and a visual simultaneous localization and mapping (VSLAM) approach with the Intel Realsense D455.The NaVis VLX is a portable backpack system with two laserscanners and four integrated cameras, which captures point clouds with a field of view of 360 • × 360 • and 2 × 300000 points/second.The acquisition of the data was performed without fixed points or targets and is based on the SLAM-Algorithm.The registration of the point cloud is done in IVION, a cloud platform by Nav-Vis.The second mobile acquisition method is based on a Intel RealSense D455 depth camera used in combination with ORB-SLAM3 (Campos et al., 2021, Hou et al., 2023) as a minimalist visual SLAM system.In addition, a further dataset was created by using the terrestrial laserscanner Imager 5016 by Zoller & Fröhlich.The imager 5016 is a laser scanner with an integrated HDR camera and a view range of 320 • × 360 • , which captures 1 million points per second.The scans were performed from various scanner positions without the usage of fixed points or targets.The point clouds were registered by using the software Scantra (Technet GmbH), which is a program for geodetic registration of laser scan point clouds based on identical planes and points.The processed datasets were exported as E57-files.

Datasets
Point A total of 8 self-created data sets are used (Tab.1).With every sensor system a dataset of a conference room emptied (CRe) and with furniture (CRf) was collected.In this scenario, the VSLAM acquisition consists of around 1.6 mio points.The MLS point cloud has 4 times that size with around 6 mio.The TLS dataset of the conference room in comparison is very large, it contains 74.2 to 155 mio points.The TLS point cloud is strongly dependent on the amount of scan positions.Whereas the size of the MLS and VSLAM point clouds are mostly dependent on the size of the scanned area.Additionally, the scan from the whole ground floor created 56.3 mio points with the MLS and 1.8 bn points with the TLS scanner.A ground truth is created from the empty state of the conference room and hallway.Therefore, the empty states are seen as ground truth and the states with furniture as test data.The point clouds from the NavVis VLX are used to measure the ground truth, because they represent the true geometry of the conference room the most complete, especially at the ceiling.The ground truth is measured by hand, and represents the corners of the empty rooms in the point cloud.This point cloud is automatically filtered in the preprocessing step, and corners and edges are rounded.Thus this ground truth is not free from errors in terms of position accuracy.The main purpose of the ground truth is to check for completeness of the detection.In order to adjust the point clouds, acquired with different acquisition systems, to the ground truth, they are co-registered using iterative point comparison (ICP) and, if necessary, manual adjustment.This is why, besides measurement errors, registration errors exist as well.An accuracy value for the ground truth is currently not assessed.The ground truth represents all 49 corners in one conference room and 34 corners in the hallway.The whole ground floor contains 83 corners.8 corners are defined as base corners, which determine the rough shape of the room.In addition, there are 2 hidden corners, which are occluded in the dataset with furniture.From the datasets, it is evident that the point cloud from the NavVis VLX is much more complete, especially on the ceiling, which is partly suspended.The TLS data, on the other hand, contains less noise and is more accurate.Especially at the windows, the datasets are very noisy and the border between the object window and the noise is barely distinguishable.The dataset gathered with the VSLAM based acquisition using the Intel RealSense, shows very noisy point clouds.It is already difficult to properly register the point cloud with the ground truth.Thus, the point cloud is expected to not fit to the ground truth very well.

Evaluation
For the evaluation, both approaches are applied to the point clouds of the dataset.The conference room is separately evaluated with and without furniture.For the MLS and TLS data the ground floor is also evaluated.In order to evaluate the strengths and weaknesses of both approaches in terms of completeness and correctness, the number of corners in the conference room, which are correct, according to the ground truth, is collected.To evaluate if a corner candidate is close to a ground truth corner, the minimal distance to the nearest corner candidate is calculated.If ground truth corner is within 20 cm of a corner candidates, it is seen as correctly detected.The completeness of the edges is not evaluated, since the edge extraction is necessary for the corner detection.Therefore, the completeness of the corner detection is transferable to the edge The results for the contour extraction show noisy results for the TLS point clouds with furniture in the ground floor and the conference Room (Fig. 4 and 5).Depending on the quality of the input data in the results from VSLAM, an incorrect geometry extraction can be seen (Fig. 5).It is Important to note, that the contour points along edges are not straight but curvy.

RESULTS
The extraction of edges and corners from contour lines shows a large overdetection (Fig. 6 and 7).With MLS and TLS the rough shape of the room is extracted.The noisy point cloud in the TLS conference room with furniture however is missing some edges and therefore some corners.The bad data quality of the VSLAM carriers over to the edge and corner extraction.Furniture on the right side of Fig. 7 leads to wrong edges and corners within the room.
The plane based extraction of corners and edges leads to an even larger overdetection in the ground floor data, even more so for the TLS data which has a higher point density (Fig. 8).
Using the contour approach on the MLS point cloud in the empty conference room 80 % of the total corners could be detected and all of the hidden edges as well as the base shape of the room (Tab.2).For all corners, this is more than double what we achieved with the plane based approach.The rough shape and the hidden corners could be detected with both approaches.This difference decreases with furniture, but is still around 10 %.With furniture the contour based approach only detects 1 out of 2 hidden edges and only 4 of the 8 of the base corners, whereas the plane based approach detects all hidden corners and the base corners.With the TLS data without furniture, around 45 % of all edges of the conference room could be detected with both approaches.The contour approach however, did only detect 4 base corners and no hidden corners in contrast to the plane based approach which did detect all base corners and all hidden corners.In the VSLAM point cloud only a few corners could be detected in the conference room with and without furniture and only 1 of the base corners.Although the number of detected points is very low, the plane based approach detected 6 times the amount of the contour based approach.For the ground floor, the plane based approach detected around 3 to 4 times the corners of the contour based approach.
The extraction worked better with the MLS point cloud and detected double the amount of corners.This is also true for hid-den and base corners, where every single one could be detected in the MLS point cloud.The overdetection is high with both approaches.Over 10 to roughly 100 times the amount of the ground truth corners did get detected.Without furniture, the overdetection with the contour point approach and laser scanning data is almost double of the plane based approach.But with the datasets with furniture, the overdetection becomes 3 to 10 times larger with the plane based approach than with the contour based approach.
The RMSE for the detected corners is around 8 to 15 cm.The calculated RMSE cannot be used for comparison between data sets, because the basis is not the same due to registration errors.It only allows a comparison between the two approaches.The MLS and TLS ground floor and the TLS conference room with furniture show larger differences, with the plane based approach being 2 to 5 cm smaller.

DISCUSSION
The different approaches could roughly detect around half of the corners.The main reason seems to be the complex and partly obstructed ceiling construction.In this area, it is on the one hand difficult to capture the complete structure, on the other hand the different scales of edges at the the walls and the floor compared to the ceiling pose problems in terms of the correct parameter choice for the plane and edge detection based on RANSAC.Thus, the approaches are not optimally tuned to detect complex structures.The results show, that an almost complete room acquisition, which was achieved with the MLS point cloud, is very important for the contour approach.In this case, substantially more corners could be detected with the contour approach than with the plane approach.On the other hand, the plane based approach worked much better with the less complete TLS data.In the settings with furniture, the plane based approach works mostly better, especially in terms of hidden corners and the base shape of the room.In that regard, the plane base approach is much more reliable regarding the presented results.For the ground floor, the plane based approach worked better than the contour approach.This could be due to an incomplete contour detection for large scenarios, since the point cloud had to be sampled down in order to be used with the contour approach.
Both approaches show very large amounts of overdetection, which requires an efficient filtering.

CONCLUSION
We presented two approaches, for the extraction of corners and edges from indoor point clouds, including hidden corners and edges, and tested them on datasets from different sensors.The contour based approach works well for complex room shapes but struggles with incomplete point clouds and obstacles such as furniture within indoor spaces.The plane based detection of edges and corners seems to be more reliable and robust than with the contour based approach, especially with noisy data, as which furniture can be seen in this scenario.For the complete corner detection, a judicious choice of parameters in RANSAC and MSAC is important, but will always have to be balanced between a complete extraction and a low overdetection.Filtering of the overdetected corners and edges is one of the biggest challenges.Restricting the intersection condition for planes and edges in both methods allows for an efficient but also somewhat weak pre-filtering in the datasets used.Additionally, is also limits the generality of the model in the process.

OUTLOOK
In order to improve the corner detection with the contour based approach, first of all, the contour extraction has to be improved to make it more efficient.Consequentially, the input point clouds do not need to be down-sampled as much.This would lead to a much more complete contour extraction, which is essential for the edge and corner detection with RANSAC.Moreover, the edge detection with RANSAC needs to be improved to produce more reliable results.A solution could be efficient RANSAC (Schnabel et al., 2007) as well as Local Hough Voting (Sommer et al., 2020).Local Hough Voting is also an alternative for the used MSAC in the plane extraction.For both approaches, the adaption of RANSAC and MSAC to the input point cloud could lead to an improvement of the corner detection.For this, clustering of the surfaces or the use of roughness for the choice of the parameters are an option.Alternatively, semantic segmentation for the detection of planes, as well as for the filtering and classification of objects could be used (Iwaszczuk et al., 2018).For a reasonable use of the approaches, the filtering of the edge and corner candidates needs to be improved.This could be achieved with a geometric optimization by using a graph and the neighbourhood relationships.Additionally, a weighting or filtering based on the point density and distribution could be integrated, e.g. in edge extraction from contour points and the plane detection.
Deep learning could be used likewise to optimize the edge and corner detection by exploiting the relationships between them.One deep learning network, in which the presented approaches could be used to extend the method for hidden edges and corners, is PC2WF by (Liu et al., 2021).However, the collected data sets have to be extended for this.Additional training data has to be collected and a more reliable ground truth has to be created, which also contains the relations between corners, represented by edges and the relations between the edges and the walls, ceilings and floors.

Figure 1 .
Figure 1.Flow chart of the approaches based on the contour (left) and based on planes (right)

The
Figure 6.Contour based approach ground floor MLS (left) and TLS (right), with edge (magenta) and corner (red) candidates [m]

Table 2 .
Completeness of corners in conference room (CR