Elsevier

Computer-Aided Design

Volume 41, Issue 10, October 2009, Pages 701-710
Computer-Aided Design

3D terrestrial LIDAR classifications with super-voxels and multi-scale Conditional Random Fields

https://doi.org/10.1016/j.cad.2009.02.010Get rights and content

Abstract

In this paper, we propose a new method for 3D terrestrial laser range data classifications. This functions as the first step towards virtual city model reconstructions from range data and is particularly useful for scene understanding. Classification of the outdoor terrestrial range data into different data types (for example, building surface, vegetation and terrain) is challenging due to certain properties of the data: occlusions due to obstructions, density variation due to different distances of the scanned object from the laser scanner, multiple multi-structure objects and cluttered vegetation. Also, the range data acquired are massive in size and require a lot of computation and memory. Recognizing the redundancy of labeling every individual data, we propose over-segmenting the raw data into adaptive support regions: super-voxels. The super-voxels are computed using 3D scale theory and adapt to the above-mentioned range data properties. Colors and reflectance intensity acquired from the scanner system are combined with geometry features (saliency features and normals) that are extracted from the super-voxels, to form the feature descriptors for the supervised learning model. We proposed using the discriminative Conditional Random Fields for the classification problem and modified the model to incorporate multi-scales for super-voxel labeling. We validated our proposed strategy with synthetic data and real-world outdoor LIDAR (Light Detection and Ranging) data acquired from a Riegl LMS-Z420i terrestrial laser scanner. The results showed great improvement in the training and inference rate while maintaining comparable classification accuracy with previous approaches.

Introduction

Virtual urban environment modeling is an important research area, with extensive applications that include regional planning, virtual reality, precise navigation and disaster management. It is possible to acquire data of urban environment via photogrammetry, laser scanning or the combination of both techniques. Although photogrammetry is the most cost-effective method, laser scanning is a faster and more accurate method. By integrating with calibrated images of the scene [1], color information can be obtained to provide a realistic visualization of the virtual city model.

A complete city model usually requires registration of several scans from different locations and each scan produces a large amount of data. Storing or visualizing the raw data is memory demanding and requires a long time to process the data. In order to reduce the size of the data and to visualize the data, geometric fitting of point clouds into polyhedral models is very desirable. There exist several point cloud reduction techniques, such as coarse-to-fine point cloud simplification by Moenning and Dodgson [2] using Fast Marching farthest point sampling. Fast point sampling is based on the idea of minimizing any reconstruction error by repeatedly placing the next sample point in the middle of the least known area of the sampling domain. Similarly, Alexa et al. [3] reduce the point cloud by removing the points with the least contribution to the moving least squares (MLS) representation of the underlying surface. Point cloud simplification can be a useful step to speed up the subsequent surface reconstruction process.

Another method to reduce the data is via triangulation (or meshing), which is a common method for object reconstructions from range data [4], [5], [6]. However, the outdoor terrestrial laser scanned point clouds have the following very different properties: occlusions due to obstructions, varying density due to different distances of the scanned object from the laser scanner, multiple multi-structure objects and cluttered vegetation. Due to this, direct reconstruction is very difficult and challenging. Triangulating the vegetation data often causes unwanted spikes (as shown in Fig. 1), or even connects the close-by vegetation and building data into the same surface. Besides that, it is difficult to represent the edges and corners properly with simple triangulations. Also, extra knowledge is required to recover occlusions on building walls which are often obstructed by vegetation.

Therefore, it is useful to classify the raw data points into different groups, for example planar, linear and cluttered data groups. With this, different data types can be processed with the most appropriate representation method. For example, tree data, which often take up a large memory space (even in the case of triangulated vegetation models), can be removed and be replaced with simple or more realistic looking generic models, depending on the required level of detail. On the other hand, a set of planar surfaces can be fitted to the planar data group, whereas cylinders can be fitted to the higher curvature data group. Another advantage of data labeling is to provide a better understanding of the scene; this is particularly useful for applications such as robotic navigation.

3D outdoor data classifiers are not new. Many previous attempts on urban modeling use aerial LIDAR (Light Detection and Ranging) data, where the acquired data consists of 2D (or 2.5D) bird-eye viewpoint clouds. Methods applied in aerial urban modeling to classify the data includes utilization of linear classifiers or clustering methods with features extracted from the height difference [7], [8], [9], [10], variations of surface normal vectors [8] and colors [11]. These methods are often inappropriate for data classification in terrestrial urban modeling. For example, vegetation removal is commonly performed by filtering via changes in the height differences; however the rooftop information is generally unavailable in terrestrial- or ground-based data acquisition.

More recently, urban modeling via terrestrial laser scanning for the reconstruction of virtual city model has become increasingly common. This is due to its capabilities of capturing the building façade details and increasing the realism of the final model. This is especially useful for applications that involve walkthrough animations, where the acquired calibrated images are texture mapped onto the building models. Due to the changing lighting conditions and moving objects (human and vehicles), the calibrated images have to be pre-processed to clean up these inconsistencies. The inconsistency in the range data, caused by objects moving slower than the laser scanning frequency during data acquisition, can be removed by taking more than a single scan for every location. The scans at the same location are then combined together by selecting the greatest depth for every point.

This paper focuses on the classification of the outdoor range data acquired with a terrestrial laser scanner. We propose using the discriminative Conditional Random Fields (CRFs) model instead of the traditional generative model (see Section 2.3 for the reasons). However, the training of the learning model, with the available large training data, is time consuming. With the learned parameters, the inference (classification) takes a long time as well. Recognizing this problem, we have developed a method that is capable of adaptively reducing the number of data items. That is, the training and inference of the learning model are based on a reduced data set of the original point clouds. The concept here is such that redundancy from multiple data items with some similar features is omitted from training and inference. With the method, the total processing time required for training and testing can be reduced. The proposed algorithm combines the extraction of distinct features from the point clouds, multi-scale discriminative graphical probabilistic model and 3D scale theory for classification.

The paper is organised as follows: previous work for the classification of range data is summarised in Section 2; the proposed model architecture is explained in Section 3; the experiment setup and results are discussed in Section 4. We validated our method with the synthetically generated data and urban data acquired from a terrestrial Riegl laser scanner. The results showed that the training and inference time is greatly reduced while maintaining comparable classification accuracy. We also compared our results with direct model fitting without classification. The result showed an improvement of having classification over the previous approaches.

Section snippets

Background

In data classification via supervised learning, previous work has shown the advantages of global classification (that takes neighboring points into account) over local classification. However, the 3D data labeling in previous work was mostly point based [12].

Super-voxels

Similarly to He’s model [28] for 2D image segmentation, we over-segment the 3D data into super-voxels before we classify the data, using algorithms modified from the 3D scale theory [24], [36] instead. Super-voxels are the result of an over-segmentation of the 3D point cloud. The super-voxel reduces the complexity of the raw data and provides longer range of interaction to the data. It is also a perceptually consistent unit that is uniform in the underlying data structure and color.

The

Results

We tested our algorithm with the synthetic data to show the advantage of having over-segmentation (super-voxel labeling vs. individual data labeling) and the improvement by taking the independencies among the neighboring data into account (Conditional Random Field vs. logistic regression).

We then validated our algorithm with two sets of complicated real-world data acquired from a terrestrial laser scanner shown in Fig. 6. The scanner has an accuracy of approximately 10 mm with a range of 250 m.

Conclusion

We have presented an efficient and accurate method for 3D terrestrial range data classifications. By over-segmenting the raw point clouds into super-voxels, we reduce the amount of data (in most cases) to 5% of the original data. We implemented the multi-scale Conditional Random Field to provide connectivity at local, edge and regional levels. The increment of labeling precision (with global classification (CRF)) over local classification (logistic regression) has been demonstrated. Also, we

References (42)

  • Y. Yemez et al.

    A volumetric fusion technique for surface reconstruction from silhouettes and range data

    Computer Vision and Image Understanding

    (2007)
  • N. Haala et al.

    Extraction of buildings and trees in urban environments

    ISPRS Journal of Photogrammetry and Remote Sensing

    (1999)
  • P.D. White et al.

    A cost-efficient solution to true color terrestrial laser scanning

    Geosphere

    (2008)
  • Moenning C, Dodgson NA. A new point cloud simplification algorithm. In: 3rd IASTED international conference on...
  • Alexa M, Behr J, Cohen-Or D, Fleishman S, Levin D, Silva T. Point set surfaces. In: Proc. 12th IEEE visualization conf....
  • R.T. Whitaker et al.

    On the reconstruction of height functions and terrain maps from dense range data

    IEEE Transactions on Image Processing

    (2002)
  • Tanaka HT, Kishino F. Adaptive sampling and reconstruction for discontinuity preserving texture-mapped triangulation....
  • Krishnamoorthy P, Boyer KL, Flynn PJ. Robust detection of buildings in digital surface models. In: Proceedings 16th...
  • F. Rottensteiner

    Automatic generation of high-quality building models from lidar data

    IEEE Computer Graphics and Applications

    (2003)
  • Matikainen L, Hyyppa J, Hyyppä H. Automatic detection of buildings from laser scanner data for map updating. In: ISPRS...
  • Vosselman G. Fusion of laser scanning data, maps, and aerial photographs for building reconstruction, In: International...
  • Anguelov D, Taskar B, Chatalbashev V, Koller D, Gupta D, Heitz G. et al. Discriminative learning of Markov random...
  • Stamos I, Allen PK. Integration of range and image sensing for photo-realistic 3D modeling. In: Proceedings 2000 ICRA....
  • D.D. Lichti

    Spectral filtering and classification of terrestrial laser scanner point clouds

    Photogrammetric Record

    (2005)
  • Triebel R, Kersting K, Burgard W. Robust 3D scan point classification using associative Markov networks. In:...
  • J.D. Boissonnat et al.

    Coarse-to-fine surface simplification with geometric guarantees

    Computer Graphics Forum

    (2001)
  • Dey TK, Li G., Sun J. Normal estimation for point clouds: A comparison study for a Voronoi based method. In:...
  • Xuming H, Zemel RS, Ray D. Learning and incorporating top-down cues in image segmentation. Lecture Notes in Computer...
  • Unnikrishnan R, Hebert M. Robust extraction of multiple structures from non-uniformly sampled data. In: IROS 2003....
  • Wolf DF, Sukhatme GS, Fox D, Burgard W. Autonomous terrain mapping and classification using hidden markov models. In:...
  • Tony J. Action-reaction learning: Analysis and synthesis of human behaviour. In: Program in media arts and sciences....
  • Cited by (131)

    • A review of terrestrial laser scanning (TLS)-based technologies for deformation monitoring in engineering

      2023, Measurement: Journal of the International Measurement Confederation
    View all citing articles on Scopus
    1

    IEEE member.

    View full text