Dense real-time mapping of object-class semantics from RGB-D video

Stückler, Jörg; Waldvogel, Benedikt; Schulz, Hannes; Behnke, Sven

doi:10.1007/s11554-013-0379-5

Dense real-time mapping of object-class semantics from RGB-D video

Special Issue Paper
Published: 22 November 2013

Volume 10, pages 599–609, (2015)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Jörg Stückler¹,
Benedikt Waldvogel¹,
Hannes Schulz¹ &
…
Sven Behnke¹

1342 Accesses
45 Citations
3 Altmetric
Explore all metrics

Abstract

We propose a real-time approach to learn semantic maps from moving RGB-D cameras. Our method models geometry, appearance, and semantic labeling of surfaces. We recover camera pose using simultaneous localization and mapping while concurrently recognizing and segmenting object classes in the images. Our object-class segmentation approach is based on random decision forests and yields a dense probabilistic labeling of each image. We implemented it on GPU to achieve a high frame rate. The probabilistic segmentation is fused in octree-based 3D maps within a Bayesian framework. In this way, image segmentations from various view points are integrated within a 3D map which improves segmentation quality. We evaluate our system on a large benchmark dataset and demonstrate state-of-the-art recognition performance of our object-class segmentation and semantic mapping approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning to Segment and Track in RGBD

Integrating Geometrical Context for Semantic Labeling of Indoor Scenes using RGBD Images

Article 03 July 2015

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

References

Bergstra, J., Bardenet, R., Bengio, Y., Kégl, B: Algorithms for hyper-parameter optimization. In: 25th Annual Conference on Neural Information Processing Systems (NIPS 2011). URL https://github.com/jaberg/hyperopt (2011)
Besl, P.J., McKay, N.D.: A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 14(2), 239–256 (1992)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. pp 5–32 (2001)
Castle, R.O., Klein, G., Murray, D.W.: Combining monoSLAM with object recognition for scene augmentation using a wearable camera. Image Vis. Comput. 28(11), 1548–1556 (2010)
Article Google Scholar
Civera, J., Galvez-Lopez, D., Riazuelo, L., Tardos, D., Montiel, JMM.: Towards semantic SLAM using a monocular camera. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2011)
Couprie, C., Farabet, C., Najman, L., LeCun, Y.: Indoor semantic segmentation using depth information. The Computing Resource Repository (CoRR) abs/1301.3572 (2013)
Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., Burgard, W.: An evaluation of the RGB-D SLAM system. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (2012)
Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. Int. J. Robot. Res. 31(5), 647–663 (2012)
Article Google Scholar
Kuemmerle, R., Grisetti, G., Strasdat, H., Konolige, K., Burgard, W: G2o: a general framework for graph optimization. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 3607–3613 (2011)
Lai, K., Bo, L., Ren, X., Fox, D.: Detection-based object labeling in 3D scenes. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp 1330–1337 (2012)
Meger, D., Forssén, P.E., Lai, K., Helmer, S., McCann, S., Southey, T., Baumann, M., Little, J.J., Lowe, D.G.: Curious george: an attentive semantic robot. Robot Auton. Syst. 56(6):503–511 (2008)
Article Google Scholar
Newcombe, RA., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, AJ., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: real-time dense surface mapping and tracking. In: Proceedings of the 10th International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136 (2011)
Nüchter, A., Hertzberg, J.: Towards semantic maps for mobile robots. Robot. Auton. Syst. 56(11):915–926 (2008)
Article Google Scholar
NVIDIA Parallel programming and computing platform-CUDA-NVIDIA. URL http://www.nvidia.com/object/cuda_home_new.html, visited on 2013-05-11 (2013)
Ranganathan, A., Dellaert, F.: Semantic modeling of places using objects. In: Proceedings of Robotics: Science and Systems (2007)
Salas-Moreno, R.F., Newcombe, R.A., Strasdat, H., Kelly, P.H.J., Davison, A.J.: Slam++: simultaneous localisation and mapping at the level of objects. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)
Sengupta, S., Greveson, E., Shahrokni, A., Torr, P.: Semantic modelling of urban scenes. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (2013)
Sharp, T.: Implementing decision trees and forests on a GPU. Computer Vision–ECCV 2008, pp. 595–608 (2008)
Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2008)
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 1297–1304 (2011) doi:10.1109/CVPR.2011.5995316
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: ECCV (2012)
Stückler, J., Behnke, S.: Combining depth and color cues for scale- and viewpoint-invariant object segmentation and recognition using random forests. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2010)
Stückler, J., Behnke, S.: Multi-resolution surfel maps for efficient dense 3D modeling and tracking. J. Visual Commun. Image Represent (2013)
Stückler, J., Biresev, N., Behnke, S.: Semantic mapping using object-class segmentation of RGB-D images. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2012)
Tomono, M., Shin’ichi, Y.: Object-based localization and mapping using loop constraints and geometric prior knowledge. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (2003)
Vasudevan, S., Gächter, S., Nguyen, V., Siegwart, R.: Cognitive maps for mobile robots-an object based approach. Robot. Auton. Syst. 55(5), 359–371 (2007)
Article Google Scholar
Whelan, T., Kaess, M., Fallon, M., Johannsson, H., Leonard, J., McDonald, J.: Kintinuous: Spatially extended KinectFusion. In: RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras, Sydney, Australia (2012)
Zender, H., Mozos, O.M., Jensfelt, P., Kruijff, G.J., Burgard, W.: Conceptual spatial representations for indoor mobile robots. Robot. Auton. Syst. 56(6), 493–502 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Insitute VI, Autonomous Intelligent Systems, University of Bonn, Bonn, Germany
Jörg Stückler, Benedikt Waldvogel, Hannes Schulz & Sven Behnke

Authors

Jörg Stückler
View author publications
You can also search for this author in PubMed Google Scholar
Benedikt Waldvogel
View author publications
You can also search for this author in PubMed Google Scholar
Hannes Schulz
View author publications
You can also search for this author in PubMed Google Scholar
Sven Behnke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jörg Stückler.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Stückler, J., Waldvogel, B., Schulz, H. et al. Dense real-time mapping of object-class semantics from RGB-D video. J Real-Time Image Proc 10, 599–609 (2015). https://doi.org/10.1007/s11554-013-0379-5

Download citation

Received: 22 May 2013
Accepted: 26 October 2013
Published: 22 November 2013
Issue Date: December 2015
DOI: https://doi.org/10.1007/s11554-013-0379-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dense real-time mapping of object-class semantics from RGB-D video

Abstract

Access this article

Similar content being viewed by others

Learning to Segment and Track in RGBD

Integrating Geometrical Context for Semantic Labeling of Indoor Scenes using RGBD Images

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dense real-time mapping of object-class semantics from RGB-D video

Abstract

Access this article

Similar content being viewed by others

Learning to Segment and Track in RGBD

Integrating Geometrical Context for Semantic Labeling of Indoor Scenes using RGBD Images

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation