Skip to main content
Log in

3D Vehicle Detection Based on LiDAR and Camera Fusion

  • Published:
Automotive Innovation Aims and scope Submit manuscript

Abstract

Nowadays, the deep learning for object detection has become more popular and is widely adopted in many fields. This paper focuses on the research of LiDAR and camera sensor fusion technology for vehicle detection to ensure extremely high detection accuracy. The proposed network architecture takes full advantage of the deep information of both the LiDAR point cloud and RGB image in object detection. First, the LiDAR point cloud and RGB image are fed into the system. Then a high-resolution feature map is used to generate a reliable 3D object proposal for both the LiDAR point cloud and RGB image. Finally, 3D box regression is performed to predict the extent and orientation of vehicles in 3D space. Experiments on the challenging KITTI benchmark show that the proposed approach obtains ideal detection results and the detection time of each frame is about 0.12 s. This approach could establish a basis for further research in autonomous vehicles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Abbreviations

BEV:

Bird’s-eye view of the LiDAR point cloud

IOU:

Intersection over union

ROI:

Region of interest

AOS:

Average orientation similarity

References

  1. Kehl, W., Manhardt, F., Tombari, F., et al.: SSD-6D: making RGB-based 3D detection and 6D pose estimation great again. In: IEEE International Conference on Computer Vision, Computer Vision Foundation, Venice, 22–29 October, 2017

  2. Liu, W., Anguelov, D., Erhan, D., et al.: SSD: single shot multiBox detector. computer science. In: 16th Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Boston, 8–10 June, 2015

  3. Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  4. Zhou, Y., Tuzel, O.: VoxelNet: End-to-End learning for point cloud based 3D object detection. In:19th Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Salt Lake City, 18–22 June, 2018

  5. Behley, J., Steinhage, V., Cremers, A.B.: Laser-based segment classification using a mixture of bag-of-words. In: IEEE International Conference on Computer Vision, Computer Vision Foundation, Tokyo, 7–10 April, 2013

  6. Wu, B., Wan, A., Yue, X., et al.: SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud. In: IEEE International Conference on Robotics and Automation, Faculty of Mathematics and Physics of Charles University, Brisbane, 13–17 August, 2018

  7. Li, S., Kang, X., Fang, L., et al.: Pixel-level image fusion: a survey of the state of the art. Inform. Fusion 33(5), 100–112 (2017)

    Article  Google Scholar 

  8. Cvejic, N., Nikolov, S. G., Knowles, H. D., et al.: The effect of pixel-level fusion on object tracking in multi-sensor surveillance video. In: 8th IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Minneapolis, 18–23 June, 2007

  9. An, L., Chen, X., Yang, S.: Multi-graph feature level fusion for person re-identification. Neurocomputing 259(4), 39–45 (2017)

    Article  Google Scholar 

  10. Sharma, V., Davis, J.W.: Feature-level fusion for object segmentation using mutual information. In: 7th IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, New York, 17–22 June, 2006

  11. Yebo, G., Minglei, Y., Zhenguo, S., et al.: The applications of decision-level data fusion techniques in the field of multiuser detection for DS-UWB systems. Sensors 15(10), 24771–24790 (2015)

    Article  Google Scholar 

  12. Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. CVPR 11(2), 936–944 (2016)

    Google Scholar 

  13. Cai, Z., Fan, Q., Feris, R.S., et al.: A unified multi-scale deep convolutional neural network for fast object detection. Comput. Vis. 9908, 354–370 (2016)

    Google Scholar 

  14. Chen, X., Ma, H., Wan, J., et al.: Multi-view 3D object detection network for autonomous driving. In: 17th Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Las Vegas, 26–30 June, 2016

  15. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Comput. Sci. 428(6), 158–165 (2014)

    Google Scholar 

  16. Song, Y., Gong, L.: Analysis and improvement of joint bilateral upsampling for depth image super-resolution. Wireless Communications and Signal Processing, Institute of Electrical and Electronics Engineers, Yangzhou, 13–15 October, 2016

  17. Girshick, R., Donahue, J., Darrelland, T., et al.: Rich feature hierarchies for object detection and semantic segmentation. In: 15th Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Tianjin, 3–6 August, 2014

  18. Geiger, A., Lenz, P., Stiller, C., et al.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)

    Article  Google Scholar 

  19. Zeng, Y., Hu, Y., Liu, S., et al.: RT3D: real-time 3D vehicle detection in LiDAR point cloud for autonomous driving. IEEE Robot. Autom. Lett. 8(11), 125–132 (2018)

    Google Scholar 

  20. Li, P., Chen, X., Shen, S.: Stereo R-CNN based 3D object detection for autonomous driving. In:20th Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Long Beach, 16–20 June, 2019

  21. Gustafsson, F., Linder-Noren, E.: Automotive 3D object detection without target domain annotations. Dissertation, Linkoping University (2018)

  22. Duan, K., Bai, S., Xie, L., et al.: CenterNet: keypoint triplets for object detection. In: 20th Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Long Beach, 16–20 June, 2019

  23. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: 13th Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Providence, 10–15 June, 2012

  24. Lederer, C., Altstadt, S., Andriamonje, S., et al.: Vehicle detection from 3D Lidar using fully convolutional network. Robotics: Science and Systems, University of Michigan, Ann Arbor, 20–22 June, 2016

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (2017YFB0102603, 2018YFB0105003), the National Natural Science Foundation of China (51875255, 61601203, 61773184, U1564201, U1664258, U1764257, U1762264), the Natural Science Foundation of Jiangsu Province (BK20180100), the Six Talent Peaks Project of Jiangsu Province (2018-TD-GDZB-022), the Key Project for the Development of Strategic Emerging Industries of Jiangsu Province (2016-1094), and the Key Research and Development Program of Zhenjiang City (GY2017006).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hai Wang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cai, Y., Zhang, T., Wang, H. et al. 3D Vehicle Detection Based on LiDAR and Camera Fusion. Automot. Innov. 2, 276–283 (2019). https://doi.org/10.1007/s42154-019-00083-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42154-019-00083-z

Keywords

Navigation