Survey of Pedestrian Action Recognition in Unmanned-Driving

Chen, Li; Ma, Nan; Wang, Pengfei; Pang, Guilin; Shi, Xiaojun

doi:10.1007/978-981-13-7983-3_44

Li Chen¹¹,
Nan Ma¹⁰,
Pengfei Wang¹²,
Guilin Pang¹⁰ &
…
Xiaojun Shi¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1005))

Included in the following conference series:

International Conference on Cognitive Systems and Signal Processing

1042 Accesses
1 Citations

Abstract

With the development of unmanned-driving car technology, there are higher requirements for the intelligence, safety and stability of intelligent vehicle driving. Especially in a complex and uncertain environment, the driverless car can accurately detect the pedestrian action, which can effectively realize the autonomous driving of the vehicle. This requires that vehicles detect pedestrians firstly, then identify pedestrian body language and try to understand their intentions, predict pedestrian’s actions, which form a good interaction cognition between human and vehicle. In this paper, we give a detailed survey about the recent and state-of-the-art research methods in the filed of human action recognition and discuss their advantages and limitations. We analysis the main framework of motion recognition, and summarize the common datasets of this filed. Finally, suggestions for future research directions are offered, which is expected to benefit the follow research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. (CSUR) 43(3), 16 (2011)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Diego, USA, pp. 886–893 (2005)
Google Scholar
Cortes, C., Vapnik, V.: Support vector networks. Mach. Learn. 20, 273–295 (1995)
MATH Google Scholar
Singh, D., Khan, M.A., Bansal, A., et al.: An application of SVM in character recognition with chain code. In: Communication, Control and Intelligent Systems (CCIS), pp. 167–171. IEEE (2015)
Google Scholar
Ahmad, A.S., Hassan, M.Y., Abdullah, M.P., et al.: A review on applications of ANN and SVM for building electrical energy consumption forecasting. Renew. Sustain. Energy Rev. 33(1), 102–109 (2014)
Article Google Scholar
Liu, H., Xu, T., Wang, X., Qian, Y.: Related HOG features for human detection using cascaded adaboost and SVM classifiers. In: Li, S., et al. (eds.) MMM 2013. LNCS, vol. 7733, pp. 345–355. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-35728-2_33
Chapter Google Scholar
Pang, Y., Yuan, Y., Li, X., et al.: Efficient HOG human detection. Sig. Process. 91(4), 773–781 (2011)
Article Google Scholar
Han, F., Shan, Y., Cekander, R., et al.: A two-stage approach to people and vehicle detection with hog-based SVM. In: Performance Metrics for Intelligent Systems 2006 Workshop, pp. 133–140 (2006)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., et al.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Yan, J., Zhang, X., Lei, Z., et al.: Robust multi-resolution pedestrian detection in traffic scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3033–3040 (2013)
Google Scholar
Zeng, J.X., Chen, X.: Pedestrian detection combined with single and couple pedestrian DPM models in traffic scene. Acta Electronica Sinica (2016)
Google Scholar
Yan, J., Lei, Z., Wen, L., et al.: The fastest deformable part model for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2497–2504 (2014)
Google Scholar
Tian, Y., Sukthankar, R., Shah, M.: Spatiotemporal deformable part models for action detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2642–2649 (2013)
Google Scholar
Viola, P., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion and appearance. Int. J. Comput. Vision 63(2), 153–161 (2005)
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Google Scholar
Girshick, R.: Fast-RCNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
Uijlings, J.R.R., Sande, K.E.A., Gever, T., et al.: Selective search for object recognition. Int. J. Comput. Vision 104(2), 154–171 (2013)
Article Google Scholar
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Ye et al.: Night pedestrian detection based on accelerated region convolutional neural network. Progress Laser Optoelectron. 54(08), 123–129 (2017)
Google Scholar
Li, J., Liang, X., Shen, S.M., et al.: Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multimedia 20(4), 985–996 (2018)
Google Scholar
Zhang, L., Lin, L., Liang, X., He, K.: Is faster R-CNN doing well for pedestrian detection? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 443–457. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_28
Chapter Google Scholar
Guo, A.I., Yin, B.Q., et al.: Small-scale pedestrian detection based on deep convolutional neural network. Inf. Technol. Netw. Secur. 37(07), 50–53+57 (2018)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Gao, Z., Li, S., Chen, J., Li, Z.: Pedestrian detection method based on YOLO network. Comput. Eng. 44(5), 215–219, 226 (2018)
Google Scholar
Hao, X.Z., Chai, Z.Y.: An improved deep residual network pedestrian detection method. Comput. Appl. Res. (06), 1–3 (2019)
Google Scholar
Zhu, P., Huang, L.: Pedestrian detection based on deep neural network in traffic environment. Inf. Commun. (05), 69–72 (2018)
Google Scholar
Johansson, G.: Visual perception of biological motion and a model for its analysis. Percept. Psychophys. 14(2), 201–211 (1973)
Article Google Scholar
Horn, B.K.P., Schunck, B.G.: Determining optical flow. Artif. Intell. 17(1–3), 185–203 (1981)
Article Google Scholar
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision (1981)
Google Scholar
Black, M.J., Anandan, P.: The robust estimation of multiple motions: parametric and piecewise-smooth flow fields. Comput. Vis. Image Underst. 63(1), 75–104 (1996)
Article Google Scholar
Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 500–513 (2011)
Article Google Scholar
Weinzaepfel, P., Revaud, J., Harchaoui, Z., et al.: DeepFlow: large displacement optical flow with deep matching. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1385–1392 (2013)
Google Scholar
Zimmer, H., Bruhn, A., Weickert, J.: Optic flow in harmony. Int. J. Comput. Vision 93(3), 368–388 (2011)
Article MathSciNet Google Scholar
Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)
Article Google Scholar
Mozerov, M.G.: Constrained optical flow estimation as a matching problem. IEEE Trans. Image Process. 22(5), 2044–2055 (2013)
Article MathSciNet Google Scholar
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3551–3558 (2013)
Google Scholar
Peng, X., Zou, C., Qiao, Yu., Peng, Q.: Action recognition with stacked fisher vectors. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 581–595. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_38
Chapter Google Scholar
Chaudhry, R., Ravichandran, A., Hager, G., et al.: Histograms of oriented optical flow and binet-cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1932–1939. IEEE (2009)
Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE (1999)
Google Scholar
Ke, Y., Sukthankar, R.: PCA-SIFT: a more distinctive representation for local image descriptors. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, vol. 2, p. II. IEEE 2004
Google Scholar
Laptev, I., Marszalek, M., Schmid, C., et al.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008)
Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014)
Google Scholar
Sharma, S., Kiros, R., Salakhutdinov, R.: Action recognition using visual attention. Comput. Sci. (2015)
Google Scholar
Zha, S., Luisier, F., Andrews, W., et al.: Exploiting image-trained CNN architectures for unconstrained video classification. Comput. Sci. (2015)
Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Lin, T.Y., Roychowdhury, A., Maji, S.: Bilinear CNNs for fine-grained visual recognition, pp. 1449–1457 (2015)
Google Scholar
Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks, pp. 2017–2025 (2015)
Google Scholar
Cao, J., Jiang, X., Sun, W.: Video human motion recognition algorithm based on CNN features of training diagram. Comput. Eng. 43(11), 234–238 (2017)
Google Scholar
Derpanis, K.G., Sizintsev, M., Cannons, K.J., et al.: Action spotting and recognition based on a spatiotemporal orientation analysis. IEEE Trans. Pattern Anal. Mach. Intell. 35(3), 527–540 (2013)
Article Google Scholar
Goudelis, G., Karpouzis, K., Kollias, S.: Exploring trace transform for robust human action recognition. Pattern Recogn. 46(12), 3238–3248 (2013)
Article Google Scholar
Blank, M., Gorelick, L., Shechtman, E., et al.: Actions as space-time shapes, pp. 1395–1402. IEEE (2005)
Google Scholar
Vishwakarma, D.K., Singh, K.: Human activity recognition based on spatial distribution of gradients at sublevels of average energy silhouette images. IEEE Trans. Cogn. Dev. Syst. 9(4), 316–327 (2017)
Article Google Scholar
Melfi, R., Kondra, S., Petrosino, A.: Human activity modeling by spatio temporal textural appearance. Pattern Recogn. Lett. 34(15), 1990–1994 (2013)
Article Google Scholar

Download references

Acknowledgment

We really thank anonymous reviewer’s constructive suggestions. This part of study is partially founded by the national natural science foundation of China with the numbers 61871038 and 61672178, Beijing Natural Science Foundation with the numbers 4182022.

Author information

Authors and Affiliations

College of Robotics, Beijing Union University, Beijing, China
Nan Ma, Guilin Pang & Xiaojun Shi
Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing, China
Li Chen
Communication and Information Centre, State Administration of Work Safety, Beijing, China
Pengfei Wang

Authors

Li Chen
View author publications
You can also search for this author in PubMed Google Scholar
Nan Ma
View author publications
You can also search for this author in PubMed Google Scholar
Pengfei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guilin Pang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Shi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nan Ma .

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Tsinghua University, Beijing, China
Fuchun Sun
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Huaping Liu
College of Mechatronics and Automation, National University of Defense Technology, Changsha, Hunan, China
Dewen Hu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, L., Ma, N., Wang, P., Pang, G., Shi, X. (2019). Survey of Pedestrian Action Recognition in Unmanned-Driving. In: Sun, F., Liu, H., Hu, D. (eds) Cognitive Systems and Signal Processing. ICCSIP 2018. Communications in Computer and Information Science, vol 1005. Springer, Singapore. https://doi.org/10.1007/978-981-13-7983-3_44

Download citation

DOI: https://doi.org/10.1007/978-981-13-7983-3_44
Published: 28 April 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-7982-6
Online ISBN: 978-981-13-7983-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics