Abstract
With the development of unmanned-driving car technology, there are higher requirements for the intelligence, safety and stability of intelligent vehicle driving. Especially in a complex and uncertain environment, the driverless car can accurately detect the pedestrian action, which can effectively realize the autonomous driving of the vehicle. This requires that vehicles detect pedestrians firstly, then identify pedestrian body language and try to understand their intentions, predict pedestrian’s actions, which form a good interaction cognition between human and vehicle. In this paper, we give a detailed survey about the recent and state-of-the-art research methods in the filed of human action recognition and discuss their advantages and limitations. We analysis the main framework of motion recognition, and summarize the common datasets of this filed. Finally, suggestions for future research directions are offered, which is expected to benefit the follow research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. (CSUR) 43(3), 16 (2011)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Diego, USA, pp. 886–893 (2005)
Cortes, C., Vapnik, V.: Support vector networks. Mach. Learn. 20, 273–295 (1995)
Singh, D., Khan, M.A., Bansal, A., et al.: An application of SVM in character recognition with chain code. In: Communication, Control and Intelligent Systems (CCIS), pp. 167–171. IEEE (2015)
Ahmad, A.S., Hassan, M.Y., Abdullah, M.P., et al.: A review on applications of ANN and SVM for building electrical energy consumption forecasting. Renew. Sustain. Energy Rev. 33(1), 102–109 (2014)
Liu, H., Xu, T., Wang, X., Qian, Y.: Related HOG features for human detection using cascaded adaboost and SVM classifiers. In: Li, S., et al. (eds.) MMM 2013. LNCS, vol. 7733, pp. 345–355. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-35728-2_33
Pang, Y., Yuan, Y., Li, X., et al.: Efficient HOG human detection. Sig. Process. 91(4), 773–781 (2011)
Han, F., Shan, Y., Cekander, R., et al.: A two-stage approach to people and vehicle detection with hog-based SVM. In: Performance Metrics for Intelligent Systems 2006 Workshop, pp. 133–140 (2006)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., et al.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Yan, J., Zhang, X., Lei, Z., et al.: Robust multi-resolution pedestrian detection in traffic scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3033–3040 (2013)
Zeng, J.X., Chen, X.: Pedestrian detection combined with single and couple pedestrian DPM models in traffic scene. Acta Electronica Sinica (2016)
Yan, J., Lei, Z., Wen, L., et al.: The fastest deformable part model for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2497–2504 (2014)
Tian, Y., Sukthankar, R., Shah, M.: Spatiotemporal deformable part models for action detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2642–2649 (2013)
Viola, P., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion and appearance. Int. J. Comput. Vision 63(2), 153–161 (2005)
Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Girshick, R.: Fast-RCNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Uijlings, J.R.R., Sande, K.E.A., Gever, T., et al.: Selective search for object recognition. Int. J. Comput. Vision 104(2), 154–171 (2013)
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Ye et al.: Night pedestrian detection based on accelerated region convolutional neural network. Progress Laser Optoelectron. 54(08), 123–129 (2017)
Li, J., Liang, X., Shen, S.M., et al.: Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multimedia 20(4), 985–996 (2018)
Zhang, L., Lin, L., Liang, X., He, K.: Is faster R-CNN doing well for pedestrian detection? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 443–457. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_28
Guo, A.I., Yin, B.Q., et al.: Small-scale pedestrian detection based on deep convolutional neural network. Inf. Technol. Netw. Secur. 37(07), 50–53+57 (2018)
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Gao, Z., Li, S., Chen, J., Li, Z.: Pedestrian detection method based on YOLO network. Comput. Eng. 44(5), 215–219, 226 (2018)
Hao, X.Z., Chai, Z.Y.: An improved deep residual network pedestrian detection method. Comput. Appl. Res. (06), 1–3 (2019)
Zhu, P., Huang, L.: Pedestrian detection based on deep neural network in traffic environment. Inf. Commun. (05), 69–72 (2018)
Johansson, G.: Visual perception of biological motion and a model for its analysis. Percept. Psychophys. 14(2), 201–211 (1973)
Horn, B.K.P., Schunck, B.G.: Determining optical flow. Artif. Intell. 17(1–3), 185–203 (1981)
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision (1981)
Black, M.J., Anandan, P.: The robust estimation of multiple motions: parametric and piecewise-smooth flow fields. Comput. Vis. Image Underst. 63(1), 75–104 (1996)
Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 500–513 (2011)
Weinzaepfel, P., Revaud, J., Harchaoui, Z., et al.: DeepFlow: large displacement optical flow with deep matching. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1385–1392 (2013)
Zimmer, H., Bruhn, A., Weickert, J.: Optic flow in harmony. Int. J. Comput. Vision 93(3), 368–388 (2011)
Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)
Mozerov, M.G.: Constrained optical flow estimation as a matching problem. IEEE Trans. Image Process. 22(5), 2044–2055 (2013)
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3551–3558 (2013)
Peng, X., Zou, C., Qiao, Yu., Peng, Q.: Action recognition with stacked fisher vectors. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 581–595. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_38
Chaudhry, R., Ravichandran, A., Hager, G., et al.: Histograms of oriented optical flow and binet-cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1932–1939. IEEE (2009)
Lowe, D.G.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE (1999)
Ke, Y., Sukthankar, R.: PCA-SIFT: a more distinctive representation for local image descriptors. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, vol. 2, p. II. IEEE 2004
Laptev, I., Marszalek, M., Schmid, C., et al.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008)
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014)
Sharma, S., Kiros, R., Salakhutdinov, R.: Action recognition using visual attention. Comput. Sci. (2015)
Zha, S., Luisier, F., Andrews, W., et al.: Exploiting image-trained CNN architectures for unconstrained video classification. Comput. Sci. (2015)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Lin, T.Y., Roychowdhury, A., Maji, S.: Bilinear CNNs for fine-grained visual recognition, pp. 1449–1457 (2015)
Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks, pp. 2017–2025 (2015)
Cao, J., Jiang, X., Sun, W.: Video human motion recognition algorithm based on CNN features of training diagram. Comput. Eng. 43(11), 234–238 (2017)
Derpanis, K.G., Sizintsev, M., Cannons, K.J., et al.: Action spotting and recognition based on a spatiotemporal orientation analysis. IEEE Trans. Pattern Anal. Mach. Intell. 35(3), 527–540 (2013)
Goudelis, G., Karpouzis, K., Kollias, S.: Exploring trace transform for robust human action recognition. Pattern Recogn. 46(12), 3238–3248 (2013)
Blank, M., Gorelick, L., Shechtman, E., et al.: Actions as space-time shapes, pp. 1395–1402. IEEE (2005)
Vishwakarma, D.K., Singh, K.: Human activity recognition based on spatial distribution of gradients at sublevels of average energy silhouette images. IEEE Trans. Cogn. Dev. Syst. 9(4), 316–327 (2017)
Melfi, R., Kondra, S., Petrosino, A.: Human activity modeling by spatio temporal textural appearance. Pattern Recogn. Lett. 34(15), 1990–1994 (2013)
Acknowledgment
We really thank anonymous reviewer’s constructive suggestions. This part of study is partially founded by the national natural science foundation of China with the numbers 61871038 and 61672178, Beijing Natural Science Foundation with the numbers 4182022.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chen, L., Ma, N., Wang, P., Pang, G., Shi, X. (2019). Survey of Pedestrian Action Recognition in Unmanned-Driving. In: Sun, F., Liu, H., Hu, D. (eds) Cognitive Systems and Signal Processing. ICCSIP 2018. Communications in Computer and Information Science, vol 1005. Springer, Singapore. https://doi.org/10.1007/978-981-13-7983-3_44
Download citation
DOI: https://doi.org/10.1007/978-981-13-7983-3_44
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-7982-6
Online ISBN: 978-981-13-7983-3
eBook Packages: Computer ScienceComputer Science (R0)