Skip to main content

Toward Surroundings-Aware Temporal Prediction of 3D Human Skeleton Sequence

  • Conference paper
  • First Online:
Pattern Recognition, Computer Vision, and Image Processing. ICPR 2022 International Workshops and Challenges (ICPR 2022)

Abstract

Temporal prediction of human pose sequence is vital for robot applications such as human-robot interaction and autonomous control of a robot. Recent methods are based on a 3D human skeleton sequence to predict future skeletons. Even if starting motions of two human skeleton sequences are very similar, their future motions may be different because of the surrounding objects of the human; it is difficult to predict the future skeleton sequences only from a given human skeleton sequence. However, don’t you think the presence of surrounding objects is an important clue for the prediction? This paper proposes a method of predicting future skeleton sequences by incorporating the surrounding information into the skeleton sequence. We assume that the surrounding condition around a target person does not change significantly within a few seconds and use an image feature around the target person as the surrounding information. Through evaluations on a public dataset, performance improvement is confirmed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adeli, V., Adeli, E., Reid, I., Niebles, J.C., Rezatofighi, H.: Socially and contextually aware human motion and pose forecasting. IEEE Robot. Autom. Lett. 5(4), 6033–6040 (2020). https://doi.org/10.1109/LRA.2020.3010742

    Article  Google Scholar 

  2. Chao, Y.W., Yang, J., Price, B., Cohen, S., Deng, J.: Forecasting human dynamics from static images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 548–556 (Jul 2017). https://doi.org/10.1109/CVPR.2017.388

  3. Corona, E., Pumarola, A., Alenya, G., Moreno-Noguer, F.: Context-aware human motion prediction. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6990–6999 (Jun 2020). https://doi.org/10.1109/CVPR42600.2020.00702

  4. Dang, L., Nie, Y., Long, C., Zhang, Q., Li, G.: MSR-GCN: multi-scale residual graph convolution networks for human motion prediction. In: Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, pp. 11447–11456 (Oct 2021). https://doi.org/10.1109/ICCV48922.2021.01127

  5. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (Jun 2009). https://doi.org/10.1109/CVPR.2009.5206848

  6. Li, M., Chen, S., Zhao, Y., Zhang, Y., Wang, Y., Tian, Q.: Dynamic multiscale graph neural networks for 3D skeleton based human motion prediction. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 211–220 (Jun 2020). https://doi.org/10.1109/CVPR42600.2020.00029

  7. Liu, J., Shahroudy, A., Perez, M., Wang, G., Duan, L.Y., Kot, A.C.: NTU RGB+D 120: a large-scale benchmark for 3D human activity understanding. IEEE Trans. Pattern Anal. Mach. Intell. 42(10), 2684–2701 (2020). https://doi.org/10.1109/TPAMI.2019.2916873

    Article  Google Scholar 

  8. Mao, W., Liu, M., Salzmann, M., Li, H.: Learning trajectory dependencies for human motion prediction. In: Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, pp. 9488–9496 (Oct 2019). https://doi.org/10.1109/ICCV.2019.00958

  9. Sofianos, T., Sampieri, A., Franco, L., Galasso, F.: Space-time-separable graph convolutional network for pose forecasting. In: Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, pp. 11189–11198 (Oct 2021). https://doi.org/10.1109/ICCV48922.2021.01102

  10. Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 6105–6114 (Jun 2019)

    Google Scholar 

  11. Tang, Y., Ma, L., Liu, W., Zheng, W.S.: Long-term human motion prediction by modeling motion context and enhancing motion dynamics. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 935–941 (Jul 2018). https://doi.org/10.24963/ijcai.2018/130

  12. Taylor, G.W., Hinton, G.E., Roweis, S.T.: Modeling human motion using binary latent variables. In: Advances in Neural Information Processing Systems, vol. 19 (Sep 2007). https://doi.org/10.7551/mitpress/7503.003.0173

  13. Wang, B., Adeli, E., Chiu, H.K., Huang, D.A., Niebles, J.C.: Imitation learning for human pose prediction. In: Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, pp. 7123–7132 (Oct 2019). https://doi.org/10.1109/ICCV.2019.00722

  14. Wang, J., Hertzmann, A., Blei, D.M.: Gaussian process dynamical models. In: Advances in neural information processing systems, vol. 18 (May 2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomohiro Fujita .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fujita, T., Kawanishi, Y. (2023). Toward Surroundings-Aware Temporal Prediction of 3D Human Skeleton Sequence. In: Rousseau, JJ., Kapralos, B. (eds) Pattern Recognition, Computer Vision, and Image Processing. ICPR 2022 International Workshops and Challenges. ICPR 2022. Lecture Notes in Computer Science, vol 13643. Springer, Cham. https://doi.org/10.1007/978-3-031-37660-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-37660-3_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-37659-7

  • Online ISBN: 978-3-031-37660-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics