Skip to main content

Periodic-Aware Network forĀ Fine-Grained Action Recognition

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14432))

Included in the following conference series:

  • 272 Accesses

Abstract

Recently, skeleton-based action recognition has gained increasing attention and achieved remarkable results in coarse-grained action recognition. Despite the positive results shown in these attempts, they are less effective in scenarios that require a detailed comparison between fine-grained classes, e.g. different moves during a vault. In such scenarios, existing methods make it hard to distinguish subtle differences between actions with different numbers of repetitions. In this article, to solve the above problem, we introduce periodicity into fine-grained action classification and propose a novel network architecture named periodic-aware network (PAN) to distinguish fine-grained actions with different numbers of repetitions. Firstly, a periodicity feature extraction module (PFEM) is proposed to capture periodicity information and extract periodicity features of different levels. Then, a periodicity fusion module (PFM) is proposed to fuse periodicity features and spatiotemporal features. We apply multiple periodicity fusion modules to fuse different levels of features. Finally, the results are obtained by classifying the fusion features. Extensive experiments on two fine-grained skeleton-based action recognition datasets, namely FineGym and Diving48, show that our proposed method outperforms previous skeleton-based action recognition methods.

S. Luo and J. Xiaoā€”These authors contributed equally to this work and should be considered co-first authors. This work was supported by the Guangdong Basic and Applied Basic Research Foundation No. 2021A1515011867, National Natural Science Foundation of China (NSFC) (61976123); Taishan Young Scholars Program of Shandong Province; and Key Development Program for Basic Research of Shandong Province (ZR2020ZD44).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Herzig, R., et al.: Object-region video transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3148ā€“3159 (2022)

    Google ScholarĀ 

  2. Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, vol. 27 (2014)

    Google ScholarĀ 

  3. Xiao, F., Lee, Y.J., Grauman, K., Malik, J., Feichtenhofer, C.: Audiovisual slowfast networks for video recognition. arXiv preprint arXiv:2001.08740 (2020)

  4. Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

    Google ScholarĀ 

  5. Duan, H., Zhao, Y., Chen, K., Lin, D., Dai, B.: Revisiting skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2969ā€“2978 (2022)

    Google ScholarĀ 

  6. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

  7. Ye, F., Pu, S., Zhong, Q., Li, C., Xie, D., Tang, H.: Dynamic GCN: context-enriched topology learning for skeleton-based action recognition. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 55ā€“63 (2020)

    Google ScholarĀ 

  8. Pan, H., Bai, Y., He, Z., Zhang, C.: AAGCN: adjacency-aware graph convolutional network for person re-identification. Knowl.-Based Syst. 236, 107300 (2022)

    ArticleĀ  Google ScholarĀ 

  9. Liu, Z., Zhang, H., Chen, Z., Wang, Z., Ouyang, W.: Disentangling and unifying graph convolutions for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 143ā€“152 (2020)

    Google ScholarĀ 

  10. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132ā€“7141 (2018)

    Google ScholarĀ 

  11. Yue, R., Tian, Z., Du, S.: Action recognition based on RGB and skeleton data sets: a survey. Neurocomputing (2022)

    Google ScholarĀ 

  12. Si, C., Chen, W., Wang, W., Wang, L., Tan, T.: An attention enhanced graph convolutional LSTM network for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1227ā€“1236 (2019)

    Google ScholarĀ 

  13. Zhang, P., Lan, C., Zeng, W., Xing, J., Xue, J., Zheng, N.: Semantics-guided neural networks for efficient skeleton-based human action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1112ā€“1121 (2020)

    Google ScholarĀ 

  14. Chen, Y., Zhang, Z., Yuan, C., Li, B., Deng, Y., Hu, W.: Channel-wise topology refinement graph convolution for skeleton-based action recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 13359ā€“13368 (2021)

    Google ScholarĀ 

  15. Cutle, R., Davis, L.: Robust real-time periodic motion detection. Anal. Appl. IEEE Comput. Soc. 22(8), 781ā€“796 (2000)

    Google ScholarĀ 

  16. Dwibedi, D., Aytar, Y., Tompson, J., Sermanet, P., Zisserman, A.: Counting out time: class agnostic video repetition counting in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 10387ā€“10396 (2020)

    Google ScholarĀ 

  17. Jacquelin, N., Vuillemot, R., Duffner, S.: Periodicity counting in videos with unsupervised learning of cyclic embeddings. Pattern Recogn. Lett. 161, 59ā€“66 (2022)

    ArticleĀ  Google ScholarĀ 

  18. Karvounas, G., Oikonomidis, I., Argyros, A.: Reactnet: temporal localization of repetitive activities in real-world videos. arXiv preprint arXiv:1910.06096 (2019)

  19. Li, Y., Li, Y., Vasconcelos, N.: RESOUND: towards action recognition without representation bias. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 520ā€“535. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_32

    ChapterĀ  Google ScholarĀ 

  20. Feichtenhofer, C., Fan, H., Malik, J., He, K.: Slowfast networks for video recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6202ā€“6211 (2019)

    Google ScholarĀ 

  21. Feichtenhofer, C.: X3d: Expanding architectures for efficient video recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 203ā€“213 (2020)

    Google ScholarĀ 

  22. Dwibedi, D., Tompson, J., Lynch, C., Sermanet, P.: Learning actionable representations from visual observations. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1577ā€“1584. IEEE (2018)

    Google ScholarĀ 

  23. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770ā€“778 (2016)

    Google ScholarĀ 

  24. Shao, D., Zhao, Y., Dai, B., Lin, D.: FineGYM: a hierarchical video dataset for fine-grained action understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2616ā€“2625 (2020)

    Google ScholarĀ 

  25. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)

    Google ScholarĀ 

  26. Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693ā€“5703 (2019)

    Google ScholarĀ 

  27. Contributors, M.: Openmmlabā€™s next generation video understanding toolbox and benchmark (2020). https://github.com/open-mmlab/mmaction2

  28. Duan, H., Wang, J., Chen, K., Lin, D.: Pyskl: towards good practices for skeleton action recognition. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 7351ā€“7354 (2022)

    Google ScholarĀ 

  29. Shi, L., Zhang, Y., Cheng, J., Lu, H.: Skeleton-based action recognition with multi-stream adaptive graph convolutional networks. IEEE Trans. Image Process. 29, 9532ā€“9545 (2020)

    ArticleĀ  Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dong Li .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (tex 5 KB)

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Luo, S., Xiao, J., Li, D., Jian, M. (2024). Periodic-Aware Network forĀ Fine-Grained Action Recognition. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14432. Springer, Singapore. https://doi.org/10.1007/978-981-99-8543-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8543-2_9

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8542-5

  • Online ISBN: 978-981-99-8543-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics