Skip to main content
Log in

Temporal attention learning for action quality assessment in sports video

  • Original Article
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

This paper proposes an end-to-end temporal attention learning method to improve the performance of action quality assessment in sports video. For temporal weighted training, an attention-learning module is built to simulate the attention mechanism and judgement preference of human perception on action quality assessment. The weights are learned based on the loss of the segmented prediction errors and used to balance the significance of segmented features. We evaluate the proposed method on diving and gym-vault action of the benchmark AQA-7 dataset. The experimental results show that the proposed attention-aware feature training method is more effective than temporal aggregation and existing temporal relationship learning methods. Furthermore, only using the distance loss between the predicated score and the ground-truth score, without considering the ranking loss of different videos for training, this paper has achieved the state-of-the-art performance on both of the spearman rank correlation and mean Euclidean distance of the predicted scores against the judge’s scores.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Parmar, P., Morris, B.: Action quality assessment across multiple actions. in 2019 IEEE winter conference on applications of computer vision (WACV). (2019)

  2. Lei, Q., et al.: A survey of vision-based human action evaluation methods. Sensors 19(19), 4129 (2019)

    Article  Google Scholar 

  3. Parmar, P., Morris, B.T.: Learning to score olympic events. in computer vision & pattern recognition workshops. (2017)

  4. Xiang, X., et al.: S3D: Stacking segmental P3D for action quality assessment. In 2018 25th IEEE International conference on image processing (ICIP). (2018)

  5. Li, Y., Chai, X., Chen, X.: ScoringNet: learning key fragment for action quality assessment with ranking loss in skilled sports. Springer, Cham (2019)

    Google Scholar 

  6. Patrona, F., et al.: Motion analysis: action detection. Recognit. Eval. Based Motion Capture Data 76, S0031320317304910 (2017)

    Google Scholar 

  7. Weeratunga, K., Dharmaratne, A., How, K.B.: Application of computer vision and vector space model for tactical movement classification in badminton. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. (2017)

  8. Morel, M., et al.: Automatic evaluation of sports motion: a generic computation of spatial and temporal errors. Imag. Vision Comput. 64, 67–78 (2017)

    Article  Google Scholar 

  9. O'Connor, N.E., Kelly, P.: Evaluating a dancer's performance using Kinect-based skeleton tracking. (2011)

  10. Pirsiavash, H., Vondrick, C.: and A. Torralba, Assessing the Quality of Actions (2014)

    Google Scholar 

  11. Venkataraman, V., Vlachos, I., Turaga, T.K.: Dynamical regularity for action analysis. In BMVC. (2015)

  12. Gordon, A.S.: Automated video assessment of human performance. In Proceedings of AI-ED. (1995)

  13. Ilg, W., Mezger, J., Giese, M.: Estimation of skill levels in sports based on hierarchical spatio-temporal correspondences. (2003)

  14. Wnuk, K., Soatto, S.: Analyzing diving: a dataset for judging action quality. In International conference on computer vision. (2010)

  15. Yongjun Li1, Xiujuan Chai1,2, and Xilin Chen: End-To-End learning for action quality assessment. (2019)

  16. William McNally Kanav Vats Tyler Pinto Chris Dulhanty John McPhee Alexander Wong, S.D.E., University ofWaterloo, GolfDB: a video database for golf swing sequencing, in cvpr 2019. (2019)

  17. Hiteshi Jain, G.H.a.A.S.: Action quality assessment using siamese network-based deep metric learning. (2020)

  18. Xu, C., et al.: Learning to score figure skating sport videos. IEEE transactions on circuits and systems for video technology, p. 1–1 (2019)

  19. Yansong Tang1, 3,∗, Zanlin Ni1,∗, Jiahuan Zhou5, Danyang Zhang1, Jiwen Lu1,2,3, Ying Wu5, Jie Zhou1,2,3,4: Uncertainty-aware score distribution learning for action quality assessment. cvpr, (2020)

  20. Parmar, P., Morris, B.T.: What and how well you performed? A multitask learning approach to action quality assessment. In 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR). (2019)

  21. Pan, J., Gao, J., Zheng, W.: Action assessment by joint relation graphs. In 2019 IEEE/CVF International conference on computer vision (ICCV). (2019)

  22. Du, T., et al.: Learning spatiotemporal features with 3D convolutional networks. In IEEE International conference on computer vision. (2015)

  23. Carreira, J., Zisserman, A.: Quo Vadis, Action Recognition? A new model and the kinetics dataset. p. 4724–4733 (2017)

  24. Lea, C., et al.: Temporal convolutional networks for action segmentation and detection. In 2017 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE (2017)

  25. UNLV AQA dataset.http://rtis.oit.unlv.edu/datasets.html.(Accessed on 22 Aug. 2020)

  26. Kingma, D.P., Ba, J.J.a.L.: Adam: a method for stochastic optimization. (2014)

Download references

Funding

The National Nature Science Foundation of China (61871196, 62001176), the Natural Science Foundation of Fujian Province, China (2019J01082, 2020J01085), and Promotion Program for Young and Middle-aged Teacher in Science and Technology Research of Huaqiao University (ZQN-YX601), supported this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jixiang Du.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lei, Q., Zhang, H. & Du, J. Temporal attention learning for action quality assessment in sports video. SIViP 15, 1575–1583 (2021). https://doi.org/10.1007/s11760-021-01890-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-021-01890-w

Keywords

Navigation