Sample Less, Learn More: Efficient Action Recognition via Frame Feature Restoration

Cheng, Harry; Guo, Yangyang; Nie, Liqiang; Cheng, Zhiyong; Kankanhalli, Mohan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2307.14866 (cs)

[Submitted on 27 Jul 2023]

Title:Sample Less, Learn More: Efficient Action Recognition via Frame Feature Restoration

Authors:Harry Cheng, Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Mohan Kankanhalli

View PDF

Abstract:Training an effective video action recognition model poses significant computational challenges, particularly under limited resource budgets. Current methods primarily aim to either reduce model size or utilize pre-trained models, limiting their adaptability to various backbone architectures. This paper investigates the issue of over-sampled frames, a prevalent problem in many approaches yet it has received relatively little attention. Despite the use of fewer frames being a potential solution, this approach often results in a substantial decline in performance. To address this issue, we propose a novel method to restore the intermediate features for two sparsely sampled and adjacent video frames. This feature restoration technique brings a negligible increase in computational requirements compared to resource-intensive image encoders, such as ViT. To evaluate the effectiveness of our method, we conduct extensive experiments on four public datasets, including Kinetics-400, ActivityNet, UCF-101, and HMDB-51. With the integration of our method, the efficiency of three commonly used baselines has been improved by over 50%, with a mere 0.5% reduction in recognition accuracy. In addition, our method also surprisingly helps improve the generalization ability of the models under zero-shot settings.

Comments:	13 pages. Code and pretrained weight will be released at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
Cite as:	arXiv:2307.14866 [cs.CV]
	(or arXiv:2307.14866v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2307.14866

Submission history

From: Harry Cheng [view email]
[v1] Thu, 27 Jul 2023 13:52:42 UTC (1,385 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Sample Less, Learn More: Efficient Action Recognition via Frame Feature Restoration

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Sample Less, Learn More: Efficient Action Recognition via Frame Feature Restoration

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators