Paper The following article is Open access

Segments-Based 3D ConvNet for Action Recognition

, , , and

Published under licence by IOP Publishing Ltd
, , Citation Wei Li et al 2020 J. Phys.: Conf. Ser. 1621 012042 DOI 10.1088/1742-6596/1621/1/012042

1742-6596/1621/1/012042

Abstract

Learning to capture both long-range and short-range temporal information is crucial for action recognition task. Previous works utilize 3D ConvNets to capture short-range temporal dynamics in replacement of optical-flow which needs time-consuming extraction. However, dramatically incresed parameters limit the capacity for modeling long-term interactions. In this paper, we propose Segments-based 3D ConvNet (S3D) to integrate both long-term and short-term temporal dynamics. Firstly, we utilize 3D ResNet without temporal downsampling to capture short-range video contents. Secondly, we integrate a sparse sampling strategy to model long-range temporal structure. Finally, experiments on UCF-101 and HMDB-51 datasets show the effectiveness of our S3D compared with corresponding 3D ConvNet.

Export citation and abstract BibTeX RIS

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Please wait… references are loading.
10.1088/1742-6596/1621/1/012042