research-article

Weakly Supervised Video Summarization by Hierarchical Reinforcement Learning

Authors:
Yiyan Chen

The University of Tokyo, Tokyo, Japan

The University of Tokyo, Tokyo, Japan
View Profile

,
Li Tao

The University of Tokyo, Tokyo, Japan

The University of Tokyo, Tokyo, Japan
View Profile

,
Xueting Wang

The University of Tokyo, Tokyo, Japan

The University of Tokyo, Tokyo, Japan
View Profile

,
Toshihiko Yamasaki

The University of Tokyo, Tokyo, Japan

The University of Tokyo, Tokyo, Japan
View Profile

MMAsia '19: Proceedings of the 1st ACM International Conference on Multimedia in AsiaDecember 2019Article No.: 3Pages 1–6https://doi.org/10.1145/3338533.3366583

Published:10 January 2020Publication History

MMAsia '19: Proceedings of the 1st ACM International Conference on Multimedia in Asia

Pages 1–6

ABSTRACT

Conventional video summarization approaches based on reinforcement learning have the problem that the reward can only be received after the whole summary is generated. Such kind of reward is sparse and it makes reinforcement learning hard to converge. Another problem is that labelling each shot is tedious and costly, which usually prohibits the construction of large-scale datasets. To solve these problems, we propose a weakly supervised hierarchical reinforcement learning framework, which decomposes the whole task into several subtasks to enhance the summarization quality. This framework consists of a manager network and a worker network. For each subtask, the manager is trained to set a subgoal only by a task-level binary label, which requires much fewer labels than conventional approaches. With the guide of the subgoal, the worker predicts the importance scores for video shots in the subtask by policy gradient according to both global reward and innovative defined sub-rewards to overcome the sparse problem. Experiments on two benchmark datasets show that our proposal has achieved the best performance, even better than supervised approaches.

References

Sijia Cai, Wangmeng Zuo, Larry S Davis, and Lei Zhang. 2018. Weakly-supervised video summarization using variational encoder-decoder and web prior. In ECCV. 184--200.Google Scholar
Sandra Eliza Fontes De Avila, Ana Paula Brandão Lopes, Antonio da Luz Jr, and Arnaldo de Albuquerque Araújo. 2011. VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recognition Letters, 56--68.Google Scholar
Ryosuke Furuta, Naoto Inoue, and Toshihiko Yamasaki. 2019. Fully convolutional network with multi-step reinforcement learning for image processing. In AAAI. 3598--3605.Google Scholar
Michael Gygli, Helmut Grabner, Hayko Riemenschneider, and Luc Van Gool. 2014. Creating summaries from user videos. In ECCV. 505--520.Google Scholar
Michael Gygli, Helmut Grabner, and Luc Van Gool. 2015. Video summarization by learning submodular mixtures of objectives. In CVPR. 3090--3098.Google Scholar
Ji He, Jianshu Chen, Xiaodong He, Jianfeng Gao, Lihong Li, Li Deng, and Mari Ostendorf. 2016. Deep Reinforcement Learning with a Natural Language Action Space. In ACL. 1621--1630.Google Scholar
Chen Huang, Simon Lucey, and Deva Ramanan. 2017. Learning policies for adaptive tracking with deep feature cascades. In ICCV. 105--114.Google Scholar
M. G. Kendall. 1945. The Treatment of Ties in Ranking Problems. Biometrika 33, 3 (1945), 239--251. http://www.jstor.org/stable/2332303Google ScholarCross Ref
Tejas D Kulkarni, Karthik Narasimhan, Ardavan Saeedi, and Josh Tenenbaum. 2016. Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. In NeurIPS. 3675--3683.Google Scholar
Behrooz Mahasseni, Michael Lam, and Sinisa Todorovic. 2017. Unsupervised video summarization with adversarial lstm networks. In CVPR. 202--211.Google Scholar
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing Atari With Deep Reinforcement Learning. In NeurIPS Workshop.Google Scholar
Karthik Narasimhan, Adam Yala, and Regina Barzilay. 2016. Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning. In EMNLP. 2355--2365.Google Scholar
Mayu Otani, Yuta Nakashima, Esa Rahtu, and Janne Heikkila. 2019. Rethinking the Evaluation of Video Summaries. In CVPR. 7596--7604.Google Scholar
Danila Potapov, Matthijs Douze, Zaid Harchaoui, and Cordelia Schmid. 2014. Category-specific video summarization. In ECCV. 540--555.Google Scholar
Liangliang Ren, Xin Yuan, Jiwen Lu, Ming Yang, and Jie Zhou. 2018. Deep reinforcement learning with iterative shift for visual tracking. In ECCV. 684--700.Google Scholar
Yale Song, Jordi Vallmitjana, Amanda Stent, and Alejandro Jaimes. 2015. Tvsum: Summarizing web videos using titles. In CVPR. 5179--5187.Google Scholar
Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, and Koray Kavukcuoglu. 2017. Feudal networks for hierarchical reinforcement learning. In ICML. 3540--3549.Google Scholar
Ronald J. Williams. 1992. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Mach. Learn. 8, 3-4 (May 1992), 229--256. https://doi.org/10.1007/BF00992696Google ScholarDigital Library
Ke Yu, Chao Dong, Liang Lin, and Chen Change Loy. 2018. Crafting a toolchain for image restoration by deep reinforcement learning. In CVPR. 2443--2452.Google Scholar
Da Zhang, Hamid Maei, Xin Wang, and Yuan-Fang Wang. 2017. Deep reinforcement learning for visual object tracking in videos. arXiv preprint arXiv:1701.08936 (2017).Google Scholar
Ke Zhang, Wei-Lun Chao, Fei Sha, and Kristen Grauman. 2016. Video summarization with long short-term memory. In ECCV. 766--782.Google Scholar
Bin Zhao, Xuelong Li, and Xiaoqiang Lu. 2018. Hsa-rnn: Hierarchical structure-adaptive rnn for video summarization. In CVPR. 7405--7414.Google Scholar
Kaiyang Zhou, Yu Qiao, and Tao Xiang. 2018. Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward. In AAAI. 7582--7589.Google Scholar
D. Zwillinger and S. Kokoska. 1999. CRC standard probability and statistics tables and formulae. CRC.Google Scholar

Index Terms

Weakly Supervised Video Summarization by Hierarchical Reinforcement Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Video summarization
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Reinforcement learning

Recommendations

Hierarchical Reinforcement Learning: A Comprehensive Survey

Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of challenging long-horizon decision-making tasks into simpler subtasks. During the past years, the landscape of HRL research has grown profoundly, resulting in copious ...
Read More
Automatic Curriculum Generation by Hierarchical Reinforcement Learning
Neural Information Processing
Abstract
Curriculum learning has the potential to solve the problem of sparse rewards, a long-standing challenge in reinforcement learning, with greater sample efficiency than traditional reinforcement learning algorithms because curriculum learning ...
Read More
Learning disentangled skills for hierarchical reinforcement learning through trajectory autoencoder with weak labels
Abstract
Typically, hierarchical reinforcement learning (RL) requires skills that are applicable to various downstream tasks. Although several recent studies have proposed the supervised and unsupervised learning of such skills, the learned skills are ...
Graphical abstract

Display Omitted
Highlights
- Learning skills as continuous latent variables with disentangled factors.
- Using weak labels to train a trajectory variational autoencoder.
- Skill repetition to expand the whole trajectory.
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

MMAsia '19: Proceedings of the 1st ACM International Conference on Multimedia in Asia
December 2019
403 pages
ISBN:9781450368414
DOI:10.1145/3338533

Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 January 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
hierarchical reinforcement learning
sub-reward
video summarization
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
MMAsia '19 Paper Acceptance Rate59of204submissions,29%Overall Acceptance Rate59of204submissions,29%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 38
  Total Citations
  View Citations
- 346
  Total Downloads
- Downloads (Last 12 months)57
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Weakly Supervised Video Summarization by Hierarchical Reinforcement Learning

MMAsia '19: Proceedings of the 1st ACM International Conference on Multimedia in Asia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Hierarchical Reinforcement Learning: A Comprehensive Survey

Automatic Curriculum Generation by Hierarchical Reinforcement Learning

Learning disentangled skills for hierarchical reinforcement learning through trajectory autoencoder with weak labels

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Weakly Supervised Video Summarization by Hierarchical Reinforcement Learning

MMAsia '19: Proceedings of the 1st ACM International Conference on Multimedia in Asia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Hierarchical Reinforcement Learning: A Comprehensive Survey

Automatic Curriculum Generation by Hierarchical Reinforcement Learning

Learning disentangled skills for hierarchical reinforcement learning through trajectory autoencoder with weak labels

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media