Abstract
With the boom of online video uploading, video tagging becomes an important way for video indexing. However, text-based video tagging methods ignore either genre labels or temporal differences of videos, which makes results defective. Fortunately, a new type of videos called time-sync commented videos which contains large amounts of information commented by the users helps videos tagging. In this paper, we propose a supervised dynamic Latent Dirichlet Allocation model utilizing the variational topics of time-sync comments to extract both genre labels and keywords as tags. We also implement experiments on large scale real-world datasets and the effectiveness of our model are proved both in genre label classification and keyword extraction compared with baseline models.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
These words and topic names are manually translated to English by the authors.
- 6.
A Chinese internet slang means laughing.
References
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. Arch. 3, 993–1022 (2003)
Chakrabarti, D., Punera, K.: Event summarization using tweets. ICWSM 11, 66–73 (2011)
Chen, X., Zhang, Y., Ai, Q., Xu, H., Yan, J., Qin, Z.: Personalized key frame recommendation. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 315–324. ACM (2017)
Chiu, C.Y., Lin, P.C., Li, S.Y., Tsai, T.H., Tsai, Y.L.: Tagging webcast text in baseball videos by video segmentation and text alignment. IEEE Trans. Circuits Syst. Video Technol. 22(7), 999–1013 (2012)
Lv, G., Xu, T., Chen, E., Liu, Q., Zheng, Y.: Reading the videos: temporal labeling for crowdsourced time-sync videos based on semantic embedding. In: AAAI, pp. 3000–3006 (2016)
Mcauliffe, J.D., Blei, D.M.: Supervised topic models. In: Advances in Neural Information Processing Systems, pp. 121–128 (2008)
Ramage, D., Hall, D., Nallapati, R., Manning, C.D.: Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 1, pp. 248–256. Association for Computational Linguistics (2009)
Rubin, T.N., Chambers, A., Smyth, P., Steyvers, M.: Statistical topic models for multi-label document classification. Mach. Learn. 88(1–2), 157–208 (2012)
Siersdorfer, S., San Pedro, J., Sanderson, M.: Automatic video tagging using content redundancy. In: Proceedings of the 32nd international ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 395–402. ACM (2009)
Ulges, A., Schulze, C., Koch, M., Breuel, T.M.: Learning automatic concept detectors from online video. Comput. Vis. Image Underst. 114(4), 429–438 (2010)
Wang, Y., Sabzmeydani, P., Mori, G.: Semi-latent Dirichlet allocation: a hierarchical model for human action recognition. In: Elgammal, A., Rosenhahn, B., Klette, R. (eds.) HuMo 2007. LNCS, vol. 4814, pp. 240–254. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75703-0_17
Wang, Z., Yu, J., He, Y., Guan, T.: Affection arousal based highlight extraction for soccer video. Multimed. Tools Appl. 73(1), 519–546 (2014)
Wu, B., Zhong, E., Tan, B., Horner, A., Yang, Q.: Crowdsourced time-sync video tagging using temporal and personalized topic modeling. In: 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 721–730. ACM (2014)
Xu, C., Wang, J., Wan, K., Li, Y., Duan, L.: Live sports event detection based on broadcast video and web-casting text. In: Proceedings of the 14th ACM International Conference on Multimedia, pp. 221–230. ACM (2006)
Xu, L., Zhang, C.: Bridging video content and comments: Synchronized video description with temporal summarization of crowdsourced time-sync comments. In: AAAI, pp. 1611–1617 (2017)
Yang, W., Ruan, N., Gao, W., Wang, K., Ran, W., Jia, W.: Crowdsourced time-sync video tagging using semantic association graph. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), pp. 547–552. IEEE (2017)
Yoshii, K., Goto, M.: Musiccommentator: Generating comments synchronized with musical audio signals by a joint probabilistic model of acoustic and textual features. In: ICEC (2009)
Zhu, J., Ahmed, A., Xing, E.P.: Medlda: maximum margin supervised topic models for regression and classification. In: Proceedings of the 26th annual international conference on machine learning. pp. 1257–1264. ACM (2009)
Acknowledgments
This work is partially supported by National Key Research and Development Program of China.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Zeng, Z., Xue, C., Gao, N., Wang, L., Liu, Z. (2018). Learning from Audience Intelligence: Dynamic Labeled LDA Model for Time-Sync Commented Video Tagging. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11303. Springer, Cham. https://doi.org/10.1007/978-3-030-04182-3_48
Download citation
DOI: https://doi.org/10.1007/978-3-030-04182-3_48
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04181-6
Online ISBN: 978-3-030-04182-3
eBook Packages: Computer ScienceComputer Science (R0)