ABSTRACT
TV advertising is ubiquitous, perseverant, and economically vital. Millions of people's living and working habits are affected by TV commercials. In this paper, we present a multimodal ("visual + audio + text") commercial video digest scheme to segment individual commercials and carry out semantic content analysis within a detected commercial segment from TV streams.Two challenging issues are addressed. Firstly, we propose a multimodal approach to robustly detect the boundaries of individual commercials. Secondly, we attempt to classify a commercial with respect to advertised products/services. For the first, the boundary detection of individual commercials is reduced to the problem of binary classification of shot boundaries via the mid-level features derived from two concepts: Image Frames Marked with Product Information (FMPI) and Audio Scene Change Indicator (ASCI). Moreover, the accurate individual boundary enables us to perform commercial identification by clip matching via a spatial-temporal signature. For the second, commercial classification is formulated as the task of text categorization by expanding sparse texts from ASR/OCR with external knowledge. Our boundary detection has achieved a good result of F1 = 93.7% on the dataset comprising 499 individual commercials from TRECVID'05 video corpus. Commercial classification has obtained a promising accuracy of 80.9% on 141 distinct ones. Based on these achievements, various applications such as an intelligent digital TV set-top box can be accomplished to enhance the TV viewer's capabilities in monitoring and managing commercials from TV streams.
- J.V. Vilanilam and A.K. Varghese, Advertising basics! A resource guide for beginners. Response Books, New Delhi, 2004.Google Scholar
- M. Mizutani, etc., "Commercial detection in heterogeneous video streams using fused multi-modal and temporal features," Proc. ICASSP'05.Google Scholar
- L. Agnihotri, etc., "Evolvable visual commercial detector," Proc. CVPR' 03.Google Scholar
- R. Lienhart, C. Kuhmunch, and W. Effelsberg, "On the detection and recognition of television commercials," Proc. ICMCS'97, pp. 509--516. Google ScholarDigital Library
- H. Sundaram and S.-F. Chang, "Computable scenes and structures in films," IEEE Tran. TMM, 4(4):482--491, 2002. Google ScholarDigital Library
- J. R. Kender and B.L. Yeo, "Video scene segmentation via continuous video coherence," Proc. CVPR'98, CA, USA, pp.367--373. Google ScholarDigital Library
- M. Yeung and B.L. Yeo, "Time-constrained clustering for segmentation of video into story units," Proc. ICPR'96, Vienna, Austria, pp.375--380. Google ScholarDigital Library
- A. Hanjalic, etc., "Automated high-level movie segmentation for advanced video-retrieval systems," IEEE Tran. CSVT, 9(4):580--588, 1999. Google ScholarDigital Library
- R. Lienhart, S. Pfeiffer, and W. Effelsberg, "Scene determination based on video and audio features," Proc. ICMCS'99, pp.685--690. Google ScholarDigital Library
- A. G. Hauptmann and M. J. Witbrock, "Story segmentation and detection of commercials in broadcast news video," Proc. Conf. ADL' 98. Google ScholarDigital Library
- L. Chaisorn, etc., "A two-level multi-modal approach for story segmentation of large news video corpus," Proc. TRECVID'03, MD, USA.Google Scholar
- X.-S. Hua, L. Lu, and H.-J. Zhang, "Robust learning-based TV commercial detection," Proc. ICME'05, Amsterdam, Netherlands, pp.149--152.Google Scholar
- A. Albiol, etc., "Commercials detection using HMMs," Proc. Int. Workshop Image Analysis for Multimedia Interactive Services, Portugal, 2004.Google Scholar
- S. Marlow, etc., "Audio and video processing for automatic TV advertisement detection," Proc. Conf. Irish Signals and Systems, Ireland, 2001.Google Scholar
- J. Wang, etc. "A robust method for TV logo tracking in video streams," ICME'06.Google Scholar
- K. Matsumoto, etc., "Shot boundary determination and low-level feature extraction experiments for TRECVID 2005," Proc. TRECVID'05, USA.Google Scholar
- B.S. Manjunath and W.Y. Ma, "Texture features for browsing and retrieval of image data," IEEE Tran. PAMI, 18(8):837--842, 1996. Google ScholarDigital Library
- V. Vapnik, The nature of statistical learning theory. Springer-Verlag,'95. Google ScholarDigital Library
- T. Zhang and C.-C. Jay Kuo, "Audio content analysis for online audio-visual data segmentation and classification," IEEE Tran. Speech and Audio Processing, 9(4):441--457, 2001.Google ScholarCross Ref
- HTK toolkit. {Online} Available: http://htk.eng.cam.ac.uk/.Google Scholar
- T.-S. Chua, etc., "TRECVID 2005 by NUS PRIS," Proc. TRECVID'05, Gaithersburg, MD, USA.Google Scholar
- L.-Y. Duan, etc., "A unified framework for semantic shot classification in sports video," IEEE Tran. TMM, 7(6):1066--1083, 2005. Google ScholarDigital Library
- M.R. Naphade and T.S. Huang, "A probabilistic framework for semantic video indexing, filtering, and retrieval," IEEE Tran. TMM, 3(1):141--151. Google ScholarDigital Library
- A. Amir, etc., "IBM research TRECVID-2005 video retrieval system," Proc. TRECVID'05, Gaithersburg, MD, USA.Google Scholar
- C.-S. Xu, etc., "Live sports event detection based on broadcast video and web-casting text," Proc. ACM Int. Conf. Multimedia'06, CA, USA. Google ScholarDigital Library
- N. Babaguchi, etc., "Event based indexing of broadcasted sports video by intermodal collaboration," IEEE Tran. TMM, 4(1):68--75, 2002. Google ScholarDigital Library
- Reuters-21578 Text Categorization Test Collection. {Online} Available: http://www.daviddlewis.com/resources/testcollections/reuters21578/Google Scholar
- F. Sebastiani, "Machine learning in automated text categorization," ACM Computing Surveys, 54(1):1--47, 2002. Google ScholarDigital Library
- K. Lang, "Newsweeder: learning to filter netnews," Proc. ICML'95.Google Scholar
- T. Joachims, "Text categorization with support vector machines: learning with many relevant features," Proc. ECML'98, Germany, pp.137--142. Google ScholarDigital Library
- J. Yuan, etc., "Fast and robust short video clip search using an index structure," Proc. ACM MIR'04, New York, USA, pp. 61--68. Google ScholarDigital Library
- K. Kashino, etc., "A quick search method for audio and video signals based on histogram pruning," IEEE Tran. TMM, 5(3):348--357, 2003. Google ScholarDigital Library
- LIBSVM. {Online} Available: http://www.csie.ntu.edu.tw/~cjlin/libsvm/Google Scholar
- C. Colombo, etc., "Retrieval of commercials by semantic content: The semiotic perspective," Multimedia Tools and Applications, 13(1):93--118. Google ScholarDigital Library
- J. Yuan, etc., "Tsinghua Univeristy at TRECVID 2005," Proc. TRECVID'05.Google Scholar
- A. Hampapur, K. Hyun, and R. Bolle, "Comparison of sequence matching techniques for video copy detection," Proc. SPIE'02, vol.4676.Google Scholar
Index Terms
- Segmentation, categorization, and identification of commercial clips from TV streams using multimodal analysis
Recommendations
CNN-based Commercial Detection in TV Broadcasting
ICNCC '17: Proceedings of the 2017 VI International Conference on Network, Communication and ComputingTV is an important advertising media. Information of a piece of TV commercial, such as broadcasting time, the duration, the casting and etc., may reflect the business value of the host company of this commercial. An automatic commercial detection system ...
Digesting Commercial Clips from TV Streams
A commercial system that performs syntactic and semantic analysis during a TV advertising break could facilitate innovative new applications, such as an intelligent set-top box that enhances the ability of viewers to monitor and manage commercials from ...
Estimation system for human-interest degree while watching TV commercials using EEG
ICONIP'11: Proceedings of the 18th international conference on Neural Information Processing - Volume Part IIn this paper, we propose an estimation system for the human-interest degree while watching TV commercials using the electroencephalogram(EEG). When we use this system, we can estimate the human-interest degree easily, sequentially, and simply. In ...
Comments