Article

Topic transition detection using hierarchical hidden Markov and semi-Markov models

Authors:
Dinh Q. Phung

Curtin Univesrity of Technology, Perth, Western Australia

Curtin Univesrity of Technology, Perth, Western Australia
View Profile

,
T. V. Duong

Curtin Univesrity of Technology, Perth, Western Australia

Curtin Univesrity of Technology, Perth, Western Australia
View Profile

,
S. Venkatesh

Curtin Univesrity of Technology, Perth, Western Australia

Curtin Univesrity of Technology, Perth, Western Australia
View Profile

,
Hung H. Bui

SRI International, Menlo Park, CA

SRI International, Menlo Park, CA
View Profile

MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on MultimediaNovember 2005Pages 11–20https://doi.org/10.1145/1101149.1101153

Published:06 November 2005Publication History

MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia

Pages 11–20

ABSTRACT

In this paper we introduce a probabilistic framework to exploit hierarchy, structure sharing and duration information for topic transition detection in videos. Our probabilistic detection framework is a combination of a shot classification step and a detection phase using hierarchical probabilistic models. We consider two models in this paper: the extended Hierarchical Hidden Markov Model (HHMM) and the Coxian Switching Hidden semi-Markov Model (S-HSMM) because they allow the natural decomposition of semantics in videos, including shared structures, to be modeled directly, and thus enable efficient inference and reduce the sample complexity in learning. Additionally, the S-HSMM allows the duration information to be incorporated, consequently the modeling of long-term dependencies in videos is enriched through both hierarchical and duration modeling. Furthermore, the use of Coxian distribution in the S-HSMM makes it tractable to deal with long sequences in video. Our experimentation of the proposed framework on twelve educational and training videos shows that both models outperform the baseline cases (flat HMM and HSMM) and performances reported in earlier work in topic detection. The superior performance of the S-HSMM over the HHMM verifies our belief that the duration information is an important factor in video content modeling.

References

B. Adams, C. Dorai, and S. Venkatesh. Automated film rhythm extraction for scene analysis. In IEEE International Conference on Multimedia and Expo, pages 1056--1059, Tokyo, Japan, August 2001.]]Google ScholarCross Ref
P. Aigrain, P. Jolly, and V. Longueville. Medium knowledge-based macro-segmentation of video into sequences. In M. Maybury, editor, Intelligent Multimedia Information Retrieval, pages 159--174. AAAI Press/MIT Press, 1998.]] Google ScholarDigital Library
H. H. Bui, D. Q. Phung, and S. Venkatesh. Hierarchical hidden markov models with general state hierarchy. In D. L. McGuinness and G. Ferguson, editors, Proceedings of the Nineteenth National Conference on Artificial Intelligence, pages 324--329, San Jose, California, USA, 2004. AAAI Press / The MIT Press.]]Google Scholar
L. Chaisorn, T.-S. Chua, C.-H. Lee, and Q. Tian. A hierarchical approach to story segmentation of large broadcast news video corpus. In IEEE International Conference on Multimedia and Expo, Taipei, Taiwan, June 2004.]]Google ScholarCross Ref
T. V. Duong, H. H. Bui, D. Q. Phung, and S. Venkatesh. Activity recognition and abnormality detection with the Switching Hidden Semi-Markov Model. In IEEE Int. Conf. on Computer Vision and Pattern Recognition, volume 1, pages 838--845, San Diego, 20-26 June 2005. IEEE Computer Society.]] Google ScholarDigital Library
S. Fine, Y. Singer, and N. Tishby. The hierarchical hidden markov model: Analysis and applications. Machine Learning, 32(1):41--62, 1998.]] Google ScholarDigital Library
A. Hanjalic. Shot-boundary detection: Unraveled and resolved? IEEE Transaction in Circuits and Systems for Video Technology, 12(2):90--105, 2002.]]Google ScholarDigital Library
A. Hanjalic, R. L. Lagendijk, and J. Biemond. Automated high-level movie segmentation for advanced video retrieval systems. IEEE Transactions in Circuits and Systems for Video Technology, 9(4):580--588, 1999.]]Google ScholarDigital Library
I. Ide, K. Yamamoto, and H. Tanaka. Automatic video indexing based on shot classification. In First International Conference on Advanced Multimedia Content Processing, pages 99--114, Osaka, Japan, November 1998.]] Google ScholarDigital Library
U. Iurgel, R. Meermeier, S. Eickeler, and G. Rigoll. New approaches to audio-visual segmentation of TV news for automatic topic retrieval. In IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, volume 3, pages 1397--1400, Salt Lake City, Utah, 2001.]] Google ScholarDigital Library
E. Kijak, L. Oisel, and P. Gros. Hierarchical structure analysis of sport videos using HMMs. In Int. Conf. on Image Processing, volume 2, pages II--1025--8 vol.3, 2003.]]Google ScholarCross Ref
S. E. Levinson. Continuously variable duration hidden markov models for automatic speech recognition. Computer Speech and Language, 1(1):2945, March 1986.]]Google ScholarCross Ref
T. Lin and H. J. Zhang. Automatic video scene extraction by shot grouping. Pattern Recognition, 4:39--42, 2000.]]Google ScholarCross Ref
Z. Liu and Q. Huang. Detecting news reporting using audio/visual information. In International Conference on Image Processing, pages 24--28, Kobe, Japan, October 1999.]]Google Scholar
Mediaware-Company. Mediaware solution webflix professional V1.5.3, 1999. http://www.mediaware.com.au/webflix.html.]]Google Scholar
C. D. Mitchell and L. H. Jamieson. Modeling duration in a hidden markov model with the exponential family. In Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pages II.331--II.334, Minneapolis, Minnesota, April 1993.]]Google Scholar
K. Murphy and M. Paskin. Linear-time inference in hierarchical HMMs. In T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems, Cambridge, MA, 2001. MIT Press.]]Google Scholar
M. R. Naphade and T. S. Huang. Discovering recurrent events in video using unsupervised methods. In Int. Conf. om Image Processing, volume 2, pages 13--16, Rochester, NY, USA, 2002.]]Google ScholarCross Ref
D. Q. Phung. Probabilistic and Film Grammar Based Methods for Video Content Analysis. PhD thesis, Curtin University of Technology, Australia, 2005.]]Google Scholar
D. Q. Phung, H. H. Bui, and S. Venkatesh. Content structure discovery in educational videos with shared structures in the hierarchical HMMs. In Joint Int. Workshop on Syntactic and Structural Pattern Recognition, pages 1155--1163, Lisbon, Portugal, August 18--20 2004.]]Google ScholarCross Ref
D. Q. Phung and S. Venkatesh. Structural unit identification and segmentation of topical content in educational videos. Technical report, Department of Computing, Curtin University of Technology, 2005. TR-May-2005.]]Google Scholar
D. Q. Phung, S. Venkatesh, and H. H. Bui. Automatically learning structural units in educational videos using the hierarchical HMMs. In International Conference on Image Processing, Singapore, 2004.]]Google Scholar
D. Q. Phung, S. Venkatesh, and C. Dorai High level segmentation of instructional videos based on the content density function. In ACM International Conference on Multimedia, pages 295--298, Juan Les Pins, France, 1-6 December 2002.]] Google ScholarDigital Library
L. R. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. In Procs. IEEE, volume 77, pages 257--286, February 1989.]]Google ScholarCross Ref
H. A. Rowley, S. Baluja, and T. Kanade. Neutral network-based face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(1):23--38, January 1998.]] Google ScholarDigital Library
K. Shearer, C. Dorai, and S. Venkatesh. Incorporating domain knowlege with video and voice data analysis. In Workshop on Multimedia Data Minning, Boston, USA, August 2000.]]Google Scholar
J.-C. Shim, C. Dorai, and R. Bolle. Automatic text extraction from video for content-based annotation and retrieval. In International Conference on Pattern Recognition, volume 1, pages 618--620, Brisbane, Australia, August 1998.]] Google ScholarDigital Library
C. G. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. Multimedia Tools and Applications, 2004. In Press.]] Google ScholarDigital Library
H. Sundaram. Segmentation, Structure Detection and Summarization of Multimedia Sequences. PhD thesis, Columbia University, 2002.]] Google ScholarDigital Library
H. Sundaram and S.-F. Chang. Computable scenes and structures in films. IEEE Transactions in Multimedia, 4(4):482--491, 2002.]]Google ScholarDigital Library
B. T. Truong. An Investigation into Structural and Expressive Elements in Film. PhD thesis, Curtin University of Technology, 2004.]]Google Scholar
J. Vendrig and M. Worring. Systematic evaluation of logical story unit segmentation. IEEE Transactions on Multimedia, 4(4):492--499, 2002.]]Google ScholarDigital Library
C. Wang, Y. Wang, H. Liu, and Y. He. Automatic story segmentation of news video based on audio-visual features and text information. In Int. Conf. on Machine Learning and Cybernetics, volume 5, pages 3008--3011, 2003.]]Google ScholarCross Ref
J. Wang, T.-S. Chua, and L. Chen. Cinematic-based model for scene boundary detection. In The Eight Conference on Multimedia Modeling, Amsterdam, Netherland, 5-7 November 2001.]]Google Scholar
L. Xie and S.-F. Chang. Unsupervised mining of statistical temporal structures in video. In A. Rosenfield, D. Doreman, and D. Dementhons, editors, Video Mining. Kluwer Academic Publishers, June 2003.]]Google ScholarCross Ref
L. Xie, S.-F. Chang, A. Divakaran, and H. Sun. Learning hierarhical hidden markov models for unsupervised structure discovery from video. Technical report, Columbia University, 2002.]]Google Scholar
X. Zhu, L. Wu, X. Xue, X. Lu, and J. Fan. Automatic scene detection in news program by integrating visual feature and rules. In IEEE Pacific-Rim Conference on Multimedia, pages 837--842, Beijing, China, 2001.]] Google ScholarDigital Library

Index Terms

Topic transition detection using hierarchical hidden Markov and semi-Markov models
1. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Hidden semi-Markov models

As an extension to the popular hidden Markov model (HMM), a hidden semi-Markov model (HSMM) allows the underlying stochastic process to be a semi-Markov chain. Each state has variable duration and a number of observations being produced while in the ...
Read More
The evaluation problem in discrete semi-hidden Markov models

This paper is devoted to discrete semi-hidden Markov models (SHMM), which are related to the well-known hidden Markov models (HMM). In particular, the HMM associated to an SHMM is defined, and the forward algorithm for solving the evaluation problem in ...
Read More
Coding with partially hidden Markov models
DCC '95: Proceedings of the Conference on Data Compression

Partially hidden Markov models (PHMM) are introduced. They are a variation of the hidden Markov models (HMM) combining the power of explicit conditioning on past observations and the power of using hidden states. (P)HMM may be combined with arithmetic ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia
November 2005
1110 pages
ISBN:1595930442
DOI:10.1145/1101149
General Chairs:
Hongjiang Zhang
Microsoft Research Asia, China
,
Tat-Seng Chua
National University of Singapore, Singapore
,
Program Chairs:
Ralf Steinmetz
Technische Universitat Darmstadt, Germany
,
Mohan Kankanhalli
National University of Singapore, Singapore
,
Lynn Wilcox
FXPAL
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 November 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
coxian
educational videos
hierarchical Markov (Semi-Markov) models
topic transition detection
Qualifiers
- Article
Conference

Acceptance Rates
MULTIMEDIA '05 Paper Acceptance Rate49of312submissions,16%Overall Acceptance Rate995of4,171submissions,24%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 20
  Total Citations
  View Citations
- 816
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Topic transition detection using hierarchical hidden Markov and semi-Markov models

MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Hidden semi-Markov models

The evaluation problem in discrete semi-hidden Markov models

Coding with partially hidden Markov models