Audio-visual segmentation for content-based retrieval

Pye, David; Hollinghurst, Nicholas J.; Mills, Timothy J.; Wood, Kenneth R.

doi:10.21437/ICSLP.1998-598

Audio-visual segmentation for content-based retrieval

David Pye, Nicholas J. Hollinghurst, Timothy J. Mills, Kenneth R. Wood

This paper reports recent work at ORL on segmentation of digital audio/video recordings. Firstly, we describe an audio segmentation algorithm that partitions a soundtrack into manageably sized segments for speech recognition. Secondly, we present an algorithm for detecting camera shot-break locations in the video. The output of these two algorithms is combined to produce a semantically meaningful segmentation of audio/video content, appropriate for information retrieval. We report the success of the algorithms in the context of television news retrieval.

doi: 10.21437/ICSLP.1998-598

Cite as: Pye, D., Hollinghurst, N.J., Mills, T.J., Wood, K.R. (1998) Audio-visual segmentation for content-based retrieval. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0517, doi: 10.21437/ICSLP.1998-598

@inproceedings{pye98_icslp,
  author={David Pye and Nicholas J. Hollinghurst and Timothy J. Mills and Kenneth R. Wood},
  title={{Audio-visual segmentation for content-based retrieval}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0517},
  doi={10.21437/ICSLP.1998-598}
}