Mining movie archives for song sequences

Doudpota, Sher Muhammad

doi:10.1007/s11042-012-1021-4

Mining movie archives for song sequences

Published: 24 February 2012

Volume 69, pages 359–382, (2014)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Sher Muhammad Doudpota¹

403 Accesses
1 Citation
Explore all metrics

Abstract

Music and songs are integral parts of Bollywood movies. Every movie of two to three hours, contains three to ten songs, each song is 3–10 min long. Music lovers like to listen music and songs of a movie, however it is time consuming and error prone to search manually all the songs in a movie. Moreover, the task becomes much harder when songs are to be extracted from a huge archived movies’ database containing hundreds of movies. This paper presents an approach to automatically extract music and songs from archived musical movies. We used song grammar to construct Markov Chain Model that differentiates song scenes from dialogue and action scenes in a movie. We tested our system on Bollywood, Hollywood, Pakistani, Bengali, and Tamil movies. A total of 20 movies from different industries were selected for the experiments. On Bollywood movies, we achieved 97.22% recall in song extraction, whereas the recall on Hollywood musical movies is 80%. The test result on Pakistani, Tamil and Bengali movies is 87.09%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning Techniques for Music Genre Classification

Using Probabilistic Parsers to Support Salsa Music Composition

Integration of a music generator and a song lyrics generator to create Spanish popular songs

Article 11 March 2020

References

Aisopos F, Papadakis G, Varvarigou T (2011) Sentiment analysis of social media content using n-gram graphs. In: ACM multimedia 2011, ACM MM 2011, 28 Nov 28–1 Dec 2011, Scottsdale, Arizona, USA
Berenzweig AL, Ellis DPW (2001) Locating singing voice segments within music signals. In: IEEE WASPAA’01, New York, pp 119–122
Bhimani H (1995) In search of Lata Mangeshkar. Indus, Qasimabad
Google Scholar
Blum A, Mitchell TM (1998) Combining labeled and unlabeled data with co-training. In: Workshop on computational learning theory. Morgan Kaufmann, San Mateo, pp 92–100. doi:10.1145/279943.279962
Cesa-Bianchi N, Lugosi G (2006) Prediction, learning, and games. Cambridge University Press, Cambridge
Book MATH Google Scholar
Chowdhry P (2000) Colonial India and the making of empire cinema: image, ideology and identity. Manchester University Press, Manchester
Google Scholar
Becchetti C, Ricotti LP (1999) Speech recognition: theory and C++ implementation. Wiley
Doulamis AD, Doulamis ND, Kollias SD (2000) On-line retrainable neural networks: improving the performance of neural networks in image analysis problems. IEEE Trans Neural Netw 11:137–155. doi:10.1109/72.822517
Article Google Scholar
El-Maleh K, Klein M, Petrucci G, Kabal P (2000) Speech/music discrimination for multimedia applications. In: International conference on acoustics, speech, and signal processing
Gokulsing KM, Dissanayake W (1998) Indian popular cinema: a narrative of cultural change. Orient Longman, New Delhi
Google Scholar
Gulzar GN, Chatterjee S (2003) Encyclopedia of Hindi cinema: an enchanting close-up of India’s Hindi cinema (Britannica). Encyclopedia Britannica Inc., Chicago
Google Scholar
Han J, Kamber M (2000) Data mining: concepts and techniques. Morgan Kaufmann, San Mateo
Google Scholar
Hirji F (2005) When local meets lucre: commerce, culture and imperialism in Bollywood cinema. Glob Media J 4(7):1–18
Google Scholar
Huang TM, Kecman V, Kopriva I (2006) Kernel based algorithms for mining huge data sets: supervised, semi-supervised, and unsupervised learning. Springer, Berlin. doi:10.1007/3-540-31689-2
Google Scholar
Imai T, Sato S, Homma S, Onoe K, Kobayashi A (2007) Online speech detection and dual-gender speech recognition for captioning broadcast news. IEICE Trans 90-D:1286–1291. doi:10.1093/ietisy/e90-d.8.1286
Google Scholar
Jiang Zhang H, Kankanhalli A, Smoliar SW (1993) Automatic partitioning of full-motion video. Multimedia Syst 1:10–28. doi:10.1007/BF01210504
Article Google Scholar
Kender JR, Lock Yeo B (1998) Video scene segmentation via continuous video coherence. In: Computer vision and pattern recognition, pp 367–373
Kim YE, Whitman B (2002) Singer identification in popular music recordings using voice coding features. In: International symposium/conference on music information retrieval
Lehane B, O’Connor N, Murphy N (2004) Dialogue scene detection in movies using low and mid-level visual features. In: Conference on image and video retrieval
Lehane B, O’Connor NE, Lee H, Smeaton AF (2007) Indexing of fictional video content for event detection and summarisation. Eurasip J Image Video Process 2007:1–16. doi:10.1155/2007/14615
Article Google Scholar
Li Y, Narayanan SS, Kuo CCJ (2004) Content-based movie analysis and indexing based on audiovisual cues. IEEE Trans Circuits Syst Video Technol 14:1073–1085. doi:10.1109/TCSVT.2004.831968
Article Google Scholar
Lu L, Jiang Zhang H, Jiang H (2002) Content analysis for audio classification and segmentation. IEEE Trans Audio Speech Lang Process 10:504–516. doi:10.1109/TSA.2002.804546
Article Google Scholar
Lu L, Zhang HJ, Li SZ (2003) Content-based audio classification and segmentation by using support vector machines. Multimedia Syst 8:482–492. doi:10.1007/s00530-002-0065-0
Article Google Scholar
Mesaros A, Virtanen T, Klapuri A (2007) Singer identification in polyphonic music using vocal separation and pattern recognition methods. In: International symposium/conference on music information retrieval
Movavi video cutter. http://www.movavi.com/videoeditor/
Nwe TL, Wang Y (2004) Automatic detection of vocal segments in popular songs. In: International symposium/conference on music information retrieval
Panagiotakis C, Tziritas G (2005) A speech/music discriminator based on rms and zero-crossings. IEEE Trans Multimedia 7:155–166. doi:10.1109/TMM.2004.840604
Article Google Scholar
Rabiner LR, Juang BH (1993) Fundamentals of speech recognition
Saunders J (1996) Real-time discrimination of broadcast speech/music. In: International conference on acoustics, speech, and signal processing
Scheirer E, Slaney M (1997) Construction and evaluation of a robust multifeature speech/music discriminator. In: International conference on acoustics, speech, and signal processing
Shen J, Shepherd J, Cui B, Lee Tan K (2009) A novel framework for efficient automated singer identification in large music databases. ACM Trans Inf Syst 27:1–31. doi:10.1145/1508850.1508856
Article Google Scholar
Song structure (popular music). http://en.wikipedia.org/wiki/Song_structure
Sundaram H, Chang SF (2000) Determining computable scenes in films and their structures using audio-visual memory models. In: ACM multimedia conference, pp 95–104. doi:10.1145/354384.354440
Xilisoft video cutter. http://www.xilisoft.com/video-cutter.html
Yeung MM, Yeo BL (1997) Video visualization for compact presentation and fast browsing of pictorial content. IEEE Trans Circuits Syst Video Technol 7:771–785. doi:10.1109/76.633496
Article Google Scholar
Zhang H, Low CY, Smoliar SW, Wu JH (1995) Video parsing, retrieval and browsing: an integrated and content-based solution. In: ACM multimedia conference, pp 15–24. doi:10.1145/217279.215068

Download references

Author information

Authors and Affiliations

Computer Science and Information Management, Asian Institute of Technology, Bangkok, Thailand
Sher Muhammad Doudpota

Authors

Sher Muhammad Doudpota
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sher Muhammad Doudpota.

Additional information

The preliminary idea of this paper was published in the Proceedings of International Conference on Database and Data Mining (ICDDM), July 2010, Manila, Philippines.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Doudpota, S.M. Mining movie archives for song sequences. Multimed Tools Appl 69, 359–382 (2014). https://doi.org/10.1007/s11042-012-1021-4

Download citation

Published: 24 February 2012
Issue Date: March 2014
DOI: https://doi.org/10.1007/s11042-012-1021-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining movie archives for song sequences

Abstract

Access this article

Similar content being viewed by others

Machine Learning Techniques for Music Genre Classification

Using Probabilistic Parsers to Support Salsa Music Composition

Integration of a music generator and a song lyrics generator to create Spanish popular songs

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Mining movie archives for song sequences

Abstract

Access this article

Similar content being viewed by others

Machine Learning Techniques for Music Genre Classification

Using Probabilistic Parsers to Support Salsa Music Composition

Integration of a music generator and a song lyrics generator to create Spanish popular songs

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation