Skip to main content
Log in

Mining movie archives for song sequences

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Music and songs are integral parts of Bollywood movies. Every movie of two to three hours, contains three to ten songs, each song is 3–10 min long. Music lovers like to listen music and songs of a movie, however it is time consuming and error prone to search manually all the songs in a movie. Moreover, the task becomes much harder when songs are to be extracted from a huge archived movies’ database containing hundreds of movies. This paper presents an approach to automatically extract music and songs from archived musical movies. We used song grammar to construct Markov Chain Model that differentiates song scenes from dialogue and action scenes in a movie. We tested our system on Bollywood, Hollywood, Pakistani, Bengali, and Tamil movies. A total of 20 movies from different industries were selected for the experiments. On Bollywood movies, we achieved 97.22% recall in song extraction, whereas the recall on Hollywood musical movies is 80%. The test result on Pakistani, Tamil and Bengali movies is 87.09%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Aisopos F, Papadakis G, Varvarigou T (2011) Sentiment analysis of social media content using n-gram graphs. In: ACM multimedia 2011, ACM MM 2011, 28 Nov 28–1 Dec 2011, Scottsdale, Arizona, USA

  2. Berenzweig AL, Ellis DPW (2001) Locating singing voice segments within music signals. In: IEEE WASPAA’01, New York, pp 119–122

  3. Bhimani H (1995) In search of Lata Mangeshkar. Indus, Qasimabad

    Google Scholar 

  4. Blum A, Mitchell TM (1998) Combining labeled and unlabeled data with co-training. In: Workshop on computational learning theory. Morgan Kaufmann, San Mateo, pp 92–100. doi:10.1145/279943.279962

  5. Cesa-Bianchi N, Lugosi G (2006) Prediction, learning, and games. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  6. Chowdhry P (2000) Colonial India and the making of empire cinema: image, ideology and identity. Manchester University Press, Manchester

    Google Scholar 

  7. Becchetti C, Ricotti LP (1999) Speech recognition: theory and C++ implementation. Wiley

  8. Doulamis AD, Doulamis ND, Kollias SD (2000) On-line retrainable neural networks: improving the performance of neural networks in image analysis problems. IEEE Trans Neural Netw 11:137–155. doi:10.1109/72.822517

    Article  Google Scholar 

  9. El-Maleh K, Klein M, Petrucci G, Kabal P (2000) Speech/music discrimination for multimedia applications. In: International conference on acoustics, speech, and signal processing

  10. Gokulsing KM, Dissanayake W (1998) Indian popular cinema: a narrative of cultural change. Orient Longman, New Delhi

    Google Scholar 

  11. Gulzar GN, Chatterjee S (2003) Encyclopedia of Hindi cinema: an enchanting close-up of India’s Hindi cinema (Britannica). Encyclopedia Britannica Inc., Chicago

    Google Scholar 

  12. Han J, Kamber M (2000) Data mining: concepts and techniques. Morgan Kaufmann, San Mateo

    Google Scholar 

  13. Hirji F (2005) When local meets lucre: commerce, culture and imperialism in Bollywood cinema. Glob Media J 4(7):1–18

    Google Scholar 

  14. Huang TM, Kecman V, Kopriva I (2006) Kernel based algorithms for mining huge data sets: supervised, semi-supervised, and unsupervised learning. Springer, Berlin. doi:10.1007/3-540-31689-2

    Google Scholar 

  15. Imai T, Sato S, Homma S, Onoe K, Kobayashi A (2007) Online speech detection and dual-gender speech recognition for captioning broadcast news. IEICE Trans 90-D:1286–1291. doi:10.1093/ietisy/e90-d.8.1286

    Google Scholar 

  16. Jiang Zhang H, Kankanhalli A, Smoliar SW (1993) Automatic partitioning of full-motion video. Multimedia Syst 1:10–28. doi:10.1007/BF01210504

    Article  Google Scholar 

  17. Kender JR, Lock Yeo B (1998) Video scene segmentation via continuous video coherence. In: Computer vision and pattern recognition, pp 367–373

  18. Kim YE, Whitman B (2002) Singer identification in popular music recordings using voice coding features. In: International symposium/conference on music information retrieval

  19. Lehane B, O’Connor N, Murphy N (2004) Dialogue scene detection in movies using low and mid-level visual features. In: Conference on image and video retrieval

  20. Lehane B, O’Connor NE, Lee H, Smeaton AF (2007) Indexing of fictional video content for event detection and summarisation. Eurasip J Image Video Process 2007:1–16. doi:10.1155/2007/14615

    Article  Google Scholar 

  21. Li Y, Narayanan SS, Kuo CCJ (2004) Content-based movie analysis and indexing based on audiovisual cues. IEEE Trans Circuits Syst Video Technol 14:1073–1085. doi:10.1109/TCSVT.2004.831968

    Article  Google Scholar 

  22. Lu L, Jiang Zhang H, Jiang H (2002) Content analysis for audio classification and segmentation. IEEE Trans Audio Speech Lang Process 10:504–516. doi:10.1109/TSA.2002.804546

    Article  Google Scholar 

  23. Lu L, Zhang HJ, Li SZ (2003) Content-based audio classification and segmentation by using support vector machines. Multimedia Syst 8:482–492. doi:10.1007/s00530-002-0065-0

    Article  Google Scholar 

  24. Mesaros A, Virtanen T, Klapuri A (2007) Singer identification in polyphonic music using vocal separation and pattern recognition methods. In: International symposium/conference on music information retrieval

  25. Movavi video cutter. http://www.movavi.com/videoeditor/

  26. Nwe TL, Wang Y (2004) Automatic detection of vocal segments in popular songs. In: International symposium/conference on music information retrieval

  27. Panagiotakis C, Tziritas G (2005) A speech/music discriminator based on rms and zero-crossings. IEEE Trans Multimedia 7:155–166. doi:10.1109/TMM.2004.840604

    Article  Google Scholar 

  28. Rabiner LR, Juang BH (1993) Fundamentals of speech recognition

  29. Saunders J (1996) Real-time discrimination of broadcast speech/music. In: International conference on acoustics, speech, and signal processing

  30. Scheirer E, Slaney M (1997) Construction and evaluation of a robust multifeature speech/music discriminator. In: International conference on acoustics, speech, and signal processing

  31. Shen J, Shepherd J, Cui B, Lee Tan K (2009) A novel framework for efficient automated singer identification in large music databases. ACM Trans Inf Syst 27:1–31. doi:10.1145/1508850.1508856

    Article  Google Scholar 

  32. Song structure (popular music). http://en.wikipedia.org/wiki/Song_structure

  33. Sundaram H, Chang SF (2000) Determining computable scenes in films and their structures using audio-visual memory models. In: ACM multimedia conference, pp 95–104. doi:10.1145/354384.354440

  34. Xilisoft video cutter. http://www.xilisoft.com/video-cutter.html

  35. Yeung MM, Yeo BL (1997) Video visualization for compact presentation and fast browsing of pictorial content. IEEE Trans Circuits Syst Video Technol 7:771–785. doi:10.1109/76.633496

    Article  Google Scholar 

  36. Zhang H, Low CY, Smoliar SW, Wu JH (1995) Video parsing, retrieval and browsing: an integrated and content-based solution. In: ACM multimedia conference, pp 15–24. doi:10.1145/217279.215068

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sher Muhammad Doudpota.

Additional information

The preliminary idea of this paper was published in the Proceedings of International Conference on Database and Data Mining (ICDDM), July 2010, Manila, Philippines.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Doudpota, S.M. Mining movie archives for song sequences. Multimed Tools Appl 69, 359–382 (2014). https://doi.org/10.1007/s11042-012-1021-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-012-1021-4

Keywords

Navigation