Abstract
when people are attaching or interesting in something, usually they are trying to interact with it frequently. Music is attached to people since the day of they were born. When music repository grows, people faced lots of challenges such as finding a song quickly, categorizing, organizing and even listening again when they want etc. Because of this, people tend to find electronic solutions. To index music, most of the researchers use content based information retrieval mechanism since content based classification doesn’t need any additional information rather than audio features embedded to it. As well as it is the most suitable way to search music, when user don’t know the meta data attached to it, like author of the song. The most valuable application of this audio recognition is copyright infringement detection. Throughout this survey we will present approaches which were proposed by various researchers to detect, recognize music using content base mechanisms. And finally we will conclude this by analyzing the current status of this era.
Article PDF
Similar content being viewed by others
References
[1] T. Zhang and C.-C. J. Kuo, “Hierarchical system for content-based audio classification and retrieval,” in Photonics East (ISAM, VVDC, IEMB), 1998, pp. 398-409,International Society for Optics and Photonics, 1998.
[2] P. Cano, “Content-Based Audio Search from Fingerprinting to Semantic Audio Retrieval,” Ph. D. Dissertation. UPF, 2007.
[3] J. Serrà, E. Gómez, and P. Herrera, “Audio cover song identification and similarity: background, approaches, evaluation, and beyond,” in Advances in Music Information Retrieval, vol. 274, Z. Ras and A. A. Wieczorkowska, Eds. Springer-Verlag Berlin / Heidelberg, 2010, pp. 307–332.
[4] S. Z. Li and G.-dong Guo, “Content-based audio classification and retrieval using SVM learning,” Invited Talk PCM, 2000.
[5] S. Z. Li, “Content-based audio classification and retrieval using the nearest feature line method,” Speech and Audio Processing, IEEE Transactions on, vol. 8, no. 5, pp. 619–625, 2000.
[6] T. Huang, Y. Tian, W. Gao, and J. Lu, “Mediaprinting: Identifying multimedia content for digital rights management,” 2010.
[7] M. S. Lew, N. Sebe, C. Djeraba, and R. Jain, “Content-based multimedia information retrieval: State of the art and challenges,” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol. 2, no. 1, pp. 1–19, 2006.
[8] J. T. Foote, “Content-Based Retrieval of Music and Audio,” in MULTIMEDIA STORAGE AND ARCHIVING SYSTEMS II, PROC. OF SPIE, 1997, pp. 138–147.
[9] G. Guo and S. Z. Li, “Content-based audio classification and retrieval by support vector machines,” Neural Networks, IEEE Transactions on, vol. 14, no. 1, pp. 209–215, 2003.
[10] M. Riley, E. Heinen, and J. Ghosh, “A text retrieval approach to content-based audio retrieval,” in Int. Symp. on Music Information Retrieval (ISMIR), 2008, pp. 295–300.
[11] Wikipedia, “Harmonic –- Wikipedia, The Free Encyclopedia.” http://en.wikipedia.org/w/index.php?title=Harmonic&oldid=657491925, 2015. [Online; accessed6-May-2015].
[12] D. Mitrović, M. Zeppelzauer, and C. Breiteneder, “Features for content-based audio retrieval,” Advances in computers, vol. 78, pp. 71–150, 2010.
[13] M. C. Sezgin, B. Gunsel, and G. K. Kurt, “Perceptual audio features for emotion detection,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 2012, no. 1, pp. 1–21, 2012.
[14] M. A. Bartsch and G. H. Wakefield, “Audio thumbnailing of popular music using chroma-based representations,” Multimedia, IEEE Transactions on, vol. 7, no. 1, pp. 96–104, Feb. 2005.
[15] A. Ramalingam and S. Krishnan, “Gaussian mixture modeling using short time fourier transform features for audio fingerprinting,” in Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on, 2005, pp. 1146–1149.
[16] R. Healy and J. Timoney, “Digital Audio Watermarking with Semi-Blind Detection for In-Car and Domestic Music Content Identification,” in Audio Engineering Society Conference: 36th International Conference: Automotive Audio, 2009.
[17] W. Li, C. Xiao, and Y. Liu, “Low-order auditory Zernike moment: a novel approach for robust music identification in the compressed domain,” EURASIP Journal on Advances in Signal Processing, vol. 2013, no. 1, 2013.
[18] D. Mitrović, M. Zeppelzauer, and C. Breiteneder, “Chapter 3 - Features for Content-Based Audio Retrieval,” in Advances in Computers: Improving the Web, vol. 78, Elsevier, 2010, pp. 71–150.
[19] B. Gajic and K. K. Paliwal, “Robust feature extraction using subband spectral centroid histograms,” in Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP ’01). 2001 IEEE International Conference on, 2001, vol. 1, pp. 85-88 vol.1.
[20] B. R. Glasberg and B. C. J. Moore, “A Model of Loudness Applicable to Time-Varying Sounds,” J. Audio Eng. Soc, vol. 50, no. 5, pp. 331–342, 2002.
[21] K. Kondo, “Method of changing tempo and pitch of audio by digital signal processing.” Google Patents, 1999.
Author information
Authors and Affiliations
Additional information
Authors’ profiel
Nishan Senevirathna (B.Sc. in Computer Science (SL)) obtained his B.Sc (Hons) in Computer Science from the University of Colombo School of Computing (UCSC), Sri Lanka in 2013. Currently working as a Senior Software Engineer at CodeGen International (Pvt) Ltd and following a M.Phil Degree program at UCSC. His research interests includes Multimedia Computing, Image Processing, High Performance Computing and Human Computer Interaction.
Dr. Lakshman Jayaratne-(Ph.D (UWS),B.Sc. (SL), MACS,MCS (SL), and MIEEE) obtained his B.Sc (Hons) in Computer Science from the University of Colombo (UCSC), Sri Lanka in 1992. He obtained his PhD degree in Information Technology in 2006 from the University of Western Sydney, Sydney, Australia. He is working as a Senior Lecturer at the UCSC, University of Colombo. He was the President of the IEEE Chapter of Sri Lankan in 2012. He has wide experience in actively engaging in IT consultancies for public and private sector organizations in Sri Lanka. He was worked as a Research Advisor to Ministry of Defense, Sri Lanka. He Awarded in Recognition of Excellence in Research in the year 2013 at Postgraduate Convocation of University of Colombo, Sri Lanka. His research interest includes Multimedia Information Management, Multimedia Databases, Intelligent Human-Web Interaction, Web Information Management and Retrieval, and Web Search Optimization. Also his research interest includes Audio Music Monitoring for Radio Broadcasting and Computational Approach to Train on Music Notations for Visually Impaired in Sri Lanka.
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Senevirathna, E., Jayaratne, L. Audio Music Monitoring: Analyzing Current Techniques for Song Recognition and Identification. GSTF J Comput 4, 15 (2015). https://doi.org/10.7603/s40601-014-0015-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.7603/s40601-014-0015-7