Analysis of Vowels in Sung Queries for a Music Information Retrieval System

Mellody, Maureen; Bartsch, Mark A.; Wakefield, Gregory H.

doi:10.1023/A:1023501817044

Analysis of Vowels in Sung Queries for a Music Information Retrieval System

Published: July 2003

Volume 21, pages 35–52, (2003)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Maureen Mellody¹,
Mark A. Bartsch² &
Gregory H. Wakefield²

61 Accesses
4 Citations
Explore all metrics

Abstract

A method for analyzing and categorizing the vowels of a sung query is described and analyzed. This query system uses a combination of spectral analysis and parametric clustering techniques to divide a single query into different vowel regions. The method is applied separately to each query, so no training or repeated measures are necessary. The vowel regions are then transformed into strings and string search methods are used to compare the results from various songs. We apply this method to a small pilot study consisting of 40 sung queries from each of 7 songs. Approximately 60% of the queries are correctly identified with their corresponding song, using only the vowel stream as the identifier.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Static and Dynamic Approaches to Vowel Perception

Indexing and Retrieval of Speech Documents

Empirical Exploration of Language Modeling for the google.com Query Stream as Applied to Mobile Voice Search

References

Bartsch, M.A. and Wakefield, G.H. (2001). To Catch a Chorus: Using Chroma-based Representations for Audio Thumbnailing. In Proceedings of the 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, (pp. 15–18). New Paltz, NY.
Birmingham, W.P., Dannenberg, R.D., Wakefield, G.H., Bartsch, M.A., Bykowski, D., Mazzoni, D., Meek, C., Mellody, M., and Rand, B. (2001). MUSART: Music retrieval via aural queries. In Proceedings of ISMIR 2001: 2nd Annual International Symposium on Music Information Retrieval. (pp. 73–82). Bloomington, Indiana University.
Google Scholar
Foote, J. (1999). Automatic Audio Segmentation using a Measure of Audio Novelty. In Proceedings of IEEE International Conference on Multimedia and Expo, I, (pp. 452–455).
Google Scholar
Fukunaga, K. (1990). Introduction to Statistical Pattern Recognition, 2nd edn., San Diego, CA: Academic Press.
Google Scholar
Ghias, A., Logan, J., Chamberlin, D., and Smith, B.C. (1995). Query By Humming: Musical Information Retrieval in an Audio Database. In Proceedings Proceedings of the Third ACM International Conference on Multimedia, (pp. 231–236).
Hermes, D. (1993). Pitch Analysis. In M. Cooke, S. Beet, and M. Crawford (Eds.), Visual Representations of Speech Signals (pp. 3–25), New York, NY: Wiley & Sons, Inc.
Google Scholar
Mazzoni, D. and Dannenberg, R.D. (2001). Melody Matching Directly from Audio. In Proceedings of ISMIR 2001: 2nd Annual International Symposium on Music Information Retrieval, (pp. 73–82). Bloomington: Indiana University.
Google Scholar
McNab, R.J., Smith, L.A., and Witten, I.H. (1996). Signal Processing for Melody Transcription. In Proceedings of ACSC'96: Nineteenth Australasian Computer Science Conference, (pp. 301–307). Melbourne, Australia.
McNab, R.J., Smith, L.A., Witten, I.H., and Henderson, C.L. (2000). Tune Retrieval in the Multimedia Library. In Multimedia Tools and Applications, 2/3, (pp. 113–133).
Google Scholar
Meek, C. and Birmingham, W.P. (2002). Johnny Can't Sing: A Comprehensive Error Model for Sung Music Queries. In Proceedings of ISMIR 2002: 3rd International Conference on Music Information Retrieval, Paris, France.
Miller, R. (1996). On the Art of Singing, (p. 50), New York, NY: Oxford University Press.
Google Scholar
Needleman, S.B. and Wunsch, C.D. (1970). A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins. J. Mol. Biol., 48, (pp. 443–453).
Google Scholar
Rabiner, L.R. and Juang, B.-H. (1993). Fundamentals of Speech Recognition, Englewood Cliffs, NJ: Prentice-Hall, Inc.
Google Scholar
Rabiner, L.R. (1989). A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. of the IEEE, 77(2), (pp. 57–286).
Google Scholar
Shifrin, J., Pardo, B., Meek, C., and Birmingham, W.P. (2002). HMM-Based Musical Query Retrieval. In Proceedings of the Second ACM/IEEE-CS Joint Conference on Digital Libraries, (pp. 295–300). Portland, OR.
Smith, L.A., McNab, R.J., and Witten, I.H. (1997). Music Information Retrieval Using Audio Input. In Proceedings of the AAAI: Intelligent Integration and Use of Test, Image, Video and Audio Corpora, (pp. 12–16). Stanford, CA.
Titze, I. (1994). Principles of Voice Production, Prentice-Hall, Inc., Englewood Cliffs, NJ.
Google Scholar

Download references

Author information

Authors and Affiliations

Applied Physics Program, University of Michigan, Ann Arbor, Michigan, USA
Maureen Mellody
Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan, USA
Mark A. Bartsch & Gregory H. Wakefield

Authors

Maureen Mellody
View author publications
You can also search for this author in PubMed Google Scholar
Mark A. Bartsch
View author publications
You can also search for this author in PubMed Google Scholar
Gregory H. Wakefield
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mark A. Bartsch.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mellody, M., Bartsch, M.A. & Wakefield, G.H. Analysis of Vowels in Sung Queries for a Music Information Retrieval System. Journal of Intelligent Information Systems 21, 35–52 (2003). https://doi.org/10.1023/A:1023501817044

Download citation

Issue Date: July 2003
DOI: https://doi.org/10.1023/A:1023501817044

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of Vowels in Sung Queries for a Music Information Retrieval System

Abstract

Access this article

Similar content being viewed by others

Static and Dynamic Approaches to Vowel Perception

Indexing and Retrieval of Speech Documents

Empirical Exploration of Language Modeling for the google.com Query Stream as Applied to Mobile Voice Search

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Navigation

Analysis of Vowels in Sung Queries for a Music Information Retrieval System

Abstract

Access this article

Similar content being viewed by others

Static and Dynamic Approaches to Vowel Perception

Indexing and Retrieval of Speech Documents

Empirical Exploration of Language Modeling for the google.com Query Stream as Applied to Mobile Voice Search

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation