Skip to main content
Log in

Analysis of Vowels in Sung Queries for a Music Information Retrieval System

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

A method for analyzing and categorizing the vowels of a sung query is described and analyzed. This query system uses a combination of spectral analysis and parametric clustering techniques to divide a single query into different vowel regions. The method is applied separately to each query, so no training or repeated measures are necessary. The vowel regions are then transformed into strings and string search methods are used to compare the results from various songs. We apply this method to a small pilot study consisting of 40 sung queries from each of 7 songs. Approximately 60% of the queries are correctly identified with their corresponding song, using only the vowel stream as the identifier.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bartsch, M.A. and Wakefield, G.H. (2001). To Catch a Chorus: Using Chroma-based Representations for Audio Thumbnailing. In Proceedings of the 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, (pp. 15–18). New Paltz, NY.

  • Birmingham, W.P., Dannenberg, R.D., Wakefield, G.H., Bartsch, M.A., Bykowski, D., Mazzoni, D., Meek, C., Mellody, M., and Rand, B. (2001). MUSART: Music retrieval via aural queries. In Proceedings of ISMIR 2001: 2nd Annual International Symposium on Music Information Retrieval. (pp. 73–82). Bloomington, Indiana University.

    Google Scholar 

  • Foote, J. (1999). Automatic Audio Segmentation using a Measure of Audio Novelty. In Proceedings of IEEE International Conference on Multimedia and Expo, I, (pp. 452–455).

    Google Scholar 

  • Fukunaga, K. (1990). Introduction to Statistical Pattern Recognition, 2nd edn., San Diego, CA: Academic Press.

    Google Scholar 

  • Ghias, A., Logan, J., Chamberlin, D., and Smith, B.C. (1995). Query By Humming: Musical Information Retrieval in an Audio Database. In Proceedings Proceedings of the Third ACM International Conference on Multimedia, (pp. 231–236).

  • Hermes, D. (1993). Pitch Analysis. In M. Cooke, S. Beet, and M. Crawford (Eds.), Visual Representations of Speech Signals (pp. 3–25), New York, NY: Wiley & Sons, Inc.

    Google Scholar 

  • Mazzoni, D. and Dannenberg, R.D. (2001). Melody Matching Directly from Audio. In Proceedings of ISMIR 2001: 2nd Annual International Symposium on Music Information Retrieval, (pp. 73–82). Bloomington: Indiana University.

    Google Scholar 

  • McNab, R.J., Smith, L.A., and Witten, I.H. (1996). Signal Processing for Melody Transcription. In Proceedings of ACSC'96: Nineteenth Australasian Computer Science Conference, (pp. 301–307). Melbourne, Australia.

  • McNab, R.J., Smith, L.A., Witten, I.H., and Henderson, C.L. (2000). Tune Retrieval in the Multimedia Library. In Multimedia Tools and Applications, 2/3, (pp. 113–133).

    Google Scholar 

  • Meek, C. and Birmingham, W.P. (2002). Johnny Can't Sing: A Comprehensive Error Model for Sung Music Queries. In Proceedings of ISMIR 2002: 3rd International Conference on Music Information Retrieval, Paris, France.

  • Miller, R. (1996). On the Art of Singing, (p. 50), New York, NY: Oxford University Press.

    Google Scholar 

  • Needleman, S.B. and Wunsch, C.D. (1970). A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins. J. Mol. Biol., 48, (pp. 443–453).

    Google Scholar 

  • Rabiner, L.R. and Juang, B.-H. (1993). Fundamentals of Speech Recognition, Englewood Cliffs, NJ: Prentice-Hall, Inc.

    Google Scholar 

  • Rabiner, L.R. (1989). A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. of the IEEE, 77(2), (pp. 57–286).

    Google Scholar 

  • Shifrin, J., Pardo, B., Meek, C., and Birmingham, W.P. (2002). HMM-Based Musical Query Retrieval. In Proceedings of the Second ACM/IEEE-CS Joint Conference on Digital Libraries, (pp. 295–300). Portland, OR.

  • Smith, L.A., McNab, R.J., and Witten, I.H. (1997). Music Information Retrieval Using Audio Input. In Proceedings of the AAAI: Intelligent Integration and Use of Test, Image, Video and Audio Corpora, (pp. 12–16). Stanford, CA.

  • Titze, I. (1994). Principles of Voice Production, Prentice-Hall, Inc., Englewood Cliffs, NJ.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mark A. Bartsch.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mellody, M., Bartsch, M.A. & Wakefield, G.H. Analysis of Vowels in Sung Queries for a Music Information Retrieval System. Journal of Intelligent Information Systems 21, 35–52 (2003). https://doi.org/10.1023/A:1023501817044

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1023501817044

Navigation