Modeling, analyzing, and synthesizing expressive piano performance with graphical models

Grindlay, Graham; Helmbold, David

doi:10.1007/s10994-006-8751-3

Modeling, analyzing, and synthesizing expressive piano performance with graphical models

Published: 21 June 2006

Volume 65, pages 361–387, (2006)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Modeling, analyzing, and synthesizing expressive piano performance with graphical models

Download PDF

Graham Grindlay¹ &
David Helmbold²

829 Accesses
15 Citations
Explore all metrics

Abstract

Trained musicians intuitively produce expressive variations that add to their audience’s enjoyment. However, there is little quantitative information about the kinds of strategies used in different musical contexts. Since the literal synthesis of notes from a score is bland and unappealing, there is an opportunity for learning systems that can automatically produce compelling expressive variations. The ESP (Expressive Synthetic Performance) system generates expressive renditions using hierarchical hidden Markov models trained on the stylistic variations employed by human performers. Furthermore, the generative models learned by the ESP system provide insight into a number of musicological issues related to expressive performance.

References

Arcos, J., & de Mántaras, R.L. (2001). An interactive cbr approach for generating expressive music. Journal of Applied Intelligence, 27(1), 115–129.
Article Google Scholar
Bengio, Y. (1999). Markovian models for sequential data. Neural Computing Surveys, 2, 129–162.
Google Scholar
Bilmes, J. (1997). A gentle tutorial on the EM algorithm and its application to parameter estimation for gaussian mixture and hidden Markov models. Technical Report ICSI-TR-97-021, University of California at Berkeley.
Brand, M., & Hertzmann, A. (2000). Style machines. In: Proceedings of ACM SIGGRAPH 2000 (pp. 183–192).
Brand, M. (1999a). An entropic estimator for structure discovery. In M. J. Kearns, S. A. Solla, and D. A. Cohn (Eds.), Advances in Neural Information Processing Systems 11. MIT Press: Cambridge, MA.
Brand, M. (1999b). Pattern discovery via entropy minimization. In: D. Heckerman and C. Whittaker (Eds.), Artificial Intelligence and Statistics, Morgan Kaufman.
Brand, M. (1999c). Structure learning in conditional probability models via an entropic prior and parameter extinction. Neural Computation, 11(5), 1155–1182.
Article Google Scholar
Brand, M. (1999d). Voice puppetry. In: A. Rockwood (Ed.), Proceedings of ACM SIGGRAPH 1999 (pp. 21–28), Los Angeles.
Bresin, R., Friberg, A., & Sundberg, J. (2002). Director musices: The KTH performance rules system. In: Proceedings of SIGMUS-46, Kyoto.
Casey, M. (2003). Musical structure and content repurposing with bayesian models. In: Proceedings of the Cambridge Music Processing Colloquium.
Cemgil, A., Kappen, H., & Barber, D. (2003). Generative model based polyphonic music transcription. In: D., Heckerman and C. Whittaker (Eds.), IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. New Paltz, NY.
Cemgil, A., & Kappen, H. (2003). Monte carlo methods for tempo tracking and rhythm quantization. Journal of Artifical Intelligence Research, 18, 45–81.
MATH Google Scholar
Corless, R., Gonnet, G., Hare, D., & Knuth, D. (1996). On the lambert W function. Advances in Computational Mathematics, 5, 329–359.
Article MATH MathSciNet Google Scholar
Dempster, A., Laird, N., & Rubin, D. (1977). Maximum-likelihood from incomplete data via the em algorithm. Journal of the Royal Statistics Society, Series B, 39(1), 1–38.
MATH MathSciNet Google Scholar
de Mántaras, R.L., & Arcos, J. (2002). AI and music: From composition to expressive performances. AI Magazine, 23(3), 43–57.
Google Scholar
Dixon, S. (2001). Automatic extraction of tempo and beat from expressive performances. Journal New Music Research, 30(1), 39–58.
Article Google Scholar
Fine, S., Singer, Y., & Tishby, N. (1998). The hierarchical hidden Markov model: Analysis and applications. Machine Learning, 32(1), 41–62.
Article MATH Google Scholar
Grindlay, G. (2005). Modeling expressive musical performance with hidden Markov models. Master’;s thesis, Dept. of Computer Science, U.C., Santa Cruz.
Lang, D., & de Freitas, N. (2005). Beat tracking the graphical model way. In: L. K. Saul, Y. Weiss, and L. Bottou, (Eds.), Advances in Neural Information Processing Systems 17 (pp. 745–752). Cambridge, MA: MIT Press.
Lerdahl, F., & Jackendoff, R. (1983). A Generative Theory of Tonal Music. MIT Press.
Murphy, K.P., & Paskin, M.A. (2002). Linear-time inference in hierarchical hmms. In: T. G. Dietterich, S. Becker, and Z. Ghahramani (Eds.), Advances in Neural Information Processing Systems 14 (pp. 833–840), Cambridge MIT Press.
Murphy, K. (2004). The BayesNet toolbox. URL: http://bnt.sourceforge.net.
Raphael, C. (2002a). Automatic transcription of piano music. In D. Heckerman and C. Whittaker, (Eds.), Proceedings ISMIR. Paris. France.
Raphael, C. (2002b). A Bayesian Network for real-time musical accompaniment. In T.G. Dietterich, S. Becker, and Z. Ghahramani, (Eds.), Advances in Neural Information Processing Systems 14, (pp. 1433–1439). Cambridge, MA: MIT Press.
Repp, B. (1992). Diversity and commonality in music performance: An analysis of timing microstructure in schumann’;s trumerei. Journal of the Acoustical Society of America, 92, 2546–2568.
Article Google Scholar
Repp, B. (1997). Expressive timing in a debussy prelude: A comparison of student and expert pianists. Musicae Scientiae, 1(2), 257–268.
Google Scholar
Saunders, C., Hardoon, D. R., Shawe-Taylor, J., & Widmer, G. (2004). Using string kernels to identify famous performers from their playing style. In: Proceedings of the 15th European Conference on Machine Learning (ECML’;2004) (pp. 384–395), Springer.
Scheirer, E. (1995). Extracting expressive performance information from recorded music. Master’;s thesis, Program in Media Arts and Science, Massachusetts Institute of Technology.
Stamatatos, E., & Widmer, G. (2005). Automatic identification of music performers with learning ensembles. Artificial Intelligence, 165(1), 37–56.
Article MathSciNet Google Scholar
Tobudic, A., & Widmer, G. (2003). Learning to play mozart: Recent improvements. In: Proceedings of the IJCAI’;03 Workshop on Methods for Automatic Music Performance and their Applications in a Public Rendering Contest.
Tobudic, A., & Widmer, G. (2005). Learning to play like the great pianists. In: Proceedings of the 19th International Joint Conference on Aritificial Intelligence (IJCAI’;05).
Todd, N. (1992). The dynamics of dynamics: A model of musical expression. Journal of the Acoustical Society of America, 91, 3540–3550.
Article Google Scholar
Wang, T., Zheng, N., Li, Y., Xu, Y., & Shum, H. (2003). Learning kernel-based hmms for dynamic sequence synthesis. Graphical Models, 65(4), 206–221.
Article Google Scholar
Widmer, G., & Goebl, W. (2004). Computational models of expressive music performance: The state of the art. Journal of New Music Research, 33(3), 203–216.
Article Google Scholar
Widmer, G., & Tobudic, A. (2003). Playing mozart by analogy: Learning multi-level timing and dynamics strategies. Journal of New Music Research, 32(3), 259–268.
Article Google Scholar
Widmer, G. (2003). Discovering simple rules in complex data: A meta-learning algorithm and some surprising musical discoveries. Artificial Intellignece, 146(2), 129–148.
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Media Laboratory, Massachusetts Institute of Technology, Cambridge
Graham Grindlay
Computer Science Deptartment, University of California, Santa Cruz
David Helmbold

Authors

Graham Grindlay
View author publications
You can also search for this author in PubMed Google Scholar
David Helmbold
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Graham Grindlay.

Additional information

Editor: Gerhard Widmer

Rights and permissions

Reprints and permissions

About this article

Cite this article

Grindlay, G., Helmbold, D. Modeling, analyzing, and synthesizing expressive piano performance with graphical models. Mach Learn 65, 361–387 (2006). https://doi.org/10.1007/s10994-006-8751-3

Download citation

Received: 23 September 2005
Revised: 17 February 2006
Accepted: 27 March 2006
Published: 21 June 2006
Issue Date: December 2006
DOI: https://doi.org/10.1007/s10994-006-8751-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Modeling, analyzing, and synthesizing expressive piano performance with graphical models

Abstract

Article PDF

Similar content being viewed by others

An Overview of Computer Systems for Expressive Music Performance

Statistical Approach to Automatic Expressive Rendition of Polyphonic Piano Music

Computational Music Theory and Its Applications to Expressive Performance and Composition

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Modeling, analyzing, and synthesizing expressive piano performance with graphical models

Abstract

Article PDF

Similar content being viewed by others

An Overview of Computer Systems for Expressive Music Performance

Statistical Approach to Automatic Expressive Rendition of Polyphonic Piano Music

Computational Music Theory and Its Applications to Expressive Performance and Composition

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation