Abstract
In this paper, we present two methods based on neural networks for the automatic transcription of polyphonic piano music. The input to these methods consists in live piano music acquired by a microphone, while the pitch of all the notes in the corresponding score forms the output. The aim of this work is to compare the accuracy achieved using a feed-forward neural network, such as the MLP (MultiLayer Perceptron), with that supplied by a recurrent neural network, such as the ENN (Elman Neural Network). Signal processing techniques based on the CQT (Constant-Q Transform) are used in order to create a time-frequency representation of the input signals. The processing phases involve non-negative matrix factorization (NMF) for onset detection. Since large scale tests were required, the whole process (synthesis of audio data generated starting from MIDI files, comparison of the results with the original score) has been automated. Test, validation and training sets have been generated with reference to three different musical styles respectively represented by J. S. Bach’s inventions, F. Chopin’s nocturnes and C. Debussy’s preludes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
J. A. Moorer, “On the transcription of musical sound by computer”, Computer Music Journal, 1(4): 32–38, 1977
M. Marolt, “A connectionist approach to automatic transcription of polyphonic piano music,” IEEE Transactions on Multimedia, vol. 6, no. 3, pp. 439–449, 2004.
J. Hertz, A. Krogh and R.G. Palmer, Introduction to the theory of neural computation, Addison-Wesley, Reading, MA, 1991.
J. C. Brown, “Calculation of a constant Q spectral transform”, Journal of the Acoustical Society of America, vol. 89, no. 1, pp. 425–434, 1991.
G. Costantini, R. Perfetti and M. Todisco, “Event based transcription system for polyphonic piano music”, Signal Process. (2009), doi:10.1016/j.sigpro.2009.03.024
P. Smaragdis and J. C. Brown, “Non-negative matrix factorization for polyphonic music transcription”, Proc. IEEE Workshop of applications of signal processing to audio and acoustics”, pp. 177–180, October 2003.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media B.V.
About this paper
Cite this paper
Costantini, G., Todisco, M., Carota, M. (2010). Improving Piano Music Transcription by Elman Dynamic Neural Networks. In: Malcovati, P., Baschirotto, A., d'Amico, A., Natale, C. (eds) Sensors and Microsystems. Lecture Notes in Electrical Engineering, vol 54. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-3606-3_78
Download citation
DOI: https://doi.org/10.1007/978-90-481-3606-3_78
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-3605-6
Online ISBN: 978-90-481-3606-3
eBook Packages: EngineeringEngineering (R0)