Abstract
This work proposes a deep learning-based note detection model to assist and evaluate students during the piano teaching and training for intonation. Traditional musical note recognition algorithms are based on either time or frequency domain analysis. These methods are inadequate for analyzing signals with time-varying frequency content as they are vulnerable to noise, have high algorithm complexity and require considerable calculation for preprocessing or feature extraction during intonation. Therefore, this paper used constant Q transform (CQT) for preprocessing and feature extraction, which performs both time and frequency domain analysis. MobileNet, which is a lightweight deep CNN model for mobile apps, is used in this paper. MobileNet’s deep separable convolution structure can satisfy the demands of both performance and inference speed required for preprocessing and feature extraction. First, spectrograms were created from the piano music signals using a constant Q transform, and potential note onset times were identified for players’ intonation. The spectrogram regions centered at these onset times were then fed into a deep separable convolutional neural network, which generated a vector of probabilities for 88 notes. Finally, this paper performed different experiments by observing the effects of varying slice lengths and data overlap settings to improve the performance of deep learning architecture. Precision, Recall and F1-score are used to assess the performance of model.
Similar content being viewed by others
Data Availability
The datasets used and analyzed during the current study are available from the corresponding author upon reasonable request.
References
Lv Z, Lloret J, Song H (2020) Internet of things and augmented reality in the age of 5G. Comput Commun 164:158–161
Li W (2022) “Analysis of piano performance characteristics by Deep Learning and Artificial Intelligence and its application in piano teaching,” Front Psychol, vol. 12,
Li W (2016) “Important Role of Basic Finger Training in Piano Learning,” Cross-Cultural Communication, vol. 12, no. 5, Art. no. 5, May doi: https://doi.org/10.3968/8472
Benetos E, Dixon S, Giannoulis D, Kirchhoff H, Klapuri A (2013) “Automatic music transcription: challenges and future directions,” Journal of Intelligent Information Systems, vol. 41, no. 3, pp. 407–434, Dec. doi: https://doi.org/10.1007/s10844-013-0258-3
Benetos E, Klapuri A, Dixon S (2012) “Score-informed transcription for automatic piano tutoring,” in European Signal Processing Conference, Bucharest, Romania,
Qiang L, Chenxi L, Xin G (2020) Automatic fingering annotation for piano score via Judgement-HMM and Improved Viterbi. Tianjin Daxue Xuebao Ziran Kexue Yugongcheng Jishu Ban 53(8):814–824
Sigtia S, Benetos E, Dixon S (2016) “An End-to-End Neural Network for Polyphonic Piano Music Transcription,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 5, pp. 927–939, May doi: https://doi.org/10.1109/TASLP.2016.2533858
Shuo C, Xiao C (2019) “The construction of internet + piano intelligent network teaching system model,” Journal of Intelligent & Fuzzy Systems, vol. 37, no. 5, pp. 5819–5827, Jan. doi: https://doi.org/10.3233/JIFS-179163
Liu M, Huang J (2021) “Piano playing teaching system based on artificial intelligence – design and research,” Journal of Intelligent & Fuzzy Systems, vol. 40, no. 2, pp. 3525–3533, Feb. doi: https://doi.org/10.3233/JIFS-189389
Yue Y (2022) “Note Detection in Music Teaching Based on Intelligent Bidirectional Recurrent Neural Network,” Security and Communication Networks, vol. p. e8135583, Jun. 2022, doi: https://doi.org/10.1155/2022/8135583
Fei H, “Application Research of Neural Network Technology in Vocal Music Evaluation,” in (2021) 6th International Conference on Smart Grid and Electrical Automation (ICSGEA), May 2021, pp. 176–179. doi: https://doi.org/10.1109/ICSGEA53208.2021.00045
Huang N, Ding X (2022) “Piano Music Teaching under the Background of Artificial Intelligence,” Wireless Communications and Mobile Computing, vol. p. e5816453, Jan. 2022, doi: https://doi.org/10.1155/2022/5816453
Fu T (2022) “Model of Markov-Based Piano Note Recognition Algorithm and Piano Teaching Model Construction,” Journal of Environmental and Public Health, vol. p. e6045597, Jul. 2022, doi: https://doi.org/10.1155/2022/6045597
Emiya V, Bertin N, David B, Badeau R (2010) “MAPS - a piano database for multipitch estimation and automatic transcription of music,” Jul.
Costantini G, Todisco M, Saggio G (2010) “A new method for musical onset detection in polyphonic piano music,” in Proceedings of the 14th WSEAS international conference on Computers: part of the 14th WSEAS CSCC multiconference - Volume II, in ICCOMP’10. Stevens Point, Wisconsin, USA: World Scientific and Engineering Academy and Society (WSEAS), Jul. pp. 545–548
Velikic G, Titlebaum EL, Bocko MF, “Musical note segmentation employing combined time and frequency analyses,” in (2004) IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Que., Canada: IEEE, 2004, pp. iv-277-iv–280. doi: https://doi.org/10.1109/ICASSP.2004.1326817
Thornburg H, Leistikow RJ, Berger J (2007) “Melody Extraction and Musical Onset Detection via Probabilistic Models of Framewise STFT Peak Data,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 4, pp. 1257–1272, May doi: https://doi.org/10.1109/TASL.2006.889801
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.” arXiv, Apr. 16, doi: https://doi.org/10.48550/arXiv.1704.04861
Funding
This paper is the phrased result of 2022 Hunan Provincial Department of Education Scientific Research Project, Outstanding Youth Project “Exploration and Research on the Linkage Development of Hunan Universities in ‘Five Dimensions in One’”. (Project Number: 22B0079).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Peng, L. Piano Players’ Intonation and Training Using Deep Learning and MobileNet Architecture. Mobile Netw Appl (2023). https://doi.org/10.1007/s11036-023-02175-x
Accepted:
Published:
DOI: https://doi.org/10.1007/s11036-023-02175-x