Skip to main content
Log in

Piano Players’ Intonation and Training Using Deep Learning and MobileNet Architecture

  • Published:
Mobile Networks and Applications Aims and scope Submit manuscript

Abstract

This work proposes a deep learning-based note detection model to assist and evaluate students during the piano teaching and training for intonation. Traditional musical note recognition algorithms are based on either time or frequency domain analysis. These methods are inadequate for analyzing signals with time-varying frequency content as they are vulnerable to noise, have high algorithm complexity and require considerable calculation for preprocessing or feature extraction during intonation. Therefore, this paper used constant Q transform (CQT) for preprocessing and feature extraction, which performs both time and frequency domain analysis. MobileNet, which is a lightweight deep CNN model for mobile apps, is used in this paper. MobileNet’s deep separable convolution structure can satisfy the demands of both performance and inference speed required for preprocessing and feature extraction. First, spectrograms were created from the piano music signals using a constant Q transform, and potential note onset times were identified for players’ intonation. The spectrogram regions centered at these onset times were then fed into a deep separable convolutional neural network, which generated a vector of probabilities for 88 notes. Finally, this paper performed different experiments by observing the effects of varying slice lengths and data overlap settings to improve the performance of deep learning architecture. Precision, Recall and F1-score are used to assess the performance of model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

The datasets used and analyzed during the current study are available from the corresponding author upon reasonable request.

References

  1. Lv Z, Lloret J, Song H (2020) Internet of things and augmented reality in the age of 5G. Comput Commun 164:158–161

    Article  Google Scholar 

  2. Li W (2022) “Analysis of piano performance characteristics by Deep Learning and Artificial Intelligence and its application in piano teaching,” Front Psychol, vol. 12,

  3. Li W (2016) “Important Role of Basic Finger Training in Piano Learning,” Cross-Cultural Communication, vol. 12, no. 5, Art. no. 5, May doi: https://doi.org/10.3968/8472

  4. Benetos E, Dixon S, Giannoulis D, Kirchhoff H, Klapuri A (2013) “Automatic music transcription: challenges and future directions,” Journal of Intelligent Information Systems, vol. 41, no. 3, pp. 407–434, Dec. doi: https://doi.org/10.1007/s10844-013-0258-3

  5. Benetos E, Klapuri A, Dixon S (2012) “Score-informed transcription for automatic piano tutoring,” in European Signal Processing Conference, Bucharest, Romania,

  6. Qiang L, Chenxi L, Xin G (2020) Automatic fingering annotation for piano score via Judgement-HMM and Improved Viterbi. Tianjin Daxue Xuebao Ziran Kexue Yugongcheng Jishu Ban 53(8):814–824

    Google Scholar 

  7. Sigtia S, Benetos E, Dixon S (2016) “An End-to-End Neural Network for Polyphonic Piano Music Transcription,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 5, pp. 927–939, May doi: https://doi.org/10.1109/TASLP.2016.2533858

  8. Shuo C, Xiao C (2019) “The construction of internet + piano intelligent network teaching system model,” Journal of Intelligent & Fuzzy Systems, vol. 37, no. 5, pp. 5819–5827, Jan. doi: https://doi.org/10.3233/JIFS-179163

  9. Liu M, Huang J (2021) “Piano playing teaching system based on artificial intelligence – design and research,” Journal of Intelligent & Fuzzy Systems, vol. 40, no. 2, pp. 3525–3533, Feb. doi: https://doi.org/10.3233/JIFS-189389

  10. Yue Y (2022) “Note Detection in Music Teaching Based on Intelligent Bidirectional Recurrent Neural Network,” Security and Communication Networks, vol. p. e8135583, Jun. 2022, doi: https://doi.org/10.1155/2022/8135583

  11. Fei H, “Application Research of Neural Network Technology in Vocal Music Evaluation,” in (2021) 6th International Conference on Smart Grid and Electrical Automation (ICSGEA), May 2021, pp. 176–179. doi: https://doi.org/10.1109/ICSGEA53208.2021.00045

  12. Huang N, Ding X (2022) “Piano Music Teaching under the Background of Artificial Intelligence,” Wireless Communications and Mobile Computing, vol. p. e5816453, Jan. 2022, doi: https://doi.org/10.1155/2022/5816453

  13. Fu T (2022) “Model of Markov-Based Piano Note Recognition Algorithm and Piano Teaching Model Construction,” Journal of Environmental and Public Health, vol. p. e6045597, Jul. 2022, doi: https://doi.org/10.1155/2022/6045597

  14. Emiya V, Bertin N, David B, Badeau R (2010) “MAPS - a piano database for multipitch estimation and automatic transcription of music,” Jul.

  15. Costantini G, Todisco M, Saggio G (2010) “A new method for musical onset detection in polyphonic piano music,” in Proceedings of the 14th WSEAS international conference on Computers: part of the 14th WSEAS CSCC multiconference - Volume II, in ICCOMP’10. Stevens Point, Wisconsin, USA: World Scientific and Engineering Academy and Society (WSEAS), Jul. pp. 545–548

  16. Velikic G, Titlebaum EL, Bocko MF, “Musical note segmentation employing combined time and frequency analyses,” in (2004) IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Que., Canada: IEEE, 2004, pp. iv-277-iv–280. doi: https://doi.org/10.1109/ICASSP.2004.1326817

  17. Thornburg H, Leistikow RJ, Berger J (2007) “Melody Extraction and Musical Onset Detection via Probabilistic Models of Framewise STFT Peak Data,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 4, pp. 1257–1272, May doi: https://doi.org/10.1109/TASL.2006.889801

  18. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.” arXiv, Apr. 16, doi: https://doi.org/10.48550/arXiv.1704.04861

Download references

Funding

This paper is the phrased result of 2022 Hunan Provincial Department of Education Scientific Research Project, Outstanding Youth Project “Exploration and Research on the Linkage Development of Hunan Universities in ‘Five Dimensions in One’”. (Project Number: 22B0079).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Linlin Peng.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, L. Piano Players’ Intonation and Training Using Deep Learning and MobileNet Architecture. Mobile Netw Appl (2023). https://doi.org/10.1007/s11036-023-02175-x

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11036-023-02175-x

Keywords

Navigation