Abstract—
Starting from the definition of the main tone of the speaker’s speech as the minimum frequency of the linear power spectrum of the vocalized segments of the speech signal, an estimation of potentially achievable accuracy of its measurement under the action of background interference such as white Gaussian noise has been made. Based on this estimation, a suboptimal algorithm for measuring the pitch frequency using a short speech frame has been developed. The developed algorithm effectiveness is confirmed by the results of the experiment, during which the author’s software was used.
Notes
At the output of the whitening filter, a packet of signal components in the frequency region has a rectangular shape [9].
The concept of the envelope of a fine structure of the speech signal power spectrum or its spectral envelope is widely used in the field of ASP and is described in detail, e.g., in [13].
The program is placed in the open access mode on the website of the authors of the article at the link https://sites.google.com/ site/frompldcreators/produkty-1/phonemetraining.
REFERENCES
L. R. Rabiner and R.W. Shafer, Theory and Applications of Digital Speech Processing (Pearson, Boston, 2011).
D. Hirst and C. Looze, Handbooks in Language and Linguistics (Cambridge Univ. Press, Cambridge, 2021), p. 336.
B. N. Schenkman and V. K. Gidla, Appl. Acoust. 163, Article 107214 (2020). https://doi.org/10.1016/j.apacoust.2020.107214
A. R. Allam, A. S. Ashour, M. A. Elnaby, and F. E. El-Samie, in Proc. 7th Int. Japan-Africa Conf. Electronics, Communications and Computations (JAC-ECC), Alexandria, Egypt, December 15–16, 2019 (IEEE, New York, 2019). https://doi.org/10.1109/JAC-ECC48896.2019.9051338
G. V. Souza, J. M. Duarte, F. Viegas, et al., J. Voice 34 (4), 641 (2020). https://doi.org/10.1016/j.jvoice.2018.12.007
J. Stahl and P. Mowlaee, Speech Commun. 111, 1 (2019). https://doi.org/10.1016/j.specom.2019.05.001
G. Sharma, K. Umapathy, and S. Krishnan, Appl. Acoust. 158, Article No. 107020 (2020). https://doi.org/10.1016/j.apacoust.2020.107338
W. Zhang, R. Wang, Q. Zhang, and S. Fang, Appl. Acoust. 166, Article No. 107338 (2020). https://doi.org/10.1016/j.apacoust.2020.107338
A.V. Savchenko and V. V. Savchenko, Meas. Tech. 65, 453 (2022). https://doi.org/10.1007/s11018-022-02104-6
I. C. Yadav, S. Shahnawazuddin, and G. Pradhan, Digital Signal Process. 86, 55 (2019). https://doi.org/10.1016/j.dsp.2018.12.013
S. Kumar, Int. J. Speech Technol. 22, 885 (2019). https://doi.org/10.1007/s10772-019-09634-5
V. V. Savchenko, Radioelectron. & Commun. Syst. 63, 532 (2020). https://doi.org/10.3103/S0735272720100039
M. Tohyama, Acoustic Signals and Hearing (Acad. Press, Kanagawa, Japan), 89 (2020). https://doi.org/10.1016/B978-0-12-816391-7.00013-9
J. D. Gibson, Information 32 (7) (2016). https://doi.org/10.3390/info7020032
Yu. Gu and H. L. Wei, Inform. Sci. 451–452, 195 (2018). https://doi.org/10.1016/j.ins.2018.04.007
S. Cui, E. Li, and X. Kang, in IEEE Int. Conf. Multimedia and Expo (ICME), United Kingdom, London, 2020 (ICME, 2020), p. 1. https://doi.org/10.1109/ICME46284.2020.9102765
S. R. Smith, J. Acoustical Soc. Amer. 150, Article No. A113 (2021). https://doi.org/10.1121/10.0007806
V. V. Savchenko and A. V. Savchenko, J. Commun. Technol. Electron. 65, 1311 (2020). https://doi.org/10.1134/S1064226920110157
V. V. Savchenko and A. V. Savchenko, Radioelectron. and Commun. Syst. 62 (5), 276 (2019). https://doi.org/10.3103/S0735272719050042
H. B. Kashani and A. Sayadiyan, Comput. Speech & Language 50, 105 (2018). https://doi.org/10.1016/j.csl.2017.12.008
V. V. Savchenko and L. V. Savchenko, J. Commun. Technol. Electron. 66, 1266 (2021). https://doi.org/10.1134/s1064226921110085
R. D. Kent and H. K. Vorperian, J. Commun. Disorders 74, 74 (2018). https://doi.org/10.1016/j.jcomdis.2018.05.004
J. D. Gibson, Information 179 (10) (2019). https://doi.org/10.3390/info10050179
J. D. Markel and A. H. Gray, “Linear Prediction of Speech,” Commun. and Cybernetics 12, (1976). https://doi.org/10.1007/978-3-642-66286-7_8
J. Sueur, Sound Analysis and Synthesis with R.? (Springer, Cham, 2018). https://doi.org/10.1007/978-3-319-77647-7_12
M. Esfandiari, S. A. Vorobyov, and M. Karimi, Signal Process. 171, Article No. 107480 (2020). https://doi.org/10.1016/j.sigpro.2020.107480
A. E. Jaramillo, J. K. Nielsen, and M. G. Christensen, in Proc. 27th Eur. Signal Processing Conf. (EUSIPCO), 2019 (EUSIPCO, 2019), p. 1. https://doi.org/10.23919/EUSIPCO.2019.8902763
A. Palaparthi and I. R. Titze, Speech Commun. 123, 98 (2020). https://doi.org/10.1016/j.specom.2020.07.003
Radio-Electronic Systems: Design Basis and Theory: Handbook, Ed. by Ya. D. Shirman, (Radiotekhnika, Moscow, 2007) [in Russian].
R. Sinha and S. Shahnawazuddin, Comput. Speech & Language 48, 103 (2018). https://doi.org/10.1016/j.csl.2017.10.007
J. Zeremdini, M. Messaoud, and A. Bouzid, Appl. Acoust. 120, 45 (2017). https://doi.org/10.1016/j.apacoust.2017.01.013
D. Jouvet and Y. Laprie, in 25th Eur. Signal Processing Conf. (EUSIPCO), 2017 (EUSIPCO, 2017), p. 1614. https://doi.org/10.23919/EUSIPCO.2017.8081482
A. V. Oppenheim and R. W. Schafer, IEEE Signal Process. Mag. 21 (5), 95 (2004). https://doi.org/10.1109/MSP.2004.1328092
S. L. Marple, Digital Spectral Analysis with Applications, 2nd ed. (Dover, Mineola, NewYork, 2019).
C. Parlak and Yu. Altun, Math. Probl. Eng. 2021, Article ID 6658951 (2021). https://doi.org/10.1155/2021/6658951
A. V. Savchenko, V. V. Savchenko, and L. V. Savchenko, Optimiz. Lett., No. 7, 1 (2021). https://doi.org/10.1007/s11590-021-01790-5
D. G. Levkov, A. G. Panin, and I. I. Tkachev, Astrophys. J. 925 (2), 109 (2022). https://doi.org/10.3847/1538-4357/ac3250
A. V. Savchenko and L. V. Savchenko, Meas. Tech. 64, 319 (2021). https://doi.org/10.1007/s11018-021-01935-z
M. B. Akcay and K. Oğuz, Speech Commun. 116, 56 (2020). https://doi.org/10.1016/j.specom.2019.12.001
Funding
The work was supported by the Russian Science Foundation, project no. 20-71-10010.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare that they have no conflicts of interest.
Additional information
Translated by N. Petrov
Rights and permissions
About this article
Cite this article
Savchenko, V.V., Savchenko, L.V. Suboptimal Algorithm for Measuring Pitch Frequency Using Discrete Fourier Transform of a Speech Signal. J. Commun. Technol. Electron. 68, 757–764 (2023). https://doi.org/10.1134/S1064226923060128
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1064226923060128