Behavioral features of the speech signal as part of improving the effectiveness of the automatic speaker recognition system

Andrzej DOBROWOLSKI; Dominik MAŁY

doi:10.37105/iboa.187

No 4 (2023), Articles

No 4 (2023)

Behavioral features of the speech signal as part of improving the effectiveness of the automatic speaker recognition system

Articles

https://doi.org/10.37105/iboa.187

Published December 5, 2023

Andrzej DOBROWOLSKI⁺⁻
Dominik MAŁY⁺⁻

Andrzej DOBROWOLSKI

Military University of Technology, Warsaw, Poland

https://orcid.org/0000-0002-0593-158X

Dominik MAŁY

Military University of Technology, Warsaw, Poland

https://orcid.org/0000-0002-1682-8553

pdf

Keywords

automatic speaker recognition
automatic speaker recognition systems
physical features
behavioral features
speech signal

Abstract

The current reality is saturated with intelligent telecommunications solutions, and automatic speaker recognition systems are an integral part of many of them. They are widely used in sectors such as banking, telecommunications and forensics. The ease of performing automatic analysis and efficient extraction of the distinctive characteristics of the human voice makes it possible to identify, verify, as well as authorize the speaker under investigation. Currently, the vast majority of solutions in the field of speaker recognition systems are based on the distinctive features resulting from the structure of the speaker's vocal tract (laryngeal sound analysis), called physical features of the voice. Despite the high efficiency of such systems - oscillating at more than 95% - their further development is already very difficult, due to the fact that the possibilities of distinctive physical features have been exhausted. Further opportunities to increase the effectiveness of ASR systems based on physical features appear after additional consideration of the behavioral features of the speech signal in the system, which is the subject of this article.

https://doi.org/10.37105/iboa.187

pdf

References

1. Dobrowolski A. (2018), Transformacje sygnałów: od teorii do praktyki, Legionowo,
2. Dobrowolski A., Majda E. (2011), Cepstral analysis in the speakers recognition systems, 15th Conference on Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), pp. 85-90, Poznań,
3. Dobrowolski A., Majda E. (2012), Application of homomorphic methods of speech signal processing in speakers recognition system, Przegląd Elektrotechniczny, R. 88 NR 6/2012, pp. 12-16
4. Jaroszyk F. (2008), Biofizyka Podręcznik dla studentów, Warszawa, Wydawnictwo Lekarskie PZWL,
5. Kamiński K., Dobrowolski A. (2022), Automatic speaker recognition system based on gaussian mixture models, cepstral analysis and genetic selection of distinctive features, Sensors, 22(23), 9370, DOI: 10.3390/s22239370
6. Reddy Gade V. S. and Sumathi M. (2021), A Comprehensive Study on Automatic Speaker Recognition by using Deep Learning Techniques, 2021 5th International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, pp. 1591-1597,
7. Tirumala S. S., Shahamiri S. R., Garhwal A. S., Wang R. (2017), Speaker identification features extraction methods: A systematic review, Expert Systems With Applications, 90, pp. 250–271, DOI: 10.1016/j.eswa.2017.08.015
8. Woźniak T., Soboń J. (2015), Ocena płynności mówienia, Nowa Audiofonologia, 4(4), pp. 9–19, DOI: 10.17431/894809

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.