Binary classifier for identification of stammering instances in Hindi speech data

Dwivedi, Shivam; Ghosh, Sanjukta; Dwivedi, Satyam

doi:10.1007/s10772-023-10046-9

Binary classifier for identification of stammering instances in Hindi speech data

Published: 11 October 2023

Volume 26, pages 765–774, (2023)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

126 Accesses
Explore all metrics

Abstract

In this research paper we show results from our experiments on creating a binary classifier for stammering identification in Hindi speech data. We train several Sequential CNN models with parametric adjustments such as color, image size, and training data shape changes to tweak classification performance. Our experimental pipeline converts speech samples into spectrograms using Librosa, and trains the Sequential CNN classifier on the image data using TensorFlow Lite. Our classification models achieve more than 95% accuracy in this classification task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparison of text preprocessing techniques for hate and offensive speech detection in Twitter

Article 01 December 2023

Comparative Analysis of Classification Methods for Automatic Deception Detection in Speech

Tracking Hate in Social Media: Evaluation, Challenges and Approaches

Article 28 March 2020

Data availability

This journal article forms a continuing segment of an ongoing doctoral project. Upon the culmination of the thesis to avoid data breaches and uphold intellectual property safeguards, the researchers plan to make the research dataset available (https://shivamdwivedi.com/resources) upon reasonable requests.

References

Ambrose, N. G., Cox, N. J., & Yairi, E. (1997). The genetic basis of persistence and recovery in stuttering. Journal of Speech, Language, and Hearing Research. https://doi.org/10.1044/jslhr.4003.567
Article Google Scholar
Asadi, B., & Jiang, H. (2020). On approximation capabilities of ReLU activation and Softmax output layer in neural networks. arXiv Preprint. https://doi.org/10.48550/ARXIV.2002.04060
Audacity, T. (2017). Audacity. The name Audacity (R) is a registered trademark of Dominic Mazzoni. Retrieved from http://audacity.sourceforge.net
Barrett, L., Hu, J., & Howell, P. (2022). Systematic review of machine learning approaches for detecting developmental stuttering. IEEE/ACM Transactions on Audio Speech and Language Processing. https://doi.org/10.1109/TASLP.2022.3155295
Article Google Scholar
Chollet, F. (2015). Keras: The Python Deep Learning library. www.keras.io
Clark, L., Cowan, B. R., Roper, A., Lindsay, S., & Sheers, O. (2020). Speech diversity and speech interfaces: Considering an inclusive future through stammering. ACM International Conference Proceeding Series. https://doi.org/10.1145/3405755.3406139
Article Google Scholar
Craig, A., & Tran, Y. (2006). Fear of speaking: Chronic anxiety and stammering. Advances in Psychiatric Treatment. https://doi.org/10.1192/apt.12.1.63
Article Google Scholar
Dwivedi, S., Ghosh, S., & Dwivedi, S. (2021). Developing Hindi stammering corpus: Framework and insights. SN Computer Science, 3(1), 39. https://doi.org/10.1007/s42979-021-00891-3
Article Google Scholar
Howell, P. (2011a). Listen to the lessons of the King’s speech. Nature. https://doi.org/10.1038/470007a
Article Google Scholar
Howell, P. (2011b). Recovery from stuttering. Psychology Press. https://doi.org/10.4324/9780203847404
Book Google Scholar
Howell, P., & Huckvale, M. (2004). Facilities to assist people to research into stammered speech. Stammering Research: An on-Line Journal Published by the British Stammering Association, 1(2), 130–242.
Google Scholar
Howell, P., Sackin, S., & Glenn, K. (1997). Development of a two-stage procedure for the automatic recognition of dysfluencies in the speech of children who stutter: II. ANN recognition of repetitions and prolongations with supplied word segment markers. Journal of Speech, Language, and Hearing Research. https://doi.org/10.1044/jslhr.4005.1085
Article Google Scholar
Kachru, Y. (2008). Hindi–Urdu–Hindustani. In Language in South Asia. Cambridge University Press. https://doi.org/10.1017/CBO9780511619069.006
Kourkounakis, T., Hajavi, A., & Etemad, A. (2020). Detecting multiple speech disfluencies using a deep residual network with bidirectional long short-term memory. In ICASSP, IEEE international conference on acoustics, speech and signal processing—proceedings, 2020-May. https://doi.org/10.1109/ICASSP40776.2020.9053893
McFee, B., Raffel, C., Liang, D., Ellis, D., McVicar, M., Battenberg, E., & Nieto, O. (2015). librosa: Audio and Music Signal Analysis in Python. In Proceedings of the 14th Python in science conference. https://doi.org/10.25080/majora-7b98e3ed-003
Mitra, V., Huang, Z., Lea, C., Tooley, L., Wu, S., Botten, D., Palekar, A., Thelapurath, S., Georgiou, P., Kajarekar, S., & Bigham, J. (2021). Analysis and tuning of a voice assistant system for dysfluent speech. In Proceedings of the annual conference of the International Speech Communication Association, INTERSPEECH (Vol. 4). https://doi.org/10.21437/Interspeech.2021-2006
Morreale, S. P., Osborn, M. M., & Pearson, J. C. (2000). Why communication is important: A rationale for the centrality of the study of communication. Journal of the Association for Communication Administration, 29(1), 1–25.
Google Scholar
Pruett, D. G., Shaw, D. M., Chen, H. H., Petty, L. E., Polikowsky, H. G., Kraft, S. J., Jones, R. M., & Below, J. E. (2021). Identifying developmental stuttering and associated comorbidities in electronic health records and creating a phenome risk classifier. Journal of Fluency Disorders. https://doi.org/10.1016/j.jfludis.2021.105847
Article Google Scholar
Sayago, S., Neves, B. B., & Cowan, B. R. (2019). Voice assistants and older people: Some open issues. ACM International Conference Proceeding Series. https://doi.org/10.1145/3342775.3342803
Article Google Scholar
Sheikh, S. A., Sahidullah, M., Hirsch, F., & Ouni, S. (2021). StutterNet: Stuttering detection using time delay neural network. In European signal processing conference, 2021-August. https://doi.org/10.23919/EUSIPCO54536.2021.9616063

Download references

Acknowledgements

First and foremost, we wish to express our profound gratitude to our research subjects. Their dedication and active involvement not only made the data collection drive a success but also enriched this research with invaluable speech data. Their unwavering support has been the bedrock upon which this work stands. Equally essential to the completion of this work was the guidance of Dr. Anil Thakur. His insights and direction have been instrumental in shaping our research. Additionally, we are deeply indebted to Dr. Sukomal Pal, whose discerning critiques and constructive feedback have been invaluable in refining our approach and processes. To all of you who have been part of this journey with us, we extend our heartfelt thanks.

Author information

Authors and Affiliations

Indian Institute of Technology, BHU, Varanasi, India
Shivam Dwivedi, Sanjukta Ghosh & Satyam Dwivedi

Authors

Shivam Dwivedi
View author publications
You can also search for this author in PubMed Google Scholar
Sanjukta Ghosh
View author publications
You can also search for this author in PubMed Google Scholar
Satyam Dwivedi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shivam Dwivedi.

Ethics declarations

Disclosures

The recruitment of research participants for this study was based on a voluntary basis. It’s important to note that none of the participants received monetary compensation for their involvement. in tandem with this, the research project garnered no external funding, with all research-related expenses being borne by the authors themselves. the authors assert their absence of conflict of interest pertaining to this research endeavor. it is further confirmed that external entities had no involvement in the study's design, data collection, analysis, interpretation of results, or the decision to publish. the presented study findings stand as a product solely derived from the authors' collected and analyzed data, maintaining independence from any external influences.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dwivedi, S., Ghosh, S. & Dwivedi, S. Binary classifier for identification of stammering instances in Hindi speech data. Int J Speech Technol 26, 765–774 (2023). https://doi.org/10.1007/s10772-023-10046-9

Download citation

Received: 03 April 2023
Accepted: 29 August 2023
Published: 11 October 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s10772-023-10046-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Binary classifier for identification of stammering instances in Hindi speech data

Abstract

Access this article

Similar content being viewed by others

A comparison of text preprocessing techniques for hate and offensive speech detection in Twitter

Comparative Analysis of Classification Methods for Automatic Deception Detection in Speech

Tracking Hate in Social Media: Evaluation, Challenges and Approaches

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Disclosures

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Binary classifier for identification of stammering instances in Hindi speech data

Abstract

Access this article

Similar content being viewed by others

A comparison of text preprocessing techniques for hate and offensive speech detection in Twitter

Comparative Analysis of Classification Methods for Automatic Deception Detection in Speech

Tracking Hate in Social Media: Evaluation, Challenges and Approaches

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Disclosures

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation