Automatic Speech Recognition in Taxi Call Service Systems

Rustamov, Samir; Akhundova, Natavan; Valizada, Alakbar

doi:10.1007/978-3-030-23943-5_18

Samir Rustamov^20,21,
Natavan Akhundova²² &
Alakbar Valizada²²

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 285))

Included in the following conference series:

International Conference for Emerging Technologies in Computing

795 Accesses
3 Citations

Abstract

In this research, the application of automatic speech recognition system in taxi call services is investigated. In comparison with traditional query handling systems such as live agents, Interactive Voice Response systems, type-base websites and mobile applications, the newest trend of artificial intelligence - speech recognition can be applied to make conversations in more natural way. For developing, training and testing of the system, Kaldi and CMUSphinx open-source speech recognition tools were utilized. Approximately 4 h of speech data in Azerbaijani have been processed for both tools. Testing has been accomplished in two ways; one of which is recognizing dataset from unknown speakers, and the other one is recognizing shuffled dataset. During these tests, variance and speed were investigated, along with accuracy. Kaldi showed accuracy between 97.3 and 99.6 with variance changing between 0.03 and 4.8. On the other hand, CMUSphinx attained accuracy between 95.6 and 97.8 with variance values of 0.2 and 3.8 in relatively less training time. Accomplished results were compared and used to define appropriate parameters for investigated models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Matarneh, R., Maksymova, S., Lyashenko, V.V., Belova, N.V.: Speech recognition systems: a comparative review. IOSR J. Comput. Eng. 19(5), 71–79 (2017). https://www.researchgate.net/publication/320673436_Speech_Recognition_Systems_A_Comparative_Review
Google Scholar
Saon, G., et al.: English conversational telephone speech recognition by humans and machines, March 2017. https://arxiv.org/pdf/1703.02136v1.pdf
Gaida, C., Lange, P., Petrick, R., Proba, P., Malatawy, A., Suendermann-Oeft, D.: Comparing open-source speech recognition toolkits. In: 11th International Workshop on Natural Language Processing and Cognitive Science (2014). http://suendermann.com/su/pdf/oasis2014.pdf
Ravanelli, M., Parcollet, T., Bengio, Y.: The pytorch-kaldi speech recognition toolkit, February 2019. https://arxiv.org/pdf/1811.07453v2.pdf
Parthasarathi, S.H.K., Strom, N.: Lessons from building acoustic models with a million hours of speech, April 2019. https://arxiv.org/pdf/1904.01624.pdf
Wang, Q., et al.: VoiceFilter: targeted voice separation by speaker-conditioned spectrogram masking, February 2019. https://arxiv.org/pdf/1810.04826v4.pdf
Schatz, T., Feldman, N.H.: Neural network vs. HMM speech recognition systems as models of human cross-linguistic phonetic perception. In: 2018 Conference on Cognitive Computational Neuroscience (2018). http://thomas.schatz.cogserver.net/wp-content/uploads/2018/11/Schatz2018b.pdf
Fukuda, T., et al.: Data augmentation improves recognition of foreign accented speech. Interspeech, September 2018. https://www.isca-speech.org/archive/Interspeech_2018/pdfs/1211.pdf
Jain, A., Upreti, M., Jyothi, P.: Improved accented speech recognition using accent embeddings and multi-task learning. Interspeech, September 2018. https://www.isca-speech.org/archive/Interspeech_2018/pdfs/1864.pdf
Ragni, A., Upreti, M., Gales, M.J.F.: Automatic speech recognition system development in the “Wild”. Interspeech, September 2018. https://www.isca-speech.org/archive/Interspeech_2018/pdfs/1085.pdf
Rustamov, S., Gasimov, E., Hasanov, R., Jahangirli, S., Mustafayev, E., Usikov, D.: Speech recognition in flight simulator. aegean international textile and advanced engineering conference. IOP Conf. Ser. Mater. Sci. Eng. 459 (2018). https://iopscience.iop.org/article/10.1088/1757-899X/459/1/012005/pdf
Article Google Scholar
Forsyth, A.: Taxi Company Adopts Speech Recognition Technology. Computerworld, 26 October 2000
Google Scholar
Forsyth, A.: Taxi fleet bets on speech recognition, Computerworld, 18 May 2001
Google Scholar
Malcolm, A.: Cab firm books speech recognition system, Computerworld, 17 May 2001
Google Scholar
Aida-Zade, K., Ardil, C., Rustamov, S.: Investigation of combined use of MFCC and LPC Features in Speech Recognition Systems. World Acad. Sci. Eng. Technol. Int. J. Comput. Inf. Eng. 1, 2647–2653 (2007)
Google Scholar
Aida-Zade, K., Rustamov, S.: The principles of construction of the azerbaijan speech recognition system. In: The 2nd International Conference “Problems of Cybernetics and Informatics”, pp. 183–186 (2008)
Google Scholar
Aida-Zade, K., Rustamov, S., Mustafayev, E.: Principles of construction of speech recognition system by the example of azerbaijan language. In: International Symposium on Innovations in Intelligent Systems and Applications, pp. 378–382 (2009)
Google Scholar
Ayda-zade, K., Rustamov, S.: Research of cepstral coefficients for azerbaijan speech recognition system. Trans. Azerbaijan Natl. Acad. Sci. Inform. Control. Probl. 3, 89–94 (2005)
Google Scholar
Aida-zade, K., Xocayev, A., Rustamov, S.: Speech recognition using support vector machines. In: 10th IEEE International Conference on Application of Information and Communication Technologies, AICT 2016 (2016)
Google Scholar
Juang, B.H., Lawrence, R.: Automatic Speech Recognition - A Brief History of the Technology Development, January 2005
Google Scholar
Jurafsky, D., Martin, J.H.: Automatic speech recognition. In: Speech and Language Processing, pp. 285–291. Pearson Education (2008)
Google Scholar
Emms, S.: Best Free Linux Speech Recognition Tools – Open Source Software, LinuxLinks 3 March 2018
Google Scholar
Thompson, C.: Open Source Toolkits for Speech Recognition, Silicon Valley Data Science, 23 February 2017
Google Scholar
Kaldi ASR. kaldi-asr.org. Accessed 16 April 2019
CMUSphinx Open Source Speech Recognition. cmusphinx.github.io. Accessed 16 Apr 2019
Bagiyev, A., Gurbanli, K., Mammadova, N., Nuriyeva, S.: Development of limited-vocabulary ASR for Azerbaijani. ACM Celebration of Women in Computing womENcourage 2018, October 2018. https://womencourage.acm.org/2018/wp-content/uploads/2018/07/womENcourage_2018_paper_26.pdf

Download references

Acknowledgment

This work has been carried out in Center for Data Analytics Research at ADA University and in Research and Development Laboratory at ATL Tech.

Author information

Authors and Affiliations

ADA University, Baku, AZ, 1008, Azerbaijan
Samir Rustamov
Institute of Control Systems, Baku, Azerbaijan
Samir Rustamov
ATL Tech, Baku, AZ, 1022, Azerbaijan
Natavan Akhundova & Alakbar Valizada

Authors

Samir Rustamov
View author publications
You can also search for this author in PubMed Google Scholar
Natavan Akhundova
View author publications
You can also search for this author in PubMed Google Scholar
Alakbar Valizada
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samir Rustamov .

Editor information

Editors and Affiliations

CFRED, Chinese University of Hong Kong, Hong Kong, China
Mahdi H. Miraz
Glyndwr University, Wrexham, UK
Peter S. Excell
Faculty of Computing, Engineering and Science, University of South Wales, Pontypridd, Mid Glamorgan, UK
Andrew Ware
AMA International University, Salmabad, Bahrain
Safeeullah Soomro
University of Essex, Colcheser, UK
Maaruf Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rustamov, S., Akhundova, N., Valizada, A. (2019). Automatic Speech Recognition in Taxi Call Service Systems. In: Miraz, M., Excell, P., Ware, A., Soomro, S., Ali, M. (eds) Emerging Technologies in Computing. iCETiC 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 285. Springer, Cham. https://doi.org/10.1007/978-3-030-23943-5_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-23943-5_18
Published: 14 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23942-8
Online ISBN: 978-3-030-23943-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics