Skip to main content

Automatic Speech Recognition in Taxi Call Service Systems

  • Conference paper
  • First Online:
Emerging Technologies in Computing (iCETiC 2019)

Abstract

In this research, the application of automatic speech recognition system in taxi call services is investigated. In comparison with traditional query handling systems such as live agents, Interactive Voice Response systems, type-base websites and mobile applications, the newest trend of artificial intelligence - speech recognition can be applied to make conversations in more natural way. For developing, training and testing of the system, Kaldi and CMUSphinx open-source speech recognition tools were utilized. Approximately 4 h of speech data in Azerbaijani have been processed for both tools. Testing has been accomplished in two ways; one of which is recognizing dataset from unknown speakers, and the other one is recognizing shuffled dataset. During these tests, variance and speed were investigated, along with accuracy. Kaldi showed accuracy between 97.3 and 99.6 with variance changing between 0.03 and 4.8. On the other hand, CMUSphinx attained accuracy between 95.6 and 97.8 with variance values of 0.2 and 3.8 in relatively less training time. Accomplished results were compared and used to define appropriate parameters for investigated models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Matarneh, R., Maksymova, S., Lyashenko, V.V., Belova, N.V.: Speech recognition systems: a comparative review. IOSR J. Comput. Eng. 19(5), 71–79 (2017). https://www.researchgate.net/publication/320673436_Speech_Recognition_Systems_A_Comparative_Review

    Google Scholar 

  2. Saon, G., et al.: English conversational telephone speech recognition by humans and machines, March 2017. https://arxiv.org/pdf/1703.02136v1.pdf

  3. Gaida, C., Lange, P., Petrick, R., Proba, P., Malatawy, A., Suendermann-Oeft, D.: Comparing open-source speech recognition toolkits. In: 11th International Workshop on Natural Language Processing and Cognitive Science (2014). http://suendermann.com/su/pdf/oasis2014.pdf

  4. Ravanelli, M., Parcollet, T., Bengio, Y.: The pytorch-kaldi speech recognition toolkit, February 2019. https://arxiv.org/pdf/1811.07453v2.pdf

  5. Parthasarathi, S.H.K., Strom, N.: Lessons from building acoustic models with a million hours of speech, April 2019. https://arxiv.org/pdf/1904.01624.pdf

  6. Wang, Q., et al.: VoiceFilter: targeted voice separation by speaker-conditioned spectrogram masking, February 2019. https://arxiv.org/pdf/1810.04826v4.pdf

  7. Schatz, T., Feldman, N.H.: Neural network vs. HMM speech recognition systems as models of human cross-linguistic phonetic perception. In: 2018 Conference on Cognitive Computational Neuroscience (2018). http://thomas.schatz.cogserver.net/wp-content/uploads/2018/11/Schatz2018b.pdf

  8. Fukuda, T., et al.: Data augmentation improves recognition of foreign accented speech. Interspeech, September 2018. https://www.isca-speech.org/archive/Interspeech_2018/pdfs/1211.pdf

  9. Jain, A., Upreti, M., Jyothi, P.: Improved accented speech recognition using accent embeddings and multi-task learning. Interspeech, September 2018. https://www.isca-speech.org/archive/Interspeech_2018/pdfs/1864.pdf

  10. Ragni, A., Upreti, M., Gales, M.J.F.: Automatic speech recognition system development in the “Wild”. Interspeech, September 2018. https://www.isca-speech.org/archive/Interspeech_2018/pdfs/1085.pdf

  11. Rustamov, S., Gasimov, E., Hasanov, R., Jahangirli, S., Mustafayev, E., Usikov, D.: Speech recognition in flight simulator. aegean international textile and advanced engineering conference. IOP Conf. Ser. Mater. Sci. Eng. 459 (2018). https://iopscience.iop.org/article/10.1088/1757-899X/459/1/012005/pdf

    Article  Google Scholar 

  12. Forsyth, A.: Taxi Company Adopts Speech Recognition Technology. Computerworld, 26 October 2000

    Google Scholar 

  13. Forsyth, A.: Taxi fleet bets on speech recognition, Computerworld, 18 May 2001

    Google Scholar 

  14. Malcolm, A.: Cab firm books speech recognition system, Computerworld, 17 May 2001

    Google Scholar 

  15. Aida-Zade, K., Ardil, C., Rustamov, S.: Investigation of combined use of MFCC and LPC Features in Speech Recognition Systems. World Acad. Sci. Eng. Technol. Int. J. Comput. Inf. Eng. 1, 2647–2653 (2007)

    Google Scholar 

  16. Aida-Zade, K., Rustamov, S.: The principles of construction of the azerbaijan speech recognition system. In: The 2nd International Conference “Problems of Cybernetics and Informatics”, pp. 183–186 (2008)

    Google Scholar 

  17. Aida-Zade, K., Rustamov, S., Mustafayev, E.: Principles of construction of speech recognition system by the example of azerbaijan language. In: International Symposium on Innovations in Intelligent Systems and Applications, pp. 378–382 (2009)

    Google Scholar 

  18. Ayda-zade, K., Rustamov, S.: Research of cepstral coefficients for azerbaijan speech recognition system. Trans. Azerbaijan Natl. Acad. Sci. Inform. Control. Probl. 3, 89–94 (2005)

    Google Scholar 

  19. Aida-zade, K., Xocayev, A., Rustamov, S.: Speech recognition using support vector machines. In: 10th IEEE International Conference on Application of Information and Communication Technologies, AICT 2016 (2016)

    Google Scholar 

  20. Juang, B.H., Lawrence, R.: Automatic Speech Recognition - A Brief History of the Technology Development, January 2005

    Google Scholar 

  21. Jurafsky, D., Martin, J.H.: Automatic speech recognition. In: Speech and Language Processing, pp. 285–291. Pearson Education (2008)

    Google Scholar 

  22. Emms, S.: Best Free Linux Speech Recognition Tools – Open Source Software, LinuxLinks 3 March 2018

    Google Scholar 

  23. Thompson, C.: Open Source Toolkits for Speech Recognition, Silicon Valley Data Science, 23 February 2017

    Google Scholar 

  24. Kaldi ASR. kaldi-asr.org. Accessed 16 April 2019

  25. CMUSphinx Open Source Speech Recognition. cmusphinx.github.io. Accessed 16 Apr 2019

  26. Bagiyev, A., Gurbanli, K., Mammadova, N., Nuriyeva, S.: Development of limited-vocabulary ASR for Azerbaijani. ACM Celebration of Women in Computing womENcourage 2018, October 2018. https://womencourage.acm.org/2018/wp-content/uploads/2018/07/womENcourage_2018_paper_26.pdf

Download references

Acknowledgment

This work has been carried out in Center for Data Analytics Research at ADA University and in Research and Development Laboratory at ATL Tech.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samir Rustamov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rustamov, S., Akhundova, N., Valizada, A. (2019). Automatic Speech Recognition in Taxi Call Service Systems. In: Miraz, M., Excell, P., Ware, A., Soomro, S., Ali, M. (eds) Emerging Technologies in Computing. iCETiC 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 285. Springer, Cham. https://doi.org/10.1007/978-3-030-23943-5_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-23943-5_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-23942-8

  • Online ISBN: 978-3-030-23943-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics