Lexicon Size and Language Model Order Optimization for Russian LVCSR

Kipyatkova, Irina; Karpov, Alexey

doi:10.1007/978-3-319-01931-4_29

Irina Kipyatkova^22,23 &
Alexey Karpov^22,23

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8113))

Included in the following conference series:

International Conference on Speech and Computer

1188 Accesses
11 Citations

Abstract

In this paper, the comparison of 2,3,4-gram language models with various lexicon sizes is presented. The text data forming the training corpus has been collected from recent Internet news sites; total size of the corpus is about 350 million words (2.4 GB data). The language models were built using the recognition lexicons of 110K, 150K, 219K, and 303K words. For evaluation of these models such characteristics as perplexity, OOV words rate and n-gram hit rate were computed. Experimental results on continuous Russian speech recognition are also given in the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ircing, P., Hoidekr, J., Psutka, J.: Exploiting linguistic knowledge in language modeling of Czech spontaneous speech. In: Proceedings of Int. Conf. on Language Resources and Evaluation, LREC 2006, Genoa, Italy, pp. 2600–2603 (2006)
Google Scholar
Kurimo, M., et al.: Unlimited vocabulary speech recognition for agglutinative languages. In: Proceedings of Human Language Technology Conference of the North American Chapter of the ACL, New York, USA, pp. 487–494 (2006)
Google Scholar
Vaičiūnas, A.: Statistical Language Models of Lithuanian and Their Application to Very Large Vocabulary Speech Recognition. PhD thesis, Vytautas Magnus University, Kaunas (2006)
Google Scholar
Whittaker, E.W.D., Woodland, P.C.: Efficient class-based language modelling for very large vocabularies. In: Proceedings of ICASSP 2001, Salt Lake City, USA, pp. 545–548 (2001)
Google Scholar
Whittaker, E.W.D.: Statistical language modelling for automatic speech recognition of Russian and English. PhD thesis, Cambridge Univ., 140 p. (2000)
Google Scholar
Vazhenina, D., Kipyatkova, I., Markov, K., Karpov, A.: State-of-the-art Speech Recognition Technologies for Russian Language. In: Proceedings of the Joint International Conference on Human-Centered Computer Environments, HCCE 2012, Aizu-Wakamatsu, Japan, pp. 59–63 (2012)
Google Scholar
Viktorov, A., Gramnitskiy, S., Gordeev, S., Eskevich, M., Klimina, E.: Universal technique for preparing components for training of a speech recognition system. Speech Technologies 2, 39–55 (2009) (in Rus.)
Google Scholar
Oparin, I., Talanov, A.: Stem-Based Approach to Pronunciation Vocabulary Construction and Language Modeling for Russian. In: Proceedings of SPECOM 2005, Patras, Greece, pp. 575–578 (2005)
Google Scholar
Pylypenko, V.: Extra Large Vocabulary Continuous Speech Recognition Algorithm based on Information Retrieval. In: Proceedings of Interspeech 2007, Antwerp, Belgium, pp. 1809–1812 (2007)
Google Scholar
Zablotskiy, S., Shvets, A., Sidorov, M., Semenkin, E., Minker, W.: Speech and Language Recources for LVCSR of Russia. In: Proceedings of LREC 2012, Istanbul, Turkey, pp. 3374–3377 (2012)
Google Scholar
Lamel, L., Courcinous, S., Gauvain, J.-L., Josse, Y., Le, V.B.: Transcription of Russian Conversational Speech. In: Proceedings of SLTU 2012, Cape Town, RSA, pp. 156–161 (2012)
Google Scholar
Lamel, L., et al.: Speech Recognition for Machine Translation in Quaero. In: Proceedings of International Workshop on Spoken Language Translation, IWSLT 2011, San Francisco, USA, pp. 121–128 (2011)
Google Scholar
Schalkwyk, J., Beeferman, D., Beaufays, F., Byrne, B., Chelba, C., Cohen, M., Kamvar, M., Strope, B.: Google Search by Voice: A Case Study. Advances in Speech Recognition: Mobile Environments, Call Centers and Clinics, 61–90 (2010)
Google Scholar
Kipyatkova, I., Karpov, A., Verkhodanova, V., Zelezny, M.: Analysis of Long-distance Word Dependencies and Pronunciation Variability at Conversational Russian Speech Recognition. In: Proceedings of Federated Conference on Computer Science and Information Systems, FedCSIS 2012, Wroclaw, Poland, pp. 719–725 (2012)
Google Scholar
Vazhenina, D., Markov, K.: Phoneme Set Selection for Russian Speech Recognition. In: Proceedings of 7th Int. Conf. on NLP and Knowledge Engineering, NLP-KE 2011, Japan, pp. 475–478 (2011)
Google Scholar
Karpov, A., Kipyatkova, I., Ronzhin, A.: Speech Recognition for East Slavic Languages: The Case of Russian. In: Proceedings of the 3rd International Workshop on Spoken Languages Technologies for Under-resourced Languages, SLTU 2012, Cape Town, RSA, pp. 84–89 (2012)
Google Scholar
Karpov, A., Kipyatkova, I., Ronzhin, A.: Very Large Vocabulary ASR for Spoken Russian with Syntactic and Morphemic Analysis. In: Proceedings of Interspeech 2011, Florence, Italy, pp. 3161–3164 (2011)
Google Scholar
Stolcke, A., Zheng, J., Wang, W., Abrash, V.: SRILM at Sixteen: Update and Outlook. In: Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2011, Waikoloa, Hawaii, USA (2011)
Google Scholar
Jokisch, O., Wagner, A., Sabo, R., Jaeckel, R., Cylwik, N., Rusko, M., Ronzhin, A., Hoff-mann, R.: Multilingual speech data collection for the assessment of pronunciation and prosody in a language learning system. In: Proceedings of SPECOM 2009, St. Peterburg, Russia, pp. 515–520 (2009)
Google Scholar
Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition, 496 p. Prentice Hall (1995)
Google Scholar
Lee, A., Kawahara, T.: Recent Development of Open-Source Speech Recognition Engine Julius. In: Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2009), Sapporo, Japan, pp.131–137 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

SPIIRAS, 39, 14th line, St. Petersburg, Russia
Irina Kipyatkova & Alexey Karpov
Saint-Petersburg State University, 7-9, Universitetskaya nab., St. Petersburg, Russia
Irina Kipyatkova & Alexey Karpov

Authors

Irina Kipyatkova
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Karpov
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Applied Sciences, Department of Cybernetics, University of West Bohemia, Univerzitní 8, 306 14, Plzeň, Czech Republic
Miloš Železný
University of West Bohemia, 306 14, Pilsen, Czech Republic
Ivan Habernal
Speech and Multimodal Interfaces Laboratory, St. Petersburg Institute of Informatics and Automation for the Russian Academy of Sciences, 14-th line, 39, 199178, St. Petersburg, Russia
Andrey Ronzhin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kipyatkova, I., Karpov, A. (2013). Lexicon Size and Language Model Order Optimization for Russian LVCSR. In: Železný, M., Habernal, I., Ronzhin, A. (eds) Speech and Computer. SPECOM 2013. Lecture Notes in Computer Science(), vol 8113. Springer, Cham. https://doi.org/10.1007/978-3-319-01931-4_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-01931-4_29
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01930-7
Online ISBN: 978-3-319-01931-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics