Using Part of Speech N-Grams for Improving Automatic Speech Recognition of Polish

Pohl, Aleksander; Ziółko, Bartosz

doi:10.1007/978-3-642-39712-7_38

Using Part of Speech N-Grams for Improving Automatic Speech Recognition of Polish

Aleksander Pohl^20,21 &
Bartosz Ziółko²⁰

Conference paper

4333 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7988))

Abstract

This paper investigates the usefulness of a part of speech language model on the task of automatic speech recognition. The develped model uses part of speech tags as categories in a category-based language model. The constructed model is used to re-score the hypotheses generated by the HTK acoustic module. The probability of a given sequence of words is estimated using n-grams with Witten-Bell backoff.

The experiments presented in this paper were carried out for Polish. The best obtained results show that the part-of-speech-only language model trained on a 1-million manually tagged corpus reduces the word error rate by more than 10 percentage points.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ziółko, B., Skurzok, D.: N-grams model for Polish. Speech and Language Technologies, Book 2, pp. 107–127. InTech Publisher (2011)
Google Scholar
Jurafsky, D., Martin, J.H.: Speech and Language Processing, 2nd edn. Prentice-Hall, Inc., New Jersey (2008)
Google Scholar
Hirsimaki, T., Pylkkonen, J., Kurimo, M.: Importance of high-order n-gram models in morph-based speech recognition. IEEE Transactions on Audio, Speech and Language Processing 17(4), 724–732 (2009)
Article Google Scholar
Sak, H., Saraçlar, M., Gungor, T.: Morpholexical and discriminative language models for turkish automatic speech recognition. IEEE Transactions on Audio, Speech, and Language Processing 20(8), 2341–2351 (2012)
Article Google Scholar
Szałkiewicz, Ł., Przepiórkowski, A.: Anotacja morfoskładniowa. In: Narodowy Korpus Języka Polskiego, pp. 59–96. Wydawnictwo Naukowe PWN (2012)
Google Scholar
Radziszewski, A.: A tiered CRF tagger for polish. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Kryszkiewicz, M., Niezgódka, M. (eds.) Intelligent Tools for Building a Scientific Information Platform. SCI, vol. 467, pp. 215–230. Springer, Heidelberg (2013)
Chapter Google Scholar
Niesler, T., Whittaker, E., Woodland, P.: Comparison of part-of-speech and automatically derived category-based language models for speech recognition. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 177–180. IEEE (1998)
Google Scholar
Ziółko, B., Manandhar, S., Wilson, R.C., Ziółko, M.: Language model based on pos tagger. In: Proceedings of SIGMAP 2008 the International Conference on Signal Processing and Multimedia Applications, Porto (2008)
Google Scholar
Piasecki, M.: Hand-written and automatically extracted rules for polish tagger. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 205–212. Springer, Heidelberg (2006)
Chapter Google Scholar
Burnard, L., Sperberg-McQueen, C.: Guidelines for electronic text encoding and interchange. In: Association for Computers and the Humanities, Association for Computational Linguistics, Association for Literary and Linguistic Computing (1994)
Google Scholar
Przepiórkowski, A.: Korpus IPI PAN. Wersja wstępna. Instytut Podstaw Informatyki PAN (2004)
Google Scholar
Janus, D., Przepiórkowski, A.: Poliqarp 1.0: Some technical aspects of a linguistic search engine for large corpora. In: The Proceedings of Practical Applications of Linguistic Corpora (2005)
Google Scholar
Stolcke, A., et al.: SRILM-an extensible language modeling toolkit. In: Proceedings of the International Conference on Spoken Language Processing, vol. 2, pp. 901–904 (2002)
Google Scholar
Saloni, Z., Woliński, M., Wołosz, R., Gruszczyński, W., Skowrońska, D.: Słownik gramatyczny języka polskiego (Eng. Grammatical dictionary of Polish) (2102)
Google Scholar
Radziszewski, A., Śniatowski, T.: A memory-based tagger for polish. In: Proceedings of the 5th Language & Technology Conference, Poznań (2011)
Google Scholar
Acedański, S.: A morphosyntactic brill tagger for inflectional languages. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds.) IceTAL 2010. LNCS, vol. 6233, pp. 3–14. Springer, Heidelberg (2010)
Chapter Google Scholar
Young, S.: Large vocabulary continuous speech recognition: a review. IEEE Signal Processing Magazine 13(5), 45–57 (1996)
Article Google Scholar
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: HTK Book. Cambridge University Engineering Department, UK (2005)
Google Scholar
Grocholewski, S.: CORPORA - speech database for Polish diphones. In: Proceedings of Eurospeech (1997)
Google Scholar
Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language modeling. In: Proceedings of the 34th Annual Meeting on Association for Computational Linguistics, pp. 310–318. Association for Computational Linguistics (1996)
Google Scholar
Jurafsky, D., Martin, J., Kehler, A.: Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition, 2nd edn. Prentice Hall (2009)
Google Scholar
Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: 1995 International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1995, vol. 1, pp. 181–184. IEEE (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics, AGH University of Science and Technology, al. Mickiewicza 30, Kraków, Poland
Aleksander Pohl & Bartosz Ziółko
Department of Computational Linguistics, Jagiellonian University, ul. Łojasiewicza 4, 30-348, Kraków, Poland
Aleksander Pohl

Authors

Aleksander Pohl
View author publications
You can also search for this author in PubMed Google Scholar
Bartosz Ziółko
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Vision and Applied Computer Sciences, IBaI, Leipzig, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pohl, A., Ziółko, B. (2013). Using Part of Speech N-Grams for Improving Automatic Speech Recognition of Polish. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2013. Lecture Notes in Computer Science(), vol 7988. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39712-7_38

Download citation

DOI: https://doi.org/10.1007/978-3-642-39712-7_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39711-0
Online ISBN: 978-3-642-39712-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics