Skip to main content

On the Root-Based Lexicon for Polish

  • Chapter
Aspects of Natural Language Processing

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5070))

  • 766 Accesses

Abstract

In this paper we present the concept of an electronic lexicon based on morphological roots. The idea of the root-based lexicon returns to traditional linguistic division of a word into a stem and an inflectional suffix. The only difference to the pure linguistic description is that an electronic resource must adapt to the analyzed text. We assume that the lexicon will be used in written text analysis (or synthesis), therefore we operate on grapheme objects.

We used the lexicon of the inflectional analyzer AMOR as the empirical foundation for the root-based lexicon. In the second part of the paper we describe the process of the automatic conversion of the data from the analyzer into the assumed format. The conversion concerns the major inflecting parts of speech: nouns, adjectives and verbs. The results are two-level morphology based entries which bear the whole package of morphological information about lexemes. In the presented form, however, any generalization about Polish inflection or inner root alternations is not available. Thus, we rebuilt the lexicon of roots. As a result we obtained the compressed lexicon which can serve not only for inflection analysis but also applications of word-formation descriptions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bień, J.S.: Koncepcja słownikowej informacji morfologicznej i jej komputerowej weryfikacji. Dissertationes Universitatis Varsoviensis, vol. 383. Wydawnictwa Uniwersytetu, Warszawskiego (1991)

    Google Scholar 

  2. Booij, G.: Against Split Morphology. Yearbook of Morphology 1993, 27–50 (1994)

    Google Scholar 

  3. Booij, G.: Inherent versus Contextual Inflection and the Split Morphology Hypothesis. Yearbook of Morphology 1995, 1–16 (1996)

    Google Scholar 

  4. Broda, B., Piasecki, M., Radziszewski, A.: Towards a Set of General Purpose Morphosyntactic Tools for Polish. In: Intelligent Information Systems XVI. Challenging Problems of Science. Computer Science, pp. 441–450 (2008)

    Google Scholar 

  5. Cetnarowska, B.: On Inherent Inflection Feeding Derivation in Polish. Yearbook of Morphology 1999, 153–183 (2001)

    Article  Google Scholar 

  6. Gruszczyñski, W.: Fleksja rzeczowników pospolitych we współczesnej polszczyźnie pisanej. Prace Językoznawcze 122. Zakład Narodowy im. Ossolińskich (1989)

    Google Scholar 

  7. Hajnicz, E., Kupść, A.: Przegląd analizatorów morfologicznych dla języka polskiego. Technical Report 937, Instytut Podstaw Informatyki Polskiej Akademii Nauk, Warszawa (2001)

    Google Scholar 

  8. Kaplan, R.M., Kay, M.: Phonological rules and finite-state transducers. In: Linguistic Society of America Meeting Handbook, Fifty-Sixth Annual Meeting, New York, December 27–30 (1981)

    Google Scholar 

  9. Kay, M.: When Meta-Rules are not Meta-Rules. In: Sparck Jones, K., Wilks, Y. (eds.) Automatic Natural Language Parsing, pp. 94–116. Ellis Horwood, Chichester (1983)

    Google Scholar 

  10. Koskenniemi, K.: Two-Level Morphology: A General Computational Model for Word-Form Recognition and Production. PhD thesis, Helsinki University (1983)

    Google Scholar 

  11. Rabiega-Wiśniewska, J.: Podstawy lingwistyczne automatycznego analizatora morfologicznego AMOR. Poradnik Językowy 10 (619), 59–78 (2004)

    Google Scholar 

  12. Rabiega-Wiśniewska, J.: A formal model of Polish nominal derivation. In: Human Language Technologies as a Challenge for Computer Science and Linguistics, Proceedings of 2nd Language & Technology Conference, Poznań, April 21-23, 2005, pp. 323–327. Wydawnictwo Poznańskie Sp. z o.o, Poznań (2005)

    Google Scholar 

  13. Rabiega-Wiśniewska, J.: Wpływ fleksji na derywację – dyskusja podziału morfologii. LingVaria 2(6), 41–57 (2008)

    Google Scholar 

  14. Rabiega-Wiśniewska, J., Rudolf, M.: Towards a Bi-Modular Automatic Analyzer of Large Polish Corpora. In: Kosta, R., Błaszczak, J., Frasek, J., Geist, L., Żygis, M. (eds.) Investigations into Formal Slavic Linguistics. Contributions of the Fourth European Conference on Formal Description of Slavic Languages – FDSL IV, held at Potsdam University, November 28-30, 2001, pp. 363–372 (2003)

    Google Scholar 

  15. Rabiega-Wiśniewska, J.: Formalny opis derywacji w języku polskim. Rzeczowniki i przymiotniki. PhD thesis, Uniwersytet Warszawski (2006)

    Google Scholar 

  16. Saloni, Z., Gruszczyński, W., Woliński, M., Wołosz, R.: Słownik gramatyczny języka polskiego. Wiedza Powszechna (2007)

    Google Scholar 

  17. Saloni, Z., Świdziński, M.: Składnia współczesnego języka polskigo. Wydawnictwo Naukowe PWN, Warszawa (1998)

    Google Scholar 

  18. Szafran, K.: Automatyczna analiza fleksyjna tekstu polskiego (na podstawie Schematycznego indeksu a tergo Jana Tokarskiego). PhD thesis, Uniwersytet Warszawski, Warszawa (1994)

    Google Scholar 

  19. Tokarski, J.: Fleksja polska. PWN, Warszawa (1973)

    Google Scholar 

  20. Tokarski, J.: Schematyczny indeks a tergo polskich form wyrazowych. PWN, Warszawa (1993)

    Google Scholar 

  21. Woliński, M.: Morfeusz — a Practical Tool for the Morphological Analysis of Polish. In: Kłopotek, M., Wierzchoń, S., Trojanowski, K. (eds.) Intelligent Information Processing and Web Mining, IIS:IIPWM’06, pp. 503–512. Springer, Heidelberg (2006)

    Google Scholar 

  22. Wołosz, R.: Efektywna metoda analizy i syntezy morfologicznej w języku polskim. Problemy Współczesnej Nauki. Teoria i Zastosowania. Inżynieria Lingwistyczna. Akademicka Oficyna Wydawnicza EXIT, Warszawa (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Rabiega-Wiśniewska, J. (2009). On the Root-Based Lexicon for Polish. In: Marciniak, M., Mykowiecka, A. (eds) Aspects of Natural Language Processing. Lecture Notes in Computer Science, vol 5070. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04735-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04735-0_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04734-3

  • Online ISBN: 978-3-642-04735-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics