Skip to main content

A Dependency Treebank for Serbian: Initial Experiments

  • Conference paper
Book cover Speech and Computer (SPECOM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8773))

Included in the following conference series:

Abstract

The paper presents the development of a dependency treebank for the Serbian language, intended for various applications in the field of natural language processing, primarily natural language understanding within human-machine dialogue. The databank is built by adding syntactical annotation to the part-of-speech (POS) tagged AlfaNum Text Corpus of Serbian. The annotation is carried out in line with the standards set by the Prague Dependency Treebank, which has been adopted as a starting point for the development of treebanks for some other kindred languages in the region. The initial dependency parsing experiments on the currently annotated portion of the corpus containing 1,148 sentences (7,117 words) provided relatively low parsing accuracy, as was expected from a preliminary experiment and a treebank of this size.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Marcus, M., Santorini, B., Marcinkiewicz, M.A.: Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)

    Google Scholar 

  2. Han, A.L.-F., Wong, D.F., Chao, L.S., He, L., Li, S., Zhu, L.: Phrase Tagset Mapping for French and English Treebanks and Its Application in Machine Translation Evaluation. In: Gurevych, I., Biemann, C., Zesch, T. (eds.) GSCL. LNCS (LNAI), vol. 8105, pp. 119–131. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  3. Hajič, J., Böhmová, A., Hajičová, E., Hladká, B.V.: The Prague Dependency Treebank: A Three-Level Annotation Scenario. In: Abeillé, A. (ed.) Treebanks: Building and Using Parsed Corpora, pp. 103–127. Kluwer, Amsterdam (2000)

    Google Scholar 

  4. Sgall, P., Hajičová, E., Panevová, J.: The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. D. Reidel Publishing Company, Dordrecht (1986)

    Google Scholar 

  5. Nivre, J.: Inductive Dependency Parsing. Text, Speech and Language Technology, vol. 34, p. 46. Springer (2006)

    Google Scholar 

  6. Ledinek, N., Žele, A.: Building of the Slovene Dependency Treebank Corpus According to the Prague Dependency Treebank Corpus. In: Grammar and Corpus, Prague, Czech Republic (2005)

    Google Scholar 

  7. Džeroski, S., Erjavec, T., Ledinek, N., Pajas, P., Žabokrtský, Z., Žele, A.: Towards a Slovene Dependency Treebank. In: LREC, Genova, Italy (2006)

    Google Scholar 

  8. Berović, D., Agić, Ž., Tadić, M.: Croatian Dependency Treebank: Recent Development and Initial Experiments. In: LREC, Istanbul, Turkey, pp. 1902–1906 (2012)

    Google Scholar 

  9. Agić, Ž., Merkler, D., Berović, D.: Slovene-Croatian Treebank Transfer Using Bilingual Lexicon Improves Croatian Dependency Parsing, In: 15th Int. Multiconf. Information Society 2012, Ljubljana, Slovenia (2012)

    Google Scholar 

  10. Vitas, D., Popović, L., Krstev, C., Obradović, I., Pavlović-Lažetić, G., Stanojević, M.: The Serbian Language in the Digital Age. In: Rehm, G., Uszkoreit, H. (eds.) METANET White Paper Series. Springer (2012)

    Google Scholar 

  11. Sečujski, M.: Automatic part-of-speech tagging of texts in the Serbian language. PhD thesis, Faculty of Technical Sciences, Novi Sad, Serbia (2009)

    Google Scholar 

  12. Hajič, J., Panevová, J., Buráňová, E., Urešová, Z., Bémová, A.: Annotations at Analytical Level: Instructions for Annotators. Technical report. UFAL MFF UK, Prague (1999)

    Google Scholar 

  13. http://ufal.mff.cuni.cz/tred/

  14. Rosa, R., Mašek, J., Mareček, D., Popel, M., Zeman, D., Žabokrtský, Z.: HamleDT 2.0: Thirty Dependency Treebanks Stanfordized. In: LREC, Reykjavik, Iceland (2014)

    Google Scholar 

  15. Stevanović, M.: Savremeni srpskohrvatski jezik – gramatički sistemi i književnojezička norma II. Narodna knjiga, Belgrade (1991)

    Google Scholar 

  16. Nilsson, J., Nivre, J.: MaltEval: An Evaluation and Visualization Tool for Dependency Parsing. In: LREC, Marrakech, Morocco, pp. 161–166 (2008)

    Google Scholar 

  17. http://ifarm.nl/signll/conll/

  18. Nivre, J.: An Efficient Algorithm for Projective Dependency Parsing. In: The 8th International Workshop on Parsing Technologies (IWPT 2003), Nancy, France, pp. 149–160 (2003)

    Google Scholar 

  19. Covington, M.A.: A fundamental algorithm for dependency parsing. In: The 39th Annual ACM Southeast Conference, Athens, USA, pp. 95–102 (2001)

    Google Scholar 

  20. Nivre, J.: Non-Projective Dependency Parsing in Expected Linear Time. In: The Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Singapore, pp. 351–359 (2009)

    Google Scholar 

  21. Nivre, J., Kuhlmann, M., Hall, J.: An Improved Oracle for Dependency Parsing with Online Reordering. In: The 11th International Conference on Parsing Technologies (IWPT 2009), Paris, France, pp. 73–76 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Jakovljević, B., Kovačević, A., Sečujski, M., Marković, M. (2014). A Dependency Treebank for Serbian: Initial Experiments. In: Ronzhin, A., Potapova, R., Delic, V. (eds) Speech and Computer. SPECOM 2014. Lecture Notes in Computer Science(), vol 8773. Springer, Cham. https://doi.org/10.1007/978-3-319-11581-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11581-8_5

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11580-1

  • Online ISBN: 978-3-319-11581-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics