Skip to main content

Tatar WordNet: The Sources and the Component Parts

  • Conference paper
  • First Online:
Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2019)

Abstract

We describe an ongoing project of construction of the Tatar Wordnet. The Tatar Wordnet is being constructed on the base of three source resources, developed by us. The first source is TatThes, a bilingual Russian-Tatar Social-Political Thesaurus. TatThes, in turn, has been constructed by manual translation and extension of RuThes, a linguistic ontology for Russian. The second source is a Tatar translation of RuWordNet, a wordnet for Russian. This translation was carried out automatically on the base of a Russian-Tatar dictionary, and then was manually verified. The third source is a semantic classification of Tatar verbs, developed from scratch. We discuss the structure, methodology of compilation and the current state these source resources, and justify the choice of them as the initial resources for building the Tatar Wordnet. Our ultimate goal is to publish Tatar Wordnet on the Linguistic Linked Open Data cloud and integrate it to the Global WordNet Grid.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)

    Google Scholar 

  2. Fellbaum, C.: WordNet. In: Poli, R., et al. (eds.) Theory and Applications of Ontology: Computer Applications, pp. 231–243. Springer, Heidelberg (2010). https://doi.org/10.1007/978-90-481-8847-5_10

  3. Vossen, P. (ed.): EuroWordNet: General Document. University of Amsterdam (2002). http://www.illc.uva.nl/EuroWordNet/docs.html

  4. Galieva, A., Kirillovich, A., Loukachevich, N., and Nevzorova, O.: Towards a Tatar WordNet: a methodology of using tatar thesaurus. In: Elizarov, A., et al. (eds.) Selected Papers of the XXI International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2019). CEUR Workshop Proceedings, vol. 2523, pp. 316–324. CEUR-WS (2019)

    Google Scholar 

  5. Çetinoğlu, Ö., Bilgin, O., Oflazer, K.: Turkish Wordnet. In: Oflazer, K., Saraçlar, M. (eds.) Turkish Natural Language Processing. TANLP, pp. 317–336. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-90165-7_15

    Chapter  Google Scholar 

  6. Bilgin, O., Çetinoğlu, Ö., Oflazer, K.: Building a Wordnet for Turkish. Romanian J. Inf. Sci. Technol. 7(1–2), 163–172 (2004)

    Google Scholar 

  7. Tufis, D., Cristea, D., Stamou, S.: BalkaNet: aims, methods, results and perspectives. A general overview. Romanian J. Inf. Sci. Technol. 7(1–2), 9–43 (2004)

    Google Scholar 

  8. Ehsani, R.: KeNet: a comprehensive Turkish wordnet and using it in text clustering. Ph.D. thesis. Işık University (2018)

    Google Scholar 

  9. Ehsani, R., Solak, E., Yildiz, O.T.: Constructing a WordNet for Turkish using manual and automatic annotation. ACM Trans. Asian Low-Resource Lang. Inf. Process. 17(3), Article No. 24 (2018). https://doi.org/10.1145/3185664

  10. Bond, F., Foster, R.: Linking and extending an Open Multilingual WordNet. In: Schuetze, H., Fung, P., Poesio, M. (eds.) Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Volume 1: Long Papers, pp. 1352–1362. ACL (2013)

    Google Scholar 

  11. Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012). https://doi.org/10.1016/j.artint.2012.07.001

    Article  MathSciNet  MATH  Google Scholar 

  12. Loukachevitch, N., Dobrov, B.: RuThes linguistic ontology vs. Russian wordnets. In: Orav, H., Fellbaum, C., Vossen, P. (eds.) Proceedings of the 7th Conference on Global WordNet (GWC 2014), pp. 154–162. University of Tartu Press (2014)

    Google Scholar 

  13. Loukachevitch, N.V., Dobrov, B.V., Chetviorkin, I.I.: RuThes-Lite, a publicly available version of thesauru of Russian language RuThes. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, pp. 340–349. RGGU (2014)

    Google Scholar 

  14. Galieva, A., Kirillovich, A., Khakimov, B., Loukachevitch, N., Nevzorova, O., Suleymanov, D.: Toward domain-specific Russian-Tatar thesaurus construction. In: Bolgov, R., et al. (eds.) Proceedings of the International Conference on Internet and Modern Society (IMS-2017), pp. 120–124. ACM Press (2017). https://doi.org/10.1145/3143699.3143716

  15. Galieva, A., Nevzorova, O., Yakubova, D.: Russian-Tatar socio-political thesaurus: methodology, challenges, the status of the project. In: Mitkov, R., Angelova, G. (eds.) Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP 2017), pp. 245–252. INCOMA Ltd. (2017). https://doi.org/10.26615/978-954-452-049-6_034

  16. Kirillovich, A., Nevzorova, O., Gimadiev, E., Loukachevitch, N.: RuThes Cloud: towards a multilevel linguistic linked open data resource for Russian. In: Różewski, P., Lange, C. (eds.) KESW 2017. CCIS, vol. 786, pp. 38–52. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69548-8_4

    Chapter  Google Scholar 

  17. Loukachevitch, N.V., Lashevich, G., Gerasimova, A.A., Ivanov, V.V., Dobrov, B.V.: Creating Russian WordNet by conversion. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual Conference “Dialogue”, pp. 405–415. RGGU (2016)

    Google Scholar 

  18. Loukachevitch, N., Lashevich, G., Dobrov, B.: Comparing two thesaurus representations for Russian. In: Bond, F., Kuribayashi, T., Fellbaum, C., Vossen, P. (eds.) Proceedings of the 9th Global WordNet Conference (GWC 2018), pp. 35–44. GWA (2018)

    Google Scholar 

  19. Galieva, A., Vavilova, Z., Gatiatullin, A.: Semantic classification of Tatar verbs: selecting relevant parameters. In: Čibej, J., Kosem, I., and Krek, S. (eds.) Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts (Euralex 2018), pp. 811–818. Ljubljana University Press (2018)

    Google Scholar 

  20. Cimiano, P., Chiarcos, C., McCrae, J.P., Gracia, J.: Linguistic Linked Open Data Cloud. In: Cimiano, P., et al. (eds.) Linguistic Linked Data: Representation, Generation and Applications, pp. 29–41. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-30225-2_3

  21. Vossen, P., Bond, F., McCrae, J.P.: Toward a truly multilingual Global Wordnet Grid. In: Barbu Mititelu, V., et al. (eds.) Proceedings of the 8th Global WordNet Conference (GWC 2016), pp. 419–426. GWA (2016)

    Google Scholar 

Download references

Acknowledgments

This work was funded by Russian Science Foundation according to the research project no. 19-71-10056.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Kirillovich .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kirillovich, A., Galieva, A., Nevzorova, O., Shaekhov, M., Loukachevitch, N., Ilvovsky, D. (2020). Tatar WordNet: The Sources and the Component Parts. In: Elizarov, A., Novikov, B., Stupnikov, S. (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2019. Communications in Computer and Information Science, vol 1223. Springer, Cham. https://doi.org/10.1007/978-3-030-51913-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-51913-1_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-51912-4

  • Online ISBN: 978-3-030-51913-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics