Dataset Open Access

INEL Evenki Corpus

Däbritz, Chris Lasse; Gusev, Valentin

Data manager(s)
Ferger, Anne; Jettka, Daniel; Lazarenko, Elena; Lehmberg, Timm; Riaposov, Aleksandr
Researcher(s)
Wagner-Nagy, Be´ata; Arkhipov, Alexandre; Gusev, Valentin; Däbritz, Chris Lasse

Corpus Citation

Däbritz, Chris Lasse & Gusev, Valentin. 2021. INEL Evenki Corpus. Version 1.0. Publication date 2021-12-31. Archived at Universität Hamburg. https://hdl.handle.net/11022/0000-0007-F43C-3. In: The INEL corpora of indigenous Northern Eurasian languages. https://hdl.handle.net/11022/0000-0007-F45A-1

Corpus Description

The INEL Evenki Corpus has been created within the long-term INEL project (Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages), 2016–2033.
The corpus makes possible typologically aware corpus-based grammatical research on the Evenki (< Tungusic) language and expands the documentation of the lesser described indigenous languages of Northern Eurasia.
The INEL Evenki Corpus covers Northern (Taimyr, Khantayskoe Ozero, Ilimpi, Erbogachon) and Southern (Sym) Evenki dialects, which have or had contacts with other languages dealt with in the INEL project, that is, first and foremost Dolgan and Selkup. The INEL Evenki Corpus is composed of texts from different sources:

  1. Published texts from different text collections, inter alia "Sbornik materialov po evenkijskomu (tungusskomu) fol'kloru" (Vasilevich 1936), covering all named dialects.
  2. Transcripts of recordings obtained from the Taimyr House of National Arts (TDNT) in Dudinka (2000s) as well as transcripts of recordings made by and from Tat’yana V. Bolina, either of them representing the Khantayskoe Ozero dialect.
  3. Texts from the handwritten archive of the Russian ethnographer and linguist Konstantin M. Rychkov recorded in the 1900s/1910s, covering the Taimyr, Ilimpi and Sym dialects.

Each text in the corpus is provided with morphological glossing, translation into English, Russian and German, as well as annotation of Russian borrowings. Some texts also have annotations for syntactic functions, semantic roles and information status.

Funding

The corpus has been produced in the context of the joint research funding of the German Federal Government and Federal States in the Academies’ Programme, with funding from the Federal Ministry of Education and Research and the Free and Hanseatic City of Hamburg. The Academies’ Programme is coordinated by the Union of the German Academies of Sciences and Humanities.

Contributions/Acknowledgements

  • The Taimyr House of National Arts (TDNT) provided valuable audio material (see above).
  • Tat’yana V. Bolina (TDNT Leading Methodologist for Evenki folklore & culture) recorded some further Evenki material in 2018 and 2019.
  • The Institute of Oriental Manuscripts of the Russian Academy of Sciences (IOM RAS; Институт восточных рукописей РАН) in Saint Petersburg provided scanned manuscripts from the Rychkov archive (The Archives of the Orientalists of IOM RAS, Coll. 49, inv. 1, items 4, 5, 6а, 6б, 6в).
  • The web-based search interface is using the Tsakonian Corpus platform developed by Dr. Timofey Arkhangelskiy.

Files (3.0 GB)
Name Size
evenki-1.0-documentation.pdf
md5:6e9a8d48446135d4923f196c815077d0
1.9 MB Download
evenki-1.0-mp3only.zip
md5:bf40160dba31c4e5174b30db2ed27c91
662.9 MB Download
evenki-1.0-noaudio.zip
md5:69fab6fbbba25923b38154e80a9cac4e
323.8 MB Download
evenki-1.0.zip
md5:b006bf14db414b7b6b290c91e76099ad
2.0 GB Download

Cite record as