Reconstructing phonetics behind the graphic system of Evenki texts from the Rychkov archive

This paper discusses the graphic system of manuscripts by Konstantin Rychkov (ca. 1910) containing texts in several dialects of Evenki (Tungusic) with Russian translation. The letters г and һ arguably denote voiced velars or post-velars, stops and fricatives alike. The former is only used before front vowels, the latter otherwise. The letter һ is also found quite often in place of the velar nasal. Palatalization is denoted by three means: a dedicated diacritic after the consonant, an umlaut on the vowel, or the vowel і/ы alternation. Russian letters for hissing sibilants were substituted with Latin ones by Rychkov, which might reflect a different (“lisping”) pronounciation. A special diacritic under sibilants and б probably indicates semi-voicing.

Лингвистика ISSN 2500ISSN -2953 1. Evenki-Russian texts in the archive of Konstantin Rychkov 1.1. The Rychkov archive Konstantin Rychkov (1882-1923 was a Russian ethnographer, linguist and journalist, among his other varied activities. Born in a poor family in Ust-Kamenogorsk (currently in Kazakhstan), he was exiled for revolutionary activities to the Turukhansk Krai, spanning huge territories in the North of Western Siberia. Working as a teacher in settlements of the Far North, he developed keen interest in the language, culture and the development of the indigenous peoples, particularly the Evenki -but also Dolgan, Ket, Selkup and others.
The archive of Konstantin Rychkov, largely unpublished, is preserved at the Institute of Oriental Manuscripts (IOM RAS), St. Petersburg. This archive includes 1341 manuscript pages of folklore and other texts in several Evenki dialects of Western Siberia collected between 1905 and 1913, in several folders (Folders 4,5,6a,6b,6v). To this text collection is adjacent an Evenki-Russian dictionary on cards [Rychkov, n.d.] (4329 cards). Apart from the Evenki materials, the archive also holds data on other languages, notably including a Dolgan dictionary (on 1535 cards) currently being prepared for publication by Prof. Setsu Fujishiro.
In what follows, we will concentrate on aspects of Evenki phonetics as reflected in Rychkov's transcription, using principally data from "Folder 5" [Rychkov, 1911], "Folder 6b" [Rychkov, 1913] and "Folder 6v" [Rychkov, 1912]. We will be mostly interested in establishing correspondences between Rychkovʼs notation and particular sounds or phonetic features, rather than in describing the phonetic variation observed in the texts.
The analysis of the Rychkov manuscripts is part of the Evenki section within the INEL project, which leverages archival data in order to develop digital text corpora for a number of language varieties indigenous to Siberia (see [Arkhipov, Däbritz, 2018] for a description of the project). We should stress that the analysis presented here is based on partial data and cannot thus be considered final.

Text content
The main body of the Folders 5, 6a, 6b, 6v is formed by both traditional and spontaneous monological texts, mostly with Russian translations. The former group includes indigenous tales and legends, but also a number of likely Russian tales (such as "Firebird"). The latter ranges from local history texts (on the origin and migration of different Evenki tribes) to descriptions of hardships in everyday life, short life stories and personal narratives such as hunting stories.
There is also a smaller amount of other kinds of data, such as elicited sentences, short songs, and riddles. Standing apart from the others, Folder 4 contains transcripts and descriptions (in Russian) of three shamanistic rites. Texts in Folder 4 are more complex in aspects of language, structure and layout; they will not be treated in the present paper.

Text layout, metadata and dialectal attribution
As follows from the inscription on the cover, the texts in Folder 4 were collected in 1905-1909 and rewritten in 1911 by Rychkov himself. Though not overtly indicated, we can assume that the remaining folders also contain texts rewritten from original fieldnotes. The Evenki texts and their Russian translations are written in parallel, with the Evenki text on the left of each page and Russian on the right. Russian translation is missing in Folder 5 on ff. 155-321, in Folder 6a on ff. 157-330 and in Folder 6v on ff. 301-434.
The metadata provided with the texts are generally scarce and sometimes altogether missing. In particular, it remains largely unknown whether two particular texts in a folder were recorded from the same speaker or not. Dialect groups can be identified easily; however, a more precise dialectal attribution demands further investigation. Beware that the geographical distribution of dialectal features in Evenki has not been stable in the past due to migrations as well as to dialect shifts in the local populations (see for instance [Vasilevich, 1948, р. 56-59;Maksimova, 2016;Mishchenkova, 2019]). Thus the dialects documented by Rychkov may or may not have been found in the same areas in later periods. Additionally, families could cover long distances during seasonal nomadic migrations, and thus be encountered far from their 'home' region.
In our data, we will be concerned with the dialects encountered by Rychkov in the territory of the former Turukhansk Krai, namely the Ilimpi dialect of the Northern group and the Sym dialect of the Southern "hushing" group. The standard literary Evenki belongs to the Southern "hissing" group.
Let us briefly characterize the folders listed above according to the available metadata and the two selected phonetic features.
Metadata: Text titles are often provided. The folder has no date. Only one text in the middle (p. 121) has a date (09.04.1911) and place (river Kemchug) mentioned; no other data on speakers or locations is found in this folder. Note that the river Kemchug, a tributary of Chulym, is rather known as the area of the Sym dialect (see also Folder 6a).
Texts in Folder 5 are rather heterogeneous linguistically, show much variation, many unexpected forms (e.g. the 1sg marker -w used for all persons), probably some Dolgan/Sakha influence (e.g. a sequential converb -mətəmi used similarly to Dolgan -An). This folder could be identified with the "North-Eastern group of tribes of the Turukhansk Krai (dying breed)" from Rychkov's letter to V. Kotvich (17.11.1913) [Voskoboinikov 1967, i.e. those Evenki which ultimately became part of the Dolgans.
Metadata: The folder is not dated. No text titles are provided. For some texts, location, speaker name and tribe are given. The name 'Barhahan' ("Барһаһáнское нарѣчіе") is not identified; however, Rychkov's dictionary mentions the "dialect of Kemchug, or Warhahan" (unnumbered card before № 3378). The river Kemchug, while lying to the west of Enisey and not to the east, is indeed found in the metadata to some texts; however, the dialect documented here apparently differs from the Sym dialect in Folder 6b.
Metadata: The folder as a whole is dated 1913. No other metadata present in the whole folder, and the texts have no titles.
Although the name 'Hojon' could not be identified, the texts in Folder 6b appear to be quite homogenous linguistically and typical for the Sym dialect.
Metadata: Texts in this folder have the most complete metadata, usually including date, location, speaker name, age and tribe. The dates range from July 5, 1912, to August 5. Most of locations besides the 'Kutynda ridge' ("Хр. Кутында") are names of rivers tributaries of Lower Tunguska.
The texts in Folder 6v are linguistically much closer to those from the Ilimpi dialect in the folklore collection [Vasilevich, 1936] than those in Folder 5, seemingly with less variation and less unexpected forms.
'So it was told.' (lit.: 'There are such old news.') [F. 5: 15] 2 Interestingly, the Latin l also sporadically appears in the Russian text instead of Russian л. In Folder 5 it occurs especially after о as in оlень 'reindeer', поlожилъ '(has) put down'; in Folder 6b also after а as in плясать стаlи '(they) started to dance'.
Evenki words such as proper names and specific cultural terms, when they occur in Russian translations, are generally spelled in the same way as they are in the Evenki text (see below), eventually preserving stress marks and often separated from a following Russian inflection suffix with an apostrophe: (2) Тутъ сестра […] стукнула по лбу Ететы́р̀ʼа.

Phonetic features as reflected in Rychkov's transcription
The values of most letters in Rychkov's Evenki transcription are straightforwardly correlated with the phonemic inventory. Despite the high number of dialects and their huge geographical spread, as pointed out by Vasilevich [Vasilevich, 1948, р. 5], the number of phonemes is the same in all dialects, and there is principally only variation in their allophones. Slightly simplified common inventories of vowels and consonants adapted from [Vasilevich, 1948, р. 6] are given in Tables 1 and 2, alongside with the corresponding characters used by Rychkov in angle brackets. The most important peculiarities in his inventory are highlighted with bold. The most salient variation parameter is the correlation /s/~/š/~/h/. Recall that the standard literary Evenki (henceforth StE) has /s/, while the dialects registered by Rychkov belong to two other groups, Northern (/h/) and Southern "hushing" (/š/).
In what follows, we will discuss some non-trivial correspondences between Rychkov's notation and the above inventories, starting with the vowels. First, Rychkov does not mark vowel length as such. Although his vertical accent mark usually occurs on vowels which should be long (based on other sources), it is generally restricted to appear only once per orthographic word. An apparent exception is the cliticized negative forms, which he spells in one word with the host but still, often, providing two accent marks, cf. ешӓẃһýчӧ '(he) didn't say' [F. 5: 62]. We can thus conclude that he used them indeed as a stress marker, rather than a marker of vowel quantity which can occur on more than one vowel in a (simple) word.
As to vowel qualities, the letter e is used for both /ə əː/ and (long) /eː/. Only і is used for the Evenki /i/ vowel, unlike in Russian translations where both Cyrillic и and і regularly appear, following the standard prereform orthography. (This helps to partly disambiguate the handwriting in the Evenki part, since in the Russian part all the three of а о и can be confused). However, the letter ы also competes to denote the same phoneme /i/, presumably reflecting allophonic variation between palatalized and non-palatalized consonants preceding the vowel, as the umlaut (see 3.2). In consonants, the voiceless and voiced affricate are denoted with the letters ч and џ, respectively. Note that both in Northern and Sym dialects the voiced member of the pair is a palatalized stop [dʼ] rather than affricate [ǯʼ]. The velar nasal /ŋ/ is denoted by ң. Cyrillic x represents (laryngeal) /h/, as in the modern orthography. This sound appears, first, as an independent phoneme in most dialects, alternating with zero in some of them (3a); second, as the Northern Evenki correlate of the Southern /s~š/ (3b).
(3) a. /h~0/: Such usage of х is in line with the modern orthography. However, Rychkov also uses another letter, һ, clearly distinct from х and more intriguing.
It turns out that the letter г, in spite of being the direct correspondence of Latin g, is rarely used by Rychkov in words known to have /ɡ/ in literary Evenki and across dialects. We find it only before front vowels, predominantly in the sequence (-)гі-(4a), less frequently in sequences (-)гӓ-(in Folders 5 and 6v) or (-)ге-(in Folder 6b), cf. (4b). Before non-front vowels, һ appears instead (see examples above), including in Russian loans (4c). Also in wordmedial clusters, г only appears when followed by a front vowel (5a), otherwise һ is used (5b). Only һ appears in word-final position (only a few occurences in the texts, see (6)).
(4) a. гірку-'to walk' (StE гирку-мӣ) But these are not all of the uses of г. Surprisingly enough, it is also found intervocalically in place of /h/ alternating with /s~š/, always in the same complementary distribution with һ depending on the following vowel: Note that /ɡ/ is normally realized as a stop word-initially and in clusters, but as a fricative [ɣ] intervocalically and word-finally in most dialects [Vasilevich, 1948;Tsintsius, 1949, р. 48-49]. We find no evidence for a fricative realization of /ɡ/ in word-initial position, neither in the past nor in modern Northern or Sym dialects. We must thus assume that Rychkov does not distinguish between the two sounds ([ɡ] and [ɣ]), using һ and г  [Vasilevich, 1948, р. 90-91].

Yakutia and Chita oblast). 5 However such strengthening of [h] into [ɣ] or [ɡ]
has not been reported, to our knowledge, for the Northern dialects, and is not observed, at least at the first sight, in the available recordings. It should also be mentioned that the strengthening reported by Vasilevich is conditioned by close vowels, i.e.
[i] and [u], while Rychkov makes the distinction between front and back vowels instead.
To sum up, the most reliably reconstructed difference between the wordinitial /h/ (symbolized by х) and the intervocalic /h/ and /ɡ/ (both symbolized by һ and г) is that of voicing. On the other hand, Rychkov apparently makes no difference between the (voiced) laryngeal fricative [ɦ] in еlаһа 'when' and the velar stop [ɡ] in һуско 'wolf'. As a non-confirmed hypothesis, Rychkov himself might have spoken a Southern variety of Russian, notoriously featuring a fricative [ɣ] (or [ɦ], depending on the specific variety) in place of standard Russian [ɡ]. This seems possible given the heterogeneous origins of population in Eastern Kazakhstan at the time; and it could explain his nondistinction of [ɡ] and [ɣ ɦ]. 6 Taking into account the complementary relations between һ and г, and disregarding for the moment the relations between һ and ң, we can then reconstruct the distribution of letters х һ г as follows (see Table 3): <х> stands for a voiceless laryngeal, irrespective of the vocalic context; <һ> stands for a voiced non-palatalized velar or uvular or laryngeal, irrespective of the stop/fricative distinction and of specific place of articulation within the velar/post-velar zone; <г> stands for a voiced palatalized velar or uvular or laryngeal, again irrespective of the stop/fricative distinction and of specific place of articulation within the velar/post-velar zone.
One cannot be sure which of the sounds potentially symbolized by һ is (are) meant by Rychkov in these cases. While a fricative [ɣ] or de-buccalized [ɦ] is more likely intervocalically, a denasalized stop [ɡ] might appear in clusters.
The principal palatalization marker for /nʼ/ is a diacritic similar to a grave accent placed slightly above and to the right of the letter (in contrast to the accent mark which is closer to vertical and placed directly above the vowel, and at a greater distance from it). But the palatalization diacritic is not limited to нˋ. It is also regularly present on рˋ, both in common names (8a) and in proper names (8b), including loans: In Folder 6b, palatalization mark is regularly present on final тˋ [tʼ] corresponding to StE [t] in some suffixes such as instrumental: шулакитˋ 'fox (instr.)' (StE сулакит), тарітˋ 'therefore' (StE тарит), аjатˋ 'well' (StE аят), шотˋ 'very (much)' (StE сот). This is characteristic of the Sym dialect; however, [Vasilevich, 1948, р. 65] describes the Sym variant as -ч [č].

Sibilants, voicing and an obscure diacritic
All in all, the following spellings for sibilants are found in different contexts: с s ш сш sш з z; other combinations involving sibilant letters, including сч, sч, arguably correspond to clusters. The two Russian letters for hissing sibilants, с and з, are only used in Folder 5, and most occurrences of с in loans are corrected into Latin s. In Folders 6b and 6v, only s and z are used (with very few exceptions), including in native words, in particular in the sequence sш (~сш in Folder 5).
Single letters с s з z can also bear additional diacritics, e.g. c̬ s̬ s̬ˋ з̬ з̬ˋ z̬ zˋ. Note that з, z appear only in loanwords, and с, s mainly in loanwords except some clusters like -ск-, -ст-, since the dialects documented by Rychkov generally lack /s/ in their inventory and [s] can only appear in native items as an allophone of either /h/ or /š/.
The following can be found in Folders 5 and 6v: − Word-final /s/ in 2sg markers is regularly written with ш ([š]): ечӓш бакаџӓнде 'you will not find'; хініш дундаду 'to your land'; interestingly, the same ending is regularly noted with the reflexive possessive: моніш џӱду 'to his own home'. − Word-final /s/ in roots was not found in the texts. − Word-medial postconsonantal /s/ (usually in cluster -ks-) is generally found as ш: ерікшӓ-'to breathe' (StE эрӣксэ), ірˋакшӓ 'reindeer skin' (StE ирэ ксэ), xукшіlда 'ski' (StE сӯксилла). Exceptionally, StE тукса-'to run' is regularly recorded as тупса-in Folder 5; it is a variant not reflected in the dictionary [Vasilevich, 1958]. 7 The same root in Folders 6v and 6b has the expected shape, тукша-. Three aspects in Rychkov's rendering of sibilants call for an explanation: 1) the voluntary replacement с, з > s, z; 2) the sequence сш~sш; 3) the mysterious 'low caron' diacritic. The interpretations suggested below are only tentative, and they all assume that there is some phonetic motivation behind these peculiarities of the writing system, which is however ultimately unknown.
2. The sequence сш~sш between vowels corresponds to a long [šː] or to a [čʼ] in other data. The following vowel bears the umlaut in the majority of cases, indicating a palatalized articulation [š(ː)ʼ]. However, if it were for the palatalization alone, Rychkov's regular palatalization mark ( ˋ ) would have been sufficient. The two-letter sequence might also mark the length of the consonant articulation. On the other hand, the preconsonantal sш in іsшта in Folder 6v is probably neither palatalized nor long. Yet another alternative would be a kind of affricated articulation with a burst component in the middle, as e.g. in older Russian pronunciation of ещё [-šʼčʼ-] 'yet', now simplified into [š(ː)ʼ]; however this remains speculative and does not follow from the spelling itself.
Second, in a few cases the 'low caron' is found under the letter б. The only occurrence in the texts is a borrowing, б̬ орох 'powder' (Rus.