Linguistic Paradox and Diglossia: the emergence of Sanskrit and Sanskritic language in Ancient India

Abstract “We know that Middle Indian (Middle Indo-Aryan) makes its appearance in epigraphy prior to Sanskrit: this is the great linguistic paradox of India.” In these words Louis Renou (1956: 84) referred to a problem in Sanskrit studies for which so far no satisfactory solution had been found. I will here propose that the perceived “paradox” derives from the lack of acknowledgement of certain parameters in the linguistic situation of Ancient India which were insufficiently appreciated in Renou’s time, but which are at present open to systematic exploration with the help of by now well established sociolinguistic concepts, notably the concept of “diglossia”. Three issues will here be addressed in the light of references to ancient and classical Indian texts, Sanskrit and Sanskritic. A simple genetic model is indadequate, especially when the ‘linguistic area’ applies also to what can be reconstructed for earlier periods. The so-called Sanskrit “Hybrids” in the first millennium CE, including the Prakrits and Epics, are rather to be regarded as emerging “Ausbau” languages of Indo-Aryan with hardly any significant mutual “Abstand” before they will be succesfully “roofed,” in the second half of the first millennium CE, by “classical” Sanskrit.

1 For their invitation to participate in a fascinating and unique event of comparative historical sociolinguistic research, the Conference "Strategies of Language Variation: Transcultural Perspectives" (Vienna, 24-25 April 2015), the author thanks Vincent Eltschinger and the organizers of the Conference ; he thanks Chiara Barbati and Christian Gastgeber for their invitation to contribute to the proceedings. Wouter Henkelman he thanks for a discussion on the Behistun (Bīsutūn) inscriptions and on Old Persian. He is grateful to two anonymous reviewers for their suggestions for improvement.
[not in this author's abstract etc.: Representation of characters: R and r with subscript circle have been replaced -in the full, published version -by R and r with subscript dot ( >>> Ṛ,  >>> ṛ); these and a few other obvious replacements by typographically less problematic diacritics ( >>> 'ā,  >>> å) were necessary to produce a readable online pdf.] 1. Introduction 1.1 In this paper I propose that the perceived "paradox" of Sanskrit (Renou 1956: 84) derives from the lack of acknowledgement of certain parameters in the linguistic situation of Ancient India which were insufficiently appreciated in Renou's time, but which are at present open to systematic exploration with the help of by now well established sociolinguistic concepts, notably the concept of "diglossia" (Deshpande 1985;Houben 1996a). This I will do by addressing three issues specified by the organizers of the Conference "Strategies of Language Variation: Transcultural Perspectives" (Vienna, April 2015). In a broad sense, Sanskrit can here be taken to include its predecessors, Vedic or "the older dialects of Veda and Brāhmaṇa" (Whitney 1888) and Pāṇini's and Patañjali's bhāṣā 'conversational language'. In a strict sense, Sanskrit refers to classical Sanskrit which arose in the first centuries CE and flourished throughout the first millennium in South Asia and beyond, until the beginning of the second, "vernacular" millennium (Pollock 2006), when its niche became more restricted (Houben 2008).
1.2 Just as structural and generative linguistics have a predecessor in Ferdinand de Saussure (1857-1913) -who has several times been explicitly mentioned as such 2 -sociolinguistics has an important predecessor in Antoine Meillet , for whom language was "éminemment un fait social" (Meillet 1921: 230). However, he has been rarely recognized as such. 3 Meillet added that language enters exactly into the definition of a "social fact" as given by Émile Durkheim in his 1895 essay on the method of sociology. 4 Although Meillet intended the "social" nature of language to be applicable both to current and historical forms of language use, sociolinguistics is at present mainly developed with regard to current language use. To systematically apply a sociolinguistic approach to historical texts and languages is therefore an innovative move from which both sociolinguistics and the study of ancient languages and literatures can be expected to profit and progress in new ways.
2. Sanskrit: matrix, or outcome of related idioms? 2.1 The first question touches on a fundamental issue in the study of Sanskrit of which the earliest Western students of Sanskrit were well aware, but which has till now not been addressed properly, partly on account of the extension of the subject, partly on account of the lack of adequate concepts.
a. In addition to the 'matrix language' (the most used one), which other languages or varieties can be found in your treated texts (e.g. in the form of style shifts and code-switching)?
The question as formulated presupposes that we have sufficient access both to a "'matrix language' (the most used one)" and to "other languages or varieties" which are its outcome. In our case, the language to which we have access quite extensively is (classical) Sanskrit, but it is not demonstrated, and a priori unlikely, that Sanskrit would be the "'matrix language' (the most used one)" for languages and idioms with which it had a dynamic relationship. Throughout its long history, Sanskrit, as the language that is well formed or well prepared (saṁ-s-kta), presupposes varieties 2 The importance of Saussure's work for structural and generative linguistics was frequently pointed out in the latter half of the previous century and has become so trite that it now normally remains implicit. As early as in 1957, N.C.W. Spence observed "it can be said that 'we are all Saussureans now'." Noam Chomsky placed himself explicitly in the tradition of saussurean linguistics. A study and analysis on the importance of Saussure's work for 20th century linguistics (till the mid-eighties) is Kaldewaij 1986. 3 I am only aware of L.-J. Calvet 1998 who briefly highlighted the importance of Antoine Meillet for modern sociolinguistics in the beginning of his book and draws attention to mild theoretical divergences with Saussure. 4 First chapter of Durkheim 1895, entitled "Qu'est-ce qu'un fait social?" of language that are less well formed -either because the linguistic norms are imperfectly realized, or because different linguistic norms are followed. As the name of a language or of a variety of language, the term Sanskrit (saṁ-s-kta) is relatively late (from the first centuries CE onwards), but it is linguistically more or less identical with the language used and discussed in Patañjali's Mahābhāṣya ('Great Commentary', 2nd century BCE) as the bhāṣā 'conversational language' described in Pāṇini's grammar, the Aṣṭādhyāyī (AA), 4th century BCE. 5 The language used and described in the Mahābhāṣya will become exemplary in the period of "classical" Sanskrit. In addition, a more archaic variety is described by Pāṇini: the language of the ancient Vedic hymns. The oldest, very extensive collection of Vedic hymns is the gveda, rich in poetic eulogies of Vedic gods such as Agni, Indra, Varuṇa, but also containing "philosophical" reflections on, for instance, the place of man in the world and in the universe (Renou 1957b). The verbal root k 'to do' has the present stem k-ṇu in the entire gveda, with three exceptions, all in the tenth and last Maṇḍala, generally regarded as the latest one (Whitney 1888: 260). In verse 2 of V 10.145, for instance, the imperative kur-u is used by a woman "conjuring against her co-wife for the affections of their joint husband" (Jamison & Brereton 2014: 1630. This does not represent a systematic and generally accepted style shift in the gveda, as in other sentences in V 10 attributed to women we find that only the older stem k-ṇu is used (as in V 10.95, the dialogue between Purūravas and Urvaśī). In his extensive study under the title "Tracing the Vedic dialects, " Michael Witzel (1989: 101) refers to this rare use of kuru in the gveda and to several other indications of "social levels of language" in Vedic texts. Subsequently, however, Witzel's study (see also Witzel 1987Witzel , 1997 is focused on the parameters "geography" and "time" and the parameter "social levels of language" is no more taken into account. Forms derived from kur/kur-u rather than k-ṇu become more prominent in the Atharvaveda and it is the normal present stem in the prose of the Brāhmaṇas and in classical Sanskrit. While kuru may be regarded, formally, as a "later" form in the tenth Maṇḍala, invoking a later stage of the language, viz. classical Sanskrit, cannot explain the form synchronically. It has been proposed that it derives from a "Vedic Prakrit", a "Middle Indo-Aryan" otherwise unattested form *kuṇu, from Vedic kṇu (Mayrhofer 1951: 136, with a reference to Wackernagel's Altindische Grammatik I, 1896). This amounts to the contemporaneous availability to the users of the language, at least at the time of the tenth Maṇḍala of the gveda, of two levels of speech, one level of high prestige normally used to address the gods, and a middle or lower, "Prakritic" one, used, for instance, by women. In the other hymn containing kuru (V 10.19, verse 2b), which may also represent a more popular register as it deals with the returning home of cattle in the evening, this form co-occurs with another linguistic form, the simple nominative plural ending in devḥ ... yajñíyās (10.19.7c). This form could be regarded as simply "later" if we arrange the linguistic forms exclusively according to a "time-line" of predominant usage, but it points, contemporaneously, to an apparently widespread distinction between levels of language known not only in Vedic but also in old Iranian: the distinction between the simple nominative plural ending of -o/-a stems (Skt. -āḥ, Av -/-a, OP -ā ḥ ) and the double ending -āsaḥ (Skt. -āsaḥ, Av -ŋhō, OP -ā ḥ /-āha ḥ ). To the discussion by Witzel (1989: 212) is to be added that the double ending -āha ḥ in OP is found only very exceptionally, namely in the expression aniyāha bagāha 'the other gods', which "seems to come from the language of religion" (Kent 1950: 9), in other words, from a "higher prestige" level of language. On the Indian side, the simple ending -āḥ is the one that, in the formulation of Witzel, "has gained prominence in all Prākts (-āḥ > -ā), except for -āse in Pāli verses" (with a reference to von Hinüber 1986 : 144, §312). Again, the double ending is found in the "higher level" context of poetry. In view of the variation attested in Vedic and in Avestan and OP, the simple ending -āḥ probably never "gained in prominence" but had always remained available next to the "higher level" use of language in poetry and in religious contexts. Traces of an actual "matrix language" are rare in the transmitted texts, but they are sufficiently attested to infer that it was current, including in "Prakritic" or "Middle Indo-Aryan" language use contemporaneous with the composition of the Veda. The double ending Skt. -āsaḥ, Av -ŋhō, OP -ā ḥ /-āha ḥ ) was, on the other hand, a pre-Vedic and pre-Avestan "hyper prestigious" form, not to say a "hypersanskritized" form, if we allow ourselves to take the term "sanskritization" in a generic, linguistic sense and apply it to a situation long before the emergence of classical Sanskrit or Sanskrit in the strict sense of the word.
2.2 In fact, as early as in 1896, Jakob Wackernagel was well aware of a distinction in language according to what he called in his time "Volksklassen". Ca. fifty years later his statement to this effect was rendered as follows by Louis Renou (1957a: 7): Ainsi la scission du langage d'après les classes sociales, qui s'observe partout mais n'est nulle part plus forte que dans l'Inde, se laisse attester dès l'époque védique. (Thus the division of language according to social classes, which is observed everywhere but is nowhere stronger than in India, can be witnessed from the Vedic period.) Renou was able to add to this statement a new note 89 on "linguistic stratification with social origin (stratification linguistique d'origine sociale)" with several bibliographical references, at a time that it would still take around a decade before sociolinguistics would emerge as an academic discipline. Renou's most recent reference was to Marcel Cohen's Pour une sociologie du langage (1956) which explored the possibilities for sociolinguistic studies and for a sociology of language.
2.3 In the next period for which language use is sufficiently accessible, the one to which the grammarians Pāṇini (4 th century BCE) and Patañjali (2nd century BCE) belonged, the role of "'matrix language' (the most used one)" accrued, again, not to "Sanskrit", referred to as bhāṣā, the 'conversational language', but, again, to some form of Prakrit, a continuation of the "Prakritic" language use infered for the Vedic period, and a language variety on which we have, for Pāṇini's and Patañjali's period, still only very limited direct information, mainly in the inscriptions of king Aśoka (3 rd century BCE). Patañjali's commentary the Mahābhāṣya or Vyākaraṇa-Mahābhāṣya is itself an excellent example of conversational Sanskrit, although the term saṁ-s-kta is still nowhere used to refer to this language or idiom. In addition, there are the extensive texts of early Buddhism, which, however, have been fixed in writing a few centuries later, long after the discourses and discussions of the Buddha which are supposed to be reported in many of these texts.
2.4 Although clapping is never done with a single hand, from the current perspective we perceive for over around two millennia, starting with Patañjali's Mahābhāṣya, a single "Sanskrit" hand clapping. 6 From the clapping itself we have to infer that there was, according to time and circumstances, another "proto-Prakrit" or "Prakrit" or "approximative Sanskrit" 7 hand clapping of which we often have no direct information at all, sometimes only a limited amount of evidence (as in Aśoka's inscriptions), and only for later periods in the course of the second millennium CE somewhat detailed information -but by that time Prakrit (and Pali) no more represent a Prakritic language in current use but have developed into codified, mostly literary languages in their own 6 In the final discussion of the ISS seminar in 1994 it was, as far as I remember, Professor H.H. Hock who used this metaphor for the relationship between Sanskrit and a not always easily recognizable other language or other form of linguistic usage with which it interacts. That we are justified in distinguishing a dynamic interaction over time of a limited number of languages was recently demonstrated by Andrew Ollet, who further observed that "a dichotomy between Sanskrit and Prakrit" was "[a]t the foundation of this language order" of three literary languages in India mentioned by Mīrzā Khān in the 17th century (Ollet 2017: 1-4). 7 The expression sanskrit approximatif des bouddhistes was proposed by Helmer Smith (1954 : 3) as equivalent, or rather as a gentle, terminological corrective, to the title Buddhist Hybrid Sanskrit which F. Edgerton gave to his extensive study published in three volumes (Grammar, Dictionary, Reader) in 1953. right. The extrapolations to which this uneven distribution of the evidence and its frequent distortion through transmission continuously invites us, are unavoidably informed by our understanding of linguistic processes in better documented areas and periods. Hence, we cannot afford to neglect either the exploration of primary sources, or the reflection on fundamental theoretical issues connected to their interpretation.
2.5 The language described by Pāṇini and Patañjali was limited to the "high prestige" form of linguistic usage current in their time. Those current idioms contained a whole range of linguistic forms in contemporaneous use, from "Prakritic" to various degrees of approximation of the high standard Sanskrit, next to bhavati 'he is', for instance, both bhoti and hoti. This can be inferred from, inter alia, the language attested in Aśokan inscriptions found throughout the Indian subcontinent and dated in the 3 rd century BCE, in between Pāṇini and Patañjali. Although the grammarians decided to describe only the desirable "high prestige" forms of the language, and not to bother about indicating all possible lower forms (apaśabda), sporadic references in Patañjali's commentary give an idea of these forms regarded as having a lower prestige (see below).
2.6 An important domain of sociolinguistic variation is ancient Indian theatre. A number of "classical" Indian dramas have been transmitted over the centuries and are available, the most important ones dating from the middle of the first millennium CE onwards. The dramas follow patterns and rules which have been set forth in texts such as the Nāṭyaśāstra (2nd or 3rd century CE? Kane 1971: 43-47;S.K. De 1960: 18). The rules also concern which language is to be used by which character. In larger classical dramas, "Sanskrit is spoken mainly by the educated, upperclass male protagonists, while various types of Prākrits are used by most women and by males of lower rank and education" (Hock & Pandharipande, 1976: 113). The earliest dramas that are fragmentarily preserved are those by a Buddhist author, Aśvaghoṣa (ca. 100 CE), otherwise known as author of a poetic biography of the Buddha in Sanskrit, the Buddhacarita. Of Aśvaghoṣa's play Śāriputraprakaraṇa only fragments of the last two Acts (out of nine in total) are preserved. The story of the play concerns the conversion to the Buddhist doctrine of Maudgalyāyana and Śāriputra. Sanskrit, in prose and in verse, is spoken in this drama by the Buddha and his disciples, Maudgalyāyana and Śāriputra, and a Śramaṇa; the Vidūṣaka, who is a Brahmin, speaks Prakrit (Keith 1924: 82).
3. Sanskrit and the frequency of its varieties 3.1 The next question consists of two closely related ones, of which we will address here first the following.
b-A. How frequently do these languages/varieties occur? Ancient India had very sophisticated techniques for memorization (Scharfe 2002, Houben & Rath 2012 and was very late in accepting writing for the transmission of its sacred texts, in comparison with its neighbours, China and Mesopotamia. After an initial, predominantly oral period, the various religious and philosophical systems accepted writing in an environment in which orality nevertheless remained predominant for a long time. Due to the conditions of the Indian climate and the properties of Indian manuscripts -mostly prepared from palm leafs or, in the north, of birch bark -they deteriorate after a relatively short period of two to three centuries and are to be copied if a subsequent generations considers their content sufficiently important. This situation has led to a very uneven distribution of quantitative manuscript survival (for some Vedic texts: oral tradition plus manuscript tradition). For older periods there are therefore no direct data available that allow us to answer the question "How frequently do these languages/varieties occur?" with precision, specifying place and time. In a limited domain such as epigraphy, some quantitative observations can be made. The oldest inscriptions, starting with those of Aśoka, are in an early Prakrit. They remain to be written in Prakrit, until the middle of the second century C.E., when the Śaka ruler Rudradāman had a text inscribed in perfect Sanskrit in which he "celebrates his own cultural and political achievements" (Pollock 2006: 68). In subsequent centuries, we find not only inscriptions in Sanskrit, but also in Prakrit, and in intermediate forms, for which the term Epigraphical Hybrid Sanskrit has been coined (Damsteegt 1978). The element "Hybrid" in this term would suggest that the language use reflected in the inscriptions is generated from separate and disparate linguistic entities or processes -for which, however, there is no evidence. "Hybridity" is as questionable here as it is in "Buddhist Hybrid Sanskrit" (Edgerton 1953), the Sanskritic language of a large body of Buddhist texts, and in "Gāndhārī Hybrid Sanskrit" (Salomon 2001). The tenacity of the "Hybrid" in reflection on ancient Indian language use is parallel to the hardly less tenacious, and hardly less problematic, conceptually underlying biological metaphor of languages as living organisms. The inscriptions of both "Buddhist" and "non-Buddhist" character of the first till the fifth century in what appears to us as approximative (see above) or intermediate Sanskrit, or even approximative or intermediate Prakrit, show, however, that religious affiliation was not a decisive parameter in the choice of idiom. If different idioms are combined in a single inscription, a neat division is seen: the one of higher prestige is used in the praśasti, the part in which the king or another donor is praised, his genealogy given, his achievements celebrated; and the idiom of lower prestige is used in the management part which records in widely understandable terms the donation etc. Another, equally problematic, employment of the term "Hybrid Sanskrit" concerns the language used in a mathematical text -again in a context where religious affiliation is insignificant -which is fragmentarily available in a single manuscript, the Bakhshali manuscript, so called after the village where it was found in what is now north-west Pakistan.
3.2 More certainty in the establishment of the date of this and other undated manuscripts would be a great help in contextualizing the linguistic evidence for specific idioms and registers of language use in pre-modern South Asia, which would be a prerequisite for judging "how frequently" a language or register, including Sanskritic language use, occurs. The language is a quite particular one in the case of the Bakhshali manuscript. Scholars have recently again referred to this language uncritically as "Hybrid Sanskrit" in the recent article "The Bakhshālī Manuscript: A Response to the Bodleian Library's Radiocarbon Dating" (Plofker et al. 2017). Among the variables of carbon dates, variation in script and linguistic variation, the first is the most objective but still much in need of calibration for relatively recent, historical dates. In view of the strong normativity of linguistic usage within the dimension "sanskrit -approximative sanskrit" it is difficult to derive a linear chronological difference from the observed linguistic variation. Also writing is a normative activity and moreover dependent on some amount of individual variation from scribe to scribe. However, writing has been much less subject either to the intensive study of early scripts by later generation scribes 8 or to the conscious reintroduction of archaisms in later forms of writing (something we see in language, most famously the studied archaizing "Vedic" language use in parts of the Mahābhārata and in the Bhāgavatapurāṇa). The "hardest" evidence to judge the date of a manuscript such as the Bakhshali and its sections would therefore be the palaeographic evidence. Other evidence, including the laboratory results of radiocarbon dating, is to be interpreted in the light of the results reached by careful palaeographic study.
3.3 With regard to the question "How frequently do these languages/varieties occur?" we can conclude that much relevant material is available but that quantification is not obvious and rendered difficult in the absence of sufficiently reliable dating and contextualization of texts. For the period of India's pre-literary (pre-Aśokan) orality, only estimates can be proposed on the basis of indications found in early Vedic texts and in early Buddhist and Jaina texts which originated in that period. Another impediment is the unreflective use by modern scholars of outdated concepts, and, more generally, conceptual poverty in linguistic reflection about Sanskrit which was still excusable a century ago or even fifty years ago but not in modern times.

Sanskrit and related varieties: wide apart or nearly indistinguishable?
4.1 The query regarding frequency of languages or language varieties (b-A) presupposes that these can be distinguished. Hence, it cannot be dissociated from the following question: b-B. Are the different linguistic systems easily distinguishable, or are some words difficult to assign to a specific language/variety? Sanskrit and forms of Prakrit such as Aśokan Prakrit and Pali, in spite of numerous well-defined differences, are nevertheless still to be regarded as very close. They were no doubt to a large extent mutually understandable. 4.2 How (a) Sanskrit and (b) its counterparts, Prakrit and numerous gradations of approximative Sanskrit, are different but also very close is clear from brief narrative passages in Patañjali's Mahābhāṣya, the extensive commentary on Pāṇini's grammar. We select here a few of these passages, first of all Bhāṣya (Section) 23 of the introductory chapter in the Mahābhāṣya which gives one out of several reasons why grammar should be studied: Those (rival) Asuras uttering (the words) he'layo he'layaḥ have been defeated. Therefore a brahmin must not speak barbaric language (mlech-), that is, he must not use corrupt words. Mleccha 'barbaric language' indeed is (the same as) apaśabda 'corrupt speech'. So that we should not become mleccha 'barbarians, users of barbaric language': that is (also a reason) why one should study grammar.
The passage is roughly parallel to the Śatapatha-Brāhmaṇa in the Mādhyandina recension, 3.2.1.23-24, where it is part of a more extensive narrative starting at Śatapatha-Brāhmaṇa (Mādh.) 3.2.1.18. The Devas and Asuras, 9 the "divine counterparts of the vedic Aryans and their rivals" (Parpola & Parpola 1975: 212), are in fierce competition in the context of a ritual. After having lost the adherence of the goddess Speech (Vāc) who was initially at their side, the Asuras shout something. Instead of he'layo he'layaḥ of the grammarians, Śatapatha-Brāhmaṇa (Mādh) has he'lavo he'lavaḥ, which the medieval commentator Sāyaṇa glosses as he'rayo he'rayaḥ. Whether the exclamation was he'layo he'layaḥ or he'lavo he'lavaḥ, in both cases it would corresponds to Prakrit versions of Sanskrit he'rayo he'rayaḥ "hey, enemies!" (Thieme 1938). 10 Either way, the "barbaric language" of the Asuras would be very close to the high standard required by the gods, both in the narrative of the Śatapatha-Brāhmaṇa and in that of the grammarians. This remains valid even if the word mleccha has no Indo-European etymology, and is perhaps, together with Pali milakkha 'barbarian', a continuation of the toponym Meluḫḫa found in Akkadian and Old Babylonian cuneiform sources where it refers to a distant foreign country engaged in sea trade. It also remains valid irrespective of whether the undeciphered symbols of the Indus Civilization are taken as representing a language, usually either a (proto) Dravidian one (Parpola 1994(Parpola , 2015, or an early Indo-Aryan one, or as not being in any way linguistic (Farmer et al. 2004). Whatever the linguistic reality in terms of languages from very different language families, the Brāhmaṇa and Mahābhāṣya authors perceived only the variation between Sanskrit and gradations of approximative Sanskrit. If we assume that some (proto-) Dravidian language was, in their time, somehow geographically near, its presence apparently remained largely unperceived. 4.3 The next passage in the Mahābhāṣya, Bhāṣya 24, gives another example of incorrect use of language, which is discussed in the paper. 4.4 Still another example of incorrect use of language is given in the Mahābhāṣya, Bhāṣya 119: There were sages (a group of sages) (nick-) named yarvāṇas-tarvāṇas. Their perception of dharma was direct, they knew the far and the near, they knew what could be known and they had come to realize ultimate reality. These worthy persons used the expressions yarvāṇastarvāṇas when they should have used yad vā naḥ tad vā naḥ "whatever (happens) to us, (let) that (happen) to us." Still, they did not use incorrect words at the time of sacrificial ritual. But the Asuras did use incorrect words at the time of sacrificial ritual. That is why they were defeated.
In this example the situation is the inverse of the preceding two: a group of sages uses here correct language within the ritual and wrong language in daily life with distortion or wrong euphonic combination of a few syllables (r instead of d, ṇ instead of n: an excess of cerebralisation). 4.5 Since the ancient Indians had a highly developed system of grammatical analysis and their reflection on language and grammar was on a high level, it is legitimate to ask what was their own view on the distinguishability or difference or closeness of the two linguistic structures, Sanskrit (or its predecessor, Patañjali's bhāṣā) and Prakrit.
At the end of book 1 of the Vākyapadīya (VP 1.175-183) the relation between correct and substandard words and their capacity to express meaning are discussed. The VP-verses of this passage envisage two situations: I. The speaker sincerely tries to speak correct language (śabda), but produces substandard words (apabhraṁśa). II. The speaker is in a community in which the substandard apabhraṁśa words have become generally known and accepted on account of a (non-Sanskrit, Prakrit) tradition. Under (I), the correct word, śabda, is vācaka 'expressive of meaning'; the substandard apabhraṁśa word is not itself vācaka 'expressive of meaning', but it brings to mind the intended correct word, śabda. Under (II), śabda, the correct word, is not or no more expressive of meaning: in the (non-Sanskrit, Prakrit) community it is the Prakrit word that has become directly expressive of meaning. In this regard, Patañjali's Mahābhāṣya and Bharthari's Vākyapadīya explicitly accept that śabda and apaśabda or apabhraṁśa, the "correct" and the "incorrect" word, can be equally expressive (MBh 1:8.21 samānāyām arthagatau śabdena cāpaśabdena ca; VP 1.27 arthapratyāyayanābhede; 3.3.30, asādhur ... vācakatvāviśeṣe vā). The relevant passages have been discussed in detail in Houben 1995: 237-242 and 1997: 336-341, where a difference in orientation was demonstrated between the verses of the Vākyapadīya (by Bharthari) and the ancient Vtti, or, more precisely, the longer ancient Vtti (bhatī to distinguish it from the, in significant respects different, laghuvtti). In the paper, some further details are added to this discussion. On a theoretical level, Bharthari's position, according to his own statements as found in his magnum opus, the Vākyapadīya (VP), corresponds to the "hocus-pocus" position rather than to the "God's truth" position. 11 On the basis of the oft-cited words of Sir William Jones (1786): "The Sanskrit language, whatever may be its antiquity, is of a wonderful structure", and on the basis of Saussure's view that Sanskrit is an "ultra-grammatical" (1916: 183), that is, for Saussure, an extremely systematic, language, one should rather have expected that Bharthari would accept and deal with the presence of "real structures" in Sanskrit. The presence of such structures, however, is precisely what Bharthari emphatically denies (Houben 1993(Houben , 2009). His position on apabhraṁśa or our "Prakrit" is part and parcel of this denial of a given structure in language: it is the individual words that are substandard, incorrect or simply different. In view of the structural and lexical closeness briefly illustrated in the paper, it is hardly justified to speak of different "languages" or even of different "dialects" with regard to the idioms of "old" and "middle" Indo Aryan which were contemporaneously in active use at the time of the early grammarians and Aśoka. 4.6 Several centuries later, when Sanskrit and several Prakrits have become literary languages, skilled poets are able to write verses which can be read in Sanskrit and one or more Prakrits at the same time. In view of the structural and lexical closeness briefly illustrated in the paper, it is hardly justified to speak of different "languages" or even of different "dialects" with regard to the idioms of literary Sanskrit and the literary Prakrits. 4.7 With this result we may go back to the "great linguistic paradox" of Louis Renou: the fact that "Middle Indian (Middle Indo-Aryan) makes its appearance in epigraphy prior to Sanskrit" (Renou 1956: 84). Renou links this to the choice of the Buddha, two centuries before Aśoka, to impart his teaching, at the basis of all later Buddhist doctrine, in Middle Indo-Aryan, "a Māgadhī or pre-Māgadhī dialect" (ib.). Similarly, Mahāvīra had decided to impart his teaching, at the basis of all later Jaina doctrine, in Middle Indo-Aryan. King Aśoka, inspired by and converted to Buddhism, would therefore have ordered his inscriptions to be in Prakrit dialects as well: this would have remained the habit for inscriptions for centuries to come. For the Buddha's choice to teach in a Middle Indo-Aryan dialect, Renou refers, in the next paragraph, to a well-known narrative found in various Buddhist canonical texts according to which two monks, converted Brahmins, propose to put the discourses of the Buddha into chandas. In terms of linguistic knowledge available in the Buddha's time this can only mean: to transpose them into a text with Vedic metre and Vedic accents (in accordance with the phonetic, grammatical and metrical rules of some early Prātiśākhyatreatise). The Buddha rejects the proposal, and encourages the monks, on the contrary, to transmit the speech of the Buddha sakāya niruttiyā, i.e., "in one's own mode of expression" (ib.). 12 Retrospectively, scholars have read in this story the rejection by the Buddha of the use of Sanskrit for his teaching. However, as no Sanskrit, in the strict sense of the term, can have been available in the Buddha's time as an identifiable linguistic option for communication, there can have been no rejection of this not yet existing Sanskrit by the Buddha. The passage is, moreover, clear in specifying that the rejection concerned chandas, which was at that time indeed an identifiable linguistic option, not so much for colloquial communication but for perpetuating a teaching. In a recent, extensive and brilliant analysis of several versions of the sakāya niruttiyā passage according to canonical texts of various Buddhist schools, Vincent Eltschinger has justly drawn again attention to the interpretation of this passage in two schools whose canons are not in Pali but in Sanskrit: the Sarvāstivādins and the Mūlasarvāstivādins (Eltschinger 2017: 315f 13 ). The relevant passages of these schools, unfortunately available only in Chinese translation, clearly imply the rejection by the Buddha not of Sanskrit but of the adoption of metrical chanting and intonation of chandas for the transmission of the Buddha's teaching. This interpretation, which equally suits the well-known Pali version quoted and discussed for instance by Edgerton in 1953, 14 is not the result of an adjustment to the Sarvāstivādins' and the Mūlasarvāstivādins' use of Sanskrit as the language of their canons: it reflects the generosity of the Buddha's allowance to his monks to teach sakāya niruttiyā, "in one's own mode of expression," which should have included a whole range of Sanskritic and Prakritic language use, comprising also any predecessor of classical Sanskrit available in his time, which, as we have seen, were anyway very close to each other and to a very large extent mutually understandable.
If, however, there is no indication that the Buddha would ever have rejected Sanskrit as an available, linguistic option, the apparent "adoption" of Sanskrit by later generations of Buddhists necessarily appears in an entirely different light as well: this was then rather a matter of relative strength and growth of Buddhist communities or sects that were prone to accepting and developing Sanskritic language (grammar) and literature. The important and foundational contributions to the development of Sanskrit literature and grammar in the early centuries CE -e.g. by Aśvaghoṣa, mentioned above, by the Buddhist grammarian Candrācārya, referred to in Bharthari's Vākyapadīya, and by the lexicographer Amarasiṁha -are then no more betrayals to a linguistic choice of the Buddha, but legitimate explorations of one of the available options of language use, originally perhaps a minority option, left open by the Buddha.
4.8 Renou's linguistic paradox is therefore to a large extent based on an optical illusion, a trompel'oeil, as Renou himself to some extent realized (Renou 1956: 84) if we accept, on the one hand, that language options in the Indo-Aryan realm, from pre-Vedic times onwards, included a range of contemporaneous linguistic forms to which different levels of prestige were attached; and, on the other hand, that the different varieties were actually extremely close and to a large extent mutually comprehensible. The linguistic situation in ancient India evoked, to Louis Renou, German Switzerland, "where the normal means of communication is the dialect, and nevertheless German has the position of a spoken language" (Renou 1956: 87). A few years later, Swiss German would be one of the defining languages in Ferguson's definition of diglossia (1959), next to Arabic/Egyptian Arabic, Haitian Creole and Greek. The situation in India as reflected in literature and in the grammarians' examples and analyses, fully applicable at least in the area defined as Āryāvarta (according to the Manusmti between the Himalayas and the Vindhya mountains and between the eastern and western sea 15 ), has indeed a remarkable parallel in the coexistence and interpenetration of High German and Schweizerdeutsch in Switzerland and matches the classical definition of diglossia, as demonstrated in Houben 1996a. The non-Indo-Aryan languages that must have been spoken by some communities in that realm apparently remained under the threshold of perception. The "extreme superposition" perceived by Pollock (2006: 50) and, in different terms, by Robert (2012), refer to a clearly distinct situation where Sanskrit, or, in Japan, Chinese, is incorporated into cultural and linguistic life in Karnataka, resp. in Japan, while all actors are and remain sharply aware of their otherness and distant origin.

Conclusion
5.1 A new exploration of the language situation in ancient India at the time of the Buddha and the early grammarians is required in terms of the emergence of dialects, sociolects, and new languages, with as contrasting parallel the contemporaneous development in ancient Persia as reflected in old Persian inscriptions and in the Avesta. The following methodological observations have been made regarding fundamentally different ways of seeing languages.
Scholars are recognizing that languages are not always easily nor best treated as discrete, identifiable, and countable units with clearly defined boundaries between them .... Rather, a language 15 Subsequent descriptions of this area (esp. those in the Mahābhāṣya and in the Manusmti) point to an ecological transformation (from still largely forested area suitable to agro-pastoralism to an urbanized environment), which goes hand in hand with major transitions in ritual and religion (from Vedic to Buddhism): Houben 2011. is more often comprised of continua of features that extend across time, geography, and social space. There is growing attention being given to the roles or functions that language varieties play within the linguistic ecology of a region or a speech community. ... Languages can be viewed, then, simultaneously as discrete units (particles) amenable to being listed and counted, as continua of features across time and space (waves) that are best studied in terms of variational tendencies as examples of 'change in progress', and as parts of a larger ecological matrix (field), where functional roles and usage of the linguistic codes for a wide range of purposes are more in focus. (Lewis 2009) Madhav Deshpande (2006: 141) rightly explained that [t]he notion of language family implies that languages B and C are branchings of a common ancestor A, and this fact of a genetic connection accounts for certain features. On the other hand, the notion of a linguistic area implies that languages A and B, though belonging to different language families and originally possessing different linguistic features come to share some of each others features over a long period of time through intense contact.
Here too, the 'linguistic area' model (in which languages appear in a 'field') is superimposed on a 'family' model (in which languages are discrete units generating new units over time). However, the latter's priority cannot always and everywhere be taken for granted. 5.2 Extensive researches since the 19th century suggest that within the period that interests us, from 1000 BCE to 1000 CE, Old Persian, Avestan, Vedic, Middle Indo-Aryan and classical Sanskrit evolved within a large area of Indo-Iranian dialect continuity (Meillet 1908: 24-30), from 'linguistic area' to 'linguistic area', with several shifts of the geographical point of gravity, from Persepolis to Gandhara and from the northwest of the Indian subcontinent to the central Gangetic plain, and to India's southern states (the Deccan and further south). Apart from "time" and "geography", it is indispensable to take into account a third parameter throughout this period and throughout the large area of the partly overlapping Iranian and Indian "worlds": the parameter of sociolinguistic variation between a pole of high prestige characterized by elaboration and sophistication (for instance in a "Dichtersprache"), and a pole of lower prestige characterized by easiness of access and solidarity. The term āriya / ārya 'noble' as a qualifier of speech is occasionally attested in this large area, in a multilinguistic context (in the multilingual Behistun inscription) in the Iranian part, and in a diglossic context (ŚĀ, passage on the language spoken in assemblies 16 ) in the Indian part. Almost contemporaneously with these employments but far in the east, the Buddha proposed an ethical focus or reinterpretation of ārya 'noble' with 'nobility' being dependent on behaviour and effort, and independent of acceptance of a hereditary 'nobility'. 5.3 Under some conditions it may be appropriate to attribute primary status to a model of "family" relationships between languages as "particles" or as discrete units, for instance with regard to languages that survive and remain relatively stable in mountainous areas. 17 For languages that flourish in areas of intensive contacts a simple genetic model may be entirely inadequate, especially when the 'linguistic area' applies also to what can be reconstructed for earlier periods (cf. Pinault 2002). When studying emerging languages such as the early stages of classical Sanskrit and literary Prakrit, these should obviously not be posited as discrete units. Invoking the concept of "hybridity" in connection with the name of a well-defined language (as in "Buddhist Hybrid Sanskrit," "Epigraphical Hybrid Sanskrit") was only a stopgap solution when dealing with languages or idioms that were emerging, successfully or without lasting success, as standards or as roofing language, or that were disappearing. The study of the emergence and disappearance of new standard languages is a large field of study to which sociolinguistics has contributed significantly in recent decades. Concepts used with regard to the evolution of new standards in Germanic (Goossens 1985) or Romance languages (Muljačič 1986(Muljačič , 1989(Muljačič , 1993 can and, for a better scientific grasp on the subject, should be applied and tested with regard to Indo-Iranian, Indo-Aryan and Sanskrit. The so-called Sanskrit "Hybrids" in the first millennium CE, including the Prakrits and Epic Sanskrit from the time of Aśoka onwards, are then rather to be regarded as emerging "Ausbau" languages of Indo-Aryan with hardly any significant mutual "Abstand" before they will be successfully "roofed," in the second half of the first millennium CE, by "classical" Sanskrit, for which Pāṇini and Patañjali, filtered by the work of Buddhist grammarians such as Candrācārya (contributing, inter alia, to a definitive abandonment of linguistic accent and of the subjunctive), will become authoritative. The appropriate question to ask with regard to Pāṇini "as a variationist" and his period would then not just be: what was "actual Sanskrit usage" giving the "best possible fit" with the rules (Kiparsky 1979: 5-6), but rather what was the diglossic range within which he and the intended public of his grammar were functioning.