A corpus study of grammatical differences between Uruguayan and Argentinian Spanish Un estudio de las diferencias entre el español uruguayo y argentino basado en un corpus

This paper explores five grammatical features in Argentinian and Uruguayan Spanish using the Corpus del español (Davies 2017). The goal is to find features that distinguish the speech of the two countries. The features studied are: (1) stress variation in 2 person singular present subjunctive forms (e.g. téngas ~ tengás), (2) number agreement with había (e.g. habían ~ había muchos casos), (3) use of vos following prepositions (e.g. con vos ~ contigo), (4) use of present perfect versus preterite to express completed actions (e.g. recién he comido ~ comí), (5) use of the present or past subjunctive in embedded clauses preceded by a matrix clause containing a subjunctive trigger in the past tense (e.g. Nos mandaron que rellenáramos ~ rellenemos los papeles anoche). Isogloss 2020, 6/6 David Ellingson Eddington 2 Statistical analyses were carried out on the proportion of each variant across the two countries.


Introduction
It is often the case that the capital city in a given country does not only comprise the economic center of the country, but houses the prestige variety of the language as well. In this regard, it is an interesting case that the capitals of Uruguay and Argentina are only separated by 203 kilometers. This short distance is actually not what makes it unusual. The distance between San Salvador and Guatemala City is only 240 kilometers, but that interval is inhabited with speakers which allows for a dialect continuum between the two cities. In contrast, the 203 kilometers between Buenos Aires and Montevideo are filled with the waters of the Plate River rather than with speakers which makes the two capitals essentially contiguous.
Rioplatense is the name of the variety of Spanish that is commonly applied to the speech of both cities, suggesting that there is a unity between them (Lipski 1994, Lope Blanch 1968. If one asks an inhabitant of Buenos Aires or Montevideo if they can tell which city someone is from by their speech, some will tell you it is impossible. Others will vigorously affirm that speakers in the other city are easily distinguished because they speak more subdued or more singsong, attributes that are difficult to quantify. Differences in intonation may actually be a marker. For example, Colantoni & Gurlekian (2004) argue that Buenos Aires intonation differs significantly from other Spanish varieties, but whether their findings distinguish the speech of Montevideo has yet to be determined. Of course, there are lexical items that serve as regional shibboleths as well (Table 1). In addition to these lexical differences, a well-documented distinction is found in second person singular terms of address and their corresponding verbal inflections. Whereas forms such as vos tenés 'you have' are firmly entrenched in Buenos Aires, forms of address vary to some degree to include tú tenés and tú tienes in Montevideo (Bertolotti & Coll 2003, Bertolotti 2011, Weyers 2013, although these forms are more commonly encountered in the interior regions farther from Montevideo. Elizaincín (1984) asked if it is possible to find characteristics in Uruguayan speech that set it apart as distinct rioplatense variety. The purpose of the present paper is to attempt to answer that question. It will do so in two ways. The first is to use corpora to uncover other differences that may not have been considered previously. The second is to use the same corpora to examine some of the between-country variations that have already been discussed, and to shed some quantitative light on them. Of course, given the geographic proximity, as well as the historical and cultural similarities between these two countries, any differences are expected to be in manner of degree rather than binary. Such is the nature of language, which is why it merits a statistical approach, since even some of the purported lexical regionalisms are gradient when observed more closely (Table 2).

The corpus study
All data were gleaned from the Corpus del español / Web dialects (Davies 2016), except when noted otherwise. This corpus was compiled recently and 60% of it derives from blogs, meaning it covers more informal registers quite well. This is important because highly edited materials from printed sources are less likely to demonstrate the regional differences explored in the present paper. The Corpus del español / Web dialects includes 38.7 million words from Uruguay and 169.4 million from Argentina. It is unfortunate that the city or province of the speakers in the corpus is not recoverable, only their country of origin. However, since roughly a third of all Uruguayans live in the greater Montevideo area and a third of all Argentinians reside in or close to Buenos Aires, a good deal of the data could be classified as rioplatense. The case could also be made that city dwellers are more likely to have internet access and to have blogs in comparison to the more rural inhabitants of the countries, which is another argument that suggests that the speech of the capital cities is well represented in the corpora. Nevertheless, each country houses a number of language varieties and the findings of this paper ultimately reflect linguistic differences based on political rather than linguistic boundaries. As a non-linguistic example of the usefulness of corpora, consider the question of consumption of yerba mate 'ilex paraguariensis'. Which country is most matero 'yerba mate drinking'? A search for tomar mate was done which yielded 261 instances in Uruguay (UY) and 550 in Argentina (AR). Of course, the total number of instances cannot be compared on equal footing since they are derived from corpora of different sizes. However, when divided by their respective corpus size in millions of words tomar mate (or its inflectional variants) occurs 3.25 times per million in Argentina and 6.74 times per million in Uruguay, suggesting that Uruguayans talk about mate, and probably drink it, more than their neighbors to the west.

Non-standard present subjunctive forms with final stress
The present subjunctive vos forms of verbs have been observed to vary as far as their stress is concerned (Fontanella de Weinberg 1979. This is most likely a change due to analogy with present indicative vos inflections which are finally stressed (e.g. podés, conversás). The alternation is between the subjunctive forms such as téngas and tengás, and entiéndas and entendás (where accent marks indicate stress placement). Bertolotti & Coll (2014) as well as Elizaincín (1984) argue that the final stress varieties are characteristic of Buenos Aires speech, but are not found in Uruguay.
Subjunctives stressed on the final syllable are observed to occur more often in negative imperatives (Fontanella de Weinberg 1979), but are not limited to that context. Therefore, in order to examine this alternation, the variant forms of the 2 nd person singular present subjunctive of 29 frequent verbs were searched for in the corpus (Table 3), regardless of whether they appeared in an imperative or not. These verbs were chosen because a finally-stressed version appeared in the corpus. There are of course, some issues with using the corpus for this task. The difference between the forms is stress, which in written form must be marked with an accent mark on the non-standard forms. Given the less formal nature of much of the corpus data, writers are less likely to adhere strictly to orthographic norms and omit accent marks on finally stressed forms such as vengás. In a similar vein, they may not even perceive that the stress is final in their own speech, much less mark it with an accent mark. In Uruguay, where vos forms alternate with tú forms, the issue of whether a word such as comas is meant to be tú cómas or vos cómas / comás is something the mere appearance of comas in the corpus cannot address. For these reasons the results from this corpus study must be considered tentative. Nevertheless, keeping those issues in mind, the results of the corpus search indicate that in Uruguay .012 of the present subjunctive vos forms have final stress while the proportion in Argentina is somewhat higher at .018. The proportion of finally stressed forms is higher in Argentina in 27 of the 29 forms, and the difference between the countries is statistically significant. The effect is small to medium (z = 28.5, p < .001, Cohen's d = -.739). As already noted, the web dialects corpus is comprised of 60% blogs. The question is whether these non-standard subjunctive forms may be found in more formal and more carefully edited materials. The Corpus del Español / News On the Web corpus contains 6.9 billion words derived from newspapers and magazines. Searches for the same words were conducted in this corpus (Table 4). Once again 26 of the 29 words had final stress more often in Argentina than in Uruguay. The difference is significant, but the effect size in these data is so small as to be negligible (z = 53, p < .001, Cohen's d = -.154). Perhaps the most important finding here is not the differences between the two countries, but the fact that the non-standard stress pattern on these subjunctive inflections is not limited to Argentina, as has been suggested previously, but is in fact found on both sides of the River Plate, but to varying degrees. More careful research along the lines of Johnson & Grinstead (2011) must be carried out in both Argentina and Uruguay before definite answers to this question can be found.

Number agreement with haber
In the present indicative tense, the existential use of haber has a single inflection, hay, which has no plural counterpart. In the imperfect however, había alternates with habían although the latter is considered incorrect in prescriptive grammars (Real Academia Española 2005: 330-331). Speakers who use había before both singular and plural arguments appear to interpret the subject of había as an unexpressed element in the language. On the other hand, when habían is used the speakers interpret the plural argument following this verb as its grammatical subject. The extant variation has been shown to be influenced by linguistic and social factors in a number of other countries (Bouzouita & Pato 2019, Claes 2016. The corpora were searched for plural nouns and adjectives appearing after había and habían as well as 15 other plural modifiers (Table 5). 1 A great deal of variation is observed in the use of habían before plurals. It is less common in Argentina (.05), while in the Uruguay corpus the proportion is .11. A Wilcoxon signed ranks test indicates that the .06 difference between Uruguay and Argentina is not only significantly different (z = 123, p < .001), but the size of the effect is not small (Cohen's d = .524), suggesting that this grammatical usage is one that may distinguish the varieties of the two countries. 1 There are only 7 total instances of habemos + past participle in the two countries studied which is not enough to warrant inclusion of this inflection. .097

Use of vos or ti following prepositions
In varieties of Spanish that use voseo there is a good deal of variation as to which form of the stressed pronoun appears after prepositions. As far as para is concerned, Fontanella de Weinberg (1999) and the Real Academia Española, Asociación de Academias de la Lengua Española (RAE & ASALE 2009: 1264) note variation in Montevideo between para ti and para vos. According to RAE & ASALE (2009: 1264 in Argentinian Spanish vos appears following para and con, while in Uruguay there is more variation between para ti and para vos as well as between contigo and con vos. Weyers and Canale (2013) found contigo to be the preferred form in Montevideo, while con vos was preferred in Buenos Aires. The use of ti and vos was observed in the corpus following six pronouns (sin, hacia, de, por, para, con). As Figure 1 indicates, sin ti, hacia ti, de ti, and por ti are more frequent than their counterparts with vos in both countries. As far as para is concerned, Argentinians are about equally split between para ti and para vos, while Uruguayans use para ti somewhat more. This is not in line with previous studies that suggest that para ti is rare in Argentina, at least in Buenos Aires. The most pronounced difference between the countries is in their use of con vos and contigo. Argentinians rarely use contigo and strongly prefer con vos. While it is true that contigo is much more common in Uruguay, con vos is still more frequent in that country than contigo.

Figure 1. Proportion of vos following six prepositions.
The differences between the two countries is not significant for sin (z = -.067, p = .944), hacia (z = .113, p = .912), or de (z = -1.295, p = .194). For por, however, there is a trend (z = -1.953, p = .051.); por vos is somewhat more common in Argentina than Uruguay. In like manner para vos is more common in Argentina than Uruguay (z = -4.715, p < .0001). Figure 1 illustrates that con vos is more frequent than contigo in both countries, but significantly more common in Argentina (z = -30.729, p < .0001). It is fair to say that the use of contigo is a Uruguayan shibboleth.

Preterite versus present perfect
Three tenses are used to express past actions in Spanish depending on factors such as aspect: imperfect, preterite, and present perfect. Research has shown that in Peninsular Spanish the preterite is being encroached on in favor of the present perfect (Schwenter & Torres Cacoullos 2008). In this regard, it lags behind other Romance languages such as Standard French and Italian that have already ousted simple past tense in favor of the perfect. In Latin American varieties, on the other hand, the preterite more clearly dominates, although in some American varieties the present perfect has taken over certain functions of the preterite (Howe & Schwenter 2003). In Argentina, the preterite has been gaining ground over the present perfect since the 19 th century (Rodríguez Louro 2009). The decline of the present perfect in Buenos Aires is further attested by the fact that it is used much more by older speakers (Burgos 2004, Rodríguez Louro 2009, suggesting an apparent time change in which the preterite is coming to dominate. In Uruguay, the preterite is also more prevalent than the present perfect (Caviglia & Malcuori, 1994, however, Fløgstad (2016) and Henderson (2010) provide evidence that the present perfect is more common in Uruguay than in Argentina.
One context in which the preterite and present perfect vary is when expressing recently occurring actions and past actions that have relevance for the present moment. These contexts occur with adverbials such as recientemente and esta mañana. The proportion of preterite and present perfect tenses following 11 adverbials of this kind was taken from the corpus and calculated for all countries. Unsurprisingly, Spain prefers the perfect at a rate of .64 over the preterite, while Uruguay (.34), Paraguay (.30), and Argentina (.29) occupy last place. The higher use of the perfect in Uruguay is not only statistically larger than in Argentina (t(10) = -4.37, p = .0014), but Cohen's d (1.32) indicates that the effect of country is large. Uruguayans use the perfect tense in these cases more than Argentinians. Why is it that Uruguay and Argentina, along with Paraguay, are on the forefront of the rise of the preterite at the expense of the present perfect? All three countries had large numbers of Italian immigrants (Calafut 1977, Oddone 1994, Pidoux de Drachenberg 1975. Fløgstad (2016) argues that the loss of the perfect tense in Argentina, but not in Uruguay, is the result of simplification due to language contact. She does not discuss Paraguay, but notes that there were large influxes of Italians from the northern regions of that country, where the present perfect dominated, as well as from the south where the preterite was more common, so assigning the change in Argentina to a particular variety of Italian is difficult. Standard Italian has lost the preterite in favor of the present perfect, which further complicates the issue. Instead of attributing the reason for the change to Italian influence, Fløgstad rightly suggests that the stage for the demise of the present perfect may have already been previously set, and language contact only served to intensify it (182). She further claims that the reason the present perfect is more common in Uruguay is because between 1905 and 1914 Uruguay had welcomed only a tenth of the new immigrant population that Argentina had (184), which suggests it may not be attributable to a large Italian population, but to a large overall immigrant population. This is supported by Moya (2008) who counts 6,501,000 European immigrants to Argentina between 1840 and 1930, 713,000 to Uruguay and only 21,000 to Paraguay. A rough estimate of the proportion of immigrant population may be calculated by dividing these numbers of immigrants by the population of each country in 1939. 2 This yields .47 in Argentina, .37 in Uruguay, and only .02 in Paraguay. Therefore, this kind of grammatical simplification may simply be the result of significant numbers of adult immigrants learning a foreign language (McWhorter 2015) rather than on Italian influence per se.

Sequence of tense in past subjunctive
In general, when a matrix clause contains a trigger for subjunctive in the embedded clause, the tense of the matrix clause determines that of the embedded clause (e.g. Nos mandaron que rellenáramos los papeles anoche. Nos mandan que rellenemos los papeles ahora). There are, however, instances where the sequence of tense may be violated, which have been studied in detail (Carrasco Gutíerrez 1999, Guajardo 2018, Laca 2010, Suñer & Padilla-Rivero 1987, Quer 1998. In countries such as Bolivia, Paraguay, and Argentina present subjunctive in the embedded clause is much more likely to be found even when the matrix clause is in the past tense (Guajardo 2017, Sessarego 2008. This phenomenon was studied using the corpus. In order to do this, sentences containing 17 triggers of subjunctive in a matrix clause were taken from the corpus. After that, the number of present and past subjunctive tense verbs was tallied for the verbs in the embedded clause, and the proportion of past and present subjunctive was calculated (Table 10). As observed in previous studies, the countries with the highest use of present subjunctive in embedded clauses preceded by past tense matrix clauses are Bolivia (.57), Ecuador (.56), Paraguay (.42), and Argentina (.38). On the other end, Cuba (.06) is the least likely to use the present subjunctive in these cases. In stark contrast to Argentina, Uruguay does not really participate in the attrition of the past subjunctive where the proportion of present tense subjunctive forms there is only .14. This difference between Uruguay and Argentina is not only significant (Z = 16, p < .006), but the effect size of the difference is large (Cohen's d = -1.05). In other words, there is a major usage difference on the east and west sides of the Plate River. In countries with large Native American populations such as Bolivia, Paraguay, and Ecuador the loss has been attributed to the large numbers of people who acquired Spanish as a second language (Guajardo 2017). Since the Native American population is much sparser in Argentina, Guajardo argues that the encroachment of the present subjunctive on the past subjunctive in embedded clauses in Argentina is due to the influence of Italian immigrants. However, one difficulty with pinning the change on Italian, is that standard Italian uses both past and present subjunctive in embedded clauses just as Standard Spanish does: (1) a. Voglio che tu venga alla festa (Italian) want.1SG that you come.SBJV.2SG to.the party b. Quiero que tú vengas a la fiesta (Spanish) want.1SG that you come.SBJV.2SG to the party 'I want you to come to the party.' (2) a. Volevo che venissi alla festa (Italian) want.1IPFV.SG that come.SBJV.IPFV.2SG to.the party b. Quería que vinieras/vinieses a la fiesta (Spanish) want.1IPFV.SG that come.SBJV.IPFV.2SG to the party 'I wanted you to come to the party.' If this is the case, why would maintaining the sequence of tense cause problems for Italian immigrants when acquiring Spanish? In order to claim that transfer from Italian is the source of the change one would need to show that the dialect of the Italian immigrants does not have the same sequence of tense as Spanish does. The data here do not support this idea.
There was, of course, a massive influx of Italian speakers into Argentina in the late 19 th and into the early 20 th century and perhaps their acquisition of Spanish may have played some role in the change. The fact that this change is much less prevalent in Uruguay casts some doubt on Italian influence, since Uruguay participated in much the same Italian immigration as Argentina (Di Tullio & Kailuweit 2011). However, there was a much larger total immigrant influx between 1840 and 1930 in Argentina than in Uruguay (Moya 2008). Once again, this may be a case of language simplification that occurs when a large number of immigrants learn a new language as adults (McWhorter 2015). That is, it may have been the overall immigrant population, not necessarily the Italianspeaking population, that affected this change. If immigration is the cause, one would suspect that the change must have begun during the mass immigration phase. The historical aspect of this change needs to be explored to test this. However, if the loss of the past subjunctive is actually more recent in Argentina instead, it simply may not have existed long enough to spread into Uruguay. This is an issue which certainly merits further attention.

Conclusions
The purpose of this study has been to use corpus data to explore five grammatical features that may serve to distinguish the linguistic variety of Uruguay and that of Argentina. One finding is that the use of existential haber in the imperfect tense more often agrees with the following plural predicate in Uruguay (e.g. habían varias maneras) than it does in Argentina (e.g. Había varias maneras). The most significant results, however, have to do with the use of perfect tenses, which in Latin America have generally been giving way to the preterite tense. This is especially the case in Argentina, while Uruguay conserves more past perfect usages than its neighbor. The other noteworthy way that grammar in Argentina and Uruguay differs is in their use of the past subjunctive in embedded clauses. In Argentina, along with a number of other countries, the present subjunctive is replacing the past subjunctive when they appear in an embedded clause following a matrix clause in the past tense (e.g. Nos pidieron que lo hagamos, y decidimos que no nos correspondía la tarea). Lack of sequence of tenses occurred at a rate of .38 in AR, but only .14 in Uruguay. Since there are a few cases where the contravention of sequence of tense rules is grammatical, these numbers suggest that the elimination of the past subjunctive has not reached into eastern rioplatense.
In some ways, sociolinguistics has not made wide use of corpora because they lack information about important variables such as gender, age, and social class. However, corpora can suggest broad regional differences, which may in turn serve as the basis for more fine-grained dialectal and sociolinguistic investigation.