Nominalized adverbs in Spanish: the intriguing case of detrás mío and its cohorts

Instances of adverbs modified by adjectives (e.g. detrás mío, delante tuyo) were extracted from the Corpus del Español. The corpus analysis reveals that these constructions are attested in all 21 Spanish-speaking countries to varying degrees, but are most frequent in Argentina and Uruguay. Adjectives following the adverbs in questions are predominantly masculine; however, in Peninsular varieties feminine forms are quite common. Although alrededor and lado are both adverbs as well as masculine nouns, they are occasionally followed by feminine adjectives (e.g. al lado suya), which is arguably due to the use of the feminine in other constructions such as encima mía and debajo nuestra.


INTRODUCTION
The use of locative adverbs followed by stressed possessive pronouns is extremely common in Spanish: detrás tuyo, delante nuestro, encima mía, arriba suyo. It has been documented throughout the Spanishspeaking world (Kany 1976;Llorente 1980;Caravedo 1996;Hernández Alonso 1996;Vaquero 1996;Aleza Izquierdo 2010;Gómez Torrego 2011;Salgado and Bouzouita 2017). However, some note that it is more common in Latin American Spanish, especially in rioplatense (Kany 1976, Real Academia 1973, Real Academia 2009. Almost as common as the construction itself are the many prescriptive language purists who denounce this usage as a barbarism (e.g. Butt and Benjamin 2011, Gómez Torrego 2000, González Calvo 2006, Jímenez Arias 2011. The Real Academia Española is no exception (2005).The complaint of these language mavens is that delante, debajo, cerca and the like are clearly adverbs, yet unscrupulous Spanish speakers treat them as if they were nouns that allow modification by possessive pronouns.
As proof that adverbs such as detrás are not nouns, the Real Academia Española suggests this test: 1 Estoy al lado suyo is acceptable because lado may appear after a possessive pronoun: Estoy a su lado. However, since phrases such as * estoy a su detrás are not observed, detrás is demonstrably not a noun, and phrases such as * estoy detrás suyo are vehemently proscribed. Yet, if we follow this line of reasoning, it actually leads to a quite different conclusion, much to the chagrin of language purists. The following are a few examples of the kind of constructions that are actually attested. 2 (1) Ella también se para y yendo hacia él, se pone en su detrás y lo abraza. 3 'She stands up too, and going toward him she gets behind him and hugs him.' (2) Los dos caballos iban tranquilos por mi delante algo distraídos. 4 'The two horses were going calmly in front of me somewhat distracted.' (3) Apagó la luz se puso en mi encima claro yo estaba muy exitada. 5 'He turned off the light, got on top of me. Of course I was very turned on.' (4) Aunque por mi dentro me decía ... 6 'Even though inside of me I said ...' Alrededor is both a noun and an adverb, although as a noun it generally appears in the plural, and just as alrededor has a dual noun/adverb status, adverbs such as detrás have, at least for many speakers, undergone nominalization.
This situation is reminiscent of the word fun in English which began as a noun (De Smet 2012). For years prescriptivists insisted that funner could not be a word since -er only attaches to adjectives. But of course, fun had also begun to function as an adjective, hence funner, the occasional more fun and even funnest. It is worth mentioning that although most decry the use of detrás mío and the like, some express more liberal leanings and describe it as incorrect but "explicable" (Gómez Torrego 2009: 166) or that such unpalatable expressions still have "su razón de ser" (Gónzalez Calvo 2006: 68).
The question is no longer whether detrás and its bedfellows are becoming nouns, but why and how they have progressed in the language. Some suggest that since adverb+possessive pronoun constructions such as devant nostre are rampant in Catalan, they must have spread to Spanish (Kany 1976, Martínez de Sousa 1996. The same construction is also attested in Galician (Silva Domínguez 1995). Perhaps Peninsular Spanish was sandwiched between these two languages in such away that it could not help but catch this construction from both of its neighbor's doorknobs. However, other than geographic proximity it is unclear how this could be proven.
The other possibility is that it was an inside job, analogy no less. In the same way neogrammarians supposed that historical changes would have been perfectly regular had analogy not reared its ugly head, analogy may have put another construction into disarray. Here, the analogy could be from two sources. The first involves analogy from the possessive construction (Davies 1966, Real Academia Española, 7 Hernández Alonso 1996).
In the second, the source is the Jekyll and Hyde expressions such as alrededor and al lado that sometimes appear as adverbs, but at others transform themselves into nouns (Gómez Torrego 2000).
Regardless of the origin of the detrás mío type constructions, they exist, and borrowing the words of Gregory Mallory, detrás mío deserves to be studied "because it's there." The present study employs a corpus approach to this topic using the newly updated Corpus del Español, which now includes data by country. It attempts to answer a number of questions: What countries use the nominalized adverbs more often? Which adverbs are nominalized the most? What gender is assigned to the newly converted adverbs? How is their gender distributed by country? Do adverbs ending in -a take feminine possessives more than those with other endings? Before proceeding to the corpus study itself, a review of the findings of previous studies is in order.

PREVIOUS CORPUS STUDIES
Marttinen Larsson and Alvarez López (2017) extracted 541 instances of adelante, abajo, arriba, atrás, cerca, debajo, delante, detrás, encima, enfrente, junto and lejos ('in front, below, above, behind, close, below, in front, behind, on top, in front, next to, far') followed by mío/a, suyo/a and tuyo/a from three corpora: Corpus Diacrónico del Español, 8 Corpus de Referencia del Español Actual, 9 and Corpus del Español del Siglo XXI. 10 They excluded nuestro/a because in a string such as detrás nuestro it is possible that nuestro does not modify detrás (e.g. Una vez dentro nuestro trabajo no ha concluido). They found instances of the constructions under discussion in all Spanish-speaking countries with the exception of Honduras, the Dominican Republic and the Philippines, but this is most likely due to the scarcity of data from those countries in the consulted corpora. Spain and Argentina presented the higher number of instances, but the consulted corpora did not include equal representation from each country, so this raw data cannot be used for comparison. The historical data they uncovered are more revealing. Their study revealed one case of detrás suyo from 1542, but the construction did not appear again until the first decade of the twentieth century. From then on the trend was generally upward, with some fluctuations in each successive decade. As far as the gender of the possessive pronoun is concerned, they observed in their corpus that feminine forms were exclusive to Peninsular Spanish, with the first attestations in the 1950s.
Santana Marrero (2014) built a corpus by searching Google Noticias for delante, detrás, encima, debajo, cerca, lejos, enfrente and alrededor followed by stressed possessive pronouns. She limited the data by including only instances from the first 20 pages resulting from each search. This yielded 187 instances of adverb+possessive pronoun. This construction was most common with the adverbs delante and detrás and least common with debajo and lejos. The adjectives suyo/a and mío/a appeared most frequently following the adverbs (51% and 31%, respectively), while cases of nuestro/a and vuestro/a were unusual (6% and 0.5%, respectively). However, it is not clear if she excluded cases with nuestro/a and vuestro/a that do not modify the adverb. Marttinen Larsson and Alvarez López (2017) did not include any instances of nuestro/a and vuestro/a in their study due to their inherent ambiguity .
In Santana Marrero's (2014) corpus about 94% of the cases the gender of the possessive pronoun was masculine. No cases of feminine possessives were observed after debajo, cerca or alrededor. The idea that the possessive may agree with the gender of the referent finds no support in her data either. As far as geographical distribution is concerned, she found cases throughout the Spanish-speaking world, but more so in Latin America, especially Argentina. Once again, this finding needs to be moderated by the fact that the internet contains widely varying amounts of data from each Spanish-speaking country, a fact that was not controlled for.
The goal of Benítez Burraco (2016) was an analysis of spoken, rather than written language. Her corpus consisted of 1,957 constructions. The principal purpose of her paper was to demonstrate how to use AntConc to carry out a corpus analysis. The source of her corpus was customer comments from www.tripadvisor.com which are usually quite informal in tone. She observed cases of abajo, adelante, adentro, alrededor, arriba, atrás, cerca, debajo, delante, dentro, derecha, detrás, en medio, en torno, encima, enfrente, lado and metros, ('below, in front, inside, around, above, behind, close, below, in front, inside, right, behind, in the middle, around, on top, in front, side, meters'), of which delante, lado and detrás were most frequent. Mío/a, tuyo/a and nuestro/a were the most frequent possessives, although it is again unclear how or whether ambiguous cases of nuestro/a were dealt with. Among her findings are the existence of both masculine and feminine possessives, along with the fact that the gender of the possessive does not correspond to the biological gender of the referent. Her corpus does not include the country of origin of the speakers.
Ruiz Tinoco (2013) created a corpus of tweets from Twitter to study a number of different topics in Spanish usage including detrás mío type constructions. His focus was on US Spanish, and for this reason his corpus contained tweets from East Los Angeles, San Diego, Tucson, Albuquerque, El Paso, San Antonio, McAllen and Houston. For contrastive purposes he also included the Mexican cities of Tijuana, Ciudad Juárez, Chihuahua, Monterrey and Reynosa. The principal finding of the two pages he dedicates to the topic is that such constructions exist on both sides of the border, and both masculine and feminine adjectives modify the nominalized adverbs.
Salgado and Bouzouita (2017) based their study on corpora of Peninsular Spanish, where they found 187 cases of the construction such as delante mío in many different parts of the country. In Spain, adjectives following adverbs tended to be feminine regardless of whether the adverbs end in -a or -o. This tendency was especially strong in Andalusia.
There are two principal differences between the aforementioned studies. One is whether they cover a particular country, many different countries or combine data from all countries in the analysis. The other difference is the particular kind of corpus used: spoken versus written, Twitter versus Google Noticias, etc. Since corpora have only been used to study nominalized adverbs in Spanish in the last few years, these studies were generally carried out as initial forays into the topic. As a result, most were undertaken in relative isolation from each other.

METHODOLOGY OF THE CORPUS STUDY
The present study is motivated by the small numbers of instances of the adverb+possessive pronoun constructions previous analyses were based on. Of course, Benítez Burraco (2016) compiled 1,957 instances, which is a respectable number that allows for differences between the 18 words she obtained data for to be examined. In her study, for each of the 18 words there were an average of 108.7 instances. However, she did not include country as a variable. If the 18 words were multiplied by 21 countries, that would yield 378 cells, and in that case 1,957 instances would only average 5.2 instances per cell.
In order to address these issues more fully, the present study sought a much larger data set that included data from 21 countries. The data were derived from the web dialect section of the Corpus del Español, 11 a two-billion-word corpus which is comprised of the following number of million words in these countries: 169. It was hoped that many more instances of detrás mío type constructions would result from this corpus when compared to the smaller samples the extant corpus studies were based on. One advantage of this corpus is that about 60% of the data come from blogs, so that it covers informal registers quite well.
Of course, corpora do have their limitations and drawbacks. The existence of typographical errors in the source documents as well as errors introduced in the compilation and tagging process are always an issue for corpora. The country of origin in this particular corpus was determined by Google's algorithm which is not foolproof, and may categorize a document into the wrong country. In like manner, the fact that a blog was written in one country does not exclude the possibility that its author may actually be from another. Another potential problem is that the corpus is divided along national boundaries. Although the data are taken from individual countries, one country may house several dialects that differ in how they deal with the constructions being considered. Additionally, if the corpus happens to incorporate a document in which one particular author uses many of these constructions, that author may skew the results for that country. In spite of these issues, the new Corpus del Español is currently the best source for looking at adverb+possessive pronoun variation by country.
Searches were performed for abajo, adelante, adentro, al lado, alrededor, arriba, atrás, cerca, debajo, delante, dentro, detrás, encima, enfrente and lejos ('below, in front, inside, beside, around, above, behind, close, below, in front, inside, behind, on top, in front, far') followed by mío/a, tuyo/a, suyo/a, nuestro/a and vuestro/a. 12 Alternate spellings such as alado, ensima and mio/a without accent marks, etc. were also included in the searches. This was necessary since the bulk of the corpus is comprised of blogs and other informal materials that are not typically edited for orthographic errors. Given that nuestro/a and vuestro/a may either modify the preceding nominalized adverb or a following noun, all 5,000+ instances of this sort were examined by hand, and those cases in which these adjectives do not modify the preceding adverb were eliminated from the results. Once this was done the search resulted in 12,085 instances of the construction in question, a much greater number than reported in previous studies (Table 1). It results in 315 cells of data (21 countries by 15 words), each of which is filled by an average of 38.4 words. 11 www.corpusdelespanol.org 12 Junto and bajo were also carried out at first, but they yielded only 3 and 7 instances, respectively. Previous studies either did not include frequency data by country or based their results on raw data without normalizing for corpus size. The raw data from the present study are found in Appendix 1. Normalizing in the present study was done by dividing the number of instances in a given country by the number of million words the corpus contains from that country. When this was done a number of interesting findings result (Figure 1). The first is that the constructions under consideration are found in every country. The second is that Spain falls at neither extreme, but in the middle. If Spain found itself outranked by all, or most of the other countries, the case could be made that these constructions are principally Latin Americanisms. As it stands, it is difficult to assert that Peninsular and Latin American varieties differ widely in their usage. Third, the notion of their widespread use in Argentina is confirmed, and Uruguay follows in close suit, which is unsurprising given the wealth of other attributes these rioplatense varieties have in common (Lipski 1994). The fact that Chile and Paraguay are not far behind suggests that this construction is most typical of varieties in or close to the Southern Cone.

GENDER OF THE NOMINALIZED ADVERBS
Adverbs such as al lado include the noun lado, which has grammatical masculine gender. Alrededor is historically comprised of al+rededor, and rededor is masculine as well. However, most adverbs do not share this structure, which means that when they are nominalized, their grammatical gender is not immediately apparent. A number of the studies cited in section 2 above demonstrate that the gender of the possessive does not correspond to the referent of the phrase, so that may be eliminated as the source. One could assume that feminine gender would be associated with adverbs ending in -a, but arriba, encima and cerca clearly do not top the chart in Figure 2, delante does. One surprising finding is that al lado and alrededor, which arguably possess grammatical masculinity, find themselves followed by feminine possessives on occasion. Perhaps the best explanation for this is that they are processed as simple adverbs like detrás and its counterparts. Since other nominalized adverbs may occasionally take feminine possessives, nominalized al lado and alrededor appear to have adopted that pattern as well. Of the total number of constructions extracted from the corpus, only 7% are feminine. This number varies across countries (Figure 3). In Spain this number reaches 22%, which may explain why the percentage of feminine al lado is highest in that country (11%, followed by the US at 7%). However, the same is not true of alrededor. In Spain it is feminine in 5% of the cases, but 11% and 8% in Honduras and Nicaragua, which only have overall rates of feminine gender of 4% and 3%, respectively. On the bottom of the chart is Panama, with no attestations of feminine forms. The percent feminine of a particular construction is given by country in Appendix 2. The availability of large corpora in recent years makes it possible to test ideas that in the past were often only anecdotal observations (e.g. Kany 1976). While some more recent studies into detrás mío type constructions have adopted a data-oriented approach, they were limited to a few hundred to under two thousand instances on which to base their conclusions. In contrast, the present study is based on over 12,000 cases. These constructions are documented to varying degrees in all 21 countries examined, but are more much common in Argentina and Uruguay, something a number of researchers had suggested. Although there is no shortage of prescriptivists who decry this usage as sloppy or uneducated, it is clearly neither infrequent nor geographically limited. Since adverbs have no grammatical gender, it is interesting to note that when they are nominalized, they favor the masculine in about 93% of the cases. It is tempting to suppose that encima, cerca and arriba would be more prone to be modified by feminine possessive pronouns since they end in -a, but they are feminine to about the same extent as are other adverbs that do not end in -a. As far as the distribution of feminine forms is concerned, it is predominantly a Peninsular usage, although some feminine instances were observed throughout Latin America, with the exception of Panama. On another note, al lado and alrededor are words that are widely attested as both adverbs and masculine nouns. In spite of this fact, even these words are occasionally modified with feminine pronouns, most likely due to the other adverbs that also take the feminine on occasion.
The corpus from which these data were extracted is comprised of about 60% blogs, which tend to represent a less formal style. While the Corpus del Español allows variables such as country and frequency to be examined, it is unfortunate that by its nature, it does not contain information about the writer's social class, gender, educational level or age. The use of nominalized adverbs is proscribed in grammars, and perhaps for this reason more educated speakers use it less often. As sociolinguists have demonstrated, where there is variation, it tends to become associated with particular social variables such as gender and age. It is entirely possible that such factors may play a part in the use of nominalized adverbs, but only further studies will be able to determine their role.