Language disintegration under conditions of severe formal thought disorder

On current models of the language faculty, the language system is taken to be divided by an interface with systems of thought. However, thought of the type expressed in language is difficult to access in language-independent terms. Potential inter-dependence of the two systems can be addressed by considering language under conditions of pathological changes in the neurotypical thought process. Speech patterns seen in patients with schizophrenia and formal thought disorder (FTD) present an opportunity to do this. Here we reanalyzed a corpus of severely thought- disordered speech with a view to capture patterns of linguistic disintegration comparatively across hierarchical layers of linguistic organization: 1. Referential anomalies, subcategorized into NP type involved, 2. Argument structure, 3. Lexis, and 4. Morphosyntax. Results showed significantly higher error proportions in referential anomalies against all other domains. Morphosyntax and lexis were comparatively least affected, while argument structure was intermediate. No differential impairment was seen in definite vs. indefinite NPs, or 3 rd Person pronouns vs. lexical NPs. Statistically significant differences in error proportions emerged within the domain of pronominals, where covert pronouns were more affected than overt pronouns, and 3 rd Person pronouns more than 1 st and 2 nd Person ones. Moreover, copular clauses were more often anomalous than non-copular ones. These results provide evidence of how language and thought disintegrate together in FTD, with language disintegrating along hierarchical layers of linguistic organization and affecting specific construction types. A relative intactness of language at a procedural, morphosyntactic surface level masks a profound impairment in the referential functioning of language.


Introduction
In neurotypical speech no sentence is uttered without a thought expressed in it: the absence of such a link would be sign of a pathology, as for example in the echolalic speech seen in parts of the autism spectrum (Prizant 1983). In line with this basic design feature of language, current architectural models of the language faculty posit an interface between two systems identified as language and thought, respectively (Chomsky 1995;Jackendoff 2002). Addressing the empirical problem of how this interface is structured, however, faces considerable methodological obstacles, including the obvious difficulty of studying the specific kind of thought expressed in language in language-independent terms. Moreover, which system or theory would account for thought itself in its human-specific Glossa general linguistics a journal of Tovar Torres, Antonia, et al. 2019. Language disintegration under conditions of severe formal thought disorder. Glossa: a journal of general linguistics 4(1): 134. 1-24. DOI: https://doi.org/10.5334/gjgl.720 Tovar Torres et  form, if not language, remains unclear in empirical terms, though a Language of Thought (LOT) has long been postulated to this effect (Fodor 1975;2008;Burton-Roberts 2011).
One tradition in linguistic theory has considered language to be the generative principle behind the relevant kind of thought itself: Ancient Indian grammar (Chaturvedi 2009); late Medieval Modistic grammar (Covington 2009); and un-Cartesian linguistics (Hinzen & Sheehan 2015); see also Humboldt (1836) and Mueller (1887). This tradition broadly contrasts with a more rationalist or Cartesian tradition, in which language is conceptualized as an expressive system, whose essential function is to encode or communicate a rational thought process that is as such given independently and grounded in languageindependent principles (Arnauld & Lancelot 1660;Chomsky 1966;Pinker & Jackendoff 2005;Fodor 2008). Considerable light could be cast on this historical and foundational dichotomy by considering patterns of language variation not merely under conditions of cognitive uniformity and neurotypicality, but under conditions of changes in the thought process as seen in neurological and neuropsychiatric disorders, where linguistic diversity co-occurs with clinical cognitive diversity. Delineating Universal grammar in the technical sense of a language-specific biological endowment ultimately depends on clarifying its relation to the species-typical thought system. Without considering linguistic changes under conditions of changes in this other system, we would deprive ourselves of variation that could address this relation.
In acquired language disorders such as post-stroke aphasia, the co-existence of cognitive decline with language impairment remains debated. Though cognitive decline is difficult to test when language impairment will typically interfere with task demands in languagebased tests, considerable evidence supports that some aspects of nonverbal cognition decline along with language in acquired aphasia (Baldo et al. 2005;Baldo et al. 2010;Fonseca et al. 2016), as well as primary progressive aphasia (Fittipaldi et al. 2019). Nonetheless, clinical impression often suggests that the thinking process is surprisingly preserved in aphasia: patients seem to struggle to get normal thoughts across linguistically, but not with the thoughts themselves (Varley 2014). In line with this clinical impression, single-case studies have documented dissociations in aphasia between language and other cognitive domains such as arithmetic, theory of mind, music, or scientific and spatial reasoning (Fedorenko & Varley 2016), though it remains debatable how much language was preserved in the patients in question, to what extent some of the tasks could not be solved by lower-level perceptual mechanisms, and to what extent the forms of thinking involved in these nonlinguistic tasks and in language are comparable (arithmetic and music in particular involve no referential concepts of the sort seen in language). Aphasia, moreover, affects people that have had normal language for many decades. The degree to which aphasic performance reveals processing limitations rather than the fundamental language deficit (knowledge or competence) has also long been debated (Linebarger et al. 1983;Bates et al. 1991). In this regard, a more telling case are 25-30% of individuals on the autism spectrum who never develop language in the first place in either production or comprehension and in any modality (Pickett et al. 2009;Tager-Flusberg & Kasari 2013;Slusna et al. 2018). The little evidence that exists about this population suggests that normal intelligence (largely even in nonverbal IQ) and social cognition (including nonverbal communication) effectively collapse, pointing to a fundamental integration of early cognitive and linguistic development (Maljaars et al. 2011;Norrelgen et al. 2015;Slusna et al. 2018). The critical role of language in categorization and learning in preverbal infants independently supports this integration (Perszyk & Waxman 2018).
Here we will consider a different neuropsychiatric condition affecting adults who have had normal language development but are affected by cognitive decline in early adulthood: formal thought disorder (FTD) in patients with schizophrenia (Andreasen 1979). While not exclusive to schizophrenia, FTD is one of schizophrenia's criterial symptoms and objective signs in the Diagnostic and Statistical Manual of Mental Disorders (DSM-5, American Psychiatric Association 2013). Detected at the level of linguistic form, it contrasts with "disorders of content" clinically identified as delusions (e.g. a patient's expressed convictions that he is Jesus or that he came to earth in a cosmic bubble). FTD is undoubtedly linked to a language dysfunction insofar as it is diagnosed as such. Moreover, meta-analyses point to dysfunction in language areas as a neural correlate (Wensing et al. 2017;Cavelti et al. 2018). However, the disorder remains conceptualized within psychiatry as being located at the level of thought, of which the clinically manifest language dysfunction is widely regarded as only an overt expression. In line with this Cartesian viewpoint, linguistic studies of spontaneous speech in this syndrome, though inaugurated by Chaika (1974) early on, have remained scarce and they have often been confined to minimal or small samples of patients with FTD (e.g. Chaika 1974, N = 1;Rochester & Martin 1979, N = 6;Harvey 1983, N = 10;Oh et al. 2002, N = 10). Today, productive speech in FTD thus remains largely characterized clinically through terms such as derailment, incoherence, tangentiality, or "word salad". Since none of these are linguistic terms, it remains as a challenge to determine the more properly linguistic variables that might identify such speech and distinguish it from both non-thought disordered speech in schizophrenia and from that of neurotypical controls. Current cognitive neuropsychological approaches to FTD still largely seek to identify neurocognitive deficits in non-verbal cognitive domains, particularly in semantic memory and executive functioning, though identifying such deficits specific to FTD has proved elusive (McKenna & Oh 2005).
More linguistic studies of FTD are required to assess the role of language dysfunction in the neurocognitive basis of FTD. Language as a neurocognitive domain plays a role not merely in FTD, but in other core symptoms as well, particularly in auditory verbal hallucinations (Tovar et al. 2019), but arguably also in delusions (Hinzen, Rosselló & McKenna 2016). Recent work in computational linguistics have suggested considerable potential for language as a biomarker in schizophrenia, as automated linguistic measures can predict symptoms of schizophrenia including FTD (Elvevåg et al. 2010;Bedi et al. 2015;Holshausen et al. 2014). Experimental psycholinguistic studies have also revealed numerous language processing anomalies in schizophrenia, largely in comprehension/perception (Titone et al. 2007;Kuperberg 2010;Kuperberg et al. 2017; but see Kuperberg et al. 2018, for a recent study of semantic priming in a naming task), and in part specific to FTD (Kuperberg et al. 1998).
Studies of FTD inspired by theoretical linguistic models, in the case of language production, fall into two main traditions. In the first of these, starting from Rochester & Martin (1979), the focus has been on the discourse level using the theoretical framework of Halliday & Hassan (1976). The authors targeted the use of various linguistic devices for establishing 'cohesion' across sentences, given the assumption at the time that schizophrenic speech at lexical and single sentence levels was largely normal (McKenna & Oh 2005). The markers of cohesion in question were a mixed set comprising anaphoric pronominal reference, substitution, ellipsis, conjunction and lexical cohesion. Differences between patients with schizophrenia with and without FTD were mainly found in the mis-use of anaphoric pronouns and demonstratives leading to unclear reference to objects or persons, but not in the quantity of such cohesion markers. This broad finding was replicated in several later studies (Wykes & Leff 1982;Harvey 1983;Docherty et al. 1996;Docherty et al. 2003). This tradition conceptualized such anomalies as communication/discourse disturbances, with the exact link to the linguistic substrate in which they occur still unclear. A second linguistic tradition in the study of schizophrenic speech has documented less syntactic complexity and more syntactic errors (Faber & Reichstein 1981;Morice & Ingram 1982;Morice & formal thought disorder Art. 134, page 4 of 24 McNicol 1986;Hoffman & Sledge 1988;), but with some evidence that syntactic anomalies may characterize language in schizophrenia generally, i.e. without being specific to FTD (Oh et al. 2002;Stirling et al. 2006;Moro et al. 2015;but see Cokal et al. 2018 for evidence that they are more pronounced in FTD as compared with either patients without FTD or controls). Oh et al. (2002) argued, though based on a small sample of six patients with FTD, that it is semantic anomalies at a sentence-level which are characteristic of FTD.
Our goal here was to investigate language in FTD with a particular view to how it may illuminate the thought-language relation. From this point of view, the referential use of Noun Phrases (NPs) is a natural focus. At this referential level, language inherently connects to thought: normal language use always is referential, with speakers picking out objects and events and saying something about them; just as referential thinking is always expressible in language (but only partially in music or imagery). Use of NPs also connects to the first of the above traditions of the study of language in FTD, since different types of NPs naturally serve different functions in discourse, with definite NPs in particular often being anaphoric, i.e. picking up on a referent identified before. It also connects to the second tradition, since NPs are a particular instance of syntactic complexity and NPs that serve different referential functions also exhibit different forms of syntactic complexity (Hinzen & Sheehan 2015;Martín & Hinzen 2016). Recent linguistic work on FTD further supports a focus on NPs. One recent study ) compared the proportions of anomalous NPs in a group of Spanish-speaking patients with FTD (N = 20) against a second group with schizophrenia without FTD (N = 20) and neurotypical controls (N = 14), with data obtained from a fairytale retelling task. This study reported a significant difference between groups when anomalies in the referential NPs were annotated as occurring in definite NPs and pronouns, but not when annotated as occurring in indefinite NPs and lexical NPs (NPs containing a lexical noun), suggesting a specific linguistic signature of FTD speech. Although the grammatical categories "lexical NP" and "definite NPs" overlap (a lexical NP like the man is definite, but need not be, as in a man, while a definite NP can be a lexical NP or pronominal), the exact linguistic distinction involved thus matters when seeking to linguistically distinguish these groups. The result of Sevilla et al. is consistent with the fact that unclear reference and poverty of content are among the terms clinically identifying FTD (Andreasen 1979;1986): although these terms reflect clinical judgements, at a linguistic level they naturally correspond to an anomalous indefiniteness (or lack of specificity) of referential phrases: either it is unclear what object, person or event is being referred to (unclear reference), or it is so indefinite that the impression of a lack of proper content arises (poverty of content). Quantity and quality of use of definite NPs is thus an appropriate and promising focus for linguistic studies of FTD. The results of Sevilla et al. (2018) on misuses of definite vs. indefinite NPs, furthermore, are broadly in line with another study in an English-speaking sample of patients with and without FTD (Cokal et al. 2018).
We also aimed to illuminate the language-thought relation by contextualizing deficits in NP use against anomalies in other levels of linguistic organization. In language, a complete thought is built in layers, starting from a selection of lexical concepts and then some initial structure-building that integrates objects or persons into verb phrases: argument structure, which reflects a layer of meaning intermediate in hierarchical complexity between lexis and full propositional information at the level of utterances that come with referential meaning. While anomalies at the lexical level (paraphasias and neologisms) are well-established in schizophrenia and FTD in particular (McKenna & Oh 2005), as are syntactic anomalies as per the second linguistic tradition above, degrees of impairments across these levels have not yet been systematically compared. Our annotation scheme formal thought disorder Art. 134, page 5 of 24 thus covers (i) referential anomalies as linked to their linguistic substrates (NP types in which they occur), (ii) argument structure, (iii) lexis, and (iv) (morpho-) syntax. Based on Sevilla et al. (2018) and Cokal et al. (2018) we predicted that: • Proportions of anomalies in definite NPs and pronouns would outweigh those in indefinites and lexical (non-pronominal) NPs; • Despite evidence for lexical (word) -level and formal syntactic anomalies in FTD in the literature, referential anomalies would be a more indicative marker of language impairment in FTD when these respective layers of linguistic organization are compared with one another.
We further explored with post-hoc analyses whether a more fine-grained sub-classification of NP types involved in referential anomalies and of clause types could further illuminate the patterns found in the main analysis. We specifically explored the following linguistic distinctions: (i) covert-vs-overt pronouns, (ii) 1 st person-vs-non-1 st Person, (iii) animate-vs-inanimate pronoun; and finally, (iv) copular-vs-non-copular clause types. This was motivated, in the case of (i), by different functions of covert and overt pronouns in Romance (particularly discourse and anaphoric functions in the former case, see Sorace et al. 2009;Camacho 2013;Jiménez-Fernández 2016); in the case of (ii), a potential influence of self-referential (1 st Personal) discourse on referential anomalies, given the importance that the 1 st Person plays across other core symptoms of schizophrenia, including auditory verbal hallucinations (Tovar et al. 2019) and delusions (Hinzen, Rossello & McKenna 2016); and, in the case of (iii), whether language disintegration is "contentsensitive" in the sense that it plays a role whether the NP in question denotes animate entities or not. In the case of (iv), finally, we inquired whether sentence type plays a role in how anomalous sentences are: copular clauses like She is my mother are more based on grammar than on lexical information: they do not contain a lexical verb and often express identities (of one thing with another), about which patients in our sample appear to be very often confused.

Participants and corpus
The basis for this study was a historical corpus collected from 38 Spanish-and Catalanspeaking stable in-patients with schizophrenia by a local psychiatrist, Dr. Moya, for purposes of a PhD dissertation on the language of formal thought disorder (Moya 1989). Speech samples consisted of free conversations with an interviewing doctor. To make an extremely time-intensive annotation procedure manageable and avoid confounds between Catalan and Spanish-speaking patients, annotations were restricted to a total of 15 Spanish-speaking but otherwise randomly selected participants and the first five pages of transcriptions from their speech, resulting in a mean number of 888,6 words per participant (standard deviation: SD = 384,9). Audios on which the original transcriptions were based were still available and provided to us. Existing transcriptions were checked against the audios, and in many cases completed. The mean age of these 15 patients was 46.13 (SD = 16.2), the mean length of illness in years was 22.4 (SD = 11.17). 7 were male. Clinical records were still available for each patient, capturing family history, clinical history, disease progression, medication, speech samples, and justification of clinical diagnosis through DSM-III criteria. DSM-III diagnostic codes were: "Paranoid schizophrenia" (295.3, 8 patients), "Undifferentiated schizophrenia" (295.9/92, 2 patients), "Disorganized schizophrenia" (3 patients), "Residual" (1 patient). 1 last patient had no diagnostic code. The DSM-III A-criterion of "incoherence and notable loss of the associative capacity" was noted to be fulfilled in each case; disorganization of speech, in many cases with detailed examples, was mentioned in most of the case reports.
To allow comparability with other studies of FTD, participants were formally rated by a psychiatrist not involved in this study (Dr. Edith Pomarol-Clotet, FIDMAG Germanes Hospitalàries, Barcelona) using the canonical Thought, Language, and Communication (TLC) scale of Andreasen (1986). TLC ratings confirmed the FTD diagnosis in all cases. According to the TLC scale, 12 of the 14 participants scored an extreme (4) on the "Global" rating, defined as "TLC disorder so severe that communication is difficult or impossible most of the time". In computing the Global score, the TLC suggests that "some TLC disorders are more pathological than others", in the sense of "likely to suggest severe psychopathology" (Andreasen 1986: 481). The above 12 participants all scored either "moderately severe" or "severe" on two or more of the "more pathological" items, i.e. incoherence, derailment, tangentiality and loss of goal. One participant was given a TLC-Global rating of "moderate" (2) -determination of the degree of FTD in this patient was problematic because the main abnormality he showed was delusional confabulation. The final participant was given a TLC-global score of (1) as she principally showed so-called negative FTD or alogia and scored mostly on the criteria of poverty of speech and perseveration. Given the different FTD profiles of these last two participants, post hoc analyses were performed to determine whether results changed when linguistic variables were compared across the group without including these participants.

Annotation scheme
Annotations proceeded first at the level of clauses, secondly at the level of the four different linguistic strata distinguished here, namely use of different types of referential nominals, argument structure, lexis, and morpho-syntax. As a first step, at the clausal level, annotators were instructed to first identify all clauses with a finite verb (in matrix or embedded positions) and to identify them as either copular clauses (with a predicate to be) or non-copular clauses (codes [cop] and [ncop]). They then had to make a first-pass judgement on whether each of these contained an anomaly or not (codes [g] or [b] for 'good 'or 'bad', resulting in codes such as [copg] or [ncopb] annotated directly after the relevant finite verbs). Three criteria were individually or jointly sufficient for an anomaly rating: (i) the clauses contained any kind of formal-grammatical errors, e.g. Él es ángeles (lit. 'he is angels'), which involves a violation of grammatical agreement; (ii) they involved three or more repetitions or were echolalic, e.g. a participant repeating the phrase no hay dinero en la casa ('there is no money in the house') at the end of his utterances as a kind of meaningless stock phrase; (iii) they had NPs with referents that could not be identified or were misplaced and contributed to false or plainly nonsensical statements, e.g. nací por aquí, por este mundo (lit. 'I was born around here, around this world', where the place indication (este mundo) is vague; or Bulle a mi alrededor una distracción (lit. 'Boils around me a distraction'), where it is unclear what the noun distracción refer to, in the context of the verb bulle; or one participant's claim Usted estaba allí ('you were there'), which misplaces the interviewing doctor (usted) as participating in a scene that took place years ago in her house (with the consternation of the doctor indicating that this was not the case). See detailed annotation samples for further examples of referential anomalies. Annotations were based on the crucial insight that reference in human language is always a relational phenomenon, in the sense that a word or NP in isolation (e.g. man, or the doctor) would never refer to anything: reference is always a sentence-(and indeed utterance-) level phenomenon, which depends on the lexical description of the referent provided, the grammatical relations in which the NP stands, and context (Hinzen & Sheehan 2015). Referential anomalies were therefore determined for NPs as occurring in their utterance-context. This point is also relevant to reference in the first Person. Although it may appear that I as used in isolation cannot possibly be mis-used referentially (it cannot fail to pick out the speaker), it arguably can become anomalous e.g. when used in a copular clause in which the referent of I is identified with another person, as in a female speaker's assertion Yo era mi marido (lit.' I was my husband'), where it is unclear who the speaker actually refers to, himself or another (male) person, i.e. her (his?) husband; or another speaker's assertion Me mataron a mí en el psiquiátrico (lit., 'They killed me in the psychiatric'), which clearly cannot have happened to the referent of 'me' except metaphorically.
In the case of an anomaly based on (iii) (referential anomaly), annotators next identified the grammatical type of the NP affected, according to the following NP sub- ) they had. NPs were annotated even in truncated utterances or hanging topics (non-sentences), in which case no anomaly was annotated at the sentence (matrix) level (since there was no sentence). An exception to this rule was when this type of anomalous NP occurred in an embedded truncated clause, in which case the anomaly at the sentential level was annotated at the level of the superior clause (e.g. En esta vida [g] pasa [b] que la la música, lit. 'In this life it happens that music', where the [b] annotates the matrix clause anomaly, while the two NPs are unobjectionable). All instances of NPs that were not referential but predicational were disregarded, for example NPs in appositions, since they resume the same referent (e.g. the boldfaced NP in Thor, the guy of the thunder, …), in NPs forming predicates of copular sentences (e.g. He was a policeman) and secondary predications (e.g. Me trajeron la última, lit. 'They brought me the last one, meaning I was the last one they brought'). Crucially, referential NPs were not annotated as anomalous merely in virtue of reflecting rarified beliefs (e.g. of a religious nature, like having seen the virgin Mary, or simply unlikely to be true but not verifiable by annotators, like having talked to Mariano Rajoy in the Palacio de Congresos).
Turning to the remaining linguistic strata, annotations of lexis included anomalies that could be detected at the lexical level alone (without considering grammatical context), in the form of either neologisms, e.g. espárramo, genitación, clanging, lexical decompositions such as arma de dedo (lit. 'weapon of finger') for pistola ('pistol'), or anomalies relating to the use of light verbs e.g. hacer convencimiento (lit. 'make conviction') for convencer ('to convince'). Violations of argument structure were defined as local selectional relations between a verbal head and its thematic arguments; these could be anomalous if either the arguments were wrongly subcategorized at the level of grammatical category of the selected dependent, e.g. hablé esto ('I spoke this'), when spoke ('to speak') requires a prepositional complement; or else the selection was wrong semantically, e.g. quisiera estar en la consideración y naturaleza de mi vida (lit. 'I would hope to be in the consideration and the nature of my live'), where estar cannot subcategorize the following NP; or Sí, están tramitando (lit. 'Yes, they are processing'), where an object NP is grammatically obligatory but missing. Finally, errors in morphosyntax comprised agreement and other errors compromising the formal-grammatical integrity of a phrase or clause, disregarding its meaning. In Mi madre son muy monjas (lit. 'My mother are very nuns').
Note that these four annotation levels (referentiality, argument structure, lexis, and morphosyntax) are not orthogonal to one another but can be at least partially ordered in a hierarchy. Thus, while morphosyntax is the most strictly formal level of grammatical organization, lexis involves lexical level semantic organization, while argument structure is grammatical but lexically projected, and reference in the present sense is a full-utterance level phenomenon. Referentially evaluable utterance-level propositional meaning presupposes a syntax matching this meaning, which in turn includes argument structure, which itself includes lexical meaning. In Her grandmother broke a leg, we need to understand the general lexical concepts grandmother, break or leg, before understanding which persons, objects, or event in the world the content words are used to refer to; and we need to grasp what event it is, conceptually, i.e. one of a grandmother breaking a leg (argument structure). Note that lexical items as such and as occurring in isolation (grandmother, leg, break) only have general conceptual meaning and cannot pick out particular persons, legs, or accidents; even complex NPs occurring in isolation, e.g. her favourite grandmother, do not as such give us any idea of who is being referred to. Both reference and truth are utterance level (root) phenomena, which require multiple grammatical elements to come together configurationally in the right way so to allow identifying a referent and event in context (Hinzen & Sheehan 2015;Hinzen 2017).
Annotations were first made by three annotators (AT, CM, SS) working independently on sub-samples of 5 transcripts each, focused on the referential analysis only, which was the most complex. To ensure strict adherence to the same criteria, all three annotators then met to check all annotations in the entire sample, under the supervision of a senior rater who was not involved in the writing of this paper (Joana Rosselló, JR), so that all annotations were checked by four raters in total. All disagreements were resolved by consensus from all raters. Domains other than reference were pursued in the form of three bachelor theses completed a the Universitat de Barcelona in 2017 under the direction of JR, who verified every annotation made in these three domains.

Samples
We provide three extracts of the conversations to give the reader a sense of the type of speech investigated here and use of the annotation scheme. It needs to be kept in mind that in general, schizophrenic speech, particularly at this level of disorganization, is very difficult to translate, and errors in one language may not be errors in the best available English translation. Below, comments justifying codes are restricted to those codes ending with [b] (bad, i.e. anomalous by the above criteria). Superscripts identify such codes and are repeated in the comments. Clause-level annotations identifying whether a clause (specified as copular or non-copular) was good or bad are annotated directly behind the clause's finite verbs; NP annotations behind the relevant NPs. PAT: patient; INT: interviewer. […] Bulle (ncopb) 1 a mi (podxs1g) alrededor (ndstb) 2 una distracción (nistb) 3 y un aliciente (nistb) 4 que seguramente debe ser (ncopb) 5 pero (pcds1g) no la (pods3tb) 6 noto (ncopb) 7 en mí (pods1b) 8 , sino que (pcds1g) me doy cuenta (ncopg) que no (pcds1g) la (pods3tb) 9 tengo (ncopb) 10  Comments: 1 Deficient (non-copular) clause because of vague/unclear references in 2-4 . 5 Non-copular clause deficient for formal-grammatical reasons. 6 Reference of clitic unclear. 7 Deficit at clausal level inherited from referential deficits in its nominals. 8 Una distracción y un aliciente cannot be located 'in me'. 9 Reference of the clitic continues to be unclear.

10
Deficit at clausal level inherited from referential deficits. 11 The verb bulle, possibly a Catalanism, has transformed into a noun, which it cannot be in Catalan. 12 Inappropriate reference in this sentence context. 13 See previous comment (11). 14 Deficit inherited from referential and formal-grammatical problems inside the clause. 15 Adjective wrongly subcategorized. 16 Reference of clitic unclear. 17 Wrong verb form.
Literal translation: I started taking Melleril and it produced me imbalance, what I suggested, I have had in my life, imbalance. -What does this mean, 'imbalance'? -Imbalance is that you are not in balance. You go to a place and you … and for example, we go for a walk on the walkers [neologism] and there are people who have balance, who turn the equilibrium away and they can pass because their body is alright, they have balance. However, other people we stagger, someone gets out of control. And then, well, I have said it many times to the friends, that I will fall into the stream or who says a slope, places of danger, the so distant places.
Comments: 1 Deficient grammatical selection of desequilibrio by tener. 2 Deficiency of the clause inherited from deficient NP: 3 pasera seems to be an instance of "clanging", originating from pasar (to pass), with the intended meaning places where you can go for a walk, but the word exists in Spanish but only with a different meaning. 4 See comment (1) above.

5-7
Deficits at clausal level inherited from infelicitous selections inside with the NPs themselves in good shape. 8 Impossible reference after plural inclusive in previous clause. 9 Deficit at clausal level inherited from (8). 10 A missing preposition a introducing the object NP in se lo he dicho las compañeras: an argument structure violation. 11 Deficit at clausal level inherited from infelicitous definite reference to some stream not previously introduced (12).

Sample 3
INT: ¿Cuánto tiempo estuvo usted en Barcelona ¿porque, usted, ¿dónde ha nacido usted? PAT3: Yo (pods1g) nací (ncopb) 1 por aquí, por este mundo (nistb) 2 , por el campo (nistb) 3 . Translation: How long did you live in Barcelona? Where you were born? -I was born around here, in this world, in the countryside. -In the countryside? -In the countryside. My mother wore a white coat and she went to a ravine and I was born. -Your mother wore a white coat? -Yes, she is over there, in the kitchen. -Your mother? -Yes. -Is your mother alive? -Yes, she is here, in the kitchen.

5
Deficits at clausal level inherited from infelicitous reference in (6) to a white coat worn by the mother when giving birth, which the speaker presumably cannot know. 7 While there is nothing wrong per se in reference to un barranco (a ravine), the mother presumably did not go to a ravine when giving birth. 8 Deficit at clausal level inherited from anomalous reference in first person in (9), where the speaker misplaces herself as being born in the context outlined. 10-15 The covert subject is misplacing the mother as being in the kitchen.

Statistical analysis
Variables compared here are proportions of errors on a specific linguistic unit. For instance, the proportion of anomalous definites was calculated as the number of anomalous definite nouns or pronouns over the total number of definite nouns or pronouns produced. This was necessary to account for quantitative differences in the total number of words produced by patients. Paired-samples t-tests were applied within patient where normality as determined by a Shapiro-Wilk test and symmetry of the data allowed this. Wilcoxon signed-ranked tests were applied in cases of violation of normality only, and Sign tests if none of both conditions applied. Cohen's d for dependent samples was used to quantify effect size of differences that were significant. According to Cohen's (1988) suggested interpretation of this measure, almost all effect sizes reported are large (defined as > 0.80), only one being medium (0.50 < d < 0.80). All indicated p-values are two-tailed and the significance level is set at 0.05.

Proportion of anomalous definite vs. indefinite NPs
A paired-samples t-test revealed that, contrary to our predictions, the proportion of anomalous definites over the total of definites (M = 0.250, SD = 0.137) was not significantly different from that of anomalous indefinites over all indefinites (M = 0.314, SD = 0.186), t(14) = -1.41, p = .180.

Proportion of anomalous nominals vs. pronouns
Wilcoxon signed ranked test showed, again contrary to our predictions, that the proportion of anomalous nominals out of all nominals (M = 0.339, SD = 0.176) was significantly higher than that of anomalous pronominals out of all pronominals (in all grammatical Persons) (M = 0.222, SD = 0.118), V = 4, p < .001, Cohen's d = 0.85).

Anomalies across linguistic strata
Pairwise comparisons of anomalies divided by linguistic strata showed that language was affected over all strata distinguished here, and additionally that there was a linear progression between them in terms of mean proportion of anomalies. Specifically, starting from the most impaired, the pattern (with Means and SDs) was: NP (0.283 ± 0.134) >* Argument Structure (0.042 ± 0.034) >* Lexis (0.006 ± 0.006) > Morphosyntax (0.005 ± 0.004), where * indicates a statistically significant difference (see Figure 1 and Table 1).

Fine-grained comparisons
Results from a series of paired-samples t-tests are summarized in Table 2; the corresponding boxplots can be found in Figure 2. There was a significant difference between the proportion of anomalous 1 st and 2 nd person pronouns out of all pronouns as compared with the proportion of anomalous 3 rd person NPs, the latter being more affected. This in turn motivated restricting the comparison of the respective proportions of anomalous    pronouns and lexical NPs to 3 rd Person pronominals only, which eliminated the significant difference between anomalies in pronouns and lexical NPs found in the main comparisons. The difference between the proportion of anomalous covert pronouns out of all covert pronouns and that of anomalous overt pronouns out of all overt pronouns was also significant, with covert pronouns more affected than overt ones. When narrowing down this last comparison to 1 st person pronouns only, on the other hand, the difference between covert and overt instances of 1 st person pronouns trended in the opposite direction (p = .06). The proportion of anomalous animate NPs out of all animate NPs was significantly lower than that of anomalous inanimate NPs over all inanimate NPs. Finally, when comparing proportions of anomalous copular and non-copular clauses out of the total of copular and non-copular clauses, another trend (p = .06) emerged, with copular clauses more affected by anomalies than non-copular ones.
Post hoc analyses of the sample with two participants removed due to their different profile of FTD as determined by Andreasen's (1986) TLC (see Section 2.1) by means of paired t-tests showed no differences in the pattern of results except in two cases where trends converted into significant results: covert (M = 0.098, SD = 0.084) vs. overt (M = 0.162, SD = 0.114) anomalous instances of 1st person pronouns (p = .035, t(12) = -2.37, Cohen's d = -0.66) with the latter more anomalous than the former; copular (M = 0.686, SD = 0.238) vs. non-copular (M = 0.524, SD = 0.137) anomalous clauses out of the total of copular and non-copular clauses (p = .010, t(12) = 3.06, Cohen's d = 0.85), with copular clauses more affected by anomalies than non-copular ones.
In order to ensure that results are not driven by possible outlier participants, we further searched for outliers in every comparison made using a common technique and, if one was found, the analysis was re-run omitting the outlier. Concretely, since our analyses are paired, for every comparison between variables V1 and V2 we looked for outliers on their paired difference (V1-V2). We calculated the Interquartile range (IQR) of this variable, which is the difference between the 75th and 25th percentiles. We then defined two cutoff points for outliers to be at a factor of k of the IQR above or below the 75th and 25th percentiles, respectively. If any patient lies beyond these points, it was considered an outlier. A common value for k is 1.5 (Tukey 1977).
A number of outliers were found with this method (2 in the Argument Structure -Lexis comparison, 2 in the Argument Structure -Morphosyntax comparison, 1 in the 3rd person pronouns -Lexical NPs comparison, and 2 in the covert -overt pronouns comparison). However, analyses excluding them resulted in very similar or smaller p-values to those of analyses using the full sample, and did not change their significance, showing that our results are not strongly driven by their influence (see plots in the Supplementary Materials showing the values of the compared error rates by patient).

Discussion
These results shed new light on language disintegration across different linguistic strata under conditions of clinical thought disintegration. Results partially supported and partially contradicted our main predictions. They did not support our expectations motivated by previous studies (Rochester & Martin 1979;Wykes & Leff 1982;Harvey 1983;McKenna & Oh 2005), which had highlighted problems with pronouns and vague and unclear reference in spontaneous schizophrenic speech, while Cokal et al. (2018) and Sevilla et al. (2018) specifically highlighted problems with definiteness. Our results suggest that at least in severe FTD of the kind studied here and in a conversational task of this nature, the referencing problem seen in such patients is more general and reaches deeper into the organization of language, as opposed to primarily affecting pronominal or definite forms of reference, as we had predicted. Pronouns or definite NPs mediate specific discourse functions such as anaphoricity and (in the case of overt and covert pronouns) aspects of information structure (Sorace et al. 2009;Camacho 2013;Jiménez-Fernández 2016). The results suggest, therefore, that the referential problem is located at a more fundamental level, affecting the entire process of reference generation from the initial retrieval of a lexical content word to the final configuration of an act of reference via a full NP in a sentential context, without being restricted to anaphoric or discourse functions.
This failure to replicate results on definiteness in previous studies may be partially due to the fact that two of the studies mentioned above that have investigated this issue most directly Cokal et al. 2018), used narrative tasks, namely a fairytale retelling task and a retelling of a visually presented comic strip, respectively. These studies found that anomalies in definite NPs , and in the quantitative proportions of definite vs. indefinite NPs (Cokal et al. 2018), are linguistic identifiers of FTD as compared with controls and patients without FTD. But the tasks in question constrain the referential process more than the conversational task used here: a fairytale already provides a plot that is memorized, and in the case of the comic strip, the referents were visually present as and when the story was told. By contrast, in the present study, referencing was restricted only through the prompting questions of the interviewer, providing fewer constraints with regard to which lexical concepts are to be retrieved for reference.
Unlike in Sevilla et al. (2018), the proportion of anomalous lexical NPs turned out here to be significantly higher than that of pronouns. This could initially suggest that the problem increases when lexical content is involved, not when reference is not lexically mediated and in this sense more grammatically mediated, as in the case of pronouns. However, in the present study, when comparisons between anomalous pronouns and lexical NPs were restricted to 3 rd Person pronouns as compared with lexical NPs (which are always 3 rd or non-Person), significant differences in relative proportions of anomalies crucially disappeared (Table 2). In short, the initial appearance that lexical NPs are significantly more affected than pronominal ones is likely based on mixing in the other grammatical persons (1 st and 2 nd ), which showed fewer anomalies in the domain of pronouns when compared with 3 rd Person pronominals. Since personal pronouns are usually functioning deictically, this also suggests the conclusion that, within the domain of pronominals where lexical-descriptive content is absent, a specific difficulty with anaphora (referential dependencies) may indeed manifest itself: such a difficulty would naturally affect 3 rd Person pronouns in their most typical uses more than personal ones. Comparisons between the use of covert and overt pronouns reported above support this interpretation, since the former were more affected than overt ones (Table 2) and they tend to function anaphorically in Spanish. Interestingly, moreover, within the domain of 1 st Person, this relation between anomalies in covert and overt 1 st Person pronominals reversed, with overt 1 st Person pronouns being more affected. This may be because there is no clear sense in which the 1 st Person realized as a covert pronoun is anaphoric as opposed to deictic.
On the other hand, this interpretation of the pattern seen within pronominals should be qualified by the fact that no significant differences in respective proportions of anomalies between definite and indefinite NPs were found, even though the former tend to be anaphoric in their functions, unlike the latter. That is, if we include all NPs, whether lexical or not, the problem still does not appear to be a problem of one NP type (e.g. NPs with anaphoric functions) primarily: it affects definite NPs as much as indefinites.
Earlier studies have also supported the existence of formal syntactic anomalies in both FTD and schizophrenia at large, as compared with control subjects (Faber & Reichstein 1981;Hoffman & Sledge 1988;Oh et al. 2002;Moro et al. 2015, a.o.). Our data, on the other hand, suggest the relative preservation of morphosyntactic aspects of linguistic organization in even severe FTD. To put this insight in a different way, if all content words were replaced by pseudo-words in the speech of the patients studied here, particularly nouns, resulting in a radical version of Jabberwocky-style speech, very few anomalies would be noticeable. 1 We interpret this relative preservation of morphosyntactic aspects as showing that insofar as even severe FTD can exhibit relatively fluent discourse, the "fluency" in question is largely procedural in nature -it reflects language at the level reflecting learned patterns in procedural memory of how phrases are built (Ullman et al. 1997). It is simply that, in terms of referential content, these phrases have become idle wheels, often effectively conveying no content at all. In short, what is surprisingly robust when our thought capacity is fundamentally lost, is morphosyntax in the sense of a learned routine, independent of the role that grammar plays in mediating a specific kind of content.
The type of content that is lacking concerns meaning that arises when lexical-level content is turned into referential expressions via grammar, which is in line with earlier suggestions of an anomalously lexically-driven speech generation process, contravening a proper "balance" between such lexical and grammatical processes of encoding meaning (Ditman et al. 2011). In accordance with this interpretation, lexis as such (disregarding its referential use in context) was as comparatively unaffected in the present study as morphosyntax was. Put differently, from the viewpoint of our comparative results across different variables and linguistic layers, it is difficult to detect FTD, even at this level of severity, by looking at a lexical level only (neologisms, clanging, etc.) while abstracting from the normal referential function of lexical items retrieved in language use. The problem does not lie so much in lexical content per se as in the grammatical meaning that arises when grammar accesses the lexicon so as to produce referential and propositional meaning on an occasion of language use. Such meaning is inherently contextual insofar as it locates given abstract and general concepts (man, birth, village, etc.) in specific objects or events existing in space and time as identified relative to the time and space of speech.
Although personal (1 st or 2 nd Person) pronouns were proportionally less affected than 3 rd Person ones, it is worth noting that, at a qualitative level, remarkable anomalies showed in the former as well, which precisely relate to contextual embedding. An example is from the participant of sample 3 in Section 2.3, who insists that her mother wore a white coat (bata blanca) during her birth, upon which the interviewer asks how she could know this, given that she had just been born. The patient answers: Porque yo nací por el campo y me dijo: "Estate aquí que yo ahora vengo" ('Because I was born around the countryside and she told me: "Stay here as I come now"'). We interpret these as misinterpretations of when a speaker is an addressee, i.e. 2 nd Person, and hence as a mis-localization of herself as a 1 st Person. Problems with felicitous uses of personal pronouns thus deserve further study, in line with theoretical approaches stressing the importance of disturbances of deixis to the psychopathology of schizophrenia (Crow 2010;Hinzen & Rosselló 2015;Hinzen et al. 2017). Deictic disturbances clearly extend beyond personal pronouns, reflecting remarkable problems of these patients in locating events or themselves as event participants in space and time, e.g. a patient saying Yo nací por aquí, por este mundo ('I was born around here, in this world'), another specifying a time incomprehensibly as la hora de víctimas, a third commenting: Llevo aquí un mes o bien han adelantado el calendario? Yo llevaba la cuenta de los días y la he perdido ('I am here for a month, or have they advanced the calendar? I kept track of the days but I have lost track now').
We speculate that the special role of referentially anomalous NPs in the linguistic profile of FTD may also explain the interesting and novel result that copular clauses were proportionally more often anomalous than non-copular ones (see Table 2 and post hoc results with two outliers removed). Copular clauses lack a lexical verb, hence they necessarily rely on NPs for their lexical structure more than any other part of speech (e.g. 'I was my husband', 'This is equilibrium'). As a consequence of that, they also have a restricted range of possible meanings, which particularly includes statements of identity, as just illustrated. This is what might make copular clauses more anomalous as compared with non-copular clauses, which have lexical support in their verbs and in this sense depend on less on the lexical content of NPs only. Investigating clause structure is an important task in future work. A completely unexpected post-hoc result (Table 2), on the other hand, was that reference to animate entities was proportionally more impaired than reference to animates. We do not know how to interpret this result. Very speculatively, reference to persons will often be deictic and rooted in the 1 st and 2 nd Person (e.g. reference to speaker and hearer, or persons directly relating to them, e.g. my sister), which were less impaired. While referencing is unstable in this population across all forms of reference including deixis, reference to non-personal objects without anchoring in the immediate speech context may become particularly unstable.
Although the lexical level showed a low proportion of anomalies comparatively to the other strata distinguished here, two phenomena transpired in the course of these annotations that have to our knowledge not been noted before and bear brief mentioning here to motivate future research. Firstly, a recurring phenomenon in this subsample were lexical decompositions of nouns or verbs into their conceptual ingredients, e.g. hace convencimiento (lit. 'make conviction') in the place of convencer ('to convince'), son de credo (are of faith) in the place of creyentes ('believers'), artistas de hielo (artists of ice) in the place of patinadores ('ice skaters'), arma de dedo ('finger weapon') for pistola, corrida de la vida ('course of life') for prostituta (prostitute), general del tráfico (traffic general) for policía de tráfico ('traffic police'), nervio de hombre ('man's nerve') for pene ('penis'). A second noteworthy phenomenon was the pervasive existence of lexical NP repetitions or stacked NPs (see e.g. the end of sample 2 above).
Sandwiched in between lexis and referential and deictic meaning lies argument structure, as an early layer of grammatical complexity encoding basic thematic structure: participants organized around an event. In line with the above interpretation, statistically significant differences in the proportion of anomalies seen with respect to both lexis and morphosyntax appeared at this layer already, though by no means as severe as in referential, utterance-level meaning. Reference in this latter sense is where language and thought connect: language cannot be used except referentially, i.e. with words used so as to pick out objects, persons or events, which the thoughts expressed in the sentences are about. Though abstract poetry takes this idea to its limits, language never functions in the way that music, say, does. It does allow us to talk about fictions, yet only if these fictions are distinguished from reality and appropriately placed in relation to existing objects and a shared deictic frame relative to which fictions are recognized as such. Again, the absence of such anchoring in a shared space of reference would be a sign of pathology, as in delusional speech.
In line with this, reference in the normal (i.e. a declarative) sense has long been linked to language, given its essential absence in non-human primates (Butterworth 2003;Tomasello 2006;Tomasello & Call 2018), its close association with language development even in its nonverbal forms in humans (Iverson & Goldin-Meadow 2005;Colonnesi et al. 2010), and given its severe reduction or absence in non-or minimally verbal children with autism spectrum disorders (Maljaars et al. 2011;Slusna et al. 2018). Since, in turn, thought that was not expressible in language as used normally would not be thought of the same kind (but might be emotion, imagery, music, or pathological thought), it is arguable that language, thought, and reference are inseparable in humans, forming an integrated, single species-specific scheme, in which they are all co-dependent (Davidson 2004;Hinzen & Sheehan 2015;Hinzen 2017). From this point of view, it makes sense that language in FTD is seen to disproportionally disintegrate at this referential end the level of grammatical complexity where thought becomes referentially anchored in speech. Referential language is unthinkable without thought; as is thought without reference.
Overall, then, we conclude that the present results suggest that, in formal thought disorder, language and thought disintegrate together: the language disintegration seen cannot be made sense of independently of the thought that language inherently conveys, nor can the thought disturbance be separated from specific linguistic dimensions and parameters in which it is manifest. Language and thought in this sense imply a conceptual distinction that ceases to be empirically meaningful. To be sure, data reported here and elsewhere (Cokal et al. 2018;Sevilla et al. 2018) do not rule out that referential anomalies seen in FTD might be due to some language-independent, currently unknown cognitive mechanism, in which case a language-thought dissociation would be re-vindicated. However, it seems unnatural to split the referential function of language off from language, when referentiality is intrinsic to all linguistic functioning and grammar is systematically sensitive to referentiality. Language cannot be used other than referentially, and it never resembles a system like music, where referential meaning of the same type is not seen. Moreover, reference as investigated here concerns a specific type of meaning arising configurationally, i.e. from an NP in a grammatically referential position within a structured utterance; and significant differences in the use of specific NP types are seen in the present results, as they were in other studies (Rochester & Martin 1979;Docherty et al. 1996;Docherty et al. 2003;Cokal et al. 2018;Seville et al. 2018). This suggests that language dysfunction in FTD should be studied at a linguistic level, though it is also true that language functioning is always integrated with other domain-general cognitive functions such as attention, executive functioning, or working memory. Docherty (2012) in particular found significant correlations between "communication failures", which often relate to reference in the present sense, and measures of attention, working memory, and conceptual sequencing. Nonetheless, whether such mechanisms can illuminate the specific and differentiated linguistic pattern seen here, is unclear. Current studies on pronoun resolution specifically find correlations between reference skills and executive functions (Hendriks et al. 2014;Sorace 2016;Ladányi et al. 2017), yet as noted, pronouns were not specifically more impaired than lexical nominals in the present study. A primary linguistic deficit in how grammar configures reference clearly remains an option to be considered in the neuropsychology of FTD. This would be consistent with current evidence from meta-analyses of neuroimaging studies about the neural correlates of FTD, which center on core language territory in the brain (Wensing et al. 2017;Cavelti et el. 2018), though interconnected with other cognitive functions.
A limitation of this study is that it lacks comparable data from a neurotypical control group using the same measures. The study focused on relative differences between error rates to profile a particular dataset of clinical speech. We therefore cannot assert to what extent the same types of errors would also be found in controls, nor whether a similar progression from the levels of Lexis and Morphosyntax to Argument structure and Referential errors might be seen there. Regarding absolute proportions of errors, some previous studies have found no differences in the proportion of syntactic errors between schizophrenia and control groups , while others have (e.g. Cokal et al. 2018). By contrast, both of these studies and many others have documented significant increases in referential errors in schizophrenia groups vs. controls, particularly in FTD. It is nonetheless empirically possible that, in controls, a significant difference between syntactic and referential errors could be found, too, though we are not aware of data on this. When comparing formal syntactic with referential errors specifically, a reason for a gradient of increased error proportions towards referential errors might be a greater cognitive demand imposed when language is put to a referential use, as opposed to merely being produced in a formally correct manner. Cokal et al. (2018) (Supplementary Materials) reported means of ratios of referential errors divided by total utterances to be .35 in a group with FTD and .11 in neurotypical controls. By contrast, means of ratios of syntactic errors were .11 in both schizophrenia groups and .07 in the control group, suggesting a much smaller gap. In the present study, the means were .28 for ratios of referential errors and about .05 for syntactic errors (including both argument structure and morphosyntactic errors). However, criteria of annotation were partially different in Cokal et al. (2018), the task was a picture description rather than free conversation, and referential errors certainly had a distinctively different quality and scale in the present group. Nonetheless, data from that study and the present one certainly suggest remarkably low rates of syntactic errors in both schizophrenia and control groups, despite of the severity of FTD involved in the present study; and only slightly more elevated mean rates in referential errors in the controls of Cokal et al. (2018) relative to syntactic errors. In a recent study of an elderly Spanish-speaking neurotypical control group (Martínez-Ferreiro et al. 2017), participants (N = 15) produced only 2.3% of ungrammatical utterances; though different from a ratio of syntactic errors, this figure again suggests such errors to be relatively rare in neurotypical speech. Overall, it seems reasonable to conclude that compared to potential differences in (low) rates of syntactic errors between people with schizophrenia (whether with or without FTD) and controls, which may or may not exist, a wide gap opens at the other end of our spectrum of linguistic strata, i.e. in referential errors, with a steep slope of the gradient from syntactic to referential errors, particularly at the levels of severity of FTD studied here.
In sum, this study has revealed, for a rare corpus of severely thought disordered speech and a conversational task, that the disorganization of thought in question affects the organization of language differently at different levels: proportionally the least anomalies are seen at the morphosyntactic and lexical levels, while proportions increase the moment that meaning is involved at a structural level (argument structure), which is still at least partially lexically driven by the meaning of the selecting verbal head. Anomalies peak when lexical meaning and argument structures are put to a referential use at the level of sentences and utterances, in a way that affects NPs in their referential uses in general, though within pronouns, distinctive patterns of differential impairment can also be seen. This result informs theories of the language-thought interface by showing that and how, as thought disintegrates, language is affected. Future work needs to confirm and finegrain the referential anomalies seen in such speech, and determine the neural basis of the gradient of decline across the four strata distinguished here.

Additional Files
The additional files for this article can be found as follows: •