Going off the rails: Impaired coherence in the speech of patients with semantic control deficits

The ability to speak coherently, maintaining focus on the topic at hand, is critical for effective communication and is commonly impaired following brain damage. Recent data suggests that executive processes that regulate access to semantic knowledge (i.e., semantic control) are critical for maintaining coherence during speech. To test this hypothesis, we assessed speech coherence in a case-series of fluent stroke aphasic patients with deficits in semantic control. Patients were asked to speak about a series of topics and their responses were analysed using computational linguistic methods to derive measures of their global coherence (the degree to which they spoke about the topic given) and local coherence (the degree to which they maintained a topic from one moment to the next). Compared with age-matched controls, patients showed severe impairments to global coherence and milder impairments to local coherence, suggesting that semantic control deficits give rise to being “led up the garden path”, i.e., one sentence automatically cueing another, with the topic becoming increasingly less relevant to the original question. Global coherence was strongly correlated with the patients’ performance on tests of semantic control, with poorer ability to maintain top-down global coherence being associated with greater semantic control deficits. Other aspects of speech production were also impaired but were not correlated with semantic control deficits. This is the first study to investigate the impact of semantic control impairments within a naturalistic setting, and indicates patients with these impairments are likely to find maintaining focus in everyday conversation difficult.


Introduction
. In TBI patients, for example, poor coherence is observed in the context of minimal impairments to micro-linguistic aspects of speech, such as phonology, lexical and syntactic processing (Coelho, 2002;Marini et al., 2014).
In these patients, however, correlations have been reported between performance on the Wisconsin card-sorting test and errors of local and global coherence (Marini et al., 2014) and more general impairments of narrative organisation (Coelho, 2002), suggesting that an executive deficit may underpin their difficulties in regulating their speech. Similarly, coherence deficits in frontotemporal dementia are present in patients with executive impairment but not in those with the non-fluent or semantic variants of the disorder (Ash et al., 2006). In healthy individuals, age-related coherence declines have also been attributed to impaired executive function (North et al., 1986;Gold et al., 1988;Kintz et al., 2016).
These results suggest that domain-general executive control processes are involved in the monitoring and selection of topics during speech. One view holds that poor coherence, at the global level in particular, results from a reduced ability to inhibit irrelevant information, such that people are less able to prevent irrelevant or off-topic ideas from becoming automatically activated and intruding into their discourse (Arbuckle and Gold, 1993;Marini and Andreetta, 2016). Recent work has developed this hypothesis by investigating the specific role of semantic control, defined as the top-down processes that control retrieval and selection of semantic knowledge (Jefferies, 2013;Lambon Ralph et al., 2017;Hoffman et al., 2018b). The role of semantic control has rarely been studied in the context of real world conversation. In recent work, however, Hoffman et al. (2018a) hypothesised that semantic control processes are critically involved in maintaining coherence because semantic knowledge is central to all propositional speech. This is obviously true at the lexical level, since the selection of words for production is determined by their meanings. But at the broader "message" level, the content of our speech is also determined by our general semantic knowledge about the topic under discussion. For example, if someone asks you how they would catch a train somewhere, you must access stored semantic knowledge about the typical characteristics of railway stations, the methods for buying tickets and so on. To ensure coherent discourse, this knowledge retrieval process must be regulated such that topicrelevant knowledge is used to drive speech production and any irrelevant associations that come to mind are avoided (e.g., retrieving information about trains might bring to mind other forms of transport not relevant to the discussion). For these reasons, we predicted that the All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020 coherence of an individual's discourse is influenced by the efficiency of their semantic control processes, in addition to more general executive function. Hoffman et al. (2018a) tested these predictors in a group of 60 healthy young and older adults. In line with previous studies, we found that the coherence of speakers was predicted by their performance on a test of domain-general executive function (the Trails test). Importantly, however, we also found that semantic control abilities were a significant independent predictor of coherence. Participants who produced more coherent speech were more skilled at selecting task-relevant semantic information in a semantic judgement task.
The results of Hoffman et al. (2018a) suggest that semantic control processes, particularly those governing the selection of task-relevant semantic knowledge, contribute to the maintenance of coherence during natural speech production. Deficits of semantic control may therefore be a key factor in explaining the coherence deficits frequently observed following brain damage. Here, we tested this hypothesis by investigating extended speech in a case-series of stroke aphasic patients who presented with clear deficits of semantic control.
We collected samples of patients speaking about different topics and determined their coherence using previously-validated methods from computational linguistics (Hoffman et al., 2018a). We predicted that patients would display impaired coherence, relative to healthy controls and that the severity of the semantic control impairment would predict the degree to which coherence was disrupted in individual patients. By assessing naturalistic speech production in patients with semantic control deficits for the first time, we therefore aimed to establish how these deficits impact on patients' everyday conversational ability.

Method
Participants: Seven patients with deficits of semantic control were recruited from Yorkshire, Surrey, Sussex and Manchester community stroke groups. We refer to these as stroke aphasia (SA) patients. All suffered a left-hemisphere stroke at least four years previously. In line with the inclusion criteria adopted by Jefferies and Lambon Ralph (2006), patients were selected to show difficulties accessing semantic knowledge in both verbal and non-verbal tasks. Two other SA patients attempted to take part in the study but were unable to complete the speech elicitation task. Their speech was considerably less fluent than the patients who successfully All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020 completed the task (on the Cookie Theft description, they produced 9 and 12 words per minute, cf. a mean of 61 words per minute in the patients who completed the task). We also included one right-hemisphere stroke patient as a control case. This individual suffered a right-hemisphere stroke but was not aphasic and exhibited no evidence of semantic deficits.
Age, years of education and time since stroke for all patients are reported in Table 1. All patients provided informed written consent to participate in this study. Ethical approval was obtained from the University of Surrey Ethics Committee (UEC/2016/090/FHMS). It also received ethical approval from the NRES Committee Yorkshire and The Humber (ref: 12/YH/0323).
The characteristics of speech in the patients were compared with control data from healthy older adults, taken from Hoffman et al. (2018a). The control sample comprised all participants reported by Hoffman et al. (2018a) who were aged between 60 and 75 (N=12).
The control group were matched to the SA patients for age (SA mean = 63.0 years; control mean = 67.4 years; t(17) = 1.44, p = 0.17) and years of education (SA mean = 13.1 years; control mean = 13.5 years; t(17) = 0.31, p = 0.76). MRI scans were available for five of the seven SA patients (see Table 1 for description of lesion locations and Supplementary Figure 1 for scan images). Patients presented with damage to a range of left-hemisphere sites. This is in line with previous studies showing that similar semantic control deficits can arise from either left prefrontal or posterior temporal and parietal damage (Jefferies and Lambon Ralph, 2006;Noonan et al., 2010;Thompson et al., 2018).
Background neuropsychological tests: The SA patients were examined on a range of general neuropsychological assessments, including forwards and backwards digit span (Wechsler, 1987), dot counting and number location from the Visual Object and Space Perception (VOSP) battery (Warrington and James, 1991), the Coloured Progressive Matrices test of non-verbal reasoning (Raven, 1962), The Trail Making Test part A and B (Reitan, 1992) and the Brixton Test of Spatial Anticipation (Burgess and Shallice, 1997).
Semantic processing was assessed using a number of standard tests. This included three components of the Cambridge 64-item semantic test battery (Bozeat et al., 2000): spoken word-picture matching using 10 semantically-related response options, picture naming, and the picture version of the Camel and Cactus Test (a four alternative-forcedchoice test tapping semantic associations). The 96-item three alternative-forced-choice test All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020 synonym judgement task was also run (Jefferies et al., 2009). The object use test (Corbett et al., 2011) involved selecting an object for a task (e.g., BASH A NAIL INTO WOOD). The target was either the canonical tool (e.g., HAMMER), or a non-canonical option (e.g., BRICK), presented among a set of five unrelated distractors. The cookie theft picture description test was run (Goodglass, 1983), giving a measure of verbal fluency for each patient. Additionally, category and letter fluency tasks were run, using eight categories (animals, fruit, birds, breeds of dog, household objects, tools, vehicles, types of boat) and three letters (F, A and S).
Tests of semantic control: Semantic control was assessed using a 2 x 2 manipulation of semantic control demands in two different tasks (Hoffman, 2018) (following Badre et al., 2005). In the first task, participants made semantic decisions based on global semantic association. They were presented with a probe word and asked to select its closest semantic associate from either two or four alternatives (see Supplementary Figure 2 for examples). The strength of relationship between the probe and target was manipulated: the associate was either strongly associated with the probe (e.g., town-city) or weakly associated (e.g., ironring). The Weak Association condition was assumed to place greater demands on semantic control, specifically the controlled retrieval of semantic information, as automatic spreading of activation in the semantic network would be insufficient to identify the correct response (Badre and Wagner, 2007).
In the second task, participants were asked to match items based on specific features.
At the beginning of each block, participants were given a feature on which to base their decisions (e.g., colour). On each trial, they were given the name of an object and were asked to select another item that was most similar on the specified feature. We manipulated the semantic congruency of the probe and target. On Congruent trials, the probe and target shared a pre-existing semantic relationship, as well matching on the currently relevant feature (e.g., cloud-snow are semantically related in addition to matching in colour). In contrast, on Incongruent trials the probe and target shared no meaningful relationship, other than their similarity on the specified feature (e.g., salt-dove are both typically white but otherwise semantically unrelated). Moreover, on Incongruent trials one of the foils had a strong semantic relationship with the probe, but did not match on the currently relevant feature (salt-pepper). Incongruent trials placed high demands on semantic control, and particularly on semantic selection processes, for two reasons. First, because there was no pre-All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10. 1101/2020 existing semantic relationship between probe and target, participants could only identify the correct response if they focused selectively on the pre-specified features of the items and not on their other semantic properties. Second, participants were required to inhibit the strong but irrelevant relationship between the probe and foil.
Speech elicitation: Procedures for obtaining speech samples was based on those used by Hoffman et al. (2018a) with healthy participants, but were adapted for use with aphasic patients. Hoffman et al. presented participants with 14 different prompts to elicit speech on different topics. To avoid patient fatigue, we used only six prompts here: 1. Describe the steps you would need to take if going somewhere by train.

What do the police do when a crime has been committed?
3. Which is your favourite season and why? 4. Do you think it's a good idea to send people to live on Mars?
5. What sort of things do you have to do to look after a dog? 6. Why do people go to Scotland on holiday?
In the study with healthy participants, written prompts were presented on a computer screen and participants spoke about each one for a fixed period of 60s. Here, the prompts were read aloud by the researcher and no time limit was imposed on responses. We did not use a time limit because we expected speech rate to be reduced in our aphasic patients and therefore that they would produce significantly less speech than controls if only allowed to speak for 60s. Instead, patients were encouraged to speak for as long as chose to about each subject. If they provided little information spontaneously, the researcher encouraged them to elaborate with a neutral comment like "Can you say anything else about that?". To ensure that potential differences between groups in quantity of speech could not account for our results, we recorded the length of each response in number of words and included this as a covariate in all analyses (see Results). In addition, we performed a second analysis to control for differences in the time available to produce a response. To do this, we computed coherence values using only speech produced in the first 60s of each patient response (thus matching the time period to the control group). We then repeated our main analysis of coherence effects using these values (see Results). All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10. 1101/2020 Speech processing and computation of coherence: Spoken responses were digitally recorded for later transcription. Non-lexical fillers (umm, ah etc.) were not transcribed and pauses were not marked. However, all lexical items were transcribed, including lexical fillers (e.g., "like", "I mean"). We also included incomplete or aborted phrases ("We wanted to…we always went to the shops"), repetitions of words or phrases ("But it's not, it's not got air there") and comments about the task itself or the patients' language problems ("I don't know what the word is"). However, analyses that excluded such utterances produced very similar results to those reported here. Our main dependent variables were computed measures of global and local coherence (GC and LC), as described below. A number of other speech markers were also computed and were included in supplementary analyses (see Statistical Analyses).
Measures of GC and LC were generated using an automated computational linguistic approach, first described by Hoffman et al. (2018a). Analyses were implemented in R; the code is publicly available and can easily be applied to new samples (https://osf.io/8atfn/). Our approach made use of latent semantic analysis (LSA) (Landauer and Dumais, 1997). LSA provides the user with vector-based representations of the meanings of words, which can be combined linearly to represent the meanings of whole passages of speech or text (Foltz et al., 1998). Using similar methods to other researchers (Elvevag et al., 2007;Foltz, 2007), we used these representations to characterise the content of each speech sample and to quantify its coherence. Coherence calculated in this way has high internal reliability and test-retest reliability and is highly correlated with human ratings of coherence (Hoffman et al., 2018a).
The computation process is illustrated in Supplementary Figure 3. Our overall strategy was to divide each speech sample into smaller windows (of 20 words each) and to use LSA to generate vector representations of the semantic content of each window. Coherence was then assessed with a moving window approach. LC was assessed by measuring the similarity of the vector for each window with that of the patient's previous window. Therefore, in common with other researchers (Elvevag et al., 2007;Foltz, 2007), we define LC as the degree to which adjoining utterances convey semantically related content. A low LC value would be obtained if a patient switched abruptly between topics during their response. We used the cosine between vectors as the measure of their similarity, so a value of 0 indicates no semantic relationship between windows and 1 indicates identical content. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10. 1101/2020 GC was assessed by comparing the semantic content of each window with a vector representing the prototypical semantics produced by healthy participants to the same prompt (GC). To generate this prototypical representation, we took all the responses made by healthy older adults in Hoffman et al. (2018a) and computed an LSA representation for each one. We then averaged these to give a composite vector that represented the typical semantic content that people produced when responding to the prompt (for further details of the LSA space and averaging procedure, see Hoffman et al., 2018a). For example, the composite vector for the prompt "Describe the steps you would need to take if going somewhere by train" would be similar to the vectors for train, railway, ticket, station and so on, as these words were frequently used in responses to this prompt. The GC for each window was defined as the similarity between its vector and the composite vector. Therefore, GC was a measure of how much a patient's response matched the typical semantic content of responses to that prompt. A low GC value would be obtained if a patient tended to talk about other topics that were semantically unrelated to the topic being probed. Thus, our measure of GC captured the degree to which patients maintained their focus on the topic under discussion, in line with the definition used by other researchers (Glosser and Deser, 1992;Wright et al., 2014).
Once GC and LC had been calculated, the window moved one word to the right and the process was repeated, until all windows had been assessed. GC and LC values were averaged across windows to give overall values for each response.
Statistical analyses: Group analyses compared performance in the SA group (not including the non-aphasic RH patient) with the age-matched healthy control group. At the group level, analyses were performed using linear mixed effects (LME) models. Accuracy on the semantic control task was analysed using a generalised LME model with a logit link function. The model had a 2 x 2 x 2 (group x task x control demands) factorial structure and included random intercepts for participants and test items, as well as all random slopes for all factors varying within-unit (Barr et al., 2013). The significance of fixed effects was assessed using a likelihood ratio test to compare the full model with a reduced model identical except for the removal of the effect of interest (Barr et al., 2013).
To test for group differences in coherence, LME models were estimated separately for GC and LC values, with a factorial manipulation of group (SA vs. controls, again excluding the All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020.02.10.20020685 doi: medRxiv preprint RH patient) and including response length as a control predictor (since longer responses may be more likely to deviate off-topic; Hoffman et al., 2018a). These models included random intercepts for participants and prompts and random by-probes slopes for the effect of group.
Effects of group were assessed using a t-test with Satterthwaite approximation of degrees of freedom, as there were a smaller number of observations here which could cause likelihood ratio tests to be anti-conservative (Huber et al., 2015). To determine whether the coherence scores of individual patients differed from the control group, we repeated the LME analyses, each time comparing the control group to a single patient. This method extends the modified t-test approach to the LME framework allowing us to take item effects into account when testing for impairments (Huber et al., 2015). We also computed correlations between coherence measures and semantic task performance in the patient group.
Finally, we calculated seven other markers of the lexical-semantic properties of speech: the mean frequency, concreteness, age of acquisition, semantic diversity and length of words produced, type:token ratio and the proportion of closed-class words (see Supplementary Materials for full details). To analyse these data, we performed a principal components analysis to reduce the individual measures to four latent factors (Following the same method as our previous studies; Hoffman et al., 2018a;Hoffman, 2019). Consistent with previous work, the four factors appeared to index specificity of semantic content, complexity of vocabulary, coherence and lexical diversity (see Table 4 for factor loadings). We performed LME analyses to the scores on each factor to determine whether patients differed from controls on each factor, and tested whether correlations with semantic performance were present.
Data availability: The data reported in this paper are available at https://osf.io/cvuqz/

Results
Background neuropsychological tests: Scores on the background neuropsychological tests are presented in Table 2. In all tables and figures, SA patients are ordered by the severity of their semantic control impairment (i.e., mean performance on the high control conditions of the semantic control task). SA patients showed evidence of multimodal semantic impairments on All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020.02.10.20020685 doi: medRxiv preprint background testing. Semantic impairment in SA is typically accompanied by domain-general executive impairments (Jefferies and Lambon Ralph, 2006;Thompson et al., 2018) and this was also the case here. Patients frequently showed impairment on tests of general executive function, such as the Brixton test of spatial anticipation, the Trails test part B and Raven's progressive matrices. All SA patients performed below the normal range on at least one of these tests, with the exception of SA1 and SA2 (who also showed the most intact performance on tests of semantic control; see below). The right-hemisphere patient, RH1, performed well on the semantic tasks, displaying mild impairment only on the synonym judgement task.
Tests of semantic control: Semantic control was assessed using a forced-choice semantic judgement task with a 2 x 2 design that manipulated the type of semantic judgement (global vs. specific feature) and the need for controlled semantic processing. Accuracy on this task is presented in Figure 1. The data were analysed with generalised mixed effects models. SA patients were significantly impaired on the task as a whole, compared with the healthy control group (B = 0.58, se = 0.22, p = 0.004). There was a main effect of the semantic control manipulation (B = -0.95, se = 0.20, p < 0.001), indicating poorer performance in the high control conditions. Importantly, there was also a marginal interaction between control demands and group (B = -0.39, se = 0.19, p = 0.062), suggesting that manipulating the need for semantic control had a larger effect on the patients than on the controls. There was no effect of task and no interactions with other factors, suggesting that the two different manipulations of control in this experiment (controlled retrieval vs. semantic selection) had similar effects on our participants.
Speech elicitation: Basic information on the speech samples collected from patients is provided in Table 3. We presented either five or six prompts to each patient, depending on the time available for testing. For SA3, we were only able to analyse data from four prompts for analysis because in the other two cases she failed to produce enough words to provide reliable coherence data (7 and 17 words only). The mean duration of patients' responses varied between 62 and 115s (in contrast, a 60s time limit was imposed on controls). Despite this, patients produced fewer words on average than controls (t(17) = 2.69, p = 0.016) because they spoke at a much slower rate (t(17) = 4.69, p < 0.001). Examples of patients' responses are given are provided as Supplementary Materials. They appeared to frequently deviate from All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020 the topic about which they were asked, behaviour which we analyse formally in the next section.
Speech coherence: Our formal assessment of coherence involved using previously-validated computational linguistics methods to quantify the global coherence (GC) and local coherence of each response (LC). Mean GC and LC for each patient is shown in Table 3, with SA and Control means in Figure 2. LME analyses of these data (including length of responses as a covariate) revealed that SA patients were significantly less globally coherent than controls (t(8.4) = 4.78, p = 0.001) and, to a lesser extent, less locally coherent (t(12.5) = 2.21, p = 0.047).
At the level of individual patients, all patients except SA1 showed significantly impaired GC scores relative to controls (p < 0.05). SA2 and SA7 also exhibited LC scores that were significantly lower than the control group (p < 0.05).
Correlations between the coherence measures, response length and scores on the high-control semantic conditions are shown in Figure 3. There was a strong tendency for patients who performed better on the high-control semantic tasks to score more highly for GC in speech (Incongruent Features: r = 0.75, p = 0.03; Weak Associations: r = 0.55, p = 0.16).
In contrast, semantic control performance did not predict LC values, nor did it predict the average length of responses.
Other characteristics of speech: Finally, we computed a range of other speech markers to determine the degree to which semantic control impairment affected other aspects of speech content (see Supplementary Materials for details). Following the same method as previous studies (Hoffman et al., 2018a;Hoffman, 2019), we performed a principal components analysis on these data, which identified four factors corresponding to distinct aspects of speech production (see Table 4 for factor loadings): 1. Semantic specificity, the tendency to produce highly concrete, low frequency words with low semantic diversity 2. Complexity of vocabulary, the degree to which participants produced long, low frequency, late-acquired words 3. Coherence, which was strongly related to the GC and LC measures analysed above 4. Lexical diversity, the degree to which participants produced a wide range of different lexical items and open-class words. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020 Table 4 also shows the mean factor scores for each group and the effects of group, assessed using LME models of the same form used to analyse GC and LC. In addition to differing from controls on the latent coherence factor, SA patients also tended to use less complex vocabulary than controls and to exhibit less lexical diversity. Correlations between patients' performance on tests of semantic control and their scores on the latent speech factors are shown in Figure 4. The coherence factor showed the strongest correlations with semantic control ability, replicating the main analysis of GC. However, there was also a weaker tendency for patients with more semantic control ability to use more complex vocabulary.
This suggests that semantic control ability influences the coherence of speech more than it does other aspects of speech content (in line with Hoffman et al.'s (2018a) findings in healthy participants).

Discussion
Maintaining coherence when speaking is a critical skill which is thought to depend on the ability to retrieve and select currently-relevant knowledge from semantic memory (known as semantic control). We investigated the coherence of connected speech in a case-series of aphasic stroke patients with deficits of semantic control. We predicted that poor control over their access to semantic knowledge would have detrimental effects on their coherence. In line with our prediction, patients showed gross impairments to global coherence: compared to controls, their spoken responses were less relevant to the topics we asked them to speak about. They showed milder deficits in local coherence, indicating that they were also less able to maintain a coherent topic from moment to moment when speaking. Moreover, we found that global coherence was correlated with performance on tests of semantic control, with the most impaired patients exhibiting the lowest global coherence values. As one would expect in an aphasic group, other aspects of speech production were also impaired but none showed such a strong correlation with semantic control abilities. Our findings provide converging support for the idea that semantic control processes are central to maintaining coherence in speech and indicate that poorly coherent speech is a major issue for patients with deficits of semantic control. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020 Our patients exhibited greater impairments to global coherence than to local coherence. In other words, they were mostly able to maintain semantic relationships between consecutive utterances, but they were poor at ensuring their utterances were related to the topic at hand. The result was that their verbal output often had the character of an undirected stream of consciousness: they began coherently but over time, they drifted away from the topic they had been asked about (see Supplementary Materials for examples).
Our interpretation of this behaviour is that automatic semantic retrieval processes are relatively intact in these cases. When speaking, they successfully activate a chain of meaningful associations which drive their speech output. However, in order to maintain focus on the topic at hand, top-down control processes must act on this information retrieved from semantic memory, to ensure that information selected for production is relevant to the topic under discussion. Disruption to these selection processes appear to be responsible for the patients' global coherence deficits.
A distinction has often been drawn between two distinct forms of semantic control: semantic selection of task-relevant knowledge vs. controlled retrieval of less salient knowledge (Badre and Wagner, 2007). Our account of the SA patients' coherence deficits holds that the selection element plays the key role in maintaining global coherence. This is consistent with our previous study in healthy participants (Hoffman et al., 2018a) and with an fMRI study, again in healthy older people, indicating that high coherence during speech is correlated with increased activation in the pars triangularis (BA45) region of left inferior frontal gyrus (Hoffman, 2019). This area is strongly implicated in semantic selection (Thompson-Schill et al., 1997;Badre et al., 2005;Badre and Wagner, 2007). In the present study, however, we found that tasks probing selection and controlled retrieval showed similar correlations with global coherence. SA patients typically show parallel impairments to both of these aspects of semantic control, making it difficult to tease apart contributions of each in this patient group. However, the converging evidence across studies suggests that poor selection of semantic information is at the root of coherence impairments.
Another outstanding question is the degree to which the semantic selection processes we measured here overlap with executive functions in other cognitive domains. This is not an issue we were able to address directly in this study because patients did not complete comprehensive assessments of executive function in other domains. However, other studies in SA patients indicate that general executive deficits typically co-occur with, and are All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020 correlated with, semantic control impairments (Jefferies and Lambon Ralph, 2006). Furthermore, patients who present with a general dysexecutive syndrome also show the hallmarks of a semantic control disorder (Thompson et al., 2018). It is not clear the degree to which these co-occurring deficits indicate a common functional system, or are simply a consequence of concomitant damage to distinct but neurally proximate systems. However, recent work in healthy individuals suggests that semantic selection ability patterns closely with performance on non-semantic executive tasks, while controlled retrieval is more closely related to semantic abilities (Hoffman, 2018). Thus, it is possible that inhibition of irrelevant semantic information is closely related to inhibitory functions in other cognitive domains.
For the present study, we deliberately used a relatively unconstrained speech elicitation task. Patients were given a topic to speak about and were required to structure their own response with minimal support from the experimenter. We chose this method to provide a strong test of patients' ability to regulate their own speech output. However, it is worth noting that SA patients show positive effects of increasing task constraints across a range of semantically-driven tasks and we would expect to see similar effects here (Noonan et al., 2010;Corbett et al., 2011). For example, in everyday life the patient's conversational partners may play an important role in providing verbal and non-verbal cues that direct them back towards topic-relevant discourse. There is a parallel here with studies in healthy older adults, in which coherence impairments are greatest when verbal topic prompts are used and less prominent when participants are asked to describe pictures or comic strips, where there is a visual reminder of the subject matter throughout (James et al., 1998;Wright et al., 2014).
One way to improve coherence in this patient group may therefore be to provide external cues that maintain their focus on the current topic.
In conclusion, the unique contribution of this research is to show the nature of impairments to everyday conversation that semantic control impairments bring. In particular, SA patients have difficulties maintaining global coherence, where their unregulated retrieval of semantic information leads them "off the rails" and progressively away from the topic at hand. Better understanding of these processes will be critical in uncovering the root causes of conversational difficulties experienced by aphasic patients and will be useful in guiding rehabilitation in hospital, clinics and in the home. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10. 1101/2020    perpetuity.
preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020  perpetuity.
preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020.02.10.20020685 doi: medRxiv preprint All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020.02.10.20020685 doi: medRxiv preprint perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020  perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020  perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 11, 2020. . https://doi.org/10.1101/2020