High-level language outcomes three and twelve months after awake surgery in low grade glioma and cavernoma patients

OBJECTIVE
Knowledge about the long-term outcome of high-level language ability in awake surgery patients with low-grade gliomas or cavernomas in language eloquent regions is limited, particularly regarding subtle changes in high-level language abilities.


PATIENTS AND METHODS
The study group consisted of 27 patients with LGG or cavernoma which involved language eloquent regions in the left hemisphere. A comprehensive assessment battery was used to target subtle changes in overall high-level language ability as well as in language sub skills. Assessments were made preoperatively and at 3 and 12 months postoperatively.


RESULTS
The results showed that overall high-level language ability had not decreased significantly at group level at 3 or 12 months postoperatively. The proportion of patients with a decline of 5 percent or more at follow up 3 and 12 months were 13% and 9% respectively. There was a marked decline in semantic fluency (animals and verbs) at 3 and 12 months postoperatively. Phonemic fluency, while not significantly reduced at three months, improved markedly in the interval between 3 and 12 months. At 12 months, the only significant decline relative to preoperative scores were seen in semantic fluency for animals and verbs. Verbal cognitive speed did not decline significantly postoperatively but approximately 40% of the patients had a decline of 5% or more at 12 months.


CONCLUSIONS
Overall high-level language ability was not significantly affected postoperatively at 3 and 12 months in LGG and cavernoma awake surgery patients. Semantic word fluency had deteriorated postoperatively at 3 and 12 months follow-up. Taken together our results indicate a decline in processing speed of verbal material postoperatively in the patient group.


Introduction
Patients with low-grade gliomas (LGG) in the left hemisphere of the brain in or near regions critical for language processing are at risk for permanent postoperative language deficits [1,2]. A methodology used to reduce that risk is to monitor language functions during the critical part of the operation with the patient awake, so called awake surgery [e.g. [3][4][5]. The common procedure used in performing awake surgery is a sleep-awake-sleep method, in which the patient is asleep during the craniotomy, awake during the tumor resection, and asleep during the closing of the skull. While the patient is awake, language functions are mapped by direct electrical stimulation (DES) of the brain tissue in and around the area of the surgery while the patient performs on one or several language tasks. This allows the surgeon to plan for the best way to initiate the resection and to continuously monitor the effects of the surgery on language function. The method allows for language eloquent brain regions to be mapped and thus permitting surgery to be performed with significantly reduced risk of impairment of language functions [6][7][8].
The proportion of patients operated in the left hemisphere who are affected by aphasia postoperatively is estimated to be 30-50% [9]. The typically reported trend is an early decline in language function that for most part has recovered within three months. Long term neurological deficits have been reported to be significantly lower in patients operated awake [10]. For the majority of the affected patients in this group the language deficits are reported to be relatively mild, and in most cases, they have recovered after three months [11][12][13].
However, in several studies it has been pointed out that there is a lack of data describing subtle changes in language functions postoperatively [13][14][15][16][17]. To detect subtle changes, the test methods need to be sensitive enough and provide adequate levels of difficulty to avoid ceiling effects [16,17]. Traditional aphasia tests are in general designed to screen for obvious symptoms that are often seen in acute cerebrovascular incidents e.g. stroke. In tumors with slow growth in language eloquent cortical areas, such as LGGs, language symptoms may be much less pronounced due to brain plasticity [e.g. 18] and it is therefore necessary to use test methods that are sufficiently sensitive to be able to detect even subtle symptoms or changes postoperatively. Another important aspect is that the performance on language tests postoperatively may have decreased but still be within a normative normal range [16,19]. If results within the normal range are relied upon as a criterion for good outcome there is a risk that subtle declines in language functions that may be of critical importance for the patient in their work and everyday life are not detected. Some recent studies have used test batteries designed to detect subtle postoperative changes in language functions [16,17,20,21]. Language functions that were covered in these studies were in summary language comprehension (instructions, text passages etc.), word fluency (semantic and phonological), picture naming, rapid automatic naming, verbal repetition (sentences, non-words) reading ability (accuracy, speed, and comprehension), and writing ability (spelling). In these studies the patients were assessed up to 3 months postoperatively. In two studies, language outcomes were examined after 12 months, but they were not designed to investigate subtle changes [15,22]. Knowledge about the long-term outcome of language functions is thus limited. Consequently, the aim of this study was to investigate language functions in awake-surgery patients not only preoperatively and after 3 months postoperatively, but also after 12 months. A comprehensive assessment battery was used to target changes in language functions, including subtle differences in high-level language abilities on longterm follow-up. An additional aim of this study was to report standardized effect size estimates for changes found, which have generally been lacking in past studies, and in addition to report the proportions of patients with changes.
The specific research questions were: -Are there postoperative changes in language functions in awakesurgery patients after three and twelve months, respectively? -What are the most important changes in language functions in terms of effect size? -What proportion of patients showed a change in language functions?

Participants
Patients registered in the Department of Neurosurgery at the Karolinska University Hospital in Stockholm between February 2015 and November 2018 with probable LGG or cavernoma which involved language eloquent regions in the left hemisphere [23] were considered for awake surgery. The decision that they were eligible for awake surgery was taken during a therapy round with neurologist, neuroradiologist and neurosurgeon. Patients were included in the study cohort if they were fluent in the Swedish language, did not have previous brain surgery, and did not have aphasia prior to the surgery. During the study 30 patients were offered the option of awake surgery and all of them accepted. However, two patients were operated subacutely due to suspected risk of tumor malignification, and for one patient awake surgery was contraindicated for medical reasons. The final study cohort thus consisted of 27 patients. At the language assessment at 12 months postoperatively, 2 patients declined the assessment and 1 patient did not show up twice on booked time. Demographic data of each participant in the study group and of their corresponding tumor characteristics are shown in Table 1 [for definition of eloquent area referred to in the table see 23].

Language monitoring procedures
To localize language eloquent areas all the patients were examined preoperatively with navigated Transcranial Magnetic Stimulation (nTMS). The nTMS data of cortical localization of motor functions and of language functions were used initially in the planning of the surgery and during the surgery. The pictures that the patient had safely named during the baseline of the nTMS assessment were used in the language monitoring during the awake part of the operation.
The awake surgery procedure applied was sleep-awake-sleep. Language monitoring during the awake part of the surgery consisted of picture naming during direct electrical stimulation (DES) [24]. DES was applied by a bipolar electrode for 3 s (57 Hz, amplitude cortically 3−4 mA, and subcortically 3−6 mA). A speech-language pathologist familiar with each patient's language ability from the nTMS assessment and the preoperative language assessment gave the surgeon continuous feedback on the patient's performance during the awake part of the surgery. Performance errors were reported to the surgeon as "one", a possible error (e.g. slight hesitation or latency) or "two", a definite error (e.g. long latency, speech arrest or semantic error). An initial cortical mapping of language function was performed prior to tumor resection. The nTMS points were stimulated as well as an extensive area covering the tumor and surrounding cortical area. Cortical locations with speech errors during two out of three stimulations were considered eloquent and marked. Continuous clinical assessment (picture naming of the nTMS pictures, spontaneous conversation about the or on topics known to be of specific interest for the patient) was performed and alternated with cortical and subcortical mapping during resection of the tumor. The tumor resection was continued to completion or stopped when it was required to preserve language function.

Data collection
The medical data were based on radiological examinations (MRI and DTI), operative records, and medical records of postoperative treatment such as radiation and/or adjuvant chemotherapy. Language data were collected preoperatively, postoperatively at three months, and at twelve months postoperatively by an experienced speech-language pathologist. The mean time interval between the preoperative language assessment and the surgery was 3 weeks (SD 2.7), the mean time interval between surgery and the three months assessment was 3.4 months (SD 0.8), and the mean time interval between surgery and the twelve months assessment was 12.3 months (SD 0.7). The language tests used in the assessment were selected to cover a broad spectrum of language functions and sufficiently high levels of difficulty: Confrontation naming, phonemic and semantic word fluency, defining the meaning of words, verbal cognitive speed, comprehension of instructions and of logico-grammatical sentences and of ambiguous sentences, ability to construct sentences, and reading speed. The information of the corresponding test materials used in these assessments are shown in Table 2.
To obtain an overall measure of high-level language skills, BeSS, a test battery designed to target subtle language disorders was used [25]. It includes seven subtests with ten items each: Repetition of long sentences, Recreating sentences, Making inferences, Comprehension of logico-grammatical sentences, Comprehension of ambiguous sentences (lexically and syntactically), Comprehension of metaphors, and Word definitions [26].
This test has been validated in patients with Parkinson's, multiple sclerosis and LGG [20,21,25,27]. Unlike BeSS, all the other language tests in this study examine specific language skills. Results are therefore reported for each of these tests. Each test session took approximately two and a half hours with a short break. The tests were always presented in the same order (as listed in Table 2).

Data analysis
Language data from the pre-and postoperative language assessments were compiled and analyzed. The Wilcoxon sign-ranked test was used to study differences between performances pre-and postoperatively on group level on each test. Effect size was estimated as r = Z/√N [28]. These analyses were based on results for which there were data from all three assessments on each test. The level of significance was Bonferroni-corrected for three repeated measures to p ≤ 0.017 (preop. to postop 3 months; preop. to postop 12 months; and postop. 3 months to postop 12 months).
To comprehensively explore changes in language abilities, we also analyzed data at the individual level. For this purpose, the results of each individual on each language test were categorized into performance levels on the preoperative assessments, and into levels of change postoperatively in relation to preoperative performance. Data were thus categorized into performance levels based on test norms or cut-off guidelines for each test. For AQT, Boston Naming Test (BNT), Months Backwards, BESS, FAS, Animals, Verbs, and Token Test the performance categories were: -1 SD unit and above, below −1 to −2 SD, below −2 to −3 SD and below −3 SD. For TROG the performance categories were: 20−17 correct blocks, 16−14 blocks, and 13 and below. For LS and DLS the performance categories were: stanine ≥4, stanine 3, 2, and 1.
To illustrate changes in performance postoperatively, 3 and 12 months, in relation to the preoperative assessment on each test, data  [3,4] Vocabulary and word retrieval Confrontation naming of 60 pictures Months backwards Test (MBT) [5] Verbal cognitive speed Reciting the months of the year backwards Token test [6,7] Comprehension of instructions Execute instructions (36) of increasing complexity with tokens combined of 2 different sizes, of 2 different shapes, and of 4 different colors BESS [8] High level language functions BESS consists of seven subtests with ten items in each subtest: 1) Repetition of long sentences 2) Recreating sentences from three given words with a defined context -syntactically and semantically correct 3) Making inferences from short text passages 4) Comprehension of logico-grammatical sentences 5) Comprehension of ambiguous sentences 6) Comprehension of metaphors 7) 7) Defining the meaning of words FAS [9] Phonemic word fluency Generate words with initial letters f, a and s in 1 min Animals [9] Semantic word fluency Generate words of the category animals in 1 min Verbs [9] Semantic word fluency Generate words of the category action verbs in 1 min AQT [10] Verbal cognitive speed Rapid naming of form and color of 40 items, combined of 4 different geometrical shapes and of 4 different colors DLS [11] Reading Speed Reading informative text with regularly recurring brackets in which a matching word (1/3) should be underscored (time limit 4 min) LS [12] Vocabulary Word comprehension; matching a given word with 1/  were categorized by performance change in percent compared with the corresponding result preoperatively: Performance increase > 10%, 5-10%, no change, performance decrease 5-10%, 11-20% or > 20%. The categories from both pre-and postop data were then color-coded and compiled into corresponding figures (Figs. 1 and 2).

Results
The mean results on group level for each of the language tests preand postoperatively are displayed in Table 3, and in Table 4 the results of the inferential analyses of these results are displayed. As can be seen in Table 3  There was no significant change in overall high-level language ability (BeSS) postoperatively at 3 and 12 months postoperatively (Table 4), and the proportion of patients with a decline in performance of 5% or more was 13% and 9% respectively (Fig. 2). There was a significant decline in semantic fluency (Verbs and Animals) postoperatively 3 and 12 months postoperatively (Table 4), and the proportion of patients with a decline of 5% or more was 71-81% (Fig. 2). Phonemic fluency (FAS), while not significantly reduced at three months, improved markedly in the interval between 3 and 12 months (Table 4), and the proportion of patients with a deceased performance declined from 59% to 23% (Fig. 2). There were no significant changes in outcome scores on MBT and AQT at 3 or 12 months postoperatively, but 39-45% of the patients had a decreased performance at 3 and 12 months. Reading speed (DLS) was not significantly slower than preoperatively but 20% of the patients had declined at 3 and 12 months postoperatively. At 12 months, the only significant decline relative to preoperative scores were seen in semantic fluency for animals and verbs. A very small but significant increase in word comprehension (LS) and in grammar (TROG) was seen at 3 months postoperatively. Finally, the effect sizes of significant outcome variables ranged from moderate to strong [29] (Table 4).

Discussion
The aim of this study was to examine postoperative language abilities in patients with LGGs or cavernomas three and twelve months after awake neurosurgery. Of particular interest was to investigate occurrence of subtle impairments in language abilities postoperatively, and to estimate the effect sizes of any observed changes.
Overall high-level language ability, assessed by BeSS, had not decreased significantly at group level at three or twelve months postoperatively. At the individual level, (displayed in Fig. 2) a small proportion of the patients had decreased 5 percent or more at follow up 3 and 12 months postoperatively (13% and 9% respectively). In a recent study high-level language was assessed in patients with LGGs in language eloquent areas in the left hemisphere three months postoperatively with BeSS, and the results were comparable to those that we found at three months [20].
Semantic fluency for both nouns (Animals) and action verbs had decreased significantly at group level after three months. These results  Word comprehension and comprehension of grammar had improved significantly. However, the clinical significance of these findings is likely to be negligible as it reflects a very small difference in raw scores and at a level close to the ceiling on both tests.
At twelve months, the only significant decrements relative to preoperative scores were seen in semantic fluency for verbs and animals. The strongest effect size was found for verbs (r = 0.77). The Fig. 2. Postoperative performance, 3 and 12 months, on each language test on group level. Performance changes in relation to preoperative results are color coded into six categories in the following way: Purple; 11% improvement or more, blue; 6-10% improvement, green; no change ± 5%, yellow; 6-10% decline, orange; 11-20% decline, and red; 21% decline or more. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article). proportions of patients who had a decline of 5% or more postoperatively at 3 and 12 months for produced verbs were 71% and 81% respectively, and for animals 71% and 76% respectively. The results thus show that awake surgery patients with LGGs and cavernomas commonly have a decline in semantic fluency postoperatively, and that it persists up to 12 months postoperatively, with the strongest long-term decrement for verbs. Studies of healthy individuals have shown that verb fluency performance has unique variance unrelated to Animals and phonemic fluency, and that it is primarily linked to vocabulary and speed of information processing [30]. Since there was no change on the two measures of vocabulary postoperatively in our patients (BNT and LS), the finding by Ross et al. [30] may suggest that the marked deterioration of verb fluency we found reflects a decline in verbal processing speed. Phonemic fluency, while not significantly reduced at three months, improved in the interval between three and twelve months. The effect size of this improvement was strong (r = 0.65). The corresponding proportions of patients with a decline in phonemic fluency of 5% or more at follow-ups at 3 and 12 months were 59% and 23% respectively, which further illustrate the improvement that took place. A common finding reported in earlier studies is reduced performance at three months or earlier postoperatively [16,31]. Since most studies have not reported follow up at twelve months, the recovery between 3 and 12 months that we found for phonemic fluency is interesting. To our knowledge a catch-up specifically of phonemic fluency postoperatively at 12 months or longer, and in contrast to semantic fluency, has not been reported earlier.
The two test tasks specifically measuring verbal cognitive speed in which a large proportion of patients declined postoperatively 3 and 12 months were MBT and AQT. The inferential analyses of these results were not significant, possibly due to the very high variability in performance. However, the proportions of patients with a decline of more than 5% in their scores on MBT and AQT where were 49% and 40% and 39% and 39% respectively at 3 and 12 months. Both MBT and AQT have been widely used to assess verbal cognitive speed in neurological patients (e.g. [32][33][34]), but to our knowledge there are no studies of the performance on these tests in the present patient group.
It is noteworthy that a large proportion of patients had a decline in verbal cognitive speed (MBT and AQT) and in semantic word finding abilities (Verbs and Animals), skills linked to executive functions. Verbal fluency tests are commonly used to identify deficits in executive functions in neurological patients [e.g. [35]], and verbal cognitive speed subserves and plays an integral role in executive functions [36]. Our results show that there is a post-operative effect on processing speed and that this may also influence aspects of executive functions, a hypothesis which, however, was not specifically investigated in the study.
The clinical significance of the reported results in the patients' everyday lives was not within the scope of this study to evaluate.
However, it may be noted that in general the patients did not report any marked changes in their language abilities when interviewed at the follow-ups. A majority of them reported that word finding fluency had deteriorated but usually not to the extent that conversation was impeded. Some patients with verbally demanding professions reported that, to some extent, the deterioration in word finding affected their ability to perform their work.
Two methodological aspects that should be considered are ceiling effects and learning effects. There were possible ceiling effects in sentence comprehension (TROG) and in comprehension of instructions (Token Test). Although the results did not show any negative outcome in these tests the mean performance was close to maximum, and it is possible that a higher level of difficulty in these tests design might have revealed changes that the current test versions could not detect. Some degree of learning effects cannot be ruled out, since there are no testretest data available for most of the tests employed in the study. Learning effects on word fluency tests have been investigated and a small effect is reported on executive functions but not on phonemic fluency over a 12-month interval [37], and very small learning effects on word fluency have been reported at test-retest intervals of one month [30].
Mental fatigue (MF) is a common co-occurring symptom following various forms of brain damage such as tumors, stroke and TBI, and it is linked to various forms of cognitive problems including slow processing speed [38]. In this study MF was not assessed but it should be noted that in the informal interviews postoperatively with the patients in our study group it was commonly reported that MF to some degree had influenced their quality of life and their ability to return to work. We believe that in future studies it would be valuable to include assessment of MF to get a more complete picture of postoperative outcome in this patient group.

Conclusion
In summary, the results of this study show that high-level language skills are not significantly affected postoperatively 3 and 12 months in LGG and cavernoma patients who are operated on awake. Semantic word fluency had deteriorated postoperatively at 3 and 12 months follow-up, but for phonemic word fluency, a non-significant decrease at 3 months was seen but had recovered at 12 months. The results of semantic fluency and verbal cognitive speed taken together indicate a decline in processing speed of verbal material postoperatively in the patient group.

Ethical approval
Application approved 10th Mars 2017 by Stockholm Ethical Review Authority (nr 2017/592-31). Table 4 Statistical differences between language test results for corresponding patients pre-and postoperatively 3 and 12 months, and between postop. 3 and 12 months (Wilcoxon sign-test: significance-level Bonferroni corrected p < 0.017). Arrows indicate direction of change in performance. Effect sizes (r) are written within parentheses.

Test
Preop

Declaration of Competing Interest
No competing interests.