Anatomy of phonemic and semantic fluency: A lesion and disconnectome study in 1231 stroke patients

Disturbances of semantic and phonemic fluency are common after brain damage, as a manifestation of language, executive, or memory dysfunction. Lesion-symptom mapping (LSM) studies can provide fundamental insights in shared and distinct anatomical correlates of these cognitive functions and help to understand which patients suffer from these deficits. We performed a multivariate support vector regression-based lesion-symptom mapping and structural disconnection study on semantic and phonemic fluency in 1231 patients with acute ischemic stroke. With the largest-ever LSM study on verbal fluency we achieved almost complete brain lesion coverage. Lower performance on both fluency types was related to left hemispheric frontotemporal and parietal cortical regions, and subcortical regions centering on the left thalamus. Distinct correlates for phonemic fluency were the anterior divisions of middle and inferior frontal gyri. Distinct correlates for semantic fluency were the posterior regions of the middle and inferior temporal gyri, parahippocampal and fusiform gyri and triangular part of the inferior frontal gyrus. The disconnectome-based analyses additionally revealed phonemic fluency was associated with a more extensive frontoparietal white matter network, whereas semantic fluency was associated with disconnection of the fornix, mesiotemporal white matter, splenium of the corpus callosum. These results provide the most detailed outline of the anatomical correlates of phonemic and semantic fluency to date, stress the crucial role of subcortical regions and reveal a novel dissociation in the left temporal lobe.


Introduction
Verbal fluency is defined as the ability to generate unique verbal responses according to a specified criterion, typically including either a semantic category (i.e., semantic fluency) or starting with a specific phoneme (i.e., phonemic fluency) (Schmidt et al., 2017). Cognitive processes involved in both types of fluency are executive processes such as energization (i.e., the ability to generate non-overlearned responses), selfmonitoring, attention and working memory, and language and memory (Baldo, SChwartz, Wilkins, & Dronkers, 2006;Biesbroek et al., 2016). A crucial distinction lies in the differential involvement of the mental lexicon network, since semantic fluency more strongly engages semantic memory, whereas phonemic fluency engages phonological memory (Baldo et al., 2006;Biesbroek et al., 2016). Furthermore, strategies applied for an effective memory search might be different for both types of fluency, which could result in differential involvement of executive processes (Biesbroek et al., 2016).
Reflecting these shared and distinct cognitive components, semantic and phonemic fluency have shared and distinct anatomical correlates (Baldo et al., 2006;Biesbroek et al., 2016;Birn et al., 2010;Costafreda et al., 2006;Henry & Crawford, 2004). However, even though the anatomy underlying phonemic and semantic fluency has been extensively studied, many aspects of their anatomical underpinnings remain controversial. A widely accepted notion is that both phonemic and semantic fluency are associated with a large left hemispheric frontotemporal network, and that phonemic fluency depends more strongly on left frontal regions, whereas semantic fluency depends more strongly on left temporal regions (Baldo et al., 2006;Biesbroek et al., 2016;Costafreda et al., 2006;Henry & Crawford, 2004;Schmidt et al., 2019). However, some of the findings of prior lesion and functional MRI (fMRI) studies conflict regarding the exact regions within the frontal and temporal lobes that are involved in either type of verbal fluency, regarding the involvement of subcortical, parietal and occipital regions, and regarding the potential contribution of the right hemisphere (see Table 1 for a summary of findings from previous lesion studies and Table 2 for a summary of the corresponding methodological designs).
An important cause of inconsistent results across prior lesion studies is their limited sample size. This is crucial in lesion studies, because only regions that are damaged in a sufficient number of patients can be included in the analyses. Achieving sufficient "brain lesion coverage" remains difficult even in studies with hundreds of subjects .
Consequently, each prior lesion study on verbal fluency has to some degree focused on a subset brain regions, reflecting the inclusion criteria and sample size of the study cohort, thus ignoring potential contributions of undamaged brain regions. Furthermore, in relatively small samples that do allow for the inclusion of a specific region in the analysis, the statistical power for detecting associations between lesions and verbal fluency may still be limited. The largest lesion study in verbal fluency to date included 191 patients and the remaining studies all included <100 patients (see Supplementary Table  A.1 for an overview). Thus, a large lesion study that covers nearly the entire brain and optimizes statistical power is warranted to obtain comprehensive knowledge on the anatomy underlying verbal fluency. Furthermore, methods to map not only the location of the lesion itself, but also the anatomical network that is disrupted by the lesion, have recently emerged and might provide complementary information (Foulon et al., 2018;Fox, 2018). To the best of our knowledge, an anatomical disconnection study has not yet been performed for verbal fluency.
The aim of the current study is to determine the anatomical correlates of semantic and phonemic fluency by performing a multicenter lesion-symptom mapping study in 1231 unselected ischemic stroke patients using multivariate support vector regression-based lesion-symptom mapping. A complementary disconnection-based approach, in which not the lesion itself but the anatomical network that is affected by the lesion is mapped, is applied to identify the subcortical networks involved in verbal fluency. The scale of this study ensures that nearly the entire brain can be included in the analyses instead of focusing on prespecified areas of interest.

Study population
Patients were selected from the Hallym vascular cognitive impairment (VCI) and Bundang VCI cohorts (Lim et al., 2014;Yu et al., 2013). A flowchart is provided in Fig. 1. All patients were admitted to either Hallym University Sacred Heart Hospital or Seoul National University Bundang Hospital (Republic of Korea) with acute ischemic stroke between January 2007 and December 2018. 1591 patients were initially selected based on the following inclusion criteria: (1) availability of brain magnetic resonance imaging (MRI) showing the acute symptomatic infarct(s) on diffusion weighted imaging (DWI) and/or fluid-attenuated inversion recovery (FLAIR), (2) successful infarct segmentation and registration (see section c o r t e x 1 4 3 ( 2 0 2 1 ) 1 4 8 e1 6 3 below), and (3) available post-stroke neuropsychological assessment within one year after stroke. Of the 1591 patients, 49 patients were excluded because of insufficient scan quality, and 65 patients were excluded because the acute infarct was insufficiently delineable, most commonly because of a suboptimal interval between stroke onset and MRI acquisition (see "image processing steps" below) or because the infarct could not be reliably distinguished from old brain infarcts or from extensive white matter hyperintensities). Next, 45 patients were excluded because of unsuccessful lesion registration. In a final step, 191 patients were excluded because of missing cognitive data on either phonemic or semantic fluency, or missing demographic data on age, sex and level of education. This resulted in the final inclusion of 1231 patients from the two cohorts; 588 from the Hallym VCI cohort and 643 from the Bundang VCI cohort. The institutional review board approved the study protocol and waived patient consent because of the retrospective nature of the study and minimal risk to participants. No part of the study procedures or analyses was pre-registered prior to the research being conducted. We report how we determined our sample size, all data exclusions, all inclusion/exclusion criteria (which were all established prior to data analysis), all data manipulations, and all measures in the study.

Magnetic resonance imaging
A 3.0 T MRI scanner (Achieva, Philips Healthcare, Eindhoven, the Netherlands) was used in both the Hallym and Bundang hospital. The Hallym VCI cohort MRI protocol consisted of DWI, Anatomical correlates found in the six prior lesion-symptom mapping studies (Almairac, Herbet, Moritz-Gasser, de Champfleur, & Duffau, 2015;Baldo et al., 2006;Biesbroek et al., 2016;Chouiter et al., 2016;Li et al., 2017;Schmidt et al., 2019) that directly compared semantic with phonemic fluency and reported correlates at the level of specific gyri, subcortical nuclei or white matter connections. The methodological design of these studies is described in Table 2. When a study is listed under shared, the corresponding region was associated with both fluency types. When a study is listed under semantic or phonemic fluency, the corresponding region was either uniquely associated with this fluency type (in that case the study is not also listed under shared) or it was preferentially associated with this fluency type (i.e., after correction for the other fluency type or substantially more significant voxels than for the other fluency type). a Defined as regions identified in at least 2 studies and non-conflicting results. Abbreviations: IFO: inferior fronto-occipital fasciculus, ant: anterior, post: posterior.
c o r t e x 1 4 3 ( 2 0 2 1 ) 1 4 8 e1 6 3 axial T1-and T2-weighted spin echo, FLAIR, gradient-echo imaging, and coronal T2-weighted spin echo imaging. The following acquisition parameters were set for FLAIR images: repetition time, 11,000 msec; echo time, 125 msec; inversion time, 2800 msec; slice thickness, 5 mm; intersection gap, 2 mm; matrix, 512 Â 512; flip angle, 90 ; reconstructed voxel size  Methodological design of the the six prior lesion-symptom mapping studies (Almairac et al., 2015;Baldo et al., 2006;Biesbroek et al., 2016;Chouiter et al., 2016;Li et al., 2017;Schmidt et al., 2019) that directly compared semantic with phonemic fluency and reported correlates at the level of specific gyri, subcortical nuclei or white matter connections. a Stroke subtype not further specified. b Qualitative description of brain regions that were included in the studies, based on the study inclusion criteria and visual inspection (by J.M.B.) of the reported lesion prevalence maps. FDR: false discovery rate. MCA: middle cerebral artery. PCA: posterior cerebral artery. ROI: region of interest.

Generation of infarct maps
Acute infarct segmentation and subsequent registration to the montreal neurological institute (MNI)-152 brain template was performed in accordance with a previously published protocol that allows for harmonization of images from computed tomography (CT) and heterogeneous MRI scan protocols . First, acute infarcts were manually segmented on DWI (n ¼ 1201) or FLAIR (n ¼ 30) sequences with in-house developed software based on MeVisLab (MeVis Medical Solutions AG, Bremen, Germany) (Ritter et al., 2011). Infarct segmentations were performed by two trained investigators (A.K.K. and G.A.). They were subsequently checked and adapted (if necessary) by an experienced rater (N.A.W.), and further revised by another experienced rater (J.M.B.) in the event of uncertainty regarding lesion location or classification. Discrepancies between ratings were discussed in consensus meetings throughout the segmentation process. All four raters were blinded to the neuropsychological data during the segmentation process. During segmentation, the raters had access to all available MRI sequences and were aware of the time interval between stroke onset and MRI. In the acute stage ( 2 weeks after stroke onset), DWI was used as a preferred modality for segmentation because it is considered the optimal sequence (i.e., is generally the most sensitive sequence and provides high contrast between infarcted and normal brain tissue) to visualize acute infarcts within this time frame (Ricci, Burdette, Elster, & Reboussin, 1999). If DWI was not available, or if a comparison of the FLAIR and DWI revealed that the FLAIR provided a more accurate image of the infarct, or if the MRI was performed in a more chronic stage, infarcts were segmented on FLAIR sequences (n ¼ 26; 2.2%). Apparent diffusion coefficient (ADC) and T1-weighted sequences were used as reference sequences to support identification and delineation of the infarcts. On DWI, the hyperintense brain region with corresponding low signal on ADC was delineated as infarcted brain tissue. If the infarct was segmented on FLAIR in the subacute stage (>48 h after stroke, but within the first few weeks), the hyperintense brain region was delineated as infarcted brain tissue. If the infarct was segmented on FLAIR in a more chronic stage when cavitation has occurred, both the cavity and the surrounding hyperintense signal (representing gliosis) were delineated as infarcted brain tissue . Intra-observer (two ratings by N.A.W. with 1 year time interval) and interobserver (between A.K.K./G.A. and N.A.W.) agreement were determined using the Dice Similarity Coefficient (Crum, Camara, & Hill, 2006;Dice, 1945), which expresses the degree of overlap between the segmented regions on a scale from 0 to 1, on a random subset of 30 DWI scans. Both inter-observer (dice similarity coefficient (DSC) .80; standard deviation (SD) .12) and intra-observer (DSC .87; SD .07) agreement were excellent. Presence of other vascular lesions, specifically old cortical infarcts, large subcortical infarcts (>15 mm), hemorrhages (>10 mm), and lacunes (<15 mm), was assessed by an experienced rater (N.A.W.) and classified according to the STRIVE criteria (Wardlaw et al., 2013). Following lesion segmentation, all scans and the corresponding lesion maps were transformed to the T1 1-mm MNI-152 brain template (Fonov et al., 2011) with the RegLSM tool (publicly available at www. metavcimap.org; most recent version described in Weaver et al. (2019)). In short, the registration procedure consisted of linear registration followed by nonlinear registration, using the elastix toolbox (Klein et al., 2010). To improve registration accuracy, an intermediate registration step to an age-specific T1 template was performed (Rorden, Bonilha, Fridriksson, Bender, & Karnath, 2012). All registration steps were combined into a single transformation, to prevent intermediate interpolations and thereby improve registration accuracy. Quality checks of all registration results were performed by one rater (N.A.W.). Manual adaptations were made in case of minor registration errors (43% of cases; n ¼ 523) using MeVi-sLab software.

Generation of anatomical disconnectome maps
Disconnectome maps were calculated using the BCBtoolkit (Foulon et al., 2018). This approach utilises connectome maps derived from diffusion weighted imaging data from 10 healthy controls (Rojkova et al., 2016) and uses the patient's lesion maps as seed region for tractography to identify the white matter fibers that are anatomically connected to the lesion (Foulon et al., 2018). In a previous validation study, these 10 healthy controls proved to be enough to obtain disconnectome maps that match the overall population (more than 70% shared variance) and that the reliability of these disconnection maps did not decrease with increasing age of the patient (Foulon et al., 2018). For each healthy control in the toolkit, tractography was estimated as previously described elsewhere Thiebaut de Schotten et al., 2011). The lesion map in MNI-152 space from each patient was registered to the native space of the healthy controls using affine and diffeomorphic deformations (Avants et al., 2011;Klein et al., 2010) and subsequently used as seed for the tractography in Trackvis (Wang & Benner, 2007). Tractographies from the lesions were transformed into visitation maps (De Schotten et al., 2011), binarized and registered back to MNI-152 space using the inverse of precedent deformations. Finally, a percentage overlap map was generated for each patient by summing the normalized visitation map of each healthy subject for each voxel in MNI-152 space. Hence, in the resulting disconnectome map, the value in each voxel indicates the probability of disconnection from 0 to 100% for each voxel in the brain. The disconnectome maps were subsequently dichotomized using a threshold of !50%. Thus, the binary disconnectome maps indicate for each voxel in the brain whether an anatomical fiber connection with the lesioned brain region is likely to exist in healthy individuals and, consequently, if the voxel is disconnected by the lesion.

Neuropsychological assessment
The procedure for neuropsychological assessment in the Hallym and Bundang VCI cohort has been previously described (Lim et al., 2014). Verbal fluency tests were administered as part of the K-VCIHS-NP (Korean Vascular Cognitive Impairment Harmonization Standards: Neuropsychology Protocol) (Hachinski et al., 2006;Yu et al., 2013). Semantic fluency was tested by asking patients to name from memory as many animals as possible in 1 min. Phonemic fluency was tested by asking patients to name as many words as possible starting with a specific phoneme in 1 min (three trials, each with a different phoneme: giut, iung, siut, corresponding to international phonemic alphabet (IPA) phonemes g, s, s respectively). These verbal fluency tests have been previously validated and were applied in the Korean language (Kang, Chin, Na, Lee, & Park, 2000). Trained clinical neuropsychologists, who were blinded to the clinical and neuroradiological profiles of each patient, administered the tests. Performance on the phonemic (sum of three trials) and semantic (animal) fluency was corrected for age and education and transformed into percentiles using previously established normative data based on a healthy population-based sample (Kang, Jang, & Na, 2012).

Statistical analysis
The association between infarct location and verbal fluency was determined at the level of individual voxels in MNI-152 space (resolution 1 Â 1 Â 1 mm) using a previously described support vector regression-based lesion-symptom mapping (SVR-LSM) Zhao et al., 2018). SVR-LSM takes the intervoxel correlations caused by nonrandom distributions of infarcts (i.e., infarct patterns follow the vascular tree) into account, thereby resulting in higher spatial accuracy of identified neuroanatomical correlates (Mah, Husain, Rees, & Nachev, 2014;Zhang, Kimberg, Coslett, Schwartz, & Wang, 2014;Zhao et al., 2018). We corrected for total infarct volume by weighting the lesioned voxels in inverse proportion to the square root of total infarct volume. A linear SVR model with feature selection was used, in accordance with previous studies Zhao et al., 2018). Only voxels damaged in at least 5 patients were included in the analysis. First, the individually normcorrected phonemic and semantic fluency scores were also corrected for sex, using linear regression, and additionally for the presence of old ischemic or hemorrhagic stroke lesions and lacunes. In the feature selection step, noise voxels in which the presence of a lesion is not univariately associated with phonemic or semantic fluency (t-test, uncorrected p-value >.05) were excluded. We previously demonstrated that this feature selection step improves the prediction accuracy of lesion-symptom mapping in ischemic stroke (Zhao et al., 2018). Subsequently, parameter training of the linear SVR model was realized to determine the optimal regularization parameter (C) and epsilon to maximize the prediction performance of the verbal fluency scores. The prediction performance was calculated for each combination of C and epsilon values by determining the mean Pearson correlation coefficient between the real and predicted fluency scores with 5-fold cross-validations, as previously described Zhao et al., 2018). The resulting optimal C (regularization parameter) was .03125 and the optimal epsilon was .1, corresponding with a prediction accuracy of .60 (p ¼ 1*10 À123 ) for semantic fluency and a prediction accuracy of .61 (p ¼ 7*10 À124 ) for phonemic fluency. Statistical inference was performed by shuffling the observations of norm-corrected z-scores to create pseudo weight coefficients, and the significance level of each voxel was calculated by counting the number of pseudo weights larger than the real weight in 5000 permutations. The voxels with permutation-based p < .01 were treated as significant voxels in the multivariate analysis. The location of significant voxels was visually inspected and quantified in relation to cortical and subcortical anatomical structures based on the HarvardeOxford (Desikan et al., 2006) and ICBM-DTI-81 (Mori et al., 2008) atlas. We performed a post-hoc voxelwise power calculation for detecting lesionedeficit relations in the t-test analysis that was used in the feature analysis step. In line with previous work in the field of lesion-symptom mapping, the maximum lesionedeficit relationship observed in the sample (operationalized as the 99th percentile of observed effect sizes in the voxels that were included in the analyses), was used as effect size in the power calculation. Thus, in the resulting power analysis, the effect size and significance level were fixed, and the number of patients with and without a lesion varied per voxel (Kimberg, Coslett, & Schwartz, 2007;Rudrauf, Mehta, & Grabowski, 2008). Two additional LSM analyses were performed. First, a univariate voxel-wise t-test was performed and corrected for multiple comparisons using permutations (Rorden, Karnath, & Bonilha, 2007), and separately reported for reference and comparison with the main SVR-LSM results. Second, the SVR-LSM analyses were repeated after additionally regression out the variance in fluency performance related to age and elapsed time between stroke and neuropsychological assessment (categorized as <3 months vs > 3 months) to determine whether differences in timing of neuropsychological assessment and resulting potential for cognitive recovery impacted the main results. The association between anatomical disconnections (as identified in the disconnectome maps) and verbal fluency was also analyzed at the level of individual voxels using the same statistical approach as described above. The only difference was that, instead of correcting for total infarct volume, we corrected for total volume of disconnections, by weighting the disconnected voxels in inverse proportion to the square root c o r t e x 1 4 3 ( 2 0 2 1 ) 1 4 8 e1 6 3 of total disconnection volume. The resulting optimal C was .03125 for semantic fluency, and .015625 for phonemic fluency, the optimal epsilon was .05 for both fluency types, corresponding with a prediction accuracy of .41 (p ¼ 1*10 À51 ) for semantic fluency and a prediction accuracy of .34 (p ¼ 2*10 À34 ) for phonemic fluency.

2.7.
Data and materials availability The source data could not be made publicly available due to legal constraints. No explicit informed consent for public archiving of the pseudonymized source data has been obtained, in which case local regulations preclude public archiving of the data. Pseudonymized data that support the findings of this study are available from the corresponding author upon request, subsequent approval from the local Institutional Review Board, and completion of a legal data sharing agreement. Legal copyright restrictions prevent public archiving of the various neuropsychological tests described in Section 2.5, which can be obtained from the copyright holders in the cited references. The RegLSM software for image processing is publicly available at https://metavcimap.org/ features/software-tools/. LSM analyses were performed in JupyterLab 2.2.9 using Python 3.8.6. The feature selection step was performed using SciPy 1.5.3 (Virtanen et al., 2020) and statsmodels .12.1 (Seabold & Perktold, 2010). The SVR was performed with scikit-learn version .24.0 (Virtanen et al., 2020). All source code components are publicly available from the respective package websites and all custom analyses scripts that were used in the current study are publicly archived at https://github.com/Meta-VCI-Map/LSM.

Results
The mean age of the 1231 included patients was 66.6 years (SD 11.6), 62.3% were male and the median years of education was 12. Brain MRI and cognitive evaluation was performed on average 3.6 days (SD 3.4) and 107 (SD 69) days after stroke respectively. A complete list of the characteristics of the included patients is provided in Table 3.

Lesion-symptom mapping: infarct location
The SVR-LSM results are shown in Fig. 2, the location of the significant voxels in relation to anatomical structures is provided in Table 4 and Supplementary Table A.2, and the location of the voxel with the maximum statistic for each cluster is reported in Supplementary Table A.3. The lesion prevalence map is shown in Fig. 2. Lesion distribution was similar in the left and right cerebral hemispheres, and 71.4% of the entire brain (including cerebrum, cerebellum and brain stem) could be included in the analyses. Median statistical power in tested voxels was .97 for phonemic (range .71e1.0) and .95 for semantic fluency (range .66e1.0). Parts of the medial frontal and parietal cortex (vascularized by the anterior cerebral artery), the most caudal parts of the temporal lobes, the mesencephalon and parts of the cerebellum were not included due to insufficient lesion prevalence in these regions. Shared significant voxels for both phonemic and semantic were located almost exclusively in the left hemisphere, including mainly left frontotemporal cortical regions and to a lesser extent parietal and occipital cortical regions, the left thalamus, basal ganglia and surrounding white matter. Regions in which lesions were associated with phonemic fluency, but not (or to a much lesser extent) with semantic fluency, were the anterior divisions of the middle and inferior temporal gyri. Regions in which lesions were associated with semantic fluency, but not phonemic fluency, were the temporo-occipital part of the middle and inferior temporal gyri, the entire fusiform cortex, the posterior part of the parahippocampal gyrus, the triangular part of the inferior frontal gyrus, the frontal orbital cortex, and the splenium of the corpus callosum.
The results of the univariate LSM analysis were largely in line with the SVR-LSM analysis ( Supplementary Fig. 1), with the exception that fewer significant voxels in inferior temporal regions were found for both fluency types and that the involvement of left medial temporal and fusiform regions in semantic fluency was not detected. After additional correction of the SVR-LSM analysis for timing of cognitive assessment, the results were essentially unchanged compared to the main results ( Supplementary Fig. 2).

Lesion-symptom mapping: anatomical disconnections
The lesion prevalence map and the results of the disconnectome-based analysis are shown in Fig. 3. Median statistical power in tested voxels was 1.0 for phonemic (range .53e1.0) and 1.0 for semantic fluency (range .51e1.0). In line with the lesion-symptom mapping analyses, nearly all significant voxels (i.e., voxels in which a disconnection is associated with lower fluency performance) were located in the left hemisphere. Overlapping correlates for both phonemic and semantic fluency were found predominantly in connections from the left thalamus to the prefrontal cortex, and to a lesser degree in dispersed left hemispheric frontoparietal and temporal white matter regions. A more extensive left frontoparietal, perisylvian white matter network was found specifically for phonemic fluency, while several clusters of left temporal voxels and clusters in thalamic nuclei that project to the temporal lobe were found specifically for semantic fluency, as well as bilateral orbitofrontal white matter connections. Disconnection of the left fornix was exclusively associated with semantic fluency. Regarding interhemispheric connections, anterior disconnections (genu of the corpus callosum) were associated with both fluency types, while posterior interhemispheric disconnections (splenium of the corpus callosum with connections to right hemispheric parietal, occipital and temporal white matter) were associated with semantic fluency.

Discussion
The results of the lesion-symptom mapping and disconnectome-based analyses provide converging evidence for a large, partially shared and distinct left hemispheric cortical-subcortical network involved in phonemic and semantic fluency. Distinct anatomical correlates of semantic c o r t e x 1 4 3 ( 2 0 2 1 ) 1 4 8 e1 6 3 fluency are the left posterior middle and inferior temporal gyri, the fusiform and parahippocampal gyri, then left fornix, the triangular part of the inferior frontal gyrus, and splenium of the corpus callosum. Distinct anatomical correlates of phonemic fluency are the anterior part of the middle and inferior temporal gyri and a more extensive frontoparietal and perisylvian white matter network. These results, based on an unprecedented sample size and the use of complementary high-resolution lesion analyses, provide a detailed outline of the anatomy of phonemic and semantic fluency. Furthermore, the detailed anatomical map of phonemic and semantic fluency could be useful in interpreting deficits in either phonemic or semantic fluency in individual patients in terms of the suspected location of brain injury.

Left temporal correlates
Our findings expand the literature and help to clarify several controversies. First of all, though our findings fit the thoroughly substantiated notion that phonemic fluency depends more on left frontoparietal regions and semantic fluency more on left temporal regions (Schmidt et al., 2019) (in line with the motor-phonological and lexical-semantic streams of the dual route model of speech processing (Fridriksson et al., 2016)), they also demonstrate that this differential involvement of temporal brain regions is more fine-grained than previously suggested. Our results demonstrate that the left superior temporal gyrus and temporal pole are involved in both verbal fluency types and reveal a dissociation within the left temporal lobe, with posterior and medial temporal regions being involved in semantic fluency, and anterolateral temporal regions in phonemic fluency. To our knowledge, this left temporal anterioreposterior dissociation for phonemic and semantic fluency has not been explicitly reported before, though several studies listed in Table 1 did actually observe that infarcts in posterior inferior and middle temporal gyri were uniquely associated with semantic fluency (Baldo et al., 2006;Chouiter et al., 2016), and one study found that the anterior part of the middle temporal gyrus was uniquely associated with phonemic fluency (Chouiter et al., 2016). There are several possible explanations for this dissociation. First of all, there is prior evidence for a top-down control for selecting appropriate responses based on a semantic constraint, which has been termed 'semantic control' (Ralph, Jefferies, Patterson, & Rogers, 2016). This is a critical process in semantic, but not in 28.0 (26.0; 0e99.7) Phonemic fluency (number of words, phoneme giut), mean (SD; range) 6.0 (4.1; 0e22) Phonemic fluency (number of words, phoneme iung), mean (SD; range) 5.9 (3.9; 0e20) Phonemic fluency (number of words, phoneme siut), mean (SD; range) 6.3 (4.1; 0e20) Phonemic fluency (norm-corrected percentile), mean (SD; range) 29.6 (26. c o r t e x 1 4 3 ( 2 0 2 1 ) 1 4 8 e1 6 3 phonemic fluency (which depends more on bottom-up retrieval (Katzev, Tü scher, Hennig, Weiller, & Kaller, 2013)) and has been attributed to posterior temporal regions and the triangular part of the inferior frontal gyrus (Ralph et al., 2016). Both these regions were uniquely associated with semantic fluency in the current study. Second, the involvement of the left dorsal (temporo-occipital) and medial (parahippocampal and fusiform) cortex in semantic fluency likely reflects semantic representations. There is a large body of literature implicating these brain regions in lexical semantic memory (Binder & Desai, 2011;Faulkner & Wilshire, 2020;Forseth et al., 2018;Patterson, Nestor, & Rogers, 2007). In particular semantic memory for living things (including animals) is strongly associated with visual semantic processing (Cree & McRae, 2003), which is localized in posterior temporal and occipital brain regions (Ralph et al., 2016). It is important Fig. 2 e Results of the support vector regression-based voxel-wise lesion-symptom mapping analysis (SVR-LSM). The upper panel shows the lesion overlay map depicting the number of patients with a lesion for a specific voxel in MNI space, and the statistical power for detecting lesionedeficit relations per voxel in the feature selection step (t-test, uncorrected p-value <.05). The lesion overlay map is thresholded at n ¼ 5 to show only the voxels that were included in the analysis. The lower panel shows the significant voxels from the multivariate SVR-LSM analysis. Significant voxels are either exclusively associated with phonemic (blue) or semantic (red) fluency, or with both (purple). The results are norm-corrected for age and sex, and additionally corrected for education, infarct volume, chronic stroke lesions, lacunes using linear regression. The left hemisphere is depicted on the left and the z-coordinates in MNI space of the transversal slices are shown.
c o r t e x 1 4 3 ( 2 0 2 1 ) 1 4 8 e1 6 3 (continued on next page) c o r t e x 1 4 3 ( 2 0 2 1 ) 1 4 8 e1 6 3 to acknowledge that, though phonemic fluency is not primarily driven by a semantic constraint, it does require integration with lexical semantic memory to ensure that a specific string of phonemes represents an existing word (Dilkina, McClelland, & Plaut, 2010;Majerus, 2019). Therefore, both phonemic and semantic fluency engage semantic memory to some extent, though the precise nature of the involved semantic processes are different. As such, it is possible that the anterioreposterior dissociation in the temporal lobe might reflect differential involvement of semantic representations. Since the anterior temporal lobe is more involved in abstract and general semantic processing, this could indicate that phonemic fluency engages more general semantic representations, though there is no prior literature to support this notion as far as we are aware. A third possible explanation is that the dissociation in the left temporal lobe might reflect a lemma (semanticesyntactic properties)/lexeme (morphophonological properties) distinction (Crepaldi et al., 2011;Levelt, Roelofs, & Meyer, 1999;Roelofs, Meyer, & Levelt, 1998), which would fit previous observations that anterior temporal stimulation selectively disrupt phonological processing while leaving access to semantic information intact, while inferior and posterior temporal stimulation can selectively disrupt semantic processing (Hamberger et al., 2016). Fourth, involvement of anterior temporal cortex in phonemic fluency might reflect higher load on executive processes as indicated by the fact that in the current study, patients were able to name approximately twice as many items per minute in semantic compared to phonemic fluency (Rusn akov a, Daniel, Chl adek, Jur ak, & Rektor, 2011;Schmidt et al., 2019).
The involvement of the left fornix in semantic fluency is a novel finding and might reflect episodic memory processes that might be appropriate in semantic fluency where episodic memory can provide a context for semantic retrieval (a so called scene reconstruction process (Hassabis & Maguire, 2007) e.g., remembering a situation in the zoo or on safari) that might not be relevant for phonemic fluency (Hodgetts et al., 2017).

4.2.
Left frontal and perisylvian correlates The notion that phonemic fluency draws more on perisylvian and frontal brain regions than semantic fluency was most apparent from the disconnectome-based analyses, showing a specific involvement of the posterior part of the external capsule and a larger extent of perisylvian and frontoparietal white matter in phonemic fluency. This can be explained by a stronger dependency of phonemic fluency on the so-called phonological loop that involves the superior temporal gyrus, supramarginal gyrus and inferior frontal regions compared to semantic fluency (Dilkina et al., 2010). We found the triangular part of the inferior frontal gyrus to be uniquely involved in semantic fluency. This critical role of the triangular part of the inferior frontal gyrus in semantic but not phonemic fluency has been previously reported in fMRI studies (Costafreda et al., 2006;Katzev et al., 2013), but the current study is the first to provide evidence in patients with brain lesions.

Right hemispheric correlates
Regarding right hemispheric contributions to verbal fluency, we identified very few relevant right hemispheric voxels in the LSM analyses on infarct location, even though both hemispheres were equally represented in the study sample. The region with the most significant voxels in the right hemisphere was the precentral gyrus (135 voxels for phonemic and 167 for semantic amounting to less than 1% of tested voxels), while the remaining regions had even far less significant voxels, possibly only reflecting noise. This might suggest that the contribution of the right hemisphere amounts to no more than speech-related motor activity. On the other hand, we did find significant clusters in the forceps major (mostly for semantic fluency), which suggests a role of interhemispheric connections. Furthermore, the disconnectome-based analyses showed that disconnection of the forceps minor (which connects the frontal lobes) was associated with fluency (mostly phonemic), and disconnection of the forceps major (which connects posterior cortical regions), and several clusters in the right parietal and occipital white matter were uniquely associated with semantic fluency. In light of these findings a contribution of the right hemisphere to cognitive functions underlying verbal fluency cannot be excluded and would be in line with one previous lesion-symptom mapping study in which it was speculated that the right hemispheric contribution could reflect a mental imagery strategy (Biesbroek et al., 2016), though it is clear that any right hemispheric involvement in verbal fluency is very limited. This table shows the results of the voxel-wise SVR-LSM analysis by quantifying the number of significant voxels per region of interest. The total number of significant voxels and the total number of tested voxels per brain region are provided and additionally expressed as a proportion (significant voxels divided by tested voxels). a Brain region specifically associated with phonemic fluency. b Brain region specifically associated with semantic fluency. Note that only regions with a proportion of at least 1% significant voxels and at least a total of 50 significant voxels for either phonemic or semantic fluency are shown here; a full list of all significant voxels is provided as Supplementary c o r t e x 1 4 3 ( 2 0 2 1 ) 1 4 8 e1 6 3 4.4.

Study strengths, methodological considerations, and limitations
The main strengths of the current study are the large sample size e allowing for the inclusion of 71.4% of the entire brain (including left and right cerebrum, cerebellum and brain stem) in the lesion-symptom mapping analyses e as well as the multicenter nature of the study and the homogeneous MRI acquisition protocol. Furthermore, we used a recently developed multivariate lesion-symptom mapping method that takes intervoxel correlations into account. This approach has been shown to provide superior sensitivity and specificity compared to traditional mass-univariate lesion-symptom mapping methods (Zhang et al., 2014;Zhao et al., 2018). As reference, we provided the results of mass-univariate LSM (using a t-test and permutation-based correction for multiple Fig. 3 e Results of the support vector regression-based voxel-wise disconnectome analysis. The upper panel shows the lesion overlay map depicting the number of patients with an anatomical disconnection for a specific voxel in MNI space, and the statistical power for detecting lesionedeficit relations per voxel in the feature selection step (t-test, uncorrected p-value <.05). The lesion overlay map is thresholded at n ¼ 5 to show only the voxels that were included in the analysis. The lower panel shows the significant voxels from the SVR analysis. Significant voxels are either exclusively associated with phonemic (blue) of semantic (red) fluency or with both (purple). The results are norm-corrected for age and sex, and additionally corrected for education, disconnection volume, chronic stroke lesions, lacunes using linear regression. The left hemisphere is depicted on the left and the z-coordinates in MNI space of the transversal slices are shown. c o r t e x 1 4 3 ( 2 0 2 1 ) 1 4 8 e1 6 3 comparisons) in Supplementary Fig. 1 in order to facilitate comparisons with prior and future univariate LSM studies on verbal fluency. The most notable difference was that the significant voxels in de left fusiform and medial temporal regions for semantic fluency were not detected in the univariate analysis. Finally, the complementary disconnectome-based analyses allowed us to not only study the impact of the lesion on cognition, but also to map the lesion to the anatomical connectome and to study the impact of remote disconnections on cognition, an approach that had not previously been applied in verbal fluency (Fox, 2018).
It should be acknowledged that all lesion-symptom methods are prone to a certain degree of error in pinpointing critical anatomical regions (Ivanova, Herron, Dronkers, & Baldo, 2021). An important source of anatomical bias is the intervoxel correlations resulting from non-random infarct patterns that follow the vascular tree (Mah et al., 2014). We used a multivariate SVR method that takes the intervoxel correlations into account to improve accuracy (Zhang et al., 2014;Zhao et al., 2018). Other important factors that improve the precision of LSM are correction for infarct volume (which we did with direct lesion volume control (Zhao et al., 2018)), sample size (i.e., increasing sample size improves precision (Ivanova et al., 2021)), and exclusion of voxels that are affected in few patients, and it was recently shown that smoothing of lesion masks may improve accuracy (Ivanova et al., 2021). Another issue is that the optimum threshold for how many patients should have a lesion in a particular voxel for it to be included in the analyses is unclear. We only included voxels affected in at least five patients. A recent study in which LSM simulations in 100 patients were performed showed that using a minimum lesion frequency of five patients compared to a minimum lesion frequency of one patient per voxel significantly improved the spatial accuracy of LSM (Sperber & Karnath, 2017). The optimum threshold may be different when including 1231 patients as was the case in the current study, but as far as we are aware, this has never been tested. Selecting a cut-off involves a trade-off between including as many brain regions as possible on the one hand, and optimizing statistical power in the included voxels on the other hand. Since we may expect more very large-scaled LSM studies in the future Weaver et al., 2021), the optimum threshold for the minimum number of patients with a lesion in each voxel in LSM studies should ideally be reassessed in simulation studies involving thousands of patients.
Several limitations should also be noted. First, due to the standardized nature of the cognitive evaluation, no more specific tests for probing cognitive subprocesses underlying phonemic and semantic fluency were available. Therefore, interpretations regarding the cognitive processes that take place in specific overlapping and distinct anatomical correlates for phonemic and semantic fluency are based on the literature instead of empirical data from the current study. In particular, the observed anterioreposterior left temporal dissociation in phonemic and semantic fluency warrants further research with more fine-grained neuropsychological tests to clarify the underlying cognitive processes. A second limitation is that, while identification of the anatomical correlates of phonemic and sematic fluency was based on rigorous statistical tests, the comparison of the anatomical correlates of the two fluency types to highlight shared and unique correlates is of a qualitative nature. This commonly used approach is sometimes complemented by an additional analysis in which the other fluency type is added as nuisance variable, but this approach can introduce statistical artefacts (Sperber, Nolingberg, & Karnath, 2020), in particular when analyzing complex cognitive constructs such as verbal fluency. Ideally, both fluency types would be statistically modeled together in an analysis that is multivariate not only at the level of voxel lesion status (as is the case in the SVR-LSM method that we used), but at the level of cognitive outcome as well, but these method are not yet available. Third, the time between stroke and neuropsychological examination ranged from one to 365 days (mean 107 days). Over time, brain plasticity and cognitive recovery can occur, and the degree of recovery may be different depending on which brain region is damaged (Karnath, Rennig, Johannsen, & Rorden, 2011). As such, neuropsychological assessment should ideally be performed in the acute stage when aiming to identify the anatomical correlates of a given cognitive function (de Haan & Karnath, 2018). We chose to include all patients with cognitive evaluation up to one year, which means that we cannot rule out false negative findings due to brain plasticity and cognitive recovery. However, the analysis with additional correction for timing of cognitive assessment ( Supplementary Fig. 2) suggests that the differences in timing of cognitive assessment across subjects did not significantly affect the main results. Of note, both fluency types were always assessed on the same day for a single patient. Regarding the timing of brain scan acquisition, this was performed in the acute stage as has been advocated to avoid anatomical distortions due to cavitation in the chronic stage (de Haan & Karnath, 2018). Fourth, despite extensive manual checks and corrections during image processing, possible distortion due to edema in the acute stage and limited spatial resolution of the brain scans should be acknowledged.

Conclusion
In summary, we determined the anatomical correlates of phonemic and semantic fluency by applying lesion-and disconnectome-symptom mapping in 1231 patients with ischemic stroke. The observed shared anatomical correlates in a large left hemispheric cortical-subcortical network reflect shared cognitive processes involved in both fluency types. In contrast, a dissociation within the left temporal lobe was found, as medial and posterior lateral cortex is involved uniquely in semantic fluency and anterior lateral cortex in phonemic fluency. Furthermore, the triangular part of the left inferior frontal gyrus is uniquely involved in semantic fluency whereas phonemic fluency draws on more extensive perisylvian and frontoparietal white matter regions. The identified distinct anatomical correlates most likely reflect differential involvement of lexical semantic and phonological memory and possibly differential involvement of executive processes. Our results provide the most comprehensive map of the anatomical correlates of phonemic and semantic fluency based on patients with brain c o r t e x 1 4 3 ( 2 0 2 1 ) 1 4 8 e1 6 3 lesions to date and shed new light on the anatomical structures and cognitive functions implicated in phonemic and semantic fluency.

Funding
This work was supported by Vici grant 918.16.616 from ZonMw, The Netherlands, Organisation for Health Research to GJB. The work of JMB was supported by a Young Talent Fellowship from the UMC Utrecht Brain Center and by a VIMP grant (project number 7330505031) from ZonMw, the Netherlands, Organisation for Heath Research. The funding sources had no involvement in the study design, the collection, analysis and interpretation of data, the writing of the report, or the decision to submit this article for publication.
CRediT author statement

Declaration of competing interest
None.