Mental flexibility depends on a largely distributed white matter network: Causal evidence from connectome-based lesion-symptom mapping

Mental flexibility (MF) refers to the capacity to dynamically switch from one task to another. Current neurocognitive models suggest that since this function requires interactions between multiple remote brain areas, the integrity of the anatomic tracts connecting these brain areas is necessary to maintain performance. We tested this hypothesis by assessing with a connectome-based lesion-symptom mapping approach the effects of white matter lesions on the brain's structural connectome and their association with performance on the trail making test, a neuropsychological test of MF, in a sample of 167 first unilateral stroke patients. We found associations between MF deficits and damage of i) left lateralized fronto-temporo-parietal connections and interhemispheric connections between left temporo-parietal and right parietal areas; ii) left cortico-basal connections; and iii) left cortico-pontine connections. We further identified a relationship between MF and white matter disconnections within cortical areas composing the cognitive control, default mode and attention functional networks. These results for a central role of white matter integrity in MF extend current literature by providing causal evidence for a functional interdependence among the regional cortical and subcortical structures composing the MF network. Our results further emphasize the necessity to consider connectomics in lesion-symptom mapping analyses to establish comprehensive neurocognitive models of high-order cognitive functions.


Introduction
"Mental flexibility" (MF) refers to the executive processes involved in the dynamic switch from one task set to another. Functional neuroimaging investigations of task-switching suggest the involvement of a bilateral fronto-parietal network in MF, while providing less consistent evidence for a role of other temporal, occipital, or basal ganglia associative areas in both hemispheres (Dajani & Uddin, 2015;Jamadar et al., 2015;Varjacic et al., 2018;Worringer et al., 2019). While the lesion-symptom mapping literature largely corroborates these findings from functional studies, it points to a larger influence of the left hemisphere. A recent review by Varjacic et al. (2018) notably indicates that MF mainly depends on a prominently left-lateralized network comprising frontal (dorsal and lateral prefrontal cortex, anterior cingulate cortex), parietal (inferior and superior parietal lobules, precuneus), temporal areas (medium temporal gyrus), and the insula. A series of studies examining whether the frontal lobe, classically associated with high-order cognitive functions, constitutes the main substrate of MF and of associated cognitive processes (Stuss, 2011;Stuss & Alexander, 2007;Stuss et al., 2001; though see Chan et al., 2015) supports the existence of a specialized role of frontal lobes in controlling specific sub-processes contributing to MF (task-setting, monitoring, etc.), though with an additional involvement of posterior non-frontal areas.
The involvement of such a largely distributed network likely pertains to the complexity of the MF processes, which typically require interactions between multiple cognitive components [conflict detection, working memory updating, error correction, etc (Dajani & Uddin, 2015;Niendam et al., 2012;Salthouse, 2011;Sanchez-Cubillo et al., 2009)]. This distribution of MF-related processes within superordinate networks is also in line with observations that MF scores are predictors of general intelligence and executive capacities, correlate with performance at other executive tasks and are impaired by damage to neural substrates shared with other cognitive functions according to lesion-symptom mapping literature (Barbey et al., 2012;Gl€ ascher et al., 2010Gl€ ascher et al., , 2012Nevado et al., 2022;Oosterman et al., 2010;Salthouse, 2011).
Yet, while the literature reviewed above provides information on the cortical areas supporting MF, the involvement of the anatomic tracts connecting these areas remains largely unresolved (Barbey et al., 2012;Cochereau et al., 2020;Cristofori et al., 2015;Duering et al., 2014;Hartung et al., 2021;Petersen et al., 2022). In addition to the involvement of a set of associative tracts in MF, current data suggest a central role of the left superior longitudinal fasciculus (SLF) and bilateral arcuate fasciculus (AF) (Barbey et al., 2012;Cochereau et al., 2020;Cristofori et al., 2015;Hartung et al., 2021;Petersen et al., 2022).
By connecting homolateral frontal, temporal and parietal regions, these tracts may indeed support the interactions between the cortical nodes of the fronto-parietal network reported by the functional literature. Further support for this hypothesis comes from resting-state fMRI studies showing that performance in MF tasks is associated with the level of connectivity within functional networks, notably the cognitive control and default network bilaterally (Mrah et al., 2022;Seeley et al., 2007;Varjacic et al., 2018;Vatansever et al., 2016).
Here, we tested the hypothesis that anatomical connections between the identified MF cortical nodes play a crucial role because this function requires highly integrated associative processing across multiple remote cortical and subcortical structures. Thus, identifying the white matter (WM) tracts involved in MF would improve the fundamental understanding of MF. We addressed this question by assessing in a large sample of 167 stroke patients the effects of WM lesions on the brain structural connectome and their associations with scores at the trail making test, an index of MF, with a 'connectome-based lesion-symptom mapping' approach [CLSM (Gleichgerrcht et al., 2017;Griffis et al., 2021)].
This method estimates individual patients' lesion-induced structural disconnection based on reference tractography and gray matter parcellations atlases. Statistical analyses of the relationship between these models and neuropsychological scores of interest enable to assess whether and to which extent the severity of damages to WM tracts and the amount of disconnection between distant brain regions influence behavior.
In comparison to studies focusing on the behavioral consequences of damage to predefined sets of WM tracts, this approach allows multilayered analyses providing complementary information on connections between gray matter areas and the underlying WM tracts. CLSM procedures are also better suited than voxel-based lesion-symptom mapping (VLSM) studies to detect diaschisis effects and the involvement of specific brain networks, because units of volume in tractography atlases inform on interregional anatomical connectivity and not only local information as do voxels in VLSM.
As an index of MF, we focused on the Trail Making Test score (TMT), one of the most widely used tests to assess this function in clinical and research settings (Army Individual Test Battery, 1944;Sanchez-Cubillo et al., 2009).
We conducted a retrospective analysis of the lesions and neuropsychological scores gathered mainly during the subacute phase of routine clinical assessments in a cohort of 167 first unilateral stroke patients.
We first tested the hypothesis that MF, as a multicomponent process relying on the integration of inputs of multiple brain areas, depends on the integrity of the WM tracts connecting structures from the prefrontal, parietal, and temporal regions, as well as from basal ganglia (BG; Barbey et al., 2012;Cristofori et al., 2015;Respino et al., 2019). We specifically c o r t e x 1 6 5 ( 2 0 2 3 ) 3 8 e5 6 expected a prominent role of intrahemispheric left connections, with a potential involvement of interhemispheric connections with homologous right contralateral areas (Cristofori et al., 2015;Petersen et al., 2022;Respino et al., 2019;Varjacic et al., 2018). Based on observations that MF might rely on connections within well-established resting-state functional networks (Mrah et al., 2022;Seeley et al., 2007;Varjacic et al., 2018;Vatansever et al., 2016), we also applied multivariate machine learning analyses to investigate how disconnections between regions composing these networks would affect MF. We expected to confirm an involvement of connections within the cognitive control and default networks.
We further conducted classical VLSM analyses on our dataset to replicate previous literature and to test for the consistency and complementarity between these results and those from the CLSM. We expected the VLSM to identify an association between TMT deficits and lesions restricted to areas pointed out by recent reviews on the topic (Varjacic et al., 2018;Worringer et al., 2019), notably the prominently left dorsal premotor cortex, anterior cingulate cortex, inferior frontal gyri and parietal lobules, precuneus, superior temporal gyri, and BG. In addition to these effects, we further expected to improve the analyses when controlling for age and educational level. We finally explored the impact of concomitant impairments in the major cognitive components involved in the TMT test.

Materials and methods
No part of the study analyses nor procedures were preregistered in a time-stamped, institutional registry prior to the research being conducted.

Participants
We retrospectively collected information on patients hospitalized for a first unilateral stroke between 2011 and 2019 at the Cantonal Hospital of Fribourg, Switzerland. The study was approved by the Research Ethics Committee of the Canton de Vaud (CER-VD), Switzerland (protocol #2019-02416). Patients have neither given their consent to share their coded data on a public repository nor to anonymize their data for such purpose. The President of the CER-VD (e-mail: secretariat.cer@vd.ch), is ready to answer any query regarding this issue, and to reach the data sharing agreement and legal agreement necessary to obtain the data (please mention the study number #2019-02416 in each related correspondence to the CER-VD).
Our sample size was based on previous literature stating the dimension of the database to obtain sufficient brain coverage and analytical power in lesion-symptom mapping (Kimberg et al., 2007) and on the availability of clinical data. All inclusion/exclusion criteria were established prior to data analysis.
Inclusion criteria were: i) first unilateral ischemic or hemorrhagic symptomatic stroke; ii) post-stroke neuropsychological evaluation with both parts B and A of the Trail Making Test (the color TMT version was also accepted); iii) available brain CT or MRI images performed during clinical routine stroke assessment.
Exclusion criteria were: i) age ! 86 years; ii) antecedents of neurologic or psychiatric disorders; iii) cerebellar, bihemispheric stroke, or subarachnoid hemorrhage. A total of 167 patients were eventually included in the analyses. No data were excluded from the analyses. Demographic and clinicalrelated data were retrospectively collected and are reported in Table 1.
We report how we determined our sample size, all data exclusions, all inclusion/exclusion criteria, whether inclusion/exclusion criteria were established prior to data analysis, all manipulations, and all measures in the study.

Trail making test
In the present study, we used the raw TMT B-A scores collected during the post-stroke clinical assessment as the index of MF.
The TMT (Army Individual Test Battery, 1944) is available on the publicly accessible digital repository "Zenodo" (https:// zenodo.org/, https://doi.org/10.5281/zenodo.7845740) and consists of 2 parts: In TMTA the participant is asked to draw lines connecting an ascending sequence of circled numbers distributed on a paper sheet. In TMTB a series of circled letters in alphabetical order is combined with numbers on the same sheet and the participant is asked to alternate the connection of numbers and letters (1, A, 2, B etc.). The Participant is instructed to separately complete both parts as fast and accurately as possible, and the task completion time is measured for each part and used to index performance. To perform correct and rapid alternation between numbers and letters in TMTB, MF is required. Since cognitive and physical functions other than MF are also involved in performing the TMT (upper limb motor control, visuo-spatial search, working memory, inhibitory control, etc.), the MF component of the task is isolated by subtracting the time taken to complete the TMTA from the TMTB (Arbuthnott & Frank, 2000;Kortte et al., 2002). Whenever an error is committed, the clinician shows it to the participant, who then autonomously corrects himself, all without stopping the time count. This way, the information relative to the number of errors committed is taken into account in the completion time. Given the equivalence to the standard TMT form (Guo, 2021;Lee & Chan, 2000), a color version of the TMT used for some patients with severe language impairments was also included in the analyses (D'Elia et al., 1996; legal copyright restrictions prevent public archiving of Color Trail Test which can be obtained from the copyright holders in the cited references; Table 1). Importantly, the TMT shows good sensitivity, reliability and low intraoperator and inter-operator variability (Bowie & Harvey, 2006); it is thus well adapted for retrospective analyses of clinical scores as conducted in the present study. The raw TMT B-A score represented the main variable of the study: for CLSM analyses we regressed the raw TMT score for age and education, two factors known to strongly influence the TMT performance (Corrigan & Hinkeldey, 1987;Siciliano et al., 2019;Tombaugh, 2004). For VLSM analyses we also used the raw TMT B-A value but, to improve our statistical power, we performed a posthoc analysis using the same score binarized on the median. Both variables were combined with age and level of education as covariates. As an additional posthoc VLSM we considered a normalized TMT B-A Z-score used in clinical practice (St-Hilaire et al., 2018). This score is obtained by normalizing the individual TMT score of a patient to the distribution of TMT normative values from a healthy population adapted for corresponding age, education, and gender (St-Hilaire et al., 2018). We then binarized the resulting Z-values based on the fifth percentile cut-off (corresponding to a Zscore ¼ À1.645), which provided us with a variable expressing a clinically meaningful distinction between normal and pathological performances. The same binary score, but based on the 10th percentile cut-off (corresponding to a Zscore ¼ À1.282) was also used for the network-level CLSM analyses with a machine learning approach; from here on we will refer to this score as 'normalized TMT score'. Further details on the use of the different variables are presented in paragraph 2.3, 2.5 and 2.6.

Further neuropsychological evaluations
A board-certified neuropsychologist converted the summary conclusions of neuropsychological evaluation reports to a three-level ordinal scale of deficit severity (three expressing the highest severity) for each of the cognitive domains listed in Table 2. The clinical neuropsychological evaluation was based on a set of neuropsychological tests (Table S1, supplementary material) whose scores were referred to the distributions of healthy populations to determine the level of impairment severity in each domain (SD < 1 ¼ mild; SD 1e2 ¼ moderate; SD > 2 ¼ severe).

2.3.
Correlation and principal component analyses The analyses described in this section were performed with the software Jamovi version 1.2 (The jamovi project (2020). 2.3.1. Correlation between TMT score and time of TMT assessment after stroke onset While no restrictive criterion for the delay between the stroke onset and the TMT evaluation was adopted during patients' inclusion, in case TMT assessments at multiple time points were available for a patient, we considered the score the closest to 5 days post-stroke. The majority of TMT evaluations were performed during the acute phase (3rd quartile ¼ 9 days), despite the delay ranging from 1 to 415 days after the stroke onset (Table 1). Similarly, most of the brain imaging were acquired within the acute phase (Table 1). As a sanity check, we examined Pearson's correlation between the raw TMT score and the post-stroke delay. A significant negative correlation would suggest a contribution of post-lesional recovery to the TMT score (the more the time passed after the stroke, the better the functional state), possibly influencing the inferences on lesion-behavior causal relations. We found a significant, but small positive correlation (r(165) ¼ .17, p ¼ .03) indicating worse performance with a longer delay. This effect likely follows from the TMT assessments with long delays being associated with more severe clinical conditions, which did not allow the patient to perform the TMT at earlier post-stroke stages. As explained later in section c) of this paragraph, we corrected for the influence of more severe collateral cognitive impairments on the TMT score that might have been associated with later assessments. Given our control of these potential confounders, the probable absence of large effects of early recovery and the small effect size we found, we can assume that the residual impact of the delay between stroke onset and the TMT assessment on our analyses probaly did not influence the results.

Correlations between TMT, age and education
To verify previous evidence for an association of raw TMT scores with patients' age and education level (three categories corresponding to 10, 10e15, >15 years of education) we computed independent Pearson's correlations between these two variables and the TMT scores. As expected, the TMT score (higher scores meaning worse performance) showed a moderate positive correlation with age (r(165) ¼ .37, p < .001), and a small negative correlation with the education level (r(165) ¼ À.22, p ¼ .004). These results replicate previous literature (Corrigan & Hinkeldey, 1987;Siciliano et al., 2019;Tombaugh, 2004) and corroborate the importance of correcting for these two variables because of their potential effect on the VLSM and CLSM results. c o r t e x 1 6 5 ( 2 0 2 3 ) 3 8 e5 6

Correlation between TMT and principal component analysis-derived cognitive domains
To assess the influence of coexisting neuropsychological deficits on the raw TMT score, we performed a Principal component analysis (PCA) with varimax rotation method to reduce the number of neuropsychological variables to submit to the successive analyses (Revelle (2019) Psych: Procedures for Psychological, Psychometric, and Personality Research R Package Version 1.9.12. Northwestern University, Evanston. /Pack-age¼psych). This was done to synthesize the shared variance of clinical syndromes due to the low-dimensional structure of impairment, and in turn to improve the power of our analyses of interaction with TMT (Bisogno et al., 2021;Corbetta et al., 2015). Bartlett's test of sphericity was significant (p < .001) and the Kaiser-Meyer-Olkin Measure of Sampling Adequacy had an overall value of .59, indicating that the dataset was suitable for a PCA. Based on an eigenvalues >1 criterion, a five-components model better synthesizes the information of the ten neuropsychological variables with 70.5% of total variance explained ( Table 2). We calculated the scores for the components by averaging the values of the corresponding pair of neuropsychological variables for each patient (bold values in table 2; e.g. ideomotor praxis ¼ 1, visual gnosis ¼ 2 / Component 1 ¼ 1.5). We followed this approach instead of using the PCA scores because the input data were clinical diagnosis reports converted to low dimensional ordinal scales; the simpler calculation was thus likely optimal to capture any structure in these noisy data while improving the interpretability of the results from a clinical perspective.
We excluded the executive functions/attention component because it included redundant information related to MF. The language production/comprehension component was also omitted from further analyses because we did not expect deficits in these domains to have an influence on TMT execution, especially given the use of color trail making to accommodate patients with severe language problems.
We subsequently employed the remaining three components (ideomotor praxis/visual gnosis, constructional praxis/ memory, neglect/general cognitive slow down) as covariates in the VLSM analyses and tested them for correlation with the TMT scores (raw and normalized Z-score) to identify potential interactions with TMT and therefore, to better interpret their impact on lesion-symptom inference.
Based on these results, only the components showing correlation with TMT were likely to influence the lesionssymptom inference , while those with no correlation possibly had a negligible effect. Further details on the interpretation of these correlations are reported in the Discussion section.

Image processing
We retrospectively extracted structural CT/MRI brain images (12 CT/155 MRI) acquired as part of the routine clinical stroke investigations (Table 1) from the database of the Cantonal Hospital of Fribourg, Switzerland. We adopted a mixed approach for lesion mask creation based on the quality of the original brain image (Fiez et al., 2000). High-quality images underwent a semi-automated procedure on SPM12 (SPM12 -Statistical Parametric Mapping. Alternate URLs. Http://Www.Nitrc.Org/Projects/Spm, Https:// www.Fil.Ion.Ucl.Ac.Uk/Spm/Software/Spm12/, Https://Bio.-Tools/SPM) using the toolbox Clusterize (procedure detailed in (Clas et al., 2012;de Haan et al., 2015) for lesion demarcation and the "clinical" toolbox for automated normalization on the provided old population template (Rorden et al., 2012). Lowquality images were manually recreated on the T1-single subject template of the Montreal Neurological Institute using the software MRIcron (Rorden & Brett, 2000), a procedure detailed in (Chouiter et al., 2016;Manuel et al., 2013).

disconnection severity estimation
The software Lesion Quantification Toolbox was used to estimate the impact of the lesion on the structural brain connectome (Griffis et al., 2021). Alternative dedicated code is also available to produce similar outputs for disconnectivity analyses (Disconets_flow, 2022). Since Griffis et al. (2021) fully describes the procedures with the Lesion Quantification Toolbox, we provide only the essentials here. Based on the previously created lesion mask, the toolbox generates the following outputs for each patient: -Tract disconnection severity: the value is produced modeling the intersection between the lesion mask and the streamlines composing each WM tract of the chosen tractography atlas. The value is expressed as the percentage of interrupted streamlines of the tract. -Parcel-wise disconnection severity: A structural connectome is modeled by combining the information from the chosen parcel-wise atlas with the tractography atlas. This measure indexes the severity of the estimated lesion-induced structural disconnections between pairs of gray matter regions of interest (ROIs) with direct structural connections in the atlas-based structural connectome. The value of disconnection severity for each ROIeROI direct connection is calculated as the percentage of reduction of the streamlines induced by the lesion compared to the structural connectome. -Voxel-wise disconnection map: By modeling the intersection between the lesion mask and a voxel-wise tract density image (TDI) derived from the tractography atlas, the software provides a 3D disconnection map in which the value of each voxel represents the percentage of reduction of the streamline density compared to the atlas. This value c o r t e x 1 6 5 ( 2 0 2 3 ) 3 8 e5 6 indexes the severity of disconnection at the voxel-wise level, though with no information on the directionality of the fibers traversing the voxel.
For the present study we ran all the analyses using an "End" connection criterion, and the deterministic HCP842 tractography template (Yeh et al., 2018). As for the parcel-wise disconnection severity, we performed two distinct analyses. First, we evaluated general disconnections between the 135 cortical and subcortical gray matter areas of the fMRI-based parcels atlas (Schaefer et al., 2018). Furthermore, to obtain network-level disconnectivity information to subsequently relate with TMT scores with a multivariate machine learning approach, we estimated the total lesion-induced disconnection load within each of the 17 networks of the Yeo's parcellation atlas (Yeo et al., 2011). To this aim, for each of the 17 networks, we conducted a Parcel-wise disconnection severity analysis and calculated the total disconnection load as the sum of all ROIeROI disconnection values within the individual network. The atlas from Schaefer et al. is composed of 135 cortical and subcortical gray matter areas, while the Atlas from Yeo et al. is composed of cortical 200 gray matter areas divided into 17 networks based on functional connectivity fMRIs. Because of their functional approach these atlases better reflect the physiological configuration of functional connections between ROIs compared to anatomy-based atlases.

Relation between TMT and white matter lesions
On the basis of the aforementioned measures of disconnectivity, we investigated the association between impairment in MF and damage to WM tracts, whole-brain direct ROIeROI structural connections and WM fibers with no a priori subdivision of the fibers into predefined anatomical units (Voxel-wise disconnection map) allowing for purely data-driven results. We used a mass univariate general linear model approach for these analyses. We further explored how lesion-induced disruption of large-scale functional networks would be related to MF impairment analyzing the data of network-level disconnection loads with multivariate machine learning techniques. This allowed us to evaluate the effect of each network while considering the influence of the other networks analyzed.
-Mass univariate general linear model: We adapted the publicly available Matlab scripts used in Sperber et al. (2022) for tracts and whole-brain parcel-wise disconnection severities (adapted version and link to the original script are available on the publicly accessible digital repository "Zenodo" (https://zenodo.Org/, https://doi.org/10. 5281/zenodo.7845740). The script computes a general linear model between the lesion-induced WM disconnection measure and the TMT score with a mass-univariate approach (each tract and each ROIeROI pair is independently tested), and applies a "maximal statistic permutation" correction of significance thresholds for the estimated parameters to control for family-wise error rate [FWER (Nichols & Holmes, 2002)]. We assumed that lesions cause only deficits, and therefore performed one-tailed statistical tests at a corrected alpha threshold of .05 for tract disconnections analyses and .01 for parcel-wise disconnections. For both analyses we considered only structures damaged in at least 10 patients. As a result, within the limits of this restriction, the presence of a linear relation of TMT scores with lesion load on each tract of the tractography atlas or with each possible connection of the structural connectome is investigated. We performed an analogue analysis (general linear model with mass univariate approach for each voxel) on the voxel-wise disconnection maps running a VLSM with DTI modality using the software Niistat (Niistat. URL: https://www.nitrc. org/Plugins/Mwiki/Index.Php/:MainPageNiistat). We ran this analysis at a one-tailed .05 alpha threshold corrected with "maximal statistic permutation" procedure and considered a minimum lesion overlap of 15 patients.
For all analyses we performed 5000 permutations for FWER correction. To account for age and education level, these factors were combined with the raw TMT score in a multiple linear regression model before the mass univariate analyses. The resulting residuals for the TMT score were then used for the three WM analyses.
-Multivariate machine learning: We trained a random forest classifier with the scikit-learn library in Python (https:// scikit-learn.org/stable/) to predict the presence of a deficit in mental flexibility based on the previously estimated values of disconnection loads within the 17 cortico-cortical networks from the Yeo's parcellation atlas and the patients' age, for a total of 18 variables. The presence of a deficit in mental flexibility was expressed using the binary normalized TMT score with a 10th percentile cut-off (see paragraph 2.2a). Although the fifth percentile is considered the formal clinical cut-off, we chose the 10th percentile to reduce the inequality of groups, which would otherwise limit the application of random forest classification. The 10th percentile still expresses performances observed in a minority of the healthy population, and thus would not radically increase the number of false positives in our cohort.
We evaluated the classifier accuracy with the area under the receiver-operator characteristic curve (AUC) in an out-ofsample classification. For 100 repetitions, we randomly selected training subsamples of 75% of the patients (N ¼ 125) and trained a random forest classifier with 100 trees and default hyperparameters. We evaluated model performance in the test subsample with the remaining 25% of patients (N ¼ 42). We averaged the AUC across all 100 repetitions to assess the overall classifier performance. We tested model performance against chance by a permutation approach in which we assessed the distribution of AUCs 1000 times under the null hypothesis with randomly shuffled data labels and considered an a ¼ .05. We further interpreted the model decisions by two strategies. First, we computed the importance of each of the 18 variables across the 100 training repetitions by assessing the default variable importance measure in the c o r t e x 1 6 5 ( 2 0 2 3 ) 3 8 e5 6 scikit library (https://scikit-learn.org/stable/). The importance expresses how much the model depends on the variable and corresponds to the mean decrease in impurity. Second, we modeled a non-random single-tree classifier in the total sample of 167 patients with a maximum tree depth of 5 layers and visualized the decision structure.

2.6.
Voxel-based lesion symptom mapping analyses We conducted the VLSM analyses using the software Niistat [procedure described in (Niistat. URL: https://www.nitrc.org/ Plugins/Mwiki/Index.Php/:MainPageNiistat)]. In VLSM the behavioral impact of damage to a specific voxel is not tested against healthy individuals, but against patients with lesions in other brain regions. The main advantage of this type of reference class is that it controls for the influence of lesions unrelated to their localization. We refer the reader to Rorden et al. (2009) and Sperber & Karnath (2018) for more discussion on the methodological challenges related to the type of comparison implemented in VLSM approaches. We performed all analyses voxel-wise in lesion modality and we analyzed only voxels that were damaged at least in 3 patients. This number was chosen to obtain adequate brain coverage because of the prevalence of small lesions in our cohort. To exclude from the analysis voxels that are located outside the brain template because of imprecisions during the normalization procedures, we ran all VLSM analyses using the custom "Explicit Voxel Mask" available on the publicly accessible digital repository "Zenodo" (https://zenodo.org/, https://doi. org/10.5281/zenodo.7845740). We adopted a one-tailed .05 alpha threshold with a permutation-based correction for FWER with 5000 permutations. The Freedman-lane method with the same number of permutations and alpha was used to account for the effect of covariates (Winkler et al., 2014).

TMT corrected for age and education
We combined the raw TMT score from each patient with age and education level as covariates in the VLSM. We also performed the same analysis using the TMT B-A score binarized on the median of the sample. Considerably reducing the variance of the variable distribution, this procedure was done to compensate for the potential low power of the VLSM due to the conservative permutation correction approach used, and to the low lesion frequency we observed in parts of the brain in our sample. For both analyses, we reduced years of education to an ordinal scale with 3 levels (1 10 years, 2 ¼ 10e15 years, 3 ! 15 years).

TMT corrected for the neuropsychological components
To test the effect of impairments in other cognitive domains on TMT, we performed additional VLSM analyses combining separately the raw and binarized TMT scores with the three PCA-derived neuropsychological components together with age and education level as covariates. Moreover, the normalized TMT Z-score described in section 2.2 was used to perform a VLSM alone or accounting for the influence of the neuropsychological components. Details and results of the aforementioned investigations are presented as supplementary material.

Connectome-based lesion-symptom mapping analyses
The lesion coverage map of the voxel-wise TDI tractography atlas (see 'voxel-wise disconnection map' in paragraph 2.5.1) is shown in Fig. 1. The coverage depicts the overlap of the Fig. 1 e Coverage map of the voxel-wise disconnections for voxels with disconnection in at least ten patients: The color of each voxel represents the number of patients whose lesions produced a reduction of streamline density passing through the voxel. MNI z coordinates of the axial sections are reported. c o r t e x 1 6 5 ( 2 0 2 3 ) 3 8 e5 6 voxel-wise disconnection maps of all patients composing our sample.

Tract disconnection severity
Fifty of the 70 tracts from the HCP842 tractography atlas were damaged in at least 10 patients and thus analyzed (Table S2 of the supplementary material). The univariate linear regression between the raw TMT score (adapted for age and education) and the severity of disconnection reached our significance threshold for 14 tracts (one-tailed corrected a ¼ .05). Results included six association and seven projection tracts within the left hemisphere as well as the posterior portion of the corpus callosum (Table 3).

Parcel-wise disconnection severity
-Whole-brain disconnections: Among 1115 ROIeROI direct disconnections analyzed (6% of the total, damaged in at least 10 patients) 28 were positively related to the TMT score adjusted for age and education (one-tailed corrected a ¼ .01), meaning that greater disconnection severity was linearly related to worse TMT scores. Disconnections are shown in a 3d representation in Fig. 2 and presented in detail along with their respective linear slope and correlation coefficients in Table S3 of the supplementary material.
Referring to the single subject anatomical parcellation atlas from Tzourio-Mazoyer et al. (2002), intrahemispheric disconnections within the left hemisphere were found for the BG with premotor areas, Rolandic operculum, ventral postcentral sulcus and supramarginal gyrus (SG). Interhemispheric disconnections were relevant between SG, temporal operculum and right superior parietal lobule (SPL), precuneus (PC) as well as the lingual gyrus (LG) and calcarine sulcus. The left temporal lobe showed disconnections with the right SPL and intraparietal sulcus while the left insula revealed disconnections with the contralateral dorsal postcentral sulcus and gyrus. Finally, the left paracentral lobule and middle cingulate cortex (MCC) showed disconnections with contralateral MCC and right SPL.
Correlation coefficients ranged from .36 to .41, indicating small to moderate effect sizes. The slope coefficients ranged from .7 to 1.4, indicating that 10% of disconnection severity  predicts an increase of seven to 14 s of completion time difference between TMT B and A.

Voxel-wise disconnection maps
Analysis of voxel-wise disconnection maps (a ¼ .05 corrected) revealed main clusters within the left hemisphere stemming from the SG, posterior temporal lobe and dorsal postcentral sulcus. In the right hemisphere two small clusters were found in the subcortical white matter of the SPL and angular gyrus (AG) (Fig. 4). Given the discontinuity of the clusters due to the highly conservative FWER correction, results at an uncorrected a ¼ .00001 are also shown in Fig. 4. With this threshold, previous connections are expanded and extended in the left hemisphere to the superior and medium temporal gyri (STG, MTG), insula, BG, dorsal premotor and lateral prefrontal areas. Projection pathways towards the brainstem and the posterior portion of the corpus callosum are also revealed.

Voxel-based lesion symptom mapping analyses
Only voxels damaged in at least three patients were analyzed. The resulting map of brain lesion coverage is shown in Fig. 5.

TMT corrected for age and education
The VLSM analysis of raw TMT with age and education level as covariates revealed a small significant cluster in the left external capsule (MNI xyz coordinates À31, À26, 21) with the Fig. 3 e Disruption of functional networks impairing the TMT performance: non-random single-tree classifier with 5 layers depth; Patients who meet the condition of a decision node branch to the left (green lines). Within each node the following information is presented in descending order: classification condition, number of patients within the node, occurrence of observed conditions within the node sample [patients without TMT impairment, patients with TMT impairment]; the condition for networks is expressed as the % disconnection value; Node conditions are organized so that patients classified as having no TMT impairment correspond to leaf nodes with a green branch. Cont ¼ cognitive control network; Default ¼ default mode network; DorsAttn ¼ dorsal attention network; SomMot ¼ somato-motor network.

TMT corrected for the neuropsychological components
The addition of neuropsychological components as covariates produced patterns of result very similar to the analyses of    TMT corrected for age and education, while the VLSM with the normalized TMT score yielded no significant results. Details and results are presented in Supplementary Figs. S1 and S2.

Discussion
We identified the white matter tracts whose damages impaired mental flexibility [MF, as measured by performance on the Trail Making Test (TMT)], by applying a whole-brain Connectome-based Lesion-Symptom Mapping analysis (CLSM) on a large cohort of 167 first unilateral stroke patients, the majority of which underwent TMT testing in the acute phase (3rd quartile ¼ 9 days post-stroke). We confirmed our general hypothesis that since MF requires interactions between multiple remote network components, the integrity of distributed brain anatomical connectivity tracts is required to maintain performance.

Connectome-based lesion-symptom mapping analyses
We found associations between impairments in mental flexibility (MF) and damage to three main groups of anatomic connections: i) associative tracts connecting parietal, temporal and frontal areas within the left hemisphere, as well as an interhemispheric connection between left temporo-parietal and right parieto-occipital cortices; ii) left intrahemispheric connections of inferior parietal and dorsal premotor regions with basal ganglia (BG); iii) cortico-pontine projection tracts from the left hemisphere.

The role of fronto-temporo-parietal connections in mental flexibility
Confirming our main hypothesis, we found that a large leftlateralized fronto-temporo-parietal network (FTP) connected to the right parietal, posterior cingulate and occipital regions (calcarine sulcus, lingual gyrus) was involved in MF (Figs. 2 and 4).
Specifically, we found evidence for a role of the left hemisphere extreme capsule, frontal aslant tract, superior longitudinal (SLF), arcuate (AF) and uncinate fasciculus (UF) ( Table  3), which reveals the importance of intrahemispheric communication between associative frontal, temporal and parietal areas. We also found involvement of the posterior corpus callosum and numerous direct interhemispheric connections of the left temporo-parietal cortex with right parietal and occipital regions (Table 3, Fig. 2).
Further focusing on the functional network disconnection analysis, we revealed the contribution of the cognitive control, dorsal and ventral attention and default mode networks, which typically involve multiple areas of the frontal, temporal and parietal lobes with associative roles [Fig. 3 (Yeo et al., 2011)].
Although limited, previous reports on the effect of lesions on white matter connections in MF also suggest a role of SLF, AF and UF (especially in the left hemisphere (Barbey et al., 2012;Cristofori et al., 2015;Petersen et al., 2022). Moreover, recent works focusing on patients with low-grade glioma also found a role of SLF and AF (Cochereau et al., 2020;Hartung et al., 2021;Mandonnet et al., 2020). However, apart from a single case report study of low-grade glioma which found an association between lesions intersecting the corpus callosum and post-ablation worsening of TMT score (Mandonnet et al., 2020), no interhemispheric connection was so far identified as relevant in contrast with our findings and fMRI reports for bilateral cortical activity in MF (Varjacic et al., 2018;Worringer et al., 2019).
Our results also corroborate previous lesion and functional literature for a prominent role of left frontal and temporal areas, left supramarginal gyrus (SG), insula, Anterior cingulate cortex (ACC), and mostly right parietal areas in MF (Barbey et al., 2012;Cristofori et al., 2015;Moll et al., 2002;Stuss et al., 2001;Talwar et al., 2020;Varjacic et al., 2018;Worringer et al., 2019;Zakzanis et al., 2005). Importantly, by combining information from both gray matter and tractography atlases, our CLSM allowed us to better delineate the structural network connecting these areas and to provide further insights into the way they interact in MF. In other words, it not only allowed us to infer which area or which structural connection tracts might be important for MF, but it revealed that communications between a specific cluster of nodes are critical for MF. In particular, we confirmed that the integrity of anatomical connections within the left FTP network and with right parietal regions is crucial for MF.
These results provide new causal evidence that the areas composing this network likely operate with an important degree of functional interdependence during MF tasks. This finding is in line with previous evidence for an association of the FTP network with key cognitive subcomponents of MF (Dajani & Uddin, 2015;Kim et al., 2012;Peri añez et al., 2004;Rushworth et al., 2005;Wager et al., 2004;Worringer et al., 2019).
Thanks to the use of highly controlled experimental tasks, stimulus-response task-switching paradigms have allowed decomposition of MF in multiple cognitive subprocesses supported by a distributed bilateral FTP network: An interesting model by Jamadar et al. (2010) suggests that anticipatory task-set reconfiguration (e.g. for TMTB, switching from letter to numbers) depends predominantly on prefrontal areas and precuneus (predominantly left hemisphere), which would predispose bilateral parietal regions to retrieve the taskrelated sets of rules (letter vs numbers sequences). The actual switching process is then mainly implemented by bilateral parietal and premotor regions in association with ACC and prefrontal areas (predominantly right hemisphere) for conflict detection and inhibition of interference (inhibiting the numbers sequence to follow the letter sequence and viceversa). The insula, a central cognitive hub, has been proposed to support the maintenance of goal-directed behavior and attention during MF tasks, especially in association with other medial cortical and subcortical regions (ACC, medial frontal superior gyrus, basal ganglia, Menon & Uddin, 2010;Dosenbach et al., 2008Dosenbach et al., , 2007Varja ci c et al., 2018). Although we found a left lateralization of the MF network, the right homologous structures may also contribute to the switching process as testified by the relevant interhemispheric connections we found (Vallesi et al., 2022). There is evidence suggesting that the left hemisphere would be dominant for taskset reconfiguration processes, while the right hemisphere c o r t e x 1 6 5 ( 2 0 2 3 ) 3 8 e5 6 would preferentially support interference inhibition and response monitoring roles (Aron et al., 2004;Mecklinger et al., 1999;Vallesi, 2012Vallesi, , 2021. Relative to TMT, a deficit of task-set reconfiguration capacity would be responsible for the slowing of planning and consequent implementation of switching, while impaired interference inhibition would produce a greater occurrence of errors, two effects capable of reducing the TMT performance. Therefore, our finding of a major involvement of the left hemisphere, given the similar distribution of lesions between the two hemispheres (Fig. 5), suggests that TMT completion time largely depends on predictive task reconfiguration capacities while being less influenced by error-making.
However, while a central role of the FTP network has been largely confirmed by functional studies, the specific contributions of the regions composing the network as well as their interaction still remain unclear, as reflected by the existence of multiple anatomo-functional models of MF that have been proposed (Jamadar et al., 2015;Karayanidis et al., 2010;Ruge et al., 2013). Despite some contrasting results, causal evidence from lesion studies also suggests the involvement of both frontal and non-frontal areas in MF (Barbey et al., 2012;Chan et al., 2015;Varjacic et al., 2018; but see Stuss et al., 2001), and, although limited, some reports point to a contribution of localized frontal regions to specific MF-related processes (Stuss & Alexander, 2007;Stuss et al., 2001). The nonuniformity of the tasks used to test the MF, which involve variable combinations of cognitive and perceptual processes might explain the variability in the anatomo-functional correlations observed in the literature (Friedgen et al., 2022;Jamadar et al., 2015;Rushworth et al., 2005). However, Stuss et al. found in various populations with focal frontal lesions, that well-defined regions in this lobe were associated with task setting and monitoring processes (two functions involved in MF tasks) assessed with multiple tasks featuring variable demands in terms of perceptual modality and cognitive associated functions (Stuss, 2011;Stuss & Alexander, 2007). This consistency of results suggests that the frontal lobe includes substrates of subfunctions purely related to the MF with little or no influence of task modality variations, to which the activity of non-frontal areas might in contrast be more susceptible.
Overall, based on these observations and the mentioned results from the task-switching literature, some authors have suggested that MF cannot be identified with a specific unitary process, but rather results from the integrated activity of multiple anatomo-functional brain units capable of flexibly adapting their computational outputs to meet different behavioral and cognitive goals (Dajani & Uddin, 2015;Medaglia et al., 2018;Niendam et al., 2012). This interpretation seems particularly interesting in the light of reports showing high correlation between tests measuring different executive functions and the evidence of high sharing of anatomical substrates between executive processes suggested by lesion mapping studies on fluid intelligence (Barbey et al., 2012;Gl€ ascher et al., 2010Gl€ ascher et al., , 2012. Different executive processes might dynamically recruit a specific combination of modules from a superordinate common network setting the whole system to a relatively unique configuration of activation as proposed by these studies (Niendam et al., 2012;Stuss, 2011;Stuss & Alexander, 2007;Zink et al., 2021).
Another element favoring the idea of a dynamic recruitment of modules within a shared superordinate network comes from the observation that MF is associated with the integrity of anatomical connections within various wellestablished resting state networks (Mrah et al., 2022;Seeley et al., 2007;Varjacic et al., 2018;Vatansever et al., 2016). In this regard, while our functional network analysis found that damage to the somato-motor network was the most relevant to predict deterioration of the MF performance, it also confirmed the importance of the executive control, default mode, as well as the ventral and dorsal attention networks in MF. While the former two systems have been already associated with MF (Mandonnet et al., 2020;Mrah et al., 2022;Seeley et al., 2007;Varjacic et al., 2018;Vatansever et al., 2016), there are no substantial reports of an association between MF and connectivity of the attentional networks. It is however understandable how directed attention would be much relevant for switching tasks focused on external stimuli, and in a broader perspective it has been proposed that the executive control might be involved in alternating between internally and externally focused cognitive activities (associated with activity of the default and attention networks respectively) during executive tasks (Spreng et al., 2010). Our finding of a relevant influence of the somatomotor networks might be due to the TMT difference score not being totally selective for MF-related processes. However, because the somatomotor network is located centrally to the FTP network, it is possible that the most common medium cerebral artery strokes frequently involved the somatomotor regions along with those associated with MF (belonging to the Default, control, and attention networks) that are more variably injured, thus producing an artefactual association between the MF and the somatomotor areas. Nonetheless, we excluded this possibility because the co-occurrence of disconnection between the somatomotor and the default, control, or attention networks was observed in a low portion of patients (between 6.6 and 13.8%).
Once more, these results support a central role of FTP networks in MF with more limited involvement of structures associated with a large spectrum of cognitive activities, which testifies for the large variety of dynamic interactions underlying MF. Furthermore, the fact that, largely replicating the random forest approach of Mrah et al., 2022, we confirmed in a different population (stroke instead glioma) and database the importance of the cognitive control A network for MF, corroborates the validity of these findings and of the random forest method for the study of network-based functions.

The role of basal ganglia connections in mental flexibility
We also found a key role of cortico-basal connections in MF. Specifically, we identified an involvement of left intrahemispheric connections of frontal premotor areas, SG and ventral postcentral sulcus with BG, as well as an extended involvement of cortico-thalamic and cortico-striatal tracts.
The only TMT VLSM study with bilateral BG lesion patients did not identify this structure as relevant, but rather pointed to a cluster in the left insula and corona radiata (Varja ci c et al., 2018). This might be due to their use of a measure of accuracy with focus on switching errors, but no time restrictions on a shape-based TMT version. Since switching errors are usually present in patients with more severe deficits, such an index is probably more specific, but less sensitive than completion time in detecting MF load on TMT. However, our VLSM analysis also showed no clusters in BG (the lesions in our sample covered the bilateral striatum, but not the thalamus), and it is possible that the WM lesions causing disconnections from these centers have a larger impact than direct lesions usually occurring in BG as small lacunar damage. Further corroborating our finding, activation of BG during TMTB or other setswitching tasks was observed with fMRI in healthy populations (Jacobson et al., 2011;Kim et al., 2012).
At the cognitive level, BG would exert a regulatory activity on cortical outputs, specifically regarding the implementation of the switching command (Berry et al., 2018;Bonnavion et al., 2019;Ea et al., 2006;Hoshi, 2013;Shafritz et al., 2005). This hypothesis might explain MF impairments and perseveration behaviors in patients with BG degeneration like in Parkinson's and Huntington's diseases (Aron et al., 2003;Cools et al., 2001;Hughes et al., 2013). Moreover, it is consistent with our finding for a role of BG connections with left premotor and inferior parietal areas, two structures involved in motor planning and inhibition (Be cev et al., 2021;Haar & Donchin, 2020;Kr oliczak et al., 2016;Marvel et al., 2019;Omata et al., 2018). Overall, our results constitute an important confirmation of the strong involvement of cortico-basal connections in MF regulation, especially mediated by the interaction with the FTP network.

The role of cortico-pontine connections in mental flexibility
Finally, we identified a role for left hemispheric corticopontine connections from frontal, parietal, temporal and occipital lobes. These tracts project to the contralateral cerebellum via the pontine nuclei of the brainstem to then enter complex cortical loops with the thalamus. Compatible with previous evidence for an involvement of the cerebellum in high-level executive functions, these findings suggest a contribution of this structure to the associative MF network (Clark et al., 2021).

4.2.
Voxel-based lesion symptom mapping analyses VLSM analysis of the TMT B-A (corrected for age and education level) revealed a main contribution of deep periventricular WM in MF, in addition to the involvement of lateral frontal and parietal cortex, and lenticular nucleus in the left hemisphere (Fig. 6).
With the exclusion of the left lenticular nucleus, these areas have already been associated with TMT and MF in previous VLSM studies (Barbey et al., 2012;Cristofori et al., 2015;Varjacic et al., 2018). However, our results did not confirm an involvement of other structures found in VLSM literature on the subject, like the left insula, or more extended portions of frontal and parietal regions bilaterally (Gl€ ascher et al., 2012;Miskin et al., 2016;Varjacic et al., 2018;Varja ci c et al., 2018). This discrepancy likely follows from the specific lesion distribution in our sample. Interestingly, VLSM results were consistent with our CLSM findings but limited to a smaller number of structures, which demonstrates how CLSM could advantageously complement VLSM while being less dependent on the lesion distribution in specific samples of patients.

The influence of coexisting cognitive deficits on mental flexibility
An important limitation of lesion-symptom mapping studies in stroke populations pertains to the coexistence of multiple functional impairments. We addressed this issue by combining PCA-derived neuropsychological components for neglect, memory, praxis, gnosis and general processing speed as covariates in the VLSM of TMT. Yet, controlling for these components in the analyses did not change the results (see supplementary material). While our findings might have been robust to the noise induced by other cognitive impairments, it is more likely that this null result was due to the use of indirect measures of cognitive performance, which were derived from qualitative descriptions in clinical reports and further submitted to a PCA. Supporting this latter interpretation, a significant but small correlation with TMT was found only for two of the three components, probably resulting in an underestimation of their influence on TMT.
Only one other VLSM study of MF in a large cohort of patients accounted for the covariance of verbal, spatial abilities on TMT and found significant results in the ACC (Gl€ ascher et al., 2012). However, because a comparative VLSM analysis of TMT without covariates has not been performed in this work, assessing the impact of coexisting cognitive deficits was not possible. Further studies are thus necessary to solve this question.

Limitations
-The specific distribution of lesions in our cohort (Fig. 5), with higher concentration e and thus statistical poweraround the BG and along the lateral and central fissures might have biased our results toward these areas. Yet, the same lesion distribution on the tractography atlas resulted in a more extended WM brain coverage (Fig. 1), making CLSM possibly less influenced by lesion distribution than classical VLSM analyses. In addition, the considerable overlap between our findings and previous literature on the topic, despite the fact that outcomes of lesion-symptom mapping approaches are influenced by clinical and analytical variability, supports the validity of our results. -The TMT task involves a multitude of cognitive and sensitivo-motor processes, some of which are partially shared between TMT A and B (Karimpoor et al., 2017;Talwar et al., 2020) We focused on the difference score TMT B-A to sort out these common components. However, it is possible that TMT B had a superior demand than A for some processes involved in both parts (attention, visual search, processing speed, motor control etc.). Indeed, our findings of an involvement of the left corticospinal tract and cortico-pontine connections, as well as the importance of the somato-motor functional networks, might suggest a diverse contribution of motor integration in TMT B compared to part A. Furthermore, TMT B might still engage other cognitive functions not needed for part A, like working memory, inhibition, language etc. Because of these components, the delta score might not have been sufficient to isolate the unique contribution of MF. However, our TMT-related results are similar to those revealed by fMRI investigations of more elementary stimulusresponse tasks designed to be selective for MF processes, which suggests that our score was effective in removing a considerable portion of the contribution from collateral processes (Kim et al., 2012;Worringer et al., 2019).

Conclusions
Overall, our results confirmed the main hypothesis of MF relying on a prominent left hemisphere fronto-temporoparietal and subcortical associative network with important interhemispheric connections. We further demonstrated the importance of WM connections in MF: the degree of damage to multiple associative tracts and direct connections between gray matter areas was linearly related to MF deficits. The damage to WM connections within the cognitive control, default mode, ventral and dorsal attention networks was also associated with impaired MF. Altogether, these results support the idea that MF involves a complex integrative process of sensorimotor and cognitive subcomponents that relies on a spread system of highly interacting anatomo-functional substrates. This system seems to be partially shared between various high-order executive functions and engages various superordinate functional networks. To our knowledge, no previous study has applied wholebrain CLSM techniques to explore the relationships between focal symptomatic lesions and TMT scores of MF. This approach allowed us to reconcile results from lesionsymptom mapping and fMRI literature to reach conclusions compatible with recent theoretical interpretations of MF and higher cognitive brain processing. Our results concur with the idea that MF cannot be defined as a discrete high-order process, but rather as consisting of multiple interacting cognitive subfunctions, which rely on a functionally distributed, domain-general, network involving multimodal sensorymotor and associative integration (Barbey et al., 2012;Dajani & Uddin, 2015;Zink et al., 2021).
From a technical point of view, our multi-layered analytical approach succeeded in overcoming the difficulties inherent to the study of a complex cognitive function like MF and provided a deeper representation of its anatomo-functional correlates compared to other lesion-symptom mapping studies. Especially combined with VLSM or other techniques, CLSM analyses expand the amount of information derived from lesion-symptom investigations.

Declaration of competing interest
None.