Language networks in aphasia and health: A 1000 participant activation likelihood estimation meta-analysis

Aphasia recovery post-stroke is classically and most commonly hypothesised to rely on regions that were not involved in language premorbidly, through 'neurocomputational invasion' or engagement of 'quiescent homologues'. Contemporary accounts have suggested, instead, that recovery might be supported by under-utilised areas of the premorbid language network, which are downregulated in health to save neural resources ('variable neurodisplacement'). Despite the importance of understanding the neural bases of language recovery clinically and theoretically, there is no consensus as to which specific regions are more likely to be activated in post-stroke aphasia (PSA) than healthy individuals. Accordingly, we performed an Activation Likelihood Estimation (ALE) meta-analysis of language functional neuroimaging studies in PSA. We obtained coordinate-based functional neuroimaging data for 481 individuals with aphasia following left-hemisphere stroke and 530 linked controls from 33 studies that met predefined inclusion criteria. ALE identified regions of consistent, above-chance spatial convergence of activation, as well as regions of significantly different activation likelihood, between participant groups and language tasks. Overall, these findings dispute the prevailing theory that aphasia recovery involves recruitment of novel right hemisphere territory into the language network post-stroke. Instead, multiple regions throughout both hemispheres were consistently activated during language tasks in both PSA and controls. Regions of the right anterior insula, frontal operculum and inferior frontal gyrus (IFG) pars opercularis were more likely to be activated across all language tasks in PSA than controls. Similar regions were more likely to be activated during higher than lower demand comprehension or production tasks, consistent with them representing enhanced utilisation of spare capacity within right hemisphere executive-control related regions. This provides novel evidence that 'variable neurodisplacement' underlies language network changes that occur post-stroke. Conversely, multiple undamaged regions were less likely to be activated across all language tasks in PSA than controls, including domain-general regions of medial superior frontal and paracingulate cortex, right IFG pars triangularis and temporal pole. These changes might represent functional diaschisis, and demonstrate that there is not global, undifferentiated upregulation of all domain-general neural resources during language in PSA. Such knowledge is essential if we are to design neurobiologically-informed therapeutic interventions to facilitate language recovery.


a b s t r a c t
Aphasia recovery post-stroke is classically and most commonly hypothesised to rely on regions that were not involved in language premorbidly, through 'neurocomputational invasion' or engagement of 'quiescent homologues'. Contemporary accounts have suggested, instead, that recovery might be supported by under-utilised areas of the premorbid language network, which are downregulated in health to save neural resources ('variable neurodisplacement'). Despite the importance of understanding the neural bases of language recovery clinically and theoretically, there is no consensus as to which specific regions are more likely to be activated in post-stroke aphasia (PSA) than healthy individuals. Accordingly, we performed an Activation Likelihood Estimation (ALE) meta-analysis of language functional neuroimaging studies in PSA. We obtained coordinate-based functional neuroimaging data for 481 individuals with aphasia following left-hemisphere stroke and 530 linked controls from 33 studies that met predefined inclusion criteria. ALE identified regions of consistent, above-chance spatial convergence of activation, as well as regions of significantly different activation likelihood, between participant groups and language tasks. Overall, these findings dispute the prevailing theory that aphasia recovery involves recruitment of novel right hemisphere territory into the language network post-stroke. Instead, multiple regions throughout both hemispheres were consistently activated during language tasks in both PSA and controls. Regions of the right anterior insula, frontal operculum and inferior frontal gyrus (IFG) pars opercularis were more likely to be activated across all language tasks in PSA than controls. Similar regions were more likely to be activated during higher than lower demand comprehension or production tasks, consistent with them representing enhanced utilisation of spare capacity within right hemisphere executive-control related regions. This provides novel evidence that 'variable neurodisplacement' underlies language network changes that occur post-stroke. Conversely, multiple undamaged regions were less likely to be activated across all language tasks in PSA than controls, including domain-general regions of medial superior frontal and paracingulate cortex, right IFG pars triangularis and temporal pole. These changes might represent functional diaschisis, and demonstrate that there is not global, undifferentiated upregulation of all domain-general neural resources during language in PSA. Such knowledge is essential if we are to design neurobiologically-informed therapeutic interventions to facilitate language recovery.

Introduction
Post-stroke aphasia (PSA) is prevalent and debilitating ( Engelter et al., 2006 ) and recovery of function tends to be variable and often incomplete ( Yagata et al., 2017 ). Compensatory changes in patterns of neural activity, reflecting increased utilisation of surviving neural regions, are hypothesised to contribute to aphasia recovery ( Murphy and Corbett, 2009 ;Stefaniak et al., 2020 ;Turkeltaub et al., 2011 ). While previous studies have explored which set of regions are consistently activated in PSA ( Turkeltaub et al., 2011 ), multiple key questions remain unanswered. These include: (a) which regions, if any, are more or less likely to be activated in PSA than healthy individuals across all language tasks and do these regions differ between language tasks of different nature (comprehension vs. production); (b) are regions upregulated in PSA also modulated by task difficulty (higher vs. lower demand); and (c) do the differentially activated regions vary between different stages of recovery. Such knowledge will be essential to understand the mechanisms underlying language network plasticity and thus design neurobiologically-informed therapeutic interventions to aid language recovery. Accordingly, this study tackled these targeted questions through the largest Activation Likelihood Estimation (ALE) meta-analysis, to date, of functional neuroimaging studies in PSA (n = 481) and healthy controls (n = 530). We define the language network as regions consistently activated during language, which might include both language-specific regions, reportedly activated during language but not non-language tasks ( Fedorenko et al., 2011 ;Pritchett et al., 2018 ), as well as domain-general regions activated during both language and non-language tasks ( Fedorenko et al., 2013 ;Geranmayeh et al., 2017 ). There were several specific questions we sought to address. We consider these briefly, below, with respect to three major themes.
First, even though recovery of language after stroke has perplexed researchers since the seminal studies of aphasia in the nineteenth century ( Finger et al., 2003 ), there have been very few formal, implemented models ( Chang and Lambon Ralph, 2020 ) and hypotheses have rarely been tested in relation to large patient datasets ( Stefaniak et al., 2020 ). Certain mechanisms underlying partial language recovery in PSA propose that neural networks unused during language in health can adapt after stroke to perform a similar function to the one normally supported by the now damaged neural network(s) ( Stefaniak et al., 2020 ), for instance through immediate engagement of quiescent homologues ( Finger et al., 2003 ) or through neurocomputational invasion of non-language regions via experience-dependent plasticity ( Keidel et al., 2010 ;Southwell et al., 2016 ). Alternatively, variable neurodisplacement Stefaniak et al., 2020 ) proposes that 'well engineered' language and cognitive networks dynamically balance performance demand against energy expenditure, downregulating spare capacity under standard performance demands in health but running the remaining system 'harder' after partial damage (as the intact system can do when under increased performance demands ( Jung and Lambon Ralph, 2016 ;Rice et al., 2018 ;Robson et al., 2014 ;Sharp et al., 2010 )). These mechanisms are not mutually exclusive and might include both language-specific ( Fedorenko et al., 2011 ;Pritchett et al., 2018 ) and non-language networks, including domain-general executive networks ( Fedorenko et al., 2013 ), in both hemispheres ( Stefaniak et al., 2020 ). Key predictions of variable neurodisplacement are that compensatory language network changes in PSA are due to upregulation of spare capacity within the pre-existing language network, and that these same upregulated neural regions show increased activation for hard over easier tasks in both PSA and healthy individuals.
Second, there is a tendency to treat 'language' and its recovery as a single, homogenous cognitive function. Instead, language refers to a diverse range of expressive and receptive activities. Different language activities are supported by interactions between various more gen-eral neurocognitive computations ( Gordon et al., 2002 ;Mementi et al., 2011 ;Patterson and Lambon Ralph, 1999 ) which can be damaged independently of each other to generate the graded, multidimensional nature of post-stroke aphasia ( Alyahya et al., 2020 ;Butler et al., 2014 ;Kummerer et al., 2013 ;Mirman et al., 2015 ). Consequently, theories of recovery need to consider not only how each primary neurocognitive system might recover, but also how changes in their interactivity can support improved performance across different language activities. Changes in the division of labour across systems can occur not only between language networks ( Ueno et al., 2011 ) but also between language and multi-demand executive systems ( Geranmayeh et al., 2017 ;Hartwigsen, 2018 ).
An important second aspect of this issue is that different subcomponents of language, such as those subserving comprehension versus production, might have differently distributed networks, including degrees of lateralisation, premorbidly ( Lidzba et al., 2011 ). For instance, the language network is often described as unilateral ( Mazoyer et al., 2014 ) but several lines of evidence suggest it is at least partially bilateral but asymmetric ( Fedorenko et al., 2011 ;Lambon Ralph et al., 2001 ). This has significant implications as many studies have highlighted a role for the right hemisphere in recovery ( Crinion and Price, 2005 ;Skipper-Kallal et al., 2017a , b). Depending on the degree of premorbid asymmetry, right hemisphere activation might reflect engagement of pre-existing right hemispheric regions of the language network via variable neurodisplacement versus novel recruitment of non-language regions via neurocomputational invasion ( Chang and Lambon Ralph, 2020 ;Warburton et al., 1999 ). It is important, therefore, to compare activation patterns in post-stroke aphasia with the natural distribution of the same language subcomponent(s) in healthy individuals.
Third, language recovery is dynamic and occurs most rapidly during the first few months post-stroke ( Pedersen et al., 1995 ;Yagata et al., 2017 ), with spontaneous language changes being slower and smaller by the 'chronic' stage after approximately 6-12 months ( Hope et al., 2017 ). Thus, in order to identify language network changes that are associated with recovery, it is important to compare language networks at subacute vs. chronic stages of recovery.
Given these many outstanding questions, this study sought to identify regions of consistent, above-chance spatial convergence of activation, as well as regions of significantly different activation likelihood, between participant groups and language tasks. The omnibus ALE metaanalysis considered which specific regions are more likely to be activated in PSA than healthy individuals across all language tasks. Subsequent subgroup analyses investigated differences based on: comprehension versus production tasks; for each task type, higher versus lower demand tasks; and time post stroke (i.e., sub-acute vs. chronic PSA). Unfortunately, there were too few studies of sub-acute patients in the literature to contrast them against chronic PSA in this meta-analysis. If language recovery reflects neurocomputational invasion or engagement of quiescent homologues then the post-stroke language network should expand to include novel regions that are not consistently activated in healthy individuals, even under increased task difficulty. Conversely, variable neurodisplacement predicts that the networks observed in PSA should also be observed in healthy controls, particularly when the healthy system is placed under greater performance demands.

Study search and selection
We searched the databases Medline, Embase and PsycINFO up to April 2020. Terms relating to aphasia (aphasia OR dysphasia OR language OR fluency OR phonology OR semantics OR naming OR repetition OR comprehension OR speaking), stroke (stroke OR ischaemia OR ischemia OR infarct) and neuroimaging (fMRI OR PET OR neuroimaging OR imaging OR functional) were used. We identified eligible articles reporting observational studies that had: a) more than one person with language impairment at any time following a single left hemispheric stroke; b) more than one healthy control; and c) performed fMRI or 15 O-PET during language task-based functional neuroimaging. We extracted coordinate data for inclusion in this ALE meta-analysis that: related to activation (not deactivation) during a language task-based functional neuroimaging experiment; was provided in standard space; was derived from whole-brain mass-univariate analyses without region of interests (ROIs), small volume corrections (SVC), or conjunctions ( Müller et al., 2018 ); was reported separately for PSA and control groups; and was calculated using the same significance thresholds in the PSA and control groups. We excluded coordinate data from survivors of right hemisphere strokes or with multiple previous strokes. Full details are reported in the Supplementary Information. If coordinates meeting these criteria for both the PSA and control groups were not provided in the publication, the authors were contacted to request unpublished coordinates.

ALE meta-analysis
Peak coordinates pertaining to language activation were extracted from each included article and double checked by the same author (JDS). Coordinates in Talairach space were converted to Montreal Neurological Institute (MNI) space using the Lancaster transformation ( Lancaster et al., 2007 ). GingerALE 3.0.2 was used to perform ALE (http://brainmap.org/ale/), which is a random-effects coordinate-based meta-analytic technique that identifies neural regions at which activation peaks converge above-chance across participant groups within a single dataset Eickhoff et al., 2011 ;Eickhoff et al., 2009 ;Turkeltaub et al., 2012 ). Individual studies might have reported activation coordinates for multiple subgroups of participants and thus contributed more than one participant group to the meta-analysis. Briefly, we grouped together activation peaks from all imaging tasks performed by the same participant group. Each peak was modelled as a 3D Gaussian distribution of activation probability with a Full Width at Half Maximum (FWHM) based on empirical estimates of spatial uncertainty derived from the number of participants in the group, with larger sample sizes modelled by narrower, taller Gaussians providing a more reliable approximation of the true activation location ( Eickhoff et al., 2009 ). Each voxel within a default grey matter mask was assigned the activation probability from the peak within the shortest Euclidean distance, producing a Modelled Activation (MA)-map for each participant group ( Turkeltaub et al., 2012 ). The voxel-wise union of all MA-maps from all participant groups included in a single dataset produced an ALE-map, in which ALE values represent the likelihood that at least one participant group activated a given voxel ( Turkeltaub et al., 2012 ). For single dataset analyses, we tested the null hypothesis of random spatial association between participant groups ('spatial independence'), namely that any spatial convergence of activation between different participant groups in a dataset is only occurring by chance . In order to compute, analytically, the null distribution of ALE values under the assumption of spatial independence between participant groups, each participant group's MA-map was first converted into a histogram representing the probability of observing each MA value in that map . Histograms representing MA-maps of individual participant groups were iteratively combined  to produce a final histogram representing the probability of observing any given ALE value under the null hypothesis of spatial independence between participant groups. The null distribution and ALEmap were combined to produce a p-value map for each dataset. The p-value map was thresholded with a voxel-wise uncorrected p < 0.001 cluster-forming threshold and a cluster-wise family-wise error (FWE) corrected threshold of p < 0.05 based on 1000 random permutations ( Eickhoff et al., 2016 ). Briefly, the null distribution of cluster sizes given a cluster-forming threshold of p < 0.001 was obtained through random simulation in which, for every participant group, a matched simulated participant group was created containing the same number of participants and foci but with foci randomly located throughout the grey matter mask . The above ALE meta-analytical algorithm was performed on each simulated dataset and each simulated ALE-map thresholded at the cluster-forming threshold of p < 0.001. The size of each contiguous cluster of suprathreshold voxels was recorded for each of 1000 such randomly simulated ALE-maps to produce a distribution of cluster sizes that would be expected under the null hypothesis of spatial independence between participant groups . Suprathreshold clusters in the real dataset's ALE-map that were larger than 95% of the null distribution clusters were significant at FWE p < 0.05 and taken to represent regions in which spatial convergence of activation between different participant groups was significantly above chance, which we define herein as regions of consistent activation.
Coordinates from tasks at different timepoints on the same participant group were not pooled; only tasks performed at the longest timepoint post-stroke for each group were included. If coordinates were available for separate groups within the same study (e.g., for stroke survivors with aphasia as individuals or sub-groups), each individual/subgroup was counted as being from a separate participant group in the meta-analysis. Single participants were included as 'participant groups' of size n = 1; as explained above, the FWHM of the Gaussian probability distribution of each peak was weighted to take account of the increasing spatial uncertainty associated with decreasing group size ( Eickhoff et al., 2009 ).
Conjunction images identifying regions in which two datasets both showed consistent activation were computed as the intersection of the thresholded ALE-maps ( Eickhoff et al., 2011 ). Contrast analyses were performed to identify regions where activation likelihood differed significantly between two datasets. ALE-maps from the two datasets being contrasted were subtracted from each other and thresholded at p < 0.05 (uncorrected) using 10,000 P -value permutations with a minimum cluster threshold of 200mm 3 . Each permutation involved pooling all participant groups contributing to either dataset alone and randomly dividing them into two datasets of the same size (i.e. number of participant groups) as the two original datasets being contrasted ( Eickhoff et al., 2011 ). ALE-maps for these two randomly assembled datasets were calculated and the difference between these 'random' ALE-maps computed. Repeating this 10,000 times produced a null-distribution for the difference in ALE values between the two datasets expected under the null hypothesis of label exchangeability at each voxel in the brain ( Eickhoff et al., 2011 ). The observed difference in ALE values at each voxel was compared to its null distribution, yielding a p-value map that was thresholded at p < 0.05 with a minimum cluster threshold of 200 mm 3 and inclusively masked to voxels that were significant during single dataset meta-analysis of either included dataset ( Eickhoff et al., 2011 ). This method of permutation testing accounted for differences in the number of participant groups between each dataset being contrasted.
The Harvard-Oxford atlas ( Desikan et al., 2006 ) defined anatomical labels and the Talairach Daemon atlas ( Lancaster et al., 2000 ) determined the Brodmann Area label associated with each peak coordinate.
We performed a set of pre-planned ALE meta-analyses that are set out below. For the omnibus ALE meta-analysis comparing all language tasks between PSA and controls groups, we required single datasets to have at least 17 participant groups, as recommended by empirical simulations suggesting this number was needed to ensure adequate power ( Eickhoff et al., 2016 ). Given the scarcity of functional neuroimaging studies in PSA, we required 10 participant groups for single datasets to be included in ALE meta-analyses for more specific contrasts between subgroups of participants or tasks, as per previous recommendations ( Eickhoff and Bzdok, 2013 ). Single datasets never contained data from the same participants as separate participant groups. If the same participant group performed multiple imaging tasks which were divided into different datasets during contrast analyses (e.g. both higher and lower demand comprehension tasks), the coordinates for both imaging tasks were included in their respective datasets. Since contrast subgroup analyses were designed to look for regions of significantly different acti-vation likelihood between groups, the inclusion of coordinates from the same group in both subgroup datasets being contrasted would, if anything, reduce the likelihood of finding differences and thus should not increase the false positive rate.

Differences between PSA and control groups 2.3.1. All language tasks in PSA vs. controls (omnibus analysis)
This analysis combined all data available. Thus, it consisted of single dataset, conjunction and contrast ALE meta-analyses comparing all language tasks in all PSA against all language tasks in all controls. The included coordinates did not contain duplicated data from the same participants.

Comprehension and production tasks in PSA vs. controls
PSA participants might activate different neural regions relative to controls for a subset of language tasks. Such differences may have been obscured by grouping all language tasks together in the omnibus ALE meta-analysis. Participant groups were therefore divided according to whether their functional neuroimaging tasks involved 'production' (including either overt or covert production of sublexical, lexical or sentence level speech components) or solely 'comprehension' without production (e.g. sentence listening, semantic judgement, picture-word matching). Single dataset, conjunction and contrast ALE meta-analyses were conducted to compare comprehension tasks in PSA against controls, and production tasks in PSA against controls.

Comprehension > production and production > comprehension tasks in PSA vs. controls
Changes in the division of labour between networks subserving distinct underlying language functions might support improved language performance post-stroke. Conjunction and contrast ALE metaanalyses were performed to compare comprehension vs. production tasks, separately within PSA and control groups. Significant clusters for 'comprehension > production' and 'production > comprehension' were then qualitatively compared between PSA and control groups.

Higher versus lower processing demand tasks
Variable neurodisplacement proposes that neural spare capacity is downregulated to save energy under standard performance demands in health but is upregulated when performance demands increase poststroke. If this occurs, we would expect the neural regions upregulated in PSA to be more likely to be activated during more difficult compared to less difficult tasks in both PSA and controls. Therefore, comprehension and production tasks were each subdivided according to task difficulty. Higher demand comprehension tasks were defined as tasks requiring a linguistic decision to be made; e.g., whether a stimulus is a word or pseudoword, concrete or abstract, or related to some other semantic or syntactic property. Lower demand comprehension tasks either did not require a linguistic decision or required a very simple identity match; e.g., passive listening or simple word-picture matching. Higher demand production tasks required production of > 1 word, such as propositional speech or category fluency tasks. Lower demand production tasks required production of single words, such as picture naming or single item repetition. Single dataset, conjunction and contrast ALE meta-analyses were conducted to compare higher versus lower demand comprehension tasks, and higher versus lower demand production tasks. These contrasts were initially performed separately within PSA and control groups. However, there were too few participant groups to contrast higher versus lower processing demand comprehension or production tasks in controls, so a third set of analyses combined PSA and control participant groups together. Significant clusters representing demandresponsive regions were compared to regions of significantly different activation likelihood between PSA and control groups identified by the meta-analyses in Section 2.3 .
Clusters identified in the above analyses were also compared for spatial overlap with the Multiple Demand (MD) network ( Duncan, 2010 ), a set of domain-general neural regions activated during a diverse range of executively demanding language and non-language cognitive tasks ( Fedorenko et al., 2013 ), and with the semantic control network known to be involved during executively demanding semantic cognition in healthy individuals ( Jackson, 2021 ).

Time post-stroke
Language recovery occurs most rapidly during the first six months post-stroke ( Pedersen et al., 1995 ;Yagata et al., 2017 ). PSA groups were therefore categorised according to whether their mean time post-stroke was before or after 6 months. Unfortunately, there were too few studies of sub-acute patients to contrast them formally with chronic PSA.

Statistical analysis
We compared mean ages of the PSA and control groups using Mann-Whitney U tests implemented in SPSS version 25 with statistical significance defined as p < 0.05 with Bonferroni correction.

Data availability
Group level coordinate data supporting the findings of this study are available on figshare (doi: 10.6084/m9.figshare.12582935).

Descriptive statistics
10,169 unique references were obtained from the systematic search. 79 papers were eligible for inclusion; useable foci were obtained from 33/79 included papers. A flowchart of the search and selection process is shown in Fig. 1 . Details of the included/excluded papers, reasons for excluding eligible papers, and information on the PSA groups included in the ALE meta-analysis are provided in Supplementary Tables S1-3. Across all language tasks, 1521 foci were obtained from 481 PSA in 64 groups, and 809 foci were obtained from 530 healthy controls in 37 groups (Supplementary Tables S3, 4). Foci relating to 172 of the 481 PSA had not been published but were provided after personal communication with the corresponding authors ( Barbieri et al., 2019 ;Geranmayeh et al., 2016 ;Hallam et al., 2018 ;Meier et al., 2019 ;Radman et al., 2016 ;Schofield et al., 2012 ;Tao and Rapp, 2019 ;Wilson et al., 2018 ).
The 64 PSA groups did not have significantly different mean ages compared to the 37 control groups (median 57.4 [IQR 9.0] years in PSA groups vs. 57.0 [IQR 8.2] years in control groups; Mann-Whitney Utest, U = 878, two-sided p = 0.18). Every pair of datasets contrasted in this paper had mean ages that were not statistically significantly different (Supplementary Table S33). Fig. 2 contains histograms of the mean ages of the groups.

Differences between PSA and control groups
Our first aim was to investigate which, if any, regions are more or less likely to be activated in PSA than healthy individuals across all language tasks and do these regions differ between language tasks of different nature (comprehension vs. production).

All language tasks in PSA vs. controls (omnibus analysis)
Single datasets from the omnibus meta-analysis comparing all language tasks in all PSA against control groups are reported in the Supplementary Information and illustrated in Fig. 3 . A conjunction demonstrated that both PSA and control groups consistently activated overlapping regions in: left frontal lobe (frontal operculum cortex, IFG  Table S7). This highlights that multiple regions throughout both hemispheres were consistently activated in PSA but were also involved in language pre-morbidly rather than being recruited 'de novo' post-stroke. Conjunction clusters in the left frontal lobe (frontal operculum cortex, IFG pars opercularis/triangularis, MFG), midline cortex (SFG, SMC, paracingulate cortex) and right frontal lobe (frontal operculum, frontal orbital cortex) at least partially overlap with the MD network ( Fedorenko et al., 2013 ), suggesting that the language network includes domain-general regions in both controls and PSA.
Contrast analyses revealed that multiple regions were less likely to be activated during language in the PSA group than controls, including midline SFG, SMC, and paracingulate gyrus as well as right IFG pars triangularis and right temporal pole (Supplementary Table S8). The midline SFG and paracingulate gyrus cluster overlaps with the MD network ( Fedorenko et al., 2013 ), suggesting it is domain-general in controls ( Fig. 6 A), whereas the right IFG pars triangularis and right temporal pole clusters do not. Since all strokes were restricted to the left hemisphere, this result demonstrates that a set of undamaged language and domain-general regions are less likely to be activated in PSA than controls.
The PSA group were more likely to activate the right anterior insula, frontal operculum and IFG pars opercularis during language than controls (Supplementary Table S8). This cluster overlaps with the Multiple Demand (MD) network ( Fedorenko et al., 2013 ) in the right frontal operculum and anterior insula, suggesting it is domain-general in controls ( Fig. 6 A). Parts of the right anterior insula, frontal operculum and IFG pars opercularis were consistently activated across all language tasks in controls (Supplementary Table S6).

Comprehension tasks in PSA vs. controls
Single datasets from the subgroup meta-analysis comparing comprehension tasks in all PSA against control groups are reported in the Supplementary Information and illustrated in Fig. 4 . A conjunction demonstrated that both PSA and controls consistently activated overlapping regions during comprehension in left frontal lobe ( Fig. 4 A, IFG pars opercularis/triangularis, frontal orbital cortex) and left posterior MTG (Supplementary Table S11).
Contrast analyses revealed that multiple regions were less likely to be activated during comprehension in the PSA group than controls, including midline cortical regions (SFG, paracingulate gyrus) that are unlikely to be damaged following a middle cerebral artery (MCA) stroke ( Fig. 4 B,  Supplementary Table S12). This midline SFG/paracingulate gyrus cluster does not overlap with the MD network ( Fig. 6 B) ( Fedorenko et al., 2013 ). PSA were more likely to activate the right anterior insula and frontal operculum during comprehension than controls ( Fig. 4 B, Sup-plementary Table S12); this cluster overlaps with the MD network ( Fedorenko et al., 2013 ), suggesting it is domain-general in controls ( Fig. 6 B).

Production tasks in PSA vs. controls
Single datasets from the subgroup meta-analysis comparing production tasks in all PSA against control groups are reported in the Supplementary Information and illustrated in Fig. 4 . A conjunction demonstrated that both PSA and controls consistently activated overlapping regions during production in: left IFG pars triangularis; midline cortex (SFG, SMC, paracingulate gyrus); and right posterior STG ( Fig. 4 C, Supplementary Table S15). This highlights that multiple regions throughout both hemispheres are consistently activated during language production in PSA that were involved in language pre-morbidly rather than being recruited 'de novo' post-stroke. Conjunction clusters in the midline SFG, SMC and paracingulate gyrus overlap with the MD network ( Fedorenko et al., 2013 ), suggesting that the language production network includes domain-general regions in both controls and PSA.
Contrast analyses revealed that PSA were less likely than controls to activate the following midline and right hemisphere regions during production: midline cortex (SFG, SMC, paracingulate gyrus); right frontal lobe (frontal orbital cortex, precentral gyrus); right insula; and right temporal lobe (Heschl's gyrus, posterior STG, temporal pole) ( Fig. 4 D, Supplementary Table S16). Again, these regions fall outside of the left MCA territory and thus were unlikely to have been lesioned by the stroke. The midline SFG/SMC/paracingulate gyrus, right frontal orbital cortex and anterior insula clusters overlap with the MD network ( Fedorenko et al., 2013 ), suggesting they are domain-general in controls ( Fig. 6 C). No regions were more likely to be activated during production in the PSA group than controls (Supplementary Table S16).

Comprehension > Production tasks in PSA vs. controls
Changes in the division of labour between networks subserving distinct underlying language functions might support improved language performance post-stroke ( Stefaniak et al., 2020 ).
A conjunction demonstrated that controls consistently activated overlapping regions during both comprehension and production tasks in: left IFG pars opercularis/triangularis; and left temporal lobe (posterior MTG, temporooccipital MTG) (Supplementary Table S19). Compared to controls, PSA had additional clusters of conjunction during both comprehension and production tasks in the midline cortex (SFG, SMC) and right frontal lobe (frontal operculum cortex, frontal orbital cortex) (Supplementary Table S17).
Contrast analyses revealed that the left frontal lobe (frontal orbital cortex, frontal pole), left temporal lobe (temporal pole, temporooccipital inferior temporal gyrus) and midline SFG/paracingulate gyrus were significantly more likely to be activated during comprehension than production in controls ( Fig. 4 E, Supplementary Table S20). PSA had additional clusters of increased activation likelihood during comprehension than production in the right anterior insula and right MFG that were not observed in controls ( Fig. 4 F, Table S18 for full details). These two PSA-specific clusters overlap with the semantic control network ( Jackson, 2021 ) ( Fig. 7 A) and with the MD network ( Fedorenko et al., 2013 ) ( Fig. 6 D). The right anterior insula cluster overlaps with the region of greater activation likelihood in PSA than controls during comprehension tasks. Taken together, these results suggest that comprehension tasks in PSA make greater use of specific right frontal domain-general regions than both production tasks in PSA (right anterior insula, MFG), and comprehension tasks in controls (right anterior insula).

Production > Comprehension tasks in PSA vs. controls
Contrast analyses revealed that the left frontal lobe (IFG pars opercularis/triangularis, frontal orbital cortex, precentral gyrus), left insula, left temporal lobe (planum temporale, temporooccipital MTG), left  Table S20). PSA had an additional cluster of increased activation likelihood during production than comprehension in the right precentral gyrus that was not observed in controls ( Fig. 4 Table S18 for full details). This PSA-specific right precentral gyrus cluster did not overlap with the MD network ( Fedorenko et al., 2013 ). These results suggest that production tasks in PSA make greater use of a specific right precentral gyrus region than comprehension tasks in PSA, but this differential activation was not present in controls.

Summary
Multiple regions throughout both hemispheres, including domaingeneral regions, are consistently activated during language in both PSA and controls. PSA are more likely to activate the following regions than controls: right anterior insula, right frontal operculum (all language tasks, comprehension tasks) and right IFG pars opercularis (all language tasks). PSA are less likely to activate the following regions than controls: midline SFG/SMC/paracingulate gyrus (all language tasks, comprehension tasks, production tasks); right IFG pars triangularis (all language tasks); right frontal orbital cortex, precentral gyrus, anterior insula, Heschl's gyrus, posterior STG (production tasks); and right temporal pole (all language tasks, production tasks). The networks subserving comprehension vs. production tasks diverge in PSA relative to controls. Comprehension tasks in PSA make greater use of specific right frontal regions than both production tasks in PSA (right anterior insula, MFG), and comprehension tasks in controls (right anterior insula). Conversely, production tasks in PSA make greater use of a right precentral gyrus region than comprehension tasks in PSA, but this differential activation was not present in controls.

Regions modulated by task difficulty
Our second aim was to investigate whether regions upregulated in PSA are also modulated by task difficulty, in keeping with variable neurodisplacement .

Higher versus lower demand comprehension tasks
Single datasets from the meta-analysis comparing higher vs. lower demand comprehension tasks in PSA are reported in the Supplementary Information.
Contrast analyses revealed that clusters in the left frontal lobe (IFG pars opercularis/triangularis, MFG), right frontal lobe (frontal operculum cortex, IFG pars opercularis/triangularis, frontal orbital cortex, MFG) and right anterior insula had greater activation likelihood during higher demand than lower demand comprehension tasks in PSA ( Fig. 5 A, Supplementary Table S23).
Only 110 foci were obtained from 78 controls in 7 participant groups performing lower demand comprehension tasks. Accordingly, there were too few groups to perform ALE meta-analyses contrasting higher versus lower demand comprehension tasks in controls ( Eickhoff et al., 2016 ). Thus, a third set of analyses combined PSA and control participant groups together. The single datasets from the meta-analysis comparing higher vs. lower demand comprehension tasks in PSA and control participants combined are reported in the Supplementary Information.
These regions of increased activation likelihood during higher than lower demand comprehension tasks closely align with the semantic control network known to be involved during executively demanding semantic cognition in healthy individuals ( Jackson, 2021 ) ( Fig. 7 B and C) and with the MD network ( Fedorenko et al., 2013 ) ( Fig. 6 E, 6 F). Critically, they overlap with clusters of greater activation likelihood in PSA than controls, across all language tasks and during comprehension tasks, in the right anterior insula and frontal operculum. They also overlap with PSA-specific clusters of increased activation likelihood during comprehension relative to production in the right anterior insula and MFG (Supplementary Table S18).
Contrast analyses revealed that activation was more likely in the left temporal pole in lower than higher demand comprehension tasks in PSA ( Fig. 5 A, Supplementary Table S23) and in both left and right temporal poles in lower than higher demand comprehension tasks in PSA and controls combined ( Fig. 5 C, Supplementary Table S26).

Higher versus lower demand production tasks
Single datasets from the meta-analysis comparing higher vs. lower demand production tasks in PSA are reported in the Supplementary Information. Fig. 4. ALE meta-analysis of comprehension and production tasks in PSA and healthy controls. A: ALE maps of comprehension tasks in PSA (green clusters) and in controls (red clusters), and conjunction map of comprehension tasks in both PSA and controls (yellow clusters). B: ALE maps of 'Comprehension tasks: controls > PSA' (violet clusters) and 'Comprehension tasks: PSA > controls' (cyan clusters). C: ALE maps of production tasks in PSA (green clusters), in controls (red clusters) and conjunction map of production tasks in both PSA and controls (yellow clusters). D: ALE maps of 'Production tasks: controls > PSA' (violet clusters). E: ALE maps of 'Controls: production > comprehension tasks' (red clusters), and 'Controls: comprehension > production tasks' (violet clusters). F: ALE maps of 'PSA: production > comprehension tasks' (green clusters), and 'PSA: comprehension > production tasks' (cyan clusters). Panels A and C: ALE single dataset analyses thresholded at p < 0.001 uncorrected voxel-wise, FWE p < 0.05 cluster wise, 1000 permutations. Panels B, D, E and F: ALE contrast analyses thresholded at p < 0.05, 10000 permutations, minimum cluster extent 200 ml.
Contrast analyses revealed that the right frontal lobe (frontal operculum cortex, IFG pars opercularis/triangularis, precentral gyrus) and right temporal lobe (planum temporale, Heschl's gyrus) had greater activation likelihood during higher demand than lower demand production tasks in PSA ( Fig. 5 B, Supplementary Table S29).
Only 189 foci were obtained from 185 controls in 8 groups performing higher demand production tasks. Accordingly, there were too few groups to perform ALE meta-analyses contrasting higher versus lower demand production tasks in controls ( Eickhoff et al., 2016 ). Thus, a third set of analyses combined PSA and control participant groups together.
The single datasets from the meta-analysis comparing higher vs. lower demand production tasks in PSA and control participants combined are reported in the Supplementary Information.
Contrast analyses revealed that a similar set of clusters in the left IFG (frontal operculum, IFG pars opercularis/triangularis), left posterior MTG, right IFG (frontal operculum, IFG pars opercularis/triangularis, frontal orbital cortex), and right temporal lobe (Heschl's gyrus, planum temporale) had greater activation likelihood during higher demand than lower demand production tasks in PSA and controls combined ( Fig. 5 D,  Supplementary Table S32).

Fig. 5.
Higher versus lower demands comprehension and production tasks. A: ALE maps of 'PSA comprehension tasks: higher > lower processing demands' (yellow clusters) and 'PSA comprehension tasks: lower > higher processing demands' (cyan clusters). B: ALE maps of 'PSA production tasks: higher > lower processing demands' (yellow clusters). C: ALE maps of 'PSA and healthy controls combined comprehension tasks: higher > lower processing demands' (yellow clusters) and 'PSA and healthy controls combined comprehension tasks: lower > higher processing demands' (cyan clusters). D: ALE maps of 'PSA and healthy controls combined production tasks: higher > lower processing demands' (yellow clusters) and 'PSA and healthy controls combined production tasks: lower > higher processing demands' (cyan clusters). All ALE contrast analyses thresholded at p < 0.05, 10000 permutations, minimum cluster extent 200ml.
Right IFG clusters from the above difficulty-modulated production contrasts overlap with the MD network ( Fedorenko et al., 2013 ) ( Fig. 6 E, 6 F). Critically, they are also adjacent to clusters of greater activation likelihood in PSA than controls, across all language tasks and during comprehension tasks, in the right anterior insula and IFG. The right precentral gyrus difficulty-modulated production cluster in PSA alone overlapped with the PSA-specific cluster of increased activation likelihood during production relative to comprehension in the right precentral gyrus (Supplementary Table S18).

Summary
As predicted by variable neurodisplacement , right anterior insular and frontal opercular regions of greater activation likelihood in PSA than controls are more likely to be activated during more difficult than less difficult language tasks.

Time post-stroke
Our third aim was to investigate whether regions differentially activated in PSA relative to controls, vary between different stages of recovery. However, we found that the literature is strongly biased as most PSA underwent neuroimaging in the chronic phase post-stroke. The 64 PSA groups had median times post-stroke of 38.0 (IQR 34.5) months ( Fig. 2 ). Only five papers, representing six of the 64 PSA groups, repeated functional neuroimaging longitudinally at multiple timepoints ( Cardebat et al., 2003 ;Long et al., 2018 ;Nenert et al., 2018 ;Radman et al., 2016 ;Stockert et al., 2020 ). When counting the 'earliest' timepoint at which each PSA group was scanned, only 9/64 groups had mean times post-stroke less than 6 months ( Cardebat et al., 2003 ;Geranmayeh et al., 2016 ;Long et al., 2018 ;Mattioli et al., 2014 ;Nenert et al., 2018 ;Qiu et al., 2017 ;Radman et al., 2016 ;Stockert et al., 2020 ). Accordingly, there were too few groups to contrast PSA before versus after six months ( Eickhoff et al., 2016 ).

Discussion
In order to identify the specific regions that are more likely to be activated in PSA than healthy individuals, and to investigate whether there are differences in activation likelihood across different language tasks and between recovery timepoints, we performed a large-scale ALE meta-analysis of functional neuroimaging studies in PSA. We obtained coordinate-based functional neuroimaging data for 481 PSA, which is over four times larger than the last ALE meta-analysis on this topic ( n = 105) ( Turkeltaub et al., 2011 ). The results provide novel insights into the mechanisms underlying language network changes post-stroke that might hitherto have been obscured by the limited sample size of any individual study in this area.
PSA were more likely to activate various regions of the right anterior insula and IFG than controls across all language tasks (anterior insula, frontal operculum, IFG pars opercularis) and during comprehension tasks (anterior insula, frontal operculum). These right anterior insular/IFG regions seem to be implicated in task difficulty as they are Fig. 6. Overlaps between clusters identified in the ALE meta-analyses and the Multiple Demand Network. A: ALE maps of 'Omnibus analysis: controls > PSA' (yellow clusters) and 'Omnibus analysis: PSA > controls' (red clusters). B: ALE maps of 'Comprehension: controls > PSA' (yellow clusters) and 'Comprehension: PSA > controls' (red clusters). C: ALE maps of 'Production: controls > PSA' (yellow clusters). D: ALE maps of 'Comprehension > production in PSA (cyan cluster). E: ALE maps of 'Comprehension higher > lower processing demands in PSA' (green cluster) and 'Production higher > lower processing demands in PSA' (blue cluster). F: ALE maps of 'Comprehension higher > lower processing demands in healthy controls and PSA combined' (green cluster) and 'Production higher > lower processing demands in healthy controls and PSA combined' (blue cluster). All panels include the outline of the Multiple Demand network (pink) ( Fedorenko et al., 2013 ). All ALE contrast analyses thresholded at p < 0.05, 10000 permutations, minimum cluster extent 200 ml. more likely to be activated during higher than lower demand comprehension tasks (right anterior insula, frontal operculum, IFG pars opercularis/triangularis, frontal orbital cortex, in PSA and controls combined) and during higher than lower demand production tasks (right frontal operculum, IFG pars opercularis/triangularis, frontal orbital cortex, in PSA and controls combined). The networks subserving comprehension vs. production diverge in PSA relative to controls. Comprehension tasks in PSA make greater use of specific right frontal regions than both production tasks in PSA (right anterior insula, MFG), and comprehension tasks in controls (right anterior insula). Conversely, production tasks in Fig. 7. Overlaps between clusters identified in the ALE meta-analyses and the Semantic Control Network. A: ALE maps of 'Comprehension > production in PSA (cyan cluster). B: ALE maps of 'Comprehension higher > lower processing demands in PSA' (green cluster) and 'Production higher > lower processing demands in PSA' (violet cluster). C: ALE maps of 'Comprehension higher > lower processing demands in healthy controls and PSA combined' (green cluster) and 'Production higher > lower processing demands in healthy controls and PSA combined' (violet cluster). All panels include the outline of the Semantic Control Network (red) ( Jackson, 2021 ). All ALE contrast analyses thresholded at p < 0.05, 10000 permutations, minimum cluster extent 200 ml.
PSA make greater use of a right precentral gyrus region than comprehension tasks in PSA, and this differential activation was not present in controls.
A previous ALE meta-analysis in PSA concluded that the language network in controls is left-lateralised, whereas PSA consistently activate additional homotopic right hemisphere regions that are not consistently activated in controls ( Turkeltaub et al., 2011 ). The clear picture that emerges from the current, much larger ALE meta-analysis is different in a fundamental way. Whilst one can find reliably different levels of activation likelihood between the PSA and control groups, these differences all fall within regions that are found to activate in both groups; in classical neuropsychological terminology ( Shallice, 1988 ), there is not a classical dissociation between PSA and control groups. Thus in the omnibus language ALE meta-analysis, the conjunction demonstrated that both PSA and controls consistently activated overlapping regions across the left and right frontal and temporal lobes, right parietal lobe, and midline cortex. Two important implications are that (a) right as well as left hemisphere areas make important contributions to language and (b) that regions, consistently activated by language tasks in PSA, are also involved in language pre-morbidly. This runs counter to the view that these areas are recruited 'de novo' post-stroke.
Irrespective of how the language tasks were divided (all language tasks, comprehension, production), we found that in PSA certain regions are less likely to be activated than in controls. These areas were not only left hemisphere regions that might have been lesioned directly by the stroke (i.e., within the left hemisphere MCA: cf. ( Phan et al., 2005 ;Zhao et al., 2020 )) but also domain-general regions of midline superior frontal and paracingulate cortex, right insula and right fronto-temporal cortex. This result implies that the language and cognitive deficits observed in PSA might not be a simple reflection of the lesioned areas but might result from combinations of lesioned and under-engaged areas.
Accordingly, the use of task-based fMRI may be an important addition for future studies that aim to explore the neural bases of aphasia or build prediction models ( Saur et al., 2010 ;Skipper-Kallal et al., 2017a ;van Oers et al., 2018 ). Less consistent activation in regions distant to the lesions might reflect functional diaschisis, i.e., reduced task-related engagement throughout a connected network where one or more nodes have been compromised by damage ( Carrera and Tononi, 2014 ). Alternatively from a more functional viewpoint, these distant regions may be less engaged because in PSA language is performed sub-optimally and therefore the full extent of the distributed language network is underutilised.
Neurocomputational invasion would predict that the post-stroke language network should expand to include novel non-language regions that were not consistently activated in healthy individuals ( Keidel et al., 2010 ;Stefaniak et al., 2020 ). This mechanism is complementary to the classical notion that right hemisphere homologues of left hemisphere language regions are quiescent in health but become activated to perform similar language computations following left hemisphere stroke ( Finger et al., 2003 ;Turkeltaub et al., 2011 ). A second linked idea is the notion of transcallosal disinhibition ( Heiss and Thiel, 2006 ;Marshall, 1984 ). This proposes that right hemisphere, homologous regions are quiescent in health because they are inhibited transcallosally by the dominant left hemisphere, but can be 'released' when these dominant areas are damaged. This idea has been an important motivation for trials of non-invasive brain stimulation to inhibit the right IFG pars triangularis to aid language recovery through a shift back to left hemisphere areas ( Bucur and Papagno, 2019 ;Ren et al., 2014 ). Previous work ( Stefaniak et al., 2020 ) has noted that these hypotheses appear to be biologically-expensive (areas are maintained but not used, except in people who happen to have the right type and location of damage), computationally underspecified (e.g., how right hemisphere regions can develop language functions when they are being constantly inhibited), and are an untested extension of findings from low-level, non-language motor circuitry ( Di Lazzaro et al., 1999 ;Ferbert et al., 1992 ). Additional counter evidence includes: chronic language weaknesses can be found following right hemisphere damage ( Gajardo-Vidal et al., 2018 ); and, residual language abilities in PSA have been related to the level of right hemisphere activation ( Crinion and Price, 2005 ;Griffis et al., 2017 ;Skipper-Kallal et al., 2017b ). The current study adds to these observations in that multiple regions throughout both hemispheres are consistently activated during language in both PSA and controls. Looking across these studies, it would seem that there is a solid empirical basis to move beyond oversimplified discussions of 'left versus right' language lateralisation and, instead, to explore how a bilateral, albeit asymmetrically left-biased, language network supports healthy function and generates aphasia after damage and partial recovery.
Variable neurodisplacement postulates that aphasia recovery involves increased utilisation of spare capacity within regions that are part of the premorbid language network but downregulated in health to save neural resources. Dynamic responses to performance demands in health and after damage could involve upregulation of language-specific and/or domain-general executive functions ( Stefaniak et al., 2020 ). Accordingly, variable neurodisplacement encompasses the hypothesis that increased utilisation of domain-general executive regions aids language recovery post-stroke ( Geranmayeh et al., 2014 ;Sharp et al., 2010 ). As noted above, a key finding from these ALE analyses was that bilateral regions, including domain-general parts of the MD network, were commonly engaged by PSA and control groups. Even where there were graded differences in favour of PSA over controls (e.g., greater activation likelihood in the right anterior insula and IFG), these are consistent with enhanced utilisation of demand-control regions due to increased task difficulty rather than 'expansion' into new territory via neurocomputational invasion. Thus, in the PSA group as well as PSA and controls combined, there was greater activation likelihood of the right anterior insula/operculum and IFG during higher than lower demand comprehension and production tasks. These same right anterior insula/IFG regions are known to be recruited during difficult tasks in healthy individuals: the right IFG has been implicated in domain-general top-down control in health ( Baumgaertner et al., 2013 ;Koechlin and Jubault, 2006 ;Meinzer et al., 2012 ); a previous ALE meta-analysis found that effortful listening under difficult conditions in healthy individuals is associated with consistent activation in the bilateral insulae ( Alain et al., 2018 ); and all ALE-identified right hemisphere regions overlap with either domain-general regions of the MD network ( Fedorenko et al., 2013 ) or regions of the semantic control network known to be involved during executively-demanding semantic cognition in healthy individuals ( Jackson, 2021 ).
The results do not suggest that there is a global, undifferentiated upregulation of all domain-general neural resources in PSA. Indeed, we repeatedly found lower activation likelihood in midline regions of the SFG/paracingulate gyrus in PSA compared to controls. These midline clusters overlap with at least some definitions of the domain-general executive network ( Fedorenko et al., 2013 ). In contrast to our findings, increased activation in the same midline region has been associated with language recovery between two weeks and four months post-stroke ( Geranmayeh et al., 2017 ). It is not clear what the basis of these opposing results is, but one possibility is that this ALE meta-analysis was predominantly based on data collected from patients in the very chronic (see below) rather than sub-acute stage. If correct, it may be the case that the executive functions supported by medial prefrontal regions (e.g., response conflict, task planning ( Dosenbach et al., 2008 ;Mansouri et al., 2017 )) are critical during early phases of recovery when performance is at its most impaired, but in relatively well-recovered, chronic PSA these mechanisms are not required (indeed continued involvement might signal poor recovery).
Activation was more likely in the anterior temporal lobes during lower than higher demand comprehension tasks. Previous task fMRI studies in healthy participants found minimal influence of semantic control demands in the anterior temporal lobe, unlike prefrontal or posterior temporal regions ( Jackson, 2021 ). However, the anterior temporal lobe is more active for coherent, consistent contexts and combinatorial meanings, while inconsistent context or combinations of meaning require increased activation in semantic control and executive demand areas ( Branzi et al., 2020 ;Hoffman et al., 2015 ). Consequently, the anterior temporal lobe result in the current ALE meta-analysis might reflect comprehension processes for coherent contexts and combinations during lower demand comprehension tasks.
As is commonly the case in stroke research ( Fareed et al., 2012 ;Thomalla et al., 2017 ), the median ages of the 64 included PSA participant groups was lower (57.4 years) than the average stroke patient (e.g., the median age of the UK stroke population was 77 in 2017 ( SSNAP, 2017 ). This may limit the generalisability of results obtained from functional neuroimaging studies to the 'real-world', and future studies should investigate patterns of activation in older PSA that are more representative of the average stroke survivor.
We identified areas of enquiry that have had little attention in the literature to date. It was not possible to ascertain whether there are consistent activation differences between subacute and chronic PSA. The 64 PSA groups had median times post-stroke of 38.0 months and even when counting the 'earliest' timepoint at which each PSA group was scanned, only 9/64 PSA groups were less than 6 months post-stroke. This dearth of data meant it was not possible to use ALE to explore differences between sub-acute and chronic PSA. Importantly, this indicates a pressing need for future studies of this early period, when there is the fastest rate of language recovery ( Pedersen et al., 1995 ;Yagata et al., 2017 ). Additionally, it was not possible to explore longitudinal fMRI changes given the extremely limited number of longitudinal PSA fMRI studies. Even among papers that reported longitudinal information, several were small (n < 10 participants) and there was considerable variation with respect to which language or non-language cognition was explored and the timing of the first imaging timepoint (from the first few days to a few months post-stroke). The relative lack of studies and small sample sizes are unsurprising given the considerable logistic challenges involved in imaging subacute stroke patients. However, longitudinal studies are a powerful approach for exploring the neural bases of recovery (because the different starting points and inter-participant variations are controlled), and particularly for exploring whether language network changes observed in the chronic phase occur immediately or over time. Such information will be critical for understanding the mechanisms underpinning both instantaneous resilience to the effects of damage, degeneracy, and longer-term experience-dependent plasticity ( Chang and Lambon Ralph, 2020 ;Price and Friston, 2002 ;Sajid et al., 2020 ;Stefaniak et al., 2020 ;Ueno et al., 2011 ).
The results of this large-scale meta-analysis argue against classical neurocomputational invasion accounts of PSA language, i.e., expansion of the language network post-damage into new territories. Instead, (a) there is considerable overlap between the bilateral language-related functional networks observed in PSA and controls; (b) the PSA participants are less likely than controls to activate certain regions including areas beyond their core lesions in the left MCA territory; and (c) are more likely to engage executive-control related regions of the right anterior insula and IFG. These results fit with a view that language is supported by a dynamic, bilateral albeit left-asymmetric network, and consistent with the variable neurodisplacement hypothesis. The size of this (random-effects) analysis (including data pertaining to 481 PSA with a heterogenous variety of lesion locations and aphasia profiles), should mean that the results will generalise to the wider patient population.
Despite its size and clear results, inevitably this study has limitations. First, all included PSA participants had a single left hemisphere stroke, so it is possible that left hemisphere clusters of lower activation likelihood in PSA might be a direct effect of tissue damage. Relatedly, left hemisphere lesions might have biased single dataset meta-analyses of PSA participants towards consistency in the right hemisphere, al-though there would have been no right hemisphere biasing effect on single dataset meta-analyses of controls, nor on any of the contrast metaanalyses. Second, decreased neurovascular coupling post-stroke could generate false activation differences between patients and controls, although this is less likely in chronic patients and undamaged cortical regions ( Geranmayeh et al., 2015 ). Third, 'neural reprogramming' might entail differences in utilisation that are only observable using connectivity ( Meier et al., 2018 ;Schofield et al., 2012 ) or multivariate analyses ( Fischer-Baum et al., 2017 ;Lee et al., 2017 ), although very few studies have used such techniques in PSA to date. Finally, the meta-analysis rests on studies reporting the full set of whole brain responses from both PSA and controls, and differences seen in meta-analyses might not be replicated in individual studies.

Declaration of Competing Interests
None.

Author Contributions
JDS and RSWA contributed to study design, data collection, analysis and write-up. MALR contributed to study design and write-up.

Data availability
Group level coordinate data supporting the findings of this study are available on figshare (doi: 10.6084/m9.figshare.12582935).