Overlapping Neural Correlates Underpin Theory of Mind and Semantic Cognition: Evidence from a Meta-Analysis of 344 Functional Neuroimaging Studies

Key unanswered questions for cognitive neuroscience include whether social cognition is underpinned by specialised brain regions and to what extent it simultaneously depends on more domain-general systems. Until we glean a better understanding of the full set of contributions made by various systems, theories of social cognition will remain fundamentally limited. In the present study, we evaluate a recent proposal that semantic cognition plays a crucial role in supporting social cognition. While previous brain-based investigations have focused on dissociating these two systems, our primary aim was to assess the degree to which the neural correlates are overlapping, particularly within two key regions, the anterior temporal lobe (ATL) and the temporoparietal junction (TPJ). We focus on activation associated with theory of mind (ToM) and adopt a meta-analytic activation likelihood approach to synthesise a large set of functional neuroimaging studies and compare their results with studies of semantic cognition. As a key consideration, we sought to account for methodological differences across the two sets of studies, including the fact that ToM studies tend to use nonverbal stimuli while the semantics literature is dominated by language-based tasks. Overall, we observed consistent overlap between the two sets of brain regions, especially in the ATL and TPJ. This supports the claim that tasks involving ToM draw upon more general semantic retrieval processes. We also identified activation specific to ToM in the right TPJ, bilateral anterior mPFC, and right precuneus. This is consistent with the view that, nested amongst more domain-general systems, there is specialised circuitry that is tuned to social processes.


Introduction
The capacity to understand and respond appropriately to the thoughts and actions of others is of vital importance to our daily lives.When this ability breaks down, there are profound consequences for an individual's ability to thrive in society (Frith, 2007;Frith and Frith, 2007).Therefore, a key challenge for neuroscience is to develop a full account of the cognitive and brain basis of social interaction.
The dominant mode within social neuroscience has been to seek out specialised neural subsystems dedicated to processing social (as opposed to more general kinds of) information (Apperly et al., 2005;Happé et al., 2017;Saxe and Powell, 2006;Spunt and Adolphs, 2017).This approach has uncovered evidence for the existence of category-sensitive cortex; regions that preferentially activate during the perception of certain social stimuli, such as faces (Kanwisher and Yovel, 2006), bodies (Downing and Kanwisher, 2010), and dyadic social interactions (Landsiedel et al., 2022).It has been argued that more complex inferential processes such as mental state attribution, or Theory of Mind, also engage highly specialised social brain areas (Apperly et al., 2005;Brüne and Brüne-Cohrs, 2006;Dodell-Feder et al., 2011;Gweon et al., 2012;Jacoby et al., 2016;Jenkins et al., 2014;Koster-Hale and Saxe, 2013;Richardson and Saxe, 2020;Ross and Olson, 2010;Saxe and Baron-Cohen, 2006;Saxe and Kanwisher, 2003;Saxe and Wexler, 2005;Scholz et al., 2009;Simmons et al., 2010).However, the extent to which 'higher-order' systems (e.g., declarative memory; cognitive control) exhibit domain-specificity of this kind is hotly debated (e.g., Apperly et al., 2005;Binney and Ramsey, 2020;Ramsey and Ward, 2020).One factor keeping this debate from being resolved is that, to date, the role of domain-general systems in social cognition has received comparatively little attention and is not well understood.Consequently, neurobiological accounts of human social behaviour fall short of being comprehensive (Diaz et al., 2013;Arioli et al., 2021;Eickhoff et al., 2011).
Recently, however, there has been increased interest in the involvement of a set of distributed domain-general networks in social processing.This includes the 'multiple-demand network' (MDN), a set of brain areas engaged by cognitively challenging tasks that span numerous cognitive domains (Assem et al., 2020;Duncan, 2010;Fedorenko et al., 2013;Hugdahl et al., 2015).MDN activity increases with working memory load and task switching demands, for example, and it has been suggested that this reflects the implementation of top-down attentional control to meet immediate task goals (Duncan, 2010(Duncan, , 2013)).MDN regions have been implicated in social processes, including working memory for social content (Meyer et al., 2012), social conflict resolution (Zaki et al., 2010) and mental state attribution (e.g.Rothmayr et al., 2011;Samson et al., 2005;Van der Meer et al., 2011).A further set of co-activated brain regions, referred to collectively as the 'default mode network' (DMN), has also garnered a widely appreciated role in social cognition (Darda and Ramsey, 2019;Diveica et al., 2021;Zaki et al., 2010;Duncan, 2010;Fedorenko, 2014;Fedorenko et al., 2013;Hughes et al., 2019;Jackson et al., 2022;Mars et al., 2012;Schilbach et al., 2006;Spreng and Grady, 2010).The DMN is a large-scale functional network that tends to activate in the absence of an explicit task, and it has been proposed that it is ideally suited for supporting self-generated internally-orientated, as opposed to externally-orientated cognition (Margulies et al., 2016;Smallwood et al., 2013).The DMN appears comprised of as many as three subsystems, and it is well accepted that at least one of these (which includes dorsomedial and ventrolateral prefrontal cortex, inferior parietal and lateral temporal regions) consistently activates during social processes like mental state attribution, which may in part relate to access to social knowledge (Spreng and Andrews-Hanna, 2015).
Indeed, it has recently been argued that a network known as the semantic cognition network (SCN; Humphreys et al., 2015;Jackson et al., 2019), has a crucial role in supporting social cognition (Balgova et al., 2022;Binney and Ramsey, 2020;Diveica et al., 2021).Semantic cognition (supported by the SCN) refers to neurocognitive systems involved in the acquisition and flexible retrieval of conceptual-level knowledge that exists to transform sensory inputs into meaningful, multimodal experiences.Conceptual knowledge critically underpins our capacity to recognise and interact with objects, words, people, and events in our environment (Patterson et al., 2007;Lambon Ralph et al., 2017), and Binney and Ramsey (2020) have argued that it should play a pivotal role in social cognition given that social interaction is, at its core, a process of meaningful exchange between persons.Support for this hypothesis has long existed within neuropsychological and comparative neuroscience literature, where there appears to be a tight coupling of general semantic deficits and social impairments (Bertoux et al., 2020;Irish et al., 2014;Klüver and Bucy, 1937;Miller et al., 2012;Souter et al., 2021; for a review see Olson et al., 2013;Rouse et al., 2024).Evidence at the level of whole-brain networks has yet to be conclusively obtained.
The SCN is comprised of the IFG and posterolateral temporal cortex (inclusive of the posterior MTG and posterior ITG), which play a particular role in control-related processes, as well as the anterior temporal lobes (ATL) which underpin semantic representation processes (Jackson, 2021;Jefferies, 2013;Noonan et al., 2013;Lambon Ralph et al., 2017).There is a generally accepted notion that there is some degree of overlap between SCN regions and those brain regions involved in social cognition (Binney and Ramsey, 2020;Spreng and Andrews-Hanna, 2015), yet only very recently have there been direct explorations of this relationship.Moreover, most of these studies have focused on the differences, and divergence (Baetens et al., 2013;Hyatt et al., 2015), and relatively little discussion was given to the implications of any overlap.The matter of convergence has been elucidated in a series of recent targeted studies (Balgova et al., 2022;Binney and Ramsey, 2020;Diveica et al., 2021;Hodgson et al., 2022) although each of these are limited in particular ways.For instance, Balgova et al. (2022) was an fMRI study which employed a limited task set and thus could lack in generalisability.Hodgson et al. (2022) and Diveica et al. (2021) used a meta-analytic approach to extract reliable findings from across large numbers of functional neuroimaging studies, and thereby circumvent the limitations of individual studies (Cumming, 2014;Eickhoff et al., 2012) which include low statistical power (Button et al., 2013) and vulnerability to idiosyncratic design/analysis choices (Botvinik-Nezer et al., 2020;Carp, 2012).Nonetheless, Hodgson et al. (2022) restricted their analyses to a limited brain volume.Diveica et al. (2021) on the other hand, while taking a whole brain approach, was focused on regions of the SCN that respond to increased cognitive control demands and they did not formally compare or contrast the whole SCN with activation maps from social cognitive tasks.These limitations (and others; see below) were addressed in the present study.
The present study addressed the overlap between brain networks involved in social and semantic cognition.It focused on one key aspect of social cognition, namely mental state attribution or 'theory of mind' (ToM), for three reasons.First, ToM is considered fundamental to successful social interactions (Apperly, 2012;Brüne and Brüne-Cohrs, 2006;Frith and Frith, 2005;Heleven and van Overwalle, 2018;van Hoeck et al., 2014).Second, there is a large body of literature, as is requisite for meta-analytic investigation.Third, ToM abilities enable one to describe, explain, predict, and infer the intentions, beliefs, and affective states of others (Adolphs, 2009;Brüne and Brüne-Cohrs, 2006;Frith & Frith, 2007, 2010, 2012;Happé et al., 2017;Premack and Woodruff, 1978).As such, ToM includes inferential processes that allow one to go beyond what is directly observable through the senses, thus appearing to be comparable to, and perhaps explained by, more general semantic processes that are specialised for the extraction of all types of meaning (Binney and Ramsey, 2020).
Most neural accounts of ToM implicate the temporoparietal junction (TPJ) alongside medial prefrontal cortex (mPFC) and the precuneus.Some accounts also include the posterior superior temporal sulcus (pSTS) and the ATL (Amodio and Frith, 2006;Mar, 2011;Molenberghs et al., 2016;Saxe and Kanwisher, 2003;Saxe and Wexler, 2005;Saxe, 2006;Saxe and Powell, 2006;Schurz et al., 2014Schurz et al., , 2017)).It is key to note that the term 'TPJ' is less frequently used in the semantic cognition literature than in social neuroscience, and the corresponding definition can be vague and heterogeneous.For present purposes, we interpret the label TPJ to refer to a large area that includes the posterolateral temporal cortex and the inferior parietal lobe, including the angular gyrus (AG) (Hodgson et al., 2022;Seghier, 2013Seghier, , 2022;;Seghier et al., 2010).Some accounts of semantic cognition include the AG and argue either that it is involved in the integration and storage of conceptual knowledge (Kuhnke et al., 2020), or as a temporary buffer (Humphreys and Tibon, 2022).However, the AG has also been attributed to other domain-general processes that extend beyond semantic processing (Cabeza et al., 2012;Geng and Vossel, 2013;Humphreys et al., 2021;Humphreys and Tibon, 2022).In the present study, we specifically anticipated overlap between the SCN and ToM network in the ATL and the TPJ as both regions are frequently implicated in putatively domain-specific social processes as well as semantic cognition (Balgova et al., 2022;Diveica et al., 2021;Humphreys et al., 2021;Olson et al., 2013;Seghier et al., 2010).
We also aimed to investigate a potential hemispheric dissociation between social and semantic cognition at these sites.In semantic cognition, the role of the ATL is viewed as bilateral (albeit with a leftwards asymmetry when probed with verbal semantic information; Lambon Ralph et al., 2001;Rice et al., 2015a), whereas the role of the ATL in social cognition has been ascribed right lateralisation (Younes et al., 2022;Zahn et al., 2009).Evidence for this distinction is limited, however, because claims that the right, but not the left ATL, is key for social processing are based chiefly upon patient studies (Borghesani E. Balgova et al. et al., 2019;Gainotti, 2015;Gorno-Tempini et al., 2003;Irish et al., 2014).Individual fMRI studies, on the other hand, typically indicate bilateral involvement or possibly a leftward asymmetry (Balgova et al., 2022;Binney et al., 2016b;Rice et al., 2018;Ross and Olson, 2010 but see Zahn et al., 2009; also see Arioli et al., 2021;Catricalà et al., 2020;Lin et al., 2018;Pobric et al., 2016;Rice et al., 2015b).The laterality of TPJ involvement in social cognition is unclear.In neuroimaging studies, it is often observed bilaterally (Molenberghs et al., 2016;Schurz et al., 2014), but selectivity of this region for ToM is argued to be limited to the right hemisphere by some authors (Perner et al., 2006;Saxe and Wexler, 2005) while others have reported greater selectivity in the left (Aichhorn et al., 2006(Aichhorn et al., , 2009)).In semantic cognition, activation of regions within the TPJ tends to be left lateralised (Handjaras et al., 2017;Kuhnke et al., 2022;Seghier, 2013;Seghier et al., 2010).Collectively, these findings paint a complex picture regarding how the ToM and semantic networks converge and diverge at these ATL and TPJ sites.
Laterality differences may be of critical importance to differentiating semantic and social cognition networks.Alternatively, they could reflect a methodological confound which is that their typical neuroimaging assessments tend to use different types of stimuli and, for example, language-based tasks tend to drive greater left-hemisphere activation (Binder et al., 2009;Rice et al., 2015b).A key aim of this study, therefore, was to investigate whether methodological factors give rise to a skewed pattern of activity in each domain.Most fMRI studies probing semantic cognition have used verbal stimuli (e.g., words/sentences) (Rice et al., 2015b;Visser et al., 2010b).In contrast, nonverbal stimuli such as animations, vignettes, or free-viewing movie paradigms are popular in the ToM literature (Diveica et al., 2021;Molenberghs et al., 2016).Although both semantic cognition and ToM are typically viewed as modality-independent processes (Gallagher et al., 2000), these prevalent methodological differences could mar between-domain comparisons because activation patterns within each domain shift according to the stimulus presentation format.For example, a meta-analysis of fMRI studies found that non-verbal compared to verbal ToM tasks, evoke greater activation in the left precentral gyrus and left and right IFG, and lower activation in the mPFC, precuneus, and bilateral TPJ (Molenberghs et al., 2016).Similarly, Visser et al.'s, 2010b meta-analysis of semantic cognition found that the laterality of ATL activation depends on whether stimuli were presented in the auditory versus visual modality (also see Krieger-Redwood et al., 2015;Rice et al., 2015b).Thus, left unaccounted for, these kinds of systematic methodological differences could create the appearance of divergence between the two task-associated networks when there is, in fact, a common system with meaningful covariation driven by properties of the stimuli.In the present study, we controlled for stimulus format (verbal, non-verbal) and input modality (visual, auditory) to disentangle pervasive differences between networks from context dependent differences.In the same vein, we controlled for inter-domain differences in the types of baseline/control tasks used (e.g., active versus passive) and screened for the presence of social stimuli in the studies of semantics.
In summary, to determine the degree to which ToM and semantic cognition share an underlying neural basis, we performed a large-scale neuroimaging meta-analysis to systematically compare the ToMrelated brain network with the SCN and with a primary focus on the ATL and TPJ.Moreover, we assessed the effect of stimulus format and sensory input modality on network overlap.To our knowledge, this is the first direct comparison of these two large-scale networks via these means (see Hodgson et al., 2022 for a region-specific analysis, and Diveica et al., 2021 for data specific to semantic control).

Literature selection and inclusion criteria
We leveraged a Theory of Mind (ToM) dataset curated by (Diveica et al., 2021), and a Semantic Cognition (SCN) dataset compiled by (Jackson, 2021).Both these studies performed a comprehensive and up-to-date literature review and followed best practice guidance for conducting meta-analyses (Müller et al., 2018).Below, we provide a brief description of each of these original datasets.
The general semantics analysis (257 studies, 415 contrasts, 3606 peaks) reported by Jackson (2021) was designed to capture all aspects of semantic cognition, including activation of conceptual level knowledge, as well as engagement of control processes that guide context-or task-appropriate retrieval of concepts.Studies were included if they compared a (more) semantic with a non-(or less-) semantic task or meaningful (or known) with meaningless (or unknown) stimuli.It included studies published between 2008 and 2019.The ToM analysis (136 experiments, 2158 peaks, 3452 participants) reported by Diveica et al. (2021) included studies published between 2014 and 2020 that employed a primary task involving inferences about the mental states of others, including their beliefs, intentions, and desires (but not sensory or emotional states).These studies were also required to compare the ToM task to a non-ToM task.Studies that looked at the passive observation of actions, social understanding, mimicry or imitation were not included unless the primary task included a clear ToM component.Studies investigating irony comprehension, those that employed trait inference tasks, and those that employed interactive games were also excluded.Both Jackson and Diveica et al. excluded contrasts that made comparisons between sub-components of the process of interest (but see the final paragraph in this Section).For example, Diveica et al. excluded affective ToM > cognitive ToM contrasts from the semantic cognition studies, and Jackson excluded abstract semantics > concrete semantics contrasts.This was critical for the present study because we were interested in common, core semantic/ToM processes that are subtracted out by these contrasts.
For these two datasets to be compared, it was essential to ensure that a similar, if not identical set of general exclusion criteria (i.e., those pertaining to the sample demographics, the imaging method, etc.) were applied.To this end, we initially planned to use the general inclusion/ exclusion criteria described by Diveica et al. (2021) and reapply them to both the ToM and SCN datasets.In practice, we needed to implement a few minor modifications to these criteria.Below we summarise the final set of general criteria that we applied in the present study and highlight discrepancies from the approaches of Diveica et al. (2021) and Jackson (2021).
1. We included only peer-reviewed articles in English, and studies that employed task-based fMRI or PET, and only those that report wholebrain activation coordinates localised in one of two standardised stereotactic spaces (Talairach (TAL) or Montreal Neurological Institute (MNI)).Coordinates reported in TAL space were converted into MNI space using the Lancaster transform (tal2icbm transform (Lancaster et al., 2007) embedded within the GingerALE software version 3.0.2;http://brainmap.org/ale).Results from region-of-interest or small-volume correction analyses were excluded.2. We included only studies that tested healthy adults to control for agerelated changes in neural networks supporting cognition (e.g., see Hoffman and Morcom, 2018).A deviation from Diveica et al. (2021) was that we only considered studies reporting data from participants aged 18-40 years.If the age range of participants in a given study was not stated, we included the results in our datasets as long as the mean age of the participants was less than 40 years (if stated) and there was no clear indication that adults outside the range of 18-40 were included in the sample.This was a similar criterion to that used by Jackson (2021).3. Diveica et al. (2021) included contrasts between the experimental task (i.e., ToM processing) and either an active control condition or rest/passive fixation.Jackson (2021) only included contrasts against active baselines.Therefore, we added additional contrasts involving rest/passive fixation into the SCN dataset.In the present study, active control conditions were characterised as either a high-level or E. Balgova et al. low-level baseline; thus, over and above Diveica et al. (2021) and Jackson (2021), the present study differentiated low-level active baselines (e.g., visual stimulation with a string of hashmarks as a control for sentence reading) from rest/passive fixation.With these extra steps, we aimed to better account for methodological differences across domains (see more detail in Section 2.3.1). 4.Where present, multiple contrasts from the same group of participants were included if they met all the other inclusion criteria.We controlled for within-group effects by pooling contrasts into a single experiment (Müller et al., 2018;Turkeltaub et al., 2012) like Diveica et al. (2021) and Jackson (2021).This means that, when we refer to the numbers of experiments that constituted the units of input, we have counted contrasts from a single participant sample as one single experiment.In follow-up contrast analyses that compared different conditions (e.g., stimulus format or input modality), initially pooled contrasts related to these different conditions were separated (see more detail in Section 2.2).While Diveica and colleagues excluded the contrast with a smaller number of peaks after separating, we retained both of these contrasts to maximise the use of all available data.
Two further adjustments were made to the SCN dataset to make it optimally comparable to the ToM dataset.As discussed above, both Jackson and Diveica et al. excluded contrasts that made comparisons between sub-components of the process of interest and thus could subtract away core processes associated with ToM and semantic cognition.In the case of ToM, this left only those contrasts comparing ToM tasks with non-ToM tasks.Jackson, however, also included a small number of contrasts that compared more semantic tasks with less semantic tasks (e. g., an identity classification task using faces with varying degrees of familiarity used by Rotshtein et al. (2005) or a task contrasting personal familiar and famous familiar faces used by Sugiura et al. (2006).In the present study, we excluded these because they could subtract out some core processes or common regions.While this was likely of little consequence in Jackson's (2021) study, the inclusion of these contrasts could, in principle, weaken the comparison of SCN data with the ToM data.An exception was applied to contrasts that pitted intelligible sentences against scrambled sentences because they were an important source of data in the verbal and auditory domain, and we reasoned that, while there is meaning present in both stimuli types at the single word level, the critical difference was meaningfulness at the sentence level.Finally, we identified and excluded a small number of experiments in Jackson's SCN dataset (n = 4) that used contrasts that could be viewed as probing ToM-related processing.
The final ToM dataset used in the present study comprised 114 experiments from 2800 participants, 159 contrasts, and 1893 peaks.The final SCN dataset used in the present study comprised 214 experiments, including data from 3934 participants, 410 contrasts, and 3803 peaks.

Categorising contrasts by stimulus format and sensory input modality
In line with our secondary aim of accounting for the effects of stimulus format and sensory input modality on network overlap, individual contrasts from both the ToM and SCN datasets were further categorised as being within the verbal domain or the non-verbal domain.Verbal paradigms used spoken or written language stimuli.Examples of non-verbal paradigms include those using pictures (e.g., of objects or actions), animations, videos, or environmental sounds (see Rice et al., 2015a,b, for a similar approach).Moreover, contrasts were independently categorised according to whether stimuli were presented in the visual or auditory modality.In this case, pictorial stimuli, as well as written words and sentences were counted as visual stimuli (see Molenberghs et al., 2016;Visser et al., 2010b for similar approaches).In cases where both types of stimuli (e.g., verbal and non-verbal) were used in the same task, the contrast was excluded (e.g.Sommer et al., 2010).The reader is referred to Table 1 and the Supplementary Information for the number of studies and a list of excluded contrasts in each of these categories.

Further methodological considerations
Following the application of general inclusion/exclusion criteria and the categorization described in Section 2.2, we took additional steps to further characterize the two revised datasets and evaluate the potential for other confounds to influence their comparison.As we shall describe below, this led to further refinement which improved the suitability of the datasets for addressing our key research questions.

Controlling for type of baseline
In semantic cognition research, it is widely accepted that the results of neuroimaging studies are affected, in important ways, by the choice of baseline task; a failure to perform adequate matching of baselines to experimental conditions in terms of perceptual input, response and attentional/executive demands, decreases sensitivity of subtractive designs to activation in brain areas associated with cross-modal integration, semantic processing and response selection (Price et al., 2005).

Table 1
The number of experiments, contrasts, and peaks split according to the stimulus format, input modality, type of baseline, and presence of social content.Indeed, the use of passive rest or simple fixation as a baseline results in failure to reveal task-positive activation in anterior temporal areas (Binder et al., 2009;Price et al., 2005;Visser et al., 2010b), because minimal baseline task demands increase the opportunity for spontaneous semantic processing (associated with daydreaming and inner speech) to occur at an equal or greater depth/magnitude than that associated with more focused task-related semantic processing (Andrews-Hanna et al., 2014;Binder et al., 2009Binder et al., , 2016;;Chiou et al., 2020;Humphreys et al., 2015;Visser et al., 2010b).While it is not typically discussed in the literature, this is also an important consideration for neuroimaging studies of social cognition because various forms of social inference are likely to occur during a state of mind-wandering (see, e.g., Diaz et al., 2013).We observed that our SCN and ToM datasets differed considerably in the types of baselines used and that there was a higher degree of variability among semantic cognition studies (see Table 1 and the Supplementary Information).This could have led to a confound in the interdomain comparisons, namely a difference in the sensitivity to activation associated with cross-modal processing.To explore these issues, we (a) quantified these differences using three categories of baseline and (b) mapped the effect of including/excluding contrasts that used these baselines on the outcomes of ALE analysis within each domain.The results of this preliminary analysis informed our final approach to defining the datasets used for the inter-domain comparisons (see below).Previous attempts to deal with this issue have only distinguished between two types of baselines (e.g., Visser et al., 2010a), but with a view to capturing greater specificity in these effects, we operationalized three, as follows.
1. High-level baselines were defined as those including an active task designed to approximate the demands of the main/experimental task without engaging the process of interest (ToM or semantic processing).This includes being generally well-matched to the experimental task in terms of perceptual (visual, auditory) properties, and means of behavioural output (overt/covert).2. Low-level baselines were defined as having a task that required active engagement but one that differed from the main task in numerous ways, including perceptual properties, means of behavioural output, or difficulty.3. Finally, the third category of baselines were those which required only passively watching a blank screen or maintaining visual fixation.
Our chief motivation for this finer differentiation of baseline types was to arrive at an optimal scenario in which we could remove crossdomain confounds while retaining as many data points, and therefore as much power, as possible.We decided on a stepwise approach in which we would compute the ALE map for each domain (i) with all contrasts included, then (ii) without contrasts involving rest/fixation, and finally, (iii) with neither the rest/fixation nor low-level baseline contrasts included.We visually compared the ALE maps generated at each step, as well as the associated output tables, paying attention to the gain or loss of suprathreshold clusters.We decided a priori that if inclusion/exclusion resulted in minimal change to the activation maps, then we would opt to retain contrasts in the sample.The authors acknowledge the arbitrariness of this criterion, but setting a more specific rule a priori was impractical due to the fact that the quality and quantity of these changes were likely to vary across different comparisons (due to sample size, etc).Therefore, to mitigate against this and ensure transparency, we (i) opted to fully report the results both prior to and following exclusions, and (ii) ensure all datasets and subsets were publicly available so that the community can reproduce our results and explore the consequences of certain methodological decisions (all data can be found at https://osf.io/ydnxh/).
We found that, in the case of the SCN data set, excluding passive/ resting baselines resulted in additional activation in the left inferior temporal lobe and in right medial temporal areas (see Supplementary Figure R2b and Supplementary Table R2).The exclusion of contrasts utilising low-level baselines did not lead to any appreciable differences in the distribution of activations, but the size of clusters was reduced owing to the reduced sample and power.In the case of the ToM data, the impact of these exclusions was negligible due to a very low number of experiments with low-level and passive baselines (Supplementary Figure R2a and Supplementary Table R1).Overall, these outcomes are consistent with an expectation that the inclusion of passive baselines would occlude activation within parts of the SCN (Binder et al., 2009(Binder et al., , 2016;;Humphreys et al., 2015;Visser et al., 2010b).Exclusion of lower-level baselines, on the other hand, might be an overly conservative approach that prohibits the detection of activation that is common across domains.We, therefore, excluded only contrasts involving rest/passive baselines and from both datasets.

Controlling for the 'socialness' of semantic stimuli
20 studies (48 contrasts) in Jackson's (2021) original SCN data set, having otherwise met our revised exclusion/inclusion criteria, involved a task or stimuli that were, to some degree, social in nature.For example, some studies used social or emotion concepts, and others probed person knowledge through famous faces (e.g., Elfgren et al., 2006;Grabowski et al., 2001;Leveroni et al., 2000).We identified contrasts as being social if they used stimuli that consistently referred to (i.e., this was a defining feature for the stimuli) social characteristics of persons or group of people, a social behaviour or interaction, or any other socially-relevant concept (Diveica et al., 2022).These studies required further consideration, particularly because of an ongoing debate concerning whether social semantics and general semantics depend upon independent or overlapping representational systems (Arioli et al., 2021;Binney et al., 2016b;Binney and Ramsey, 2020;Olson et al., 2013;Pexman et al., 2023).It is possible that ToM tasks engage social concepts and therefore the same regions engaged by social semantic processing (e.g., the dorsal ATL; Binney and Ramsey, 2020;Ross and Olson, 2010;Zahn et al., 2007) without relying on general semantic areas.In this case, if we were to pool social semantic contrasts and general semantic contrasts, then we might obtain an exaggerated picture of the extent to which the ToM network overlaps with the general processing semantic network (Rouse et al., 2024).However, there was also a pragmatic reason for including these studies: they are a key source of data related to non-verbal semantic processing (see Table 1) and excluding them could compromise our ability to remove the confounding effect of stimuli type.To account for this, we examined the effect of including/excluding these studies in our general semantic dataset.These results are fully reported in the Supplementary Information No 2. Briefly, the overall pattern remained almost the same when the social contrasts were excluded, apart from losing a small cluster in the right IFG and slightly less extensive left temporopolar activation.These differences are likely to be due to the reduction in the number of studies included and concern brain regions that were not the focus of the present study, and thus are not central to the conclusions made.Therefore, we decided to retain the social contrasts as part of our SCN dataset and include them in the cross-domain comparisons reported in the Results section.

Data analysis
We performed coordinate-based meta-analyses, using the revised activation likelihood estimation (ALE) algorithm as implemented in the GingerALE 3.02 software (http://brainmap.org/ale)(Eickhoff et al., 2009(Eickhoff et al., , 2012(Eickhoff et al., , 2017;;Laird et al., 2005).To ensure sufficient statistical power, analyses were only performed on samples comprising a minimum of 17 experiments (Eickhoff et al., 2016).Nonetheless, meta-analyses performed on small sample sizes are susceptible to potential publication bias, and caution should be given to interpretation of results from samples with less than 30 studies (Acar et al., 2018).The minimum sample size in this report, however, is n = 37.Each analysis was comprised of two stages.The first stage consisted of independent analyses of the ToM and SCN datasets, which were used to identify areas of consistent activation within each domain.Here, the ALE meta-analytic method treats the activation coordinates reported by each experiment as the center points of three-dimensional Gaussian probability distributions which differ in width to account for the reliability of the peak estimate based on the size of the participant sample (Eickhoff et al., 2009).These spatial probability distributions are aggregated, creating a voxel-wise modelled activation (MA) map for each experiment in the sample.Then, the voxel-wise union across the MA maps of all experiments is computed, resulting in an ALE map that quantifies the convergence of results across experiments (Turkeltaub et al., 2012).GingerALE tests for above-chance convergence (Eickhoff et al., 2012), thus permitting random-effects inferences.Following the recommendations of Müller et al. (2018), ALE maps of both the ToM and SCN domains were thresholded using cluster-level family-wise error (FWE) correction of p < 0.05 with a prior cluster-forming threshold of p < 0.001 (uncorrected), which was estimated via 5000 permutations.Cluster-level FWE correction has been shown to offer the best compromise between sensitivity to detect true convergence and spatial specificity (Eickhoff et al., 2016).
The ALE maps generated in this first stage were used as inputs for the second stage of analysis, comprised of conjunction and contrast analyses.These analyses were aimed at identifying similarities and differences, respectively, in neural activation between the SCN and ToM sets of studies.Conjunction images were generated using the voxel-wise minimum value of the ALE maps (Nichols et al., 2005).Contrast images were created by directly subtracting one ALE map from the other (Eickhoff et al., 2011).Differences in ALE scores were compared to a null distribution that was estimated via a permutation approach with 5000 repetitions.Given that there are no established methods for multiple comparison correction of ALE contrast maps (see Eickhoff et al., 2011), the contrast maps were thresholded using a more conservative threshold of p < .001(uncorrected) and a minimum cluster size of 100 mm3.Moreover, we masked the contrast maps with the cluster-level FWE-corrected ALE maps resulting from the independent ALE analysis of the respective cognitive domain.Thresholded ALE maps were plotted on a MNI152 template brain using MRICroGL (https://www.nitrc.org/projects/mricrogl).We used FSL maths commands and FSL VIEW (https ://www.nitrc.org/projects/fsl) to binarise the ALE maps for better visual clarity when displaying the conjunction.
In a final step, we conducted post hoc cluster analyses that afforded a complementary approach to evaluating whether clusters of activation identified in the two independent ALE analyses of the SCN and ToM data were driven by certain methodological characteristics (i.e., input modality and stimulus format).We examined the list of experiments that contributed to each cluster by at least one peak and computed the likelihood of contribution of a given experiment type.For these purposes, we used Fisher's exact tests of independence and post-hoc pairwise comparisons in R studio Version April 1, 1106 (https://www.rstudio.com).
In summary, our analysis pipeline proceeded as follows.To address our primary question about similarities in the brain networks underpinning semantic and social cognition, we conducted independent ALE analyses on the ToM and SCN datasets which generated whole-brain activation maps.These maps were then used to create conjunction and contrast analyses aimed at identifying overlap and differences in the topology of activation between the two domains.We repeated these analyses having divided the SCN and ToM datasets into subsets containing experiments that used VERBAL stimuli on one hand, and NON-VERBAL stimuli on the other.This allowed examination of the effect of stimulus format.Then we split the datasets into subsets containing experiments that used VISUAL and AUDITORY stimuli and repeated the analyses to investigate the impact of sensory input modality.Finally, we performed cluster analyses to check whether the likelihood of finding activation within each cluster identified in the primary ALE analyses of the ToM and SCN data depends on experiment type (VERBAL, NON-VERBAL, VISUAL, AUDITORY).

General overlap between networks subserving theory of mind and semantic cognition
Our principal analyses explored the extent to which neural networks engaged by ToM and semantic cognition tasks overlap (and diverge).Overall, the results reveal extensive areas of overlap including at key areas of interest (see Fig. 1 and Table 2; also see the independent ALE analysis results for each separate domain in Supplementary Information No. 2: Supplementary Figure R1 and Supplementary Table R1).Specifically, there was a conjunction of ToM and SCN activity within the bilateral ATL that covered the temporal pole (TP) and the banks of the anterior STS, the MTG and STG in both hemispheres.In the left but not the right hemisphere, the area of overlap extended along the whole length of the MTG/STG towards the lateral temporoparietal junction (including the AG) as well as medial portions of the IPL.There was also a conjunction of activation in the left posterior ventral temporal lobe (ITG/FG), and in the lateral frontal cortex including pars orbitalis, triangularis and opercularis of the left IFG and the ventral precentral gyrus.There were smaller clusters on the bank of the right inferior frontal sulcus (pars triangularis), the left dorsomedial frontal cortex and left inferior precuneus.
In the context of this large overlap, the contrast analyses revealed key differences between ToM and SCN (Fig. 1).On the lateral surface of the bilateral ATL the activation for ToM included an area of anterior MTG that the SCN did not.Moreover, in the right IPL/AG (within the TPJ), activation was only consistently identified for ToM.While both ToM and semantic cognition elicit reliable activation in the left TP, as well as the IPL/AG (TPJ), the contrast analyses revealed that voxels in this same areas had significantly higher ALE values for ToM compared to the SCN.Beyond our key areas of interest, compared to the semantic studies, ToM studies also showed higher convergence of activation in the right IFG, right precentral gyrus, bilateral anterior mPFC, left precuneus and left cerebellum.On the other hand, SCN experiments also showed increased convergence of activation in the ventral portion of the left pMTG stretching to the posterior ITG and FG, in the left MFG/IFG spreading towards the insula, and in the left inferior precuneus and right dorsal mPFC.

The role of stimulus format (VERBAL versus NON-VERBAL)
In this next set of analyses, we explored the extent to which differences between the activation maps associated with ToM and semantic cognition could be explained by systematic differences in the types of tasks and stimuli used in each domain.We repeated the above comparisons, this time excluding contrasts involving nonverbal stimuli (i.e., only retaining those involving verbal stimuli).Both samples were large enough for the purposes of meta-analysis although there were many more experiments using verbal stimuli in the domain of semantic cognition than there were in the ToM dataset (VERBAL ToM: n = 46; VERBAL SCN: n = 175).Nonetheless, this analysis revealed a very similar pattern of conjunction to the principal set of comparisons reported in Section 3.1 including the bilateral ATL (anterior MTG/STS) and the left TPJ (See Fig. 2 and Supplementary Information No. 2: Supplementary Table R4).However, there ceased to be any IFG activation for ToM tasks, and thus overlap between the two domains was absent in this region.A similar observation was made at the left posterior STS/MTG, and other small clusters of conjunction were no longer present.This could simply be due to the substantial reduction in size of the ToM experiment sample (from 113 to 46).However, we looked at the cluster analyses for the IFG and found that verbal ToM experiments were significantly less likely than nonverbal ToM experiments to contribute to the clusters in the bilateral IFG (see Supplementary Information No.2: Supplementary Figure CA1 and Supplementary Table CA1 for more detail).This is contrary to expectations given that the left IFG is strongly engaged in language processing (Friederici, 2011).It is, however, consistent with prior results from the false belief > false photograph contrast employed in many of these verbal ToM studies (see Diveica et al., 2021;Schurz et al., 2014), and it is possible this result reflects the large number of these contrasts included in this meta-analysis.One explanation for this observation could be differences between the ToM tasks (e.g., false belief) and the control/baseline (e.g., false photograph) task in terms of the semantic/syntactic operations that need to be performed.Should there be greater or equivalent difficulty in the control task, then IFG activation could be subtracted away (see Diveica et al., 2021 for analyses that address this issue).
In the corresponding contrast analyses, the differences between ToM and the SCN in the bilateral ATL and left TPJ were less pronounced, yet they remained.There also continued to be more consistent activation of the right TPJ for ToM.This was also true of the left anterior mPFC, and left precuneus.Indeed, while the extent of the clusters changed because of the reduced sample size in the ToM dataset, we continued to find more consistent involvement of the right TPJ, mPFC and left precuneus in ToM as compared to semantic cognition.Given that there were less studies in the ToM than SCN dataset, it is unlikely that these crossdomain differences could be attributed to lower statistical power in the case of SCN.For more detail see Supplementary Information No. 2: Supplementary Figures R4 & R5 Panel A and Supplementary Table R6 & R7).
When we limited the datasets to experiments utilising nonverbal stimuli, the results of the ALE analysis for ToM remained mostly unchanged from that seen in Section 3.1.In the case of semantic cognition, the number and extent of clusters was greatly diminished which reflects the reduced sample size (see Fig. 3).Indeed, there were more experiments using nonverbal stimuli in the domain of ToM than there were exploring semantic cognition (ToM: n = 71; SCN: n = 37) and, as a consequence of the reduced sample size in the SCN domain and Panel A displays the conjunction alongside statistically significant differences revealed by the contrast analyses.The contrast maps in Panel A were thresholded with a cluster forming threshold at p < 0.001 and a minimum cluster size of 100 mm 3 .In Panel B, we have overlaid the binarised versions of the complete ALE maps resulting from independent analysis of ToM and semantic cognition studies.This allows for full visualisation of the topography of the two networks and consideration of the relationship between them (also see Supplementary Figure R1 and Supplementary Table R1).The independent ALE maps were treated to a cluster-forming threshold at p < 0.001, and an FWE-corrected cluster-extent threshold at p < 0.05.The sagittal and coronal sections are chosen as representative slices positioned over peak coordinates at which there is the greatest conjunction in the bilateral anterior temporal lobes (left y = 12; right y = 14).
E. Balgova et al. consequent lack of initial convergent ALE activation, there was no conjunction between the domains in the left TPJ.Overlap was still present in key regions of interest including the left pMTG, left ITG and some small aspects of the left IFG.Notably, even though visual inspection of the independent ALE maps for each domain suggests a large difference in terms of bilateral ATL activation, there were no significant differences revealed by the contrast analysis.The bilateral TPJ responded selectively to ToM in this analysis, while the posterior ITG was only present in the SCN.The ALE maps for each domain can be found in Supplementary Information No. 2: Supplementary Figures R4 & R5 Panel B and Supplementary Tables R6 & R7.In the cluster analysis of either the ToM or semantic domain, we found that the likelihood of finding activation in the respective ATL or TPJ areas did not depend on the verbal/non-verbal nature of the stimuli.This finding suggests that the inability to identify convergent left TPJ activation in the non-verbal SCN sample, and, consequently, overlap with non-verbal ToM, is indeed due to reduced statistical power.The cluster analysis showed that nonverbal experiments did however contribute more to the bilateral IFG and SFG in the ToM domain (See more detail in Supplementary Information No.2: Supplementary Figure CA1 and Supplementary Table CA1).

The role of sensory input modality (VISUAL versus AUDITORY)
We also investigated the impact of sensory input modality.Importantly, both domains were dominated by experiments using visually presented stimuli.Comparisons limited to the auditory experiments were not possible due to a very small sample of ToM data.Overall, the pattern and extent of the common activation for VISUAL experiments (ToM: n = 106; SCN: n = 152) remained highly similar to our original analysis (Section 3.1), with common clusters of activation in key semantic areas (See Fig. 4 and Supplementary Information No. 2: Supplementary Table R10), including the left ATL, left IFG, left pMTG and ITG/FG and the left IPL/AG.There were also clusters of conjunction in the left medial SFG and precuneus.However, unlike in the initial analysis, there was no right ATL activation for the visual SCN experiments, and therefore no overlap between domains in the right ATL.Indeed, the cluster analyses revealed that visual relative to auditory SCN contrasts were less likely to contribute to the right ATL cluster, suggesting that it is unlikely that the absence of right ATL activation for visual SCN can be explained by reduced power per se.Instead, it seems more likely that the auditory contrasts were driving this cluster in the case of semantic cognition.One possibility is that this reflects increased effort in studies that use auditory stimuli (see Discussion).Other minor differences to the initial analyses are a diminished area of conjunction in the left middle STG and an absence of a conjunction in the right IFG (see Supplementary Information No.2: Supplementary Figure CA1 and Supplementary Table CA1 for more detail).
As in our full analysis, the contrast analysis found more consistently identified activation in visual ToM than visual semantic cognition studies in the right TPJ, right IFG, precentral gyrus, anterior mPFC and precuneus.A small portion of the bilateral IFG remained more reliably engaged across SCN studies, as did the MFG, anterior mPFC and left precuneus.For more detail see the VISUAL and AUDITORY ToM and VISUAL and AUDITORY SCN ALE maps in Supplementary Information No. 2: Supplementary Figures R8 & R9 and Supplementary Tables R11 &  R12.
Although they do not directly relate to the study's main questions, Independent ALE analyses cluster forming threshold p < 0.001; cluster-extent FWE p < 0.05.The contrast analyses were further thresholded with a cluster forming threshold at p < 0.001 and a minimum cluster size of 100 mm 3 .Anatomical labels are derived from the Automatic Anatomical Labelling Atlas.AG = angular gyrus, MTG = middle temporal gyrus, IFG -inferior frontal gyrus, TP = temporal pole, SMA = supplementary motor area, SFG = superior frontal gyrus, ITG = inferior temporal gyrus, IPL = inferior parietal lobule, MOG = middle occipital gyrus, STG = superior temporal gyrus.
E. Balgova et al. for sake of completeness and to allow for comparisons with prior metaanalyses (Molenberghs et al., 2016;Rice et al., 2015b;Visser et al., 2010b) we also performed conjunctive and contrastive analyses within each domain which compare each stimulus format and sensory modality (e.g., comparisons of the VERBAL SCN and NON-VERBAL SCN data sets, the VISUAL SCN and AUDITORY SCN data).The results of these analyses can be found in the supplementary information (see Supplementary Information No. 2: Supplementary Figures R6, R7 and R10 and Supplementary Tables R8, R9, R13 and R14).

Discussion
The present study aimed to glean a clearer understanding of the contribution of the general semantic system to social cognition.To this end, we took a neuroimaging meta-analytic approach to assess the degree to which engagement in ToM tasks shares neural correlates with semantic processes.The key findings were as follows.
1. Overall, there was a strikingly large degree of overlap between the activation likelihood maps for ToM and the SCN.This was most evident in the bilateral ATL, the left STS, left MTG, left TPJ, and left IFG, which are all key regions for semantic processing (Binder et al., 2009).This suggests that semantic processes are integral to performing theory of mind tasks.2. Most differences that emerged were mainly a matter of the extent of regional activation, which is likely driven by discrepancies in the sample size contributing to each ALE map.Nonetheless, there were a few notable exceptions.3. The right TPJ, anterior aspects of the bilateral MTG, bilateral mPFC, and the bilateral precuneus, were consistently identified in ToM but not SCN studies.Significant differences remained even after controlling for methodological factors, including the type of experimental stimuli, input modality and baseline condition used to probe each domain.This is consistent with claims that the function of these regions (e.g., the right TPJ) are tuned towards processing social stimuli.The initial ALE maps were treated to a cluster-forming threshold at p < 0.001, and an FWE-corrected cluster-extent threshold at p < 0.05 prior to the conjunction and contrast analyses.The contrast maps in Panel A were additionally thresholded with a cluster-forming threshold at p < 0.001 and a minimum cluster size of 100 mm 3 .Panel A displays the conjunction alongside side statistically significant differences.In Panel B, we have overlaid the binarised versions of the complete ALE maps resulting from independent analysis of VERBAL ToM and VERBAL SCN studies.This allows for full visualisation of the topography of the associated networks (also see Supplementary Figures R4 & R5 and Supplementary Table R6 & R7).The sagittal and coronal sections are chosen as representative slices positioned over peak coordinates at which there is the greatest conjunction in the bilateral anterior temporal lobes (left y = 12; right y = 14).
4. The posterior ITG and dorsal IFG (both in the left hemisphere) were consistently identified in SCN studies but not in ToM studies.This difference was even more pronounced after controlling for stimulus format and modality.One possibility is that this reflects differences in task difficulty, which we did not account for (see Diveica et al., 2021). 5. Activation in bilateral IFG and SFG, irrespective of domain, appears to be driven by stimulus format.Right ATL activation could be driven by input modality.However, there are other uncontrolled methodological confounds that may have also played a role (e.g., task difficulty, processing effort, experiment number differences across domains).These findings highlight the need for future studies, whose aim is to contrast different cognitive domains, to systematically control for these types of methodological factors.
We interpret these results as generally supporting a recent proposal that social cognition draws upon a set of domain-general systems and processes dedicated to semantic cognition (Binney and Ramsey, 2020).We elaborate on these arguments and discuss each of the key findings in the following paragraphs.

Two sides of the same coin? The relationship between semantic cognition and theory of mind
It is argued that progress in social neuroscience theory will rapidly accelerate if the field embraces established models of other, more general domains of cognition (Amodio, 2019;Binney and Ramsey, 2020;Spunt and Adolphs, 2017).Theoretical advances in, for example, the domain of human learning and memory, are not always (immediately) incorporated within the social neuroscience literature, yet they are valuable opportunities to generate new hypotheses and more detailed models of social cognition, both in terms of mechanisms and neural The initial ALE maps were treated to a cluster forming threshold at p < 0.001 and an FWE corrected cluster-extent threshold at p < 0.05 prior to the conjunction and contrast analyses.The contrast maps in Panel A were additionally thresholded with a cluster forming threshold at p < 0.001 and a minimum cluster size of 100 mm 3 .Panel A displays the conjunction alongside side statistically significant differences.In Panel B, we have overlaid the binarised versions of the complete ALE maps resulting from independent analysis of VERBAL ToM and VERBAL SCN studies.This allows for full visualisation of the topography of the associated networks (also see Supplementary Figure R4 & R5 and Supplementary Table R6 & R7).The sagittal and coronal sections are chosen as representative slices positioned over peak coordinates at which there is the greatest conjunction in the bilateral anterior temporal lobes (left y = 12; right y = 14).
bases (Amodio, 2019).Binney and Ramsey (2020) argue that reflections on theories of semantic cognition could prove particularly fruitful in this regard.They also highlight the striking similarities between the topologies of brain regions activated during neuroimaging studies of social cognition and semantic cognition, drawing particular attention to the ATL, the TPJ (including the angular gyrus and posterolateral temporal lobe), and the inferior frontal cortex.Prior to the present study, however, these activation maps had not been formally compared at the level of the whole brain (see Hodgson et al., 2022 for a region-specific analysis).Our results confirm a large degree of overlap, which raises questions about the nature of the various processes that afford the theory of mind ability (for related discussion, see Arioli and Canessa, 2021;Deschrijver and Palmer, 2020).We specifically argue that it suggests that theory of mind processes involve cognitive mechanisms related to conceptual retrieval and semantic inference.
What does semantic processing contribute to theory of mind?Semantic memory or, conceptual knowledge, refers to a database of the meaning of words, objects, events and behaviours (Lambon Ralph et al., 2017).Thus, it is essential for recognising social signals, both verbal and nonverbal, that provide clues to someone's cognitive or affective state.Moreover, it provides a means of cognitive abstraction that enables inference and representations of complex beliefs and intentions that we cannot directly observe (Adolphs, 2010;Binney and Ramsey, 2020).Finally, it guides the generation of responses that are appropriate to the observed behaviour, having considered the identity and social roles of the other agent or agents, as well as the wider social context.For example, should one see someone appear to laugh at a funeral, they must interpret the audiovisual signals and resolve any potential ambiguities (e.g., could it, in fact, be crying?).Then one must infer their likely mental state, particularly given their identity/role (e.g., the bereaved next of kin), and generate a context-appropriate social response (e.g., in this case, suppression of smiling or laughter).Now imagine the possible consequences of having impaired semantic knowledge (e.g., in semantic dementia, Rouse et al., 2024).Failure to correctly recognise the identity Fig. 4. Common and differential activation for VISUAL ToM (N = 106) and VISUAL SCN (N = 152).The initial ALE maps were treated to a cluster-forming threshold at p < 0.001, and an FWE-corrected cluster-extent threshold at p < 0.05 prior to the conjunction and contrast analyses.The contrast maps in Panel A were additionally thresholded with a cluster-forming threshold at p < 0.001 and a minimum cluster size of 100 mm 3 .Panel A displays the conjunction alongside side statistically significant differences.In Panel B, we have overlaid the binarised versions of the complete ALE maps resulting from independent analysis of VERBAL ToM and VERBAL SCN studies.This allows for full visualisation of the topography of the associated networks (also see Supplementary Figures R8 & R9 and Supplementary Table R11 & R12).The sagittal and coronal sections are chosen as representative slices positioned over peak coordinates at which there is the greatest conjunction in the bilateral anterior temporal lobes (left y = 12; right y = 14).and/or the actions of the agent could lead to a misattribution of mental state, and/or socially inappropriate behaviour.
How tightly coupled are theory of mind and semantic processes?We argue our findings, together with prior patient (Binney et al., 2016a;Ding et al., 2020;Edwards-Lee et al., 1997;Snowden et al., 2018), animal (Klüver and Bucy, 1937) and neuroscientific studies involving healthy populations (Balgova et al., 2022;Diveica et al., 2021), suggest the underlying systems are closely linked (also see Binney and Ramsey, 2020;Olson et al., 2013;Rouse et al., 2024).One possibility is that theory of mind can be considered a case of semantic processes, rather than something distinct, and this means it would operate upon the same basic principles (and neural underpinnings; Binney and Ramsey, 2020).An alternative possibility is that theory of mind draws partly on general semantic processes (e.g., in the act of representing another's cognitive/affective states), but also on distinct, more specialised processes which are supported by regions outside the SCN (e.g., systems involved in detecting the extent to which there is a mismatch between those states and one's own; Arioli and Canessa, 2021;Deschrijver and Palmer, 2020).Akin to this, is the finding that the cortical network supporting language processes (which involve activating links between words and meaning) is partially overlapping but also separable from the theory of mind network (Fedorenko and Varley, 2016;Paunov et al., 2022).Our results align more straightforwardly with the second possibility, although they do not rule out the first (see Section 4.2).At the very least, as we shall elaborate in the following paragraphs, our study advances understanding of the precise functional contribution of different brain regions to theory of mind.This level of specificity is sometimes missing from, and is important for development of, existing neurobiological accounts of theory of mind (Saxe and Kanwisher, 2003;Saxe and Wexler, 2005).

Functional fractionation of the 'social brain'
We observed pervasive differences between the activation likelihood maps for ToM and SCN.Specifically, activation of the right TPJ, anterior aspects of the bilateral MTG, bilateral mPFC, and the bilateral precuneus appear more attuned to ToM tasks.All these regions are included in descriptions of putative brain regions specialised for theory of mind (Saxe, 2006;Saxe and Powell, 2006;Schurz et al., 2014Schurz et al., , 2017)).However, they are also considered part of the default-mode network (DMN) (Andrews-Hanna et al., 2014;Buckner et al., 2008;Spreng et al., 2009;Spreng and Grady, 2010), a resting-state network proposed to support various forms of internally orientated cognition (i.e., cognition that is decoupled from sensory processing (Margulies et al., 2016;Smallwood et al., 2013), including memory-driven cognition (Murphy et al., 2018).The DMN has been explicitly linked to social cognition (Mars et al., 2012;Spreng et al., 2009;Schilbach et al., 2006) although it has also been shown that regions activated by social tasks are, to some degree, distinct from what are considered 'core' regions of the DMN (Jackson, 2021;Jackson et al., 2016;Mars et al., 2012).In the present study, however, it was core DMN regions (especially those around the sagittal midline) that showed differences between semantic cognition and ToM.'Core' regions have been argued to represent information related to the self and to allow for integration of Self and Other information via interaction with other DMN subsystems (Spreng and Andrews-Hanna, 2015).
Our results shed new light on the relationship between the 'social brain' and domain-general networks by highlighting significant overlap with the SCN.Important clues might also be gleaned from the way in which activation patterns diverge, and the fact that this occurs most notably within the right hemisphere homologues of left-lateralised SCN regions (e.g., the TPJ).One possible account of these observation is that engaging in ToM recruits the SCN plus additional regions that are more tuned to social processes.Alternatively, these regions may all comprise one widely distributed but nonetheless functionally integrated network, that exhibits systematic variation in the involvement of some of its nodes (particularly across hemispheres) owed to task-related or stimulusrelated factors (e.g., input modality).'Socialness' of a task (or perhaps the degree of involvement of Self-and Other-related processes (Chiou et al., 2022;Platek et al., 2004;Quesque and Brass, 2019) could be one such task-related factor (Binney and Ramsey, 2020;Pexman et al., 2023).Further research is needed to directly probe these factors and how they drive network involvement within and across domains.In the remainder of this discussion, we expand on debates surrounding the ATL and the TPJ because they are ascribed key roles in both ToM and in semantic cognition.

The role of anterior temporal lobes in theory of mind
Convergent neuropsychological and neuroimaging evidence strongly implicates the ATL in semantic knowledge representations.Semantic knowledge underpins a wide range of meaning-imbued behaviours, including language use, action understanding and interactions with objects (Patterson and Lambon Ralph, 2016;Lambon Ralph et al., 2017).By extension, we argue that the contribution of the ATL to ToM, and to social cognition more generally, is the supply of conceptual level information which constrains inferences about the intentions and actions of other agents (Binney and Ramsey, 2020).The current study revealed reliable overlap between ToM and semantic processing in the ATLs, which supports this hypothesis.The present findings also complement those of prior fMRI studies that directly explored the relationship between social and general semantic processing in the ATL.Across all these studies, two consistent findings have emerged.First, a ventrolateral portion of the left ATL responds equally to socially relevant concepts and more general concepts (both concrete and abstract), and this is irrespective of whether concepts are probed via verbal or pictorial stimuli (Binney et al., 2016b;Rice et al., 2018).The same ventrolateral region also activates during three different verbal and nonverbal ToM tasks, which suggests that conceptual information is accessed during ToM (Balgova et al., 2022).Second, there are some differences between social and general semantic tasks within the dorsolateral ATL (Binney et al., 2016;Rice et al., 2018; also see Arioli et al., 2021;Lin et al., 2018;Lin et al., 2018;Mellem et al., 2016;Ross and Olson, 2010;Zahn et al., 2007) although the location of this difference moves around across studies.Importantly, the differences are small compared to the amount of overlap.Indeed, the ATL subregion differentiating between ToM and SCN in the present study was abutting a much larger left ATL cluster which was activated consistently across both domains (also see Beauchamp, 2015;Deen et al., 2015 for comparisons of social perception with language and voice perception).
This overall pattern is consistent with the graded semantic hub account (Bajada et al., 2019;Binney et al., 2012;Rice et al., 2015a), which characterises the bilateral ATL as a unified representational space, all of which is engaged by the encoding and retrieval of semantic information of any kind.The centre of this hub exists over the ventrolateral ATL and its engagement in semantic processing is largely invariant to stimulus factors (e.g., modality).Towards the edges of this space, however, there are gradual shifts in semantic function such that regions on the periphery are more sensitive to certain types of semantic features (for a computational exploration of this general hypothesis, see Plaut, 2002).Why exactly ToM tasks would engage the dorsolateral ATL more than general semantic tasks is unclear.One possibility is that the meaning conveyed by typical ToM stimuli (i.e., the state of mind of an actor in absence of explicit descriptors) is not directly observable and, therefore must be inferred to a greater extent than in a typical semantic task.This may rely heavily on verbally-mediated semantic information, which has been shown to engage the dorsolateral ATL more (Binder et al., 2009;Rice et al., 2015a;Visser and Lambon Ralph, 2011).Another possibility is that it reflects a proximity to and strong connectivity with the limbic system (via the uncinate fasciculus; (Bajada et al., 2017;Binney et al., 2012;Papinutto et al., 2016) and a role of this ATL region in processing semantic features related to emotion (Olson et al., 2007;Rice et al., E. Balgova et al. 2015a;Vigliocco et al., 2014).
The ventrolateral areas of the ATL implicated in recent studies of semantic processing (Binney et al., 2016b;Rice et al., 2018) and theory of mind (Balgova et al., 2022) sit posterior to Brodmann's area 38, and include the anterior ITG (including its basal surface) as well as the anterior fusiform.In the present study, the ATL subregions implicated were limited to the MTG and STG, and there was no evidence of more ventral involvement.This can be accounted for by signal distortion and signal loss that is typically observed with conventional forms of the fMRI technique.ATL-optimised distortion-corrected fMRI studies, on the other hand, detect robust ventral ATL activation during both semantic and ToM tasks (Balgova et al., 2022;Binney et al., 2010;Castelli et al., 2000;Devlin et al., 2000;Sharp et al., 2004).This methodological factor may also be particularly important for understanding the lack of left ATL activation for nonverbal stimuli.Prior distortion-corrected fMRI studies have shown that activation to nonverbal stimuli is almost entirely limited to ventral and ventromedial ATL structures which are regions that suffer the most from signal dropout (Rice et al., 2015b;Visser et al., 2010a).
There were also differences in the extent to which the right ATL was engaged, with a greater proportion of the right anterior MTG involved in ToM.Moreover, the involvement of the right ATL in semantic processing was dependent on including studies using auditory verbal stimuli.This confirms prior studies which also found that auditory verbal (or 'spoken') stimuli activate the ATL bilaterally, whereas written stimuli which show a left bias (Marinkovic et al., 2003;Rice et al., 2015b).Thus, while ATL involvement in ToM appears always to be bilateral, right-sided involvement in semantic processing appears to be related to stimulus factors.This could be understood more broadly in terms of processing effort.Indeed, auditory semantic stimuli are typically sentences which require both rapid processing of individual tokens, as well as processing of combinatorial meaning, and which could work the semantic system more vigorously than other types of stimuli (see Visser et al., 2010b for similar arguments).In a similar vein, the bilateral ATL activation during ToM tasks could reflect the semantic richness of stimuli.These observations are, however, not consistent with the right ATL having a distinctly social function (Bonnì et al., 2015;Gainotti, 2015;Gainotti et al., 2003;Gainotti and Marra, 2011;Pobric et al., 2016).

The temporo-parietal junction
The TPJ has been associated with a variety of cognitive domains, including attention, language, and episodic memory, and many of them bilaterally (Binder et al., 2009;Humphreys and Lambon Ralph, 2015;Igelström and Graziano, 2017;Özdem et al., 2017).It is also now becoming clear that these functions fractionate along an anterior-posterior, as well as a dorsal-ventral axis (Bzdok et al., 2013;Hodgson and Lambon Ralph, 2008;Humphreys and Lambon Ralph, 2015).The present study shows that STS/STG and inferior parietal involvement in ToM is bilateral (Bzdok et al., 2012;Hodgson et al., 2022;Molenberghs et al., 2016;Schurz et al., 2014Schurz et al., , 2020)).The inferior parietal lobe (including the angular gyrus) is involved in semantic processing bilaterally (Binder et al., 2009; see also Bonner et al., 2013;Kuhnke et al., 2022), whereas posterior MTG/STS involvement is left-lateralised (Jackson, 2021).Taken together, these results suggest that parts of the left TPJ serve a function common to ToM and SCN (Numssen et al., 2020).For example, the left angular gyrus has been implicated in integration and storage of conceptual knowledge by some authors (Binder et al., 2009;Kuhnke et al., 2020) and attributed with a more domain-general role by others (e.g., the multi-sensory buffering of spatio-temporally extended representations; Humphreys et al., 2021;Humphreys and Tibon, 2022).The left MTG/STS appears to be involved in processes that constrain semantic retrieval and which could also be engaged during ToM (Diveica et al., 2021).The right TPJ does not appear to be engaged by semantic processing, which is consistent with claims that it has a selective role in social and moral processing (Numssen et al., 2020;Saxe and Kanwisher, 2003;Saxe and Wexler, 2005;Young et al., 2010).However, the present study cannot rule out involvement in other cognitive domains.

Concluding remarks and future directions
In conclusion, we observed considerable overlap between the cortical networks engaged by semantic tasks and theory of mind tasks.These observations add to growing set of convergent findings from across neuropsychology, comparative and cognitive neuroscience which suggest this overlap reflects shared underlying processes and, further, that ToM relies in part on processes related to semantic cognition (Binney and Ramsey, 2020).Alternatively, this overlap could, on closer inspection, turn out to reflect tightly yet separately packed cognitive functions that only dissociate when investigated at higher spatial resolutions or at the level of individual participants (Lee and McCarthy, 2016).Further research is needed to explore these alternatives.Furthermore, inferences afforded by functional neuroimaging data are merely correlational and, therefore, the field needs to increasingly turn to patient models such as stroke, temporal lobe epilepsy, and frontotemporal dementia (Kumfor et al., 2017a;Kumfor et al., 2017b;Rankin, 2020Rankin, , 2021)), and non-invasive techniques like transcranial magnetic stimulation, to directly probe whether certain brain regions are necessary for both social and semantic cognition.

Fig. 1 .
Fig. 1.Common and differential activation for ToM (N = 113) and SCN (N = 211).Panel A displays the conjunction alongside statistically significant differences revealed by the contrast analyses.The contrast maps in Panel A were thresholded with a cluster forming threshold at p < 0.001 and a minimum cluster size of 100 mm 3 .In Panel B, we have overlaid the binarised versions of the complete ALE maps resulting from independent analysis of ToM and semantic cognition studies.This allows for full visualisation of the topography of the two networks and consideration of the relationship between them (also see Supplementary FigureR1and Supplementary TableR1).The independent ALE maps were treated to a cluster-forming threshold at p < 0.001, and an FWE-corrected cluster-extent threshold at p < 0.05.The sagittal and coronal sections are chosen as representative slices positioned over peak coordinates at which there is the greatest conjunction in the bilateral anterior temporal lobes (left y = 12; right y = 14).

Fig. 2 .
Fig. 2. Common and differential activation for VERBAL ToM (N = 46) and VERBAL SCN (N = 175).The initial ALE maps were treated to a cluster-forming threshold at p < 0.001, and an FWE-corrected cluster-extent threshold at p < 0.05 prior to the conjunction and contrast analyses.The contrast maps in Panel A were additionally thresholded with a cluster-forming threshold at p < 0.001 and a minimum cluster size of 100 mm 3 .Panel A displays the conjunction alongside side statistically significant differences.In Panel B, we have overlaid the binarised versions of the complete ALE maps resulting from independent analysis of VERBAL ToM and VERBAL SCN studies.This allows for full visualisation of the topography of the associated networks (also see Supplementary FiguresR4 & R5and Supplementary TableR6& R7).The sagittal and coronal sections are chosen as representative slices positioned over peak coordinates at which there is the greatest conjunction in the bilateral anterior temporal lobes (left y = 12; right y = 14).

Fig. 3 .
Fig. 3. Common and differential activation for NON-VERBAL ToM (N = 71) and NON-VERBAL SCN (N = 37).The initial ALE maps were treated to a cluster forming threshold at p < 0.001 and an FWE corrected cluster-extent threshold at p < 0.05 prior to the conjunction and contrast analyses.The contrast maps in Panel A were additionally thresholded with a cluster forming threshold at p < 0.001 and a minimum cluster size of 100 mm 3 .Panel A displays the conjunction alongside side statistically significant differences.In Panel B, we have overlaid the binarised versions of the complete ALE maps resulting from independent analysis of VERBAL ToM and VERBAL SCN studies.This allows for full visualisation of the topography of the associated networks (also see Supplementary FigureR4& R5 and Supplementary TableR6& R7).The sagittal and coronal sections are chosen as representative slices positioned over peak coordinates at which there is the greatest conjunction in the bilateral anterior temporal lobes (left y = 12; right y = 14).

Table 2
Conjunction and contrast analyses of the ToM (N = 113) and SCN (N = 211) experiments.