Unraveling the functional attributes of the language connectome: crucial subnetworks, flexibility and variability

Language processing is a highly integrative function, intertwining linguistic operations (processing the language code intentionally used for communication) and extra-linguistic processes (e

To combine both effectiveness ( Gibson et al., 2019 ) and utility ( Jaeger and Tily, 2011 ) of language production and comprehension, several essential abilities are required.The first one, is the combinatory skill ( Boer et al., 2012 ;Friederici et al., 2017 ;Zuidema and de Boer, 2018 ).Language is compositional and recursive, implying specialized processing of intra-linguistic aspects (i.e., an aptitude to handle various combinatorics, perceptive, syntactic, or semantic/conceptual; Pylkkänen, 2019 ).A second important ability relates to multisensory integration, which facilitates spoken communication and enhances speech intelligibility ( Chandrasekaran et al., 2009 ;Ghazanfar and Schroeder, 2006 ;Luo et al., 2010 ;Noppeney et al., 2008 ;Schroeder and Foxe, 2005 ;Schwartz et al., 2004 ;Sumby and Pollack, 1954 ).Beyond the multisensory facilitation (or low-level multimodal integration; Holler and Levinson, 2019 ), high level cognitive abilities related to topdown multimodal mechanisms have also been emphasized.A shared understanding, a relevant and contextually-adapted discourse, requires aligning the partners' representations, considering shared knowledge, past experiences, or even making assumptions about the other's perspectives.Establishing "common ground " between conversational partners ( Clark and Marshall, 1981 ) relies on a wide range of "high-level " cogni-tive functions such as working memory (resonance-based theory of common ground, Horton, 2007 ), long term memory ( Brown-Schmidt and Duff, 2016 ), theory of mind or mentalizing ( Vanlangendonck et al., 2018 ) processes.Language use is therefore adapted online, enabling communication in a range of environmental and social contexts, meeting various cognitive demands and individual metacognitive needs.Given the pressure exerted by communication, cognitive and metacognitive demands, language has evolved as an adaptive and complex system, which requires considering external (context) and internal (individual needs and goals) signals while processing both intra-and extra-linguistic signals (e.g., Holler and Levinson 2019 , for a multimodal language-insitu framework).
How are these various abilities integrated to sustain language functions and how are they implemented in the brain?The exploration of brain networks and their unique lens for understanding cognitive function has become an important part of the cognitive neuroscience landscape ( Fornito et al., 2013 ).Brain network descriptions have revealed that the brain is organized as a "small-world " network ( Achard and Bullmore, 2007 ), favoring optimization of information transfer ( Laughlin and Sejnowski, 2003 ).A balance between segregation and integration characterizes this organization, i.e., by short communication pathways creating specialized subsystems (segregation), whose interconnectivity is coordinated by distant, highly connected brain regions (integration; van den Heuvel and Sporns, 2013 a).Functionally, local systems or highly connected "modules " for the processing of information in a given modality (visual, auditory, etc.) are linked together by sparse and specific fiber paths over long distances, according to a connectivity principle of "local richness and long-range sparseness " ( Pulvermüller, 2018 ).This organization allows efficient serial, parallel, and distributed brain activity ( Herbet and Duffau, 2020 ).Integrative areas (sometimes referred to as connector hubs) at the interface between local systems are significant for the multimodal neural integration of information ( Cocchi et al., 2013 ;Fornito et al., 2015 ;van den Heuvel and Sporns, 2013b ).Regarding language circuitry, integration/segregation subsystems and specific connector hubs have been previously identified for both language production and comprehension (e.g., Friederici, 2012 ;Hagoort, 2016 ;Hertrich et al., 2020 ;Roger et al., 2022 ).However, language is multi-faceted, and a comprehensive analysis of the functional properties of the language connectome in a broader framework would contribute to a more accurate description.
To fill the gap, we propose, in this study, to examine the functional attributes of language at the brain level through an integrative perspective, mixing several linguistic tasks explored in the light of graph theory.This research is naturally framed within the substantial legacy of the study of language and its brain foundations, whose growing and diverse observations have been accompanied by the evolution of investigative techniques.In the past decades, many authors have highlighted brain function and structure associated with language through theoretical neurocognitive models (e.g., Duffau et al., 2014 ;Friederici et al., 2017 ;Hagoort, 2016Hagoort, , 2019 ; ;Hickok and Poeppel, 2007 ;Indefrey, 2011 ;Levelt, 1989 ;Price, 2012 ;Rauschecker and Scott, 2009 ).However, this study, in direct continuation with past legacy, adds further value by investigating functional cerebral connectivity (FC) based on task data.The anatomo-functional substrates associated with language are indeed highly task-dependent ( Hickok and Poeppel, 2000 ).
The task-based FC analyses presented here rely on an fMRI database compiling a broad spectrum of language-related tasks (Interactive networks of Language database: InLang database).InLang comprises thirteen language tasks, performed cross-sectionally by 150 right-handed neurotypical adults.The database is unique in that it covers a broad spectrum of language features: semantic and conceptual processing, decoding (phonology, sound), lexico-syntactic formulation (production), dialogality (social aspects of language), monitoring of self and others, and unintentional speech ( Fig. 1 A; see also the Supplementary Material for more details).Such a database is essential to uncover the functional architecture of the multifaceted language processes in an integrative approach.It makes it possible to model a comprehensive connectomic atlas of language and to explore its fundamental properties in depth.To this end, we analyzed task connectomes -where interregional FC was estimated by beta-series correlations -using graph metrics applied at multiple scales.It allowed us to expose: (1) the overarching FC profile of different language tasks and latent subprocesses; (2) the architecture of the general language connectome (LANG); (3) the functional roles of crucial language subnetworks and brain regions; (4) the anatomofunctional correlates; and (5) the flexibility and variability exerted on the LANG connectome.In terms of interindividual FC variability, gender (e.g., Filippi et al., 2012 ;Zhang et al., 2018 ) or age (e.g., Edde et al., 2021 ;Jockwitz and Caspers, 2021 ) are potentially important criteria to consider.Without any a priori hypothesis on the pattern of potential changes related to these factors or their behavioral relevance, we have attempted to identify, in an exploratory way, the main variations in the distribution of key regions belonging to the LANG connectome.
Fig. 1 provides an overview of the InLang database and the methodology used to address these 5 main axes.

Task-based connectomes
Regions of interest (ROIs) covering both the brain and the cerebellum were defined from 6 mm radius spherical regions built around the 264 coordinates in MNI space proposed by Power et al. (2011) .The images used for signal extraction (beta values) were the statistical parametric maps containing the linear contrasts between the HRF parameter estimates for the conditions of interest.Nilearn ( https://nilearn.github.io/stable/index.html; Abraham et al., 2014 ) was used to delineate ROIs and extract the signal.The mean signal for each of these ROIs was extracted by participant and task.Functional connectivity (FC) between brain ROIs was derived from correlating (Pearson correlations) the extracted beta values across subjects, within a given task.The process was repeated for each of the tasks separately, allowing to obtain thirteen 264 * 264 matrices of interregional FC based on beta-series correlation.Codes and derivatives are available here: https://osf.io/6xm8n/).These matrices of interregional FC were thresholded to obtain the task-based connectomes.We applied a 5% threshold, which defines the 5% of the highest positive correlation values, considered to represent non-spurious internodal connections.The matrices were binarized: 1 was assigned to internode connections that survived to the given density threshold, and 0 was assigned otherwise.Taskbased connectomes were built from these binarized matrices, reduced to a fixed number of edges (top 5%, 3485 edges).Several graph metrics were computed on the task-based connectomes: global (networkwide), intermediate (modularity), and local (nodal), using Networkx  S2 contains the subjects' characteristics, by tasks.(B) Summary of the steps performed to obtain the task connectomes.For a given task, we extracted the beta values from the individual functional activation maps, on a parcellation covering the whole brain ( Power et al., 2011 ).The beta values were then used to compute the task-specific connectivity matrix (correlation matrix).The same procedure was repeated for all tasks to obtain the respective functional matrices and connectomes.(C) Outline of the multi-level statistical analyses performed on functional connectivity (FC) measures (i.e., graph theory parameters) to address 5 main axes.
( https://networkx.org/ ).The metrics are described in the following subsections and Fig. S2 (Appendix) shows the evolution of the main graph metrics as a function of the density of connections included in the connectomes (i.e., as a function of more or less permissive thresholding).

Global connectivity profiles / Main FC parameters
After having modeled the different task-based connectomes, we extracted from these connectomes four main and complementary graph parameters.These parameters were computed at the local level (i.e., at the node or ROI level) and then averaged at the global level (i.e., at the connectome or graph level) allowing network-wide estimates.The four main global metrics used to describe each of the task-based connectomes were: -The mean global efficiency (E glob ) as proposed by Latora and Marchiori (2001) , which is the average of the unweighted efficiencies over all pairs of nodes: where N is the total number of nodes in the network G , the distance d( i,j ) corresponds to the number of edges in a shortest path between any two nodes i and j .E glob represents the capacity of a given network to efficiently integrate and transmit information between the network components or subnetworks (e.g., Bullmore and Sporns, 2012 ;Roger et al., 2019b ;Stanley et al., 2015 ).The higher the value, the more likely that information transfer is fast.
-The average local efficiency (E loc ) which is the average of the local efficiencies of each node.
Local efficiency of a node i corresponds to the average global efficiency of a subgraph induced by the neighbors of i ( Latora and Marchiori, 2001 ): The average E loc reveals the network's tendency to effectively share information within immediate local communities or the capacity of a given network to segregate the information processing (e.g., Bullmore and Sporns, 2012 ;Roger et al., 2019b ;Stanley et al., 2015 ).The higher the value, the more locally efficient the network is.
-The mean integration-segregation balance (I:S) as expressed by the difference between E glob and E loc (E glob -E loc ).
The integration-segregation balance allows to estimate how the functional organization of a task promotes either (1) more independent processing of specialized subsystems (i.e., segregation) or ( 2) cooperation between different subsystems (i.e., integration; Wang et al., 2021 ).A positive balance reflects a network with a general tendency toward functional integration (E glob > E loc ) while a negative balance reflects a general tendency toward functional segregation (E loc > E glob ).
We extracted the relative spatial layout of regions along the cortical surface by using existing scripts ( https://github.com/margulies/topography/tree/master/utils ), based on an algorithm developed to approximate the exact geodesic distance from triangular meshes ( Oligschläger et al., 2017 )."Physical " geodesic distances between pairs of nodes, estimated in mm, were quantified from the Power's nodes coordinates projected onto a template surface mesh (fsaverage5).This resulted in a node-by-node matrix of geodesic cortical distance.From the geodesic distance matrix, we averaged the distribution of distance-toconnected-areas of the relevant functional connections identified from the thresholded FC matrices.We thus obtained the global geodesic distance of the functional connectivity for each of the task-related connectomes.The higher the mean geodesic distance between functionally connected regions, the further apart the connected regions are on average along the cortical surface.

General LANG connectome and subprocesses
From the main global GT parameters described in the previous section (extracted at the connectome-level, for each task), we assessed the similarity of the FC profiles of the language task connectomes.The similarity was quantified using the Euclidean distance.We applied a datadriven hierarchical clustering approach to the similarity matrix and estimated the partitioning.Thus, we identified categories of tasks with similar FC profiles or global network topology.The internal composition and nature of the tasks assigned to each of the identified groups can reveal putative linguistic subprocesses that may be latent and common to several tasks (see also the Supplementary Material for a rationale and a similar method applied to functional activation maps).Starting from the partitioning, we computed the FC matrices of each of these main task groups -hereafter referred to as subprocess connectomes -from the scans of the respectively involved tasks and using the same procedure described above ( Section 2.2 .).In addition, and still following the same procedure, we generated the general task-based language connectome -abbreviated LANG -corresponding to the FC matrix derived from all language tasks.The global FC parameters reported in the previous section ( Section 2.3.1 ) were also extracted for the LANG and the respective subprocesses connectomes.
Note that only scans corresponding to the young/middle-aged adults of the InLang cohort ( n = 114) were considered for the calculation of the FC matrices (of the tasks, subprocesses, and the general LANG connectome) and the main statistical analyses.Scans of "older" adults were included exclusively for statistics regarding the effect of age on LANG hubs (see Section 2.4.5 .for a description of the age-related variability analysis).

Intermediate composition of LANG
We performed modularity analyses to determine community structure of the general LANG connectome.We used the Louvain community detection algorithm ( Blondel et al., 2008 ) implemented in Networkx.Louvain's method is widely used for community detection in neuroscience and has been previously shown to outperform other community detection methods ( Yang et al., 2016 ).To ensure the stability of the final partition, we repeated the modular partitioning process 100 times ( Schedlbauer and Ekstrom, 2019 ) and we evaluated the best LANG partition ( Aynaud, 2018 ) on the matrix averaging the results of all iterations.Each ROI was assigned to a specific community (i.e., a subnet, here denoted LANG "Net ").To facilitate subsequent analyses and interpretations, the Power's coordinates of LANG ROIs were mapped to the HCP's multimodal parcellation (version 1.0: HCP_MMP1.0proposed by Glasser et al. 2016 ), which consists of 180 brain parcels.Cerebellum coordinates were mapped to the probabilistic human cerebellum atlas SUIT ( Diedrichsen et al., 2009 ).
Still concerning intermediate configurations between the global level of the network and the nodal ROIs level, we focused on identifying densely interconnected subgraphs of LANG.Complete subgraphs, called cliques, are all-to-all connected sets of brain regions providing architecture that isolates information transmission processes ( Giusti et al., 2016 ) and supports efficient and specialized processing ( Sizemore et al., 2018 ).A maximal clique is one that includes the largest possible number of nodes and to which no more nodes can be added.Using Networkx, we estimated  (G) which is the number of nodes in a maximal clique of G and marked the relevant nodes.

Nodal properties of LANG
We calculated the degree centrality (DC; denoted k ) of each node i as the number of adjacent edges to the node ( ki ), from the reduced and unweighted adjacency LANG matrix.DCs are a convenient metric for highlighting brain regions with a high degree of connectivity or "hubs " and form the basis for other measures of nodal graph theory.
From the DCs (k), we computed the within-component degree z-score ( zi ), that expresses the extent to which node i is connected to other nodes in its respective component and is calculated as follows ( Guimerà and Nunes Amaral, 2005 ): with k iS the number of connections of node i to the other nodes in the subgraph component S (i.e., the Net) and k S and  Si respectively the mean and SD of the within-component DC over all nodes in S .
To quantify to what extent a node connects across all components, we measured the participation coefficient ( PCi ).The following conventional formula ( Guimerà and Nunes Amaral, 2005 ) was applied, with m the set of components S or Nets (here 4): Following Schedlbauer and Ekstrom (2019) and because of the narrow distribution of the PC s we z-scored the coefficients (zPCi) from each network.
The zi and zPCi values have enabled to assign a specific role to each of the LANG ROIs.The nodes were classified according to their type of functional communication within the connectome as follows: connector (high z i /high z PC i ; high intra-Net and high inter-Net FC); provincial (high z i /low z PC i ; high intra-Net FC); satellite (high z i /low z PC i ; high inter-Net FC); or peripheral (low zi /low zPCi ; low inter-Net FC).We applied this classification as proposed in previous studies ( Bertolero et al., 2015 ;Cohen and D'Esposito, 2016 ;van den Heuvel and Sporns, 2013b ) and with z i > 0 corresponding to "high z i " and z PC i > 0 corresponding to "high P i ".
Finally, we were interested in quantifying the rich club organization of the networks.A rich club reflects a set of nodes in the network of whose level of interconnectivity (i.e., richness) exceeds the level of FC that can be expected by chance.For each degree k, the rich-club coefficient ( ) is the ratio of the number of actual to the number of potential edges for nodes with degree greater than k ( Colizza et al., 2006 ): where Nk is the number of nodes with degree larger than k , and Ek is the number of edges among those nodes.We compared and normalized the rich club coefficient to sets of "equivalent " random networks.An empirical null distribution constituted from the average of 1000 random networks of equal size and degree distribution was generated ( (  ) ).
The difference between (  ) and (  ) allowed us to obtain the normalized rich club coefficient  (  ) : In line with previous work ( Colizza et al., 2006 ;Grayson et al., 2014 ;van den Heuvel and Sporns, 2011 ), a network was considered to have rich club organization when  was greater than 1 for a continuous range of increasing k (rich club regime).Rich club nodes were brain regions taking part in these densely connected networks (or rich clubs), forming a functional unit.We considered as rich club hubs the nodes taking part in the club at value k where the strongest rich club effect was observed.

Hemispheric asymmetry
The FC hemispheric asymmetry of the ROIs was estimated with the DCs.We derived a connectivity-based lateralization index (LI), by contrasting the k values of homotopic nodes (comparison of FC between mirror areas), according to the following formula: With LH(k) being the DC for the ROI in left hemisphere, RH(k) the DC for the homolateral ROI in the right hemisphere.
We also calculated global lateralization indexes at the connectome or Net level by averaging the corresponding nodal LIs.LI values can range continuously from -1 to 1 and the following landmarks were considered for interpretation: -1 = complete RH dominance; + 1 = complete LH dominance and between -0.2 and + 0.2 = no clear dominance ( Roger et al., 2019 a;Rolinski et al., 2020 ;Seghier, 2008 ).

Functional and structural matching
We used mappings provided by previously published tools to estimate the spatial concordance between LANG and (1) the neurotransmitter pathways; or (2) the terminations of large white matter (WM) bundles.The "functional " maps were issued from nuclear imagingderived neurotransmitter maps implemented in the JuSpace toolkit ( Dukart et al., 2021 ), specifically designed to link neuroimaging (MRI data) with underlying neurotransmitter information (as revealed by PET and SPECT tracers).The "structural " maps came from the deep-learning algorithm TractSeg ( Wasserthal et al., 2018 ), which offers the segmentation of the main long-range WM brain bundles.It also allows the generation of grey matter masks that are linked by the bundles (ending masks).These ending masks were used here to define the structural connection maps of each bundle or combination of bundles.
The functional and structural maps were registered to the surface template and binarized.Each parcel of the HCP_MMP1.0template was coded according to the presence/absence of map coverage, with 1 corresponding to at least 40% coverage of the parcel surface ( > 40%); and 0 to less than 40% coverage ( < 40%).
We then used the simple matching coefficient (SMC; Boriah et al., 2008 ;Sokal and Michener, 1958 ) method to quantify the spatial concordance between LANG and each of the functional and structural binary maps.SMC indicates the coincidence ratio between the mutual presences (and absences) and the length of the binary sequences: 0% means that the labels have nothing in common and 100% that they have identical sequences.Only coefficients exceeding 2/3 of the total agreement (SMC > 0.67) were considered relevant.

Cross-processes flexibility
We computed a flexibility index by using multilayer network model and with a method close to that of Betzel et al. (2017) .The layers of the model were constituted from the matrices of the 5 groups of tasks (i.e., subprocesses) identified with data-driven clustering analyses (see Section 2.3.2 ).To keep a common reference between layers, the matrices were restricted to the LANG 131 ROIs and re-estimated on this basis.We applied the generalized Louvain package ( Jeub et al., 2011 ), suited to determine community structure in multiplex graphs ( Bassett et al., 2011( Bassett et al., , 2013 ; ;Mucha et al., 2010 ).This method has the advantage of preserving the community labels consistently across layers (here the task groups), avoiding thus the issue of community matching ( Yang et al., 2021 ).
From the communities assigned across layers, we calculated a flexibility score as previously proposed by ( Bassett et al., 2011 ).The flexibility f i of a node corresponds to the number of times that a node changes its modular assignment between layers, normalized by the total number of possible changes (i.e., the total number of layers minus 1, here 4).In short, the f-score reflects the frequency a brain region changes its community assignment.It ranges from 0 to 1, where 0 corresponds to a region that never changes module whatever the subprocess/task involved (stable on all layers); and 1 corresponds to a region that never belongs to the same module on the 5 layers.We also calculated the mean flexibility (F) over all nodes in the network to examine the global flexibility of the system. 

Inter-individual variability
We assessed inter-subject variability by considering the individual signal values, extracted for each individual and on each of the LANG ROIs.To remove the variability induced by the task, we normalized the beta values, considering the mean and standard deviation of the other subjects who performed the given task (z-score betas).When individuals performed multiple tasks (for subjects enrolled in the same protocol), we averaged the z score betas for these subjects to avoid accounting for additional intra-individual variability.We thus performed the measurement of inter-individual variability by considering the subjects and not the scans.In addition, we considered only the young/middle-aged cohort of participants in which LANG was modeled to (1) prevent a large task effect (older adults primarily performed the NAM object naming task); and ( 2) not overestimate the variability of LANG by adding older participants (seniors).The average and absolute z scores of each region were then divided into 3 bins of increasing interindividual variability.

Age effect
To highlight LANG ROIs that are the most resilient/vulnerable to the aging effect, we examined the subjects of the whole InLang cohort representative of a wide range of ages (young/middle-aged and older adults) who performed the same task (Object Naming: NAM).Age was considered continuously in our statistical analyses, from age 20 to 85 ( n = 82).We computed standard correlation coefficients (Pearson r ) between the age and DC (here estimated based on individual connectivity matrices).This allowed us to observe a positive (positive and high r ) or negative (negative and high r ) age effect on task-based FC.

Gender effect
We estimated the gender effect by generating the FC matrices reduced to the LANG ROIs for males (M) and females (F; self-reported gender) separately.The average DCs obtained for each ROI were then compared between the two groups.We estimated significant differences between the two populations by means of standard t tests.For illustrative purposes, only the top 10% of the largest differences (Males > Females and Females > Males, independently) were retained to define the most diverging ROIs.

Global profile of the tasks and latent subprocesses
Data-driven clustering based on the similarity of global measures across conditions/tasks (the global efficiency:   , the average local efficiency:   , the mean geodesic distance: d , and the total number of nodes: N; Fig. 2 A) has revealed an optimal 5-cluster solution.This solution was consistent with that obtained based on BOLD functional activations (Fig. S1, Appendix).We used the internal composition of the five task groups to label them according to the underlying language subprocess that might be primarily involved ( Fig. 2 A), namely: G1 = MON-ITORING (MS, MO, DO tasks); G2 = DECODING (PHON, RHYM and PROS tasks); G3 = SEMANTIC (SEM and SP tasks); G4 = PRODUC-TION (NAM, FLU, GENE, and REP tasks); G5 = WANDERING (VMW task).Indeed, the monologal and dialogal inner speech with own and other voices (MS, MO, DO) engage MONITORING processes (inner voice control ).Phoneme detection (PHON), rhyme judgment (RHYM), and  S1) and for the clustering applied to BOLD functional activations (Fig. S1).Table S3 summarizes the global measures for each task and subprocess.(B) Non-linear relationship between the integration/segregation balance (I:S) and the geodesic distance of the functional connections of the different connectomes.Plain dots correspond to the mean values of each task-related connectomes.Empty dots to the mean values of subprocess-related connectomes.The Kmean clustering revealed 3 distinct types of connectivity according to the I:S/geodesic distance profile (short-range, middle-range, and long-range).The colored lines come from the centroids (crosses) estimated from the observed data (at the level of brain region), in relation to the regression polynomial curve.(C) Global topology of the LANG task-based connectome.The 131 regions of interest (ROIs) of LANG ( Power et al., 2011 ; LH = 61%, RH = 35%, CER = 4%) are projected here in a reduced two-dimensional space (PCA layout) consisting of the first two components: PC1and PC2.The LANG atlas, node coordinates and properties are described in Table S4.The package including the atlas can be downloaded here: https://osf.io/6xm8n/ .
prosodic detection (PROS) first involve phonology and/or prosody DE-CODING (sound control ).Semantic categorization (SEM) and speech perception (SP) respectively engage word and sentence comprehension.They both primarily require SEMANTIC processing (conceptual knowledge ).Object naming (NAM), categorical fluency (FLU), sentence generation (GENE), and word repetition (REP) rely on lexical/lexico-syntactic formulation or word PRODUCTION (conceptual knowledge ).Finally, verbal mind wandering (VMW) involves spontaneous speech production underpinned by introspective WANDERING processes (or unintentional thought).Details of the tasks and the language subprocesses theoretically and primarily targeted according to statistical fMRI contrasts performed are provided in the Supplementary Material (see also the summary Table S1, Appendix).
Interestingly, language tasks and subprocesses can also be grouped according to the physical distance of their functional connectivity within the brain.By contrasting the nodal integration/segregation balance (I:S) with the nodal geodesic distance  of each task and subprocess, we have observed a significant linear relationship between the two parameters ( r = 0.8, p < .001).However, the best-fit function indicated that the relationship was slightly better described by a nonlinear relationship (i.e., a polynomial curve; Fig. 2 B).Furthermore, the unsupervised classification (k-mean method) applied to these data has identified 3 main clusters, denoting a gradual organization of tasks and subprocesses according to 3 canonical profiles of average connectivity that can be interpreted as C1 = long-range connections; C2 = middle-range connections; C3 = short-range connections ( Fig. 2 B).Overall, the more taskor process-based connectomes were segregated rather than integrated on average (i.e., a negative difference in favor of   ), the shorter the physical internodal distance (short-distance functional connections, as for the control tasks of language involving DECODING and MONITOR-ING subprocesses).Conversely, the more task-or process-based connectomes were integrated rather than segregated on average (i.e., a positive difference in favor of   ), the longer the physical internodal distance (long-range functional connectivity, as for the WANDERING and SEMANTIC task groups).

Global topology of the general LANG connectome
After excluding irrelevant functional connections (see Material and Methods, Section 2.2 .),LANG was composed of 131 non-isolated regions of interest (ROIs; Power's parcellation: Power et al. 2011 ), distributed over the two hemispheres (nLH = 80; nRH = 46) and the cerebellum (nCER = 5).Connectivity between LANG ROIs appeared balanced between integration and segregation (I:S = 0.049), associated with a rather long-range connectivity profile ( d = 68.1).Table S3 (Appendix) summarizes the global network properties of the tasks, the subprocesses, and of the general LANG connectome.Fig. 2 C shows the LANG connectome as a graph projected into a reduced two-dimensional space based on principal component analysis (PCA).

LANG partition and hubs (intermediate and local scale)
Community-based detection applied to the LANG connectome has defined 4 distinct components (or functional subnetworks, called Nets; see Fig. 3 A, for projection into the reduced PCA space).Fig. 3 B shows the mapping of the Nets onto the brain and cerebellum templates.Considering the composition, Net1 could correspond to the core component of language, engaged in the coding-decoding of linguistic signals of multiple nature: e.g., acoustic, syntactic, conceptual, articulatory ( Coding-Decoding system ).Net2 is represented by executive-attentional functional networks ( Control-Executive system ).Net3 is mainly composed of regions of the default mode network (DMN) known to be involved in high-level cognitive abstraction and can thus be regarded as a "conceptual" knowledge network ( Abstract-Knowledge system ).Finally, Net4 involves a large majority of perceptual and motor brain areas, suggesting that it is the " Sensori-Motor " system of language.A supported argument and in-depth discussion of the putative functional roles of these LANG Nets is raised in the discussion ( Section 4 ).
The Nets' composition, coupled with their topological organization in the PCA reduced space, have provided evidence for the possible mean- ing of the 2 main PCA axes (i.e., the two principal components; PC1 and PC2 of Fig. 2 C).PC1 extended from auditory to sensorimotor components of language and may reflect the axis of externally oriented cognition (from verbal-specific to domain-general somatosensory systems).PC2 progressively involved control executive regions to semantic associative regions and may represent the axis of high-level internal cognition associated with language ( Fig. 3 A).
Interestingly, the core Net1 was located at the crossroads of these two internal-external axes.Moreover, Net1 was the component with the highest portion of connector nodes (Net1 = 40.7%,distributed in both hemispheres; Table S4, Appendix) reflecting a high capacity to integrate information from regions belonging to the same network (intra-FC), as well as to other specialized networks (inter-FC; high zi , high Pi class).Net1 also exhibited a "rich club " organization (from a rich club regime of k > 17 to k < 26; Φnorm(k) > 1, p < .001;10.000 permutations).Restricting to the level of k where the strongest rich club effect was observed (k = 24), we found a set of 5 left perisylvian hubs constituting the "rich-club " of Net1 (areas: STGa, 45, 55b, PFm, STV; Fig. 3 C).These nodes also formed the maximal Net1 clique (i.e., the maximal complete subgraph;  (LANG/Net1) = 5).
Table S4 (Appendix) includes information about the LANG modules and hubs for each region.Fig. S3 (Appendix) presents the LANG connectomic atlas including the right hemisphere (RH), as well as details of its components.

Functional correlates
On average, the LANG's FC laterality index indicated a slight LH predominance (LI(LANG) = + 0.23), but hemispheric asymmetry was variable across the Nets.The proportion of nodes that were more strongly connected was higher in LH for Net1 (LI(Net1) = + 0.41) than for the other Nets ( Fig. 4 A).By comparison, the FC of the nodes belonging to the "sensory-motor component " were bilaterally distributed (LI(Net4) = -0.12).Overall, the FC asymmetry of LANG Nets (from bilateral to LH) was arranged along the following gradient: Net4 < Net3 < Net2 < Net1.
In addition, some LANG Nets were spatially congruent with the mapping of neurotransmitter receptor pathways.In particular, the LH nodes of Net2 and Net3 showed a high spatial matching with the serotonin receptors 5HT2a (SMC Net2/5HT2a = 0.68; SMC Net3/5HT2a = 0.74).Those of Net4 overlapped with the noradrenergic transporters NAT_MRB (SMC Net4/NAT_MRB = 0.7).However, based on the SMC relevance criterion (SMC > 0.67), Net 1 did not show significant spatial correspondence with any of the neurotransmitter maps.Fig. 4 B shows the distribution of the LANG ROIs that matched (or did not match) with the PET receptors (see also Fig. S5, Appendix).

Structural correlates
The endings of some large white matter (WM) bundles were spatially concordant with the LANG Nets ( Figs. 4 C, S4, Appendix).The best overlap between the bundles and Net1 (LH ROIs) was obtained by combining the ending masks of the left arcuate fascicle (AF), superior longitudinal fascicle branch III (SLFIII), inferior longitudinal fascicle (ILF), and the thalamo-premotor (T_PREM) projections (SMC Net1/AF-SLFIII-ILF-T_PREM complex = 0.9).The concordance rate increased to 92% when the ending masks of the middle cerebellar peduncle (MCP) and the cerebellar ROIs of Net1 were included.At a more restricted level, the unique contribution of the left AF provided a high spatial concordance with the Net1 lateral LH nodes (SMC Net1/AF = 0.72).Regarding Net2 (LH ROIs), the best matching was reached with the combination of the left superior longitudinal fascicle branch II (SLF-II) and the cingulum (CG) bundle (SMC Net2/SLFII-CG = 0.76).The concordance between the SLFII individually taken and the lateral LH Net2 ROIs was close to 70% agreement (SMC Net2/SLFII = 0.68).Net3 (LH ROIs) had almost complete coverage when considering the combination of CG, the middle longitudinal fascicle (MLF), and the fornix (FX; SMC Net3/CG-MLF-FX = 0.98).Finally, Net4 (LH ROIs) was spatially well covered by the combination of the cortico-spinal (CST) and the striato-precentral (ST_PREC) tracts (SMC Net4/CST-ST_PREC = 0.78).( Dukart et al., 2021 ) were tested, but only spatial matches considered significant are shown here (SMC > 0.67; see Method).LANG regions with sufficient coverage ( > 40% of overlap) are in red while those with no or insufficient coverage ( < 40%), in gray.(C) Structural concordance with large white matter (WM) bundle terminations provided by TractSeg ( Wasserthal et al., 2018 ).Only the best bundles combinations allowing for the highest match are displayed here.The red/gray color code corresponds to the same definition as for Panel B.

Flexibility and variability
The module assignments of LANG ROIs varied according to the linguistic subprocesses involved.We have calculated the flexibility coefficients to capture the FC versatility of the ROIs engaged in the different Nets, depending on the subprocess at work.The average flexibility coefficients (F) were low for Net1 (F Net1 = 0.21) and Net4 (F Net2 = 0.32); while those for Net2 and Net3 were twice as high (F Net2 = 0.55; F Net3 = 0.64).Ordering LANG networks according to their functional versatility yielded: Net1 < Net4 < Net2 < Net3; from invariant to highly flexible ( Fig. 5 A).
Although modest, there was also some inter-individual variability in FC when individuals performed the language tasks ( Fig. 5 B).We found the highest inter-subject variability on Net3, but the variance remained low on average (mean z score = 0.56).FC variability between participants was more visible at the regional level than at the network scale.In addition, we found a high matching coefficient between the "universal language network " (ULN; as proposed by Malik-Moraleda et al., 2022 and the lateral LH ROIs of Net1 (SMC Net1/ULN = 0.84; Fig. S5, Appendix), suggesting some between-individual and cross-cultural consistency in key language network involvement.
However, the LANG connectome underwent changes with age ( Fig. 5 D).We observed both positive and negative correlations between age and degree centralities (DCs).Net3 and Net2 were the components showing the most important modulations.More specifically, ROIs of Net2 were negatively correlated with age (mean r = -0.39);while ROIs of Net3 were, on average, positively correlated with age (mean r = 0.31).Thus, the older the individuals, the less functionally connected the Net2 regions were (i.e., a decrease in functional hubs for this network in LANG).By contrast, the Net3 regions tended to be more strongly interconnected in LANG with age.
Finally, gender also modulated LANG connectivity.The strongest LANG ROIs for males compared to females (M > F) in terms of DCs were distributed between Net1 (62.5%) and Net2 (37.5%) in LH.The strongest LANG ROIs for females compared to males (F > M) were practically all located in Net3 (91.67%).Fig. 5 C shows the LANG ROIs with the most divergent FC.It is interesting to note that the top 10% were located in LH but see also Table S5 (Appendix) for a complete picture of statistical differences.

Discussion
The main objective of this study was to provide an in-depth, multiscale view of the organization of brain function associated with language from a connectomic perspective.We leveraged an extensive fMRI (C) Variability determined by age.Illustration of the correlation coefficients between age and ROI DCs, as well as their distribution in each of the Nets.The correlations were performed on the naming task (NAM) that includes healthy participants over a wide age range: 82 subjects aged 18 to 84 years (see also Fig. S9).(D ) Variability induced by gender.Top 10% of the most different LANG ROIs between males (M) and females (F; self-reported gender).In cyan, the top 10% of ROIs where nodal connectivity (DC) was higher in males compared to females.In magenta, the top 10% of regions where nodal connectivity was comparatively higher in females than in males.See Table S5 for a detailed summary of significant differences.database of multi-paradigm language tasks ( InLang [Interactive networks of Language] database), and we applied a state-of-the-art functional connectivity (FC) methodology that provided unique insights on brain networks related to language.The central finding of this research was that the general language connectome (LANG) could be objectively partitioned into four main non-overlapping subnetworks (referred to as "Nets "), possessing distinctive and marked features.Table 1 below provides a complete overview of these Nets.
The most extensive subnetwork, Net1, appears to correspond to a "specialized " language system shaped for the encoding-decoding of auditory-verbal signals.Indeed, Net1 consisted primarily of areas belonging to the intrinsic networks previously designated as "auditory " and "language " (CAB-NP RSNs: Ji et al. 2019 , Fig. 3 C).It included a set of both primary, secondary, and associative areas, previously noted as specialized for language (e.g., Labache et al., 2019 ;Price, 2012 ).In addition, a subset of Net1 composed of crucial brain regions densely interconnected formed a critical set for information integration and communication during language tasks ( Fig. 3 C).These core network structures were inscribed in the left perisylvian zone, namely: the anterior part of the superior temporal gyrus (STGa), the posterior part of the pars triangularis of the inferior frontal gyrus (pIFG, 45/44), posterior middle frontal gyrus/premotor cortex (pMFG, 55b area), inferior parietal cortex, supramarginal gyrus (PF/PFm,), temporo-parieto-occipital junction (STV/TPOJ1).Our analyses to determine the functional role of brain regions within networks had identified them all as "connector " hubs, which is consistent with previous observations (e.g., IFG/TPJ/STG: Goucha et al., 2017 ;Hagoort, 2016 ;PF/SMG: Braga et al., 2013 ;pMFG, 55b: Hazem et al., 2021 ).Connector hubs were located at the contact points of several white matter (WM) fascicles, actively supporting long-distance information transport and processing (e.g., at the AF/SLF convergence areas for the IFG and TPJ; Roger et al., 2022 ).Indeed, we observed a robust matching with the AF endpoints (whose involvement in language has been widely and more directly reported; e.g., Forkel et al., 2022 , for a meta-analysis) and the lateral perisylvian part of Net1.Nevertheless, more than a one-to-one relationship between structure and function, it was a combination of various WM bundle terminations that underlay the entire network ( Fig. 4 C).Beyond the clique, Net1 embedded other connector areas.Some of which were in the right hemisphere, others in the basal ganglia (the anterior parts of the thalamus and the putamen) or in the right cerebellum (Crus I) as well (Table S4, Appendix).Large cortico-basal ganglia-cerebellar loops would be involved during language tasks, supporting a substantial role of the subcortical structures in high-level cognition including language ( Murphy et al., 2022 ).
Individually, brain areas have their own anatomical and microstructural properties (cytoarchitectonic features, Zilles and Amunts, 2010 ) and may thus be biased -under normal conditions -to respond efficiently and preferentially to certain input types.They can be tuned for functional selectivity to linguistic phonological, syntactic, lexical, or even semantic units ( Friederici 2011 ).However, the underlying computational processing (i.e., the functional role) of regions belonging to the same Net could be deeply similar.Computational building blocks (called primitives : Poeppel, 2012 ; elementary linguistic operations: Hagoort, 2019 ; or neural operations: Buzsáki, 2020 ) of Net1 could imply here the segmentation-fusion of the linguistic signal, yielding the generation of a verbal information stream of increasing and ordered complexity ( Zaccarella and Friederici, 2015 ).Multiple combinatorial operations on different linguistic representations have already been reported (e.g., the combinatorial network of language: Pylkkänen, 2019 ).Net1 and its constituents could represent the foundation of these combinatorics in the task.The modularity analysis we applied to multiple language tasks would indeed have captured a common "language combinatorial " computational mechanism for Net1, making this network a cornerstone of a "language-specialized " encoding-decoding system ( Hagoort, 2017 ).Consistent with a central system, Net1 was topologically situated at the interface of the other components, between an internally and externally oriented cognition ( Fig. 3 A).Moreover, Net1 was found to be a globally inflexible (unchanged) configuration regardless of task and linguistic demand ( Fig. 4 A).It also appeared spatially consistent with the "universal language network " proposed by Malik-Moraleda et al., 2022 as an invariant, cross-cultural, functional language network (see Fig. S5, Appendix).Several universals of language (apart from the "universal grammar "; Chomsky, 1995 , which is debated) have been reported ( Coupé et al., 2019 ) and concern both semantics ( Gibson et al., 2017 ), syntax ( Futrell et al., 2015 ) or even pragmatics ( Piantadosi et al., 2011 ).The constraints applied to shape languages seem to follow common rules of optimization of coding and information transfer towards a fundamental principle of efficiency.The functional selectivity of Net1 regions is likely to be inherited from our ancestors and to be part of a language-ready brain ( Boeckx and Benítez-Burraco, 2014 ).They are also supported by a specific brain architecture already present in children ( Friederici, 2017 ) whose functional connectivity is genetically encoded ( Mekki et al., 2022 , for the genetic regulation specifically involved in the perceptual-motor and semantic pathways of language).
At the boundaries of Net1, we also detected two networks that are integral parts of the general LANG connectome (Nets2-3).Net2 was dominated by intrinsic attentional and executive control networks (cinguloopercular and frontoparietal networks; Fig. 3 C).First, the cinguloopercular network (CON) is a superordinate system encompassing the salience network ( Ji et al., 2019 ), involved in external-signal-driven attentional control or top-down, "exogenous " redirection of attention ( Matthen, 2005 ).The specialization of such a network in the actively controlled integration of exteroceptive information may lead to the provision of appropriate information in working memory ( Parr and Friston, 2017 ) in order to construct an internal representation of the external world relevant to the individual at a specific time.Second, the frontoparietal network [FPN; close to the Multiple Demand Network (MDN): Smith et al. (2021) , or to the Central Executive Network (CEN); Doucet et al. (2019) ] is a network involved in all processing requiring controlled attention directed toward internal cues and goals.This network operates for endogenous and top-down attentional redirection ( Perrone-Bertolotti et al., 2020 ) and is engaged in verbal working memory and "fluid " cognition ( Assem et al., 2020 ).Overall, Net2 is a controlled executive language system that captures both endogenous and exogenous attentional aspects.Net3, on the other hand, was almost exclusively composed of DMN (Default Mode Network) regions ( Fig. 3 C).At rest, the default state is thought to be involved in "random episodic silent thought, " promoting creativity ( Andreasen, 2011 ).Previous studies exploring intrinsic connectivity have shown that some components of the DMN are tightly and specifically coupled with the language network (particularly the anterolateral subnetworks; Gordon et al., 2020 ).Taskbased studies have also shown its involvement in natural language processing ( Simony et al., 2016 ).As a foundation of the episodic-semantic memory spectrum (and more broadly, language-memory; Roger et al., 2022 ), the DMN is a multimodal experiential system ( Xu et al., 2017 ) that fosters resonance and binding between environmental features and those derived from similar prior knowledge and states ( Binder and Desai, 2011 ;Constantinescu et al., 2016 ).For these reasons, Net3 has been referred to here as the "Abstract-Knowledge system " of language.
Even if their functional role in cognition was distinct, Net2 and Net3 were both involved in high-level cognition.They displayed similar network features in terms of hub properties, with a remarkably high proportion of "satellite " key regions compared to other Nets (Table S4, Appendix).Satellite centers are regions whose functional communication supports dialogue between components ( van den Heuvel and Sporns, 2013b ).In our case, they favor communication with regions belonging to other Nets, facilitating multimodal integration or information linking throughout the task.Moreover, we have observed in Net2 and Net3 a clear tendency to reconfigure.according to linguistic demand (i.e., versatile networks with a flexible modular configuration that depends on the language subprocesses involved; Fig. 4 A).It is consistent with studies showing that these systems are auxiliary and differentially involved depending on the nature of the task ( Fedorenko and Thompson-Schill, 2014 ).For instance, FPN/MDN is functionally active in controlled and challenging semantic tasks but not in less demanding linguistic tasks ( Diachek et al., 2020 ), which is consistent with the role of FPN in attentional processes and fluid cognition ( Assem et al., 2020 ;Perrone-Bertolotti et al., 2020 ).In the same line, the Net3 configuration was more likely to be engaged in tasks involving the projection of spontaneous and self-oriented thoughts, such as verbal mind wandering ( Andrews-Hanna et al., 2014 ;Binder et Desai, 2011 ;Humphreys and Lambon Ralph, 2015 ;Konishi et al., 2015 ;Lau et al., 2013 ;Raichle, 2015 ;Wang et al., 2020 ).Interestingly, the brain spatial distribution of Net2 and Net3 corresponded to the mapping of the 5HT2A receptors ( Fig. 4 B) involved in serotoninergic transmission ( Savli et al., 2012 ; see also Beliveau et al., 2017 , for a high resolution and in vivo brain atlas of the serotoninergic system), capable of amplifying or sustaining cortical excitation ( Puig and Gulledge, 2011 ).These receptors indeed modulate whole-brain connectivity and promote flexibility between brain states and processes ( Jancke et al., 2021 ).They could therefore be a relevant biomarker of the functional flexibility revealed in Nets 2-3.
However, the two networks were distinct, underpinned by specific anatomic connectivity ( Fig. 4 C).In addition, they were differentially sensitive to gender ( Fig. 4 D).For example, some hubs in Net3 showed higher FC during tasks in females compared to males, consistent with recent findings on differential DMN connectivity in resting state studies ( Liang et al., 2021 ).Finally, Net2 and 3 were both subject to the pressures of age but here again differently ( Fig. 4 C).Net2 was negatively impacted by aging.The older the age, the less Net2 regions were functionally connected.This reduction with age in attentional-controlled FC was consistent with the alteration in executive functioning traditionally observed in older subjects ( Reuter-Lorenz et al., 2016 ).Net3, on the contrary, had a high share of hubs that were more densely connected with age, which may reflect a compensatory pathway traditionally observed in aging concerning language (i.e., a semantic strategy: Baciu et al., 2021 ).
The last system, Net4, held bilateral sensorimotor cortico-subcortical brain areas.This fourth component of the language connectome was distinct from the perceptual and motor auditory-verbal structures included in Net1 ( Fig. 2 B) but could be an essential part of the actionperception circuits of language.The brain regions involved in Net4 have already been described as engaged in several sensorimotor aspects related to language production, in particular: general action selection (premotor); motor execution (supplementary motor area); orofacial motor activity (precentral and postcentral language areas); or even timing of motor outputs (putamen and cerebellum; Price, 2012 for an exhaustive overview).Besides the primary and secondary sensorimotor regions, Net4 also encompassed a large part of the precuneus.Precuneus supports a prominent level of interconnectivity with other brain regions, which has led to the identification of functional subdivisions (posterior-visual; central-cognitive/associative; anterior-sensorimotor; Margulies et al., 2009 ) and has indeed been considered a crucial sensorimotor connector hub of the LANG connectome (Table S4, Appendix).Importantly, it is an important site of production-comprehension coupling in natural speech ( Silbert et al., 2014 ).In addition to speech production, Net4 can be engaged in language comprehension.Semantic grounding (i.e., the semantic links between words and their actions, referent objects, and related concepts) appears to depend on semantic circuits that bring together both the circuits related to word form (perisylvian, Net1) and conceptual circuits that underlie, among other aspects, sensory-motor experience (extrasylvian, including Net4).The involvement of the motor system in speech perception and understanding has been observed in various contexts ( Fernandino et al., 2022 ;Schomers and Pulvermüller, 2016 ;Skipper et al., 2017 ;and see Pulvermüller, 2018 for the hypothesis of neural reuse of action perception circuits in language), which may explain why Net4 was globally engaged regardless of the subprocess involved in the language tasks ( Fig. 4 A).Finally, several neurotransmitters were involved in regulating the activity of sensorimotor regions.However, we observed a specific spatial matching between the sensorimotor language system and the noradrenergic receptor mapping (noradrenaline transporter: Hesse et al., 2017 ;Fig. 4 B).Catecholamine noradrenaline has substantial projections to somatosensory and motor areas, including primary cortices, and the modulatory effects of noradrenaline on sensorimotor processing are diverse.While its contribution to modulating arousal states ( Holland et al., 2021 ) and adapting sensory circuits for optimal behavior in animals is well documented (see Jacob and Nienborg, 2018 , for a review), its precise function in humans and in language remains to be investigated.
Overall, our observations reinforce and complement past observations about the neurocognitive architecture of language.For example, the concept of multiple language networks ( Hagoort, 2019 ) or the "theoretical " subdivision of the vast language network into a critical system accompanied by several additional systems (or margins; Hertrich et al., 2020 ) have been discussed.In an opinion article, Fedorenko and Thompson-Schill (2014) have previously reconsidered functional specialization from the perspective of dynamic engagement of interactive networks to support different goals and linguistic complexity.Our observations about the flexibility of the different LANG components are entirely in line with these considerations.For example, some networks may be central (i.e., they are composed of regions that maintain their allegiance through time such as Net1).In contrast, others are composed of peripheral regions that are not necessarily specialized for language but participate in its elaboration (such as Nets2-3).In the same paradigm, Chai et al., (2016) studied the dynamic flexibility of language regions on a comprehension task.Beyond a core set of regions in the left hemisphere -analogous to our perisylvan hubs belonging to the rich-club ( Fig. 3 C) -they observed flexible reconfigurations of "peripheral " regions, primarily located in the right hemisphere.Within this extended language network, the angular gyrus and the anterior temporal lobe belonging to Net3 of the LANG atlas and assigned to the DMN (e.g., Smallwood et al., 2021 ), exhibited a singular and highly flexible pattern of activity that may drive the functional interhemispheric coordination associated with language ( Blank et al., 2016 ;Chai et al., 2016 ;Mahowald and Fedorenko, 2016 ).Although other regions demonstrated weakly lateralized FC in the left hemisphere within the LANG atlas, these regions indeed showed more balanced connectivity between hemispheres ( Fig. 4 A).They are known to play an active role in the multimodal integration of information in semantic cognition (e.g., Lambon Ralph et al., 2017 ;Seghier, 2013 ), which is widely distributed in the brain ( Huth et al., 2016 ).
Other data also support the proposal of a functional architecture of language composed of multiple systems/subsystems that corroborate our observations on LANG of discrete language-related components via the demonstration of specific signatures in terms of intrinsic connectivity ( Labache et al., 2019 ) or of dissociated cortical distribution by intraoperative cortical stimulations allowing to go beyond the traditional correlational framework ( Corrivetti et al., 2019 ).These segregated functional networks appear to dynamically integrate into larger interactive configurations ( Roger et al., 2022 ).This echoes the concept of encapsulation or nested networks ( Hilgetag and Goulas, 2020 ), where domain-specific and domain-general circuits interface to underlying a given mental process ( Fedorenko and Thompson-Schill, 2014 ) ( Fedorenko and Thompson-Schill, 2014 ).Even if the number of proposed networks varies according to methods used or primary theoretical frameworks, the task-based networks or systems defined in the LANG atlas were consistent with previous partitioning proposals.
The great benefit of partitioning emerging directly from data is to pinpoint latent mechanisms that transcend our classical cognitive descriptions (see interesting discussions on the current problem of brainbehavior concordance or the blurriness and ambiguity associated with terminology and definition of psychological constructs: Anderson, 2011 ;Buzsáki, 2020 ).Data-driven ontology provides an independent view, here from a neuro-centric perspective ( Roger et al., 2022 ), and can serve as a "lingua franca across disciplines and theoretical gaps " ( Eisenberg et al., 2019 ).However, since they are derived from the observations, these partitions depend directly on the quantity, quality, sensitivity, and validity of the data used.This study has the advantage of being based on a database including a rich diversity of fMRI protocols, varying in a wide range of language characteristics ( Fig. 1 ).However, task paradigms are less easy to implement than the resting state.They are typically very controlled, require multiple repetitions to obtain a robust signal/noise ratio, are more prone to movement artifacts, and induce higher interindividual variability ( Park et al., 2020 ).This often leads to smaller final samples, which may be a prominent issue for subsequent analyses.Taking into consideration the need to maximize observations to ensure robust results, we have focused most of our analysis on the investigation of the general connectome or on the subprocesses common to several tasks (and not on individual tasks).The compilation of even larger databases will allow for broader and more detailed investigations.For example, it would be important to consider the pragmatic aspects of language ( Rasgado-Toledo et al., 2021 ), which are not specifically valued in the InLang database.Moreover, the current trend is to extend the framework of fMRI paradigms traditionally employed in "laboratory " settings to less controlled and more ecological protocols ( Verga and Kotz, 2019 ).As evidenced by the recent Neuroimage Special Issue ( Finn et al., 2022 ), several initiatives and datasets are steering toward accounts of cognition in more natural settings (e.g., Bhattasali et al., 2020 ;LeBel et al., 2021 ;Nastase et al., 2020 2017) data collections].Multimodal datasets would then allow more direct confirmation of the neurofunctional relevance of the spatial correspondences we observed regarding anatomo-functional correlates of LANG.In our study, analysis of the links between LANG and anatomical connectivity and neurotransmitter maps could not be performed in the same individuals.Multimodal data would provide more subtle indicators of the correspondence between structure, function, and neurotransmitter pathways than the simple matching coefficient.It is essential when considering highly interactive complex networks where one-to-one relationships cannot be a viable principle ( Suárez et al., 2020 ).
In addition, considering behavioral and cognitive performance is crucial for a better description of the functional repertoire.We must acknowledge that one of the main limitations of the proposed database concerns the cross-sectional aspect of the cohort and, consequently, the lack of (homogeneous) behavioral/cognitive data (i.e., inconsistent response modalities; some tasks were designed without directly measurable behavioral output, without overt speech production or with inner speech, depending on the targeted linguistic processes or to limit movement artifacts).While databases built from pre-existing data are essential to address current data reuse concerns (see the FAIR principles; Wilkinson et al., 2016 ), building highly multimodal databases -coupled with improved methods of acquisition, processing, and statistical investigation -is a practical approach to tackle the issue of brain-behavior correspondence more effectively.Such multimodal data acquired systematically across subjects would also limit the inter-subject variability inherent in cross-sectional databases, which may lead to greater stability and accuracy of results.However, the presence of functional (language) tasks and/or behavioral data in the multimodal neurocognitive datasets available to date is still generally limited.
We have attempted to comply with the best practices mentioned for analyzing and sharing FC data following the guidelines of the COBIDAS report ( Nichols et al., 2017 ), but some methodological choices may influence the estimation of FC and should be further emphasized.The LANG atlas was selected on independent resting-state and multimodal brain parcellations ( Glasser et al., 2016 ;Power et al., 2011 ), limiting the circular analyses related to the ROIs selection ( Kriegeskorte et al., 2009 ).Furthermore, the regions belonging to the atlas were identified without a priori, preventing here another bias that is a similar form of circular logic in which assumptions are retrofitted to results (also called HARKing; Button, 2019 ).However, it is essential to note that we estimated the FC through the correlation of interregional beta-series extracted from the tasks rather than using time-series methods like those applied to resting-state data ( Cao et al., 2014 ).This approach of interregional FC has been implemented, validated ( Göttlich et al., 2015( Göttlich et al., , 2017 ) ), and successfully applied in previous task-based fMRI studies (e.g., for recent studies: Antonucci et al., 2020 ;Franzmeier et al., 2018 ;Pang et al., 2022 ).Nevertheless, as it has also been argued previously that removing inter-event/block variance could help to reduce the chances of circularity in the analysis ( Cole et al., 2019 ), replicating the LANG analyses by removing the average evoked responses from each task would thus be an option to consider to measure the potential bias induced by the FC estimation method.
Finally, the matrix thresholding method is a well-known limitation on resting-state FC.On binary networks, the inclusion of weaker correlations as functional edges leads to the inclusion of noisier connections, which can significantly bias the graph metrics ( van den Heuvel et al., 2017 ).Including false-positive edges has a more deleterious effect than the exclusion of false-negative connections ( Zalesky et al., 2016 ), suggesting that a restrictive threshold should be preferred over a more permissive one.The most widespread practice for choosing an "optimal " threshold is to extract the parameters of interest over a given threshold range and estimate their change (a range between 5 and 25% for most studies, according to a recent methodological review; Hallquist and Hillary, 2019 ) .In our case, the network parameters were robust and stable over a threshold range between 5 and 10% (Fig. S2, Appendix).The chosen threshold of 5% of the highest positive correlations kept the LANG network balanced, neither too sparse nor completely interconnected and composed of relatively high correlation values limiting the inclusion of false-positive connections.
The present study offers a language atlas that relies on a thorough topological (i.e., spatial) analysis of FC.One of the following steps is to identify the causal organization (e.g., effective hierarchies/heterarchies) and precise timeframes in which its components/regions engage depending on subprocesses/mechanisms at work.MEG (such as in the MOUS dataset: Schoffelen et al. 2019 ) or electroencephalography (EEG; placed on the scalp, the cortex, or intracranial) are valuable tools for the evaluation of such dynamics.To this end, and using intracranial EEG recordings, a recent study examined the dynamic organization of naming ( Forseth et al., 2021 ).They observed that regions were co-activated during extended periods, confirming that complex behaviors such as speech production requires the coordination of discrete network states (defined as a set of reference dynamics that coordinate the generation and transmission of information throughout the cortex; see also the concept of meta-networking of Herbet and Duffau (2020) ).They were able to sequence, map, and identify the temporality of the different transient states; globally confirming the seminal model of word production proposed by Indefrey and Levelt (2004) .The definition of a comprehensive repertoire of language states and causal relationships (i.e., effective and directed functional connectivity; e.g., Deco et al., 2021 ) in various tasks may extend our understanding of language functioning.
Although we explored several modulators, such as age and gender, a specific investigation of these factors remains to be done to gain a complete picture of the changes induced by these characteristics.As discussed above, we have observed a change in some hubs with age.However, the effect of age can be refined by attempting to capture critical evolutionary trajectories of the LANG networks.Establishing accurate age-related benchmarks requires more representative databases, both in terms of age and tasks performed, but would mutually benefit models of language and aging.The sample of our cohort of so-called young or middle-aged adults ( < 60 years) includes a wide variety of ages (mean = 36.87;SD = 13.29), and it would be interesting to see if there are already significant age-related changes in language networks in this sample (i.e., before age 60).Studies of aging have indeed long shown that brain structure evolves with age and that certain regions -mainly frontostriatal -experience an early decline ( Buckner, 2004 ;Head et al., 2005 ;Raz et al., 2005 ;Raz and Rodrigue, 2006 ;Ziegler et al., 2011 ).It could result in subtle but quantifiable neurofunctional changes in the language networks.Exploring the LANG atlas from this angle would allow age categories to be defined not by traditional "chronological age" but by observable differences in functional brain connectivity, which constitutes an exciting perspective.
There are many opportunities for analysis and application of the LANG atlas.It may also involve considering other quantitative metrics or methods for evaluating the properties of LANG Nets/ROIs and their synergies.Given the current view of brain functional organization as nested hierarchies (e.g., Hilgetag and Goulas, 2020 , for a review), assessing the "coreness centrality " property of brain networks is timely and of particular interest.Recently, Stanford and colleagues estimated the stability and resilience of brain networks by applying a sophisticated method of decomposing networks into hierarchically ordered subgraphs with progressively increasing connectivity ( Stanford et al., 2022 ), also known as k-core decomposition ( Batagelj and Zaversnik, 2003 ;Seidman, 1983 ).This method has shown sensitivity, validity, and promise for assessing high-level neurocognitive functioning (see also Arese Lucini et al., 2019 ;Lahav et al., 2016 ;Li et al., 2020 ;Peng et al., 2019 ).Using diverse graphbased methods could lead to a comprehensive, multilevel evaluation of the LANG atlas.Exploring a wide range of connectomic properties of LANG, especially in populations where the importance of language is central -such as in multilingual individuals, in pediatric populations for language acquisition, or in pathologies with aphasic manifestations -represents an additional avenue for a deep appreciation of the functioning of language systems.

Conclusion
Language is a multi-faceted cognitive function.To account for the multidimensionality of language, we performed functional connectivity analyses on a multi-paradigm fMRI database ( InLang [Interactive networks of Language] database), gathering thirteen different language tasks.It allowed us to inspect the language connectome in-depth, particularly its spatial properties and functional attributes.This study reaffirms that high-level cognition, such as language, emanates from synergistic exchanges of external and internal information across specialized systems.Our results highlighted the involvement of essential discrete networks (or components) that are settled around a core "languagerelated " system.Furthermore, the flexible engagement of some key regions depending on several modulating factors, such as linguistic demand, pointed to the dynamic nature of language.Importantly, and from ontology-based data integration, we have proposed a connectomic atlas of the "language mosaic" (LANG atlas), which can be considered as a reference for investigating additional conditions or pathologies altering language functioning.

Fig. 1 .
Fig. 1.Overview of the InLang database, and methodological outlines.(A ) The Interactive networks of Language database ( InLang database): the thirteen language tasks and the main dimensions manipulated by the protocols.The protocols have been previously published and the MRI data have been acquired between 2010 and 2019.InLang gathers, in a unique database, a cross-sectional cohort of 150 different healthy individuals and 359 functional scans (see Materials and Method, Section 2.1 and the Supplementary Material for more details about tasks and protocols).TableS2contains the subjects' characteristics, by tasks.(B) Summary of the steps performed to obtain the task connectomes.For a given task, we extracted the beta values from the individual functional activation maps, on a parcellation covering the whole brain( Power et al., 2011 ).The beta values were then used to compute the task-specific connectivity matrix (correlation matrix).The same procedure was repeated for all tasks to obtain the respective functional matrices and connectomes.(C) Outline of the multi-level statistical analyses performed on functional connectivity (FC) measures (i.e., graph theory parameters) to address 5 main axes.

Fig. 2 .
Fig. 2. Global connectomic profiles of the tasks, the subprocesses and the LANG connectome.(A) Hierarchical clustering of parameters used to define global FC profiles (   = global efficiency,   = average local efficiency, d = mean geodesic distance; N = total number of nodes; top); and of individual tasks, to cluster them into groups of underlying subprocesses (bottom).See also the Supplementary Material for the rationale of the subprocesses labels (TableS1) and for the clustering applied to BOLD functional activations (Fig.S1).TableS3summarizes the global measures for each task and subprocess.(B) Non-linear relationship between the integration/segregation balance (I:S) and the geodesic distance of the functional connections of the different connectomes.Plain dots correspond to the mean values of each task-related connectomes.Empty dots to the mean values of subprocess-related connectomes.The Kmean clustering revealed 3 distinct types of connectivity according to the I:S/geodesic distance profile (short-range, middle-range, and long-range).The colored lines come from the centroids (crosses) estimated from the observed data (at the level of brain region), in relation to the regression polynomial curve.(C) Global topology of the LANG task-based connectome.The 131 regions of interest (ROIs) of LANG( Power et al., 2011 ; LH = 61%, RH = 35%, CER = 4%) are projected here in a reduced two-dimensional space (PCA layout) consisting of the first two components: PC1and PC2.The LANG atlas, node coordinates and properties are described in TableS4.The package including the atlas can be downloaded here: https://osf.io/6xm8n/ .
Fig. 3 C highlights their internal composition in terms of discrete intrinsic networks (as previously characterized by Ji et al. (2019) ; Cole-Anticevic Brain-wide Network Partition: CAB-NP).

Fig. 3 .
Fig. 3. Global, intermediate and local connectomic features of the LANG connectome.(A) Reproduction of the LANG connectome in the 2D space of PC1/PC2 (cf.Fig. 2 C).Four main components identified within the LANG network (optimal partition, Louvain method).The components (Nets) are displayed on the connectome and the ROIs are colored according to these components.The number of ROIs per Nets (Power ROIs; including both hemispheres and cerebellum) was distributed as follows: Net1 = 54 (41.2%),Net2 = 28 (21.4%),Net3 = 26 (19.8%),Net4 = 23 (17.6%).ASSO = associative; EXE = executive; AUD = auditory; SMN = sensorimotor.(B) Illustration of the LANG connectome and its 4-Net functional brain subdivision.Distribution on a multimodal parcellation of the brain (HCP_MMP1.0;Glasser et al. 2016 ) and cerebellum (SUIT; Diedrichsen et al. 2009 ).Only the left hemisphere (LH) is represented here (see Fig. S3 for a complete representation of the connectomic atlas).(C) Intrinsic functional composition of LANG Nets (top), in accordance with the resting state networks (RSNs) proposed by Ji and collaborators (Cole-Anticevic brain-wide network partition; CAB-NP; Ji et al., 2019 ) on the same template (HCP_MMP1.0borders).Nodal properties of LANG (bottom), showing the distribution of the main hubs, connector hubs as well as the regions belonging to the maximal clique (complete subgraph; diamond).Only the LH is represented here.The labelled ROIs correspond to the 5 left perisylvian hubs constituting the "rich-club ": STGa = the anterior part of the superior temporal gyrus; 45 = area 45 (pars triangularis) of the inferior frontal gyrus; 55b = area 55b of the posterior middle frontal gyrus; PFm = inferior parietal cortex, supramarginal gyrus; STV = superior temporal visual area of the temporo-parieto-occipital junction.

Fig. 4 .
Fig. 4. Functional attributes and structural underpinnings of the LANG connectome.(A) Asymmetry of FC estimated on LANG ROIs and their distribution as a function of Nets.Left hemispheric FC dominance can be considered if IL > + 0.2 (see Method). (B) Spatial concordance with neurotransmitter receptors maps of serotonergic (5HT2a [F18]altanserin PET; Savli et al. 2012 ) and catecholaminergic/noradrenergic (NET (S,S)-[(11)C]O-methylreboxetine (MRB) PET; Hesse et al. 2017 ) pathways.All the neurotransmitters' maps implemented in JuSpace( Dukart et al., 2021 ) were tested, but only spatial matches considered significant are shown here (SMC > 0.67; see Method).LANG regions with sufficient coverage ( > 40% of overlap) are in red while those with no or insufficient coverage ( < 40%), in gray.(C) Structural concordance with large white matter (WM) bundle terminations provided by TractSeg( Wasserthal et al., 2018 ).Only the best bundles combinations allowing for the highest match are displayed here.The red/gray color code corresponds to the same definition as for Panel B.

Fig. 5 .
Fig. 5. Variability of the functional attributes of the LANG connectome. A. Variability induced by linguistic demand (i.e., the linguistic subprocess involved in the task).Representation of the flexibility score of each of the LANG ROIs as well as their distribution in each of the Nets (see also Fig. S7).(B) Inter-individual variability.Representation of the betas z-scores estimated on the LANG ROIs and their distribution in each of the Nets (see also Fig. S8).(C)Variability determined by age.Illustration of the correlation coefficients between age and ROI DCs, as well as their distribution in each of the Nets.The correlations were performed on the naming task (NAM) that includes healthy participants over a wide age range: 82 subjects aged 18 to 84 years (see also Fig.S9).(D ) Variability induced by gender.Top 10% of the most different LANG ROIs between males (M) and females (F; self-reported gender).In cyan, the top 10% of ROIs where nodal connectivity (DC) was higher in males compared to females.In magenta, the top 10% of regions where nodal connectivity was comparatively higher in females than in males.See TableS5for a detailed summary of significant differences.

Table 1
Summary of the properties associated with each of the language networks.