Brain connections of words, perceptions and actions: A neurobiological model of spatio-temporal semantic activation in the human cortex

Neuroimaging and patient studies show that different areas of cortex respectively specialize for general and selective, or category-specific, semantic processing. Why are there both semantic hubs and category-specificity, and how come that they emerge in different cortical regions? Can the activation time-course of these areas be predicted and explained by brain-like network models? In this present work, we extend a neurocomputational model of human cortical function to simulate the time-course of cortical processes of understanding meaningful concrete words. The model implements frontal and temporal cortical areas for language, perception, and action along with their connectivity. It uses Hebbian learning to semantically ground words in aspects of their referential object- and action-related meaning. Compared with earlier proposals, the present model incorporates additional neuroanatomical links supported by connectivity studies and downscaled synaptic weights in order to control for functional between-area differences purely due to the number of in- or output links of an area. We show that learning of semantic relationships between words and the objects and actions these symbols are used to speak about, leads to the formation of distributed circuits, which all include neuronal material in connector hub areas bridging between sensory and motor cortical systems. Therefore, these connector hub areas acquire a role as semantic hubs. By differentially reaching into motor or visual areas, the cortical distributions of the emergent 'semantic circuits' reflect aspects of the represented symbols' meaning, thus explaining category-specificity. The improved connectivity structure of our model entails a degree of category-specificity even in the 'semantic hubs' of the model. The relative time-course of activation of these areas is typically fast and near-simultaneous, with semantic hubs central to the network structure activating before modality-preferential areas carrying semantic information.


Introduction
The human brain is able to acquire and store knowledge about people, facts, objects, actions, and culture through experiences in everyday life.Much of this knowledge comes in units, as 'conceptual' or 'semantic representations', and carries symbolic linguistic labels in language, whereby the relationships between word-forms and semantic meaning appears as arbitrary.When semantic functions are damaged, serious consequences in daily cognitive activity can arise, being manifest as impairments of language and verbal communication and in some cases extending to domains such as planning, object recognition, or goal directed action such as drinking a glass of water (Bak and Chandran, 2012;Damasio et al., 1996;Gainotti, 2010;Kemmerer et al., 2012;Pulvermüller and Fadiga, 2010).Given the centrality of semantics in human life, it is crucial to understand the neural mechanisms underlying the nature of semantic knowledge in the brain, which, despite decades of research, is still one of the most controversial issues among cognitive neuroscientists, who propose quite diverging perspectives on this issue.
One view puts forth that one or more area(s) is/are active during meaning processing in the brain, which appear to function as general convergence zones or semantic hubs and process the meaning of all types of signs and symbols.'Semantic hubs' have been proposed to be situated in the frontal, temporal and parietal cortices, especially in the left language dominant hemisphere (Bookheimer, 2002;Patterson et al., 2007;Price, 2000;Pulvermüller, 2013).For example, evidence for a multimodal semantic hub in anterior-inferior temporal cortex comes from patients suffering from semantic dementia, because damage in this region seems to be the best predictor of their semantic deficit (Mion et al., 2010).Although there is strong evidence for semantic hub areas, that is, for cortical regions which are generally important for meaning processing, an explanation of why several regions seem to play a role as semantic hubs and, especially, why they are localised in their specific cortical areas, is necessary.
A second important observation is that some additional cortical areas contribute to semantic processing in a more selective fashion, being particularly relevant for specific semantic categories, such as words typically used to speak about animals, tools, or actions and their related concepts.Some evidence also indicates that when recognizing a word such as run, activity in motor cortex, and even more specifically in leg-motor cortex, emerges, whereas, when hearing an object-and visually-related word such as sun, activity in visual areas is relatively more pronounced (Boulenger et al., 2009;Damasio et al., 1996;Gainotti, 2010;Hauk et al., 2004;Pulvermüller et al., 2009).Support for category-specific semantic processes is provided by a number of neurocognitive empirical studies that have focused on the importance of the motor and premotor cortex during conceptual processing, demonstrating for example that perceiving action words and sentences evokes activity in motor and premotor cortices (Boulenger et al., 2009;Hauk and Pulvermüller, 2004;Hauk et al., 2004Hauk et al., , 2008;;Pulvermüller, 1999Pulvermüller, , 2001;;Rüschemeyer et al., 2007;Shtyrov et al., 2004).Furthermore, activation in the premotor and motor cortex is so fine grained that we can differentiate semantic subcategories of actionrelated words somatotopically (Grisoni et al., 2016;Hauk and Pulvermüller, 2004;Hauk et al., 2004).Category-specific effects have also been seen in the visual areas, especially in the ventral temporaloccipital areas, when visually-related words are being processed (e.g.animal, colour or object-related words) (Chao et al., 1999;Kiefer, 2005;Sim and Kiefer, 2005).Importantly, category-specific semantic effects are also documented in the lesion literature, where sometimes rather small lesions in modality-preferential areas can selectively impair the processing of specific semantic categories (Dreyer et al., 2015;Warrington and Shallice, 1984).A neurobiological explanation of category-specificity has been proposed, which relates the differential activation patterns and lesion signatures to the functional level of cortical circuits with different distributions across areas.Accordingly, widely distributed cortical circuits for word forms carried by neuronal assemblies in the perisylvian language areas are linked with neuronal ensembles storing semantic information.These semantic circuits reach into modality-preferential motor and/or sensory areas depending on whether the perceptual or action-related information is relevant for grounding the meaning of the words (Barsalou, 2008;Martin, 2007;Pulvermüller and Fadiga, 2010;Pulvermüller, 2005Pulvermüller, , 2001)).The different distribution of the semantic circuits across the cortex, therefore, explains aspects of category-specificity. Notably, some studies reported that both category-general and category-specific semantic activation in the brain has been found to emerge rather fast, i.e. within ~200 ms after a meaningful symbol can be recognized (Hoenig et al., 2008;Penolazzi et al., 2007;Pulvermüller et al., 2000Pulvermüller et al., , 2004Pulvermüller et al., , 2005;;Shtyrov et al., 2014).For example, Moseley et al. (2013) recorded brain signals using magnetoencephalography (MEG) and found different responses for action-related, object-related and abstract written words already at 150 ms after their onset, with gradually stronger activations for the action/object items in motor/visual regions, respectively.An explanation of category-specificity has been offered in terms of neurobiological principles.However, in order to integrate theory and data about semantic hubs with established knowledge about category-specificity, it is necessary to develop formal models of cortical structure and function that explain the presence of both.
An effort towards such explanation was recently made by Garagnani and Pulvermüller (2016), who used a network implementation of cortical areas and their connectivity to mimic the function of the perisylvian language cortex, in particular inferior frontal and superior temporal cortex, along with general visual and motor areas function in order to simulate the binding of phonological/lexical and semantic information.Using Hebbian mechanisms for synaptic modification, this model was used to simulate the emergence of neuronal circuits that process information about word forms and their related action-vs.object-related meanings.However, the model used a simplified connectivity structure, and was applied to make predictions about magnitude and topography of brain activation, but not its time course.Here, we improve on this earlier architecture by incorporating additional cortico-cortical connections documented by neuroanatomical studies.This neuroanatomically more appropriate model was used, as in the earlier version, to predict the cortical distribution of the memory circuits for words with object-and action-related meaning.However, this type of model can be used to predict not only where in the brain linguistic and semantic brain activity occurs, but also when these processes take place, i.e., the time course of such activation.Although the spatio-temporal dimension was already present in the previous network architecture (Garagnani and Pulvermüller, 2016), we provide here, for the first time, a precise activation time course analysis of different areas of the network.Furthermore, the previous model included connector hub areas, which exhibited increased numbers of links compared with other areas.To make sure that the specific activation signatures that we observed therein particular, the generally strong activation seen in connector hub areaswere not just a result of an increased weighted sum of incoming and outgoing synaptic connections to and from neighbouring areas ('more and stronger links, more semantics'), an in-degree normalization across areas was used here to balance the overall input across areas and emphasise the role of network topology (or connection structure) as a factor influencing circuit topographies (or cell assembly distributions).
To investigate word meaning processing in the human brain, we used a neural network model implementing realistic anatomical and physiological features of the human cortex.The model simulates primary and secondary sensorimotor areas in frontal, temporal and occipital cortex along with 'connector hub' areas interfacing between different sensory and motor systems (Garagnani and Pulvermüller, 2011, 2013, 2016;Garagnani et al., 2008Garagnani et al., , 2009;;Pulvermüller and Garagnani, 2014).The short and long distance connections between model areas are based on existing neuroanatomical evidence.Functionally, the model takes advantage of realistic Hebbian learning mechanisms (Hebb, 1949).The network was trained with repeatedly presented specific sensorimotor patterns coding for the articulatory and acoustic phonological structure of single words and some of their action-or perception-related semantic features.As a result of learning and area/connectivity structure, distributed 'semantic circuits' emerged in the network, spanning different areas.Importantly, the topographies of these circuits showed similarities and differences between semantic types (action vs. object words), which can be related to the semantic information stored.We document circuit distributions and their dynamic activation and discuss the results in the context of specific model features, existing experimental evidence, and novel predictions for future research.

Materials and methods
We applied a neurobiologically grounded computational model replicating structure and functional properties of the human cortex to investigate the neural mechanisms underlying word meaning acquisition and processing in the perception and action systems of the mind and brain.The model's architecture mimics the left perisylvian cortex involved in spoken word processing, corresponding to articulatory and acoustic phonological word forms (Fadiga et al., 2002;Fry, 1966;Pulvermüller and Fadiga, 2010;Pulvermüller, 1999;Zatorre et al., 1996), areas outside the perisylvian cortex involved in processing visual object identity (Ungerleider and Haxby, 1994), and the execution of manual actions (Deiber et al., 1991;Dum andStrick, 2002, 2005;Lu et al., 1994).The model mimics a range of biologically realistic properties of the human cortex including the following features: 1. Area structure: 12 cortical areas were modelled, including modalitypreferential sensory and motor ones as well as connector hub areas interlinking sensory and motor systems.2. Between-area connectivity: different areas were linked based on neuroanatomical principles and data, realising sparse, random, initially weak and topographic connectivity.3. Within-area connectivity: similarly sparse, random and initially weak connectivity was implemented locally, along with a neighbourhood bias towards local links (Braitenberg and Schüz, 1998;Kaas, 1997).4. Local lateral inhibition and area-specific global regulation mechanisms (local and global inhibition) (Braitenberg, 1978;Palm et al., 2014;Yuille and Geiger, 2003). 5. Synaptic modification by way of Hebbian type learning, including both long-term potentiation and depression (LTP, LTD) (Artola and Singer, 1993).6. Neurophysiological dynamics of single cells including temporal summation of inputs, nonlinear transformation of membrane potentials into neuronal outputs, and adaptation (Matthews, 2001).7. Constant presence of uniform uncorrelated white noise in all neurons during all phases of learning and retrieval, and additional noise added to the stimulus patterns to mimic realistic noisy input conditions during retrieval (Rolls and Deco, 2010).
Word learning processes in the model are based entirely on mechanisms of Hebbian plasticity, often summarized by the phrase "cells that fire together, wire together", although the learning rule applied (see above and Appendix A) implements 'anti-Hebb' learning too, colloquially described by the phrase "cells out of sync delink" (for discussion, see Garagnani et al. 2009).Accordingly, within a network of interconnected neurons, repeatedly and consistently co-active subpopulations of cells strengthen their connections, forming the so called cell assemblies (CAs) (Hebb, 1949).According to Hebb (1949), assemblies can be considered functional units in the brain representing the building blocks of cognitive functions, including language (Braitenberg, 1978;Palm et al., 2014;Pulvermüller, 1996).In principle, the emerging neuronal assemblies can be local, that is, restricted to a small area or even cortical column of a fraction of a cubic millimetre or, alternatively, be spread out across wide cortical regions, and it is not clear a priori whether a given network and input pattern leads to the formation of local or distributed circuits.Different cortical distributions, or topographies, of cell assemblies have been postulated for symbols with different meaning.Standard postulates are that words related to actions include neurons in the motor cortexwhich control the movements a word such as run is typically used to speak aboutwhile words referring to objects (such as sun) will include neurons in areas along the ventral visual stream of object processing (Huyck and Passmore, 2013;Pulvermüller and Preissl, 1991;Pulvermüller, 1999).Previous simulation studies have already shown the formation of distributed neuronal assemblies exhibiting differential cortical distributions as a result of repeated concomitant presentation of activation patterns and Hebbian plasticity mechanism (Garagnani and Pulvermüller, 2011, 2013, 2016;Garagnani et al., 2008Garagnani et al., , 2009;;Wennekers et al., 2006).

Model architecture
The model consists of 12 cortical areas of artificial neurons with area-intrinsic connections and mutual connections between them.In the left perisylvian language cortex, we identify six cortical areas divided into two sub-systems: auditory and articulatory systems (areas highlighted in blue and red in Fig. 1A).The auditory system includes the primary auditory cortex (A1), auditory belt (AB), and parabelt areas (PB)whereas the articulatory system includes the primary articulatory motor cortex (inferior part of primary motor cortex, M1 i ), perisylvian cortex include an articulatory system (red colours), including inferiorprefrontal (PF i ), premotor (PM i ) and primary motor cortex (M1 i ) and auditory system (areas in blue), including auditory parabelt (PB), auditory belt (AB) and primary auditory cortex (A1).These areas can store correlations between neuronal activations carrying articulatory-phonological and corresponding acoustic-phonological information, for example when phonemes, syllables and spoken word forms are being articulated (activity in M1 i ) and acoustic features of these spoken words are simultaneously perceived (stimulation of primary auditory cortex, A1).(ii) Extrasylvian areas include a motor system (yellow to brown), including dorsolateral prefrontal (PF L ), premotor (PM L ) and primary motor cortex (M1 L ) and a "what" visual stream of object processing (green), including anterior-temporal (AT), temporo-occipital (TO) and early visual areas (V1).Together with the perisylvian areas, these extrasylvian areas can store correlations between neuronal activations carrying semantic information, for example when words are used (activity in all perisylvian areas) to speak about objects present in the environment (activity in V1, TO, AT) or about actions that the individual engages in (activity in M1 L , PM L , PF L ).Numbers indicate Brodmann Areas (BAs).(B) Schematic illustration of all 12 model areas and the known between-area connections implemented.The colours indicate correspondence between cortical and model areas.(C) Microconnectivity structure of one of the 7500 single excitatory neural elements modelled (labeled "e").Within-area excitatory links (in grey) to and from "cell" e are limited to a local (19×19) neighbourhood of neural elements (light-grey area).Lateral inhibition between e and neighbouring excitatory elements is realised as follows: the underlying cell 'i' inhibits e in proportion to the total excitatory input it receives from the 5×5 neighbourhood (dark-purple shaded area); by means of analogous connections (not depicted), e inhibits all of its neighbours.Each pair (e,i) of model cells is taken to represent an entire cluster or column (grey matter under approximately 0.25 mm 2 of cortical surface) of pyramidal cells and the inhibitory interneurons therein.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)inferior premotor (PM i ) and prefrontal motor cortex (PF i ).Six additional areas outside the perisylvian cortex (which we call 'extrasylvian') were included to model the ventral visual stream and dorsolateral motor system (green and yellow highlighted areas).The ventral visual system is relevant for processing visual object identity and includes, apart from primary visual cortex (V1), temporo-occipital (TO) and anterior-temporal (AT) areas.Finally, the motor system which, for example, is relevant for the execution of manual actions, includes the dorsolateral fronto-central motor (M1 L ), premotor (PM L ), and prefrontal cortices (PF L ).
Each model area consists of two layers of 25×25 excitatory and inhibitory artificial neurons (e-and i-cells) (see Fig. 1C).Each e-cell represents a cluster of excitatory pyramidal cells, and the underlying icell models represent the cluster of inhibitory interneurons, situated within the same cortical column (Eggert and van Hemmen, 2000;Wilson and Cowan, 1972).As it is typical for the mammalian cortex, the connectivity between and within model areas is sparse, patchy and topographic (Amir et al., 1993;Braitenberg and Schüz, 1998;Gilbert and Wiesel, 1983).To regulate and control activity in the network, local and area-specific inhibition is implemented (Bibbig et al., 1995;Palm, 1982;Wennekers et al., 2006).Details of the model functions and of the Hebbian learning mechanism (including LTD and LTP) are summarized in previous works (Garagnani and Pulvermüller, 2011, 2013, 2016;Garagnani et al., 2008Garagnani et al., , 2009)).For completeness, we recapitulate them in Appendix A.
Long distance cortico-cortical links between sub-systems (see purple arrows Fig. 1B) are realised between all pairs of multimodal hub areas (PB, PF i , AT and PF L ).This is motivated by evidence for neuroanatomical connections between inferior prefrontal (PF i ) and auditory parabelt (PB) areas, carried by the arcuate and the uncinate fasciculus (Catani et al., 2005;Makris and Pandya, 2009;Meyer et al., 1999;Parker et al., 2005;Paus et al., 2001;Rilling et al., 2008;Romanski et al., 1999a,b) and, in the extrasylvian system connections between anterior-temporal (AT) and lateral prefrontal (PF L ) areas, carried by the uncinate fascicle (Bauer and Jones, 1976;Chafee and Goldman-Rakic, 2000;Eacott and Gaffan, 1992;Fuster et al., 1985;Parker, 1998;Ungerleider et al., 1989;Webster et al., 1994).The periand extrasylvian systems are also linked by means of long distance cortico-cortical connections across the central hub areas; likewise parabelt (PB) and lateral prefrontal cortex (PF L ) are reciprocally connected (Pandya and Barnes, 1987;Romanski et al., 1999a,b) as well as the anterior/middle-temporal (AT) and inferior prefrontal (PF i ) areas (Pandya and Barnes, 1987;Petrides and Pandya, 2009;Rilling, 2014;Romanski, 2007;Ungerleider et al., 1989;Webster et al., 1994).A recent simulation study adopting a similar network architecture did not implement connections between inferior and superior prefrontal or between auditory parabelt and anterior temporal cortex (Garagnani and Pulvermüller, 2016).We added both links because of the evidence for reciprocal connectivity between anterior-temporal (AT) and parabelt (PB) areas (Gierhan, 2013) and between inferior and lateral prefrontal (PF i , PF L ) areas (Yeterian et al., 2012).This also led to a more symmetric network structure.The asymmetries in the earlier network may account for some of its functional properties, which, as we discuss below, were not seen in the present network based on a (slightly) more realistic structure (see Section 4).
The previous study (Garagnani and Pulvermüller, 2016) found that semantic circuits included a massively enhanced number of neurons in connector hub areas compared with primary or secondary areas, which was seen as an explanation of semantic hub status.However, there are different mechanisms that could underlie the observation: One way to explain it is by way of topological network structure, especially the fact that 'connector hub' areas hold a central role in interlinking sub-systems.At the same time, and partly independent from their role as connector hubs, the same areas are also the targets and origins of an increased number of connections to other areas (i.e. a higher 'degree' of connectivity).In the case of our present model, 2 between-area connections exist for most areas (primary ones have input plus 1 connection), but connector hubs have 4 of them, thereby entailing larger amounts of activation reaching these areas when activity waves spread through the network from its different ends during learning.Any specific functional properties of hub areas, including their great involvement in carrying semantic circuit members, may therefore, result either from network topology, or from the number of area input connections from other areas, or from both.If it is just the number of inputs to and thus amount of activation in an areatheir 'in-degree'that is relevant for an increased importance in semantics, the explanation of semantic hubs may trivially be based on the formula 'what activates most, is most relevant for cognition'.However, an explanation based on network topology and connectivity structure per se becomes plausible if general semantic relevance can be documented for hubs that have an overall input comparable to that of other areas.Therefore, we normalized the overall amount of input of all (equal-sized) areas by dividing the contribution of all long-distance connections (all links among the 'rich club' of connector hubs, central quadruplet in Fig. 1B) by 3.After this indegree normalization (which in the present symmetric architecture also implies out-degree normalization), each of the 12 areas receives two equal quantities of inputs (either 1*1 or 3*1/3), one from the left and one from the right side of the model.This procedure preserved differences in topology while normalising for amount of input activation per area.

Simulations
The simulations were carried out in two steps.After learning the semantic relationships between articulatory and acoustic information about the word form (perisylvian activity patterns in M1 i and A1) and 'grounding' action or object information (extrasylvian activity pattern either in M1 L or in V1) (Section 2.2.1), the network was used to simulate the neurophysiological correlates of word recognition and understanding (Section 2.2.2).

Learning phase
The network architecture described above (Fig. 1B) was initialized at random before the learning phase began (see Appendix A): 12 different, randomly initialized networks were created, each with 12 different sets of sensorimotor patterns representing object-and actionrelated words.These 'word-learning patterns' represented six objectrelated and six action-related words.Each pattern consisted of a fixed set of 19 cells chosen at random from the 25×25 cells of an area (ca.3% of the cells) which were simultaneously presented to the primary areas of the network.At the linguistic and semantic levels, the cells in M1 i and A1 represented articulatory and acoustic phonetic features and their values (e.g., [+labial]) and those in M1 L and V1 action-related and visually-related semantic features plus values of the words (e.g., [+LEG ACTION], [+ROUND SHAPE]).Each word in our training set was grounded in input to three of the four primary areas of the model: apart from perisylvian A1 and M1 i activity, object-related words received concordant visual (V1) and action words lateral motor area (M1 i ) grounding activity.This mimics a typical situation of object-related word learning, whereby the word is uttered while the referent object is present (Vouloumanos and Werker, 2009) or the relevant action is being performed (Tomasello and Kruger, 1992).Note that white noise was always present and overlaid all learning patterns (in addition to that already present in all areas of the network).This was implemented to account for variability of perceptions and actions of the same type.The model was set up to learn the correlation between word and referential semantic information; the critical question was which type of representations develops in the network as a consequence of learning.
Each word-learning pattern of 3×19 activated cells (57 cells in total) was simultaneously presented to the respective primary areas for 3000 times.Some trial-to-trial variability of patterns was due to noise overlay (see below).The number of presentation was chosen on the basis of previous simulations (Garagnani and Pulvermüller, 2016).While three primary areas were directly activated by each learning pattern, the fourth non-relevant area (M1 i for object-and V1 for action-related words) received additional variable noise input, i.e. a further pattern, consisting of 19 randomly chosen cells that changed inconsistently over learning episodes, was presented to the respective primary areas.This was done to make sure that the correlation of the word-form activity in the perisylvian cortex with that of the semantic information was high in one modality for action and object words in motor and visual systems, but low in the non-relevant one.A learning trial involved presentation of a word pattern for 16 time-steps, followed by a period during which no input (inter stimulus interval -ISIs) was given.The next stimulus was presented to the network only when the global inhibition of the PF i and PB areas decreased below a specific fixed threshold; this allowed the activity in the network to return to a predefined baseline value, so as to minimize the possibility of one trial affecting the next one.During each ISIs, only the inherent baseline noise (simulating spontaneous neuronal firing) was present in the neural-network.

Cell assembly definition
During the learning phase, we noticed the gradual formation of cell assembly circuits with different assemblies responding to different input patterns.After 3000 presentations in which three of the four sub-systems were co-activated by stimulating specific neurons in their respective primary cortex, distributed neuronal circuits spontaneously emerged within the network areas, linking up wordform in the perisylvian language areas (auditory and articulatory subsystems) with referential-semantic information in the sensorimotor areas (visual and motor sub-systems) (this is further explained in Section 3.1).
To identify and quantify the neurons forming the 12 CA circuits across the network areas, we computed the average firing rate of each excitatory cell (7500 e-cells) over the 15 time-steps subsequent to a single presentation of the learned sensorimotor patterns (no semantic input was provided in the primary areas of the extrasylvian system).An e-cell was defined as a member of a given CA circuit, only if its timeaveraged rate (output value or "firing rate") reached a threshold θ which was area-and cell-assembly specific, and defined as a fraction γ of the maximal single-cell's time-averaged response in that area to pattern w.More formally, where O x t ( , ) w is the estimated time-averaged response of cell x to word pattern w (see Eq. (A4.1) in Appendix A) and γ∈[0,1] is a constant (we used γ=0.5 on the basis of previous simulation results (see Garagnani et al. (2008Garagnani et al. ( , 2009))).This was computed for each of the 12 trained networks and the number of CA cells per area was averaged over the six object-and six action-related words.CA distributions across areas were analysed statistically as described in Section 2.3.

Neurophysiological word recognition simulations
After training, we used the network to simulate the process of perceiving, recognizing and understanding object-and action-related words and the neurophysiological mechanisms underlying these processes.To this end, each 'testing trial' started with primary auditory area (A1) stimulation using only the A1 component of the learning pattern of one learnt 'word'.Stimulation was for 2 time-steps, followed by 50 time-steps during which no input was provided and another 10 used as a baseline for the subsequent trial.To ensure that all testing trials started from analogous baselines, network activity was reset before the baseline.In order to obtain better signal-to-noise ratios, each of the auditory patterns was presented in 12 different testing trials.Results for each CA were obtained by averaging the 12 "trials" of its sensorimotor pattern presentation.
During word recognition, we recorded the area-specific "within-cell assemblies (CA) activity" per simulation time-step during the 10 time-steps preceding the stimulus onset and the 50 time-steps following offset.The within-CA activity was computed as the sum of the output values (cumulative firing rates, CFRs) of the emerging CA cells in each area produced by stimulation of area A1 as a function of time.By "CA cells", we mean here the cells forming the CA (as defined in Section 2.2.2 above); through Hebbian learning, these cells become strongly and reciprocally connected, forming the CA circuits.After this, we identified the "peak amplitude" as the maximum value reached by the CA's cumulative firing rates during the 50 post-stimulus time-steps, and the "peak delay", the latency of the peak upon stimulation.These values were computed for each of the 12 learned networks, averaged over the two word-types and across network areas: results were submitted to statistical analysis as described below.

Statistical analysis
Statistics were performed on the six object-and six action-related words learnt by one network and across the 12 different network instances.To statistically test for the presence of significant differences in the topographical CA distribution and activation dynamics, we performed repeated-measures Analyses of Variance (ANOVAs).A 4way ANOVA was run with factors WordType (two levels: Object vs. Action), PeriExtra (two levels: Perisylvian={A1, AB, PB, M1 i , PM i , PF i }, Extrasylvian cortex={V1, TO, AT, M1 L , PM L , PF L }), Temporal Frontal (TempFront) (2 levels: temporal areas={A1, AB, PB, V1, TO, AT}, frontal areas={M1 L , PM L , PF L , M1 i , PM i , PF i }) and Areas (three levels: Primary={A1, V1, M1 L , M1 i }, Secondary={TO, AB, PM L , PM i } and Central={PB, AT, PF L , PF i } areas).We further performed a second statistical analysis on the data of the two systems separately, six perisylvian and six extrasylvian areas with factors "WordType", "TempFront", "Areas", as described above.Analysis was performed on 3 different sets of data: (i) on CA cells distributions emerged from word acquisition, on the (ii) peak amplitudes, and (iii) peak delays during word recognition simulations.Finally, we performed Bonferroni-corrected planned comparison tests (24 comparisons, corrected critical p < .0020) to further explore the significant differences in CA cells distributions and peak delay data across the four subsystem areas.The emerging CA circuits are spread out to the same degree across the perisylvian language areas for object-and action-related words, whereas motor and visual areas of the extrasylvian cortex seem to exhibit different CA cell distributions.These distributions indeed appear to show a double dissociation.Object-related words extend more into the visual (V1, TO) areas, whereas they extend only minimally into the extrasylvian motor (PM L , M1 L ) areas; the reverse pattern emerges for the action-related words.

Learned CA topographies for object-and action-related words
Fig. 3 illustrates the distribution of the CA circuits, given as the number of CA cells per areas averaged across 12 trained networks, for object-(light grey) and action-related words (dark grey).The extrasylvian system involved in processing visual-object identity and motor action seems to exhibit a double dissociation between the two word Results of one typical instantiation of the model in Fig. 1 are shown, using the same area labels.Each set of 12 squares (in black) illustrates the distribution of "cells" of one specific CA across the 12 network areas.Each white pixel in a square indexes one CA cell.CAs for object-related words extend into higher and primary visual cortex (V1, TO, but not M1 L ), linking information about spoken word forms (perisylvian pattern) with information from the visual modality (neural pattern in V1).Network correlates of action-related words extend into lateral motor cortex (M1 L , PM L , but not V1), thus semantically grounding words in information about actions.For convenience, the area structure of the network is repeated at the top.
types, as already noted above and in Fig. 2. The perisylvian language cortices seem to show no significant differences between the circuits for the two word types.Note also that there is a larger number of CA cells in the multimodal hub areas (PB, PF i , AT, and PF L ) than in the secondary areas (AB, PM i , TO, PM L ), where there are more cells than in primary areas (A1, M1 i , V1, M1 L ).This appears independent of whether an object-or action-related word is represented.
The observations described above were confirmed by the 4-way ANOVA.A main effect of Areas (F2,24=1226.424,p < .0001)emerged, which confirms that the CA cell densities differed across areas, with CA cell densities being higher in hub than in secondary areas (p < .0001),and higher in secondary than in primary areas (p < .0001).In addition, we found a significant interaction between the factors WordType, PeriExtra, TempFront and Areas (F2,24=130.795,p < .0001),indicating that the distributions of the two types of word-related CA circuits across the network differed.Because the interaction also demonstrates that CA-distribution differed between perisylvian and extrasylvian systems, we ran further statistical analyses on the data from the two systems separately, now using 3-way ANOVAs.We found a main effect of Areas for both perisylvian (F2,24=2091.116,p < .0001)and extra-sylvian systems (F2,24=3959.92,p < .0001),as revealed by the 4-way ANOVA analysis.As expected, the perisylvian system did not show any significant differences between CA distributions of the two word types across the 6 areas (F2,24=0.38,p=0.68).In contrast, the extrasylvian system revealed a highly significant interaction of all three factors WordType, TempFront and Areas (F2,24=156.555,p < .0001),confirming the word category differences in the CA topographies and local cell-density distributions across visual, motor and multimodal areas as suggested by Figs. 2 and 3. To further investigate the differences between CA types across the network, we ran Bonferroni-corrected planned comparison tests (24 comparisons, corrected critical p < .0020);these confirmed the presence of a larger number of CA cells in visual (V1, TO and AT) than in motor (M1 L , PM L, and PF L ) areas for object-(p < .001),and the opposite for action-related words (p < .001).Post-hoc analysis of the data from the connector hubs (AT, PF L ) also showed a significant difference between the two word types there, i.e. stronger action-related word CA cell densities in PF L compared to AT (p < .0001),and the opposite for object-related words (p < .001).Differences in CA-cell densities between word types and pairs of areas in the semantic systems were all significant (p < .002),as described in Fig. 2. In contrast, no significant differences emerged in the perisylvian system (p > .87).

Neurophysiological word recognition results
To obtain a simulation of spoken word recognition and comprehension processes, we analysed the time-course of the network's response to presentation of the learned auditory word-form patterns to area A1.To this end, we computed the sum of all CA cell activity values (quantified as the cumulative firing rates, CFRs, see Section 2.2.3) as a function of time across the entire network or for specific areas.Activation time courses showed an initial "ignition" of CA circuits, a strong activation, which peaked at time-step ~16 and included a majority of the circuits' neurons (Fig. 4).Replicating, in part, the structural distributions of semantic circuits depicted in Fig. 3, both types of circuits were similarly spread out across all perisylvian areas of the model; by contrast, differences between semantic circuit types were present in extrasylvian cortex: object-related words (blue pixels) elicited activation in the visual system and less in the motor system, while the reverse happened for the action-related words (red pixels).Note also the low degree of overlap between CAs of the two different word types (yellow pixels) for these two specific CAs instances.
In extrasylvian areas, maximal area-specific activation levels significantly differed between the circuits carrying the two semantic wordtypes.A significant double dissociation showed that circuits for objectrelated words produced higher amplitude in the visual (cumulative firing rates (CFRs)=9.10)sub-system than in the lateral (hand) motor system (CFRs=5.23),and, vice versa, action-related words activated the lateral motor system (CFRs=8.38)more strongly than the visual system (CFRs=4.86,see Fig. 5B bar plot left-hand side).As visual inspection indicates, the auditory and articulatory motor sub-systems (see Fig. 5A bar plot left-hand side) did not show any differences in activity levels between semantic word types.Furthermore, comparing activity levels between areas of the network (see Figs. 6A-B and 7A-B), multimodal hub areas (AT, PF L , PB, PF i ) seemed to show the strongest activation dynamics (CFRs~15) in comparison with secondary (CFRs~10) and primary areas (CFRs~5).
The statistical analyses of the dynamic functional activation of the circuits confirmed these observations, which are in line with the CAdistribution results described in Section 3.1.In particular, the 4-way ANOVA performed on peak activation levels per area and word type revealed a main effect of Areas (F2,22=630.246,p < .001),again with maximal CA activation in 'central' connector hub areas.In addition, a significant interaction of factors WordType, PeriExtra, TempFront and The extrasylvian areas whose cells can be seen as circuit correlates of word meaning show a double dissociation, with relatively more strongly developed CAs for object-than for action-related words in primary and secondary visual areas (V1, TO), but stronger CAs for action-related than for object-related words in dorsolateral primary motor and premotor cortices (PM L , M1 L ).Asterisks indicate that, within a given area, the number of CA neurons significantly differed between the circuits of action and object words (Bonferroni-corrected planned comparison tests, 24 comparisons; critical threshold p < .0020).
Areas (F2,22=137,433, p < .001)emerged, confirming different activation levels between word type circuits across the network's areas.Because of the differences between the peri-and the extrasylvian systems, we also ran a second statistical analysis on each of the two systems separately.The 3-way ANOVA revealed a main effect of Areas on both perisylvian (F2,22=667.146,p < .001)and extrasylvian (F2,22=268.1345,p < .001)systems.Whereas the perisylvian areas did not show any significant differences in peak amplitude between the two circuit types (F1,11=0.98,p=.76), the extrasylvian system revealed significant interactions of factors WordType, and TempFront (F1,11=518.7315,p < .001),and of WordType, TempFront and Areas (F2,22=109.3367,p < .001),showing different activation dynamics across the extrasylvian areas between the circuits of the two word categories (Fig. 5 left-hand side).

Area-specific activation time-coursepeak delay results
Figs. 6 and 7 delineate the area-specific activation time courses of semantic circuits of object-(A) and action-related words (B) across the network.The activation in different areas peaked at different times and showed different maximal amplitudes.The schematic brains at the top of each panel illustrate the area-specific peak delay and the boxplots indicate the latency of maximal activation together with their standard errors (boxes) and standard deviations (whiskers).
The activation time-courses in the perisylvian language areas exhibited a similar, cascade-like time-course for both object-and action-related CA circuits (see Fig. 6A-B).Area A1 peaked at an early time (2 time-steps) after stimulus onset because it was driven by the sensorimotor pattern presented there.The auditory-belt (AB) area peaked at ~6 time-steps, and shortly followed by the parabelt (PB~7) and inferior prefrontal (PF i ~10) areas, and finally the premotor (PM i ~12) and primary motor (M1 i ~13) areas.This time-course was the same for both circuit types.By contrast, the extrasylvian semantic system (Fig. 7A-B) seemed to exhibit different temporal activation patterns for the two types of semantic circuits.The extrasylvian connector hub areas (PF L , AT) peak activated at similar latencies as the perisylvian hubs (PF i , PB) central to the network structure (12-13 time-steps).Interestingly, the multimodal prefrontal area (PF L ) revealed a similar activation dynamics (~13 simulation time-steps) for both word types, whereas the anterior-temporal hub area (AT) peaked 1 time-step earlier for action-related words (~12) than for objectrelated ones (~13).Massive activation time-course differences were apparently present in non-central extrasylvian areas, i.e. in the primary and secondary visual and dorsolateral motor areas of the network.Object-related words activated their lateral premotor and temporooccipital area shortly after the connector hubs (PM L ~15, TO~15), closely followed by the primary visual (V1~16) area.In contrast, the circuits underpinning action-related words first activated the lateral premotor (PM L ) area (~15), closely followed by temporo-occipital (TO) and lateral primary motor (M1 L ) areas (~16).Both object-and actionrelated words activated the primary areas of the relevant system approximate ~15 time-steps after word onset and at the end of the activation cascade.As visible in Fig. 7A-B, different activation dynamics can be observed for object-and action-related words in the secondary areas of the non-relevant system (PM L for object-related words and TO for action-related words).However, we note that the activation peaks were quite flat in these cases, thus leading to some variance in latencies.
To confirm these observations about the activation time-course across areas for the different word-related CAs, we ran the same 4-way ANOVA as in the previous sections, but not using peak activation latencies.The statistical analysis revealed a significant interaction of factors WordType, PeriExtra, TempFront and Areas (F2,22=3615.08,p < .0001),which confirms the different area-specific activation timecourses between the two word type circuits.Once again, the perisylvian cortex showed no significant differences between circuit types across the six areas (F2,22=0.4,p=.68).The extrasylvian cortex revealed a significant interaction of the factors WordType, TempFront and Areas (F2,22=4791.15,p < .0001),which confirms a different activation timecourse of the extrasylvian areas for object-and action-related words, as described above.
We further ran a Bonferroni-corrected planned-comparison test (24 comparisons, corrected p < .0020) to investigate the possible difference in temporal activation between the two word-types across the neuralnetwork.Similar activation time-courses for the two word types/ circuits were found across the network areas, except for the temporo-occipital (TO, p=0.001) and the anterior-temporal (AT, p=0.0002) visual areas.Activation times for each word/circuit type showed no significant differences between the extrasylvian connector hub areas (AT, PF L : p > .0080),which, however, activated significantly earlier than the modality-preferential ones (p < .001).Intriguingly, comparisons between modality-preferential cortices showed significant differences, expect between TO and PM L (p=0.66) for object-related word circuits and between TO and M1 L (p=0.77) for action-related ones.In the perisylvian language cortex, all comparisons between area-peak activation times showed significant differences (p < .001)(see Figs. 6 and 7, i.e. brain/boxplot).
For putative comparison of model data with experimental data (see Section 4), a further analysis of the activation dynamics was performed.Activation to both word types across sub-systems unfolded symmetrically in the perisylvian and extrasylvian cortex ("Motor"-then-"Visual" vs. "Visual"-then-"Motor"see Fig. 5, right-hand side).These observations were fully confirmed by the 2-way ANOVA run on the data of the two systems separately (i.e.peri-and extra-sylvian systems), with factors WordType (2 levels: object vs. action) and TempFront (2 levels: temporal areas vs. frontal areas).The statistical analysis showed a significant interaction of WordType and TempFront (F1,11=24.52,p < .0004;action words, dorsal motor sub-system: 25 simulation time-steps, ventral visual sub-system: 24.37, object words, dorsal motor sub-system: 24.14, ventral visual sub-system: 25.27) in the extrasylvian systems, confirming the symmetrical time-course of activation of the two word types, with no differences in the perisylvian language cortex (F1,11=0.6,p < .46).Notably, the significant interaction was due to slower average activation times in the relatively more relevant semantic system (dorsal action sub-system for action words, ventral visual sub-system for object words) compared with the less relevant sub-systems, a feature due to the absence of (slow) activation in the respective primary areas (see Fig. 5).

Discussion
A neurocomputational model implementing a range of cortical areas in frontal, temporal and occipital lobes along with main features of their connectivity structure and neurophysiologically realistic learning mechanisms offers an explanation of known facts about the cortical basis of meaning processing, in particular, the fact that some areas serve a general role in semantic processing, whereas others primarily take a category-specific role.When the model was used to mimic semantic grounding of word forms in action and perceptual informa-tion in motor and visual cortex, distributed neuronal assemblies developed, which functioned as 'semantic circuits' insofar as they interlinked information about word form and meaning.Intriguingly, these semantic circuits showed different distributions across extrasylvian modality-preferential areas, as already found in a previous simulation study (Garagnani and Pulvermüller, 2016).This replicates the category-specificity of action and object words, which, in a range of neuroimaging studies, more strongly activated dorsolateral motor and ventral-stream visual areas, respectively.In contrast to the categoryspecific behaviour of modality-preferential areas outside the perisylvian domain, substantial amounts of neuronal machinery in connector hub areas in prefrontal and anterior temporal cortex were involved to similar degrees in both kinds of cell assemblies, consistent with a role of these connector hubs as 'semantic hubs'.As in-degree normalization was used in the present simulations, we argue below that this functional segregation into general and category-specific semantic areas resulted from connectivity structure and especially the high 'degree' of connector hubs, rather than from overall strength of the input.In fact, in contrast to earlier work (Garagnani and Pulvermüller, 2016), area function only gradually changed from category-specificity towards a category-general role, with even connector hubs exhibiting a degree of category-specificity, a feature which may be due, in part, to the inclusion of additional connections based on neuroanatomical evidencewe return to this issue below.Finally, the novel analysis of the time courses of activation indicated that in word recognition and comprehension, auditory areas are (trivially) activated first, closely followed by connector hub and modality-preferential frontal and temporal areas.Another intriguing observation was that the extrasylvian sub-systems carrying category-specific semantic information about a given word type (i.e., the dorsolateral motor sub-system for action words and the ventral visual sub-system for object words) showed a tendency toward delayed activation relative to the other areas.Moreover, a direct comparison of the activation dynamics of the model with real cortical activations observed during spoken word processing exhibit a degree of consistency (see Section 4.2 and Fig. 8).Below we discuss these findings in light of empirical data, previous neurocomputational work, and future research.It needs also to be emphasized that the present model tests, and demonstrates the validity of a neurobiological theory of language, which claims that semantic content is stored in the brain by distribution of the cell assembly circuits (CAs) spread out across cortical areas, and that the specific cortical distribution (topography) of these circuits across the network reflects semantic information, in particular, semantic category-specificity (see, for example, Pulvermüller, 1999).The semantic models most popular at present still stipulate semantic hubs as the main seat of conceptual and semantic processing without providing neurobiological explanations for such hubs, nor for their specific cortical locations.A purely verbal description of a distributed semantic circuits theoryin terms of "what fires together must also bind together"would already provide some plausibility, but one might still Fig. 7. Spatio temporal activation patterns of the six extrasylvian model areas.As described in Fig. 6 all curves (bottom part of each panel) illustrate area-specific activation dynamics plotted against time and the boxplot (upper part) shows the latency of maximal activation.Brain schematics highlight the areas specific activation dynamics.Two or more areas are plotted into the same brain schematic if there were no significant delay differences between their peak activations (Bonferroni-corrected for 24 comparisons; critical threshold p < .0020).Averages and statistics are calculated across 12 different networks.
object that a working model of relevant cortical areas might give rise to entirely different mechanisms, for example to the emergence of local semantic processing in a single 'interface system' rather than distributed circuits that bind semantic information.Similarly, even if one is inclined to accept that distributed circuits reach into specific sensory and/or motor cortices, it would still be unclearsolely on the basis of a logical argumentwhether such 'category-specific' distribution is restricted to primary areas, should include primary and secondary ones, or whether semantic specificityas indicated by the present resultsreaches the highest level of connection hubs, which, as most models postulate, are category-general and relevant for all semantic categories to the same degree.

Semantic hubs vs. category-specificity in the human brain: explaining both by a neuromechanistic circuit-level model
Diverging theories of semantic representation have been proposed to explain the extensive empirical findings about the brain basis of meaning processing revealed by neuropsychological and imaging studies in patients and healthy subjects.As mentioned in the introduction, cognitive neuroscience has posited the existence of several convergence areas or "semantic hubs" that enable associating different aspects of conceptual and semantic knowledge.These areas have been located in the inferior and dorsolateral prefrontal, inferior parietal, superior temporal and anterior ventral temporal cortex, and postulated to equally process the meaning of all types of signs and symbols (Bookheimer, 2002;McCrory et al., 2000;Patterson et al., 2007;Pulvermüller, 2013).A complementary position emphasizes the importance of other cortical regions for semantic processing which are particularly relevant for specific word types related to specific semantic categories, such as animals, tools or actions.A range of relevant neuroimaging studies have shown the relevance of the motor cortex during conceptual processing of action-related words (Dreyer et al., 2015;Grisoni et al., 2016;Hauk and Pulvermüller, 2004;Hauk et al., 2004;Shtyrov et al., 2014) and of the sensory cortex during conceptual processing of visually related words (e.g.colours, animals or objectrelated words) (Damasio et al., 1996;Tranel et al., 1997).Furthermore, recent neurophysiological studies (EEG-MEG) show early ( < 200 ms) and automaticity brain activation reflecting semantic differences (e.g., Moseley et al., 2013;Pulvermüller et al., 2005).This evidence, which we discussed extensively in the introduction above, is consistent with the claim that semantic processing is distributed across, and divided up between, category-general hubs and category-specific areas.The frequently emphasized need for an integrative explanation of both general and semantic areas along with their location (Binder and Desai, 2011;Pulvermüller, 2013) is now being answered by results from the network simulations we report here.
The explanation of hubs and category-specificity requires reference to an intermediate level of computational simulation of neuronal circuits which bind together specific word forms and their semantic, meaning-related features (Pulvermüller et al., 2014).The formation of these semantic circuits results from (i) the correlation structure of 'grounding' sensorimotor semantic information and co-occurring word forms, (ii) the neurobiologically realistic learning and therefore mapping of the correlations on neuronal connection strengths and (iii) the structural information immanent to the neuroanatomy of cortical areas and their connectivity.As these circuits map sensorimotor correlations, they bridge between those neurons in sensory and motor areas where informationand thus correlated activationis present during learning.This leads to category-specificity of circuit topographies, with action words such as "run" yielding cell assemblies reaching into motor systems and object words such as "sun" being implemented as circuits strongly linking up with neurons in visual cortices (Kiefer et al., 2008;Pulvermüller, 2013).These distributed word-related CA circuits did not extend into the non-relevant sub-systems (M1 L for object-and V1 for action-related words) because neural activity of these areas presented a low degree of correlation.This is because during training these areas were stimulated with random patterns that changed in every learning episode (see Section 2).Consequently, following the correlation based learning rule, object-related CA circuits exhibited a larger density in the visual (V1, TO, AT) than in the motor areas (M1 L , PM L , PF L ) and vice versa for action-related words (Fig. 3).
It should be clarified here that the presence of a random-noise pattern to the non-relevant sub-systems was necessary to prevent the extensions of the semantic circuits into motor areas for object-related and visual areas for action-related words.In fact, in an additional set of word learning simulations, network training without the random noise pattern being present in the non-relevant sub-systems failed to produce a category-specific distribution.This observation further documents the important function of neuronal noise in the brain and in brain-like networks (Doursat and Bienenstock, 2006), which prevents excessive CA growth.We conclude that noise in primary areas is critical for obtaining semantic cortical circuits with category-specific signatures.In essence, as it is important to learn that the word "run" relates to certain motor patterns, it is likewise important to learn that variable visual inputs ('noise') typically occur during running so that specific visual features are de-correlated from the word form.We note that under deprived conditions, for example in blind language learners, this type of de-correlating sensory-related noise is missing in the deprived primary cortex.Resultant CA growth into the ventral stream may explain why blind individuals activate visual areas in linguistic and semantic processing (see Bedny et al., 2011 andNeville andBavelier, 2002).
In order to connect information about actions and perceptions available in the primary cortices, activity must run through connector hub areas.Therefore, neurons in multimodal cortices are included in all types of semantic circuits to a similar degree.This explains the existence and cortical location of semantic hubs in inferior and dorsolateral prefrontal cortex and in anterior and superior temporal cortex.Our model did not include areas of the parietal cortex, but if it did, it is foreseeable that the same localisation mechanisms will apply to the additional lobar system so that an additional 'semantic hub' in posterior parietal cortex (posterior supramarginal gyrus, intraparietal sulcus and angular gyri) might emerge.A new finding of the present work is the emergence of a degree of category-specificity also in extrasylvian hub areas.Earlier simulations by Garagnani and Pulvermüller (2016) had found no category differences in any of the hub areas.This may have been due, in part, to the reduced input to extrasylvian hub areas implicated by the absence of connections between ventral and dorsolateral prefrontal cortex and likewise between anterior inferior and posterior superior temporal cortex.As these connections have meanwhile been documented by anatomical studies (Gierhan, 2013;Yeterian et al., 2012), they were included in the present simulations and a small but significant degree of categoryspecificity in these hub areas was the consequence.
A fruitful target for future research will be to investigate the possibility of category-specific semantic deficits after lesions in anterior temporal and dorsolateral prefrontal cortex.In this context, a closer look at patients in early stages of semantic dementia may be crucial, because these patients sometimes show lesions restricted to anterior and inferior temporal areas (Patterson et al., 2007).Some work in this field suggests no differences in processing different semantic categories (Lambon Ralph et al., 2007), but other studies have reported some differences, for example between colour-and form-related words (Gainotti, 2012;Pulvermüller et al., 2010).Stroke-and encephalitisinduced lesions of the multimodal parts of the left temporal lobe (corresponding to area AT in the network) have also been found to cause category-specific word processing deficits for animals, persons, and living things (Damasio et al., 1996;Gainotti, 2012;Pulvermüller et al., 2010;Warrington and Shallice, 1984).Thus, it seems that there is at least some evidence for category-specificity in the extrasylvian anterior-temporal connector hubs.Only future research can validate or falsify the model's prediction about a slight but significant category difference between object and action-related words after focal anteriortemporal and dorsolateral prefrontal damage.
There is quite a bit of debate about the prominence of different areas for semantic processing.Some approaches hold that true semantic processing is only present in the multimodal hubs, and modality-preferential areas only serve an optional, 'enriching' or 'colouring' function (Mahon and Caramazza, 2008).Although the network model we present here offers no justification for such a view because all parts of the distributed semantic circuits contribute to their function and there is no basis for excluding circuit parts when it comes to functionthe model offers an explanation of why some areas across which the circuits are distributed are functionally more important than others.Factors which come in here are the general location of an area's neurons with respect to the network's connectivity structure (topology)with gradually more functional contributions of 'central' areas than 'peripheral' onesand, importantly, the relative CA neuron density a circuit shows across areas.In this context, the generally observed main effects of the level of area, with relatively more CA neurons in secondary than primary and also much more neurons in connector hub areas than in secondary ones, is of critical importance.In the previous study (Garagnani and Pulvermüller, 2016), it was not entirely clear whether the relatively high number of CA neurons (and thus circuit neuron densities) in connector hub areas was due to the stronger input these areas generally received (higher 'indegree') or to the network topology, or both.Here, we performed indegree normalization (see Section 2) and thus excluded the sheer amount of activity entering an area as explanatory factor.In spite of indegree normalization across sub-systems, which ensured that all network areas received (on average) equal quantities of inputs, circuit cell density was still higher in the connector hub areas in the centre of the network architecture, where phonological and semantic word circuits converge.This result is consistent with the statement that network topology plays a major role in determining the prominence of connector hubs for general semantic processing.However, we note that larger circuit densities in the 'centre' of networks have also been observed with next neighbour between-area connections only, suggesting that, apart from its 'degree' and resultant hub status as such, the 'centrality' of an area within the network is a relevant factor (Garagnani et al., 2008).
In sum, the present neural network simulations exhibit the spontaneous formation of semantic CA circuits distributed over modality-preferential and "higher" multimodal convergence areas and mechanistically explain the emergence in the cortex of both categoryspecific and general semantic processes.In addition, the use of a more realistic architecture leads to the presence of moderate categoryspecificity in connector hub areas outside the perisylvian region.The spontaneous formation of these semantic circuits is based on, and explained by, well-documented learning mechanisms of Hebbian synaptic plasticity and cortical area and connectivity structure.These simulation results explain why modality-preferential areas are activated relatively more strongly by specific semantic categories and why the connector areas become semantic hubs and to a degree similarly great, relevance for processing all kinds of meanings.

Neurophysiological mechanisms underlying word recognition and understanding: simulating the time-course of semantic activation
The semantic circuits that had formed as a consequence of correlation learning were reactivated from the acoustic phonological end to simulate the area-specific cortical activation dynamics of spoken word understanding and to provide a functional estimate of categorygeneral and category-specific semantic activation strength, topography, and timing.Comparison of maximum circuit activity levels per area and word type revealed a dissociation similar to that found in the structural analysis of circuit topographies reported above.In particular, object-related words activated the visual system (V1, TO) more strongly than the motor system (M1 L , PM L ) and, for action-related words, motor system peak activation was relatively stronger (Fig. 5B left-hand side).As before, the perisylvian auditory and articulatory sub-systems did not show any significant difference in amplitude between word-types (see Fig. 5A left-hand side).Stronger activation in the connector hubs (AT, PF L , PB, PF i ) than in secondary (TO, AB, PM L , PM i ) areas, and stronger activation in secondary than in primary (A1, V1, M1 L , M1 i ) areas, was found.The word-category dissociation and the different activity levels predicted during simulated wordrecognition processes is a direct consequence of the distinct cortical topographies of object-and action-related semantic circuits, which emerged in the model during learning, with more CA cells leading to correspondingly more activity during CA circuit ignition.
The area-specific activation time-course of the multi-area network illustrated in Figs. 6 and 7 (Brain and boxplot upper part) showed similar activation dynamics for object-and action-related words.For both word types, the perisylvian language system exhibited a cascade of activations whose peaks unfold (in a sequential manner) over a period of approximately 12 simulation time-steps.Activation was first present in the primary auditory areas A1, driven by the external stimulus, and then spread across the perisylvian areas, terminating in the primary articulatory areas (M1 i ).In contrast, activation in the sensorimotor semantic areas is near-simultaneous, with all peaks concentrated within a period of just 5 simulation time-steps (hub areas activate first, regardless of word type).The "near-simultaneous" effect of the CA cells activation processes in sensorimotor areas is caused by the rich neuroanatomical connections of the convergence hub areas, which link together the different modality-preferential cortices.Therefore, upon reaching the language hubs (PB-PF i ), activity leads to the simultaneous "ignition" of the CA cells present in the anterior-temporal (AT) and dorsolateral prefrontal (PF L ) hub cortices, which, in turn, quickly activate the modality-preferential CAs.Thus, the inherent connectivity structure of the model leads to a near-simultaneous activation of the most richly connected hub areas as compared to the primary and secondary cortices.The multimodal hubs can be seen as a "crossroad" where information from different modality-preferential systems converges; after full ignition, CA activity gradually disappears in the multiarea network (see Fig. 4), ending in the modality-preferential areasi.e.primary hand-motor area (M1 L ) for action-, and primary visual cortex (V1) for object-related words.In other words, the modalitypreferential cortices (for object words V1 and TO areas and for action words M1 L and PM L areas) activate after all other areas.Hence, on the basis of the activation dynamics exhibited by the present model, we would predict that during semantic information retrieval, activation should spread in a cascade-like fashion across the perisylvian language areas; sensorimotor areas should then activate near-simultaneously, with semantic hubs activating before the modality-preferential areas, where additional semantic information is held.
In a recent study (McNorgan et al., 2011), on the basis of withinand cross-modality feature-and concept-relatedness judgment data the authors argue that 'deep' models of semantic grounding (i.e., which involve several processing steps between sensory, and between sensory and motor components) are necessary to explain their results.Because our model is neuroanatomically realistic and, as such, it incorporates indirect multi-step links between modality-preferential sensorimotor regions, it can be considered a neurobiologically motivated 'deep' semantic model in the sense of McNorgan et al.Therefore, we conjecture that it might also be compatible with their results, although, as our present focus was on modelling neurophysiological mechanisms, we have not attempted to replicate the outcome of their specific work.Experimental studies analysing the latency of semantic processes in language perception suggest that semantic information provided by words is already retrieved within ~200 ms after stimulus presentation (Brown and Lehmann, 1979;Hauk et al., 2008;Preissl et al., 1995;Pulvermüller et al., 1999).Moreover, recent MEG-EEG recordings have shown that different semantic categories (visually presented) activated different cortical areas within ~150 ms; at this point in time, action words activated mostly the motor system and object words activated the visual system (Moseley et al., 2013).However, these neuroimaging techniques with high temporal resolution (such as MEG and EEG) do not offer a sufficiently high spatial resolution to detect fine-grained differences between multimodal semantic hubs and modality-preferential areas implemented in the neural network (for example, between premotor and prefrontal areas).Therefore, we further investigated the activation dynamics of the four sub-systems, i.e. auditory, articulatory, visual and motor sub-systems implemented in the model, and compared their respective average activation time courses with each other and with real cortical activations observed during spoken word processing.Fig. 8 reports results from a Magnetoencephalography (MEG) study investigating the temporal activation dynamics evoked by action-related words (Pulvermüller et al., 2005) and relates them to the activation time courses obtained from our model after stimulating area A1 with the 'acoustic patterns' of action-and object-related 'words'.Although the alignment of simulation time-steps and real time is always to a degree tentative, the near-simultaneous but still fastcascading activation from superior temporal to inferior frontal and finally dorsal action-related areas exhibited by the cortical sources estimated from the MEG recordings is paralleled by the model results.Note, however, that the delay between superior temporal and inferior frontal activations is relatively longer in the simulations than in the MEG sources, thus also indicating a discrepancy.For relating simulation results more directly to empirical data, it might be advantageous to perform analogous semantic learning experiments in healthy subjects and then compare the brain and network responses of the processing of the learnt items (see also below).
In sum, the model shows a "near-simultaneous" activation timecourse of the semantic areas; the semantic hubs, anterior-temporal (AT) and dorsolateral prefrontal areas (PF L ), activate first, and are then followed by the modality-preferential areas carrying category-specific semantic information.The perisylvian language areas exhibited a cascade of activations, with no word type effects.Most of the empirical studies about semantic processing performed in the past used words from real natural language, making it impossible to control the way these words have been learned, or to isolate the relevant semantic features from the many other putatively confounding psycholinguistic and psychological features distinguishing the different lexical classes between each other (Kemmerer, 2014;Pulvermüller, 1999;Vigliocco et al., 2011).A well-designed word learning experiment employing neuroimaging methods with high spatial and temporal resolution (EEG/MEG and fMRI) is needed to test the validity of the present model's results and predictions, and identify where the neural correlates of novel object-and action-related words emerge in the brain, and at which point in time of the recognition process their activation occurs.

Summary and conclusions
Current neurosemantic theories still diverge about the role of category-specific and category-general semantic mechanisms and about the contribution of modality-preferential and multimodal ('amodal') brain systems in semantic processing (Barsalou, 2008;Bookheimer, 2002;Devlin et al., 2003;Gallese and Lakoff, 2005;Martin and Chao, 2001;Patterson et al., 2007;Pulvermüller, 2005;Warrington and McCarthy, 1987).Here we applied a neural-network model replicating anatomical and physiological features of a range of cortical areas including sensorimotor, multimodal and language areas to investigate the neurobiological mechanisms underlying conceptual semantic grounding of words in action-and object-related information.The word learning simulations documented the spontaneous emergence of word/symbol-specific, tightly interconnected cell assemblies within the larger networks, each binding articulatory-acoustic wordforms to sensorimotor semantic information.Due to network structure, connectivity, and Hebbian associative learning, which maps neuronal correlations, the emerging 'semantic circuits' for object-and actionrelated words exhibited category-specificity primarily in modalitypreferential areas; the "higher" multimodal connector hub areas central to the network architecture showed only moderate category-specificity (Figs. 3 and 4).Due to their central position in the model architecture, connector hubs showed highest cell densities of both types of semantic circuits, therefore acting as 'semantic hubs'.Word category dissociations were confirmed by the reactivation of the cell assembly circuits during simulated word recognition and comprehension processes.The model's results, which can be compared with real experimental data (see Fig. 8), predict a symmetrical temporal activation for object-and action-related words, with the semantic hub areas activating first and modality-preferential ones slightly later (Figs.6 and 7).Interestingly, extrasylvian systems relevant for semantic processing of a given word category activated with a delay upon the relevant system, whereby strong dorsal motor systems activation were preceded by weak ventral visual system activation to action words, while strong ventral visual activations to objects words were preceded by weak dorsal motor processes (Fig. 5).This observation (prediction) also calls for future experimental testing.The present simulations demonstrate that realistic neurocomputational models can elucidate aspects of semantic processing in the cortex and integrate findings from neuroimaging studies.In sum, the model illustrates the spontaneous emergence of both category-specific and general semantic hub areas and, on the basis of well-established neuroscience principles, offers a mechanistic explanation of where and when meaning is processed in the brain.
Cells produce a graded response that represents the average firing rate of the neuronal cluster; in particular, the output (transformation function) of an excitatory cell e at time t is: O(e,t) represents the average (graded) firing rate (number of action potentials per time unit) of cluster e at time t; it is a piecewise-linear sigmoid function of the cell's membrane potential V(e,t), clipped into the range [0,1] and with slope 1 between the lower and upper thresholds φ and φ+1.The output O(i,t) of inhibitory cell i is 0 if V(i,t) < 0, and V(i,t) otherwise.In excitatory cells, the value of the threshold φ in Eq. (A2) varies in time, tracking the recent mean activity of the cell so as to implement neuronal adaptation (Kandel et al., 2000).Thus, stronger activity leads to a higher threshold in subsequent time-steps.More precisely, φ e t a ω e t ( , ) = ( , ) where ω(e,t) is the time-average of cell e's recent output and α is the "adaptation strength" (see below for the exact parameter values used in the simulations).For an excitatory cell e, the approximate time-average ω(e,t) of its output O(e,t) is estimated by integrating the linear differential equation Eq. (A4.1) below with time constant τ A , assuming initial average ω(e,0)=0: Local (lateral) inhibitory connections (see Fig. 1C) and area-specific inhibition are also implemented, realising, respectively, local and global competition mechanisms (Duncan, 1996(Duncan, , 2006) ) and preventing activation from falling into non-physiological states (Braitenberg and Schüz, 1998).More formally, in Eq. (A1) the input V In (e,t) to each excitatory cell of the same area includes an area-specific ("global") inhibition term k S •ω S (e,t), which is subtracted from the total sum of the I/EPSPs postsynaptic potentials V In in input to the cell, with ω S (e,t) defined by: The low-pass dynamics of the cells (Eq. ( A1), (A2) and (A4.1,2)) are integrated using the Euler scheme with step size Δt, where Δt =0.5 ms.
Excitatory links within and between (possibly non-adjacent) model areas are established at random and limited to a local (topographic) neighbourhood; weights are initialized at pattern, in the range [0, 0.1].The probability of a synapse to be created between any two cells falls off with their distance (Braitenberg and Schüz, 1998) according to a Gaussian function clipped to 0 outside the chosen neighbourhood (a square of size n=19 for excitatory and n=5 for inhibitory cell projections).This produces a sparse, patchy and topographic connectivity, as typically found in the mammalian cortex (Amir et al., 1993;Braitenberg and Schüz, 1998;Douglas and Martin, 2004;Kaas, 1997).
The Hebbian learning mechanism implemented simulates well-documented synaptic plasticity phenomena of long-term potentiation (LTP) and depression (LTD), as implemented by Artola, Bröcher and Singer (Artola and Singer, 1993;Artola et al., 1990).This rule, which covers both "true" Hebbian co-occurrence ("what fires together wires together") as well as decorralative "anti-Hebb" ("neurons out of sync delink") plasticity, provides a realistic approximation of known experience-dependent neuronal plasticity and learning (Finnie and Nader, 2012;Malenka and Bear, 2004;Rioult-Pedotti et al., 2000).In the model, we discretized the continuous range of possible synaptic efficacy changes into two possible levels,+Δw and −Δw (with Δw≪1 and fixed).Following Artola et al., we defined as "active" any link from an excitatory cell x such that the output O(x,t) of cell x at time t is larger than θ pre , where θ pre ∈]0,1] is an arbitrary threshold representing the minimum level of presynaptic activity required for LTP (or LTD) to occur.Thus, given any two cells x and y connected by a synaptic link with weight w t (x,y), the new weight w t+1 (x,y) is calculated as follows:

Fig. 1 .
Fig. 1.Model of lexical and semantic mechanisms: The 12 cortical areas modelled (A), their global connectivity architecture (B), and aspects of the micro-structure of their connectivity (C) are illustrated.(A) Six perisylvian (i) and six extrasylvian (ii) model areas are shown, each including a dorsolateral (frontal) and a ventral (temporal) part: (i)perisylvian cortex include an articulatory system (red colours), including inferiorprefrontal (PF i ), premotor (PM i ) and primary motor cortex (M1 i ) and auditory system (areas in blue), including auditory parabelt (PB), auditory belt (AB) and primary auditory cortex (A1).These areas can store correlations between neuronal activations carrying articulatory-phonological and corresponding acoustic-phonological information, for example when phonemes, syllables and spoken word forms are being articulated (activity in M1 i ) and acoustic features of these spoken words are simultaneously perceived (stimulation of primary auditory cortex, A1).(ii) Extrasylvian areas include a motor system (yellow to brown), including dorsolateral prefrontal (PF L ), premotor (PM L ) and primary motor cortex (M1 L ) and a "what" visual stream of object processing (green), including anterior-temporal (AT), temporo-occipital (TO) and early visual areas (V1).Together with the perisylvian areas, these extrasylvian areas can store correlations between neuronal activations carrying semantic information, for example when words are used (activity in all perisylvian areas) to speak about objects present in the environment (activity in V1, TO, AT) or about actions that the individual engages in (activity in M1 L , PM L , PF L ).Numbers indicate Brodmann Areas (BAs).(B) Schematic illustration of all 12 model areas and the known between-area connections implemented.The colours indicate correspondence between cortical and model areas.(C) Microconnectivity structure of one of the 7500 single excitatory neural elements modelled (labeled "e").Within-area excitatory links (in grey) to and from "cell" e are limited to a local (19×19) neighbourhood of neural elements (light-grey area).Lateral inhibition between e and neighbouring excitatory elements is realised as follows: the underlying cell 'i' inhibits e in proportion to the total excitatory input it receives from the 5×5 neighbourhood (dark-purple shaded area); by means of analogous connections (not depicted), e inhibits all of its neighbours.Each pair (e,i) of model cells is taken to represent an entire cluster or column (grey matter under approximately 0.25 mm 2 of cortical surface) of pyramidal cells and the inhibitory interneurons therein.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 2
Fig. 2 illustrates six of the twelve CA-cell distributions for object-(A) and action-related (B) words, as they spontaneously emerged during simulated word learning (the other CAs produced similar results).Each set of 12 squares is a snapshot of the CA distribution

Fig. 2 .
Fig. 2. Distributions of cell-assemblies (CAs) emerging in the 12 area network during simulation of word learning in the semantic context of visual (A) and action (B) perceptions.Results of one typical instantiation of the model in Fig.1are shown, using the same area labels.Each set of 12 squares (in black) illustrates the distribution of "cells" of one specific CA across the 12 network areas.Each white pixel in a square indexes one CA cell.CAs for object-related words extend into higher and primary visual cortex (V1, TO, but not M1 L ), linking information about spoken word forms (perisylvian pattern) with information from the visual modality (neural pattern in V1).Network correlates of action-related words extend into lateral motor cortex (M1 L , PM L , but not V1), thus semantically grounding words in information about actions.For convenience, the area structure of the network is repeated at the top.

Fig. 3 .
Fig. 3. Average distributions of CAs emerging in 12 instantiations of the 12 area network architecture during simulation of word learning in the semantic context of actions and visual perceptions.Bars show average numbers of CA neurons per area for object-(dark grey) and action-related (light grey) word representations; error bars show standard errors over networks.(A) Data from the six perisylvian areas whose cells can be seen as circuit correlates of spoken word forms do not show category-specific effects.(B)The extrasylvian areas whose cells can be seen as circuit correlates of word meaning show a double dissociation, with relatively more strongly developed CAs for object-than for action-related words in primary and secondary visual areas (V1, TO), but stronger CAs for action-related than for object-related words in dorsolateral primary motor and premotor cortices (PM L , M1 L ).Asterisks indicate that, within a given area, the number of CA neurons significantly differed between the circuits of action and object words (Bonferroni-corrected planned comparison tests, 24 comparisons; critical threshold p < .0020).

Fig. 4 .
Fig. 4. Activation spreading in the 12 area network showing the simulated recognition of object-(blue pixels) and action-related (red pixels) words.Yellow pixels illustrate the overlap between the two word-related CAs.Network responses to stimulation of A1 with the "auditory" patterns of two of the learned words; each set of 12 "squares" depicts a selected snapshot of the entire network's activity (as in Fig. 2).Cell activity levels are indicated by brightness of pixels; snapshot numbers indicate simulation time-steps of the network output.See main text for details.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 5 .
Fig. 5. Bar plots illustrating the amount of activity -"peak amplitude" (left hand side) and the activation time-course -"peak delay" (right hand side) of auditory and articulatory (A) and visual and motor (B) areas for object-and action-related words during auditory word recognition.

Fig. 6 .
Fig. 6.Spatio temporal activation patterns of the six perisylvian model areas.All curves (bottom part of each panel) illustrate area-specific activation dynamics plotted against time during the neurophysiological word recognition processes (time is in simulation time steps).Brain schematics (at the top of each panel) highlight the cortical locations of the areas for each specific activation curve and peak.The latency of maximal activation together with standard errors (boxes) and standard deviations (whiskers) are illustrated by a given boxplot.The small horizontal segment indicates stimulus onset and offset.

Fig. 8 .
Fig. 8.Comparison of real and simulated brain activations elicited by specific semantic word categories.(A) Time course of activation of cortical areas elicited by passive presentation of spoken action words and determined using magnetoencephalography (MEG) and distributed source localizations.Action words elicited sequential but near-simultaneous activations in left superior temporal, inferior frontal and superior central cortex.The average latency of maximal activation in the four ROIs is reported together with the standard errors (boxes; bars indicate 1.96 SE, data adapted from Pulvermüller et al., 2005).The boxplots in panels B & C illustrate results from the corresponding simulated activation time-courses.The point in time at which stimulus-evoked activity is peaking in each of the modelled four sub-systems (auditory, articulatory, visual and motor systems) is plotted against time given in simulation time-steps.Boxes give standard errors and whiskers standard deviations.The average was computed across the 12 different networks and calculated separately for (B) Action and (C) Object-related words.Notice that the respective non-relevant sub-systems (Visual for action-and motor for object-related words) are not illustrated here, as the activation levels are relatively low.