Hierarchical modular structure enhances the robustness of self-organized criticality in neural networks

One of the most prominent architecture properties of neural networks in the brain is the hierarchical modular structure. How does the structure property constrain or improve brain function? It is thought that operating near criticality can be beneficial for brain function. Here, we find that networks with modular structure can extend the parameter region of coupling strength over which critical states are reached compared to non-modular networks. Moreover, we find that one aspect of network function—dynamical range—is highest for the same parameter region. Thus, hierarchical modularity enhances robustness of criticality as well as function. However, too much modularity constrains function by preventing the neural networks from reaching critical states, because the modular structure limits the spreading of avalanches. Our results suggest that the brain may take advantage of the hierarchical modular structure to attain criticality and enhanced function.


Introduction
In recent years, there has been much interest in the study of the structure of complex networks arising from real world systems [1][2][3][4]. The structure of a network often determines its function as well as the character of dynamics occurring on the network. Neural networks in the brain, as typical complex systems, have a highly intricate architecture. It has been found from anatomy that one prominent feature of the architecture of the brain is that the cortical network is organized in a hierarchical modular way, from cellular micro-circuits in cortical columns, via cortical areas, to clusters of highly connected brain regions [5][6][7][8]. Neurons within a column, areas and regions are more densely linked with each other than with neurons in the rest of the network. It has been demonstrated that the hierarchical modular structure satisfies constraints imposed by the evolution and development [9]. The structure of neural networks determines assemblies of neurons, among which interactions are more effective [10][11][12]. Can the brain utilize the hierarchical modular structure to improve its functional performances and how? Here we use a numerical model to study how the hierarchical modular structure affects cascades of activity, i.e. neuronal avalanches, and the dynamical range of neural networks.
Self-organized criticality (SOC) is one of the key concepts for describing the emergence of complexity in systems consisting of a large number of components. The concept asserts that a system of interconnected nonlinear elements self-organizes into a state which has properties like those exhibited by critical states [13]. SOC is usually identified as a power-law distribution of avalanche sizes which is obtained without fine-tuning parameters. At the critical state, a small perturbation can induce a sizable response in the system. It has been found for many complex systems in nature, such as earthquakes, forest fires and avalanches [13,14]. As a typical complex system, it has been shown that neural networks can work at SOC states. SOC of neural systems was obtained in neural network models [15][16][17][18][19] and observed in experiments in vitro [20][21][22][23][24] and has potential functional advantages [20,23,[25][26][27][28][29][30].
In the brain, critical states are robust as found in several experiments in vivo [31][32][33][34]. It is robustly present in environments and configurations much more complex than those in numerical models which often ignore many aspects of the properties of real neural systems. 3 For example, in the real brain, the average synaptic strength increases during learning [35]. This puts forward the question of how the brain utilizes its structure and dynamic properties to maintain and improve the robustness of critical states. In a recent work, the robustness of SOC was investigated by considering more realistic dynamical properties in neural network models. It was shown that a biologically realistic type of synaptic plasticity (short-term depression) can extend the critical region in a globally coupled network and can therefore enhance the robustness against the variation of the coupling strength between neurons [17]. The effect of network structures on SOC has been studied on complex networks, including scale-free [36] and smallworld networks [37]. However, little is known about the effect of hierarchical modular networks (HMNs) on SOC. Here we study the role of HMN structure in improving the robustness of SOC in neural networks. Intuitionally, the modular structure limits the size of avalanche in neural networks due to sparser connections between modules. In this work, using computer simulations analysis, we will show that, surprisingly, the critical region can be further extended significantly by the modular structure. The avalanche dynamics on modular networks is distinct from the homogeneous networks. The extended criticality has high functional performance in terms of a larger dynamical range in the response to stimuli. This work also raises several fundamental questions for future investigation.

Model
To study the effect of HMNs on the robustness of SOC, we use the neural network model proposed in [17] and modify the underlying network structure. We begin with a directed random network in which each possible directed connection is built with a probability p. The average degree of the network is k = pN , which is equal to the average input and average output degree. The adjacent matrix is denoted by {a i j }, a i j = 1 if neuron j has output link to neuron i, otherwise a i j = 0. We modify the random network into HMN in the following way: first, the network is divided into two modules of the same size by rewiring links between them. That is, we randomly divide neurons into two groups of the same size. Then, with a probability R, each of the inter-group links (i ← j) is cut, and the node j is connected to a node k which is randomly selected in the same group of j. In this way we get a one-level modular network. To generate a network with (l + 1) levels, each module in the l-level network is split into two submodules and the inter-module connections are rewired into the modules with the same probability R. For an l-level network, the number of modules is m = 2 l . The model can be easily generalized by splitting a module into more than two submodules. With the rewiring method, a homogeneous network is changed into a modular network, while the total number of edges is maintained. Thus the modular networks can easily be compared with homogeneous networks. The connection matrix of the density between the modules and an example of the adjacent matrix of the neurons are shown in figure 1. Note that the average number of connections of the nodes does not change when the network becomes modular with heterogeneous connection density within and between modules.
The network consists of N integrate-and-fire neurons, which are characterized by the membrane potential V i (t). Whenever the membrane potential crosses a spiking threshold = 1, an action potential is generated and the membrane potential is reset by V r = V − . The spike is delivered to the postsynaptic neurons at a fixed delay τ d . The dynamics of the network is The connection density within a module is the ratio of the number of intramodule links to the number of possible links within the module. The connection density between two modules is the ratio of the number of links between them to the number of possible links between them.
described by the equation [17] where k is the average degree of the nodes. The network receives external inputs by a random process ξ τ ∈ {1, . . . , N } that selects a neuron ξ τ (t) = i at a rate τ (τ τ d ) and advances the membrane potential V i by an amount I ext . σ i j is the interaction strength between neurons within the network. It is found in experiments [38,39] that the synapse has short-time plasticity: synaptic transmission can be changed, because of the consumption and recovery of the transmitter. In this model, σ i j = a i j u J i j . The variable J i j represents the synaptic efficacy and is subject to the dynamicṡ Upon arrival of an action potential at t j sp from neuron j, the neurotransmitter is released with a certain probability u. Then the synaptic efficacy is decreased by u J i j in equation (2). If the presynaptic neuron is silent, then the synapse recovers and its strength J i j approaches the value α/u at a slow timescale τ J . The parameter α determines the maximal connection strength and the strength of each synaptic signal u J i j . So α is called the coupling strength. Due to the plasticity, the effective coupling in the network, J i j u, is expected to be smaller than α. It also depends on the network activity which is controlled by the coupling strength α in a non-trivial manner. In the following we report the results for the parameters p = 0.012, I ext = 0.025, u = 0.2 and τ J = 2000. The qualitative features are not sensitive to these parameters. In most simulations we consider networks with N = 2 10 nodes. The effects of network size N are also analyzed. When a neuron generates spikes, the membrane potentials of the network neighbors are advanced, and they may fire spikes. The firing propagates in the network and an avalanche is activated. In this model, the avalanche is defined as a period of activity that is initiated by an external input and terminates when no further neuron becomes activated [17].

Criticality is achieved on hierarchical modular networks
First we compute the distribution of avalanche sizes P(L) on HMNs, which are generated with a rewiring probability R = 0.9 and l = 3 (as shown in figure 1). In figure 2, the avalanche distributions for different coupling strengths α are compared. When the coupling strength is weak, the number of large avalanches that extend to the scale of the whole network is negligible, corresponding to subcritical states. When the coupling strength is strong, the distribution of avalanche size is non-monotonic and the number of large avalanches is excessive. The state is called a supercritical state. With a medium value of the coupling strength α, the distribution of avalanche size follows a power law with an exponential tail. The transition of network dynamics is similar to that on globally coupled networks [17]. Thus, neural networks with hierarchical modular structures are able to exhibit similar critical avalanche dynamics as in homogeneous networks.
Next we study whether the modular structure enhances or weakens the robustness of critical states compared with homogeneous networks. The robustness of critical states is represented by the width of the parameter region of the coupling strength α where networks can reach approximate critical states. To evaluate the critical region, we fit the distributions of avalanche sizes P(L) obtained in simulations to a power-law function while ignoring the cutoff of the distributions (L > N /2) as in [17]. The mean-square deviation (denoted by ) of the distribution from an exact power law is used to evaluate whether the distribution follows a power-law function [17]. Figure 3(a) shows the relation between the value of and the coupling strength α for random networks and modular networks with various rewiring probabilities R. It is notable that the region of small deviation can be extended by the modular structure compared with random networks. When the rewiring probability is too large, e.g. R = 0.99, the value of is increased dramatically and the distribution of avalanche size is far from a powerlaw function. The exponent γ of the avalanche distributions as a function of coupling strength is shown in figure 3(b). In the region of small deviation , the exponent of the distribution of both modular networks and random networks is close to −1.5. So the avalanche dynamics on HMNs has the same class of critical states as that on globally coupled networks [17], i.e. behaves effectively as a critical branching process [13]. For networks with finite sizes, the α value corresponding to minimal deviation min is regarded as the critical state. The relation between the minimal value of the deviation min and the rewiring probability R is shown in figure 4(a). The value of the minimal deviation min firstly decreases and then increases as R increases. Smaller min indicates that the modular network structure can make the distribution of avalanche sizes closer to an exact power-law function than random networks. The distributions of avalanche size at the critical point of random networks and HMNs with R = 0.9 are shown in the inset of figure 4(a). The cutoff of the distribution is moved to the right by HMNs, so that the deviation of the distribution from the power-law function is reduced, which is the reason for smaller min . The width of the critical region can be used as an indication of the robustness of SOC with respect to the coupling strength α. To measure the width of the critical region, we use th = 2 × min (R = 0) as the ad hoc threshold of the value of for both random networks and HMNs with a given network size. min (R = 0) is the minimal value of deviation for a random network with the same network size. Such a choice is reasonable because the effects of finite network sizes can be taken into consideration (in general larger min for smaller N ). Figure 4(b) shows the relation between the width of the critical region and the rewiring probability R. The width of the critical region is significantly extended by HMNs.
Additionally, the distributions of avalanche sizes of the networks at the boundaries of the critical region ( = th ) are shown in figures 4(c) and (d) for random networks and HMNs, respectively. It is notable that at the tail of distributions, the distances between the curve of critical states and curves at boundaries are larger for HMNs than those for random networks. This results from the fact that the cutoff of HMNs' distribution function moves to the right. We will show later that the extension of the approximation critical region manifests itself in the network's functional property in terms of response sensitivity to external stimuli. It seems that the distribution on the upper boundary of the critical region (figure 4(d)) of HMNs is supercritical. We can select lower thresholds to reduce the difference between the boundary and the critical state, but the conclusion that HMNs can enhance the range of effective criticality remains valid, as seen in the inset of figure 4 The modular network that extends the critical region has distinctly different structure properties from a homogeneous network. We calculate the fraction f ext of connections between the m (= 2 l ) modules of the l-level network. For the three-level networks, when R = 0 the network is a random network, f ext = (m − 1)/m = 7/8. When R = 0.9 the fraction f ext is decreased to about 0.14. The formation of modules can be quantified by the modularity Q [40], which is defined as Q = k (e kk − a 2 k ), where e kl is the fraction of all links in the network that connect the nodes between the modules k and l, and a k = l e kl [10]. When R = 0.9 the value of modularity of the m modules is 0.73 for the three-level HMNs with network size N = 1024.
In figure 5, we show the relation between the minimal deviation min at critical states and the modularity Q of HMNs with different levels and sizes. The enhancement of robustness compared with random networks can be obtained in hierarchical networks with different levels. For networks with size N = 512, the optimal numbers of levels, which give the lowest values of min , are l = 2 and 3, as shown in figure 5(a). When the network size N = 1024, the optimal number of levels is l = 3, as shown in figure 5(b). So the optimal level that gives the lowest min depends on the network size. Figure 5 suggests that the larger the network, the more modules are needed for optimal enhancement of the robustness of criticality in HMNs. The results indicate that the optimal number of levels increases slowly with the network size, likely to follow a logarithmic dependence on N . Currently, a systematic exploration of this relationship is beyond our capacity, because the amount of computation increases very fast with network size. An analytical treatment in the future might be helpful in obtaining a better understanding of this important issue. For HMNs with the optimal number of levels, the lowest min occur with an intermediate modularity Q. Similarly, it has been shown that modular networks with a certain intermediate modularity support the capacity of the cortex to segregate and integrate multisensory information simultaneously [41,42].
We also note that critical states observed on HMNs (l > 1) can be recovered in the modular networks without hierarchy (l = 1). However, hierarchical organization can allow the networks to display criticality in a broader range of larger modularity (figure 5), in general broader with more levels in the hierarchy. This is another indication of the robustness in terms of the variation of network structure-the system can achieve SOC in a broad range of rewiring probabilities R with hierarchical organization.

Avalanche dynamics on hierarchical modular networks
In this section, we study the effect of the HMN structures on the avalanche dynamics of the neural networks. The difference in the dynamics of avalanches between a random network and three-level modular networks with the same coupling strength α = 1.2 is illustrated by the spike raster in figure 6. Compared with the random network (figure 6(a)), the avalanche pattern is changed significantly by modular structures (figures 6(b) and (c)). On the modular network with R = 0.9 we can see that some avalanches are limited in modules, and large avalanches are fewer ( figure 6(b)). When R = 0.99 large avalanches are decreased further and more avalanches are limited in modules, as shown in figure 6(c).
We can make a quantitative measurement of the effect of modular structure on the spreading of avalanches by counting how many modules are involved in an avalanche. For avalanches with moderate sizes (200 < L < 300), the distribution of the number of modules involved is presented in figure 7. For random networks, the avalanches always spread to all the subnetworks which correspond to the modules in modular networks. When the rewiring probability R = 0.9, most of the avalanches involve all of the modules in the networks, and the avalanches spread to seven or six modules with small probabilities. The results suggest that the mean-field theory of criticality could still be a reasonable approximation for the dynamics on HMNs if the intermodule connections are not extremely sparse. When the rewiring probability R = 0.99, most of the avalanches cannot spread to eight modules. As an example of the effect, distributions of avalanche sizes are shown in the inset of figure 7. The distribution is far from a power-law function. Modules are at supercritical states, as indicated by the peak at the size of the module (N /8 = 128), while the whole network is similar to subcritical states because large avalanches are lacking.
It is interesting to try to understand the relation between the avalanches in modules and those in whole networks and how the avalanches in modules mix with the avalanches of the whole network. We investigate the distribution of the number of neurons in one module that are involved in an avalanche. For random networks, we consider a network as coupled subnetworks, and the distribution is also computed for each subnetwork.   of avalanche size for the whole random network and for subnetworks. Avalanches uniformly distribute in all subnetworks, and a part of the network is similar to the whole network. The distribution of avalanche sizes within subnetworks shows a similar critical state of the whole network, but has a cutoff of the distribution at the size of the subnetwork. In modular networks, the distribution of avalanche sizes in modules is different from those in random networks. Figure 8(b) shows that large events within modules have a relatively higher probability than those avalanches of similar sizes in the whole networks, indicated by different slopes of these two distributions. This is because in HMNs some large events may involve many neurons in one module and some other neurons in other modules. For R = 0.95 the distributions in figure 8(c) show that the modules are at supercritical states. But the mixing between large and small avalanches within modules induces a distribution of avalanche size of whole networks close to a power law. It is important to point out that the results shown in figure 8(c) are consistent with the experimental results [20], where distributions of avalanche sizes for a part of the electrodes monitoring the activity of cortical networks have a peak at the tail, while the distribution for the whole array of electrodes does not show a peak. This consistency indicates that the modular organization is likely playing an important role in the avalanche dynamics in biological neural networks.

Understanding using the branching process and the largest eigenvalue approach
Deeper understanding of critical phenomena could be obtained using the branching process [13]. Recently, the branching process on various networks was studied and it was proposed that the largest eigenvalue of the network adjacent matrix governs the critical state [43]. This theory is proved for random networks and scale-free networks. It is interesting to examine if the theory is applicable to the biologically more realistic neural dynamics and modular network structure considered in this work.
Basing on a mean-field approach, we can consider our system as a branching process on the network with effective coupling strength. In neural networks with dynamical synapses, the synaptic strengths adapt their values as the available resource of the neurotransmitter is consumed and recovered. The mean synaptic efficacy u J i j is the effective coupling and determines the statistical behavior of the networks [17]. The membrane potentials of the neurons in the network before an avalanche are uniformly distributed on the interval [ , ], where is the spiking threshold of the membrane potential and is a lower bound of the reset membrane potential V r = V (t sp ) − . (The lower bound → 0 for N → ∞ and k → ∞, because the synaptic signal tends to 0 as k → ∞.) The probability that a neuron is activated by a synaptic signal is u J i j 1 k 1 − . We denote the probability that neuron i is activated by external or synaptic signals by p i . Examining the linear stability of the fixed point around p i = 0, we can obtain p t+1 i = N j p j A i j [43], where A i j = a i j u J i j 1 k 1 − . This form translates the dynamics of the neural system into a branching process. Diagonalizing the system using eigenvectors, namely assuming p t i = v i λ t , we obtain λv i = N j v j A i j . Therefore, the critical state should occur when the largest eigenvalue λ max of the matrix {A i j } equals 1, so that one eigenmode is not decaying or growing. The theory appears to be applicable to any network if the mean-field approximation is valid.
This theoretical prediction can be compared to the numerical estimation of the largest eigenvalue λ max . The effective coupling J i j u and at the critical point ( min ) of networks with finite sizes can be obtained in simulations. We found that for random networks, λ max = 0.992, which is slightly smaller than 1.0, most likely due to finite system size. Therefore in our more realistic neural dynamics model, the theory of [43] is verified on random networks.
For modular networks, λ max is insensitive to R in a broad range. However, λ max increases drastically when the modularity becomes significant when R is very close to 1.0 (figure 9). λ max > 1 at the critical point of the whole modular network indicates that the local dynamics is already in the regime of instability, which is consistent with the observation in figure 8(c) that the avalanches in the module are slightly supercritical. So the theory is no longer valid to associate the largest eigenvalue with the overall critical point of the whole network when it becomes highly modular. Intuitively, the deviation from the theory results from the fact that the perturbations cannot spread in the whole modular networks at the point that the static state turns unstable locally. A larger coupling strength is needed to generate the power-law distribution overall, but leads to a slightly supercritical state within modules. The mean-field approach and local stability analysis are not sufficient to capture the avalanche dynamics across the modules. A new theory is needed to obtain a more comprehensive understanding, which is an interesting topic for future research.

Mechanism for enhancing the robustness of criticality
Now let us obtain some insight into the mechanism underlying the enhanced robustness of criticality. It is due to the interplay between short-term depression and modular network structure. Previously, it was shown numerically in [17] that the states of homogeneous networks are determined by the effective coupling strength J i j u. Here we can show that the dependence of on J i j u in the HMNs is very similar to that in random networks, except that the curve is shifted to larger values of J i j u ( figure 10(a)). Therefore, the effective coupling can still be used as the control parameter of network states when the modularity is not extremely strong to prevent the system from reaching the criticality.
The nonlinear relation between α and J i j u ( figure 10(b)) results in the asymmetry of the critical region with respect to α. It is notable that the slope of the curve J i j u versus α decreases as α increases ( figure 10(b)). This is a result of the activity-dependent synaptic adaption. A larger synaptic strength tends to shorten the inter-spike intervals and the synapses have a short time to recover and the average strength resides at a lower level [17]. As a result of the relation between J i j u and α, the boundaries of the critical region are not symmetrical around the critical state, stretching to the right side of α in order to generate the same amount of deviation of the effective coupling J i j u from that of the critical point. This is shown by the gray region in figure 10(b) for random networks.
The effect of synaptic adaption is more pronounced in modular networks. The dense connections within modules can induce temporally more synchronized firing (figure 6), which will lead to stronger depression of the synapses. As a result, the effective coupling at a given α becomes smaller in modular networks when compared to random networks ( figure 10(b)). On the other hand, the avalanches are limited by modules, so the other parts of the network only receive input from activated modules rather than from all modules. Therefore, with the critical effective coupling strength of random networks, the HMNs as a whole are at subcritical states and a larger effective coupling strength J i j u is needed to achieve the critical state. Importantly, the slope of the curve J i j u versus α decreases as α increases. When the critical point moves to the right, it corresponds to a broader range of α for the effective critical region in modular networks, as shown by the blue region in figure 10(b).

Dynamical range in the response to stimuli
In the neural network model based on the branching process, it was shown that the critical state can support the optimal response range to external perturbations [25], because the response has both sensitivity and high differentiation for stimuli. In [25], each susceptible neuron is activated with a probability s at each step. At subcritical states, the response is able to differentiate signals (linear response), but they cannot amplify small signals. So there is a long flat section of response for small signals in the signal axis. At supercritical states, small signals are amplified, but the response is not able to differentiate signals, and again there is a large flat region. With very large signals, the networks at different states all generate saturated responses. In [25], these flat regions for weak and strong signals can be effectively discarded by considering a range of signal [s 0.1 , s 0.9 ] where the response interval [r 0.1 , r 0.9 ] is between 10 and 90% of the total response range [r 0 , r max ] for a given network state. The dynamical range then was defined as = 10 log(s 0.9 /s 0.1 ), which measures the signal processing ability by the signal interval over which the signal is amplified and is well differentiated by the response. is found to be optimal for the critical state in the branching process.
Such an advantage of the critical states, if it exists also in the more realistic neural networks in the regime of SOC, would be beneficial for neural information processing using SOC. In the following, we study the response of neural networks to stimuli. Different from the branching process model, where the probability that one firing node activates its neighbors is fixed, in our model the coupling strength always evolves adaptively during the response process. With long-lasting or large stimuli, the neurotransmitters will be consumed, and the response is expected to become weak after a transient. Figure 11 shows the adaptation of the response to continuous stimuli. Here the simulations are performed on random networks. At each step of the fast timescale (i.e. each τ d ) one neuron is randomly selected to be activated. The number of spikes in the network at each step is plotted for subcritical, critical and supercritical states. We can see that the differences between the responses of the three states tend to vanish after a maximum. The neural networks with dynamical synapses are suitable for processing shortlasting signals. Therefore in networks with dynamical synapses, the average firing rate of the response of networks to continuous stimuli over a long time window as used in the branching process model [25] is not a suitable measure of the response. An experiment demonstrating that cortical networks at criticality have maximum dynamical range [23] also used a transient stimulus and response to characterize the dynamical range. Here we consider the response of the neural network model to transient stimuli. We randomly select s neurons and activate them. The number of selected neurons s is regarded as the strength of stimuli. The size of the avalanche evoked by a stimulus is taken as the strength of the response, denoted by r . Figure 12(a) shows the relation between the response r and the stimulus s. Here we limit the signal strength to a small region s ∈ [1, 50], i.e. s low = 1 and s high = 50, in order to set s N . r low and r high are the responses to the stimuli of s low and s high , respectively. The response property is different from the branching process in [25] in two ways. (i) For weak signals, the network in the supercritical region does not show a long flat region ( figure 12(a)). (ii) If the stimuli are too strong, the short-term depression induces that the response is dominated by the exhausted synapses and the response will be linearly proportional to the strength of the stimulus, and the ratio is different for sub-and supercritical states. Due to these differences, the definition of dynamical range in [25] is not directly applicable to catch the interval of signals where the response can achieve both sensitivity and differentiation to signals. Thus we propose a modified definition for this purpose. Note that r low /s low can show whether the network is sensitive to the smallest signal considered, and r high /r low can measure the differentiation ability when the signal changes from a small to a large value. A combination of these two parts, namely δ = 10 × log(r high /r low ) × log(r low /s low ), can provide a characterization whether the network response can both amplify and differentiate the range of signals considered. Although the details are different, our definition measures the same ability of processing signals with both sensitivity and differentiation as the dynamical range in [25]. With this definition of dynamical range, we can see that the critical states have the largest dynamical range, as shown in figure 12(b). When comparing HMN with R = 0.9 and l = 3 to the corresponding random network, one can see that the parameter region of α for a large dynamical range is extended by HMN structure. The enhancement of the robustness of criticality on HMNs can manifest itself in the response functions of neural networks.

Effects of network size
In studying critical phenomena, it is important to understand how the scaling properties depend on the system size N . In the original study of globally coupled neurons with short-term synaptic plasticity [17], the effects of network size were analyzed under the condition that the time constant τ J of the recovery of the synapse strength is proportional to the network size, i.e. τ J = 10N . Under this assumption, global coupled networks and random networks can get critical states in the thermodynamic limit N → ∞ for α 1. The proportional relation between τ J and network size N , however, is not biologically realistic because the synaptic recovery has characteristic times [44,45], and it is often fixed in models [45]. Here we consider that the value of τ J is fixed, independent of network sizes. In this case, using the self-consistency equation of the critical state proposed in [17], we can show that the critical state of random networks in the thermodynamic limit can only be reached at one particular point α = 1, rather than a broad parameter region (see the appendix). Numerical simulations with different network sizes under a fixed connection density confirm the theoretical prediction. Figure 13(a) shows that the critical point in random networks approaches α = 1 and the critical region shrinks when the network size N increases. The shrinking of the critical region can be interpreted by the relation between the effective coupling J i j u and the value of α (see figure 10). The slope of J i j u versus α around the critical point increases as the critical point shifts to the left; thus the bottom of the curve -α becomes narrow. So the approximate criticality states are hard to reach in random networks as the network size N increases.
When coming to study the impact of network size on SOC in modular networks, we encounter a fundamental theoretical problem-how to scale the network when the size is getting larger? In particular, how should the topological parameters, such as the depth of the hierarchical levels, the size of the densest modules and rewiring probability or the overall modularity, be varied with the system size, so that the modular networks of different sizes can be regarded as belonging to the same class? To our knowledge, this question has not been addressed and we do not yet have a clear answer to it. This important problem of both theoretical and practical significance in statistic physics of complex systems deserves a systematic study in the future.
In this paper, we study the effect of network size of HMNs on SOC numerically and heuristically. We compare a set of HMNs with the corresponding random networks of the same size in figure 13(b). Here we fix the size of the basic module (128 neurons) and increase the hierarchical level l with the system size N , i.e. l = 2, 3, 4 for N = 512, 1024, 2048, respectively. For a fixed rewiring probability (e.g. R = 0.9), we can see that the region of approximate critical state in HMNs is larger than that of the corresponding random network, which is also manifested by the ratio of the width of the critical region of HMNs to that of random networks in figure 13(c). But, similarly, the region shrinks when the network size becomes larger. Intuitively, this is because the avalanches cannot spread to larger scales proportional to the network size. As discussed above, to achieve an approximate critical state in the whole network, the avalanches with dense modules need to move slightly into the supercritical regime. Therefore, it is possible that the critical state in larger networks can be restored when the avalanches within the modules are further pushed into the supercritical regime, for example, by further increasing the connection density within the modules. Indeed, we can see from figure 13(b) that the width of the critical region does not decrease clearly with network size N if the rewiring probability R is also adjusted, R = 0.90, 0.92 and 0.94 for HMNs with N = 512, 1024 and 2048, respectively. The results here are consistent with those in figure 5 that networks with more levels and larger sizes show critical states at larger modularity Q.

Conclusion and discussion
We investigated the effect of HMN structure on the robustness of SOC in neural networks with biologically realistic short-term synaptic plasticity. By rewiring the inter-module connections into modules, we changed homogeneous networks into HMNs that have the same size and overall connection density. We compared the avalanche dynamics on HMNs with that on random networks. The hierarchical modular structure affects the avalanche dynamics in different ways when the modularity increases. We found that critical states can be reached on the HMNs. More importantly, networks with significant modular structure property can extend the parameter region of coupling strength in which approximate critical states are reached. With too large modularity, the modular structure prevents the neural networks from reaching critical states. The finding that maximally enhanced robustness of SOC occurs at a certain intermediate modularity is similar to a balance between information segregation and integration in the brain by modular structure with intermediate modularity [41,42]. It will be interesting to associate the avalanches within modules and crossing modules with information segregation and integration.
The effect of modular structure on the avalanche dynamics is to limit the spreading of avalanches in the modular networks. This effect has also been demonstrated in the study of the spreading dynamics on networks [46]. More than the limiting effect, here we showed several new features of the avalanche dynamics on modular networks. With intermediate modularity, large events in one module mix with small events in other modules and the whole network exhibits criticality. The limitation of the modular structure on the spreading of avalanches induces that a larger coupling strength is needed for criticality when compared to the corresponding random networks. As the result of the dynamical synapses, i.e. the relation between effective couplings and synapse strength, the critical parameter region is enlarged in modular networks. In particular, an important dynamical regime in the mixtures of states within modules is that modules are slightly supercritical while the whole network is close to critical states. This finding provides an explanation of the same observation in experimental results from cortical neural systems [20] by the modular network organization, which is a prominent feature in cortical neural systems. With too large modularity, the avalanches from different modules cannot combine together effectively. Modules exhibit avalanches clearly in the supercritical regime, while the whole networks display subcritical states with avalanches seldom reaching the scales of the whole systems.
We found that when network size is larger, the effect of HMN on the robustness of criticality could be more significant with suitable hierarchical levels. For larger networks with more hierarchical levels, stronger avalanches within the basic modules are required for the spreading of the avalanches to larger scales, so that the critical state can be established in the whole network. In contrast, the critical region shrinks in larger random networks. Our numerical results also suggest that in order to obtain optimal enhancement of the robustness of criticality, more levels in the hierarchical organization are needed for larger networks. It will be an interesting challenge to develop an analytical theory for the optimal HMNs for SOC.
Our analysis demonstrated that the avalanche dynamics in this more realistic model of neural dynamics with synaptic plasticity can be understood using the branching process when the network is random or when the modularity is not very strong and the critical states are determined by the largest eigenvalue of the effective coupling strength matrix. However, for modular networks with significant modularity, the largest eigenvalue cannot predict the criticality due to the new features of the avalanche dynamics. A new theoretical approach towards the criterion for criticality in highly modular networks needs to be developed, which is a topic we reserve for our future study.
We also studied the response of the neural network model to stimuli. We showed that the neural networks with dynamical synapse are suitable for processing transient signals rather than continuous signals. The critical states on both random networks and HMNs have the largest dynamical range, suggesting that SOC could be employed by the neural system to achieve high sensitivity to signals and significant amplification of signals. More importantly, the parameter region of a large dynamical range is increased by the network modularity due to the enhanced robustness of criticality.
To summarize, a system without the robustness of SOC could easily move into subcritical or supercritical states when the system parameters change, where it cannot perform efficient information processing. Here we showed that the parameter region for SOC can be significantly extended in HMNs compared with homogeneous networks. Our work reveals that neural systems in the brain could utilize the HMN structure to enhance the robustness of the critical states. With a much larger range of coupling strengths supporting SOC due to the modular structure, the network can maintain sensitivity to external stimuli when the average coupling strength is modified during learning [35]. This property is of special importance when the critical state itself is crucial for learning [30]. In the future, we will study the interplay between SOC and learning by incorporating spike-timing-dependent plasticity [47], where the plasticity will likely increase the heterogeneity of coupling strength in the network together with a possible increase or decrease in a global manner.
A recent work [48] has also shown the importance of hierarchical modular structure for a type of criticality related to effective synchronization of modules with coherent within-module activation. In their model, a dc current and the spike-timing-dependent plasticity were used to achieve synchronized firing of neurons within each module, which is called module spike.
It is found that the HMN organization is important in supporting a power-law-like behavior of the number of modules displaying module spike simultaneously, which is actually the size of the synchronized cluster of modules. This type of synchronization-related criticality cannot be obtained in random networks, but in the HMN due to the inherent self-similarity in the structure [48]. In contrast, the neuronal avalanche dynamics studied here is self-organized through the coupling of synapses with short-term depression, and the criticality can be obtained on globally coupled networks and random networks which do not have any built-in scalefree properties in the structure. However, modular organization, especially hierarchical modular structure, can significantly enhance the robustness of criticality. It is important to note that the dynamics considered in [48] and in our work can be associated with different states of the brain, one with external stimuli represented by the dc current and one with spontaneous background activity. It will be very interesting to elucidate in future how these two different dynamics are related to each other. A significant difference between these dynamics can be expected; for example, in contrast to synchronization within modules, neuronal avalanche dynamics has a large repertoire of spatiotemporal activity patterns [49]. Furthermore, the enhanced sensitivity to external signal by SOC in modular networks may play a constructive role in organizing the dynamics into synchronized patterns in the presence of structured stimuli.
The avalanche dynamics is similar in different systems of SOC and modular organization has been shown to be very common in many complex network systems [40]; our findings may provide new insights into the dynamics of complex systems beyond neural networks. Our work has also stimulated several important theoretical problems for future investigations, such as how to study the finite size scaling and to identify the structural determinant for the criticality in highly modular networks (the largest eigenvalue is not valid). These open problems are of fundamental importance in understanding many realistic complex systems where modular and hierarchical modular organizations are commonly present, because mean-field approximation could only be valid up to certain levels. For example, in the brain, the number of total neurons is huge (much larger than the sizes one can consider in any finite size analysis), but the basic functional modules, such as mini-columns, have only hundreds to thousands of neurons. Fluctuations due to finite size may play crucial roles in information processing, which cannot be captured by scaling the system to the thermodynamic limit.
For global coupled networks at the critical states, the mean synaptic strength J i j and the mean value of the inter-spike intervals isi can be derived following the method in [17]. (A.1) In the limit N → ∞, the external input should be scaled by I ext ∼ N −w with w > 0. Assuming the isi ∼ N 1+ , then the states of the system in the thermodynamic limit depend on the values of w and . For any value of > −1, the left-hand side of the equation tends to α. If w, then the right-hand side is 0. If < w, the right-hand side tends to 1. So the only possible solution is α = 1.
Similarly, the self-consistency equation of critical states can be derived for random networks with the average degree scaled by k = cN η .
In this equation, we used the expected value of the avalanche size on infinite global networks as the approximation of the expected value on random networks. Rescaling the synapse strength by the average degree, this approximation is reasonable for random networks. At η = 1 and c = 1, that is, the average degree equals N , equation (A.2) recovers the result for the globally coupled network as in equation (A.1). The left-hand side of equation (A.2) always tends to α. If > w there are two cases: if η < 1, the right-hand side tends to 0; if η = 1, the right-hand side is negative (tends to −∞ when c = 1). When = w, the right-hand side tends to 0. If < w, then the right-hand side is negative if η < 1 − (w − ); or the right-hand side tends to 1 if η 1 − (w − ). The only possible solution for the random network is α = 1.