The relationship between structure and function in locally observed complex networks

Recently, studies looking at the small scale interactions taking place in complex networks have started to unveil the wealth of interactions that occur between groups of nodes. Such findings make the claim for a new systematic methodology to quantify, at node level, how dynamics are influenced (or differentiated) by the structure of the underlying system. Here we define a new measure that, based on the dynamical characteristics obtained for a large set of initial conditions, compares the dynamical behavior of the nodes present in the system. Through this measure, we find that the geographic and Barabási–Albert models have a high capacity for generating networks that exhibit groups of nodes with distinct dynamics compared to the rest of the network. The application of our methodology is illustrated with respect to two real systems. In the first we use the neuronal network of the nematode Caenorhabditis elegans to show that the interneurons of the ventral cord of the nematode present a very large dynamical differentiation when compared to the rest of the network. The second application concerns the SIS epidemic model on an airport network, where we quantify how different the distribution of infection times of high and low degree nodes can be, when compared to the expected value for the network.


Introduction
Given that complex systems are almost invariantly composed by a large number of interacting elements, they can be efficiently represented and studied in terms of complex networks [1,2,3].In this representation, their structural and dynamical properties can be extracted and investigated.Typically, the structure of such networks is quantified in terms of several measurements [4], reflecting different properties of the respective topology (e.g.node degree, shortest paths, centralities) and geometry (e.g.arc length distances, angles, spatial density).
A great deal of the investigations about structure and function in complex systems has focused on trying to predict the dynamics from specific structural features [5,6,7].Such an ability would provide the means for effectively controlling real-world systems.Despite the growing number of works devoted to this problem, the knowledge about the relationship between the structural and dynamical properties remains incipient because of three main reasons: (a) dynamics is often summarized in terms of global statistics, which overlooks its intricacies; (b) the investigation often focuses on linear relationships such as correlations between structural and dynamical features; and (c) several effects, such as initial conditions, network topology, stochasticity or dynamical differences from node to node are not selectively fixed or controlled.
In this article we propose a novel methodology capable of quantifying how much the dynamics at each node differentiates from the dynamics at the other nodes as a consequence of specific aspects (e.g.local anisotropies) of the network structure.This is accomplished by simulating the investigated dynamics for a large number of initial conditions and checking how much a given property of the dynamics at a node (e.g.entropy of the time series at that node) deviates from the overall dynamics.The level at which a node i "feels" the structure differently from the other nodes is quantified in terms of a parameter α i .In this way, the proposed methodology addresses the three shortcomings mentioned above by: (i) being local, i.e. it is applied for each individual node; (ii) by not imposing any specific kind of relationship between dynamical and structural features; and (iii) isolating each (above mentioned) condition that can affect the dynamics.
Several important findings have been obtained by using this methodology.Our results show that the nodes feel rather distinctly the structure in most of the considered situations.While the Erdős-Rényi model [8] does not show any dynamical differentiation, the geographic model of Waxman [9] presents fluctuations that naturally originates different dynamical groups trough the density of connections.The Barabási-Albert model [10] shows a rather distinct behavior for the highly connected nodes of the network, which end up having a very distinct dynamics related to the rest of the network, being even more extreme than the topological differences.When considering a real network of the nematode Caenorhabditis elegans, we find that there exists a group of neurons where the local topology influences rather distinctly on the spike rate of the signals.

Integrate-and-fire dynamics
The efforts of understanding the intricacies of neuronal dynamics in the brain is reasonably old.Many classical studies tried to characterize the effect of the most diverse stimulus on the behavior of a single neuron, through a range of mathematical models [11,12].But with the advent of graph theory the area has gained a new point of view, which considers that many behaviors of a large set of interconnected neurons can not be explained by the simple extrapolation of particular neuronal dynamics.One of the first solid work in this area is due to Wilson and Cowan [13] who formalized theoretical tools to study an ensemble of excitatory and inhibitory neurons.Since their work, a range of new studies have appeared with the framework of network theory [14,15,16,17].
Here we take an information approach [18] of the dynamics, which is not concerned with the particular shape of the neuron signals, but with the times and intervals of neuronal spikes.The most traditional dynamics that take this approach is the integrateand-fire model [19,20], which treats the neuron as an integrator with a hard threshold limit, T l ‡.Since we are not interested in studying the dynamics per se, but use it as a tool to present our methodology, we choose a further simplified discrete integrate-andfire dynamics given by where V i (t) is the membrane potential of neuron i at time t, t f j the instant of the f -th spike of neuron j, δ(x) = 1 when x = 0 and δ(x) = 0 otherwise, and σ the couple strength.In this simplified scheme the relevant dynamical parameter is T l /σ, and so, without loss of generality, we set σ = 1.When V i reaches the threshold T l , the neuron fires a unitary signal to all its neighbors and V i is reseted to zero.The values for membrane potential at t = 0 were randomly sorted with uniform probability inside the range [0, T l ].
The actions of a given neuron i along the time is stored in the binary time series s i (t), which indicates that the neuron spikes at instant t whether s i (t) = 1.

Characterizing the dynamics
After defining the dynamics to be studied, we need to determine what characteristics of the node signals we want to study.Here we present the measurements that will be used to illustrate the methodology.Spike rate: Given the time series s i (t), we define its mean rate as ‡ For a good review of a class of discrete neuronal dynamics called map-based models see [21].
The relationship between structure and function in complex networks observed locally 4 where T sim is the total simulation (or experiment) time and T est is an empirical time, taken large enough so as that r i do not significantly change after it (we say that the dynamics stabilized).This measurement corresponds to the average number of spikes during the considered interval, and it is widely used in neuroscience, since many neurons codify the stimulus amplitude trough the rate of spikes [18].
Inter-spike entropy: Besides the rate, many neurons are also believed to codify information in the inter-spike intervals, that is, the time spent between two consecutive spikes.Defining ∆ as the random variable associated with those time intervals, then we can write the inter-spike entropy of a signal i as where P (∆) is the probability of finding an interval of size ∆ in the signal of i.This measurement quantifies the complexity of the discrete signal.

Distance measurements
A last concept we need to introduce is related to the distance between a set of points.Suppose that we have a set of points inserted in a m-dimensional space and these points form two distinct groups (see Figure 1).An immediate way to quantify the separation between the groups is by using the Euclidean distance between the center of mass of each group, given by where r and s are the indices of the groups and x ri represents the mean (or center of mass) of the points in group r on the i-th dimension.The disadvantage of the Euclidean distance is that it does not take into account the dispersions of the groups from which the distance is being measured, to that means we use the Hotelling distance.Following Figure 1, if we have two random variables X r and X s with the realized values {x r } and {x s }, marked respectively in blue and red in the figure, clearly the two cases shown in Figures 1(a) and (b) have a more significant distance between their mean than the case in Figure 1(c), although the Euclidean distance is the same.There are many ways to take into account the dispersions of the groups, here we use the Hotelling statistic [22] that considers the variance of each group in the direction defined by the line that passes between the two means.The distance, h, between two groups is defined by where x r is the average position of group r.The variable Σ m is the estimation of the equivalent covariance matrix of each group, given by where, Σ r and Σ s are the covariance matrix of the groups.In Figure 1 we show the values of h for each case.The distance we considered is an hybrid version of both Euclidean and Hotelling metrics.This is required in order to avoid singularities for set of nodes with zero variance.Since we are not interested on the absolute value of the distance, but on the comparison of values between groups, we can define a new hybrid distance, given by where d(r, s) is the usual Euclidean distance between the center of mass of groups r and s.
Finally, in cases we have to calculate the distances trough more than one variable, it is necessary to normalize them so as to give a fair comparison.In order to do so we calculate the standard score of each measurement x, given by The relationship between structure and function in complex networks observed locally where std(x) is the standard deviation of x.Since through this work we always use the standardized version of the values, we simplify the notation by calling x just by x.

Measuring the differentiation
In order to apply our methodology we begin with a network having N nodes, that will be fixed through the entire process, and execute R 0 times a dynamics on it, each execution beginning with a different initial condition.The best approach would be to apply every possible initial condition, but this would be impossible to simulate, so we randomly sort the initial conditions that will be used.It is important to note that depending on he dynamics we can have a very restricted set of initial conditions that take the system to a particular state, and so this state will rarely be accessed by sampling.That is not a problem to our method, because we are analysing the dynamics for the set of initial conditions imposed, that is, we are studying the signals that are in fact observed.f (1)   f (2)   f (2)   f (2)   f (1)   f (1)   Case 1 Case 2 Figure 2. Example of topological diferentiation for two nodes.Projecting the signal characteristics for a single realization of the dynamics (a) is not suficient to discern the topological influence.We need to consider many distinct initial conditions in order to infer if the topology diferentiates (b) or not (c) the dynamics of the nodes.
In possession of the dynamic signals for the R 0 realizations, we need a mechanism to represent them in order to compare their behavior.In our case we use dynamical measurements that tries to extract the most relevant information of the signals.Let F be one of such measurement, after the many realizations we get a set of observed values The relationship between structure and function in complex networks observed locally 7 f i,r , where i is the node index and r the realization.These values can vary trough four distinct mechanisms: (a) initial condition, (b) network topology, (c) particular dynamics and (d) stochasticity.Our objective here is to study only the relation between the initial condition and the topology, and so we use a deterministic dynamics with identical equation for every node.With these restrictions the only variations we can observe on the values of F are: (a) fluctuations on the dynamical values of a given node, that given the fact that the topology is static, can only be caused by the variation of the initial condition.(b) Differences on the mean values of F for distinct nodes, that because of the properties assumed can only be caused by the topological differences of the nodes.The term topological difference need to be used with care, because unless in very specific cases where the network is perfectly symmetric (e.g., a lattice with toroidal boundary), the topology of two given nodes is never identical, that is, we can always find a structural characterization that will have distinct values for them.Nevertheless, since the networks we use are not regular, every significant dynamical difference we observe must be caused by the topology.It is also important to note that the reverse is not true, if the dynamics of two nodes appear to be the same, their topologies are still distinct, what happened is that both nodes felt the topology in the same manner.An example of this last case is the diffusion dynamics on graphs [23], in which the equilibrium behavior depends only on the degree of the nodes, that is, although the nodes possess distinct general topology, the localized characteristic of the dynamics allows only the degree to differentiate the nodes.The two cases we may come across are shown in Figure 2.
In order to quantify the difference of the values obtained for each node, we use the distance measurement defined by equation 7. Through this statistic we can identify if the difference of the means of the dynamical values obtained are in fact significant, that is, we are quantifying the difference between the dynamics of the nodes normalized by the intrinsic fluctuations caused by the initial condition.The distance between every pair of nodes is them represented by the matrix Ξ, where each line i and column j represents the distance obtained between the nodes i and j, that is, where d h (i, j) is defined by equation 7. Finally, we can define our mean dynamical differentiation measurement, α, as the mean values of each line of this matrix The standard procedure now would be to calculate the statistical significance of the observed values of α, but since we are concerned with the comparison between the distances, and not on the absolute values, this does not need to be performed.To carry out the comparison, we construct an histogram of the obtained values of α.Having in mind that α is relative to some dynamical characteristic, we can have distinct histograms relative to the desired characterization.
The relationship between structure and function in complex networks observed locally 8 What we search for are particular behaviors of the histograms, for example, it is expected that a single node with very distinct dynamics compared to the rest of the network will have a very large α value.It is important to observe that although we presented the methodology for a single measurement, nothing prevents us from calculating the distances using simultaneously various dynamical measurements.In Figure 3 we show an example application of the presented model.0.000 0.277 0.496 0.599 0.257 0.261 0.277 0.000 0.424 0.524 0.405 0.405 0.496 0.424 0.000 0.097 0.320 0.314 0.599 0.524 0.097 0.000 0.405 0.399 0.257 0.405 0.320 0.405 0.000 0.002 0.261 0.405 0.314 0.399 0.002 0.000 (a) ( f (1)   f ( 2)

Comparison between random network models
We compare the differentiation relative to the spike rate feature, which we call α r for different network topologies, namely Erdős-Rényi (ER) [8], Barabási-Albert (BA) [10] and Waxman geographic model [9].In Figure 4(a) we show the result obtained for the geographic model with N = 1000 and k = 10 and an integrate-and-fire dynamics with T = 8 taking place on the network.To obtain statistical significance we use 100 different generated networks, and each network is subjected to R 0 = 1000 realizations of the dynamics with different initial conditions.We construct histograms of α r obtained for each generated network and show in Figure 4(a) the mean value and standard deviation the set presents.We see that the frequency of nodes with small α r has a large variation, which is caused by the intrinsic fluctuations in the dynamics of the many nodes with similar spike rate present on the network.It is feasible to think that this fluctuation would decay as α r increases, but this is true only for intermediate α r , while at high values of α r we observe a sudden increase of fluctuation.Additionally, the mean value stays at an almost constant value for α r in the range [0.9, 1.5].This is caused by the high potential of the geographic model to display structure fluctuations, originating regions with higher density when compared to the rest of the network.These regions alter significantly the spike rate of the nodes, and high differentiated groups appears.In Figure 4(b) we apply the same procedure to calculate α r in the geographic model, only changing the dynamic threshold to T = 10.It is clear that the groups are no longer distinguishable.This is so because the threshold is now so large that even the topological fluctuations cannot differentiate a significant number of nodes, when compared to the ones with small α r .We apply the same procedure used for the geographic model to the ER networks with N = 1000 and k = 10.In Figure 4(c) we show the information about the obtained histograms for T = 8, which makes clear that this model exhibits much smaller fluctuations.This is caused by the much smaller geodesic distances that the model exhibits, when compared to the geographic counterpart, which creates a more compact network.We also show in Figure 4(d) the case T = 10 for the ER model, where we see a perfect decaying behavior expect for a random Poissonian system.
The third investigated model is the BA with N = 1000 and k = 6.Figures 4(e) and (f) show the log-scale histogram for, respectively, T = 8 and T = 10.In both cases we observe a significantly high peak for large α r , which we found to be related to the network hubs (nodes with very high degree).This result was expected, given our observations of large fluctuations on the geographic network, but this is not the main result for the BA model.The important result is that although the power-law degree distribution of the model has a continuous decaying behavior, the histogram of α r shows small values for intermediate α r and increases for large α r .This behavior can be interpreted as follows: the dynamical differentiation of a hub is, as expected, very large, but a node with almost equal degree can end up with a much smaller differentiation, having dynamics more similar to the low degree nodes.This result confirms the important role that hubs have in complex systems, not only in the sense of being central, but also in having a different purpose to the network dynamics.

Network of the Caenorhabditis elegans
Although many interesting properties arise when studying random network models, it is on real networks that the dynamical differentiation analysis can show its real potential.To show this we now apply the methodology to the C. elegans neuronal network.In this network, each node represents a neuron and two nodes are connected if there exists some kind of directed communication between them (e.g.synapses, gap junctions, etc).The data was compiled by Chen et al. [24,25] and obtained from [26].The network has 279 nodes and k = 22.4.
Although we motivate the method with a real network, it is important to note that our dynamics does not take into account many signal particularities that arise for real neurons [12], therefore we are looking for a coarse grained description of the neurons ( inside the network.We will show that, even with this simplified description, it is still possible to observe some interesting phenomena. We begin by showing the histograms of α, relative to the spike rate, α r , and interspike entropy, α e , in Figure 5.An immediate result we can observe is that in Figure 5(a) there is a group with high differentiation relative to the spike rate, indicated with a red circle, which is somewhat similar to that observed for the BA random model.With this in mind we plot in Figure 6 the degree histogram of the network, indicating in red the nodes inside the observed group.We see that the nodes with high differentiation possess a high degree in the network, but there are some high degree nodes that do not show a distinct dynamics.It is clear that the topology influence does not occur merely by the degree of the nodes.There is a particular relation between these high differentiated nodes that make their dynamics very peculiar when compared to the rest of the network.The reality is that the nodes inside the red circle in Figure 5(a) are known as the interneurons of the ventral cord of the C. elegans.They are well recognized for possessing a high number of synapses [27], given that they make the bridge between sensory and motor neurons without much restriction on the type of transmitted signals (some classes of interneurons are known for receiving only a specific type of signals).In table 1 we present the traditional names of these neurons and some information about their spatial and topological distance.Position refers to the spacial localization of each neuron relative to the axis that goes from the head (value 0) to the tail (value 1) of the nematode.We see that the majority of the described neurons are on the head (more specifically, in the nerve ring of the nematode [28,29]), with the exception of PVCL and PVCR that are on the tail.D1 is the mean topological distance between these neurons, that is, given a node i we measure how many edges we need to travel in order to go to node j, and we take the mean of this distance for all j inside the differentiated group.D1 = 1 means that the neuron is a neighbor, or receives a direct signal, of all the other neurons shown in the table.The feature D2 complements D1 as it shows the topological distance between the given neuron and all other neuron, excluding those present in the table.We see that in all cases the differentiated nodes are closer between themselves than with the rest of the network, an effect partially provoked by their high degree.That is, besides having a high degree, these nodes are well connected between themselves, originating a high capacity of communication inside the group, and rendering their dynamics very distinct in comparison to the rest of the network.The position of each indicated neuron inside the nematode is shown by the drawings in Figure 7. Finally, we observe that α e (Figure 5(b)) does not show the same behavior of α r , meaning that the topology does not generate dynamical groups when considering the signal complexity of the integrate-and-fire.

Conclusions
In a dynamical system underlain by a completely regular topology (e.g. a toroidal lattice), every node behaves identically regarding its influence on the overall dynamics.It remains an important question to quantify how local heterogeneities in the topology, which can be understood as structural symmetry breaks, may influence the unfolding of the respective dynamics.Despite continuing interest in this area, relatively incipient results have been obtained as a consequence of the fact that several elements that can interfere with the overall dynamics -such as initial conditions, stochasticity, and parameter configurations -are not kept constant while inferring individual effects.The current article has addressed this problem by proposing a framework for quantifying to which an extent the topology around each node contributes to differentiating the dynamics.Moreover, the method is devised in such a way as to not require the consideration of any specific topological or structural measurement.Though the method can be applied with respect to any of the potentially interfering elements, in the present work we restrict our attention to the effect of initial conditions.
We illustrated the potential of the reported methodology with respect to random model (ER, BA and geographic) and real-world (C.elegans) complex networks under the integrate-and-fire dynamics.Several interesting findings are reported, including the fact that the nodes in ER networks are not significantly differentiated regarding their respective time series.The geographic model exhibits clusters of nodes that are highly differentiated in comparison to the majority of network nodes, a consequence of the high statistical fluctuations present in the network construction.The result for the BA model showed that the topological particularities of the hubs are amplified in the dynamics taking place on the system.
Regarding the C. elegans network, we found that some nodes are highly differentiated by their spiking rate.While all these nodes have been found to correspond to well connected nodes, there are well connected nodes that are not in this group, indicating the presence of additional topological influences besides the node degree.We identified these highly active nodes as corresponding to interneurons of the ventral cord.
Several future developments are possible, including the consideration of other types of dynamics, other models of networks, as well as investigating the effect of stochasticity and varying parameters or dynamics at each node.

Figure 1 .
Figure 1.Example of Hotelling distances.The three Figures present the same Euclidean distance between the mean point of the two groups, but in cases (a) and (b) the statistical distance is more significant than in (c).The respective Hotelling distances are shown in each Figure.

Figure 3 .
Figure 3. Example application of the methodology.(a) We execute 12 runs of the dynamics with different initial conditions and (b) project the obtained signals on the measurement space, which in this case is 2-dimensional.(c) The matrix Ξ ij is obtained using equation 7, and (d) its mean is taken in order to obtain the α of each node.

Figure 4 .
Figure 4. Mean and standard deviation of the histograms of α for 100 generated networks.The networks were generated using (a) and (b) the geographic model, (c) and (d) the ER model, (e) and (f) the BA model.Graphics on the left were obtained using T = 8, and the ones on right with T = 10.

Figure 5 .
Figure 5. Histograms of the dynamical differentiation of the nodes with respect to (a) the spike rate and (b) the interspike entropy.

Figure 7 .
Figure 7. Neurons of the C. elegans with high differentiation.Dark or light red indicate the left and right version of the neuron.Pictures obtained from [26].

Table 1 .
Degree distribution of the C. elegans network.Red bars indicate the nodes present in Table1.Calculated topological values for the interneurons of the ventral cord of C. elegans.See text for explanation about D1 and D2.