A method for multiscale community detection in brain networks

The identification of community structure in graphs continues to attract great interest in several fields. Network neuroscience is particularly concerned with this problem considering the key roles communities play in brain processes and functionality. Most methods used for community detection in brain graphs are based on the maximization of a highly parameter-dependent modularity function. In practice, the parametrization of this function often obscures the physical meaning and hierarchical organization of the partitions of network nodes. In this work, we present a new method able to detect communities at different scales in a natural, unrestricted way. First, to obtain an estimation of the information flow in the network we release random walkers to freely move over it. The activity of the walkers is separated into oscillatory modes by using empirical mode decomposition. After grouping nodes by their co-occurrence at each time scale, the so-called k-modes clustering returns the desired partitions. Our algorithm was first tested on benchmark graphs with favorable performance. We used the method on brain networks, including the anatomical connectivity of the macaque and human brains and a model for the interactions between nodes. We found a repertoire of community structures in the anatomical and functional networks, with a clear link existing between these two. The observed partitions range from the evident division in two hemispheres –in which all processes are managed globally– to specialized communities seemingly given by physical proximity and shared function. Our results stimulate the research of hierarchical community organization in terms of temporal scales of information flow. Highlights Oscillatory modes of networks’ signals carry information on architectural rules. Meaningful partitions of the brain network are found over different temporal scales. The multiscale organization of the brain responds to the function of its components.


Abstract
The identification of community structure in graphs continues to attract great interest in several fields. Network neuroscience is particularly concerned with this problem considering the key roles communities play in brain processes and functionality. Most methods used for community detection in brain graphs are based on the maximization of a highly parameter-dependent modularity function. In practice, the parametrization of this function often obscures the physical meaning and hierarchical organization of the partitions of network nodes. In this work, we present a new method able to detect communities at different scales in a natural, unrestricted way. First, to obtain an estimation of the information flow in the network we release random walkers to freely move over it. The activity of the walkers is separated into oscillatory modes by using empirical mode decomposition. After grouping nodes by their co-occurrence at each time scale, the so-called k-modes clustering returns the desired partitions. Our algorithm was first tested on benchmark graphs with favorable performance. We used the method on brain networks, including the anatomical connectivity of the macaque and human brains and a model for the interactions between nodes. We found a repertoire of community structures in the anatomical and functional networks, with a clear link existing between these two. The observed partitions range from the evident division in two hemispheres -in which all processes are managed globally-to specialized communities seemingly given by physical proximity and shared function. Our results stimulate the research of hierarchical community organization in terms of temporal scales of information flow.

Introduction
Network community detection constitutes a problem of current vital importance. Among all the nodes and interactions constituting a network, structures of subdivisions exist. In each of these communities (also referred as groups or clusters), nodes have a greater probability of being locally connected than to nodes in other groups (Fortunato & Hric, 2016;Garcia, Ashourvan, Muldoon, Vettel, & Bassett, 2018). One example with several applications in the literature (Girvan & Newman, 2002;Porter, Mucha, Newman, & Warmbrand, 2005;Traud, Kelsic, Mucha, & Porter, 2011) is the tight-knit of a person's friendships and the exchanges they have with other groups of friends. The identification of community structures provides insights into organizational principles, not only in terms of isolation of the clusters per se but also for the collective dynamical spreading of processes over the network (Fortunato & Hric, 2016).
In the brain, neural units connect to one another over different spatio-temporal scales in intriguing and fascinating ways (Breakspear, Sporns, Honey, & Ko, 2007;Moradi, Dousty, & Sotero, 2019;Sotero & Trujillo-Barreto, 2008;Valdes-Sosa et al., 2009). The modularity of a such system is believed to critically impact the phenomena of segregation (processes occurring in groups of heavily interconnected brain units) and integration (the combination of information exclusive to specialized brain regions) (Rubinov & Sporns, 2010;Sporns, 2013). Other advantages of a community structure relate to adaptability, robustness to failure and the reduction of wiring costs -see (Garcia et al., 2018) and (Betzel et al., 2017) and the references therein. Additionally, grouping exists across different levels (a hierarchy) for supporting rapid responses to changes (Garcia et al., 2018). As an illustration, consider the large community of neural conglomerates in one cerebral hemisphere. This can be broken into smaller communities according to the functional role of their members (Thomas Yeo et al., 2011). An important initial step for the study of brain structure, however, is the definition of the nodes and edges in the graph and the scale to be considered. This selection relies on the data available, which depends on the imaging modality used to record it. For example, anatomical associations can be examined through diffusion weighted magnetic resonance imaging data (DWMRI) and functional neuroimaging or electrophysiological methods, e.g., fMRI and electroencephalogram provide insights into the dynamic interactions between brain regions (Y. Iturria-Medina et al., 2007;Valdes-Sosa et al., 2009).
Regardless of the network data, the bulk of community studies in the brain use variants of Newman's modularity function (Newman, 2006) and its maximization through Louvain-like algorithms (Blondel, Guillaume, Lambiotte, & Lefebvre, 2008) for the detection of clusters of regions (Sporns, 2013). The partitions obtained through these methods maximize intra-community edge weights relative to a specific random network null model (Bassett et al., 2013;Garcia et al., 2018). Overall, these algorithms are problematic in that the output structure depends on the chosen null model and on a resolution parameter, , as well. Exploration of the resolution parameter space yields several structures which occasionally present hierarchy (Bassett, Khambhati, & Grafton, 2017). How could one set so that a meaningful set of communities -and not any partition-is revealed? In many instances, researchers exclusively report the partition obtained for = 1 (Fukushima et al., 2018). Nevertheless, it is known that high-modularity partitions can be found for = 1 in random, unstructured graphs, where no community structure should be detected (Fortunato & Hric, 2016). More recently, a useful heuristic has been introduced to retain the socalled graph's most salient partition (Bassett et al., 2013;Garcia et al., 2018). In brief, a grid search is performed on the resolution parameter to find the value that generates the set of partitions with the greatest similarity. However, modularity maximization tends to split large communities in smaller pieces, which is a consequence of the choice of the null model. This effect is not solved by multi-resolution approaches (Fortunato & Hric, 2016). These techniques have also been adapted to generate hierarchical output structures (Ashourvan, Telesford, Verstynen, Vettel, & Bassett, 2019;Jeub, Sporns, & Fortunato, 2018) though the limitations with regard to the choice of null models and resolution parameters persist.
Other algorithms exist with somewhat fewer applications in brain research (Fortunato & Hric, 2016;Gates, Henry, Steinley, & Fair, 2016). Given the connectivity characteristics of communities, the utilization of random walkers for their identification is fairly straight-forward.
Walkers tend to stay trapped in a cluster before transitioning to a different group (Fortunato & Hric, 2016). Walktrap (Pons & Latapy, 2005) and Infomap (Rosvall & Bergstrom, 2008) are examples of detection methods that employ random walk dynamics. The former is a costly, parameter-dependent method that exploits the probability of transition between two nodes in a certain number of steps as a measure of vertex similarity to group nodes. In the latter, a codeword is assigned to each vertex the walker encounters. Infomap considers networks with community structure to be analogous to geographical maps: unique codewords (street names) are only necessary to identify nodes (streets) in one specific community (city). Although Infomap has proven effective in artificial benchmark graphs and large datasets, it has performed more poorly in classical real networks traditionally utilized for testing algorithms (Gates et al., 2016;Hric, Darst, & Fortunato, 2014), e.g. the Zachary karate club (Girvan & Newman, 2002;Zachary, 1977).
Those networks, for which ground-truth partitions are known, resemble some commonly analyzed brain graphs in that they have relatively small size and present various types of adjacency matrices, e.g., sparse, like those obtained from DWMRI or dense, from fMRI (Gates et al., 2016).
The description in terms of dynamical flows, as utilized in Walktrap and Infomap despite the above-mentioned limitations, is one that appeals to neuroscientists. In the first place, structural features like node degrees, number of edges of the brain graph, etc., condition the dynamics of network processes (Fortunato & Hric, 2016). Secondly, the transmission of information in the brain is, obviously, a dynamical process (Sotero, Sanchez-Rodriguez, Dousty, Iturria-Medina, & Sanchez-Bornot, 2019), brain connectivity being adaptive and function-sensitive within the context of structural constraints (Friston, 2011). For these reasons, we believe that the analysis of community structure and the identification of hierarchical architectures in brain networks can benefit from taking into account the dynamical aspects of its information flow. Thus, in this paper, we present a novel approach to community detection specifically designed for brain graphs, although not limited to them.
We build on the methodology introduced by Sotero et al. in recent work .
These authors studied information flow in brain networks by using the fraction of walkers that a given one finds at each node as the variable describing the evolution of the walker over the network. This function of time, taken as the network's signal, can be decomposed into its constituent frequencies by using empirical mode decomposition (EMD) (N. E. Huang et al., 1996).
Each of these oscillatory modes then associates with the notion of a temporal scale. Here, we incorporate a final step for performing data partitioning through -modes clustering (Z. Huang, 1998). The arrangement of the nodes visits recorded throughout the walkers' flows at the different temporal scales allows the unveiling of a hierarchical organization. Intuitively, a walker would spend considerable times in large communities, which is seen in slow oscillatory modes.
Analogously, fast modes could reflect the motion over smaller clusters. Initially, we test the algorithm on benchmarks and real networks with known community structure such as Girvan-Newman (Girvan & Newman, 2002), Lancichinetti-Fortunato-Radicchi (Lancichinetti, Fortunato, & Radicchi, 2008) and the Zachary karate club (Zachary, 1977). We then proceed to extract communities existing in macaque and human anatomical connectivity matrices, as well as in-silico functional connectivity graphs built over the human brain anatomical network. Meaningful patterns of communities obtained here support the reliability of our method.

The network's signal
Let us imagine a network of nodes, possibly presenting community structure, in which a random walker is set free. The walker moves over the edges available to it. In general, the probability of transitioning from node towards node in the next time step is given by (Zhang, Shan, & Chen, 2013): where is the weight of the connection from area to area . The walker tends to visit the nodes in a community before a route takes it to an outsider, a member of a different community (Fortunato & Hric, 2016). This is because of the predominantly local connectivity pattern of communities. Now, suppose that ( ≫ ) walkers simultaneously move over the same network. Each time one walker appears in a node, it finds fellow walkers, while others visit different nodes. Let us compute, for each walker, the fraction of the total number of other walkers that it encounters at each time step. After time steps, there exist time series reflecting different realizations of the flow of information in the network . Those series incorporate information on the structure of the network (e.g., the number of walkers at a hub is expected to be persistently high), and the paths therein existing (i.e., the random walk itself). Finally, for generalization purposes -as the ratio of walkers would depend on the size of the network-we standardize such time series. Fig. 1a shows an exemplary signal corresponding to one of the walkers flowing over one of the networks considered in this paper. The horizontal axis is two-fold, showing both the temporal iteration (lower) and the indexes of the nodes the selected walker visits at each time step. Given the size of the graphs in this study, i.e., brain networks with ~10 2 , we fix = 1000.
The representation of the network we come up with can be decomposed to obtain oscillatory modes at different time scales. The temporal scales of random walks processes in complex networks depend on the network structure . In other words, the network structure can be seen through different dynamical levels ranging from slow time scales in which walkers practically travel through the entire network to faster scales consisting, for instance, of the abrupt transitions from one node to the next. Empirical mode decomposition (EMD) (N. E. Huang et al., 1996) solves the problem of finding a nearly orthogonal basis for any complicated, nonlinear and non-stationary process without the need for any predefined model. The components into which the signal is broken down representing temporal scales are conventionally called intrinsic mode functions (IMFs). Each of these satisfy that: 1) the number of zero-crossings and extrema of the function are either equal or differ by one, and 2) the mean of its upper and lower envelopes is zero.
Components are different as to conveyance of information (Sotero, 2016). For notation purposes, IMF1 denotes the fastest mode (highest frequency). The rest are named accordingly.
In our interpretation oriented to community detection, IMFs of the fraction of walkers-signal reflect hierarchical organization of the network. For example, fast temporal scales of the fraction of walkers could associate with partitions composed of small groups of nodes. In slower modes, the walker may get to visit all the nodes existing in larger communities. To give an example, as shown in Fig. 1b, a walker may quickly transition over nodes 74, 76, 65, 54 and 60 (seen with IMF2) or more slowly appear at those but also at others (IMF4). The five elements mentioned above may represent a community; those and the ones identified by the green color, may constitute a broader community.

Finding nodes that cluster together
The following step consists of exploiting the features of the IMFs and grouping nodes together. For each oscillatory mode and walker, we take chunks of data consisting of the network nodes seen between zero crossings. In our previous example, 74, 76, 65, 54 and 60 would be one of such data chunks for the fast IMF2 (Fig. 1b). Other sets of nodes will appear in different portions. One may think of our selection of the chunks in terms of the oscillations of a spring-mass system. There, points to the right/left of the zero reference cluster together (the spring is stretched/compressed). Each time the signal for the displacement of the mass passes through the equilibrium position it is also switching from one 'community' to the other. The zero crossingsanalogy bases on the interpretation and symmetry properties of the IMFs. In practice, nodes outside a certain true community , may occasionally pertain to a chunk of otherwise genuine members of , given the existence of edges to that community. Likewise, all nodes belonging to a community do not necessarily have to appear together between two contiguous zero crossings. The problem is how to identify authentic node clusters over the effects of noise with the information available from the IMFs. To address this issue, we turn to unsupervised learning, particularly to clustering. In clustering analysis, the goal is to group objects based on the information availablefeatures describing the data (Assent, 2012;Ronan, Qi, & Naegle, 2016;Steinbach, Ertöz, & Kumar, 2004). The more similar items in a group are and the more different to those in other groups, the better the clustering (Steinbach et al., 2004).
To capture the natural structure of the data, we proceed to cluster network nodes -or objects, in conventional clustering jargon-given their co-appearance between zero-crossings of the IMFs -the features. Features are binary vectors encoding the positions of the nodes appearing together in the between zero-crossings chunks. This yields a complete representation of the IMFs corresponding to each walker in terms of binary variables (Fig. 2a). To ensure a proper sampling of all nodes and their co-appearances, we select the features corresponding to a high number of walkers (200 out of the 1000 simulated). This is a random selection, in the same way that subsets of variables are sometimes chosen when clustering data with multiple independent signals (Jiliang Tang, Salem Alelyani, & Huan Liu, 2014;Ronan et al., 2016).
Here, we employ the so-called k-modes clustering algorithm (Z. Huang, 1998). K-modes is an extension of the popular k-means method (Macqueen, 1967) for categorical variables, binary features being a particular case of those. A simple matching dissimilarity is used as notion of distance in k-modes. Two objects, and are far from each other by a quantity that equals the number of mismatching features (of features), namely: The algorithm minimizes a cost function after a vector (the mode 1 ) for each of the k clusters has been selected and objects grouped around it such that their dissimilarity is minimal. Alike kmeans, Huang's k-modes yields locally optimal solutions depending on the starting conditions.
Thus, one necessary first step is running the algorithm several times to select the solution with the lowest overall cost. The number of initial conditions for the clustering algorithm is empirically set to 50 in this work, based on the consistency of the solutions obtained. With all these considerations k-modes is run (Fig. 2b). The implementation of k-modes we used is available from https://github.com/nicodv/kmodes.

Accepting/rejecting hierarchical partitions
Clustering algorithms generally find clusters even in absence of underlying structure, highlighting the necessity of validating solutions (Ronan et al., 2016). In k-modes, objects are allocated in k clusters exactly. However, unless the user has prior knowledge on the distribution of the data -which rarely occurs-k is a parameter to be determined. Several metrics are intended to elucidate the correct number of clusters in the data from running the algorithm over a range of k. These metrics use measures of separation, compactness, or both. Various studies, most famously the one by Milligan and Cooper (Milligan & Cooper, 1985), have looked at the performance of indexes for assessing the results over numerical data and Euclidean distances. To the best of our knowledge, such measures have not been transformed to account for categorical data (binary, in particular) and matching dissimilarity, as recommended by Huang in his seminal paper (Z. Huang, 1998). Therefore, in Appendix B, we briefly describe six measures with satisfactory performance for recovering true cluster structure, i.e.: the Calinski-Harabasz index (Anderson, 2001;Calinski 1 The terms employed here are mostly faithful to the ones used in each of the parts combined. This is why the word mode appears with two meanings: one refers to the oscillatory functions in which a signal can be decomposed (the intrinsic mode functions), the other represents 'the centroids' of the clusters of nodes obtained with binary features (as in k-modes clustering).
& Harabasz, 1974), the C-index (Milligan & Cooper, 1985), a modified Duda-Hart criterion (Duda, Hart, & Stork, 2001), silhouette width (Kaufman & Rousseeuw, 1990), one of the family of Dunn indexes (Bezdek & Pal, 1998;Dunn, 1973) and the Davies-Bouldin index (Davies & Bouldin, 1979;Dubes, 1987). Details on their utilization and customization to account for binary data, if applicable, are also included. Importantly, as the success ratio of an individual index in determining the true number of clusters is limited and may depend on the data (Dubes, 1987;Milligan & Cooper, 1985), here, we adopt the criterion of selecting the best number of clusters based on a majority rule (Charrad, Ghazzali, Boiteau, & Niknafs, 2014). Failure to establish a majority of the indexes indicating the same correct number of clusters may hint at the lack of a definitive community pattern. This is usually accompanied by at least one of the indexes rejecting the existence of community structure altogether (see Appendix C).  This way, a hierarchical organization is unveiled.
One last issue to consider is the stochasticity that is inherent to most community detection techniques (Bassett et al., 2013;Fortunato & Hric, 2016). In our method, this is expressed as sets of randomly chosen walkers which are considered to build clustering features. The application of the clustering algorithm over those subsets of features can yield slightly different partitions. To present unique partitions, we use consensus clustering (Strehl & Ghosh, 2003). One starts by building a consensus matrix, , that accounts for the co-occurrence of nodes in communities. Nonsignificant relationships between nodes are removed by thresholding the co-occurrence matrix.
Such threshold is set to the highest value of all co-occurrences in the association matrix resulting from random permutations of the original partitions (Bassett et al., 2013). Then, the algorithm is applied over until all the partitions are identical (the 'true-partition'). The results reported in this paper correspond to the consensus partition after applying k-modes clustering to 50 sets of independent features for each of the networks. Their similarity with known ground-truth communities is analyzed by using adjusted mutual information, AMI (N. Vinh, Epps, & Bailey, 2010) (see Appendix D).

Benchmarks
Artificially generated graphs and a real network with known group structure were used to assess the performance of our algorithm. We chose simple benchmark graphs with features alike the brain networks for which the application was intended, e.g. similar number of nodes.

Girvan-Newman benchmarks (GN)
These graphs are random with known community structure. A GN benchmark consists of 128 nodes and four communities, with 32 nodes each. The average expected degree of a node is 16 (Girvan & Newman, 2002). A fraction of those connections ( ) is made to vertices in other communities. As such fraction increases, algorithms usually struggle to pinpoint the underlying community structure. Here, we set = 0.1 (Lancichinetti & Fortunato, 2009) and applied our detection algorithm over both binary and weighted versions of the GN model. One limitation of GN is its inability to reproduce the scale-free property of real networks (heterogeneous node degree distributions, node degree and community sizes following a power-law distribution) (Lancichinetti et al., 2008).

Lancichinetti-Fortunato-Radicchi benchmarks (LFR)
A more realistic benchmark, LFR does account for the heterogeneous and skewed distribution of the degree and community size. Both these parameters are chosen from power-law distributions. Networks are built by joining stubs at random (Fortunato & Hric, 2016;Lancichinetti et al., 2008). We kept the value of the mixing parameter at 0.1. The average degree in a network of = 100 nodes was set to 13 and the upper extreme of the degree distribution to 27. Consequently, the randomly generated networks (binary and weighted) utilized here had 5 communities, with sizes [26,23,21,20,10].
The GN and LFR networks used in this work were generated by using code available from (https://sites.google.com/site/santofortunato/inthepress2). All the code parameters were set to their default values except for the ones above-mentioned.

Zachary's karate club
The karate club network collects the interactions of 34 individuals over three years (Zachary, 1977). A conflict over the price of the karate lessons escalated and provoked the fission of the group as the supporters of the club's instructor formed a new organization, separate from the original one that stayed with the president. Thus, Zachary's data encompasses one of the few examples of nearly-definitive ground-truth communities, the two resulting groups (Hric et al., 2014). Many of the detection algorithms existing in the literature are tested on their ability to recuperate Zachary's factions. Such task is usually performed over a binary connectivity matrix for the members of the club (Girvan & Newman, 2002;Hric et al., 2014;Newman, 2004). In this study, instead, we used the weighted version provided by Zachary, in which the strength of an edge is given by the number of external contexts where interactions between two individuals were observed (see Appendix A, Fig. S1, for the adjacency matrix that fixes some inconsistencies in Zachary's report). The weights were normalized to the [0,1] interval.

Brain networks
The community structure of several brain graphs was investigated. The anatomical connections of both the macaque and human brains as well as a model of functional interactions based on Kuramoto oscillators that were considered here are detailed in what follows.

Macaque visual and sensorimotor anatomical network
Cortico-cortical connections existing between large-scale areas of the macaque neocortex have been identified through anatomical tracing studies (Malcolm P . Among all the areas and pathways summarized in Young's paper, only those lying in the cortical visual and somatosensory-motor systems are considered here (see Fig. S2). This connectivity matrix, with 46 nodes, is only slightly different than the one utilized in the network structure study by Honey et al. (Honey, Kotter, Breakspear, & Sporns, 2007), where visual areas were labelled following . Several connections are reciprocal. However, in general, the network is directed and binary, with 1's in a row indicating the efferent projections reported for the given area -see  for more details on the cortical areas.

Human brain anatomical network
An average human brain anatomical network (Yasser Iturria-Medina, Sotero, Toussaint, & Evans, 2014) was also constructed and analyzed in this study. The original data is freely available All images were processed using a q-space diffeomorphic reconstruction method (Yeh & Tseng, 2011) to register the voxel coordinates into MNI space (Evans, Kamber, Collins, & MacDonald, 1994). Orientation distribution functions (ODFs) were reconstructed to a spatial resolution of 2 mm 3 . As a result of the processing across all 60 subjects, a final template image (CMU-60 DSI) was created by averaging the ODF maps. This template constitutes a detailed and unbiased representative map of the nervous fiber orientations in the young healthy brain.
Next, we estimated probabilistic axonal connectivity values between each brain voxel and the surface of each considered gray matter region (voxel-region connectivity) using a fully- supporting the existence of the hypothetical white matter connection, independently of the density/strength of this connection. A network backbone, containing the dominant connections in the regional connectivity map, was computed using a minimum-spanning-tree based algorithm (Rubinov & Sporns, 2010). It was the resulting minimum spanning tree the network that we used (Fig. 3a).

Human brain functional network
To construct a representation of functional interactions in the brain, simulations of the Kuramoto model (Kuramoto, 1975) were performed. The Kuramoto model is a classical dynamical system that describes the behavior of a set of coupled oscillators. For the sake of consistency and contrast, the anatomical parcellation described in the previous section was conserved, while the relative coupling between two nodes in the network of oscillators corresponded to the backbonemeasure between regions and . The evolution of the phase of the -th oscillator, , is given by (Daffertshofer & van Wijk, 2011): where is a global coupling strength and is the intrinsic frequency of node . In all our simulations, we drew the natural frequencies from a standard Gaussian distribution and the initial conditions from a uniform distribution in the interval [0, 2 ). A total of 250 sets of natural frequencies and initial conditions were used. System (3)  Intuitively, the collective behavior of the system depends on the parameter . Stronger interactions (high ) overcome the dispersion of the intrinsic frequencies yielding coherence in the network, whereas in the low− regime oscillators tend to remain asynchronous (Breakspear, Heitmann, & Daffertshofer, 2010;Daffertshofer & van Wijk, 2011). The degree of synchrony of the oscillators is quantified through the phase uniformity (K. V. Mardia, 1975): In our calculations, a grand-average phase uniformity value for each coupling strength, , was obtained by averaging ( ) in the considered time interval across all simulations with such ( Fig. S3). Similarly, a so-called -dependent functional connectivity matrix was also calculated.
To do so, we followed Cabral et al. (Cabral, Hugues, Sporns, & Deco, 2011) and assumed an electrophysiological measure of the brain activity, such as a the mean firing rate or excitatory postsynaptic potential over the brain region, to be given as ( ) = 0 sin( ( )). Functional connectivity for a pair of nodes was then defined as the Pearson correlation between their ( ) and ( ) signals, for each simulation (Bordier, Nicolini, & Bifone, 2017). The representative interaction matrix associated with was finally obtained after Fisher-transforming the pairwise correlation coefficients, averaging, performing one sample t-tests ( = 0.05), correcting by falsediscovery rate and applying the inverse transformation. Functional connectivity matrices for = 5 ( ≈ 0.12), = 30 ( ≈ 0.47) and = 150 ( ≈ 0.98) are shown in Fig. 3b, Fig. 3c and Fig.   3d, respectively.

Data and code availability statement
The datasets and codes analyzed during the current study are available from public repositories, which have been referenced throughout the paper. A specific set of codes containing a demonstration on how to concatenate the method pipeline is offered at https://www.soterolab.com/software. All calculations but the detection of clusters were performed in MATLAB R2018a (The MathWorks Inc., Natick, MA, USA). Python 2.7 (Python Software Foundation, https://www.python.org/) was used for an implementation of the k-modes algorithm. Table 1 shows the results of the application of our method in benchmark graphs (see

Methods, Benchmarks). The AMI value (see Appendix D), appearing in the last column, illustrates
the degree of similarity between the obtained partitions and the ground-truth community structure known for each of the graphs. For both instances of the GN model (binary and weighted), the right partition was found over a range of IMFs. In the case of the LFR benchmarks, our method unveiled the 5 communities planted at IMF6. Over slower IMFs than the ones reported in Table 1, coarser organization of the networks was in some cases observed, e.g. one of the ground-truth communities stood alone and the rest merged. The analysis of faster IMFs did not return any community structure (see Appendix B and Appendix C). Finally, in the case of the karate club, the two known fractions in which it split were nearly obtained over IMF3. For each network, the number of nodes and known communities existing are given. The IMFs over which our method finds the right number of communities appear in the fourth column. In the last column, the adjusted mutual information values quantifying the degree of similarity between the solutions returned by the algorithm and the ground-truth structures are shown.

Network
The results for the network of Zachary's karate club are further illustrated in Fig. 4. A schematic representations of the two-communities structure that was revealed appears in Fig. 4a.  Fig. 4c shows 50 of such partitions (one per row). In some of those, node 10 was assigned to the community we have called "1" (in blue).
Thus, a consensus matrix (Fig. 4c, center panel) basically consists of binary values for the cooccurrences of all nodes in communities but those including node 10. Re-running the algorithm yielded 50 identical partitions (Fig. 4c, right). This partition (Fig. 4a) corresponds to the division reported by Zachary through observations of the karate club except for one member (node 9). This result is expected, according to the original paper and many others in which the karate club has been analyzed (Girvan & Newman, 2002;Hric et al., 2014), as the data apparently supports node 9's membership to the wrong faction.
The other structure (Fig. 4d) was obtained at IMF2. This consists of three communities and suggests a pattern in which the two leaders (node 1, the instructor, and 34, the president) often interact with what presumably are their intimate friendship circles (nodes colored in blue and green, respectively) and the rest of the network conforms a different group. It is important to bear in mind that our analysis was performed over a weighted matrix accounting for several contexts in which the members of the club were seen interacting. Thus, the broader community found may represent a set of passive actors in the fission of the social network, some who "sit and wait" for the inputs coming from the rapidly exchanging groups of leaders and close followers. Therefore, the consideration of temporal scales -essential to our methodology-could be a key aspect to uncover new and interesting phenomena.

The macaque visual and sensorimotor network
After testing the reliability of our method in several networks for which the community structure is known, we proceeded to its application to brain graphs. The first network considered was that of binary connections between cortical structures of the macaque visual and somatosensory-motor systems (see Methods, Macaque visual and sensorimotor anatomical network). As such, a certain distribution of network nodes between those two functional systems was expected. Fig. 5 shows the hierarchical tree returned as consensus clustering for the macaque anatomical network. At the highest level (two-clusters partition), the communities found correspond with the documented distinction between visual and sensorimotor areas (Hilgetag, Burns, O'Neill, Scannell, & Young, 2000; (Hilgetag et al., 2000).
To further explore the performance of the algorithm herein introduced, we compared our results to the more conventional Louvain-like community detection methods -Newman-Girvan null model, implemented in the Brain Connectivity Toolbox (Rubinov & Sporns, 2010). Fig. S4 shows the results of 50 initial runs and the consensus partitions obtained in both cases. Both widely used criteria for the selection of the resolution parameter of the Louvain algorithm, i.e., = 1 or chosen as the value for which partitions are more similar, yielded the same result, which is a three-clusters structure. This organizational pattern echoed our last result (Fig. 5, Fig. S4b), with the exception of the ventral occipitotemporal (VOT) cortex which in some partitions appeared together with the ventral cortex though was eventually grouped with most dorsal areas.

The human brain networks
We have also applied our community detection algorithm to the network of cortical and subcortical neural conglomerates of the human brain (see Methods, Human brain anatomical network and Human brain functional network). Fig. 6 summarizes the results. Firstly, the anatomical connectivity matrix was considered. We obtained two organizational levels, which are depicted in Fig. 6a and 6b. The highest of the two (Fig. 6a) consists of two communities which are the left and right hemisphere of the brain. Running the clustering algorithm with the features of a different IMF yielded a structure of subdivisions of the two hemispheres (Fig. 6b). This fourcommunity organization is practically symmetrical with the exception of the postcentral gyrus, the pallidum and the thalamus proper, which switch communities from one hemisphere to the other (see Fig. 6b and Table S1). The two communities to the top of the brain representation in the panel are mainly part of the frontal lobe, the cingulate cortex and the basal ganglia. On the other hand, those shown toward the bottom generally correspond with parietal, occipital and temporal areas (Klein & Tourville, 2012;Lanciego, Luquin, & Obeso, 2012). An instance of the validation indexes supporting the existence of four communities in this data was given as a demonstrative example in Fig. 2c.
The three synthetic functional networks (Fig 3b-d) resulting from superimposing Kuramoto oscillators to the matrix of anatomical connections were explored lastly. In Fig. 6c, the only communities found in the = 5 case are shown. Two of those communities are a set of neural structures belonging to either the left (in blue) or right (in green) hemisphere. However, a third community (in maroon) consists of fifteen inter-hemispherical regions, all of which except for the right posterior cingulate appeared in a symmetrical manner in both hemispheres, including the totality of the occipital areas (see also Table S1). Fig. S5 shows the consensus clusters identified by using the two standard criteria for the resolution parameter in Louvain-modularity maximization. While three regions are grouped inter-hemispherically with those of their kind, in general, modularity maximization seems to fail at recognizing the functional relationships that are supposed to exist in this data, e.g., the mixed community pinpointed by our method. The tendency to split communities to obtain higher modularity values is also observed in Fig. S5 as anatomical communities are divided in a virtually arbitrary way (compare to Fig. 6b, for example). For = 30, our algorithm (and Louvain-maximization) returned the same two-hemispheres structure illustrated in Fig. 6a. Nevertheless, for = 150, no community structure was found over any of the IMFs and combinations of walkers considered (one cluster encompassed all nodes). This conclusion was reached by applying the criteria of Appendix C.

Discussion
The problem of identifying modular structures at different scales of a network has captured the attention of the neuroscience community in recent times. Notably, Jeub et al. (Jeub et al., 2018) and Ashourvan et al. (Ashourvan et al., 2019) have introduced variants for the sweep through the Newman-Girvan modularity's γ-space eventually yielding hierarchical architecture. These methods have been tested in brain networks with encouraging results. Inherent limitations exist however, as the algorithms build on multi-scale modularity functions. Consequently, the exposed structures depend on the selection of several parameters and a null model, as in regular Louvainlike community detection (Blondel et al., 2008). Specifically, authors tend to recommend the utilization of null models that suit the characteristics of the data perfectly (Betzel et al., 2017).
However, null models appear as abundant as detection algorithms in the literature oftentimes, making its selection a key step for the success or failure of the application of an algorithm (Sporns, 2013). Ideally, one would like to provide the user with minimum-input tools that can reveal the underlying structures of the data in natural ways.
In the recent past, much of the discussion as to the directions of neuroscience research was centered on avoiding univariate statistical comparisons and, instead, looking at the network interactions as a whole (Telesford, Simpson, Burdette, Hayasaka, & Laurienti, 2011). What is more, we believe that to better capture the complexities of a system like the brain, with multiple spatio-temporal scales and dynamic reconfigurations, the mere application of generic network science methods is not sufficient. Tools must be developed to account for the brain's unique characteristics. Thus, in this paper, we have searched for organizational hierarchies through the temporal scales of the network's random walker signals, without necessitating to fix any parameter model. By doing so, the characteristics of the information flow in the brain are also incorporated . The integration of the network's architecture with the dynamical interactions of the oscillatory modes in the brain is consequently suggested as an important consideration in clustering techniques.

The brain organizes according to function
In discussing the results of applying our methodology to the macaque visual and sensorimotor network, several insights can be gleaned. Firstly, the obvious and most straightforward precedent is the classification of areas according to the functional neural system they belong to, visual and sensorimotor .
Additionally, two anatomical pathways have been identified in the visual system (Mishkin, Ungerleider, & Macko, 1983), which are usually known as dorsal (originating in the occipital cortex and terminating in the parietal lobe) and ventral (from occipital to temporal). These anatomically constrained divisions constituted the rationale for expecting the separation between somatosensory-motor and visual areas (which in turn was further divided into dorsal and ventral) by our community detection algorithm, given the numerous connections existing between areas in a functional system and somehow less connections with outsiders (Honey, Thivierge, & Sporns, 2010). We highlight, however, that limbic structures like the entorhinal cortex (ER) and the perirhinal cortex (A35), usually considered together with the sensorimotor system , were clustered with most visual areas. Also, the ventral occipitotemporal area (VOT) appeared in an otherwise dorsal community. This result is analogous to the one described by Hilgetag et al. (Hilgetag et al., 2000) in a study on cluster organization of a similar, larger network using now obsolete techniques. Moreover, in that work, prototypical ventral and dorsal areas V4 and 7a were clustered with the opposite streams. Notwithstanding the slight differences, the subdivision of the visual community obtained here closely resembles the ventral and dorsal streams reported by Hilgetag et al. Several more recent studies utilizing modularity maximization methods have detected similar sets of communities, yet different macaque datasets have been used (Sporns & Betzel, 2016).
We believe that the minor discrepancies in the analysis of the primate cortical network commented in the above paragraph are due to two main issues. First of all, we should revisit the limitations that exist intrinsic to each community detection method. These, of course, affect the partitions returned given any connectivity matrix, revealing the necessity of continuing to develop tools and migrating to more comprehensive approaches that use as much valuable network information as possible. Another important matter to recall is that imperfections exist in data as well. For example, the matrix of connections we used (M P Young, 1993) encompasses reports from several studies, which sometimes even employed different anatomical parcellations. Also, this matrix accounts only for the existence or absence of reports of links between areas of the macaque cortex, without considering the strength of the connections. Whether datasets accurately reflect the particularities of the connections existing in the brain or not will remain a fundamental question in neuroscience.
The next application of our newly introduced community-finding method was to human brain networks. We considered a connectivity matrix in which each entry reflects the evidence of the existence of a white matter link between two brain regions (Y. Iturria-Medina et al., 2007), given a template of such connections in the young healthy brain (Dunovan et al., 2015). Dominant connections were retained through a minimum spanning tree algorithm. Although the minimum spanning tree trims connections and does not contain loops, it is believed to provide a correct representation of any denser brain network to which it is applied, retaining paramount topological characteristics like its small-worldness and scale-freeness (Tewarie, van Dellen, Hillebrand, & Stam, 2015).
We have found two partitions of the anatomical network, which seemingly follow physical proximity and functional specialization rules. In the case of the first partition, commissural fibers appear to act as those rare links to members of other communities for the two brain hemispheres were perfectly separated. Each hemisphere split into two communities over the other partition found. The four-communities structure was almost bilaterally symmetrical, as only the postcentral gyrus, pallidum and thalamus proper exchanged membership, i.e., they grouped with most frontal, cingulate and basal ganglia regions in the right hemisphere and with parietal-occipital-temporal areas in the left. Variations in local connections within hemispheres may be the reason why these regions behaved in such a way. In principle, the thalamus, as universal relay station, and the pallidum projecting to the thalamus (Lanciego et al., 2012) should have no constraints to belong to one or the other intra-hemisphere community found. The postcentral gyrus, although deemed part of the parietal lobe is located in the vicinity of the frontal lobe, possibly explaining its grouping with such neural structures. The association of brain regions to perform processes and functions could also be reflected in the anatomical network and, consequently, in the communities obtained.
For instance, having the basal forebrain clustered with frontal areas may be justified by the fact that its projections to the prefrontal cortex are paramount for attention, learning and memory, and decision-making (Tashakori-Sabzevar & Ward, 2018).
The study of the matrices for the interaction of oscillators over the anatomical frame yielded stimulating results. For one thing, the mechanism for the transitions between functional states relates to the tuning of the coupling parameter in the model. Three communities appeared in the low-coupling regime ( = 5), one of them presenting areas from both brain hemispheres in a close to symmetrical pattern. When the coupling strength was raised to = 30, the mixed community was destroyed and the only recognizable pattern was the one of two separate brain hemispheres, which was one already existing in the anatomical network. This is because the connectivity matrix of a set of Kuramoto oscillators overcomes the dispersion of natural frequencies for higher values of the coupling parameter (Breakspear et al., 2010). All in all, our results suggest additional evidence for the known link between structure and function in the brain (Honey et al., 2010).
Higher functional couplings amplify the anatomical subdivision of the network in two hemispheres.
Other characteristics of the clusters of the functional networks are also noteworthy. For example, all the occipital areas (Klein & Tourville, 2012), responsible for vision (Johns, 2014), appeared in the mixed community of the = 5 -regime. Previous studies of modular organization in functional graphs found robust grouping in the occipital cortex (Meunier, 2009). Among the other areas present in the intra-hemispherical cluster: the superior parietal lobe has abundant connections with the occipital lobe and participates in visuospatial perception (Johns, 2014); the precuneus has a major role in visuo-spatial imagery (Cavanna & Trimble, 2006) and the posterior cingulate cortex is considered a core node of the default mode network (DMN) and to be involved in many tasks (Yasser Iturria-Medina et al., 2014). The inter-hemispherical community obtained appears to be one extended circuit concerned with the function of vision. The rest of the nodes within each hemisphere may consequently process all the non-visual stimuli, possibly constituting an optimal configuration for speedy and accurate performance on cognitive tasks (Garcia et al., 2018). We believe that the detection of these communities supports the notion of functional integration in the brain, whereas evidence for segregation can also be found in a division that isolates units specialized in handling with visual stimuli.

On the strengths and limitations of the algorithm
To conclude this section we would like to highlight important features of our method. One interesting scenario is the one of functional interactions with = 150. The large synchronization seen there yields a close-to--regular graph (Fig. 3d) that goes together with the algorithm identifying a single group of nodes. In fact, one recommended practice for testing new community detection algorithms is checking that they do not return group structure in the absence of it (Hric et al., 2014). Of further value is the effective performance shown when searching for the community structure of the benchmark graphs, all of which presented different network characteristics. On this topic, one must also mention the diversity of the real networks whose organization in groups was explored. The karate club is a small, weighted and undirected network.
On the other hand, the macaque visual and sensorimotor network is binary and directed. The human brain networks, with larger dimension, had different levels of sparsity. Many algorithms, e.g. Infomap, are initially designed for a specific type of graph (Fortunato & Hric, 2016) and only later extended. However, our method seems to be primed to perform reliable community detection in moderate-size networks like the ones associated with the brain's large-scale activity.
It was not in our interest, though, to test its suitability for high-dimensional graphs. In that case, the demand for computing resources would grow. Firstly, the computational cost of k-modes scales linearly with the number of objects and many random initializations of the modes are required to find a reliable clustering solution (Nguyen, 2017). Secondly, the number of possible combinations of nodes appearing between zero-crossings of an IMF would increase as well, so the implementation of k-modes must be optimal to handle a large number of binary features. We bypassed some of these complications by using the University of Calgary's One solution for reducing dimensionality is the selection of relevant features in the data (Ronan et al., 2016). Because the random choice of features (i.e., the set of walks) could also generate solutions that differ from the existing structure if many unrepresentative features were combined, the identification of a relevant set is also desired for the stability of the method. Nevertheless, dimensionality reduction by feature selection is a complicated matter on its own (Steinbach et al., 2004), especially in unsupervised learning, and state-of-the-art techniques degrade the results in some situations (Ronan et al., 2016). In short, we do not recommend the use of the method presented in this paper on large networks until further steps towards optimization are taken.

Conclusion
In summary, we have introduced an approach for the detection of modular organization by considering the temporal scales of the information flow over the networks of interest. This new tool insinuates particularly useful for the analysis of large-scale brain graphs, for which: 1) the transmission of information is a process of paramount importance and 2) a desirable balance between accuracy and computational complexity of the community detection algorithm can be achieved given the current implementation state. We find several organizational patterns existing in the brain anatomical and functional networks -also in the social network that we study. These structures may coexist together, in a dynamical way that is given by the temporal scales of the

Appendix B. Cluster validation indexes
In what follows, is the number of objects to be clustered and is the number of such clusters. Let { 1 , 2 , ⋯ , } be a partition of the integers from 1 to such that if the th object, , belongs to the th cluster. The centroid (mode) of a subset of objects is the vector that minimizes the sum of the distances to all the objects in . Supposing the (mismatching similarity) distances between every and , are known, which also applies for the distances between objects and their clusters' centroids, then: -The Calinski-Harabasz index (Calinski & Harabasz, 1974), also known as pseudo-F ratio, is defined as: Eq. (A.1) SSW and SSA are the within-group sum of squares and the among-group sum of squares, respectively. Adapting (Anderson, 2001), these quantities are obtained from the matrix of distances between pair of objects: The among-group distances are large compared to the within-group distances in the case of high separateness and compactness. Thus, maximum values are taken to represent the correct number of clusters (Milligan & Cooper, 1985).
-The C-Index (Milligan & Cooper, 1985) is calculated as: is the sum of the within-cluster distances: Eq. (A.5) and ( ) is the sum of the smallest (largest) distances in the dataset. is the total number of pair of objects in the same cluster, = ∑ ( −1) 2 =1 (Charrad et al., 2014). C-Index is restricted to the interval (0,1) and its minimum value suggests the optimal number of clusters (Milligan & Cooper, 1985).
-The Duda-Hart (Duda et al., 2001) score is inspired in the fact that the sum of squarederrors corresponding to a partition decreases with . Thus, in conventional Euclidean-distance clustering problems, the optimal number of clusters is the smallest such that is smaller than certain critical value (Milligan & Cooper, 1985). In our case of binary distances, we limit ourselves to request that ratio to be minimal, indicating a possible correct number of clusters, and define ( ) as a "sum of mismatching similarity distances error": Eq. (A.6) -The Silhouette width (Kaufman & Rousseeuw, 1990) is calculated with the following expression: here, is the average distance from the th point to every other object in its cluster: = 1 −1 ∑ ∈{ \ } and is the minimum average distance from the th object to all objects of other clusters, minimized over the clusters, namely: = min ≠ { }; = 1 ∑ ∈ (Charrad et al., 2014). The index can take values in the interval [−1,1] with negative values indicating the clustering solution is not accurate, and understandably so, as the minimum average distance from many objects to other clusters would be bigger than the dissimilarity to objects of the clusters where they belong to. On the other hand, the maximum value is taken to represent the optimal number of clusters in the data (Charrad et al., 2014).
-The Dunn index (Dunn, 1973) is generally defined as: where Δ is the diameter of the th cluster and Ι( , ) is the intercluster distance between and . Out of the many variants available for computing both these quantities, we use the ones recommended by (Bezdek & Pal, 1998): is maximized when the clusters are compact (the diameter is small) and separate (the intercluster distance is large) (Milligan & Cooper, 1985).
-The Davies-Bouldin index (Davies & Bouldin, 1979) is also a function of the ratio of within-cluster dispersions and the between-clusters separation. When using mismatching dissimilarities, it can be calculated as: where Δ is defined as in Eq. (A.9) and is the distance between the centroids of clusters and , = ( , ) (Charrad et al., 2014). The smaller ( ), the better the partition (Dubes, 1987).       Partition with the highest similarity (in terms of adjusted mutual information), existing at = 3.0.
Colored nodes correspond to communities and their location, to average coordinates of the brain regions in MNI space.