Totally homogeneous networks

Abstract In network science, the non-homogeneity of node degrees has been a concerning issue for study. Yet, with today's modern web technologies, the traditional social communication topologies have evolved from node-central structures into online cycle-based communities, urgently requiring new network theories and tools. Switching the focus from node degrees to network cycles could reveal many interesting properties from the perspective of totally homogenous networks or sub-networks in a complex network, especially basic simplexes (cliques) such as links and triangles. Clearly, compared with node degrees, it is much more challenging to deal with network cycles. For studying the latter, a new clique vector-space framework is introduced in this paper, where the vector space with a basis consisting of links has a dimension equal to the number of links, that with a basis consisting of triangles has the dimension equal to the number of triangles and so on. These two vector spaces are related through a boundary operator, for example mapping the boundary of a triangle in one space to the sum of three links in the other space. Under the new framework, some important concepts and methodologies from algebraic topology, such as characteristic number, homology group and Betti number, will play a part in network science leading to foreseeable new research directions. As immediate applications, the paper illustrates some important characteristics affecting the collective behaviors of complex networks, some new cycle-dependent importance indexes of nodes and implications for network synchronization and brain-network analysis.


INTRODUCTION
Network science has gained popularity due to its great achievements over the past 20 years, where small-world networks [1] are built from nearestneighbor regular networks through rewiring, presenting two significant characteristics of short average path length and large clustering coefficient, while scale-free networks [2] are modeled based on random networks [3], possessing a power-law node-degree distribution. The three fundamental concepts in network science-average path length, degree distribution and clustering coefficientcorrespond to three basic structures: chain, star and cycle. The clustering coefficient is calculated based on triangles, but the cycle depends on many more, so there should be a large class of complex networks that have prominent cyclic structures that go beyond the typical small-world and scale-free models. Indeed, one was recently discovered [4], referred to as totally homogeneous networks, which was called the '(k, g, l)homogeneous network ' [4], where k is the node-degree variable, g is the girth variable (length of the smallest cycle of a node) and l is the pathsum variable (sum of all path lengths of a node to other nodes). This new class of networks were coined as the type of networks that have optimal synchronizability if it has a maximum g and a minimum l, which is a particularly important type of regular network that optimizes the network synchronizability [4], as well as controllability and robustness against attacks. Despite their usefulness in various applications (as further discussed below), it is technically challenging to find such networks, since they require all nodes to have the same degree, the same girth and the same path-sum. Yet small-sized totally homogeneous sub-networks ubiquitously exist in all complex networks, typically triangles and smallest k-cavities (to be further discussed below). Notably, totally homogeneous networks have natural symmetries, signifying their importance in network theory and applications, and therefore should be more carefully investigated. Figure 1. A 6-node fully connected network; an 8-node smallest 3-cavity network; a 10-node nearest-neighbor regular network; a 10-node sync-optimal network.
A network usually has fully connected subnetworks (i.e. complete sub-graphs) like triangles and tetrahedrons, called simplexes in topology or cliques in graph theory, which are special totally homogeneous (sub)networks. Here, a node is a 0clique, a link is a 1-clique, a triangle is a 2-clique, a tetrahedron is a 3-clique and so on. Other than cliques and nearest-neighbor regular (sub)networks, it is important to find those non-fully connected and yet linearly independent cavities as well as other sync-optimal totally homogeneous (sub)networks, which are known to be very important in brain functional networks and big-data analysis, as will be further discussed below. Typical examples of totally homogeneous (sub)networks, namely a fully connected network, a smallest k-cavity network, a nearest-neighbor regular network and a syncoptimal network, are shown in Fig. 1.
A network can be represented by an adjacency matrix, in which element 1 means two corresponding nodes are connected by a link, while 0 means not. Therefore, the node degree is of the highest importance for a network and, as a result, the emergence of giant nodes (hubs) signifies the key role of stars in scale-free networks, which has been a main research subject in the past.
By examining cycles, it is easy to find that cliques, such as links and triangles, constitute the backbone of a network. Therefore, a new clique vector-space framework in the form of a sequence of vector spaces defined on the binary field is introduced. Let C 0 be the vector space with a basis consisting of all nodes, C 1 be one with a basis of all links (edges), C 2 be the one with a basis of all triangles and so forth. Noting that the three links of a triangle in C 2 are links in C 1 , a boundary operator is defined between these two vector spaces, so that the two vector spaces connected by the boundary operator can be described and analysed using a boundary matrix. It will be seen that boundary matrices have richer mathematical contents and are useful tools for advanced studies. For instance, one can introduce the concept of linear dependence for chain vectors and for cycle vectors, as well as equivalence between cycles, so that the resulting chain groups and cycle groups (including boundary groups and homology groups) can be investigated by using advanced mathematical tools from algebraic topology and group theory. It can be foreseen that such a clique vector-space framework will enrich the studies of network science and meanwhile bring up more opportunities to the field.
Using tools from algebraic topology can help find invariants in networks. The most well known is perhaps the Euler characteristic number, which equals the alternating sum of all simplexes in the network, namely the number of nodes minus the number of links then plus the number of triangles and then minus the number of tetrahedrons . . . until there are no more to add or subtract [5]. Another important invariant is the Betti number, which is the total number of linearly independent cavities [5], defined as follows: the order-0 Betti number is the number of connected sub-networks, order-1 Betti number is the number of linearly independent 1cavities (which are non-triangular cycles) and so on, as will be further illustrated by examples below. The Euler-Poincaré formula connects these two indexes together [5]: the alternating sum of simplexes equals the alternating sum of the Betti numbers. Besides, the number of links in a spanning tree of a connected network equals the number of all nodes minus 1, which is the rank of the chain group in the network [6]. Moreover, the rank of the cycle group, namely the number of linearly independent triangular and non-triangular cycles, equals the number of links minus the number of nodes and then plus 1 [6].
Cycles have been a main research subject in graph theory [6], while algebraic topology is an important branch in mathematics [5]. In the current literature of network science, however, investigations involving both network cycles and algebraic topology are quite rare. Nevertheless, there is some significant progress: the finding of a criterion for network synchronizability (2002) [7]; the introduction of (sync-optimal) totally homogeneous networks (2013) [4]; the study of signals and noise in brain activities (2017) [8]; the discovery of cliques and cavities in brain functional networks (2018) [9]; the search for basic cycles in measuring the importance of a node and cycle index in spreading over WeChat and their relationships with hyper-networks (2019) [10].
This paper presents the mathematical description of a new framework of clique vector spaces, first introducing related concepts of various cycles, second defining a sequence of clique vector spaces associated with boundary operators and finally discussing the chain group, cycle group, boundary group and homology group. Then, it shows how to utilize boundary matrices to calculate linearly independent cycles and cavities of different orders, as well as their characteristic indexes, in a complex RESEARCH ARTICLE network. Furthermore, by examining the key factors that affect the collective behaviors of a network, it demonstrates the important role of totally homogeneous sub-networks in a complex network. Finally, it briefly discusses some applications of algebraic topology in network science, indicating a couple of future research topics.

RESULTS AND DISCUSSION Description of the new mathematical framework
Investigating network cycles is much more difficult than examining node degrees, since cycles have many variants such as higher-order cycles, linearly (in)dependent cycles and redundant cycles that contain smaller circles. A simple cycle is intuitively clear, which is a closed path starting from a node and returning to the same node after traversing some other nodes. The smallest cycle is the triangle, which is also a clique, called 2-clique, and a tetrahedron is 3-clique. The 3-clique is not a cycle in the usual sense, and is called a second-order cycle, or simply a 2-cycle. Similarly, 4-clique is a 3-cycle, 5-clique is a 4-cycle and so on. Yet, 0-clique and 1-clique are not cycles. The concept of linearly (in)dependent cycles or cavities will be introduced through the description of the new framework next.
The new framework of clique vector spaces in the binary field and their associated boundary operators are defined as follows. Let C k be the vector space with a basis consisting of k-cliques, with dimension m k equal to the number of the k-cliques. The Euler characteristic number χ = m 0 −m 1 +m 2 −· · · . All the vectors in C k are subsets of some k-cliques, with the empty set being the zero vector, denoted as ∅ or 0. In the binary field, there are only two elements, 0 and 1, with 1 + 1 = 0. The addition between two vectors c and d is defined by set operations as Define a boundary operator ∂ k : C k → C k−1 to connect two successive vector spaces in the following way: denote triangle (1, 2, 3) ∈ C 2 by σ 123 ∈ C 2 , which has boundaries (1, 2), (2, 3), (3, 1) and define ∂ 2 (σ 123 ) = σ 12 + σ 23 + σ 31 , where the '+' operation is performed in the binary field. For the two end nodes of the boundary (1, 2), node 1 and node 2, one has ∂ 1 (σ 12 ) = σ 1 +σ 2 . Again, by using additions in the binary field, one obtains Since C 1 is a vector space consisting of links, its elements are called 1-chains. If, for l ∈ C 1 , the chain The containment relationships of these subspaces are shown in Fig. 2.
To study the linear dependence, boundary matrices are introduced. For instance, in C 1 , one can define a node-link matrix B 1 , in which an element is 1 if the node is in the link correspondingly; otherwise, it is 0. Similarly, define a link-face matrix B 2 on C 2 , in which an element is 1 if the link is on the face correspondingly; otherwise, it is 0. Through elementary row (column) transformations in the binary field, one can calculate the rank r k of the boundary matrix B k , which is the number of linearly independent vectors in C k . Moreover, all k-chains in C k constitute an Abel group, called a chain group, with the empty set being the zero element. The number of generating elements of a chain group is the rank r k of the boundary matrix B k . Likewise, all the k-cycles of ker(∂ k ) and all k-boundaries of im(∂ k+1 ) each form an Abel group, called cycle group Z k and boundary group Y k , with ranks m k −r k and r k+1 , respectively. Two k-cycles c and d are said to be equivalent, denoted as c ∼ d, if c + d is a boundary of a (k + 1)-chain. All equivalent cycles constitute an equivalent class. By definition, if b ∈ Y k , then b ∼ ∅. Decomposing the cycle group Z k via the boundary group Y k yields a homology group Z k /Y k , with rank equal to the Betti number β k = m k − r k − r k+1 . Elements of this homology group are cycle-equivalent classes, called k-cavities.
Next, the above-introduced concepts and algorithm are illustrated by a simple network. See Supplementary Data, Section 1 for detailed computations.
The network in Fig. 3 has 13 triangles and 1 tetrahedron, so its Euler characteristic number is χ = 14 − 26 + 13 − 1 = 1 − 2 + 1 = 0, where the Betti numbers are calculated in Supplementary Data, Section 1. Since β 1 = 2, the homology group Z 1 /Y 1 has rank 2 and order 4, with four cycle-equivalent classes: empty set ∅ (including 13 triangles), two 1cavities and their sum. The key is to search for these cliques and cavities. For distinction below, the rightside node numbers are marked by apostrophes.

Searching for linearly independent cycles of the network
Joining other links to the spanning tree forms linearly independent cycles, with the total number equal to the number of links minus the number of nodes and then plus 1. In so doing, linearly dependent cliques and equivalent cavities with the same length in the network should also be included.
Here, those cycles without underlines are linearly independent 1-cycles. Obviously, all 1-cycles and 2cycles are totally homogeneous sub-networks.
To this end, it should be clear that the above analytic and computational methods can be extended to directed and weighted complex networks, even multi-layered networks.

Main factors affecting collective behaviors of a complex dynamical network
The seminal paper [1] on small-world networks studies the collective behaviors of a dynamical RESEARCH ARTICLE network. It points out that regular networks have relatively large clustering coefficients, but their average path lengths are generally quite long, and that random networks are opposite. So, it concludes that both are not good for collective dynamical behaviors such as information spreading and multi-agent synchronization. Thereby, it recommends a smallworld network model that has both advantages of large clustering coefficients and short average path lengths. Now, focusing on cyclic structures in small-world networks reveals some interesting phenomena that have not been observed or emphasized before.

Network synchronization-characteristic number is key
In the study of optimal synchronizability of complex networks, it was found [4] that the totally homogeneous network with equal node degree, long girth and short path-sum is the best. Here, the synchronizabilities of four typical networks shown in Fig. 4 are compared: regular network, small-world network, random network and sync-optimal totally homogeneous network, in which the small-world network is created through random rewiring [1] while the sync-optimal network is created through deterministic rewiring [4]. All these sample networks are connected with 20 nodes and 40 links without tetrahedrons.
Data about these four sample networks are summarized below: To compare their synchronizabilities, the eigenvalues of their Laplacian matrices are calculated [4,7]: the nearest-neighbor regular network has spectral gap (smallest nonzero eigenvalue) 0.4799 with eigen-ratio (smallest nonzero eigenvalue versus largest eigenvalue) 0.0769; the small-world network has spectral gap 0.5035 and eigen-ratio 0.0714; the random network has spectral gap 0.7947 and eigenratio 0.0812; the sync-optimal network has spectral gap 2.0000 and eigen-ratio 0.2982.
These results depict that the network synchronizabilities of the regular network, small-world network, random network and sync-optimal network are increasing successively. In particular, it shows that the key factor affecting the network synchronizability should be the Euler characteristic number: the smaller the characteristic number, the better the synchronizability for networks of the same size. Furthermore, the characteristic number depends on both 2-cliques and 1-cavities: having fewer 2-cliques but more 1-cavities, the characteristic number will be smaller, hence the network is easier to synchronize. These data are consistent with the previous observations. Note also that the clustering coefficient depends on the number of triangles, so a larger clustering coefficient means more 2-cliques are involved, and consequently the network synchronizability will become worse. Relatively to a larger clustering coefficient, a shorter average path length is more important for better network synchronization.
Network spreading-totally homogenous networks are better Now, consider information or disease spreading on the four networks shown in Fig. 4, all based on the cyclic Susceptible-Infected-Recovered (SIR) model [10]. This cycle-based SIR model differs from the conventional SIR model [11] in that the nodes belonging to the same cycle can always transmit information or disease, even if they are not directly connected to each other.
The performed simulations involved successively selecting every node as the source and, with spreading probability 0.06 and recovering probability 1.00, processing the information though nodes with S (susceptive), I (infected) and R (recover) states until no more infected nodes remained in the network. Then, after 100 runs, the average number of recovered nodes was recorded. The final results were Network 1: 2.656 nodes, Network 2: 2.441 nodes, Network 3: 1.349 nodes and Network 4: 4.111 nodes (see Supplementary Data, Section 2 for more details).
These simulation results clearly show that the random network is the worst, the small-world network is not as good as the regular network and the totally homogeneous network is the best.

Other promising applications based on cycles
Cycle-based importance indexes of nodes-cycle number and cycle ratio There are many indexes for measuring the importance of a node in a network [12], but there do not seem to be any based on cycle structure.
Two new concepts of cycle number and cycle ratio have been recently introduced [10] for measuring the importance of nodes. A node may have several smallest cycles, called the smallest basic cycles of the network. All non-redundant cycles are called basic cycles. The cycle number of a node is defined as the total number of basic cycles that pass this node. The cycle ratio of a node i is then defined as the sum of the proportions of this node i appearing in the basic cycles of all those nodes that are contained in the basic cycles of this node i.
To evaluate the performances of the indexes' abilities to measure the node importance in a network, three existing indexes (degree, H-index and coreness) and the two new indexes (cycle number and cycle ratio) are compared [10] to study their effects on the connectivity and spreading over a network. For connectivity, all nodes are ranked according to their importance measured by an index and some nodes are removed, then a portion of nodes in a giant surviving sub-network is computed and finally their relationships are plotted for comparison. For nodes with the same index value, randomly rank them one after another. Simulation demonstrates that the intentional attack according to the cycle ratio ranking is more effective. For spreading, choose the initial node according to the importance ranking, from high to low, as the source node. The SIR spreading process is then performed until no infected node remains. After 100 runs, the average number of recovered nodes is recorded and the Kendall's tau correlation coefficient [13] is calculated. In evaluating the spreading performance, due to the cyclic structure, the spreading matrix is used instead of the adjacency matrix. Here, the spreading matrix is introduced from WeChat data, which means that two nodes in the same group can communicate, even if they do not know each other. Results show that infected nodes spread very well according to the cycle-number ranking on the cycle-based SIR model.

Relation of network and hyper-network-studying hyper-networks
In classical graph theory, one link can only connect two nodes. In reality, however, one link could be shared by multiple nodes. Such a link is called a hyperlink. A network consisting of nodes and hyperlinks is called a hyper-network. There exists correlation of an ordinary network and a hypernetwork [10]. By viewing a basic cycle of a node as a hyperlink, an ordinary network can be converted into a hyper-network, as shown in Fig. 5. The questions are whether the reverse can be performed and, if so, whether the reverse process preserves all information. Since ordinary networks are special cases of hyper-networks, it is clear that generally the answers are no. But, this does not exclude particular situations. In fact, for a hyper-network, multiplying the incidence matrix by its transpose yields a cyclenumber matrix. Then, dividing each row of the cycle-number matrix by the cycle number yields a cycle-ratio matrix, and then adding each column to it gives the cycle ratio of each node. Thus, the new notion sheds some light on future research on hypernetworks.

Brain functional networks-important roles of cliques and cavities
It was pointed out [9] that, although the human brain looks sparsely connected, its clique structure therein is quite dense. It was found [9] that cliques play a very important role in cortical, visual and perceptive functions. However, from the conventional graph-theoretical viewpoint, one can only observe the connectivity among nodes and can not discover deeper and higher-order structural characteristics of the brain, which needs more powerful mathematical tools such as algebraic topology. From an empirical research investigation, it was found [9] that cycles with longer girths are extremely important in the task of controlling the brain and that the cavity structure is even more critical in the spreading patterns RESEARCH ARTICLE of the brain. Surprisingly, it was found [9] that the universal cavity structure in the brain does not exist in the conventional benchmark null network model, indicating the need for more powerful topological graph theory [14] beyond the classical algebraic graph theory in the studies of the brain.
Computational topology coming to network science-looking for higher-order topological features Persistent homology [15], an important subject in algebraic topology, can be used to improve computational accuracy in different spaces and to detect subtle details in a multi-scale space, recovering more essential features of a research object on the ground space. In contrast, the conventional techniques such as signal sampling and noise analysis as well as parameter selection may yield some false results. In a study of the functional network formed by time-series data obtained by a weighted rank filtration technique, it was found [8] that cliques and cavities in a functional network have higher-order characteristics than the connectivity among nodes, which provides much more useful information, consistent with some existing studies [9]. By investigating the synchronization of Kuramoto oscillators using fMIR data, it was found [8] that persistent homology can reveal clearly some synchronous behaviors in the learning process of the brain that were not discovered by conventional signal sampling and noise analysis. Typically, persistent homology helps to distinguish strong and weak synchronization phenomena in communities of the brain network and helps to detect functional changes through the learning process of the brain. A recent report [16] shows that persistent homology can be used to assist in topological data analysis, to reveal local, mesoscale and global properties and features of the network, using weighted, noisy and non-uniformly sampled complex data, verified by Electroencephalography (EEG) data analysis.

CONCLUSION
Using a sequence of clique vector spaces alongside boundary operators to describe complex networks has demonstrated well that cliques, simplexes and fully connected sub-networks are the backbones of various networks. This framework allows higherlevel mathematical concepts and methods such as characteristic number, homology group and Betti number to play more significant roles in networkscience studies. They provide useful tools for uncovering and analysing higher-order topological features and global structures of a complex network.
Four representative classes of totally homogeneous networks have been examined, especially some elegant properties of the smallest k-cavity subnetworks and sync-optimal networks, revealing that cycle homogeneity is as important as node heterogeneity for understanding complex networks. Network-synchronization criteria originate from physics and are then evolved via optimization to establish the notion of totally homogenous networks and finally connected to some invariants in algebraic topology, with significance demonstrated by brain research. This process highlights the interactions among network science, physics, biology and mathematics. When looking at an object from different angles, one finds different aspects about it.
The situation resembles what the famous Chinese poet Su Dongpo said in his well-known poem [17], 'From the side, a whole range; from the end, a single peak: far, near, high, low, no two parts alike. Why can't I tell the true shape of Lu-shan? Because I myself am in the mountain' . The new perspective of this paper hopefully will open up a new research direction in network-science studies in the near future.

SUPPLEMENTARY DATA
Supplementary data are available at NSR online.