Comparing complex networks: in defence of the simple

To improve our understanding of connected systems, different tools derived from statistics, signal processing, information theory and statistical physics have been developed in the last decade. Here, we will focus on the graph comparison problem. Although different estimates exist to quantify how different two networks are, an appropriate metric has not been proposed. Within this framework we compare the performances of two networks distances (a topological descriptor and a kernel-based approach as representative methods of the main classes considered) with the simple Euclidean metric. We study the performance of metrics as the efficiency of distinguish two network’s groups and the computing time. We evaluate these methods on synthetic and real-world networks (brain connectomes and social networks), and we show that the Euclidean distance efficiently captures networks differences in comparison to other proposals. We conclude that the operational use of complicated methods can be justified only by showing that they outperform well-understood traditional statistics, such as Euclidean metrics.


Introduction
Despite the success of complex networks modeling and analysis, some methodological challenges are still to be tackled to describe and compare different interconnected systems. Identifying and quantifying dissimilarities among networks is a challenging problem of practical importance in many fields of science. Given two graphs ¢ { } G G , , we aim at finding an injective and real-valued function h that maps , that quantify the (dis)similarity between two networks have been been studied in several areas such as chemistry, protein structures, social networks up to neuroscience, among others [1][2][3][4]. Without an h uniqueness, different approaches have been proposed including graph edit operations, distances based on divergences, spectral parameters, kernels, or different combinations of the previous [5][6][7][8][9][10][11].
Although several of these dissimilarity metrics have been developed in the framework of complex networks and can capture the connectivity structure at different different levels (degrees, walks, paths, etc), the natural question arises as to whether a simple measure (e.g. the Euclidean distance) is able to quantify and distinguish two networks.
In this work, we consider three classes of the function h: the first class, which represents a large bunch in the literature, quantifies local changes via structural differences. These metrics may range from the simplest Euclidean distance [12][13][14] to more elaborated algorithms that assign costs of different operations to map nodes/edges of G to their ¢ G counterparts [5,15,16]. Another distance class considers topological descriptors that map each graph into a feature vector (e.g. degree distribution, nodes centrality, etc). These vectors are compared with any multivariate statistical distance or information-type metrics to compute the graph dissimilarity [10,17,18]. We notice that considering one type of feature may imply to lose topological information from others parameters, and the price of a complet caracterisation may be paid with more runtime. The last class considered here includes kernel-based approaches that compare global substructures (i.e. walks, paths, etc). These methods capture global information of networks (e.g. the graph Laplacian) considered in a metric space, where a defined inner product directly estimates its dissimilarity [19]. Kernel-based methods, Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence.
Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. however, often integrate over local neighborhoods, which renders these approaches less sensitive to small or local perturbations [7].
In our study we show than the use of a simple Euclidean metric may provides good performances to asses graph differences, when compared to other more complicated functions. We propose a framework for measuring the performance of functions hʼs applied on undirected-binary graphs of equal sizes. We define the hʼs performance in terms of 'discriminability' and 'runtime'. The former is the capability of h for discriminating two sets of networks associated to two different groups. The latter is simply the computing time.

Comparing network distances in synthetic and real networks
In what follows, we compare the performance of the standard Euclidean distance (D f ), the dissimilarity measure (D d ) defined in [10], and the graph diffusion kernel distance (D k ) [9], from each of the classes mentioned above. As each class encompasses many metrics with a common core (e.g. Frobenius norm, Information theory, Kernel-based types), we chose one of the recent published distances for each class to compare them. For these algorithms, we evaluate the discriminability and runtime in different synthetic and real-world networks. We show that the Euclidean distance substantially outperforms other methods to capture differences between networks of the same size.

Euclidean distance
Assuming that {A 1 , A 2 } are the adjacency matrix representations of graphs {G 1 , G 2 }, we have the Euclidean distance defined by: where   · F denotes the Frobenius norm.

Network structural dissimilarity
This dissimilarity measure captures several topological descriptors [10]:

Kernel-based distance
A recently proposed distance is based on diffusion kernels [9]. This method estimates the differences between diffusion patterns of two networks undergoing a continuous node-thermal diffusion. A set of distances at different scales t can be obtained by means of the Laplacian exponential kernels  The kernel-based distance is obtained by [9]: where  i denotes the graph Laplacian of network i.
To assess the performances of these functions to capture network's differences, we consider a network A and a set of perturbed networks {A p } generated with a random rewiring (with probability p) of original network A. We evaluate hʼs by computing the differences between perturbed versions {A p } and its original configuration A.
For low values of p, networks are very similar. Network differences are expected to increase with p. The aim of this random rewiring is to simply produce a random perturbation similar to that used when studying the network robustness [20]. We then evaluate the dissimilarity value after a given fraction of links is rewired while preserving the number of links and connectedness.

Benchmark tests
We build binary Barabasi-Albert (BA), Strogatz-Watts (SW) [20] and Lancichinetti-Fortunato-Radicchi (LFR) [21] models with L links and N=100. In the BA model, the mean degree is set to 4 and the exponent of the degree distribution is, by construction, 3 , as well as the 5th-95th percentiles (figure 1). As expected, all the averaged profiles display monotonically increasing curves that reach out certain saturation around p = 10 −1 . Results suggest that all the measures (including the Euclidean distance) are sensitive to small structural changes (10% of reshuffled links), and reflect well the structural perturbations. Beyond this threshold (p > 10 −1 ), however, all functions cannot distinguish between a graph A and its perturbed version {A p }. Results also show that, despite the non-trivial heterogeneous connectivity of the LFR model, the networkdistance profiles are quite similar. Further, results clearly indicate that Euclidean distances has lower variability than the other two distances.

Assessment of performances
Our results suggest that the dissimilarity curve obtained by comparing a given network and its different perturbed versions captures relevant features of the original connectivity, which suggests it can be directly used to compare two networks. To assess the different metrics' performances we quantify the 'discriminability' and the 'runtime'. Discriminability assesses whether a given function h is sensitive at certain perturbation p, and whether it is suitable to distinguish two different group of networks at a given p. Discriminability is defined as the percentage of times a function h distinguishes the differences of each group of networks at certain perturbation level. The more times h distinguishes two different datasets, the better the h discriminability is. In addition, runtime simply measures the h execution time. The faster a given function h estimates the differences, the better the corresponding metric is. For the sake of applicability we tested the performance of different hʼs in real networks.

Real networks
In this work, we evaluate metric's performances upon two dataset of different nature: functional brain connectomes and social networks. We use a recently published brain connectivity dataset [22], which includes functional connectivity matrices estimated from magnetoencephalographic (MEG) signals recorded from 23 Alzheimer patients (P) and a set of controls subjects (C) during a condition of resting-state with eyes-closed [23]. Alzheimer disease is caracterised by anatomical brain deteriorations, which are reflected in an abnormal brain connectivity. MEG activity was reconstructed on the cortical surface by using a source imaging technique [23]. Connectivity matrices were obtained from N=148 regions of interest by means of the spectral coherence between activities in the band of 11-13 Hz. We specifically focused on the functional connectivity in this frequency band, which is particularly activated during resting activity with closed eyes, and it reflects the main functional connectivity changes accompanying the disease [24]. All the recording parameters and preprocessing details of connectivity matrices are explained in [23].
Following the procedure of [25], we thresholded each connectivity matrix by recovering its minimum spanning tree and then filling the network up with the strongest links until to reach a mean degree of three. Our criterion admits that the weighted links of the raw networks had been previously validated, either maintained or canceled [26]. This thresholding criterion ensures a trade-off between network efficiencies (both global and local) and wiring cost. In [25,26], theoretical and numerical results show that, for a large class of brain networks (including functional ones as those used in our study), this balance is obtained when the connection density ρ follows a fractal scaling regardless of the network size according to the power-law ρ = 3/N. The resulting connectivity networks are binary adjacency matrices with N = 148 nodes with L = 222 links. A direct comparison of connectivity matrices between the graphs of two groups Î  { } A P C does not not allow to distinguish them. This result agrees with previous studies that found group differences related to very local changes in connectivity [23,24]. Authors in [23] for instance, found that only 3% and 4% of the nodes accounts for the connectivity differences between groups, when different frequency bands are combined in the analysis.
The approach proposed to detect global network differences between those groups is based on the dissimilarity curve of each network. For this, each connectivity graph A is firstly perturbed by randomly choosing l links " = ¼ l L 1, 2, , and reshuffling them such that the graph remains connected. We get thus a set of = ||{ }|| A 222 l perturbed networks. We then compute the network differences between all pairs (A, A l ). at certain level l, if the group differences are statistically different at that perturbation level. Discriminability is defined as the hits percentage along all L perturbations, i.e. the number of times the null hypothesis H o of no difference between the two groups is rejected. To assess significant differences, we used a non parametric permutation test allowing 500 permutations for each l and we reject H o at p0.05 (corrected by a Bonferroni method).
The mean distance profiles d á ñ l h s , for each h are plotted in figure 2. As in synthetic models, profiles show a monotonically increasing behavior. At low rewiring percentages (11%) there is no significant differences at group level. For small perturbation levels, functions h cannot distinguish connectivity between groups. Something similar is observed when links perturbation are above ≈70%. On the other hand, D f appears as the one with the highest discriminability closely followed by D k , while D d appears with lowest one. Results clearly suggest that Euclidean distance distinguishes better the two groups of networks considered here.
We now move our attention to the comparison of social networks. We applied our approach to the analysis of connectivity differences between two social networks. Each connectivity matrix contains the friendship and socioemotional interactions among workers in a tailor shop in Zambia, during two periods of time (seven months apart), immediately before and unsuccessful (t 1 ) and a successful (t 2 ) strike, respectively [27]. Networks in each group consist of 39 actors forming a giant component. Both networks reflect the changing patterns of alliance among workers during extended negotiations for higher wages.
Each network was rewired under the same procedure explained above retrieving = ||{ }|| A 100 l perturbed networks to compare with. We repeat this process for 20 independent realizations and then average the distance profiles for each h. Results displayed in figure 3 suggest that Euclidean distance distinguishes better than the other two metrics the change of alliance patterns among workers observed during the two periods of time t 1 and t 2 .
We assessed the execution time for computing a distances profile for each subject (we used MATLAB R2017a ran in an OS 10.12.6, 4 GHz Intel dual core i7 processor and 32 GB memory). Figure 4 shows the relatives orders of magnitude in seconds that each metric takes to compute the networks differences. For the analysis of brain connectomes, the average times obtained are: t f =6.83×10 −5 , t d =2.68×10 −2 , t k =1.90×10 −1 for the Euclidean distance, the dissimilarity metric and the kernel-based method, respectively. The results clearly show Euclidean distance as the fastest method in comparison with the others two. Clearly, D f is 3 (4) orders of magnitude faster than D d (D k ). Similar relative orders of magnitude are obtained for the social networks.
Runtime finally determines which measure has the best performance when computing graphs distances. While the discriminability of D k is close to that of D f , its runtime is four orders of magnitude slower than D f due to the fact that D k needs to search into several scales to find the highest difference. D d runtime is three orders of  magnitude slower than D f , because D d takes into account many topological properties under several tuning parameters.
To rule out the possibility that the differences in the number of connections of the networks could account for significant differences in the different distances, we have assessed the differences between surrogate graphs of the two groups, obtained by randomly rewiring the links of the original networks while keeping the same degree distribution. This procedure allows 'normalizing' for the potential influences of changes in the number of connections.
For brain connectomes, we estimate the distance between the aggregate (averaged) network of each group of subjects/patients. For the analysis of social networks, instead, we used the original social interaction matrices. For both dataset we create a set of 100 surrogate networks as described above and compare, by means of a zvalue, a given distance between the original networks with that obtained from surrogate pairs. Table 1 depicts z-values for the three metrics. Interestingly, the low z-values obtained by D d suggest that this distance mainly reflects differences in the degree distribution. In contrast, the Euclidean and kernel-based distances seem to capture structural differences beyond the degree distribution or density.
In summary, the Euclidean distance emerges as the metric with the highest discriminability to distinguish groups of networks studied here, and the fastest computation, which is something important when one manages large datasets.

Concluding remarks
Finding an accurate graph distance is a difficult task, and many metrics have been described without a framework to properly benchmark such proposals. Here we make a call of the simple Euclidean distance as the one with a very good trade-off between good and fast performances in contrast to more elaborated algorithms. Here we propose a method to detect global network differences with high efficiency and fast computation time. Although we used a random rewiring, the analysis over other perturbations or networks models deserves a statistically detailed study out of the scope of this rapid communication.
Our results suggest a non-trivial dependence between networks' structure and networks' distances. Appropriate statistical control of distances (e.g. via group comparisons or random null models) are therefore necessary to take into account these differences. We also propose a simple framework to assess any metric's performance in terms of discriminability and runtime. Results indicate that, for comparing binary networks of the same size, the Euclidean distance's discriminating capabilities outperform those of graph dissimilarity and diffusion kernel distance.
Our approach is founded on unweighted network models. Its natural application implies binarization after thresholding, a procedure widely adopted to mitigate the uncertainty carried by the weights estimated from neuroimaging data. Further work is needed to clarify how our approach can be extended to weighted networks, where the perturbation of links is less straightforward (simple rewiring, perturbation of weights, etc). Similarly, more elaborated network models (e.g. multi-layer, signed, spatial, or time-varying networks) might, however, need more elaborated tools to account for the geometry or the interdependencies of interacting units, and make their comparisons more robust.