Mutual information model for link prediction in heterogeneous complex networks

Recently, a number of meta-path based similarity indices like PathSim, HeteSim, and random walk have been proposed for link prediction in heterogeneous complex networks. However, these indices suffer from two major drawbacks. Firstly, they are primarily dependent on the connectivity degrees of node pairs without considering the further information provided by the given meta-path. Secondly, most of them are required to use a single and usually symmetric meta-path in advance. Hence, employing a set of different meta-paths is not straightforward. To tackle with these problems, we propose a mutual information model for link prediction in heterogeneous complex networks. The proposed model, called as Meta-path based Mutual Information Index (MMI), introduces meta-path based link entropy to estimate the link likelihood and could be carried on a set of available meta-paths. This estimation measures the amount of information through the paths instead of measuring the amount of connectivity between the node pairs. The experimental results on a Bibliography network show that the MMI obtains high prediction accuracy compared with other popular similarity indices.

networks become infeasible in heterogeneous ones. Note that, in a heterogeneous complex network, two objects might be connected via different paths while the semantic underneath them are not identical. In this regard, a meta-structure known as meta-path has been proposed in order to exploit nodes dependencies in heterogeneous complex networks. Meta-path is a sequence of node types that acts as a similarity search pattern between node pairs. Without restriction on either the structure or length of the meta-paths, the number of possible meta-paths is unbounded. However, generating informative meta-paths and selecting the best set of them are some of interesting issues in meta-path based similarity searches. Accordingly, the studies on meta-paths are roughly divided into two classes.
In the first class, the focus is on efficient discovery or selection of meta-paths, especially in large scale networks. Automatic discovery of meta-paths in large scale complex networks has been studied in ref. 10. The users are asked to provide some examples of node pairs that exhibit high proximity. Then, a greedy algorithm would be employed to generate the meta-paths that can appropriately explain the relationship between the example pairs. The experiments on real-world heterogeneous information networks, DBLP and Yago, show the effectiveness of the method in finding some important meta-paths. The difficulty of discovering reasonable meta-paths by human in large scale complex networks has been discussed in ref. 11. Here, the most interesting meta-paths are discovered based on conventional knowledge discovery principles from the hundreds-of-thousands of possible choices.
In the second class, assuming a meta-path is given, the objective is to explore new similarity measures or to develop efficient similarity searches. There are a number of meta-path based similarity measures proposed in the literature. Two basic measurements, named as Path Count (PC) and Random Walk (RW) 12 , are based on the number of path instances between the given node pairs 13 . The higher the PC and RW in value, the more the similarity can be obtained. HeteSim (HS) was proposed in ref. 14 as a new path-based relevance measure. It is a symmetric and self-maximum measure which has a much smaller computational complexity than SimRank 15 . Another similarity measure, called as PathSim (PS), is only applicable on symmetric meta-paths 16 . The basic idea is that two similar objects not only be strongly connected, but also share comparable visibility. In comparison with random walk based measures, PathSim is able to find more meaningful similarities.
In addition to these similarity measures, meta-path based topological features have been conducted in a number of works for similarity search. Collective classification in heterogeneous complex networks 17 , social link prediction in multiple partially aligned social networks 18 , co-author relationship prediction in heterogeneous bibliographic networks 13 , and a meta-path based prediction model based on a topic discriminative search space 4 are some of the interesting applications of similarity searches using meta-path based topological features.
The proposed meta-path based similarity indices suffer from two major drawbacks. Firstly, the similarity indices are strongly dependent on the amount of reachability between the nodes while they do not consider the information within meta-paths. Therefore, they tend to bias to highly visible or concentrated objects. Secondly, most of these indices are originally designed for a single and usually symmetric meta-path. In other words, even though a set of useful meta-paths might be available, benefiting all the meta-paths to enhance the quality of predictions is not straightforward.
Recently, information theory has been employed for link prediction problem in homogeneous complex networks [19][20][21] . The main contribution of these works is to measure the information provided by common topological features, such as common neighbors, instead of using them as simple topological features. In this paper, we propose a mutual information model to perform link prediction in heterogeneous complex networks. The proposed model, called as Meta-path based Mutual Information Index (MMI), provides an information theoretic framework in which multiple meta-paths with different semantics are mutually employed to improve the similarity exploitations. Here, the link likelihood of a node pair is formulated as a conditional self-information of the existence of that link when a set of meta-paths are available. We have evaluated the proposed approach using a Bibliographic network, DBLP. The results of prediction accuracy under Precision indicate the efficiency and validity of the proposed MMI method. It will be also shown that the efficiency would be kept even when the network sparsity or noisy connections are increased. The main contributions of this paper can be summarized as follows: • A new meta-path based similarity measure is proposed from information theory perspective for predicting future links in heterogeneous complex networks. • In the proposed approach, it is shown that how the contribution of meta-paths results in more accurate link prediction using their semantic information. • To the best of our knowledge, this is the first study to apply an information-theoretic model to link prediction problem in heterogeneous complex networks.

Results
At first, we introduce some basic definitions including heterogeneous complex network, network-schema, meta-path, and path-instance. Then, some discussion are given about the generation and selection of meta-paths in heterogeneous complex networks. Afterwards, the proposed model would be introduced.

Terminology Definitions. Definition 1.
A heterogeneous complex network is defined as a graph are defined for assigning a label (object/relation type) to a node or an edge, respectively. Note that, in homogeneous complex networks, t = r = 1. . For instance, a well-accepted network schema of DBLP database 22 has been shown in Fig. 1

. As it is indicated
} and is composed of seven relation types appeared above the edges.

Definition 3.
A meta-path  is defined on the network schema T G . More formally,  is a sequence of entity types which connects an object type of ′ T 1 to that of ′ T l through a composite relation. The value of l is called the meta-path length.

Definition 4.
Let  be a meta-path. We call p = (a 1 , … , a l ) a path instance of  between a 1 and a l where δ( Fig. 2, three meta-path examples on DBLP network schema of Fig. 1 are shown. These meta-paths can be employed to exploit similarities between any two authors. In other words, each meta-path could act as a similarity search pattern between two authors.

Meta-path generation and selection issues.
Referring to the network schema of DBLP, as shown in Fig. 1, assume that the aim is to find all possible meta-paths describing co-authorships dependencies. For even-length meta-paths, i.e. l = 2k, the possible number of meta-paths would be 4 k−1 . Similarly, for odd-length meta-paths, i.e. l = 2k + 1, an additional P would be added to an even-length meta-path which leads to the total number of 4 k−1 odd-length meta-paths. Thus, the number of possible meta-paths with maximum length of L is: In general, for a star network schema with K different object types, the total number of meta-paths is equal to To find the best set of informative meta-paths, an exhaustive search is required in a search space with size of θ(K L ) which is a NP-Complete problem 10,12 . Thus, as the network schema grows in size, generating all informative meta-paths as well as selecting the best set among them become a nontrivial issue. However, all the possible meta-paths might not necessarily provide the proper information or meaningful semantics. In addition, it has been found that a typical meta-path might lose its importance as its length increases 16 . Taking these facts into account, the negative impact of highly computational time meta-path generation and selection issues could be reduced by using a prior knowledge. It is important to mention that, using the most informative meta-path could not entirely explore semantic dependencies between node pairs. So, the contribution of selected meta-paths leads to more accurate link prediction. We will address the above mentioned issue in experimental results.
However, employing multiple meta-paths raises another challenging issue that is how they could be contributed with different semantics. For instance, assume DBLP network schema, as shown in Fig. 1, is given in order to investigate co-authorship dependencies between a pair of authors, namely (A i , A j ). Three different dependencies with three various semantics between A i and A j are demonstrated in Fig. 3. The first dependency is described by  The first meta-path seeks the similarity of two authors based on the citations between them. While the second and the third metapaths consider the published papers between two authors with the same keyword(s) and same publisher as the similarity search patterns, respectively. A, P, V, and T stand for Author, Paper, Venue, and Term, respectively. the first meta-path, 1  , where two object types, i.e. papers (P) and terms (T), declare the authors dependencies based on their publications with similar keywords. Roughly speaking, such dependency states that two authors are more likely to have a future collaboration if they publish more papers with similar keywords. Through the second meta-path,  2 , the authors dependencies are defined by their direct citations. The third dependency is defined by the third meta-path, 3  , which is an asymmetric meta-path including three object types. Accordingly, the published papers by the same venue and the published papers with the probable similar keywords describe another semantic dependency between two authors. Since each meta-path has its own number of object types with a specific length, integrating all the semantic information provided by each meta-path is not straightforward.

Meta-path based Mutual Information Model for Link Prediction.
In the literature of graph theory, many topological graph measures have been proposed to characterize the structural information through measuring the graph complexity. Furthermore, it has been well investigated that information-theoretic measures based on graph entropies provide positive structural information and meaningful interpretations 23 . To obtain such graph entropy measures, some graph invariants should be considered like the number of vertices, the vertex degree sequences, extended degree sequences, eigenvalues, and connectivity information 24 .
Accordingly, a number of studies has been done to analyze complex networks by introducing some graph entropy measures [24][25][26] . Moreover, graph entropy has been recently employed in the problem of link prediction in complex networks. In ref. 19, a mutual information similarity index was proposed where common neighbors is used to provide structural information to estimate the links likelihoods. In ref. 20 structural information, including common neighbors, were employed to facilitate the link prediction task. The extension of this work has been employed in a weighted complex network 21 . These entropy-based similarity indices have been evaluated over a number of complex networks and compared to common proximity measures. The results show that the proposed entropy-based indices improve the prediction accuracy with reasonable lower computational time complexity.
Inspiring from these studies, we have developed a new mutual information model for link prediction in heterogeneous complex networks. The proposed model estimates the link likelihoods via introducing the meta-path based link entropy following a given meta-path. Furthermore, it provides an information-theoretic framework such that multiple meta-paths are employed to facilitate link prediction by providing different semantic information about the target node pairs. Before introducing the proposed model, we recall two basic definitions from information theory 27 .
Definition 5. Let X be a random variable and x be an outcome of X with probability p(x). Then, the self-information of x quantifies the uncertainty of the outcome x and is defined as follows: Definition 6. Let X and Y be two random variables and x and y be their outcomes, respectively. The mutual information of X and Y measures the amount of reduction in uncertainty of the outcome x when the outcome y is known, or vice versa, and is defined as follows: where I(x|y) is the conditional self-information of outcome x when outcome y occures. Now, consider link prediction in a heterogeneous complex network. We start with the case when a single meta-path  = … T T T i k j is given. We define the likelihood score of a node pair (x, y) with δ(x) = T i and δ(y) = T j as: where L xy 1 denotes the event of (x, y) being connected and is the conditional self-information of L xy 1 when the meta-path  is given. Eq. (3) indicates the link likelihood of node pair (x, y) as the amount of uncertainty of the event L xy 1 based on the provided information by . According to Definitions 5 and 6, the smaller the value of | I L ( ) xy 1  , the higher the probability of event L xy 1 . Recalling Definition 6, the likelihood score can be rewritten as: 1  is the mutual information of the event L xy 1 and the given meta-path . We estimate I L ( ) xy 1 by a prior probability that the node pair (x, y) has a link.

Definition 7.
Let (x, y) be a node pair such that δ(x) = T i and δ(y) = T j . k x j denotes the j-typed neighbor size of x, i.e. the number of adjacent nodes with type T j to x. Similarly, M ij denotes the number of possible links between the object of type T i and that of type T j . According to the above definition, when δ(x) = T i and δ(y) = T j , then the prior probability of L xy 1 is calculated as:  . Then, assuming that there is not any correlation between the path links, p(p z ) can be estimated by: . Therefore, the mutual information I L p ( ; ) xy z 1 is estimated as: In Eq. (6), we have considered the case that only a single meta-path is given. As discussed in the second section of the Results, in real world heterogeneous complex networks, it is a difficult task to obtain a proper meta-path that could explain all the nodes dependencies. Now, we extend Eq. (6) to the case of multiple meta-paths. Let  be the set of selected meta-paths. Then, the likelihood score of the node pair (x, y) is calculated by: xy xy where  ω is the contribution weight of each meta-path such that  ω ∑ = 1, and S xy  is calculated using Eq. (6). The following example facilitates the understanding of the MMI. Example 1. Consider the network as shown in Fig. 4. Let us calculate the likelihood score of (A 2 , A 6 ) based on the meta-path = APVPA  for co-authorship prediction. The two path instances of  are p 1 = (A 2 P 1 V 2 P 5 A 6 ) and p 2 = (A 2 P 3 V 1 P 4 A 6 ). Using Eq. (5), we get = − = . I L ( ) log(2/21) 1 02 A A 1 2 6 . Similarly, we haveI L ( ) . Now, using Eqs (6) and (8), the final link likelihood score is obtained as . This indicates that the link likelihood existence of (A 2 , A 6 ) is about twice of (A 3 , A 5 ) which has a likelihood score of 2.61. The above calculations are summarized in Table 1.
Experimental Results. In order to evaluate the proposed approach, DBLP network data 28 has been used for two different experiments. Firstly, link prediction between similar object types is considered based on two link prediction tasks as co-authorship and citation prediction problems. In the first problem, the objective is to predict future scientific collaborations between the authors. While the second problem copes with measuring the documents similarity. For each of the mentioned problem, two meta-paths have been selected. Accordingly, { , } 1 2   for the first and { , } 3 4   for the second problem have been used in the experiments. They have high significance levels for predicting future co-authorships 13 and citations 4 relations, respectively, as shown in Table 2. In the second experiment, we consider recommending publishers to the authors as a link prediction task between two classes of objects. For this purpose, we have extracted a bipartite network between the authors and venues from the original DBLP network. Accordingly, we have selected the meta-path = AVAV 5  to define the dependencies for each author-venue pair. The performance of the MMI method has been compared with five LCP-based similarity indices 8 which have been found efficient in bi-partite networks.
For both experiments, four samples of DBLP network, namely S3, S5, S7, and S9, have been selected each of which contains the authors with 3, 5, 7, and 9 published papers, respectively. In the section of Methods,  Prediction between similar object types. The prediction accuracy under Precision has been reported for co-authorship and citation prediction problems in Tables 3 and 4, respectively. In both problems, the MMI overcomes all other indices by employing either the first or the second meta-paths in most cases with higher Precision rate. These results are clearly indicated for S3 and S5. When the network gets enriched by considering more publications for each author, i.e. using S7 and S9, the Precision rate of all indices becomes more competitive. The reason is that the MMI relies on the information of meta-path and its relevant path-instances through measuring the link entropy. Meanwhile, other indices do not consider the information provided by the available paths, rather they are dependent on the amount of reachability between two nodes. Therefore, it can be inferred that the MMI could better capture the similarities between the nodes with lack of efficient structural and reachability information.

Meta-path Description
The source author cites the target one.     Another interesting observation is that although the Precision rates of all meta-path based measures are close to that of SimRank in many cases through employing each of meta-paths, the MMI benefits both of them, i.e. 1  and 2  , by integrating their information. As a consequence, the MMI overcomes the other indices in term of the Precision when both meta-paths are employed. Accordingly, by integrating both meta-paths, the Precision rate of the MMI is increased about 20%, 9%, 11%, and 3% in S3, S5, S7, and S9, respectively, for co-authorship prediction problem. For the citation prediction problem, such improvements are about 4%, 6%, 10%, and 7%, respectively. The results indicate that, the Precision rates of the citation predictions are lower than those of co-authorships for all indices. However, the citation prediction problem has been known to be more challenging than the co-authorship predcition problem as the former one is a directed link prediction task with more noisy connections. Moreover, it is not cleared where the links between the papers could be determined in the network, in advance. This is in contrast with a co-authorship network where authors are more likely to form scientific communities with each other that would result in more accurate predictions. It should be noticed that SimRank has been considered as a homogeneous proximity measure since its accuracy does not depend on the employed meta-path.
The performance of the MMI, when multiple meta-paths are given, depends on the contribution weight of each meta-path, as represented in Eq. (9). Since we have selected two meta-paths for each problem in the first experiment, the above equation for co-authorship prediction problem can be rewritten as: To study the impact of α on the Precision of the MMI, it has been changed from 0 to 1. The the results are depicted in Fig. 5 for both problems. Referring to Table 3, the meta-path 2  is more informative than  1 . It can also be drawn from the Fig. 5 that the Precision rate is decreased when we simply give high contribution weight to the second meta-path, 2  . That is, the higher the Precision rate, the more the proper selection of α. In practice, the best value of α is obtained in α ≈ 0.3. The observation for citation prediction problem is similar except that the best value of α is found in α ≈ 0.4. That is the effectiveness of  3 and  4 in expressing citation relations are more close to each other compared to  1 and 2  in predicting co-authorship relations. To study the impact of choosing the number of top-L candidate links on the Precision rate, L has been changed from 10 to 100 and the average results of 100 independent trials have been found, as shown in Figs 6 and 7. The results demonstrate that the MMI still keeps its higher ability of retrieving latent links when L changes, specially using the more informative meta-path. Another interesting achievement is that the stability of the MMI in retrieving the latent links when L changes is improved by integrating both meta-paths. As depicted in Fig. 8, the MMI overcomes all other indices in term of Precision while it is robust to the changes of L.
Prediction between different object types. The Precision performance of the MMI in recommending relevant publishers to the authors is reported in Tables 5 and 6, based on top-100 and top-10% rated links, respectively, and compared with LCP-based methods. As the extracted network for the second experiment is a bi-partite network between the authors (A) and venues (V), the possible meta-paths are restricted to the form of (AV) l where 2l − 1 shows the length of the meta-path. Since the increase of l does not bring any additional meaning or more informative meta-paths, we have selected l = 2 in the experiment. Therefore, the only meta-path employed by the MMI is  = AVAV 5 . As shown in Table 5, the MMI obtains considerably better Precision rates compared to LCP-based methods when S3 and S5 are used. By considering top-10% rated links, as shown in Table 6, the Precision rate of the MMI is still better than the LCP-based methods or strongly competitive, based on S3 and S5, respectively. This observation confirms our discussion in the previous experiment on the ability of the MMI to desirably retrieve  latent links in the presence of sparse or noisy links. On the other hand, while the MMI keeps its high Precision rates on S7 and S9, the Precision rates of LCP-based methods are substantially increased. This could be viewed as the result of forming strong local communities in S7 and S9 since more relations between the authors and venues exist. The impact of choosing the number of top-L candidate links has also been studied and depicted in Figs 9 and 10. It is obvious that as the value of L changes, the MMI brings higher or competitive Precision rates even by employing a restricted and single meta-path. However, as demonstrated in the previous experiment, the Precision rate of the MMI would be improved if other informative meta-paths are available to be mutually contributed.

Discussion
There are two sources of structural information for link prediction in complex networks. For a candidate node pair, one can explore the neighborhood properties to extract structural data, such as common neighbors, which can be used to measure the link likelihood. On the other hand, similarity search could be performed through the existing paths. For instance, counting the number of available paths between the candidate pairs. However, when the network becomes heterogeneous, considering only the amount of reachability would not be sufficient to exploit nodes similarities. In other words, measuring the information provided by the network heterogeneity, through exploring different nodes or relation types, might lead to extract more detailed information to enhance the similarity search and make the predictions more accurate (Fig. 11).  We have proposed a new similarity measure based on mutual information model in heterogeneous complex networks. The proposed model introduces an information-theoretic framework in which link entropy is defined as a semantic measure for link prediction. This measure is reinforced by the number of path instances of an employed informative meta-path. Moreover, to obtain more accurate predictions, the mutual contribution of meta-paths are utilized with associating a contribution weight to each of them. To investigate the efficiency of the proposed model, different experiments were conducted and compared with three popular meta-path based link predictors, an efficient path-based homogeneous link predictor, and five LCP-based link predictors. We have selected four samples from DBLP network which are differed in the amount of reachability between the node pairs.
• Comparing with other classical indices, the proposed model can efficiently predict links through measuring the information along the paths rather relying on the number of path instances. Even when the number of node connectivities in the network become low, the model correctly predicts the links. This is confirmed by two network samples S3 or S5 in our studies. In other words, when the network is partially noisy, our model keeps robust against noisy connections. • The stability analysis of the proposed model has been addressed through changing the contribution weights of the meta-paths as well as choosing the number of top candidate links. The results indicate that integrating multiple meta-paths considerably improves the stability of the model. Also, the prediction performance improves substantially in term of Precision. It is concluded that employing multiple meta-paths with different semantic information captures more nodes dependencies. • In case of employing multiple meta-paths, a heuristic parameter tuning is required to be applied on the contribution weights of meta-paths. Grid search, random search, gradient-based search, and Bayesian optimization are among the most popular parameter optimization heuristics 29,30 . However, a naive solution could be used to obtain the contribution weights using a prior structural information. For example, one can count the total number of path instances of each meta-path, say c i , between all the observed links in the network. Then, the contribution weight of each meta-path i could be found as c i /∑ i c i . For instance, we obtained ω = .   • Although, meta-path based link predictors have a moderate performance in some cases using the less informative meta-path compared to SimRank, the Precision rate considerably improves when more informative meta-path is employed. This result indicates that a link predictor undesirably might lose some semantic information of links when a heterogeneous network is projected into an equivalent homogenous one. It can be concluded that the prediction performance would be improved by considering the heterogeneous information if a proper meta-path or a set of informative meta-paths could be found or contributed to enhance the similarity search. • We have investigated through experimental results that by reducing the heterogeneous network into a bi-partite one, the proposed model still keeps its efficiency and performability. Such reduction might be continued to a completely homogeneous network. More formally, suppose x and y be a node pair in a homogeneous complex network. Let P denotes a simple path between x and y with length l, ≤ P L be the set of all simple path with l ≤ L, and L is the maximum meta-path length. Now, the MMI could be specialized to act as a homogeneous link predictor by re-writing the Eq. (9) as follows: xy P P P P xy P L L Therefore, the amount of information through the simple paths can be measured in term of link entropy in homogeneous networks as well.

Methods
Problem Definition and Algorithm. Suppose a heterogeneous complex network is given as , as stated in Definition 1. Assuming δ(u) = T i for all source nodes u and δ(v) = T j for all target nodes v, E ij and U ij are denoted as the set of observed target links and its universal set, respectively. Denoting by U ij \E ij as the set of non-existent links, the aim of link prediction is to find a score function f(x, y) = s that assigns a similarity score s to a non-existing link (x, y).
Data. As mentioned before, DBLP network data has been employed 28 in our experiments which is available to download in ref. 31. This data set originally contains 2,092,356 papers with 8,024,869 citations, and 1,712,433 authors with 4,258,615 collaborations 31 . In Fig. 12, some detailed information of the network has been depicted for 15 years of data, from 2000 to 2014. As an example, we have selected four samples from this network as S3, S5, S7, and S9 in which there are 3, 5, 7, and 9 publications per author, from 2010 to 2014, respectively. The more the number of papers per each author, the better link reliability in the network. Similarly, the lesser the number of papers per each author may cause to some noisy information of some observed parts of the network.
In order to evaluate the prediction accuracy of link predictor, E ij is divided into a training set, E ij T , and a test set, E ij P such that . In each network sample, E ij T and E ij P include the observed links from 2010 to 2012 and from 2013 to 2014, respectively. In this regard, three years of data has been used for training phase and the last two years has been selected for the test. Table 7 shows some statistics of four network samples. Evaluation Metrics. To measure the prediction accuracy, two evaluation metrics for link prediction, AUC and Precision, are commonly used. The AUC (area under the receiver operating characteristic curve) is found by the probability that a randomly chosen missing link (i.e. positive pairs in test set) is given a higher score than a randomly chosen non-existent link 32 . However, this measure has been found as a deceptive evaluation measure 32,33 , epecially in imbalanced data problems. The major drawback of AUC evaluation is that it is required to define a negative set, which is composed of all the missing (non-observed) links in the network except for the removed links for the test. While it is not known in advance which of the missing links are truly negative, undetected, or will appear in future. Therefore, the negative set can not be properly determined in the link prediction task. Ignoring the above fact causes that the AUC evaluation gets biased towards a negative set in unbalanced datasets. Also, such evaluation gives more importance to the methods that overfit the network structure rather than offer a more general prediction ability. In this regard, the prediction performance of all methods in both experiments have been measured under the Precision which is a measure of correctness achieved in positive prediction.
To obtain Precision, a similarity score is calculated for each node pair. After sorting the scores, if there are L r links belonging to the test set among top-L candidate links, then Precision is obtained as ref. 32: The higher the Precision rate, the more the possibility in retrieving top-ranked latent links.
Benchmarks. For performance comparisons, nine popular link predictors have been selected from three classes of methods. PathSim 16 , HeteSim 14 , and RandomWalk 13 are among meta-path based methods, SimRank 15 is basically a homogeneous link predictor, and CAR, CJC, CAA, CRA, and CPA are five LCP-based link predictors 8 which are reviewed in the following.
PathSim. Given a symmetric meta-path , PathSim gives a similarity score to a node pair (x, y) with the same type as:   Table 7. Some statistics of the selected network samples.