Chained graphs and some applications

This paper introduces the notions of chained and semi-chained graphs. The chain of a graph, when existent, refines the notion of bipartivity and conveys important structural information. Also the notion of a center vertex vc\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$v_c$$\end{document} is introduced. It is a vertex, whose sum of p powers of distances to all other vertices in the graph is minimal, where the distance between a pair of vertices {vc,v}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{v_c,v\}$$\end{document} is measured by the minimal number of edges that have to be traversed to go from vc\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$v_c$$\end{document} to v. This concept extends the definition of closeness centrality. Applications in which the center node is important include information transmission and city planning. Algorithms for the identification of approximate central nodes are provided and computed examples are presented.

mathematical properties and applications of bipartite graphs in the areas of algebra, combinatorics, chemistry, communication networks, and computer science are provided by Asratian et al. (1998).
The notion of multipartite graphs is required in the definition of chained graphs introduced in this paper. The chained structure characterizes multipartite networks such that edges can occur only between nodes belonging to "subsequent" partite sets V i and V i+1 , i = 1, 2, . . . , m − 1 , and vice versa, as illustrated in Fig. 1. The definition of chained graphs can be relaxed allowing connections between nodes belonging to the same node subset, as it will be subsequently explained.
The above concepts will be described in detail in Sect. 2, where it will be shown that bipartite graphs are ℓ-chained for some ℓ ≥ 2 . This shows that the chained structure is a refinement of bipartivity, since it reveals additional structure of a bipartite graph. The chains provide insight into how the vertices are connected; this structure is not uncovered by bipartivity only.
We also will use chained graphs to identify "central nodes". These are nodes determined by their location in the chain structure, incorporating a different idea of centrality than other centrality measures, such as the degree or the subgraph centrality. A nice introduction to the latter measure is provided by Estrada and Higham (2010); see also Borgatti (2005), Estrada (2011a), and Estrada and Knight (2015) for an overview of other important quantities that describe global properties of a given graph, such as the importance of a particular node within the network, or the ease of traveling from one node to another. With the aim of determining a new centrality measure, called "position centrality", we will first examine the spanning trees associated with a given underlying graph (Bapat 2014;Bondy and Murty 1976). The position centrality of a node will be defined by taking into account the lengths of the paths from it to all the other vertices and it can be computed by using the chained structure determined by the tree rooted at the node. By using this measure, which depends upon a parameter p, one may identify a most "centrally located" node, referred to as a "center vertex", as a vertex with the smallest position centrality. There may be more than one center vertex. For p = 1 , a center vertex coincides with a vertex with the largest closeness centrality (Newman 2010).
Another application of interest to us is the detection of anti-communities, i.e., subsets {S i } p i=1 of vertices of a graph G with no or few edges between vertices in each set S i , but many connections between the node sets S i and V\S i , i = 1, 2, . . . , p . Once a semi-chained structure has been identified in a graph, the presence of anti-communities can be determined by ascertaining the number of edges among nodes belonging to the same set; see, e.g., the autobahn data set and Fig. 16 in Sect. 6. Community and anti-community detection in networks is an important problem with applications in various fields, including physics, computer science, as well as in the natural and social sciences. Several methods have been developed to identify this kind of structures in networks; see, e.g., Chen et al. (2014) Raghavan et al. (2007). In Fasino and Tudisco (2017) a spectral method was used to simultaneously detect communities and anti-communities, while in Concas et al. (2020) another approach to identifying anti-communities has been described. We will illustrate the benefit of using the chained structure for this purpose in Sect. 6. This paper is organized as follows: Sect. 2 introduces notation that will be used in the remainder of the paper and discusses ℓ-chained bipartite graphs. Section 3 describes the structure of the adjacency matrices that are associated to ℓ-chained graphs. The relation between the chain structure and spanning trees is investigated in Sect. 4. Section 5 introduces the notion of position centrality and discusses some applications. Numerical illustrations of ℓ-chained graphs and the identification of approximations of central nodes are described in Sects. 6 and 7 contains concluding remarks.

Some definitions
This section introduces notation and definitions to be used in the sequel. Most of our definitions and terminology follow those in Estrada (2011a), Newman (2010. The adjacency matrix M = [m ij ] n i,j=1 ∈ R n×n associated with an unweighted undirected simple graph G with n vertices is symmetric and has the entry m ij = 1 if there is an edge between the vertices v i and v j , otherwise m ij = 0. Bipartivity, and more generally multipartivity, are interesting structural properties of a graph that provide important information about the network being modeled. There are various characterizations of multipartite graphs (Estrada and Gómez-Gardeñes 2016;König 1916). They can be defined as follows.
Definition 1 A graph G is said to be ℓ-partite if the set of vertices V that make up the graph can be partitioned into ℓ disjoint non-empty subsets V = V 1 ∪ V 2 ∪ · · · ∪ V ℓ such that every vertex in V i , for any 1 ≤ i ≤ ℓ , is adjacent only to vertices in V j for some j = i , and the number of subsets, ℓ , is as small as possible. A graph is said to be bipartite when ℓ = 2 , and multipartite when ℓ ≥ 3.
Equivalently, the vertices of an ℓ-partite graph can be colored with ℓ colors, so that the vertices at the endpoints of every edge have different colors, and ℓ is the minimal number of colors required (Jensen and Toft 1995).
The graph on the right-hand side of Fig. 2 is bipartite, and the graph in Fig. 3 is tripartite.
Usually, vertices in distinct subsets V i of an ℓ-partite graph model different entities. For instance, users of social bookmarking services, such as Delicious (http:// www. delic ious. com), put tags on web pages. Users, tags, and web pages can be represented by a tripartite network V = V 1 ∪ V 2 ∪ V 3 , in which users define the vertex subset V 1 , tags define the subset V 2 , and web pages define the subset V 3 . This example of tripartite graphs is discussed by Ikematsu et al. (2013).
There are various methods for partitioning the vertex set V of a bipartite graph G into unique disjoint non-empty subsets V 1 and V 2 , such that every vertex in V 1 is adjacent to a vertex in V 2 ; see Bondy and Murty (1976) and Concas et al. (2020) for discussions of methods and further references. Assume for the moment that the n vertices in the set V are enumerated so that the first n 1 of them make up the vertex set V 1 and the remaining n 2 = n − n 1 vertices make up the vertex set V 2 . Then the adjacency matrix for G is of the form where B ∈ R n 1 ×n 2 , O denotes a zero-matrix of suitable order, and the superscript T denotes transposition. A bipartite graph with partition sets V 1 and V 2 is said to be complete if every vertex of V 1 is adjacent to all vertices of V 2 . For complete bipartite graphs, every entry of the submatrix B of the adjacency matrix (2.1) is one. The notion of a complete bipartite graph can be extended to multipartite graphs.
Definition 2 An ℓ-partite graph G = {V, E} with the vertex set V = V 1 ∪ V 2 ∪ · · · ∪ V ℓ partitioned into non-empty disjoint subsets V i is said to be complete if, for each 1 ≤ i ≤ ℓ , every vertex in the vertex subset V i is adjacent to every vertex in the set V\V i .
Complete ℓ-partite graphs are commonly denoted by K n 1 ,n 2 ,...,n ℓ , where n i is the cardinality of the node subset V i . The adjacency matrix M for K n 1 ,n 2 ,...,n ℓ is of order n = ℓ j=1 n j with all entries m ij equal to one, except for the entries of ℓ disjoint diagonal blocks of zeros of orders n 1 , n 2 , . . . , n ℓ . The following definitions introduce the notions of particular multipartite structures, which will be used in the remainder of the paper.
Definition 3 An undirected graph G = {V, E} is said to be ℓ i -chained with initial vertex v i if the set of vertices can be subdivided into ℓ i disjoint non-empty subsets such that v i ∈ V 1 , and all vertices in the set V j are adjacent only to vertices in the sets V j−1 or V j+1 for j = 2, 3, . . . , ℓ i − 1 , where the chain length ℓ i is the largest number of vertex subsets V j with this property. Moreover, the vertices in V 1 and V ℓ i are adjacent only to vertices in V 2 and V ℓ i −1 , respectively. Vertex sets V j with consecutive indices are said to be adjacent.
In the Delicious bookmarking service application mentioned above, vertices in V 1 and V 3 are adjacent only to vertices in V 2 . Thus, this vertex partitioning shows that the graph is 3-chained.

Definition 4
The graph G = {V, E} is said to be ℓ i -semi-chained with initial vertex v i if the set of vertices can be subdivided into ℓ i disjoint non-empty subsets (2.2) such that v i ∈ V 1 , and all vertices in the set V j are adjacent only to vertices in the sets V j−1 , V j , or V j+1 for j = 2, 3, . . . , ℓ i − 1 , where the chain length ℓ i is the largest number of vertex subsets V j with this property. Moreover, the vertices in V 1 and V ℓ i are adjacent only to vertices in V 1 ∪ V 2 and V ℓ i −1 ∪ V ℓ i , respectively. Figure 2 displays two chained graphs with three vertices and different initial vertices. In the chained graph displayed in subfigure (a), each vertex set V i , i = 1, 2, 3 , contains one node, and the initial vertex is v 1 . This gives the chain length ℓ 1 = 3 . The same chain length can be obtained if the initial node is chosen to be v 3 . The chained graph in (b) has initial vertex v 2 , with V 1 = {v 2 } and V 2 = {v 1 , v 3 } , which gives the chain length ℓ 2 = 2 . This example illustrates that the chain length depends on the initial vertex chosen. Figure 3 displays a 1-semi-chained graph, that is not chained in the sense of Definition 3. The semi-chained structure in this example is independent of the choice of the initial vertex.

Example 3
While chained structure is not so common for graphs, every non-trivial graph is semi-chained. Nevertheless, representing a graph in (semi-)chained form is useful, because this structure is closely linked to anti-communities, which are subsets of vertices, such that there are only few edges between vertices in the same subset, but many edges between vertices in different subsets. Recent discussions on anticommunities and their detection can be found in Concas et al. (2020), Estrada and Gómez-Gardeñes (2016) , Estrada and Knight (2015), Fasino and Tudisco (2017). We will introduce a density measure for anti-communities, which is similar to the intra-cluster density that allows one to identify clusters or communities; see Estrada (2011b), Estrada and Knight (2015), and Fortunato (2010).

Definition 5
The anti-community score 0 ≤ ρ ≤ 1 is the ratio between the number of edges connecting the nodes in the subset and the maximum admissible number of edges between them.
To highlight the role of the anti-community score, we will in the following consider ρ-anti-communities. The sets V i , i = 1, . . . , ℓ , in an ℓ-chained graph are 0-anti-communities, as they have no internal edges, while each set V i in a semi-chained graph is a ρ i -anti-community. When ρ i is small, V i may be considered an anti-community.

Definition 6
The maximal chain length, ℓ , of a graph is defined as where the maximum is over all the initial nodes v i in the vertex set V . When the maximal chain length is considered, the graph is said to be ℓ-chained.
Example 4 The graph G of Example 2 has maximal chain length ℓ = 3.
The following notion will be useful in the sequel. It is stronger than (standard) multipartivity, but weaker than complete multipartivity.
Definition 7 An ℓ-partite graph G = {V, E} with the vertex set partitioning V = V 1 ∪ V 2 ∪ · · · ∪ V ℓ into non-empty disjoint subsets V i is said to be strongly ℓ-partite if, for every i, every vertex in the subset V i is adjacent to at least one vertex in every subset V j , j = i.
The special case of strongly tripartite graphs is applied to community detection by Ikematsu et al. (2013), who refer to these graphs as 3-partite 3-uniform hypernetworks. We also define the notion of strongly ℓ-chained graphs.
Definition 8 An ℓ-chained graph G = {V, E} with the vertex set partitioning V = V 1 ∪ V 2 ∪ · · · ∪ V ℓ into non-empty disjoint subsets V i is said to be strongly ℓ-chained if, for every i, every vertex in the subset V i is adjacent to at least one vertex in the subsets V i−1 (for 1 < i ≤ ℓ ) and V i+1 (for 1 ≤ i < ℓ).
We are interested in strongly chained graphs, because their structure can be identified from the knowledge of the vertex and edge sets of a graph. We note that "standard" chained graphs G = {V, E} cannot be uniquely identified from the knowledge of V and E . Indeed, let the vertex v of a chained graph be connected only to vertices in the vertex set V i for some 1 < i < ℓ . Then v may belong to either the vertex sets It is remarkable that an ℓ-chained graph is always bipartite, and vice versa. This property will help us study anti-communities.
Theorem 1 Let G be a bipartite graph. Then the graph is ℓ-chained for some ℓ ≥ 2 . The partitioning of the node set V into chained sets is not unique, but the maximal number of chained sets, ℓ max , is uniquely determined. Conversely, if a graph is ℓ-chained, then it is bipartite.
Proof Let the graph G = {V, E} be bipartite and let V = V 1 ∪ V 2 be the associated partitioning. It follows that the graph is at least 2-chained. Conversely, let the graph G = {V, E} be ℓ-chained, i.e., there is a partitioning of the vertex set V = V 1 ∪ V 2 ∪ · · · ∪ V ℓ that satisfies the properties of Definition 3. Then letting shows that the graph G is bipartite with associated vertex set partitioning V = V 1 ∪ V 2 . The unicity of ℓ max follows by recursive subdivision of the sets V 1 and V 2 , and by a suitable choice of the initial set V 1 in (2.2).
The property of bipartite graphs shown by Theorem 1 will be further discussed in Sect. 3, where we consider the structure of adjacency matrices for ℓ-chained graphs for ℓ ≥ 3.
We remark that the ℓ-chained structure with ℓ > 2 gives a finer representation of a bipartite graph, as it provides information on hierarchical connections between nodes that is not contained in the basic notion of bipartivity.

Closed chained graphs
This subsection considers chained graphs that may be cyclic. This kind of graphs are important, e.g., for their connection to n-cubes.
Definition 9 A graph G = {V, E} is said to be closed ℓ i -chained with initial vertex v i if the set of vertices can be subdivided into ℓ i disjoint non-empty subsets such that v i ∈ V 1 and all vertices in the set V j are adjacent only to vertices in the sets where the chain length, ℓ i , is the largest number of vertex subsets V j with this property. Closed ℓ i -semi-chained graphs can be defined analogously.
We remark that a closed ℓ-chained graph G = {V, E} is not ℓ-chained, but may be k-chained for some k < ℓ . The following example illustrates this.
Example 5 Consider the graph G = {V, E} in Fig. 4a and define the vertex subsets V i = {v i } for i = 1, 2, . . . , 6 . This graph is closed 6-chained with initial vertex v 1 .
The chain of vertex sets shows that G is a 4-chained graph with initial vertex v 1 . The graph also is bipartite. The latter property is illustrated by Fig. 4b.
Proof Let ℓ be even. Then we may partition the closed ℓ-chained graph G with vertices v 1 , v 2 , . . . , v ℓ as This shows that G is (ℓ/2 + 1)-chained with initial vertex v 1 .
If, instead, ℓ is odd, then we define the vertex sets

Fig. 5 A closed 7-chained graph is tripartite. A tripartization of the node set is given by
The remaining vertices, v ℓ+1 2 and v ℓ+3 2 , are adjacent and make up the vertex set V ℓ+1 2 . This makes the graph G ((ℓ + 1)/2)-semi-chained with initial vertex v 1 .
. This partitioning shows that the graph is closed 7-chained with initial vertex v 1 . The graph also is 4-semi-chained. Moreover, the graph is The following result shows that the facts that the graphs in Figs. 4 and 5 are bipartite and tripartite are not coincidences.

Theorem 3 Consider a closed
Proof If ℓ is even, then the partition (2.3) produces a bipartite graph; see an example with ℓ = 6 in Fig. 4. If ℓ is odd, then the partitioning shows that the graph is tripartite. An example with ℓ = 7 is illustrated in Fig. 5.  Fig. 6a. The vertices are enumerated so that odd vertices are adjacent to even vertices, and vice versa. The graph G is bipartite with the partitioning V = V 1 ∪ V 2 , where the set V 1 contains all vertices with odd index, and V 2 contains all vertices with even index. Moreover, the graph is closed 4-chained with initial vertex v 1 , as well as 3-chained with initial vertex v 1 . The latter is seen from the chain structure Example 8 Consider the 3-cube graph G = {V, E} displayed in Fig. 6b. The vertices v 1 , v 2 , . . . , v 8 are enumerated so that odd vertices are adjacent to even vertices, and vice versa. Hence, the graph G is bipartite with V = V 1 ∪ V 2 , where the set V 1 contains all vertices with odd index, and the set V 2 contains all vertices with even index. The bipartite structure is illustrated in Fig. 7b.
To determine the closed chain structure with initial vertex v 1 , we regard the vertex par- shows that G has a closed 4-chained structure with initial vertex v 1 . We note that the graph is not strongly chained, as v 8 is not connected to v 1 .
The chain structure of the 3-cube gives rise to a different partitioning of the node set V . Define the node subsets The vertices in V i+1 are adjacent to the vertices in V i for i = 1, 2, 3 . Thus, the graph G is strongly 4-chained with initial vertex v 1 . The chain structure is illustrated by the graph in Fig. 7a.
The above observations can be extended to n-cubes.
Definition 10 A 0-cube is made of just one vertex. An n-cube is composed by 2 n vertices. It is obtained recursively by taking two (n − 1)-cubes, the first one with vertices v i , i = 1, 2, . . . , 2 n−1 , and the second one with vertices v i , i = 2 n−1 + 1, 2 n−1 + 2, . . . , 2 n , and connecting the vertex v i in the first cube to the vertex with index (i mod 2 n−1 ) + 2 n−1 + 1 in the second cube.
The graph of an n-cube, with n > 3 is bipartite with V = V 1 ∪ V 2 , where the set V 1 contains all vertices with odd index, and the set V 2 contains all vertices with even index. To determine the chain structure with starting vertex v 1 , we regard the vertex partition- vertices with odd index except v 1 } , and V 4 = {all vertices not adjacent to v 1 with even index} . Thus, the graph is strongly 4-chained with initial vertex v 1 . Given the symmetry of an n-cube with respect its nodes, changing the starting node does not modify the number and cardinality of the node sets. An n-cube with n > 3 is also closed 4-chained. This structure is not strongly chained. Moreover, it is not unique. It can be determined by considering the above chained node sets and move nodes from V 2 to V 4 , without making the set V 2 empty. This discussion leads to the following result.

Adjacency matrices for chained graphs
Consider an ℓ-chained graph G = {V, E} with initial vertex v 1 and vertex set partitioning V = V 1 ∪ V 2 ∪ · · · ∪ V ℓ . Let n i be the cardinality of the vertex set V i for i = 1, 2, . . . , ℓ . Define the matrix A i ∈ R n i ×n i+1 that describes the connections between the vertices in the set V i and the vertices in the set V i+1 for i = 1, 2, . . . , ℓ − 1 . Hence, the entries of A i satisfy [A i ] jk = 1 if there is an edge between vertex v j in V i and vertex v k in V i+1 , and [A i ] jk = 0 otherwise. The adjacency matrix M associated with G is symmetric block tridiagonal with off-diagonal blocks A i and A T i , and has vanishing diagonal blocks It is known from Theorem 1 that every ℓ-chained graph is bipartite. This also can be seen by applying a suitable permutation matrix P and its transpose to the adjacency matrix M from the left and right, respectively, to obtain Here ⌊α⌋ denotes the integer part of α ≥ 0.
Example 9 We illustrate the permutation (3.2) for ℓ = 5 . In this case where I k is an identity matrix of order k, and This shows that the graph G associated to the adjacency matrix M is bipartite.
The submatrix B in (3.3) exhibits a particular pattern of zero entries. This suggests the possibility of identifying the strongly chained structure of a graph, whose vertices are in a random order, by first identifying its bipartite structure, e.g., by methods described in Concas et al. (2020), Gleich: MatlabBGL-A Matlab Graph Library. https:// www. cs. purdue. edu/ homes/ dglei ch/ packa ges/ matlab_ bgl/, and then reordering the vertices to obtain a suitable zero pattern in the submatrix B.
The considered permutation also illustrates that it is not possible to identify a 3-chained graph. Indeed, considering the adjacency and permutation matrices one obtains This shows that the matrix B does not have a zero pattern that would allow one to identify the node partitioning of a 3-chained graph.
If G is an ℓ-semi-chained graph, then the diagonal blocks of the matrices (3.1) and (3.2) may have some nonzero entries. If there are fewer nonvanishing entries in the diagonal blocks of the matrix (3.2) than in the off-diagonal blocks, then this indicates the existence of an anti-community.

Chained graphs and spanning trees
Many graphs G are not chained, but their spanning trees are. This section explores the possibility of using the chained structure of a spanning tree to gain insight into properties of the underlying graph. A spanning tree for G is a subgraph T = {V, E ′ } that is a tree and contains all the vertices of G ; see, e.g., Estrada (2011a), Newman (2010) for further details. In general, E ′ E ; if E ′ = E , then G is a tree itself. A spanning tree T for G is not uniquely determined by G . In particular, T depends on the initial vertex, the so-called root, of the tree. A spanning tree for a graph G with n vertices can be computed in time proportional to n.
Each spanning tree has an ℓ-chained structure: let V 1 contain the root, v 1 , of the tree, V 2 the children of the root, and, in general, V i+1 the children of the vertices in V i for i = 1, 2, . . . , ℓ − 1 . The set V ℓ contains the leaves of the tree at the lowest level. This shows, in particular, that spanning trees are bipartite; cf. Theorem 1. We will use the chained structure of a spanning tree T for G to determine an approximated chained structure for G , also in situations when G is not chained.

Definition 11
Let T be a spanning tree for the graph G . An ℓ-chained vertex set decomposition for T is said to be an ℓ-chained vertex set decomposition for G . We will refer to leaves of T as leaves of G.
Let D = E \ E ′ be the set of the edges in G that are not in T , and let C(T ) denote the graph obtained by adding the edges in D to the spanning tree T . The graph C(T ) coincides with G and inherits the chain structure of T .
If all the edges in D are compatible with the chain structure of the spanning tree T , that is, if for each edge e k ∈ D , there is an index 2 ≤ i ≤ ℓ − 1 such that e k connects a vertex in V i to a vertex in V i−1 or V i+1 , then the graph G = C(T ) is chained. If an edge in D connects two vertices that belong to the same node set V i , the graph is semichained. Finally, if an edge in D connects a vertex in V i to a vertex in V i+j , |j| ≥ 2 , then the graph C(T ) is not chained. This observation leads to the following result.
Theorem 5 A graph is ℓ-chained (semi-chained) if at least one of its spanning trees T generates a graph C(T ) whose edges are compatible with the (semi-)chain structure of T .
Let the vertex set decomposition V = V 1 ∪ V 2 ∪ · · · ∪ V ℓ be determined by the chain structure of G = C(T ) . Recall that a graph is strongly chained if every vertex in V i is connected to at least one vertex in V i+1 and to one vertex in V i−1 for = 2, 3, . . . , ℓ − 1 . Moreover, every vertex in V 1 (resp. V ℓ ) is required to be connected to one vertex in V 2 (resp. V ℓ−1 ). Whether a graph is strongly chained depends on the leaves of the graph.
Theorem 6 Let T be a spanning tree of the graph G = {V, E} , and let T determine the chain structure V = V 1 ∪ V 2 ∪ · · · ∪ V ℓ . If T has leaves connected only to vertices in V ℓ−1 , then G is strongly chained. This also holds if T has leaves connected to vertices in V 1 or in V 2 , but not if there are leaves connected to vertices in both V 1 and V 2 .
Proof The vertices in V ℓ are leaves. If, in addition to the leaves in V ℓ , there are leaves connected to the root, v 1 , but no other leaves, then the graph is strongly chained. This can be seen by moving the leaves connected to v 1 to a new vertex set V 0 that precedes V 1 . This shows that the graph is strongly (ℓ + 1)-chained with chain structure V = V 0 ∪ V 1 ∪ · · · ∪ V ℓ .
We turn to the situation when, in addition to the leaves in V ℓ , there are also leaves connected to the vertices in V 2 . The latter leaves can be moved to V 1 , which shows that the graph is strongly ℓ-chained.
We remark that it is easy to construct examples that illustrate that if T has a leaf connected to a vertex in V i for some 3 ≤ i ≤ ℓ − 2 , then C(T ) is not guaranteed to be strongly chained.
The above discussion leads to Algorithm 1 for determining if a graph with n vertices is (semi-)chained in O(n 2 ) time steps. We note that the algorithm can easily be parallelized, as each iteration is independent on the others.
The following example illustrates that both the partitioning of the vertex set V of a graph G = {V, E} and the number of partitions, ℓ , depend on the choice of the root of the spanning tree T as well as on the spanning tree itself.  are spanning trees for G . Regard first the partitioning of the tree (4.1). Starting with vertex v 1 , we obtain the vertex subsets V 1 = {v 1 } , V 2 = {v 2 } , V 3 = {v 3 , v 5 } , and V 4 = {v 4 } . Thus, ℓ = 4 . If we instead start with vertex v 2 , then we get the sets V 1 = {v 2 } and V 2 = {v 1 , v 3 , v 5 } , V 3 = {v 4 } , and ℓ = 3.
We now turn to the spanning tree (4.2). Letting V 1 = {v 1 } , we obtain V 2 = {v 2 } , and In what follows, we will need the notion of tree branches.

Definition 12
A branch for a tree T is a sequence of vertices starting at the tree root and ending at a leaf. The length of a branch is the number of vertices in the branch. A longest branch is a branch with maximal length.
A recursive procedure for determining all the longest branches of a tree is presented by Algorithm 2.
It is natural to seek the root of a spanning tree with the deepest chain structure on a long branch of any of the spanning trees of the graph. A heuristic approach for doing this is described by Algorithm 3.

Example 11
We have already seen in Example 9 that the chain structure of an undirected simple unweighted graph G = {V, E} cannot be uniquely determined from the sets E and V . Here we provide another illustration using spanning trees. The pictures in Fig. 8 show the same graph. The graph in Fig. 8b is obtained by determining a spanning tree for the graph in Fig. 8a, starting from vertex v 9 , and then constructing C(T ) by adding the missing arcs. Both graphs are strongly 4-chained.
The adjacency matrices for the graphs of Fig. 8 are permutations of each other. The blocks in matrix (3.1) for the two graphs are and respectively. We notice that the graph G is in fact 5-chained. This can be seen, for example, by constructing the spanning tree starting from vertex v 2 .
Example 12 Consider the graph G = {V, E} displayed in Fig. 9. The adjacency matrix for G is given by By removing edges denoted by dashed lines between the vertices v 2 and v 3 , v 4 and v 5 , as well as between the vertices v 5 and v 6 , we obtain the graph G ′ with the associated adjacency matrix This matrix is of the form (3.1). It follows that the graph G ′ is chained. The vertex set for G ′ can be expressed as In Example 12, the graph G is approximated by a 3-chained graph. We conclude that G is a 3-semi-chained graph with ρ-anti-communities {v 1 } , {v 2 , v 3 } , and {v 4 , v 5 , v 6 }.

Position centrality and some applications
There are many ways to measure the importance of a vertex in a graph; see, e.g., Estrada (2011a), Estrada andHigham (2010), andNewman (2010). These measures often are referred to as centrality measures. In this section, we are interested in determining a most "centrally located" vertex in a graph. We call such a vertex a center vertex. For this purpose, we introduce a new centrality measure, which belongs to the class of pathbased centrality measures. This class includes closeness and betweenness centralities. In fact, determining the most centrally located nodes is an extension of closeness centrality.
Applications of the detection of a center vertex include: -Information dissemination: we are interested in determining a vertex (the center vertex) such that information from it can travel to all other vertices in the least amount of time. Here we assume that the travel time is proportional to the number of edges that have to be traversed from a center vertex to the receiving vertices. In the context of social network theory, the importance of a node for spreading information is often associated with the betweenness centrality which assumes that the communication in a network takes place through the shortest paths passing through this node. However, it has been shown that in some circumstances the best spreaders do not correspond to the most highly connected or central nodes. They are often located within the core of the network, identified by using k-shell decomposition analysis; see Kitsak et al. (2010) and the references therein. -City planning: let the edges of a graph represent the streets of a town. It would be reasonable to allocate a fire station, police station, bus terminal, or hospital at a center vertex of the graph.
Definition 13 Let T be a spanning tree of the graph G , starting at a vertex v, and let V 1 , V 2 , . . . , V ℓ the ℓ-chained structure determined by the tree. The position centrality P p of v in the graph, where p ∈ R , is defined by where (#V i ) denotes the cardinality of the set V i . We refer to a vertex v c with the smallest position centrality as a p-center vertex.
When p = 1 , the position centrality is the sum of the lengths of the paths from v to all the other vertices, so its minimization is equivalent to the maximization of the closeness centrality is the distance between v i and v j . For p = −1 , position centrality is equivalent to harmonic mean distance; see Newman 2010, Eq. (7.30). We emphasize that position centrality depends on the chained structure, which contains important information about the network being analyzed.
Using positive p values different from 1 may help select central nodes with different features. A value larger than 1 further penalizes the presence of a large number of long walks, and selects a relatively long ℓ-chained structure, generally with maximal chain length, with sets V k containing a small number of vertices. This feature has the interesting side effect of reducing the bandwidth of the adjacency matrix corresponding to the node ordering induced by the chain structure.
On the contrary, p ∈ (0, 1) reduces the difference between the scores of long and small walks, leading to a shorter chain structure, composed by large node sets.
Example 13 Consider the graph G displayed in Fig. 10. The position centrality of a vertex in the graph can be computed by using the chained graph starting from this vertex.
To compute the position centrality of vertex v 2 , we consider the spanning tree rooted at vertex v 2 . We obtain Since the graph is unweighted, the length between a vertex in V i to a vertex in V i+1 is one. It follows that the 1-position centrality of vertex v 2 is while P 5 (v 2 ) = 4828 and P 1/5 (v 2 ) = 12.02.
We turn to the position centrality of vertex v 4 . Letting V 1 = {v 4 } , we obtain V 2 = {v 1 , v 3 , v 5 , v 8 } , and V 3 = {v 2 , v 6 , v 7 , v 9 , v 10 } . We have Similarly, we can compute the position centrality for all the other vertices of the spanning tree. Vertex v 4 has the smallest position centrality score for p = 1 5 and p = 1 , while the center vertices for p = 5 are v 7 , v 9 , and v 10 .
In Example 13, the center vertices lie on one of the longest branches of the spanning tree. It is reasonable to assume that this is typical for many trees. Hence, to approximate the center vertex, instead of evaluating the position centrality for all the vertices, it is more efficient to compute the position centrality for the vertices on the longest branches only. This suggests the iterative procedure described by Algorithm 4. The same approach can also be used for determining the approximate top k p-center nodes, as described by Algorithm 5.
It may be attractive to identify a tree with the shortest longest branch and then determine a candidate for the central node on a longest branch. We outline this approach, but hasten to add that it is only a heuristic, because a center vertex is not guaranteed to lie on a longest branch.
The following example illustrates that the center vertex depends on the spanning tree.
Example 14 Consider the undirected and unweighted graph G displayed in Fig. 11. Two shortest-path trees rooted at vertex v 1 are shown in Fig. 12. The center vertices of the shortest-path tree in Fig. 12a are the vertices v 2 and v 3 . Let v 2 be the starting vertex.
Let, instead, v 3 be the initial vertex. Then we obtain Similarly, we find that the center vertex of the shortest-path tree in Fig. 12b are the vertices v 2 and v 4 . The position centrality of both these vertices is 8, which is the smallest position centrality of all the vertices.

Numerical experiments
The algorithms discussed in the previous sections were implemented in the MATLAB programming language. Large tests were executed on a Linux virtual machine running on a Cisco UCSB-B480-M5 server based on Intel Xeon Gold 6136 processors. The virtual machine is equipped with 32 cores and 128 Gbyte RAM. We first illustrate the use of the algorithms on a small graph, namely, the one described in Example 11 and illustrated in Fig. 8. The graphs in Fig. 13 display spanning trees starting at vertex v 2 and at vertex v 6 of the graph G . The dashed lines denote edges that must be added to the tree T to obtain the graph C(T ) : it is seen that the added edges are compatible with the chain structure of T , and the chain length is 5. Applying Algorithm 1 confirms that this length is maximal. Hence, G is 5-chained and all the sets V i determined by T are 0-anti-communities.
By computing the 1-position centrality of all nodes in G , one finds that the corresponding central nodes are v 4 , v 5 , and v 8 . The nodes with the smallest 5-position centrality are v 2 and v 3 . Both of them are roots of a tree with maximal chain length; Fig. 13a illustrates this for v 2 . We applied Algorithm 3 for approximating the chain structure length 1 2 3 4 5 6

Fig. 11
An undirected and unweighted graph G of the graph, and Algorithm 4 to approximate its center vertex for p = 1 . To investigate the global performance of these methods, the algorithms were applied starting from each vertex of the network; the results are displayed in Fig. 14. It can be observed that the chain structure length was not detected for each starting vertex, but the computed approximations are accurate. On the contrary, Algorithm 4 always determined one of the three correct center vertices. We now analyze the structure of three medium-sized networks, deriving from well known data sets: -autobahn (1168 nodes, 2486 edges) describes the German highway system network, where the vertices are locations and the edges highways connecting them. It is available at Biological Networks Data Sets of Newcastle University, http:// www. biolo gical-netwo rks. org/. -yeast (2361 nodes, 13828 edges) represents the protein interaction network for yeast: the interacting proteins are connected by edges (Jeong et al. 2001;Sun et al. 2003). It is available at Batagelj and Mrvar (2006). -geom (7343 nodes, 23796 edges) was extracted from the computational geometry database collaboration network geombib by B. Jones (version 2002). Nodes represent authors; the value of the entry (i, j) of the adjacency matrix is the number of papers coauthored by authors i and j. The data set is available at Batagelj and Mrvar (2006). We will use the associated unweighted network.
The autobahn network is connected, but the networks yeast and geom are not. We therefore considered the largest connected component, of 2224 and 3621 vertices, of the latter networks. As expected, Algorithm 1 reveals all three networks (autobahn, yeast, and geom) to be semi-chained. The maximal chain length of a spanning tree for each of the three graphs is ℓ = 63 , 12, and 15, respectively. The structure of a maximal chain length spanning tree for autobahn (starting at vertex 116) and for geom (starting at vertex 207) are displayed in Fig. 15a,b. The additional edges which define C(T ) , represented by dashed lines, are compatible with the semi-chain structure of each tree T .  Figure 16 displays the adjacency matrix for the autobahn network after applying two particular orderings of the nodes, deriving from the spanning tree displayed in Fig. 15a. By listing the vertices in the same order as they appear in the node sets V i , i = 1, 2, . . . , 63 , we obtain the spy plot in Fig. 16a. In a spy plot, each nonzero entry of a matrix is represented as a dot, and the quantity "nz" on the x-axis denotes the number of nonzeros. The graph exhibits the form reported in (3.1), and shows that this ordering reduces the bandwidth of the adjacency matrix, especially in the presence of a long chain structure. By applying the vertex ordering proposed in Example 9, that is, by listing first the nodes in the sets V i with an odd index i and then those with an even index, the adjacency matrix of G takes the sparsity structure shown in the spy plot in Fig. 16b. It coincides with the form displayed in Eq. (3.3), and shows that the graph is almost bipartite.
In view of the sparsity of the diagonal blocks, this spy plot signals the presence of anti-communities in the network. Indeed, by computing the anti-community score of the node sets V i resulting from the application of Algorithm 1 with starting vertex 116, represented in the graph if Fig. 15a, we find that the autobahn network has 48 0-anti-communities (23 including just one vertex) and 15 anti-communities, with maximal score ρ = 0.07. The spy plots for the yeast and geom networks corresponding to the first ordering are reported in Fig. 17a, b, respectively. They clearly show that in both networks there are groups of vertices which do not interact, and that there are no anti-communities. Figures 18, 19, and 20 depict the results obtained by running Algorithm 3 to approximate the chain structure length, and Algorithm 4 to determine an approximation of the center vertex for p = 1 . The algorithms are initialized using each node in the network as a starting vertex, in order to investigate their best and worst performances. In real applications, the algorithms should be initialized with a random starting vertex.
The graph in the left panel of each figure shows the maximal chain length We see that for the autobahn network about half of the tests determine the correct value 63, and the other runs obtain the close value 61. For the other two networks, Algorithm 3 is very accurate, missing the correct chain length by one unit in just a few cases.
The graphs (b) in Figs. 18, 19, and 20, report the relative errors in the approximations of the 1-position centrality by Algorithm 4 when compared to the exact result. The (exact) minimal position centrality was computed by Algorithm 1, which identified the following center vertices for the three test networks: v c = 698 , with P 1 (v c ) = 13954 , for autobahn; -v c = 518 , with P 1 (v c ) = 6914 , for yeast; -v c = 20 , with P 1 (v c ) = 11736 , for geom.
We see that the position centrality was accurately estimated in most cases. The relative error for autobahn exceeds 15% only for a small number of starting vertices, while it is always below 16% for yeast, and 8% for geom.
To illustrate the differences between the center vertex individuated by the position centrality, as defined in Definition 13, and other centrality measures, we consider a real-world data set concerning air transport management. where e i denotes the i-th column of the identity matrix. The eigenvector centrality was introduced by Bonacich as a measure of the influence a node has in a network (Bonacich 1987). The i-th entry of the principal eigenvector q 1 of the adjacency matrix A of a graph is known as the eigenvector centrality of node i. Typically, q 1 is normalized and, by the Perron-Frobenius theorem, it can be chosen so that all of its components are nonnegative.
The node identified by both the subgraph centrality and the eigenvector centrality is New York City, one of the largest commercial centers of the US. The center vertex determined by the position centrality is located at Las Vegas. Indeed, given its position and connections, it is easy to travel from Las Vegas to any other town.
For completeness, we report in Fig. 21 the results obtained by running Algorithm 3 to approximate the chain structure length, and Algorithm 4 to determine an approximation of the center vertex. We see that in this case Algorithm 3 is not very accurate, but the network is too small for the experiment to be of significance. On the contrary, the center node is determined with high accuracy.
Finally, Table 1 reports the central nodes for the networks autobahn, yeast, geom, and airlines, according to various centrality indices. Position centrality, with p = 1, 5, 1 5 , is compared to the degree of a node, betweenness centrality (Newman 2010), PageRank  (Page et al. 1999), subgraph centrality, and eigenvector centrality. We note that for the autobahn network P 1/5 , the degree, the PageRank, and the subgraph centrality agree in the determination of the center node. For airlines most of the methods agree, with the exception of subgraph and eigenvector centrality, which identify the same node, and P 5 . In most cases, different indices select different center vertices, illustrating that they take different features of the network into consideration. We emphasize the fact that the position centrality associates to a center vertex a hierarchy of the nodes, namely, the chain structure, which contains additional strong information about the topology of the network.

Conclusion
The notions of chained and semi-chained graphs, as well as of center nodes, are introduced. Their properties and use to analyze networks are discussed, and algorithms for approximating both the chained structure of a graph and its center nodes are presented.