Network dismantling on factor graphs: break long loops and spare local structures

A new solution framework for the task of network dismantling is recently developed, based on a two-scale bipartite factor-graph representation of the original graph where local structures are abstracted as factor nodes. This technique leads to advancement of extant dismantling algorithms, among which the belief-propagation decimation (BPD) algorithm has an efficient counterpart (factor BPD, i.e., FBPD) on the factor graph, building upon a mean-field spin-glass theory developed for the underlying long-loop feedback vertex set (FVS) problem. In this paper, I (1) demonstrate the advantage as well as disadvantage of the new factor-graph approach, and investigate the varying choice of factors, (2) show that the method can be supported by an alternative microscopic picture, and the two distinct spin-glass theories derive equivalent outcomes, whose analytical results serve as lower bounds for the FVS size on random regular factor graphs, besides (3) an extra mathematical lower bound from the result on random regular (original) graphs. Performances of graph/factor-graph algorithms are compared on various real networks. It shows empirically and analytically that the factor-graph approach does not interfere with what we could achieve without applying this technique; the new approach does a good job where traditional algorithms may perform poorly.

. Factor graph transformation (adapted from figure 1 in [15]). A simple edge is a two-clique (red); a triangle is a three-clique (blue), and a tetrahedron is a four-clique (yellow). graph across the two types of nodes, on which message-passing based on the spin-glass microscopic picture becomes more efficient. Extending the previous analysis, in this paper, I demonstrate the advantage and disadvantage of this new solution framework, substantiate it with an alternative spin-glass microscopic picture plus a relevant mathematical bound for the FVS size on random regular factor graphs, and discuss its scope of application and complementary role to extant solutions.

Dismantling on factor graphs
We consider cliques (among which triangles/three-cliques are the most common type) as graph factors and represent them with factor nodes, which transform the original graph to the bipartite factor graph (figure 1). By this means, local factors disappear and the graph is simplified to a tree if it does not contain more complicated factors; optimal FVS solutions could then be yielded with the spin-glass approach [11], and corresponding network dismantling results are expected (yet not guaranteed, see below) to improve through efficient message-passing. Suppose the original graph G = (N, E) and the factor graph G = (N , E ). We denote the set of real nodes and factor nodes with R = {i, j, k, . . .} and I = {a, b, c, . . .}, i.e., |R| = N and |R| + |I| = N (no real node is deleted in the factor graph). Note that during graph transformation, all original edges E are broken and any edge in the factor graph G is always between a real node and a factor node, i.e., there is only one type of edges in the factor graph despite two types of nodes. Factor nodes in figure 1 are shown with squares, among which a simple edge is a two-clique, a triangle is a three-clique, a tetrahedron is a four-clique, and so on (we consider cliques up to a certain order γ; here γ = 4). In most cases, with higher and higher order cliques being considered as factors, the factor graph would be more and more reduced in scale; yet there are exceptions due to the cutoff of clique order at the higher end. Consider a eight-clique with only one edge missing. Now, if we identify up to four-clique, such a structure would be represented as two four-cliques with 4 × 4 − 1 = 15 edges connecting the two components, resulting in 1 + 1 + 15 = 17 factor nodes and 4 + 4 + 15 × 2 = 38 (new type of) edges in total. Alternatively, if we identify up to five-cliques, this structure is now represented as a five-clique and  . Downgraded dismantling performance with the factor-graph approach. For task |LCC| 5, when conducting dismantling on the original graph, the optimal solution is the removal of 3 nodes (shown in light shade); on the corresponding factor graph, 5 nodes need to be removed (not shown). Not capturing incomplete cliques (such as the structure on nodes A to D) under the current scheme undermines the power of the new approach. three single nodes, which could not formulate a triangle due to the missing edge. Each of the three nodes connects to the 5 nodes in the clique, and there is 2 edges connecting them three. Therefore, there are 1 + 3 × 5 + 2 = 18 factor nodes and 5 + 2 × (3 × 5 + 2) = 39 edges in total. In this case, elevating the clique order from γ = 4 to γ = 5 leads to more nodes and edges in the factor graph. This issue is resolved if the clique order is further elevated, e.g., γ 7 yields the optimal factor-graph representation of this example. Such phenomena of insufficient clique order cutoffs, however, do not appear at most graph instances that we tested in this study, but for very few exceptions (see table 1). Improvement of (decycling-based) dismantling performance on the factor graph versus on the original graph can be demonstrated on the sample graph in figure 2. The seven-node graph is connected through 2 triangles and 2 single edges, whose corresponding factor graph is a tree consisting of 11 nodes. When conducting dismantling on the original graph under the condition that the resulting largest connected component (LCC) has |LCC| 3, two nodes (C and E) need to be removed at the first step to render the graph decycled. Neither of the two nodes can be inserted back without violating the LCC size constraint; thus the final removal set has length 2. Dismantling the (already acyclic) factor graph calls directly for tree-breaking, where only one node (D) needs to be put into the removal set before the LCC size constraint is met. This small example reveals the potential of the factor-graph approach in dismantling real networks, where local clusters consisting of triangles and higher-order cliques often prevail.
Nevertheless, the new method does not always improve dismantling performance and may occasionally lead to downgraded performance (figure 3). A critical failure lies in the inability of capturing incomplete cliques (e.g., nodes A to D in figure 3) under the current scheme where only cliques are identified as factors. An incomplete clique is not able to be successfully abstracted as a factor, and has to be represented with a lower clique plus additional edges. In future extensions, it is plausible to explore the incorporation of incomplete cliques as new factors and evaluate the method's updated performance. Initial tests (see later) suggest that capturing more factor types does render the factor graph more simplified, yet the eventual dismantling performance can be downgraded. This is because the set of re-insertable nodes that are removed at the decycling step drastically shrinks as now individual nodes are associated with more complex factors; a possible re-insertion may often incur putting back a large number of nodes and is thus inhibited.

Lower bound of FVS size on random regular factor graphs
It is expected that on random regular factor graphs, the new approach will lead to smaller FVS size compared with that obtained under the normal graph approach. We derive lower bounds of FVS size on random regular factor graphs. Start with random regular triangle graphs [18,19]. On these graphs each node has exactly K triangles, and each triangle in the graph is formed with a tuple of three nodes selected at random; the graph contains no other structure, not excluding simple edges. Thus in the factor graph, each factor node has exactly 3 neighbors and each real node has K neighbors. Suppose there is N real nodes, then the number of factor nodes added to the graph is KN/3. For random regular (non-triangle) graphs of order K, the lower bound of FVS size (as a fraction of N) is known to be b(K) [20]. Now, for random regular triangle graphs of order K, the system is close to regular non-triangle graphs of order 2K (i.e., each node has K triangles and thus 2K neighbors). However, after b(2K) nodes have been removed, the remaining graph is a tree structure in terms of local triangles, thanks to the breakdown of long loops. Since local triangles still exist, the lower bound of FVS size b tri (K) is slightly greater than b(2K), with additional nodes to be removed in order to break local structures: For the lower bound, we want to find the minimum number of Δb that once these nodes are removed, the remaining structure is completely acyclic. After the removal of bN nodes, there are (1 − b)N nodes left, organized as a tree when neglecting local triangles. The minimum number of nodes required to break all local triangles on such a quasi-tree, is obtained when the structure is most densely organized, i.e., starting from a central node, every node has the same number of triangles organized layer-wise, similar to a Bethe lattice, with the mean-field triangle degree given by Then, when all inner-layer nodes are removed, the structure becomes completely acyclic, obtaining the most efficient removal. Counting the nodes at each layer (up to a certain layer w), In the limit of N → ∞, we arrive at a lower bound of FVS size on random regular triangle graphs of order where b(2K) is given by [20]. Following similar derivations, for random regular tetrahedron graphs, one arrives at the lower bound: which could further extend to random regular factor graphs with higher order of cliques being the factor.

Belief-propagation on factor graphs: theory A
A spin-glass theory is constructed for the factor-graph model based on a microscopic thermodynamic picture [15]. Here we explain the theory in detail. Suppose each real node i has the direction state s i which could be zero, i.e., s i = 0, indicating that i being deleted from the graph, or it could point to one of the neighboring factor nodes a ∈ ∂i and then s i = a. A microscopic configuration of the entire system of N nodes (considering only the states of real nodes, not factor nodes) is denoted as s ≡ (s 1 , s 2 , . . . , s N ). Each factor node a then serves as a constraint on the configuration s of the system: T Li where δ is the Kronecker delta: δ y x = 1 iff x = y, otherwise δ y x = 0. Notice that χ a = 1 iff (i) every neighboring real node i of node a is in state s i = a or state s i = 0, or (ii) exactly one neighboring node i is not in state s i = a or s i = 0 while all other neighboring nodes j ∈ ∂a are all in state s j = a or s j = 0. Now the partition function of the system is defined as where β is the inverse temperature controlling the mean number of zero-state nodes in the system. Therefore, any valid configuration s of the system satisfies all constraints χ = 1 induced by all existing factor nodes (noting the multiplication term at the end of equation (6)). When any constraint is not satisfied and we have χ = 0 somewhere, it means that a certain factor node is not being properly considered, then the corresponding system configuration is not valid and will not be counted in Z.
Essentially, due to the constraints χ from factor nodes, every configuration s having positive contribution to Z(β) can be represented as a set made of trees (each of which is a factor subgraph with no loop) and cycle-trees (each of which is a factor subgraph with exactly one loop); as long as a configuration s is not conflicting a tree or quasi-tree, it is valid (for example, a valid (local) configuration for nodes F, G, K in figure 1 can be s F/G/K = 4/4/4, or s F/G/K = 4/4/5, s F/G/K = 2/4/4, s F/G/K = 4/8/4. . . . As long as at least two of F, G, K are pointing to the factor node 4, the configuration is valid). We write the cavity probabilities of the state s i of each real node i and then formulate the self-consistent BP equations under this spin-glass theory. Denote by p s i a→i the cavity probability of state s i of node i if i is constrained only by factor node a but not by all the other connected factor nodes b, and q s i i→a the cavity probability of state s i of node i if i is constrained by all the connected factor nodes b ∈ ∂i except for the factor node a. Self-consistent BP equations for p s i a→i are The normalization constant z a→i ensures that p 0 a→i + p a a→i + b∈∂i\a p b a→i = 1. Notice that p 0 a→i = p a a→i , and p b a→i is the same for any b ∈ ∂i\a hence its value is p b a→i = where z i→a ensures q 0 i→a + q a i→a + with z i such that q 0 i + a∈∂i q a i = 1. Compared to the two types of free entropy contribution under the original BP scheme, under the factor-graph scheme there are three types: contribution from real nodes T Li φ i∈R , contribution from factor nodes φ a∈I , and contribution from edges φ ia (noting the uniform edge type).
The total free entropy Φ at a certain temperature is summing over the three terms: At a given value of β we obtain ρ and the overall entropy density of the system s = β(f − ρ). The minimum fraction ρ 0 of feedback vertices is thus determined when s(ρ 0 ) = 0. The above equations are simplified when we only consider cavity messages q s i i→a , in which case the BP equations become and further with normalization q 0 i→a +q a i→a + b∈∂i\a q b i→a = 1. Free entropy Φ(β) is now expressed as where φ a∈I of a factor node a is as in (10); φ i+∂i is the combined free energy contribution of real node i and all its neighboring factor nodes a ∈ ∂i, which is given by with z i+∂i expressed as Now the marginal probability of node i being empty is

Belief-propagation on factor graphs: theory B
On factor graphs, the original spin-glass picture [4,11] may still apply, but calls for a modification. Upon the separation of real nodes and factor nodes, belief-propagation (BP) equations are to be distinguished for the two types of nodes on the factor graph. Factor nodes should participate in message-passing, but they should never be selected into FVS, as their removals do not lead to node removal in the original graph. As factor nodes cannot be unoccupied, one of the three cases of a real node's state is dismissed for factor nodes: the message q 0 i→j is now 0, if the sender i is a factor node (thus its neighbor j is a real node). Therefore, for a real node i ∈ R, messages are (∂x denoting neighbors of x): where the normalization constant z i∈R→a∈I allows that q 0 i→a + q i i→a + Σ c∈∂i\a q c i→a = 1, and β is the inverse temperature controlling the mean number of zero-state nodes in the system. On the other hand, for a factor node a ∈ I, messages are: where z a∈I→i∈R satisfies p j a→i + Σ l∈∂a\i p l a→i = 1. The three types of free entropy contribution are now from which we calculate the free entropy density f = Φ(β)/N. Looking at (18) and (19), one verifies that, for a real node i from a certain clique, messages from other nodes in the same clique are successfully passed to i via the factor node (say a) placed in the center of the clique, without adding extra information along the way (since p 0 a→i = 0), in which case the original message-passing scheme without the factor node is strictly preserved. This observation also implies that the current factor-node approach would work perfectly when these new nodes are replacing cliques, and may not strictly work along the representation of other factors in the graph (e.g., rectangles, triangles sharing edges etc) that are not complete cliques. If a non-clique factor is represented by a factor node, it will send some extra messages to certain nodes in the factor through channels that do not exist on the original graph.

Results
The theories and algorithms are tested on synthetic graphs (random regular triangle/tetrahedron graphs) and then on a panel of real networks of various sorts. Random regular factor graphs are used to demonstrate the matching of theories and numerical outcomes, as well as the binding of analytical lower bounds. Since the graph is regular, considering that messages and cavity probabilities are the same on edges and nodes, BP equations can be simplified and we arrive analytically at the replica-symmetric mean-field results of the optimal FVS size. The two spin-glass theories introduced above are orderly labelled as theory-A and theory-B. Numerical results are obtained with the original BPD algorithm and the new FBPD algorithm; under the same factor-graph framework, the corresponding version of CoreHD [5] is derived, i.e., FCoreHD, which executes on the factor graph the same way as CoreHD except the additional feature that only real nodes are to be removed. These four algorithms (BPD, FBPD, FCoreHD, FCoreHD) are tested on a number of real networks, comparing the dismantling performance under the consideration of clique factors up to clique order γ with γ = 3, 4, 5.

Synthetic graphs
Results of the FVS size of random regular triangle/tetrahedron graphs with different triangle/tetrahedron degree K are shown in figure 4, obtained from the two spin-glass theories, the FBPD algorithm, as well as the lower bound b tri/qua (K). Unexpectedly, the two spin-glass theories yield identical results; the corresponding two versions of FBPD are further tested on various synthetic and real networks, and all results suggest that the two theories are essentially equivalent, despite built up from distinct thermodynamic pictures. As lower-bound equilibrium solutions, spin-glass theories well match algorithm outcomes, while progressively underestimate numerical results as K gets larger. The lower bound b tri/qua (K) binds tight at random regular triangle graphs (i.e., b tri (K)), while the binding loosens at random regular tetrahedron graphs (i.e., b qua (K)).

Real networks
BPD/FBPD and FCoreHD/FCoreHD algorithms are tested on a panel of real networks of various types, including power grid (Grid) [21], internet network at the autonomous system level (IntNet) [22], email networks (Email1, Email2) [23,24], protein-protein interaction network (yeast) [25], and social networks on multiple platforms (LastFM, Twitch, FB [a/b; from two sources], Deezer, Brightkite [BK]) [26][27][28][29][30]. With each network instance, we consider the identification of cliques up to order γ = 3, 4, 5; correspondingly, the original graph is transformed to factor graphs at different clique cutoffs (table 1 left). As expected, along an increasing γ, the factor graph is in general more and more simplified (with some rare exceptions, as discussed in section 1) with smaller |N | and |E |. In many cases, γ = 4 demonstrates the saturation of clique identification as the factor graph is often only slightly changed when we further move to γ = 5.
We dismantle the network such that the LCC size of remaining sub-networks is no more than 1% of the size of the original network. The dismantling set starts with the FVS (of size m), experiencing a few tree-breaking steps (of size t) that push the subgraphs further down the size threshold, and then a number of nodes (of size r) are re-inserted into the remainder graph as much as possible; the final size of the dismantling set n = m + t − r. The number of tree-breaking steps t is largely the same across different methods, and is always a small number; results of n and r are shown in table 2. As mentioned earlier, a smaller FVS at network decycling does not always lead to a smaller removal set at network dismantling, because a simpler graph leads to less node re-insertion. This is confirmed in the results (table 2 right): such is the case for the factor graph comparing to the original graph, as well as for the factor graph constructed from a large γ comparing to from a small γ. Indeed, results suggest that although further increasing γ leads to a smaller FVS (table 1 right), the best dismantling result is nevertheless often obtained with an intermediate γ (table 2 left). Such limitation being said, the improvement of FBPD over BPD and FCoreHD over CoreHD in dismantling performance, is clearly demonstrated at almost all tested instances.

Scale of graph transformation
As a final note, we analyze the size of the original graph G = (N, E) and the factor graph G = (N , E ). Write N = N + ΔN, E = E − ΔE (consider the net reduction of edges) where E N − 1, E N − 1, with equality taken when G or G is a tree. Consider the identification of cliques up to order γ, withholding the possible existence of incomplete factors (i.e., cliques with certain edges missing). G is transformed into G with factor nodes inserted, where each factor node can have up to γ links. Since an edge in the factor graph is always connecting one real node and one factor node, there is N − 1 E γΔN, and The denser the original G is, the closer ΔN/N is to 1/(γ − 1), i.e., the less are factor nodes added to the graph. Consider a specific complete γ-clique s γ with e γ edges. After the transformation, there is ; Δe γ > 0 for γ > 3 and monotonically increases with γ. For incomplete cliques, Δe γ is smaller and γ(γ − 3)/2 is the upper bound. The number of cliques identified is the number of factor nodes ΔN. Thus the largest possible reduction of edges ΔE is reached when all cliques are complete γ-cliques, which is also when the lower bound of (21) is reached. Therefore ΔE = {Δe m } |s m |=ΔN;m∈ [2,γ] Equations (21) and (22) determine the scale of graph transformation. The upper/lower bound of (21) corresponds to the lower/upper bound of (22). The former case is when the original graph is the simplest (a tree or quasi-tree): in this case, a largest number of factor nodes are added to the original graph, and edge reduction is the smallest (even adding more edges to the graph). The resulting factor graph is thus the most unfriendly to the dismantling task. Yet this case is also where the factor-graph approach is the least required, as dismantling could already be well resolved on the original tree-like graph via traditional methods (e.g., BPD). The latter case is when the original graph is most clustered (i.e., full of γ-cliques), but now the number of additional factor nodes is the smallest, and edge reduction is the largest as well. Indeed, this is when we need the new factor-graph solution the most to conduct efficient network dismantling, while the factor graph is now also the most simplified in our favor. This reveals an inherent trade-off in the applicability of BPD vs FBPD (also CoreHD vs FCoreHD): the utility of factor-graph transformation (thus the utility of FBPD/FCoreHD) does not interfere with what we could achieve without this technique, while desirably providing a family of new solutions (through the manipulation of factors) which essentially span an algorithm spectrum.

Discussions and concluding remarks
The re-insertion procedure adopted by many dismantling algorithms [3][4][5], regarded by many as being ad-hoc, is desirably subdued in scale under the new factor-graph approach (table 2), as a direct result of the preservation of local structures during graph transformation. Essentially, this solution framework allows for the consideration of miscellaneous local structures besides cliques (e.g., squares, triangles sharing edges etc), which are commonly seen in real-world networks especially of certain functional types (e.g., protein interaction networks, road networks). Although the incorporation of incomplete cliques would let the message-passing in the factor graph deviate from that in the original graph (section 4, theory B), this is certainly supported in the new message-passing scheme guided by the novel spin-glass picture (section 3, theory A). It will be interesting to investigate, for example, if network decycling and dismantling results could be further improved and if the re-insertion procedure could be further reduced, although the computational burden could be daunting during such elaboration.
In the experiments we tested a such extension. Instead of identifying small complete cliques as graph factors, we construct larger structures as factors. Specifically, starting from a triangle, we add all neighboring triangles to the structure until no triangle could be added. This agglomerate is then viewed as a single graph factor. Evidently, such a method identifies structures that are much larger in size than the base case, and the graph reduction is pushed to a much higher level where short-range structures of various sorts are largely overlooked (we also tested starting from a tetrahedron and adding neighboring tetrahedra, which leads to slightly smaller structures; the discussion below is similar). Results suggest that, however, this treatment does not lead to smaller dismantling set. The FVS is indeed shrank, as now the factor graph is much more simplified; yet during re-insertion, since now a removed node in FVS often entails large clusters (as now factors are huge), the likelihood of a successful re-insertion, without violating the dismantling size constraint, is much lowered. Therefore, with smaller FVS and even smaller re-insertion, the overall result is downgraded.
The current spin-glass theories are not sufficiently binding when the graph is very densely clustered (as for random regular triangle/tetrahedron graphs with a large triangle/tetrahedron degree). In future works, higher-order effects under the replica-symmetry spin-glass theory might be further considered in the model. The binding of mathematical lower bounds of FVS size for random regular factor graphs can certainly be improved as well.