Attack robustness and stability of generalized k-cores

Earlier studies on network robustness have mainly focused on the integrity of functional components such as the giant connected component in a network. Generalized k-core (Gk-core) has been recently investigated as a core structure obtained via a k-leaf removal procedure extending the well-known leaf removal algorithm. Here, we study analytically and numerically the network robustness in terms of the numbers of nodes and edges in Gk-core against random attacks (RA), localized attacks (LA) and targeted attacks (TA), respectively. In addition, we introduce the concept of Gk-core stability to quantify the extent to which the Gk-core of a network contains the same nodes under independent multiple RA, LA and TA, respectively. The relationship between Gk-core robustness and stability has been studied under our developed percolation framework, which is of significance in better understanding and design of resilient networks.


Introduction
In network science, robustness refers to the ability of surviving random failures or intentional attacks. Much work has been carried out to explore the robustness of networked systems by revealing the size of their functional components through percolation theory [1,2]. In this context, the most studied functional components in networks include giant connected component [3,4], k-core structure [5][6][7] and core [8][9][10]. A recent study [11] considered a k-leaf removal procedure for  k 2 which leads to the Generalized k-core (or Gk-core) by progressively removing k-leaves, i.e. nodes with degree less than k, together with their nearest neighbors and all incident edges. It is clear that the resulting subgraph is equivalent to the ordinary core in the case of k=2 [8].
The Gk-core naturally characterizes for example the robustness of networks suffering from virus infection deactivating weak nodes (i.e. k-leaves) and their nearest neighbors.
However, in most real situations, such as power grid blackouts, market crashes, and brain seizures, understanding merely the network robustness in terms of the size of functional component is not very useful. The potential damage and recovery of the network crucially rely on the location of the functional components. [12] showed a pronounced variation of giant component sizes corresponding to two correlated random realizations of percolation, suggesting the changing role of individual nodes in response to different percolation. Stability is recently introduced in [13] as a novel measure to quantify the extent to which the giant connected component of a network consists of the same nodes under multiple independent edge percolation. It is found interestingly that robustness and stability are consistent in single-layer networks but do not always imply each other in networks with interdependency links.
Here we extend the stability of giant components under independent edge percolation to stability of Gk-core under a range of attacks including random attacks (RA) [14][15][16][17], localized attacks (LA) [18][19][20], and targeted attacks (TA) [21][22][23]. Moreover, we investigate how each type of attack influences the network robustness in terms of the numbers of nodes and edges in the Gk-cores. It is found that the effect of a LA is exactly the same as that of a RA on Erdős-Rényi (ER) networks in terms of both robustness and stability of Gk-core. Interestingly, the analogous equivalence recurs in exponential networks but between LA and TA (see table 1). Under our percolation framework, we observe discontinuous percolation transition for Gk-core with  k 3 and continuous percolation transition for G2-core in all attack scenarios. The relationship between robustness and stability is explored in three stylized network models as well as a real-world social network. We find excellent agreement between theoretical calculations and numerical simulations.
The rest of the work is organized as follows. The analytical frameworks for attack robustness and stability based on generating function formalism are established in section 2. Numerical studies for synthetic networks and an example of a large-scale real-life network are given in sections 3 and 4, respectively. The conclusion is drawn in section 5.

Theoretical results
In this section, we consider a random network model with any degree distribution of node degree. Specifically, let P(q) be the probability that a randomly chosen node has degree q. Let n and l be the numbers of nodes and edges in the network, respectively. Following [3,24], the generating function for the degree distribution is defined by 1 is the generating function for the excess degree distribution. Clearly, = ¢ ( ) l n G 2 1 0 is the average degree. We are interested in the numbers of nodes and edges as well as stability of Gk-core when a fraction 1−p of nodes are removed according to RA, LA, and TA.
Under RA, we define the stability of Gk-core as the fraction of nodes in all Gk-cores under ℓ independent realizations of random attacks on the network. Namely, where -Gk core t means the nodes in Gk-core in the tth realization of RA (in which a fraction 1−p of nodes are randomly removed), and |·| means the size of a set. We will omit the superscript or the parameters ℓ, p in (1) (and other similar notations later) when no confusion will be caused. For LA and TA, we have similar definitions for the corresponding stability. Equation (1) extends the stability concept of giant component [13] and characterizes the extent to which the Gk-core is stable regardless of the specific damage caused during an attack.
For a given p, if the expected size of Gk-core under consideration is denoted by m, it would be useful to compare S k (ℓ, p) with the 'stability' of a random subset: where R t is the tth realization of a random set of size m sampled randomly from the network with replacement.
follows the binomial distribution ( ( ) ) ℓ n m n Bin , with expectation ( ) ℓ n m n . Hence, the expected S(ℓ) decays exponentially in the form ( ) ℓ m n as a function of ℓ.

Generalized k-core under random attacks
A random attack removing a fraction 1−p of nodes from the network can be considered as a two-staged process by first removing the nodes but keeping the edges connecting the remaining nodes and the removed nodes, and then removing those edges. In other words, in the first stage only nodes are deleted and in the second stage only edges are deleted. Since the network is randomly connected, the probability of a random edge leaving a removed node is equal to the ratio of the number of edges leaving the removed nodes in the first stage to the total number of edges leaving all nodes in the original network. Hence, the probability for an edge to leave a removed node can be calculated as The generating function of the degree distribution, denoted byˆ( ) G x 0 , of the resulting network after the random attack becomes following [16] because deleting the edges leaving the removed nodes is equivalent to deleting a 1−p fraction of edges randomly in the second stage. Given  k 2, the Gk-core is obtained by a iterative pruning procedure. At each time step, a randomly chosen k-leaf is removed together with its nearest neighbors and all their incident edges. The procedure continues until no k-leaves exist in the remaining network. The resulting subgraph is called the Gk-core. Following the approach of [11], the nodes can be split into three categories: if a node can become a leaf, it is called α-removable; if a node can become a neighbor of a leaf, it is called β-removable; if a node belongs to Gk-core, it is called non-removable. Assume that following a randomly selected edge, the node we arrive at is α-removable, β-removable, and nonremovable with probability α, β, and 1−α−β, respectively. Invoking (4), these probabilities can be calculated as [11] å å 2 is the probability that both end nodes of a random edge belong to the Gk-core, and p 2 l corresponds to the numbers of edges in the network after random attack. Note that ( ) l p k RA can also be derived by using generating function (4).
Next, we study the stability ℓ Note that a node is in the Gk-core if it has at least k neighbors which are also in the Gk-core. By using (1), (4), and the repeated differentiation of the generating function [3], we have the expected stability where the first factor p is the occupation probability of RA,ˆ( ) P q is the degree distribution after RA, ⎜ ⎟ ⎛ ⎝ ⎞ ⎠ q s is the binomial coefficient, and the term in the brackets is the probability that a randomly chosen node belongs to Gkcore given that it has degree q. The last equality in (9) follows directly from the differentiation property as well as the binomial theorem. Note that a one line calculation validates in the light of (7) and (9).

Generalized k-core under localized attacks
A localized attack is performed by randomly deleting a seed node in the network, then its nearest neighbors, and then its second nearest neighbors and so on until a fraction 1−p of nodes in the whole network are removed. The generating function of the degree distribution, denote byˆ( ) G x 0 , of the resulting network after the localized attack is given by [18] = . Using (10) and following the approach of [11], the probabilities α and β after Gk-core percolation are established by  , can be computed by (10) similarly as in (8): is the average degree of the network after localized attack. To find the stability of Gk-core under LA, we argue similarly as in the RA scenario. It follows from (1) and where the term in the brackets is the probability that a randomly chosen node belongs to Gk-core given that it has degree q. It can be directly checked that by (13) and (15).

Generalized k-core under targeted attacks
A targeted attack removing a fraction 1−p of nodes from the network can be conveniently implemented by assigning weight or probability to each node. For a node i with degree q i , we set the canonical removal probability as . When γ>0, a node with higher degree is more likely to be deleted; when γ<0, a node with lower degree instead is more likely to deleted. The case g  ¥ corresponds to the intentional deletion according to the fully sorted degree sequence. In particular, γ=0 is equivalent to the RA where each node is deleted with equal probability. Following [21], by introducing an auxiliary generating , we obtain the generating function for the degree distribution of the resulting network after target attack as )( ( ) ) p P q qt P q q q q q 1 . Likewise, using (17) and following the approach of [11], the probabilities α and β after Gk-core percolation are established by , can be computed similarly: is the average degree of the network after targeted attack. Finally, we consider the stability Note that a node is in the Gk-core if it has at least k neighbors which are also in the Gk-core. By using (1), (17), and the differentiation property of the generating function [3], we obtain the expected stability where, as in RA and LA cases, the term in the brackets above is the probability that a randomly chosen node belongs to Gk-core given that it has degree q. It follows from (20) and (22) confirming the idea that stability at ℓ=1 is equivalent to node fraction of Gk-core. Moreover, it is easy to check that the case of RA can be recovered as =p pand

Synthetic networks
We apply the above analytical framework for attack robustness and stability to three types of complex networks: homogeneous random networks following Poisson degree distributions and degenerate degree distributions, and quasi-heavy tailed networks following exponential degree distributions. Numerical simulations are based on networks with n=10 7 nodes. Here, we leave scale-free networks off since they only have a trivial Gk-core for all  k 2 [11]. In table 1, we summarize the robustness and stability for Gk-core under RA, LA, and TA for these benchmark networks.

ER networks
For an ER network with average degree λ, the degree distribution follows l = l -( ) ! P q q e q for  q 0, and = l -( ) ( ) G x e x 0 1 . In figure 1 we display the behavior of n k (p) and l k (p) as functions of occupation fraction p for ER networks with λ=10 under RA, LA, and TA with γ=1. The agreement between simulations and results from generating function formalism is good. Some interesting observations are commented as follows.
The effect of an RA is exactly the same as that of an LA in terms of both n k and l k . In fact, we have  (14). This coincidence for ER networks has been observed for giant connected components in [18][19][20]25] and for cores in [26]. This can be intuitively explained as a competition between degree heterogeneity, where highdegree nodes are more likely to sit in the attack hole of LA, and localization, where only surface nodes of the hole are connected to the remaining nodes. These two competitive forces of LA reach a balance in ER networks giving rise to the same damage measured by Gk-cores as an RA does.
With the variation of occupation probability p, continuous phase transition is observed for k=2 for all three types of attacks, while there is first order percolation transition behavior for  k 3 in all attack scenarios (see figure 1). This is similar to k-core percolation [26].
When comparing damage caused to the Gk-core under TA with that under RA or LA, we find that TA is more harmful as high-degree nodes tend to be deleted in early stages dismantling the Gk-cores. From both figures 1(a) and (b), we note that this influence however gets smaller when k increases. This phenomenon can be explained as k-leaves for larger k are more likely to connect to some high-degree nodes. Therefore, removing these k-leaves will lead to the deletion of high-degree nodes. This effect turns out to be comparable to the TA with γ=1 considered here in ER networks for k=4.
Next, we explore the relationship between Gk-core stability of ER networks under RA, LA, and TA. Several interesting observations can be derived from figure 2, where S k (p, ℓ) is shown for Gk-core with relative size 0.8 under all three types of attacks. Firstly, S 2 >S 3 >S 4 >S=0.8 ℓ (see equation (2)) for any given ℓunder all attack scenarios. This means the inner Gk-core with larger k tends to be less stable, which agrees with our intuition as the more nodes (k-leaves and their neighbors) are deleted the more fluctuation will be introduced. All S k seems to decay exponentially but at slower rates than the random subset scenario. Secondly, we have TA for all k and ℓ. The equivalence of S k RA and S k LA can be shown directly by using (9) and (15). Under TA, we are more likely to remove high-degree nodes, which are often 'anchor nodes' [13] in Gk-cores. Thus, S k TA drops lower than S k RA for any given k. We can conclude from figures 1 and 2 that attack robustness and stability of Gk-cores for ER networks are generally consistent: lower k and milder attacks result in more robust and stable Gk-cores; see table 1. The stability of Gk-core has been shown in figure 2 (and below in figures 4 and 6) for the relative size of 0.8 as an example. It has qualitatively the same behavior for other relative sizes.

Random regular networks
For a random regular network degenerated on the atomic degree at q 0 , the degree distribution follows d = ( ) P q q q , 0 for  q 0, and = ( ) G x x q 0 0. Figures 3 and 4 correspond to the network of q 0 =8. Noting that the two strategies RA and TA coincide for any value of γ since the nodes have the same degree in the initial network.
Similar as in ER networks, continuous phase transition is observed for = k 2 for all attack scenarios, and there is first order percolation transition behavior for  k 3 for all attack scenarios (see figure 3).  Interestingly, in all cases RA seems to cause more damage to the network than LA does, reminiscent of the giant component based results observed in [18][19][20]. The localization effect takes over in LA and leads to a larger Gk-core for any given k.
When it comes to Gk-core stability results under RA and LA shown in figure 4, we find that > S S k k LA RA for all k and ℓ. The means that Gk-core under LA is more stable than under RA in line with the robustness results. Similar to ER networks, we observe a natural hierarchy that > > S S S k k 1 2 for k 1 <k 2 under all attacks, indicating the less stability of inner Gk-core with larger k. It is worth mentioning that G2-cores in random regular networks are more stable than in ER networks as S 2 shown in figure 4 is almost level under both RA and LA. This is because random regular networks have a much narrower degree distribution than ER networks even after attacks, which induces less small degree nodes such as 2-leaves stabilizing the G2-core.

Exponential networks
An exponential network has the degree distribution = - which is approximately equal to the parameter σ for large σ. Figure 5 shows the behavior of n k (p) and l k (p) as functions of occupation fraction p for exponential networks with σ=80 under RA, LA, and TA with γ=1.

A real-world network example
We compare the attack robustness and stability of a real-world network against RA, LA and TA by using an actor collaboration network constructed from IMDb in the year 2004 [29,30]. The network has 1092431 nodes representing movie actors, and 56263702 edges with two actors sharing an edge between them if they ever played in a movie together. The degree distribution is skewed as shown in figure 7. We present the relative size and normalized number of edges in Gk-cores under RA, LA, and TA in figure 8(a) and stability of Gk-cores in figure 8(b). More data are collected in tables 2 and 3.
From figure 8(a) and table 2 we observe that that TA is the most harmful attack and RA is the mildest one in all the three types of attacks on the actor collaboration network. For example, when 10% nodes are removed from the network, < < n n n k k k TA LA RA and < < l l l k k k TA LA RA for both G2-core and G5-core. The stability displayed in figure 8(b) does not have a good approximation between analytical and simulation results, which reveals that the structural features such as degree correlations and clustering may have a non-negligible effect on Gk-core stability. Nevertheless, figure 8(b) and table 3 allow us to have a tangible understanding on how stable the Gk-cores are: Gk-core under TA exhibits the lowest stability, while under RA it exhibits the highest. For example, more than 56% (i.e. 0.436/0.774) nodes that constitute the G2-core after an RA will remain in the G2core after 10 independent repetitions of such an RA. This number decreases to about 42% (i.e. 0.297/0.702) for LA and further to about 24% (i.e. 0.145/0.613) for TA, supporting our theoretical results for networks with heterogeneous degree distributions.

Conclusion
In summary, we have studied the robustness and stability of Gk-cores of uncorrelated random networks with arbitrary degree distribution. We develop a theoretical framework to systematically gauge network robustness in terms of the relative size and normalized number of edges of Gk-core under RA, LA, and TA. It is found that continuous phase transition only exists in G2-core for all the three types of attacks, and discontinuous transitions are determined for Gk-core with  k 3 in all scenarios. We introduce the Gk-core stability and show how different types of attacks affect the stability of Gk-core. Similarities behind the organizing principles underpinning attack robustness and stability are identified, but they are by no means substitutable especially for heterogenous networks. Methods presented in this work hold promise for more implications in the design and reinforcement of resilient networked systems.

Acknowledgments
The author would like to thank the reviewers for careful reading and valuable comments. This work was supported by NNSFC (11505127) and Northumbria University.