Maximizing Network Resilience against Malicious Attacks

The threat of a malicious attack is one of the major security problems in complex networks. Resilience is the system-level self-adjusting ability of a complex network to retain its basic functionality and recover rapidly from major disruptions. Despite numerous heuristic enhancement methods, there is a research gap in maximizing network resilience: current heuristic methods are designed to immunize vital nodes or modify a network to a specific onion-like structure and cannot maximize resilience theoretically via network structure. Here we map complex networks onto a physical elastic system to introduce indices of network resilience, and propose a unified theoretical framework and general approach, which can address the optimal problem of network resilience by slightly modifying network structures (i.e., by adding a set of structural edges). We demonstrate the high efficiency of this approach on three realistic networks as well as two artificial random networks. Case studies show that the proposed approach can maximize the resilience of complex networks while maintaining their topological functionality. This approach helps to unveil hitherto hidden functions of some inconspicuous components, which in turn, can be used to guide the design of resilient systems, offer an effective and efficient approach for mitigating malicious attacks, and furnish self-healing to reconstruct failed infrastructure systems.

where F is external force (stress), σ is elastic deformation and σ c is critical elastic deformation. For a linear physical elastic system, the external force (or elastic deformation) is also a resilience metric, and it has the identity with the elastic potential energy. The value of Ep of Equations (1) lies in the range [0, ∞).
In analogy with the physical elastic system, the proposed network resilience refers to the network deformation under external force including initial attacks, disruptions or perturbations. Let the fraction of the removed nodes, q, represent external force; and let the fraction of failed nodes, 1 − G(q), denote elastic deformation under external force, where G(q) is the fraction of the largest (giant) connected component 3,12,29 . And support that the size and shape of a complex network can be restored during elastic deformation if the external force is withdrawn. The elastic potential energy of a complex network, E p , can be given by (see also Fig. S1a, and detailed explication in Supplementary Information Section S1). Considering that the network system is a nonlinear discrete-time system, the quadrature formula (2) can but be solved by numerical integration method, here we provide the numerical versions of Equation (2) by rectangular and trapezoid approximation methods respectively where N is the total number of nodes in the network, 1/N is the normalized minimum-step integral size which corresponds to dq in the Equation (2), and q, q l and q l−1 are the fractions of the removed nodes and q l − q l−1 = 1/N. The value of E p of Equations (3) and (4) E p lies strictly in the range [1/N, 0.5], where the two limits correspond to a star network and a fully connected graph respectively. This is because (1) a star network breaks down if a vital node is removed from it, and (2) if a fully connected graph is attacked maliciously (or randomly), its fraction of the largest (giant) connected component is equal to 1 minus the fraction of the attacked nodes, i.e., G(q) = 1 − q.
Though the error of numerical integration in Equation (4) is smaller than that in Equation (3) (see the detailed explication in Supplementary Information Section S1), we select the Equation (3) as numerical integration version of Equation (2) in the following simulations in Result Section, for comparing with the method in ref. 17 . Note that in ref. 17 , the right side of Equation (3) is defined only as a robustness measure R without mathematical deductive inference and physical properties. Beyond that, the complex networks have other resilient indices such as an elastic coefficient (also called the modulus of elasticity), the critical external force (critical threshold, q c ) and the elastic complementary energy (all of which are defined in Supplementary Information Section S1), the same as the physical elastic systems do. The traditional measurement for resilience of networks, critical threshold (q c ), can just reflect the critical external force, which is unsuitable for nonlinear systems. For a nonlinear network, the elastic potential (or complementary) energy can better characterize its elastic properties due to its advantages covering the elastic coefficient and critical threshold (see Fig. S1b, and detailed explication in Supplementary Information Section S1). Theoretical framework. If a certain fraction (q) of vital nodes is intentionally removed from a network and the network breaks down into many finite (disconnected) components, i.e., q = q c , the network will undergo a structural collapse and no giant connected component will exist, i.e., G(q c ) = 0. Let the vector C = (C 1 , …, C k , …, C K ) represent the finite components, whose normalized sizes are s 1 , …, s k , …, s K (s 1 > … > s k > … > s K ), where k is the serial number of a finite component ordered by size, and K is the number of finite components in the collapsed network. Similar to the definition of the critical giant components, we define the "weak cores" (e.g., C 1,c , C 2,c in Fig. 1) as the critical finite components. A critical finite component is a special critical giant component caused by an attack, as a finite component is regarded as a subnet. If an edge between the "weak cores" and the critical giant component (C c,c in Fig. 1) is added, the failure of the finite component can be avoided unless the critical giant component G(p c ) fails. Therefore, the weak cores can be used for maximizing network resilience. For simplicity, we investigated the case of adding only one optimal edge, e ij , to maximize the network resilience (elastic potential energy); an optimal set of edges is provided in the follow-up section. There are only 4 ways to add this edge e ij : (i) in the same finite component where s a and s b are the sizes of C a and C b , respectively, and (iv) between a finite component C a (i ϵ C a ) and the critical giant component C c,c (j ϵ C c,c ). After adding an edge in any of the above 4 ways, from Equation (2), the increment of elastic potential energy of network can be given by where G I (q) and G O (q) are the elastic potential energies of the modified network by adding the edge e ij and the original network respectively, and ΔE p ϵ [0, 0.5). Note that the least important nodes in the "weak cores" (C i,c ) and the critical giant components should be selected as the terminal nodes of edge e ij to avoid being attacked maliciously in cases (iii) and (iv).
For cases (i) and (ii), due to where q a and q 1 are the fractions of the removed nodes) (see Fig. 2a), the increment of elastic potential energy can be obtain from Equation (5) by Suppose that the two finite components C a and C b fail at q a and q b (q a < q b ) respectively, where q b is the fraction of the removed nodes, and C a,c and C b,c are the corresponding "weak cores" of C a and C b respectively. Accordingly, the increment of elastic potential energies in case (iii) and (iv) are respectively given by (see Fig. 2b,c) Fig. 2b), one can see that the increment of elastic potential energy in case (iv) is greater than that in case (iii), due to q c > q b .
Algorithm. The above analysis shows that the optimal edge, e ij , must be located between a "weak core" and the critical giant component (i.e., case (iv)) (strict theoretical proof in Supplementary Information Section S4). Moreover, from Equation (8), it can be observed that increment of the elastic potential energy in case (iv) depends on two key factors: the size and the failed sequence (such as q a in Fig. 2c and q b in Fig. 2d) of the finite component. Greater finite component size and smaller failed sequence result in greater increment of elastic potential energy, as shown in Fig. 2c,d. By comparing the increment of the elastic potential energy from Equation (8)  , , can be obtained, here, i ϵ C k and j ϵ C c,c (or C' c,c , the critical giant component of the modified network). Undoubtedly, the first element in the set of edges, e i j , 1 , is an optimal edge which, if added into the network, would improve the resilience of network maximally. The sequential set of optimal edges can be obtained naturally by repeating the above procedure. In this regard, a highly scalable algorithm, PA, is proposed for maximizing resilience. The algorithm is terminated if the number of added edges reaches a predefined limit, Fig. 3 shows the overall flowchart of the algorithm (more detailed depiction of the PA algorithm is shown in The increment of elastic potential energy by adding an edge between the two nodes from two different finite components (case (iii)). (c) The increment of elastic potential energy by adding an edge between the "weak core" C a,c (or C b,c in d) and the critical giant component C c,c (case (iv)). Supplementary Information section S5). Naturally, by adding the edges from the optimal set sequentially, the resilience of network can be enhanced maximally. The resilience-improvement algorithm scales where M is the number of edges of the network, α(α«N) is the number of pre-set optimal edges and K(K«M) is the number of large finite components (more detailed explanation in Supplementary Information section S5). Generally, the number of large finite components K in a collapsed network is small, because the size distribution of the finite components follows the power law at the tail 29 . This high scalability allows us to find the edges to enhance the network resilience optimally in large-scale networks.

Effectiveness.
We demonstrate the efficiency of our approach on the Zachary (Karate club) network 26 , the Gansu (GS) 27 and Henan (HN) power grids 28 as well as artificial random networks, i.e., scale-free (SF) networks and Erdös-Rényi (ER) networks. Figure 1d-f demonstrates the effectiveness of the proposed algorithm in maximizing the resilience of a simple network (Zachary network 26 ) against malicious attack (high degree adaptive, HDA). The network resilience is increased by 30%, 63% and 72%, by adding one, two and three edges, respectively. Figure 4a-c shows the structures of SF, GS and HN network optimized by the proposed method (the structure of the optimized ER network in Fig. S2). For example, in Fig. 4c, before optimization, the finite components (green) C 1 , C 2 will emerge if the vital nodes (such as high degree nodes (purple)) v 1 , v 2 are maliciously removed from original network; after being optimized by adding optimal edges (red), the emergence of C 1 , C 2 will be avoided naturally under the same attacks. This case explains why the proposed method can tremendously improve network resilience. As a practical example, the networked micro grids can enhance the power system resilience 5 .
In Fig. 5a-c, we show the mitigation of malicious attacks for the SF network, GS and HN power grids (ER network in Fig. S3), respectively. The dashed lines correspond to the sizes of the giant component G(p) in each original network, and the coloured solid lines correspond to the typical modified networks under the different numbers of added edges (from 20 to 180, 2 to 32 and 1 to 16 for SF, GS and HN, respectively). The coloured areas give increments of the resilience (elastic potential energy) under malicious attacks. By adding only 4.5% of edges to the SF network, GS and HN power grids under HDA attacks (Fig. 5d-f), the resilience of the three networks were increased by 44%, 187% and 740%, respectively.
We compare the proposed algorithm with the heuristic strategies, i.e., ES 17 and EA 23 in Fig. 5d-f. Remarkably, the heuristic strategies (ES and EA) improve the network resilience greatly. Furthermore, the improvement ratios of the network resilience by our algorithm are the optimal ratios and are greater than those of the heuristic strategies 17,23 under the same proportion of added (or swapped) edges. In the same three figures, we investigate the effect of the resilience improvement of our algorithm on two different malicious attacks, i.e., the widely used HDA 3 and the optimal collective influence (CI) 12 (see also Figs S4 and S5). Our algorithm performs very well under both attacks. The network resilience is improved by 36%, 223% and 762% (by adding or swapping 4.5% edges) in the SF network, GS and HN power grids, respectively, under the CI attack. Furthermore, if the critical threshold is used as the resilience measure, our algorithm also outperforms the other strategies 17,23 (Fig. 5g-i).   For networks with a community structure 29 (such as the Zachary network, the GS and HN power grids), our algorithm produces a better network resilience and greater critical threshold than those complex networks with no community structure (such as SF and ER networks), as shown in Figs 5 and S3, because the networks lead to a few large finite components when they are attacked maliciously. In addition, better improvements of network resilience and critical percolation threshold can be obtained in the SF network (Fig. 5) than in the ER network (Fig. S3). As the top vital (hub) nodes of the SF network are removed sequentially, its serious heterogeneity will generate a few large finite components, which contributes to the consequences. Figure S6 shows that the network resilience and the critical thresholds of the original and the improved ER networks are increased, which indicates that they follow nearly the same rising trend in the original and the improved networks as the average degree. From Fig. S7, one can observe that the improvements in the network resilience and the critical threshold remain nearly unchanged regardless of the network size.
Unchanged network functionality. The functionality of a network is commonly related to its topological features 17,29 . It is fundamental and necessary to keep a network's functionality unchanged when optimizing its resilience. We tested the effects of the topological structural changes on the functionalities of the optimized networks, i.e., the SF network, and the GS and HN power grids. The distributions of cumulative degree, shortest path distance and betweenness were used for measuring the functionality. As shown in Fig. 6, those functionality measures hardly changed. Other topological characteristics including the cluster coefficient, the network diameter, etc., also remain unchanged (Table S2). Therefore, the networks optimized by our algorithm are not only more resilient against malicious attacks but also exhibit little change to their functionalities compared with the original networks.

Discussion and Conclusion
Intentional attacks and the corresponding defences are always the two opposite sides of network security. To enhance network resilience against malicious attacks, we introduce the network resilience indices by mapping a complex network onto a physical elastic system; then we propose a unified theoretical framework and a general approach (PA algorithm) to solve the problem of resilient optimization. As mentioned before, both the ES methods and EA methods cannot well maintain the topological functionality of a network and their performance on resilience improvement cannot be guaranteed since they are unable to optimize network resilience globally under a theoretical framework. In contrast, our algorithm can maximize network resilience by adding optimal edges between the "weak cores" and the critical giant component (Fig. 1), with minimal costs. This is because, after being optimized by our method, the emergences of the large infinite components can effectively be avoided under the same attacks (Figs 1 and 4). Moreover, the proposed indices of network resilience can characterize the elastic properties for nonlinear networks, compared with the conventional metrics such as critical threshold. Case studies show that our algorithm achieves better performance on resilient improvement of networks, compared with competing approaches 17,23 .
As edges are added to reach a certain proportion, the growth of network resilience slows down, especially for realistic networks, because the number of large-scale finite components generated by malicious attacks becomes increasingly smaller. Thus, it is necessary to balance the maximum resilience improvements with the costs of modifying a network to find an optimal compromise for the application of our method.
The proposed theory is strictly valid, and can be applied to any real network. Our solution to the optimal resilience problem demonstrates its importance because it can be used to enhance network resilience, guide the design of technological resilient systems, and offer fast and effective ways to mitigate the collapse of networks against malicious attacks, or furnish a self-healing solution to reconstruct existing failed infrastructure systems.