Vulnerability of robust preferential attachment networks

Scale-free networks with small power law exponent are known to be robust, meaning that their qualitative topological structure cannot be altered by random removal of even a large proportion of nodes. By contrast, it has been argued in the science literature that such networks are highly vulnerable to a targeted attack, and removing a small number of key nodes in the network will dramatically change the topological structure. Here we analyse a class of preferential attachment networks in the robust regime and prove four main results supporting this claim: After removal of an arbitrarily small proportion epsilon>0 of the oldest nodes (1) the asymptotic degree distribution has exponential instead of power law tails; (2) the largest degree in the network drops from being of the order of a power of the network size n to being just logarithmic in n; (3) the typical distances in the network increase from order log log n to order log n; and (4) the network becomes vulnerable to random removal of nodes. Importantly, all our results explicitly quantify the dependence on the proportion epsilon of removed vertices. For example, we show that the critical proportion of nodes that have to be retained for survival of the giant component undergoes a steep increase as epsilon moves away from zero, and a comparison of this result with similar ones for other networks reveals the existence of two different universality classes of robust network models. The key technique in our proofs is a local approximation of the network by a branching random walk with two killing boundaries, and an understanding of the particle genealogies in this process, which enters into estimates for the spectral radius of an associated operator.


Motivation
The problem of resilience of networks to either random or targeted attack is crucial to many instances of real world networks, ranging from social networks (like collaboration networks) via technological networks (like electrical power grids) to communication networks (like the world-wide web). Of particular importance is whether the connectivity of a network relies on a small number of hubs and whether their loss will cause a large-scale breakdown. Albert, Albert and Nakarado [1] argue that "the power grid is robust to most perturbations, yet disturbances affecting key transmission substations greatly reduce its ability to function". Experiments of Albert, Jeong, and Barabási [2], Holme, Kim, Yoon and Han [24] and more recently of Mishkovski, Biey and Kocarev [29] find robustness under random attack but vulnerability to the removal of a small number of key nodes in several other networks. The latter paper includes a study of data related to the human brain, as well as street, collaboration and power grid networks. One should expect this qualitative behaviour across the range of real world networks and it should therefore also be present in the key mathematical models of large complex networks.
A well established feature of many real world networks is that in a suitable range of values k the proportion of nodes with degree k has a decay of order k −τ for a power law exponent τ . The robustness of networks with small power law exponent under random attack has been observed heuristically by Callaway et al. [8] and Cohen et al. [11], but there seems to be controversy in these early papers about the extent of the vulnerability in the case of targeted attack, see the discussion in [17] and [12]. As Bollobás and Riordan [7, Section 10] point out, such heuristics, informative as they may be, are often quite far away from a mathematical proof that applies to a given model. In their seminal paper [7] they provide the first rigorous proof of robustness in the case of a specific preferential attachment model with power law exponent τ = 3, and later Dereich and Mörters [15] proved for a class of preferential attachment models with tunable power law exponent that networks are robust under random attack if the power law exponent satisfies τ ≤ 3, but not when τ > 3, thus revealing the precise location of the phase transition in the behaviour of preferential attachment networks. However, the question of vulnerability of robust networks when a small number of privileged nodes is removed has not been studied systematically in the mathematical literature so far.
It is the aim of the present paper to give evidence for the vulnerability of robust networks by providing rigorous proof that preferential attachment networks in the robust regime τ ≤ 3 undergo a radical change under a targeted attack, i.e. when an arbitrarily small proportion > 0 of the most influential nodes in the network is removed. Our main results, presented in Section 1.3, show how precisely this change affects the degree structure, the length of shortest paths and the connectivity in the network. The results take the form of limit theorems revealing explicitly the dependence of the relevant parameters on . Not only does this provide further insight into the topology of the network and the behaviour as tends to zero, it also allows a comparison to other network models, and thus exposes two classes of robust networks with rather different behaviour, see Section 1.5. Our mathematical analysis of the network uses several new ideas and combines probabilistic and combinatorial arguments with analytic techniques informed by new probabilistic insights. It is crucially based on the local approximation of preferential attachment networks by a branching random walk with a killing boundary recently found in [15]. In this approximation the removal of a proportion of old vertices corresponds to the introduction of a second killing boundary. On the one hand this adds an additional level of complexity to the process, as the mathematical understanding of critical phenomena in branching models on finite intervals is only just emerging, see for example [20]. On the other hand compactness of the typespace for this branching process opens up new avenues that are exploited, for example, in the form of spectral estimates based on rather subtle information on the shape of principal eigenfunctions of an operator associated with the branching process.

Mathematical framework
The established mathematical model for a large network is a sequence (G n : n ∈ N) of (random or deterministic) graphs G n with vertex set V n and an edge set E n consisting of (directed or undirected) edges between the vertices. We assume that the size |V n | of the vertex set is increasing to infinity in probability, so that results about the limiting behaviour in the sequence of graphs may be seen as predictions for the behaviour of large networks. In all cases of interest here the average number of edges per vertex converges in probability to a finite limit and the topology of a bounded neighbourhood of a typical vertex stabilizes. An important example for this is the proportion of vertices with a given degree in G n , which in the relevant models converges and allows us to talk about the asymptotic degree distribution. The mathematical models of power law networks therefore have an asymptotic degree distribution with the probability of degree k decaying like k −τ , as k → ∞, for some τ > 1. Our focus here is on the global properties emerging in network models with asymptotic power law degree distributions.
A crucial global feature of a network is its connectivity, and in particular the existence of a large connected component. To describe this, we denote by C n a connected component in G n with maximal number of nodes. The graph sequence (G n : n ∈ N) has a giant component if there exists a constant ζ > 0 such that Vulnerability of robust preferential attachment networks ularised by Barabási and Albert [3] and has received considerable attention in the scientific literature. The idea is that a sequence of graphs is constructed by successively adding vertices. Together with a new vertex, edges are introduced by connecting it to existing vertices at random with a probability depending on the degree of the existing node; the higher the degree the more likely the connection. Despite the relatively simple principle on which this model is based it shows a good match of global features with real networks. For example, the asymptotic degree distributions follow a power law, and variations in the attachment probabilities allow for tuning of the power law exponent τ . If the power law exponent satisfies τ < 3, then the network is robust and ultrasmall.
The first mathematically rigorous study of resilience in preferential attachment networks was performed by Bollobás and Riordan [7] for the so-called LCD model. This model variant has the advantage of having an explicit static description, which makes it easier to analyse than models that have only a dynamic description. It also has a fixed power law exponent τ = 3, hence, Bollobás and Riordan [7] prove only results for this specific exponent. They show that in this case the network is robust and identify a critical proportion c < 1 such that the removal of the oldest n oldest vertices leads to the destruction of the giant component if and only if ≥ c . Note that this is not in line with the notion of vulnerability that we are interested in as we only want to remove a small proportion of old vertices.
In the present paper, we consider the question of vulnerability in the following model variant, introduced in [14]. Let N 0 be the set of nonnegative integers and fix a function f : N 0 → (0, ∞), which we call the attachment rule. The most important case is if f is affine, i.e. f (k) = γk + β for parameters γ ∈ [0, 1) and β > 0, but non-linear functions are allowed. Given an attachment rule f , we define a growing sequence (G n : n ∈ N) of random graphs by the following dynamics: • Start with one vertex labelled 1 and no edges, i.e. G 1 is given by V 1 := {1}, E 1 := ∅; • Given the graph G n , we construct G n+1 from G n by adding a new vertex labelled n + 1 and, for each m ≤ n independently, inserting the directed edge (n + 1, m) with probability f (indegree of m at time n) n ∧ 1. (1.1) Formally we are dealing with a sequence of directed graphs but all edges point from the younger to the older vertex. Hence, the directions can be recreated from the undirected, labelled graph. For all structural questions, particularly regarding connectivity and the length of shortest paths, we regard (G n : n ∈ N) as an undirected network. Dereich and Mörters consider in [14,15] concave attachment rules f . Denoting the asymptotic slope of f by γ := lim k→∞ f (k) k , (1.2) they show that for γ ∈ (0, 1) the sequence (G n : n ∈ N) has an asymptotic degree distribution which follows a power law with exponent τ = γ + 1 γ .
For γ ≥ 1, i.e. τ ≤ 2, the mean of the asymptotic degree distribution is infinite and a radically different topology can be expected. Results on power law networks in this regime have been derived for example in [18,4]; we restrict ourselves to the finite mean case γ < 1. In the case γ < 1 2 , or equivalently τ > 3, there exists a critical percolation EJP 19 (2014), paper 57.
Page 4/47 ejp.ejpecp.org parameter p c > 0 such that (G n (p) : n ∈ N) has a giant component if and only if p > p c . 1 If however γ ≥ 1 2 , or equivalently τ ≤ 3, the sequence (G n (p) : n ∈ N) has a giant component for all p ∈ (0, 1], i.e. (G n : n ∈ N) is robust. This is the regime of interest in the present paper.

Statement of the main results
Our focus is on the case of an affine attachment rule f (k) = γk + β with β > 0 and γ ∈ [ 1 2 , 1). Recall that for this choice the preferential attachment network is robust. We use the symbol a( ) b( ) to indicate that there are constants 0 < c < C and some 0 > 0 such that cb( ) ≤ a( ) ≤ Cb( ) for all 0 < < 0 .   If γ > 1 2 then, as ↓ 0, Theorem 1.1 shows that the removal of an arbitrarily small proportion of old nodes makes the network vulnerable to percolation, but does not destroy the giant component.
The steep increase of p c ( ) as leaves zero shows that, even when a small proportion of old nodes has been removed in the network, the removal of further old nodes is much more destructive than the removal of a similar proportion of randomly chosen nodes.
As for small the critical value p c ( ) is strictly decreasing in γ, this effect is stronger the closer γ is to 1 2 . This result might be perceived as slightly counterintuitive since the preferential attachment becomes stronger as γ increases and therefore we might expect older nodes to be more privileged and a targeted attack to be more effective than in the small γ regime. However, the effect of the stronger preferential attachment is more than compensated by the fact that networks with a small value of γ have a (stochastically) smaller number of edges and are therefore a-priori more vulnerable.
Note also that p c ( ) may be equal to 1 if is not sufficiently small in which case (1.3) implies that the damaged network has no giant component. In the case γ = 1 2 the implied constants in (1.4) can be made explicit as c = 1 γ+β and C = 1 β , but we cannot show that they match asymptotically. However, we conjecture that meaning that the ratio of the left-and right-hand sides converges to one.
To gain further insight into the topology of the damaged graph, we now look at the asymptotic indegree distribution and at the largest indegree in the network. Recall from [14] that outdegrees are asymptotically Poisson distributed and therefore indegrees are solely responsible for the power law behaviour as well as the dynamics of maximal degrees. From here onwards we additionally assume that β ≤ 1. Under this 1 The results of [15]  Vulnerability of robust preferential attachment networks condition, f (n) < n + 1 for all n ∈ N 0 and the minimum in (1.1) is always attained by its first argument.
For a probability measure ν on the nonnegative integers, we write ν ≥k := ν({k, k + 1, . . .}) and ν k := ν({k}). Let Z[m, n] be the indegree of vertex m in G n at time n ≥ m. Since for m > n , the indegree of m in G n and G n agree, writing X (n) for the empirical indegree distribution in G n , we have We write M(G) for the maximal indegree in a directed graph G. For s, t > 0, let denote the beta function at (s, t). Before we make statements about the network after the targeted attack, we recall the situation in the undamaged network. In this case Dereich and Mörters in [14] show that the empirical indegree distribution X 0 (n) in G n satisfies almost surely lim n→∞ X 0 (n) = µ in total variation norm. The limit is the probability measure µ on the nonnegative integers given by and satisfies lim k→∞ log µ ≥k / log k = −1/γ. Moreover, the maximal indegree satisfies, in probability, Our theorem shows that in the damaged network the asymptotic degree distribution is no longer a power law but has exponential tails. The maximal degree grows only logarithmically, not polynomially.

Theorem 1.2. (Collapse of large degrees)
Let ∈ (0, 1). Almost surely, lim n→∞ X (n) = µ in total variation norm. The limit is the probability measure µ on the nonnegative integers given by It satisfies lim k→∞ log µ ≥k /k = log(1 − γ ). Moreover, the maximal indegree satisfies, in probability, It is worth mentioning that µ = µ 0 , so Theorem 1.2 remains valid for = 0. Moreover, the result holds also for γ ∈ (0, 1 2 ) by the same proof. Theorem 1.2 shows in particular that by removing a proportion of the oldest vertices we have removed all vertices with a degree bigger than a given constant multiple of log n. This justifies the comparison of our vulnerability results with empirical studies of real world networks such as [1], in which all nodes whose degree exceeds a given threshold are removed. Note also that as ↓ 0 the right-hand side in (1.5) is asymptotically equivalent to −γ and the growth of the maximal degree is the faster the larger γ. Vulnerability of robust preferential attachment networks Preferential attachment networks are ultrasmall for sufficiently small power law exponents. For our model, Mönch [31], see also [13,16], has shown that, denoting by d G the graph distance in a graph G, for independent random vertices V n , W n chosen uniformly from C n , we have meaning that the ratio of the left-and right-hand side converges to one in probability as n → ∞. Removing an arbitrarily small proportion of old vertices however leads to a massive increase in the typical distances, as our third main theorem reveals. We say that a sequence of events (E n : n ∈ N) holds with high probability if P(E n ) → 1 as n → ∞.

Theorem 1.3. (Increase of typical distances) Let
> 0 be sufficiently small so that (G n : n ∈ N) has a giant component, and let V n , W n be chosen independently and uniformly from C n . Then, for all δ > 0, log n with high probability.
Our proof gives the result for all values γ ∈ [0, 1), > 0, with p c ( ) < 1 but if γ < 1 2 even without removal of old vertices the typical distances in the network are known to be of order log n, so that this is not surprising. We believe that there is an upper bound matching the lower bound above, but the proof would be technical and the result much less interesting.
In the next two sections we discuss some further ramifications of our main results.

Non-linear attachment rules
So far we have presented results for the case of affine attachment rules f , given by f (k) = γk + β. While the fine details of the network behaviour often depend on the exact model definition, we expect the principal scaling and macroscopic features to be independent of these details. To investigate this universality we now discuss to what extent Theorem 1.1 remains true when we look at more general non-linear attachment rules f . We consider two classes of attachment rules.
(2) A concave function f : N 0 → (0, ∞) with γ := lim k→∞ f (k)/k ∈ [0, 1) is called a Cclass attachment rule. Note that concavity of f implies that the limit above exists and that f is non-decreasing.
The asymptotic slope of the attachment rule determines the key features of the model. For example, Dereich and Mörters [14] show that for certain C-class attachment rules with γ > 0 the asymptotic degree distribution is a power law with exponent τ = 1 + 1/γ. The following theorem shows that γ also determines the scaling of the critical percolation probability for the damaged network. Moreover, if f is in the L-class and If f is in the C-class, the statement remains true in the case γ > 1 2 , and in the case γ = 1 2 if the limit is replaced by a lim sup ↓0 and the equality by '≤'.
Theorem 1.4 implies that the damaged network (G n : n ∈ N) is not robust. But as lim ↓0 p c ( ) = 0 it is still 'asymptotically robust' for ↓ 0 in the sense that when less than order n old vertices are destroyed, then the critical percolation parameter remains zero. We formulate this as a corollary. For two graphs G = (V, E) andG = (Ṽ,Ẽ), we write G ≥G if there is a coupling such that V ⊇Ṽ and E ⊇Ẽ. Corollary 1.5. Let f be a L-class or C-class attachment rule with γ ≥ 1 2 and let (m n : n ∈ N) be a sequence of natural numbers with lim n→∞ m n /n = 0. The network (G (mn) n : n ∈ N), consisting of the graphs G n damaged by removal of the oldest m n vertices along with all adjacent edges, is robust.
Proof. Let p ∈ (0, 1). By Theorem 1.4, there exists > 0 such that p c ( ) < p. Choose n 0 ∈ N such that m n /n < for all n ≥ n 0 . Then G (mn ) n ≥ G n for all n ≥ n 0 , implying G (mn ) n (p) ≥ G n (p). Since the network (G n (p) : n ∈ N) has a giant component, so does the network (G (mn) n (p) : n ∈ N). Theorem 1.4 is derived from Theorem 1.1 using the monotonicity of the network in the attachment rule. Its appeal lies in the large class of functions to which it applies. The L-class attachment rules are all positive, bounded perturbations of linear functions. In Figure 1 we see several examples: On the left a concave function which is also in the C-class, then a convex function and a function which is convex in one and concave in another part of its domain. The latter examples are not monotone, and all three are asymptotically vanishing perturbations of an affine attachment rule. The example of an L-class attachment rule on the right shows that this may also fail.

Vulnerability of other network models
We would like to investigate to what extent our results are common to robust random network models rather than specific to preferential attachment networks. Again our focus is on Theorem 1.1 and we look at two types of networks, the configuration model and the inhomogeneous random graphs. Both types have an explicit static description and are therefore much easier to analyse than the preferential attachment networks studied in our main theorems.

Configuration model
A targeted attack can be planned particularly well when the degree sequence of the network is known. A random graph model with fixed degree sequence is given by the Let n k = |{i ≤ n : d i = k}| be the number of vertices with degree k and assume that there exists a N-valued random variable D with 0 < ED < ∞ and P(D = 2) < 1, such that n k /n → P(D = k) for all k ∈ N and 1 n ∞ k=1 kn k → ED as n → ∞.  Vulnerability of robust preferential attachment networks We observe the same basic phenomenon as in the corresponding preferential attachment models: While the undamaged network is robust, after removal of an arbitrarily small proportion of privileged nodes the network becomes vulnerable to random removal of vertices. However, when γ > 1 2 , then the increase of the critical percolation parameter p c ( ) as leaves zero is less steep than in the corresponding preferential attachment model.
Note that our assumptions imply that 0 < ED < ∞. In the case ED = ∞, Bhamidi et al. [4] show a more extreme form of vulnerability, where the connected network can be disconnected with high probability by deleting a bounded number of vertices.

Inhomogeneous random graphs
Inhomogeneous random graphs are a generalization of the classical Erdős-Rényi random graph. Let κ : (0, 1] × (0, 1] → (0, ∞) be a symmetric kernel. The inhomogeneous random graph G (κ) n corresponding to kernel κ has the vertex set V n = {1, . . . , n} and any pair of distinct vertices i and j is connected by an edge independently with probability P({i, j} present in G (κ) n ) = 1 n κ i n , j n ∧ 1. (1.7) Many features of this class of models are discussed by Bollobás, Janson and Riordan [6] and van der Hofstad [23]. The first inhomogeneous random graph model we consider is a version of the Chung-Lu model, see for example [9,10]. The relevant kernel is This is an example of a kernel of the form κ(x, y) = ψ(x)ψ(y), for some ψ, which are called kernels of rank one, see [6]. Note that a similar factorisation occurs in the configuration model since the probability that vertices i and j are directly connected is roughly proportional to d i d j . Therefore, the configuration model can be classified as a rank one model, too. The network corresponding to κ (CL) has an asymptotic degree distribution which is a power law with exponent τ = 1 + 1/γ.
The second inhomogeneous random graph model we consider is chosen such that the edge probabilities agree (at least asymptotically) with those in a preferential attachment network and the asymptotic degree distribution is a power law with exponent Note that if γ = 1 2 this kernel is not of rank one, but strongly inhomogeneous. These two kernels allow us to demonstrate the difference between rank one models and preferential attachment models within one model class.
We denote by G (CL) n , resp. G (PA) n the inhomogeneous random graphs with kernel κ (CL) , resp. kernel κ (PA) . If γ ≥ The fact that the Chung-Lu model is vulnerable to targeted attacks has also been remarked by van der Hofstad in Section 9.1 of [23].
Summarising, we note that vulnerability to a targeted attack is a universal feature of robust networks, holding not only for preferential attachment networks but also for configuration models and various classes of inhomogeneous random graphs. In the case 2 < τ < 3, studying the asymptotic behaviour of the critical percolation parameter p c ( ) as a function of the proportion of removed vertices reveals two universality classes of networks, distinguished by the critical exponent measuring the polynomial rate of decay of p c ( ) as ↓ 0. In terms of the power law exponent τ this critical exponent equals 3−τ τ −1 in the case of the configuration model and the Chung-Lu model, but is only half this value in the case of preferential attachment networks and inhomogeneous random graphs with a strongly inhomogeneous kernel. The same classification of networks has emerged in a different context in [13], where it was noted that the typical distances in networks of the two classes differ by a factor of two. The key feature of the configuration model and the rank one inhomogeneous random graphs seems to be that the connection probability of two vertices factorises. By contrast, the connection probabilities in preferential attachment networks have a more complex structure giving privileged nodes a stronger advantage.

The local neighbourhood in the network
Dereich and Mörters [15] have shown that the (not too large) graph neighbourhood of a uniformly chosen vertex in G n can be coupled to a branching random walk on the negative half-line. Although we cannot make direct use of this coupling result in our proofs, it is helpful to formulate our ideas in this framework. Therefore, we now explain heuristically that a suitable exploration of the local neighbourhood of a given vertex v 0 ∈ G n reveals a graph that can be approximated by the genealogical tree of a twotype branching random walk with two killing boundaries. A complete definition of the branching process used in the current article is given in Section 2.1 and the coupling is proved rigorously in Section 4 below.
Firstly, we associate to every vertex in G n a location on the negative half-line such that the youngest vertex (i.e., vertex n) is located at the origin and the distance between vertex j − 1 and vertex j is given by 1/j. In particular, the vertex labelled v is located at s n (v) := − n−1 j=v 1 j , the location of the oldest vertex scales like − log n, and vertices with label at most n , which we remove when damaging the network, are asymptotically located to the left of log . The location of a vertex is determined by its age in the network with old vertices being located further left than young vertices. As the graph size increases, the location of any fixed vertex moves to the left and the vertex locations (s n (v) : v ∈ {1, . . . , n}) become dense everywhere on the negative half-line.
We run an exploration from vertex v 0 ∈ G n and successively create particles in the branching random walk that approximate the discovered vertices. We stop as soon Vulnerability of robust preferential attachment networks as there is no longer a one-to-one correspondence between the nodes in the two processes. For example, this could happen if in the network a vertex is rediscovered and the explored subgraph is no longer a tree. A careful analysis, carried out in Section 4.1 below, shows that when the order in which vertices are explored is chosen in a suitable way, then we do not stop until either the whole component is discovered or at least c n vertices have been found, where lim n→∞ c 2 n /n = 0. To start, we place a particle at the location of vertex v 0 and declare it to be the root of the branching random walk. Then we explore all direct neighbours of v 0 in G n . The locations of the particles in the first generation of the branching random walk are chosen to approximate the locations of these direct neighbours. To this end, we that v 0 has a direct neighbour labelled u < v 0 , is given by , and the approximation follows from the fact that the process (f (Z[u, n]) n−1 j=u 1 1+γ/j : n ≥ u) is a martingale; see Lemma 3.8 below. Since the location of u can be written as s n (v 0 ) plus the displacement s n (u) − s n (v 0 ), asymptotically, we can approximate the displacements of the direct neighbours of v 0 on its left by the points of a Poisson point process . We emphasise that Π describes the displacements, not the particle locations. Hence, in the branching random walk, the relative positions of the offspring to the left of a particle with location λ are given by the points of Π that lie in [log − λ, 0].
In the next step, we motivate the point process that describes the relative positions of the offspring on the right in the branching random walk. Note that in the network every direct neighbour u of v 0 with u > v 0 increases the indegree of v 0 and therefore the probability that v 0 has further offspring on its right. The distance between the i-th and (i + 1)-st right neighbour of v 0 in the network is given by Suppose the i-th neighbour of v 0 is born at time k. For given t > 0, we have where l is the smallest integer with l j=k 1 j > t. Plugging in the connection probabilities, we deduce that (1.9) is equal to When the exploration is continued, the information gathered from the already explored neighbourhood leads to a size-biasing effect. Indeed, in the network the edges between a vertex v and its direct neighbours on the right u > v are not independent. If v was discovered as a direct neighbour of a vertex w on its right, i.e. w > v, then we already know that v has indegree at least one. Consequently, we expect v to have more direct neighbours on its right than without this information. Mathematically, this leads to a size biasing effect and the displacements of particles on the right of v are given by the jump times in [0, −s n (v)] of Z started in one instead of zero. In contrast, if v was discovered as a direct neighbour of a vertex w with smaller label, i.e. w < v, then we do not have that information and the displacements are again the jump times in Similarly, for the direct neighbours on the left of v, there is no size-biasing effect as a consequence of the independence between the edges on the left. Of course, there are several further dependencies coming from the previously explored subgraph. However, we show in Section 4 that the error accrued by adjusting only for the immediate parent is asymptotically negligible when we discover not more than c n vertices, where lim n→∞ c 3 n /n = 0.
To be able to use different offspring distributions depending on the relative location of the parent, each vertex is equipped with a mark α in { , r} to indicate the relative location of the parent, where the non-numerical symbols and r stand for 'left' and 'right', respectively. The relative positions of the offspring can be generated as the points of Π on (−∞, 0] and the jump times of Z (with initial state depending on the mark) on [0, ∞). All offspring particles located on the left of log or on the right of 0 are immediately removed. In other words, the approximating tree is the genealogical tree of a two-type branching random walk with two killing boundaries.
An equivalent description is as a multitype branching process with type space Φ := [log , 0] × { , r}, where the first component indicates the location of a particle and the second indicates its mark. Whilst the branching random walk interpretation offers more intuition, the two killing boundaries make the mathematical analysis difficult. Hence in our analysis, we will use the interpretation of the process as multitype branching process with the larger typespace Φ.

Main ideas of the proofs
Understanding the local neighbourhood of vertices in the network is the key to many of its properties. As in [15] the survival probability of the approximating killed branching random walk is equal to the asymptotic relative size ζ of the largest component.
This result allows us to determine, for example, the critical parameter for percolation from knowledge when the percolated branching process has a positive survival probability. To form the percolated branching process with retention probability p from the original process every particle is kept with probability p and removed together with its line of descent with probability 1 − p independently of all other particles.
It is instructive to continue the comparison of the damaged and undamaged networks in the setup of this branching process. In [15], where the undamaged network is analysed, the branching random walk has only one killing boundary on the right. It turns out that on the set of survival the leftmost particle drifts away from the killing boundary, such that it does not feel the boundary anymore. As a consequence, the unkilled process carries all information needed to determine whether the killed branching random walk survives with positive probability and, therefore, whether the network has a giant component. The two killing boundaries in the branching random walk describing the damaged network prevent us from using this analogy; every particle is exposed EJP 19 (2014), paper 57. to the threat of absorption.
To survive indefinitely, a genealogical line of descent has to move within the (spacetime) strip [log , 0] × N 0 . To understand the optimal strategy for survival observe that, in the network with strong preferential attachment, old vertices typically have a large degree and therefore are connected to many young vertices, while young vertices themselves have only a few connections. This means that in the branching random walk without killing, particles produce many offspring to the right, but only a few to the left. Hence, if a particle is located near the left killing boundary it represents an old vertex in the graph and is very fertile, but its offspring are mostly located further to the right and are therefore less fertile. A particle near the right killing boundary, however, represents a young vertex and has itself a small number of offspring, which then however have a good chance of being fertile since they are necessarily located further left in the strip. As a result, the optimal survival strategy for a particle is to have an ancestral line of particles whose locations are alternating between positions near the left and the right killing boundary. This intuition is the basis for our proofs.
Continuing more formally, for the proof of Theorem 1.1 we show that positivity of the survival probability can be characterised in terms of the largest eigenvalue ρ of an operator that describes the spatial distribution of offspring of a given particle. More precisely, the branching random walk survives percolation with retention parameter p if its growth rate pρ exceeds the value one, so that p c ( ) = 1/ρ . Our intuition allows us to guess the form of the corresponding eigenfunction, which, relatively to the particle density, has its mass concentrated in two bumps near the left and right killing boundary. From this guess we obtain sufficiently accurate estimates for the largest eigenvalue, and therefore for the critical percolation parameter, as long as the preferential attachment effect is strong enough. This is the case if γ ≥ 1 2 , allowing us to prove Theorem 1.1.
By contrast, for γ < 1 2 we know that the network is not robust, i.e. we have p c (0) > 0. It would be of interest to understand the behaviour of p c ( ) − p c (0) as ↓ 0. Our methods can be applied to this case, but the resulting bounds are very rough. The reason is that in this regime the preferential attachment is much weaker, and the intuitive idea underlying our estimates gives a less accurate picture.
The idea for the proof of Theorem 1.3 is based on the branching process comparison, too. To bound the probability that two typical vertices V and W are connected by a path of length at most h, we look at the expected number of such paths. That is given by the number of vertices at distance at most h − 1 from V multiplied by the probability that such a vertex connects to W . By our branching process heuristics, the number of vertices at distance at most h − 1 from V can be approximated by the number of particles in the first h − 1 generations of the branching random walk, which is of order ρ h where ρ = 1/p c ( ) as before. The probability of connecting any vertex with label at least n to W is bounded from above by f (m)/ n, where m is the maximal degree in the network. Since m = o(n) by Theorem 1.2, this implies that the probability of a connection between V and W is bounded from above by exp(h log(1/p c ( )) − log n + o(log n)) and therefore goes to zero if h ≤ (1 − δ) log n/ log(1/p c ( )), δ > 0, which yields the result. Theorem 1.2 is relatively soft by comparison. The independence of the indegrees of distinct vertices allows us to study them separately and we again use the continuous approximation to describe the expected empirical indegree evolution. The limit theorem for the empirical distribution itself follows from a standard concentration argument. The asymptotic result for the maximal degrees is only slightly more involved and is based on fairly standard extreme value arguments. EJP 19 (2014), paper 57.

Overview
The outline of this article is as follows. We start with the main steps of the proofs in Section 2. The multitype branching process which locally approximates a connected component in the network is defined in Section 2.1 and its key properties are stated. The main part of the proof of Theorem 1.1 then follows in Section 2.2. The analysis of the multitype branching process is conducted in Section 2.3. We study an operator associated with this process in Section 2.3.1 and derive necessary and sufficient conditions for its survival in Section 2.3.2. Sections 3.1 and 3.2 are devoted to the study of the topology of the damaged graph. In Section 3.1 the typical and maximal degree of vertices is analysed. In Section 3.2 typical distances are studied. The couplings between the network and the approximating branching process that underlie our proofs are provided in Section 4. We then look at model variations in Section 5. The derivation of Theorem 1.4 from Theorem 1.1 is presented in Section 5.1. This is the only section which requires consideration of non-linear attachment rules. We finish in Section 5.2 by studying the question of vulnerability in other network models.

Connectivity and branching processes
In this section, we restrict our attention to linear attachment rules f (k) = γk + β, for γ ∈ [0, 1) and β > 0, and let be a fixed value in (0, 1). The goal of this section is to prove Theorem 1.1. To this end, we couple the local neighbourhood of a vertex in G n to a multitype branching process. The branching process is introduced in Section 2.1 and Theorem 1.1 is deduced in Section 2.2. Properties of the branching processes which are needed in the analysis are proved in Section 2.3. The proof of the coupling between network and branching process is deferred to Section 4.

The approximating branching process
As explained in Section 1.6 the local neighbourhood of a vertex in G n can be approximated by a multitype branching process with type space Φ = [log , 0] × { , r}. A typical element of Φ is denoted by φ = (λ, α). The intuitive picture is that λ encodes the spatial position of the particle which we call location. The second coordinate α indicates on which side of the particle its parent is located and we refer to α as the mark. In view of (1.8), a particle of type (λ, α) ∈ Φ produces offspring to its left with displacements having the same distribution as those points of the Poisson point process Π on (−∞, 0] with intensity measure Since these offspring have their parent on the right, they are of mark r. Recall from Section 1.6 that we denote by Z an increasing, integer-valued process, which jumps from i to i + 1 after an exponential waiting time with rate f (i), independently of the previous jumps. We write P for the distribution of Z started in zero and E for the corresponding expectation. By (Ẑ t : t ≥ 0) we denote a version of the process started inẐ 0 = 1 under the measure P .
The distribution of the offspring to the right depends on the mark of the parent. As motivated in Section 1.6, when the particle is of type (λ, ), then the displacements of the offspring follow the same distribution as the jump times of (Z t : t ∈ [0, −λ]), but when the particle is of type (λ, r), then the displacements follow the same distribution as the jump times of (Ẑ t : t ∈ [0, −λ]). All offspring on the right have their parent on the left, so their mark is . Observe that the chosen offspring distributions ensure that new particles have again a location in [log , 0]. The offspring distribution to the right is not We call the branching process thus constructed the idealized branching process (IBP). It can be interpreted as a labelled tree, where every node represents a particle and is connected to its children and (apart from the root) to its parent. We equip node x with label φ(x) = (λ(x), α(x)), where λ(x) denotes its location and α(x) its mark, and write |x| for the generation of x. To obtain a branching process approximation to G n (p), we define the percolated IBP by associating to every offspring in the IBP an independent Bernoulli(p) random variable. If the random variable is zero, we delete the offspring together with its line of descent. If it equals one, then the offspring is retained in the percolated IBP.
Let S be a random variable with Denote by ζ (p) the survival probability of the tree which is with probability p equal to the percolated IBP started with one particle of mark and location S and equals the empty tree otherwise. Let C n (p) be a connected component in G n (p) of maximal size.
The proof of Theorem 2.1 is postponed to Section 4. The theorem describes the asymptotic size of the largest component in the network in terms of the survival probability of the percolated IBP. To make use of this connection, we have to understand the branching process.
For any measurable, complex-valued, bounded function g on Φ, and φ ∈ Φ, let where the expectation E φ,p refers to the percolated IBP starting with a single particle of type φ, percolated with retention parameter p. We write A = A 1 for the operator corresponding to the unpercolated branching process and E φ := E φ,1 . Recall that all quantities associated with the IBP, and in particular A p , depend on the fixed value of . We denote by C(Φ) the complex Banach space of continuous functions on Φ equipped with the supremum norm. The following proposition, which summarizes properties of A p , is proved in Section 2.3.1.

Proposition 2.2.
For all ∈ (0, 1) and p ∈ (0, 1], the operator A p : C(Φ) → C(Φ) is linear, strictly positive and compact with spectral radius ρ (A p ) ∈ (0, ∞). Moreover, The survival probability of the percolated IBP has the following property.   Notice that the corollary implies that (G n : n ∈ N) has no giant component when ρ (A) ≤ 1. Moreover, the first statement of Theorem 1.1 follows from the corollary by taking p c ( ) = ρ (A) −1 ∧ 1.
To complete the proof of Theorem 1.1, it remains to estimate the spectral radius ρ (A). This estimation is performed in Section 2.2 below using that (see, e.g., Theorem 45.1 in [22]) for a linear and bounded operator A on a complex Banach space, the spectral radius is given by By the definition of the Poisson point process Π in (2.1), the intensity measure of Π We denote by Π the point process given by the jump times of (Z t : t ≥ 0) and by Π r the point process given by the jump times of (Ẑ t : t ≥ 0). A simple computation (cf. Lemma 1.12 in [15]) shows that with M α (dt) := a α e γt dt, where a = β and a r = γ + β, the intensity measure of Π α is given by (2.4)

Proof of Theorem 1.1
Subject to the considerations of the previous section, Theorem 1.1 follows from the following proposition.
Note that A andĀ map real-valued functions to real-valued functions and nonnegative functions to nonnegative functions, and they are monotone. For A this observation In particular, by the monotonicity and linearity of A andĀ, To complete the proof it suffices to show that ρ (Ā) = log(1/ ), which we can achieve by 'guessing' the principal eigenfunction ofĀ.
Indeed, the result follows from (2.6) and We show the latter identity by induction over n. For n = 0, Proof of lower bound in Proposition 2.5 (b). We analyse the ancestral lines of particles in the branching process at a fixed time n ≥ 2. Going back two steps in the ancestral line of every particle alive we can divide the population at time n in four groups depending on the relative positions of parent and child in the transitions from generation n − 2 to n − 1 and from n − 1 to n: (1) in both steps the child is to the left of its parent, (2) in the first step the child is to the left and in the second it is to the right of its parent, (3) first right, then left, (4) in both steps the child is to the right of its parent. The cases are depicted in Figure 3. Vulnerability of robust preferential attachment networks function g on Φ and (λ, α) ∈ Φ, Intuitively, going back the ancestral line of a typical particle in the population at a late time, for a few generations the ancestral particles may be in group (4) behaviour is not sustainable, as after a few generations in this group the offspring particle will typically be near the right end of the interval and will therefore be pushed into the right killing boundary so that it is likely to die out. Over a longer period the ancestral particles are much more likely to be in groups (2) and (3), as this behaviour is sustainable over long periods when the ancestral line is hopping more or less regularly between positions near the left and the right boundary of the interval [log , 0]. A similar pattern can also be observed when studying typical paths in the random graph model; see our discussion in Section 1.7. The aim is now to turn this heuristics into useful bounds on high iterates of the operator A.
It is useful to understand how the operators B i act on the constant function 1 as well as on the functions g 1 (λ, α) := e −γλ and g 2 (λ, α) := e −(1−γ)λ . We can write where M (dt) = βe (1−γ)t dt and M α (dt) = a α e γt dt with a α ≤ γ + β are the intensity measures of the point processes Π and Π α . From this we obtain, for (λ, α) ∈ Φ, Moreover, similarly elementary calculations for B 1 , B 2 and B 4 imply where bg stands for 'big' and sm for 'small', we have Using that by definition Up to constants, the estimates for B 3 and B 4 preserve g 1 but change g 2 into g 1 , whereas the estimates for B 1 and B 2 preserve g 2 and change g 1 into g 2 . Hence, we split the sequence of indices into blocks containing only 1 or 2 and blocks containing only 3 or 4. We write m for the number of blocks, k j for the length of block j andk j := where the last sum is over all sequences of indices (i 1 , . . . , i n ) with ik j , . . . , ik j+1−1 ∈ {3, 4} for j odd and ik j , . . . , ik j+1−1 ∈ {1, 2} for j even. We insist that formally the first block contains the indices 3 or 4 -the case that this does not hold is covered by k 1 = 0. Hence, in the first block, operators B 3 and B 4 encounter g 1 , which is preserved. To determine the constants, we only have to keep track of how often B 4 is used; we call this number l 1 . The first operator belonging to a new block j causes a factor b bg log( 1−2γ ) and if the change is from a {1, 2} to a {3, 4} block, then an additional 2γ−1 is obtained. For the subsequent steps within block j, we again have to track how often the operator causing the smaller constant b sm , B 1 or B 4 , is used. This number is called l j . After applying all n operators, the function g 1 (φ)1 odd (m) + g 2 (φ)1 even (m) remains and we bound it by −γ . This procedure yields i1,...,in∈{1,...,4} Combining (2.7) and (2.8), we conclude that for all φ ∈ Φ 3) yields, for all ∈ (0, 1), The insight gained in the proof of the lower bound, enables us to 'guess' an approximating eigenfunction, which is the main ingredient in the proof of the upper bound.
Proof of upper bound in Proposition 2.5 (b). Let c r := 1 and c := β/(γ + β) and, for (λ, α) ∈ Φ, let Recall that we write |x| for the generation of a particle x in the IBP and λ(x) for its location.

By monotonicity of A this implies
Taking the n-th root on both sides, an application of (2.3) yields the required bound for ρ (A).

A multitype branching process
In this section, we analyse the IBP and its relation to the associated operator A. We begin by providing properties of A in Section 2.3.1, and then use these properties to prove necessary and sufficient conditions for the multitype branching process to survive with positive probability in Section 2.3.2. Throughout, we use the notation introduced in Section 2.1 and write P φ,p for the distribution of the percolated IBP with retention probability p started with one particle of type φ ∈ Φ, abbreviating P φ := P φ,1 . Proof. If g ∈ C(Φ), g ≥ 0, g ≡ 0, then there exist log ≤ λ 1 < λ 2 ≤ 0 and α 0 ∈ { , r} such that g is strictly positive on [λ 1 , λ 2 ] × {α 0 }. Hence, it suffices to show that By the definition of the process any particle produces offspring in a given interval of positive length with, uniformly in the start type, strictly positive probability. The two steps allow the time needed to ensure that the relative position of the parent satis- Proof. According to (2.4), we can write for g ∈ C(Φ) and (λ, α) ∈ Φ, . Thus A can be written as the sum of two operators, which are both compact by the Arzelà-Ascoli theorem.
We summarize some standard properties of compact, positive operators in the following proposition. (i) The spectral radius of A, ρ = ρ(A), is a strictly positive eigenvalue of A with one dimensional eigenspace, generated by a strictly positive eigenvector ϕ. The eigenvalue ρ is also the spectral radius of adjoint A * and the corresponding eigenspace is generated by a strictly positive eigenvector ν 0 . We rescale ϕ and ν 0 such that ϕ = 1 and ν 0 (ϕ) = 1 to make the choice unique.
Proof. Statements (i) and (ii) are immediate from the Krein-Rutman theorem, see Theorem 3.1.3 (ii) in [32], and the general form of the spectrum of compact operators. Statement (iii) then follows from the spectral decomposition of a compact operator on a complex Banach space. See for example [22] and there in particular Vulnerability of robust preferential attachment networks

Proof of Theorem 2.3
We start with a moment estimate for the total number of offspring of a particle. In the sequel, we write |IBP n | for the number of particles in generation n of the IBP. Lemma 2.9. We have sup φ∈Φ E φ |IBP 1 | 2 < ∞.
Proof. Let Π, Z andẐ be independent realisations of the Poisson point process and the pure jump processes defined in Section 2.1. Let φ = (λ, α) ∈ Φ. By the definition of the IBP, where d = denotes distributional equality. Since f is non-decreasing,Ẑ stochastically dominates Z. This implies that for all φ ∈ Φ, The first term on the right is finite because Π is a Poisson point process with finite intensity measure. The second summand was computed in Lemma 1.12 of [15] and found to be finite.
The next result is a classical fact about branching processes. We give a proof since we could not find a reference for the result in sufficient generality. See Theorem III.11.2 in [21] for a special case. Proof. We split the proof in two steps. First we show that δ := inf φ∈Φ P φ,p (|IBP 1 | = 0) > 0, then we conclude the statement from this result. By definition of the percolated IBP, for all (λ, α) ∈ Φ, Since the lower bound is independent of (λ, α), the claim δ > 0 is proved. For the second step of the proof, we set p = 1 to simplify notation. The proof for general p is identical. Fix N ∈ N, set τ 0 := 0 and, for k ≥ 1, let τ k := inf{n > τ k−1 : |IBP n | ∈ [1, N ]}, where inf ∅ := ∞. The strong Markov property implies, for all φ ∈ Φ and k ∈ N, where the supremum is over all counting measure ν on Φ such that ν(Φ) ∈ [1, N ]. Under P ν , ν = n i=1 δ φi , the branching process is started with n particles of types φ 1 , . . . , φ n .
When all original ancestors have no offspring in the first generation, then the branching process suffers immediate extinction and τ 1 = ∞. Hence, for all such ν, We conclude, for all φ ∈ Φ P φ 1 ≤ |IBP n | ≤ N infinitely often = lim Proof of Theorem 2.3. Throughout the proof, we write ρ := ρ (A p ) and ϕ for the corresponding strictly positive eigenfunction with ϕ = 1 from Proposition 2.8 (i). First suppose ρ ≤ 1. By Lemma 2.10 P φ,p (lim n→∞ |IBP n | ∈ {0, ∞}) = 1. By Proposition 2.8 (iii) the assumption ρ ≤ 1 implies that Hence, sup n∈N E φ,p [|IBP n |] < ∞ and we conclude that lim n→∞ |IBP n | = 0 P φ,p -almost surely for all φ ∈ Φ and, therefore, ζ (p) = 0. Now suppose that ρ > 1 and denote W n = 1 ρ n |x|=n ϕ(φ(x)) for n ∈ N. Then (W n : n ∈ N) is under P φ,p a nonnegative martingale with respect to the filtration generated by the branching process. Hence, W := lim n→∞ W n exists almost surely. Given Lemma 2.9, Biggins and Kyprianou show in Theorem 1.1 of [5] that E φ,p [W ] = 1 and therefore, P φ,p (W > 0) > 0. This implies in particular that the branching process survives with positive probability irrespective of the start type.
We now investigate continuity of the survival probability as a function of the attachment rule. For this purpose we emphasise dependence on f by adding it as an additional argument to several quantities. The result is used in the proof of Theorem 2.1 in Section 4 below.
Proof. Observe that there exists a natural coupling of the IBP(f ) with the IBP(f − δ) such that every particle in the IBP(f − δ) is also present in the IBP(f ) and, hence, ζ (p, f − δ) is increasing as δ ↓ 0. We can therefore assume that ζ (p, f ) > 0, that is ρ(f ) := ρ (A p , f ) > 1, and by the continuity of A p in the attachment rule, there exists δ 0 > 0 such that ρ (A p , f − δ 0 ) > 1. In the proof of Theorem 2.3 we have seen that this implies that the IBP(f − δ 0 ) survives with positive probability, irrespective of the start type, and similar to Lemma 2.6, we conclude inf φ∈Φ P φ,p IBP(f − δ 0 ) survives > 0. (2.9) Recall the definition of the martingale (W n : n ∈ N) and its almost sure limit W from the proof of Theorem 2.3, which satisfies E φ,p [W ] = 1 and where, conditionally on the first generation, (W (φ(x)) : |x| = 1) are independent copies of the random variable W under P φ(x),p . In particular, φ → P φ,p (W = 0) is a fixed point of the operator Hg(φ) = E φ,p [ |x|=1 g(φ(x))] on the set of [0, 1]-valued, measurable functions. As the only [0, 1]-valued fixed points of H are the constant function 1 and the extinction probability φ → P φ,p (IBP(f ) dies out), we deduce that W > 0 almost surely on survival. Let c > 0 and N ∈ N. On the space of the coupling between IBP(f ) and  Since the offspring distribution of an individual particle is continuous in δ uniformly on the type space, the probability that IBP(f ) and IBP(f − δ) agree until generation N tends to one as δ ↓ 0. On this event, when |IBP N (f )| ≥ Cρ(f ) N for some C > 0, then the probability that the IBP(f − δ) subsequently dies out is bounded by By (2.9), this expression tends to zero as N → ∞ when δ ≤ δ 0 . Hence, for all c > 0, On the event {W > c} ∩ {W n → W }, there is a finite stopping time N 0 such that W n ≥ W/2 for all n ≥ N 0 and we deduce that ρ(f ) −n |IBP n (f )| ≥ W n / max ϕ ≥ c/(2 max ϕ). Since W n converges to W almost surely, we conclude that lim N →∞ Θ 2 (c, N ) = 0. Finally, Θ 1 (c) tends to zero as c ↓ 0 because W is positive on the event of survival.

The topology of the damaged graph
We investigate the empirical indegree distribution and maximal indegree of the damaged network in Section 3.1, and typical distances in Section 3.2.

Degrees
The following lemma formalises basic facts about the indegrees Z[m, n].    Vulnerability of robust preferential attachment networks Lemma 3.1 allows us to replace the independent random variables in these sequences by groups of independent and identically distributed random variables.
Dereich and Mörters observe in [14], see for example Corollary 4.3, that the indegrees in network (G n : n ∈ N) are closely related to the pure jump process (Z t ) t≥0 . Since the indegrees are not altered by the targeted attack, the same holds in the damaged network (G n : n ∈ N). We now explain this connection. Let ψ(k) := k−1 j=1 j −1 for all k ∈ N, which we consider as a time change, mapping real time epochs k to an 'artificial time' ψ(k).
In particular, for f (k)/m ≤ 1 2 , the equivalence (3.1) implies where the random null sequence o(1) is bounded by a deterministic null sequence of order O((log n) 2 /n).
We proceed by estimating the distribution function of k i=0 T [i]. The following identity for the incomplete beta function will be of use.
Proof. Denote the left-hand side by θ(k, a, c). For x > 0, we have Integrating both sides between 0 and a and dividing by a c , we obtain the claim.
Dereich and Mörters (pp 1238-1239 in [14]) give a simple argument based on Chernoff's inequality to upgrade the convergence of the expected empirical degree distribution to convergence of the empirical degree distribution itself. The proof remains valid for the damaged network and is therefore omitted.
To establish the claimed tail behaviour of µ, we consider the large k asymptotics of P (Z t ≥ k + 1) in (3.4). By Stirling's formula, B(k + 1, β/γ) −1 k β/γ , where we write a(k) b(k) if there exist constants 0 < c ≤ C < ∞ such that ca(k) ≤ b(k) ≤ Ca(k) for all large k. Moreover, In the first estimate we used that x β γ −1 is bounded from zero and infinity; in the second we employed Laplace's method (see for example Section 3.5 of [28]). In particular, µ has the stated tails. To complete the proof of Theorem 1.2, it remains to derive the asymptotic behaviour of the maximal indegree. The statement follows from the next two lemmas.

Distances
In this section, we study the typical distance between two uniformly chosen vertices in C n and prove Theorem 1.3. We write Z[m, n] = Z[m, n + 1] − Z[m, n] and, for m ≥ n, Z[m, n] = 0. In the graph, the indegree of vertex m at time m is zero by definition, but we will also use the distribution of the process (Z[m, n] : n ≥ m) for different initial values. Formally, the evolution of Z[m, ·] with initial value k is obtained by using the attachment rule g(l) := f (k + l) and we denote its distribution by P k , using E k for the corresponding expectation; we abbreviate P := P 0 , E := E 0 . We further writê n := inf{n ∈ N : f (n)/n ≤ 1} ∨ 2. Note that γ < 1 impliesn ∈ N. We observe some facts about the indegree distribution. These are adaptations of results in [15].
Lemma 3.8 (Lemma 2.7 in [15]). For all k ∈ N 0 and m, n ∈ N with k ≤ m,n ≤ m ≤ n,  The proof of the lemma is similar to the proof of Lemma 2.10 in [15] and we omit it. After these preliminary results, we now begin our analysis of typical distances in the network (G n : n ∈ N). Recall, that for this type of questions, we consider G n to be an undirected graph. For v, w ∈ V n and h ∈ N 0 , let In the graph G n , condition (3.9) can be written as Z[v i−1 , v i ] ≤ θ. We further denote, for v, w ∈ V n , h ∈ N 0 and θ ∈ (0, ∞), on n is suppressed in the notation, but it will always be clear from the context which graph is considered. We write IBP (f ) for the idealized branching process with type space [log , 0] × { , r} generated with attachment rule f if we want to emphasize f and . The proof of the following lemma is deferred to Section 4.3.  3.11. Let δ > 0 such that γ(1 + δ) < 1, ∈ (0, ), and (θ n : n ∈ N) a sequence of positive numbers with θ n = o(n). For all sufficiently large n, v 0 ∈ V n , and h ∈ N 0 , We are now in the position to prove Theorem 1.3. The distance between two vertices v, w ∈ V n in different components of G n is defined to be infinite.
Proof of Theorem 1.3. Let v, w ∈ V n , h ∈ N. With θ n := (log n) 2 , (1.5) yields where the error bound is uniform in v, w and h. Markov's inequality yields, for every v, w ∈ V n with v = w and for every h ∈ N, To estimate the probability P(E k | k−1 i=1 E i ), we first note that the only edge in the selfavoiding path p on whose presence the event {v k−1 , v k } ∈ E n can depend is {v k−2 , v k−1 }.
The possible arrangements of these two edges are sketched in Figure 4. When v k−2 < v k−1 (cases A,B,C in Figure 4), then we, in addition, have knowledge of edges whose left vertex is v k−2 . However, these are always independent of {v k−1 , v k }. If v k−1 < v k (cases A,D,E in Figure 4), then event E k requires that Z[v k−1 , v k ] ≤ θ n . Since edges with left vertex v k−1 depend only on edges whose left vertex is also v k−1 , the only in A,B,C and F, Using Lemma 3.9 and (3.8), we can bound the probability in both cases by f (1)/( n).
Combining this estimate with (3.11) and (3.12), we obtain We denote byρ the spectral radius of the operator A associated to IBP ((1 +δ)f ) and byφ the corresponding eigenfunction. Choose a constant C such that for all sufficiently smallδ, C ≥ max φφ (φ)/min φφ (φ). This is possible since the eigenfunctions are continuous inδ (this can be seen along the lines of Note 3 to Chapter II on pages 568-569 of [27]). Furthermore, by the continuity of the spectral radius with respect to the operator (see Chapter II.5 in [27]) and, since ρ (A) > 1 by assumption,ρ > 1 for all smallδ.
Hence, for all v, w ∈ V n , v = w, In particular, for δ > 0 and h n := (1 − δ 2 ) log n logρ , we showed that For independent, uniformly chosen vertices V n , W n in C n , we have V n = W n with high probability. According to (3.10), this implies P(d G n (V n , W n ) ≤ h n ) = o(1). Choosingδ so small that logρ ≤ (1 + δ) log ρ (A), it follows that, with high probability,

Approximation by a branching process
In this section, we compare the connected components in the network to the multitype branching process defined in Section 2.1. We begin by coupling the local neighbourhood of a uniformly chosen vertex to the IBP in Sections 4.1 and 4.2. This local EJP 19 (2014), paper 57. consideration allows us to draw conclusions about the existence or nonexistence of the giant component from knowledge of the branching process, see Section 4.4. For the analysis of the typical distances in the network, knowing the local neighbourhood is insufficient. We show in Section 4.3 that a slightly larger IBP dominates the network globally in a suitable way.

Coupling the network to a tree
The proof of the coupling follows the lines of [15] for the undamaged network, but unfortunately we cannot use their results directly as the coupling in [15] makes extensive use of vertices which are removed in the damaged network. Note however that the removal of the old vertices significantly reduces the risk of cycles in the local neighbourhood of a vertex and, therefore, the coupling here will be successful for much longer than the coupling in [15].
In the first step, we couple the local neighbourhood of a vertex v 0 in G n to a labelled tree T n (v 0 ), thus ruling out cycles in that subgraph. In Section 4.2 we then study the asymptotics of the offspring distributions to arrive at the IBP.
Every vertex v in the labelled tree T n (v 0 ) is equipped with a 'tag' in V n and a 'mark' α ∈ V n ∪ { }. The tag indicates which vertex in the network is approximated by v. We use the same notation for vertex and tag to emphasize the similarity between the tree and the network. The mark α carries information about the tag of the parent w of v in the tree. In the spirit of Section 1.6, v has mark α = if its parent has a smaller tag, i.e. w < v, and we say that the parent of v is on its left. In contrast, if w > v we say that the parent is on its right. It turns out that here it is beneficial to record the exact tag of w instead of only the relative position and we choose α = w. Hence, a typical label is of the form (v, α).
To construct the coupling, we run an exploration process on the connected component of v 0 . The offspring distribution of a vertex v in the tree is chosen to be the same as the distribution of direct neighbours of v in G n when only the vertex w is known as whose direct neighbour v is found in the exploration. That vertex w determines the mark of v. The need of this information to identify the offspring distribution is the reason why vertices in T n (v 0 ) are equipped with marks, whereas vertices in G n (v 0 ) are not.
Note the similarity to the comparison between network and IBP sketched in Section 1.6.
Formally, for v 0 ∈ V n , let T n (v 0 ) be the random tree with root v 0 of label (v 0 , ) constructed as follows: Every vertex v produces independently offspring to the left, i.e., with tag u ∈ { n + 1, . . . , v − 1} with probability P(v has a descendant with tag u) = P( Z[u, v − 1] = 1).
All offspring on the left are of mark v. Moreover, independently, v produces descendants to its right (i.e. with tag at least v+1). Since the parent of these descendant is on the left, they are of mark . The distribution of the cumulative sum 2 of the sequence of relative positions of the right descendants depends on the mark of v When v is of mark α = , then the cumulative sum is distributed according to the law of (Z[v, u] : v + 1 ≤ u ≤ n). When v is of mark α = w ∈ V n , w > v, then the cumulative sum follows the same The percolated version T n,p (v 0 ) is obtained from T n (v 0 ) by deleting every particle in T n (v 0 ) together with its line of descent with probability 1 − p, independently for all particles. In particular, with probability 1 − p the root v 0 is deleted and T n,p (v 0 ) is empty.
We write C n,p (v 0 ) for the connected component in G n (p) containing vertex v 0 . Proposition 4.1. Suppose (c n : n ∈ N) is a sequence of positive integers that satisfies lim n→∞ c 2 n /n = 0. Then there exists a coupling of a uniformly chosen vertex V n in V n , graph G n (p) and tree T n,p (V n ) such that |C n,p (V n )| ∧ c n = |T n,p (V n )| ∧ c n with high probability.
To prove Proposition 4.1, we define an exploration process which we then use to inductively collect information about the tree and the network on the same probability space. We show that the two discovered graphs agree until a stopping time, which is with high probability larger than c n . After that time, the undiscovered part of the tree and the network can be generated independently of each other. This result offers a detailed description of the local neighbourhood of a vertex, and is much stronger than the equal cardinality stated in Proposition 4.1.
We begin by specifying the exploration process that is used to explore the connected component of a vertex v 0 in a labelled graph G, like C n,p (v 0 ) or T n,p (v 0 ). We distinguish three categories of vertices: We couple the exploration processes of the network and the tree started with v 0 ∈ V n up to a stopping time T , such that up to time T both explored subgraphs (without the marks) coincide. In particular, the explored part of C n,p (v 0 ) is a tree and every tag has been used at most once by the active or dead vertices in T n,p (v 0 ). The event that at least one of these properties fails is called E. We also stop the exploration, when either the number of dead and active vertices exceeds c n or when there are no active vertices left.
In this case we say that the coupling is successful. If we have to stop as a consequence of E, we say that the coupling fails. In the sequel, we will label some key constants by the lemma in which they appear first. In the proof of Lemma 4.2 we make use of the following result.  [15]). Let (c n : n ∈ N) satisfy lim n→∞ c n /n = 0. Then there exists a constant C 4.3 > 0 such that for all sufficiently large n, for all disjoint sets I 0 , I 1 ⊆ V n with |I 0 | ≤ c n and |I 1 | ≤ 1, and for all u, v ∈ V n , .
With n so large that n ≥n, Lemma 3.9 and (3.8) imply that Since c n /n tends to zero as n → ∞, the right-hand side converges to one.
Proof of Lemma 4.2. We assume that n is so large that n ≥n. To distinguish the exploration processes, we use the term descendant for a child in the labelled tree and the term neighbour in the context of G n (p). The σ-algebra generated by the exploration until the completion of step k is denoted F k .
Since the probability of removing v 0 is the same in C n,p (v 0 ) and T n,p (v 0 ), this event can be perfectly coupled. If v 0 is not removed, then we explore the immediate neighbours of v 0 in G n (p) and the children of the root v 0 in the tree. Again these families are identically distributed and can be perfectly coupled. Now suppose that we successfully completed exploration step k and are about to start the next step from vertex v. At this stage every vertex in the tree can be uniquely referred to by its tag and the subgraphs coincide. Denoting by a and d the set of active and dead vertices, respectively, we have a = ∅ and |a∪d| < c n . We continue by exploring the left descendants and neighbours of v. Since we always explore the leftmost active vertex, we cannot encounter a dead or active neighbour in this step. However, in the tree T n,p (v 0 ) we may find a dead left descendant (i.e. an offspring whose tag agrees with the tag of a dead particle); we call this event Ia. On Ia, the vertices in the explored part of T n,p (v 0 ) are no longer uniquely identifiable by their tag and we stop. We have In the first inequality, we used subadditivity, the definition of T n,p (v 0 ) and omitted the event that offspring of v are removed by percolation. Hence, P(Ia) = O(c n /n). In the exploration to the left in the tree, we immediately check if a found left descendant has a right descendant which is dead. We denote this event by Ib and stop the exploration as soon as it occurs. The reason is that in the network this event could not happen since we always explore the leftmost active vertex. Therefore, the distribution of left neighbours agrees with the distribution of the left descendants conditioned on having no dead right descendants and we can couple both explorations such that they agree in this case. To estimate the probability of the adverse event Ib, we use the definition of By definition of the exploration process, there are at most c n dead vertices. Therefore, Lemma 3.9 and (3.8) yield which implies in particular that P(Ib) = O(c n /n).
We turn to the exploration of right descendants, resp. neighbours. When vertex v is of mark α = , then we already know that v has no right descendants, resp. neighbours, in d since we checked this when v was discovered. We denote the event that a right descendant, resp. neighbour, is active by IIr and stop the exploration as soon as this event occurs because the tags in T n,p (v 0 ) are no longer unique, resp. we found a cycle in C n,p (v 0 ). According to Lemma 4.3 and (3.8), Thus, P(IIr) = O(c n /n). Conditional on the event that there are no active vertices in the set of right descendants, resp. neighbours, the offspring distribution in tree and network agree and can therefore be perfectly coupled. When the vertex v is of mark α = , then we have not gained any information about its right descendants, yet. The event that there is a dead or active vertex in the right descendants is denoted by II a. We stop when this event occurs and use (3.8) to estimate Thus, P(II a) = O(c n /n). In C n,p (v 0 ), we know that v has no dead right neighbours as this would have stopped the exploration in the moment when v became active. The event that there are active vertices in the set of right neighbours is denoted by II b and we stop as soon as it occurs since a cycle is created. Using again (3.8), we find As in case α = , the explorations can be perfectly coupled when the adverse events do not occur. We showed that in every step the coupling fails with a probability bounded by O(c n /n). As there are at most c n exploration steps until we end the coupling successfully, the probability of failure is O(c 2 n /n) = o(1). In other words, the coupling succeeds with high probability.
Proof of Proposition 4.1. First, consider the statement for a fixed vertex v 0 . When the coupling is successful and ends because at least c n vertices were explored, then |C n,p (v 0 )| ≥ c n and |T n,p (v 0 )| ≥ c n . If the coupling is successful and ends because there are no active vertices left, then |C n,p (v 0 )| = |T n,p (v 0 )| since the subgraphs coincide.
Since the coupling is successful with high probability by Lemma 4.2, |C n,p (v 0 )| ∧ c n = |T n,p (v 0 )| ∧ c n with high probability. As Lemma 4.2 shows the success of the coupling uniformly in the start vertex, the randomization of the vertex v 0 to a uniformly chosen vertex V n ∈ V n is now straightforward.

Coupling the tree to the IBP
Coupling the neighbourhood of a vertex to a labelled tree provides a great simplification of the problem since many dependencies are eliminated. However, the offspring EJP 19 (2014), paper 57. distribution in the tree T n,p (V n ) is still complicated and depends on n. Since we are mainly interested in the asymptotic size of the giant component, we now couple the tree to the IBP, which does not depend on n and is much easier to analyse. We denote by |X (p)| the total progeny of the IBP. Recall the definition of S from (2.2). Proposition 4.4. Let p ∈ (0, 1] and (c n : n ∈ N) be a sequence of positive integers with lim n→∞ c 3 n /n = 0. Then there exists a coupling of a uniformly chosen vertex V n in V n , the graph G n (p) and the percolated IBP started with a particle of mark and location S such that, with high probability, Proof of Proposition 4.4. Throughout the proof, suppose that n is so large that n ≥n.
Instead of coupling the IBP directly to the network, we couple a projected version of the IBP to the tree T n,p (V n ). As long as the number of particles is preserved under the projection, this is sufficient according to Proposition 4.1. To describe the projection, we where s n (v) = − n−1 j=v 1 j . Since s n ( n ) < log( n /n) ≤ log , every location in [log , 0] can be uniquely identified with a tag in V n by the map π n . The projected IBP is again a labelled tree: The genealogical tree of the IBP with its marks is preserved, the location of a particle x is replaced by the tag π n (λ(x)). If s n ( n + 1) < log , then no particles of the IBP are projected onto n + 1. Moreover, while for v ≥ n + 3 an interval of length 1/(v − 1) is projected onto v, for n + 2 only an interval of length at most s n ( n + 2) − log is used. This length is positive but may be smaller than 1/( n + 1). As a consequence, the projected IBP can have unusually few particles at n + 1 and n + 2 and we treat these two tags separately.
The exploration of the two trees follows the same procedure as the exploration described in Section 4.1 and we declare the coupling successful and stop as soon as either there are no active vertices left or the number of active and dead vertices exceeds c n .
Since both objects are trees, as long as the labels for the starting vertices agree, any failure of the coupling comes from a failure in the coupling of the offspring distributions. For simplicity, we consider only the case p = 1. The generalization to p ∈ (0, 1] is straightforward. We first show that the labels of the starting vertices can be coupled with high probability. To this end, note that the distribution of S is chosen such that exp(S ) is uni- . Moreover, the probability that V n or π n (S ) is in { n + 1, n + 2} is of order O(1/n). Hence, V n and S can be coupled such that In the next step we study the offspring distributions of a particle x in the IBP with label (λ, α) and π n (λ) = v. We start with the offspring to the left. Let u ∈ { n + 1, . . . , v}. By definition of the IBP, particle x produces a Poissonian number of projected offspring with tag u with parameter A vertex with tag v in T n (V n ) produces a Bernoulli distributed number of descendants with tag u with success probability P( Z[u, v − 1] = 1) when u < v, and with success probability zero when u = v. It is proved in Lemma 6.3 of [15] that for u ≥ n + 3 the Poisson distributions can be coupled to the Bernoulli distribution such that they disagree with a probability bounded by a constant multiple of v γ−1 u −(γ+1) for u < v and 1/v for u = v. For u ∈ { n + 1, n + 2} a similar estimate shows that the probability can be bounded by a constant multiple of 1/( n). Since the number of descendants with tag in { n +1, . . . , v} form an independent sequence of random variables, we can apply the coupling sequentially for each location and obtain a coupling of the π n -projected left descendants in the IBP and the left descendants in T n (V n ). The failure probability of this coupling can be estimated by where C, C , C are suitable positive constants whose value can change from line to line in the sequel. We turn to the offspring on the right. Suppose that particle x in the IBP has mark α = . The cumulative sum of π n -projected right descendants of x follows the same distribution as (Z sn(u)−λ : v ≤ u ≤ n). The cumulative sum of right descendants of v in T n (V n ) is distributed according to the law of (Z [v, u] : v ≤ u ≤ n). The following lemma is taken from [15] and we omit its proof.
Lemma 4.5 (Lemma 6.2 in [15]). Fix a level H ∈ N. We can couple the processes (Z sn(u)−λ : v ≤ u ≤ n) and (Z [v, u] : v ≤ u ≤ n) such that for the coupled processes for some constant C 4.5 > 0 and σ H the first time one of the processes reaches or exceeds H.
In the coupling between the tree T n (V n ) and the projected IBP we consider at most c n right descendants. Hence, Lemma 4.5 implies that the distributions can be coupled such that for some C > 0. When α = r, then cumulative sum of π n -projected right descendants of x follows the same distribution as (Ẑ sn(u)−λ − 1 : v ≤ u ≤ n). The cumulative sum of right descendants of a vertex v with mark w ∈ V n , w > v, in T n (V n ) is distributed We can couple these two distributions. Again the proof of the following lemma is up to minor changes given in [15] and therefore omitted. Vulnerability of robust preferential attachment networks Lemma 4.6 (Lemma 6.6 in [15]). Fix a level H ∈ N. We can couple the processes for some constant C 4.6 > 0 and σ H the first time one of the processes reaches or exceeds H.
As we explore at most c n vertices during the exploration, Lemma 4.6 implies that we can couple the offspring distribution to the right such that there is a constant C > 0 Since we explore at most c n vertices in total, the probability that the coupling fails can be bounded by a constant multiple of c n /n + c 3 n /n, which converges to zero. Thus, the two explorations can be successfully coupled with high probability and, as in the proof of Proposition 4.1, the claim follows.

Dominating the network by a branching process
Like in the coupling, we begin with a comparison to a tree: For θ ∈ N and v 0 ∈ V n , let T ,θ n (v 0 ) be the subtree of T n (v 0 ), where every particle can have at most θ offspring to the right. That is, for a particle with tag v and mark α = the cumulative sum of the offspring to the right is distributed according to the law of (Z When v is of mark α = w ∈ V n , w > v, then the cumulative sum follows the same Proof. Let p = (v 0 , . . . , v h ) ∈ S h (v 0 ). Using the notation and set-up from the proof of Theorem 1.3, and the definition of the tree T ,θ n (v 0 ), one easily checks that in cases A,B,C,E and F of Figure 4 on page 31, P(E h | ∩ h−1 i=1 E i ) agrees with the probability that in tree T ,θ n (v 0 ) a vertex with tag v h−1 gives birth to a particle of tag v h given that its parent has tag v h−2 . In case D of Figure 4, the tree T ,θ n is allowed to have one more offspring on its right, because the edge {v h−2 , v h−1 } is not accounted for.
bounded from above by the probability for the event in the tree. We obtain Particles in generation h of T ,θ n (v 0 ), who have two ancestors with the same tag, are not represented in the sum on the right-hand side. Adding these, we obtain the result.
Proof of Lemma 3.11. By Lemma 4.7, it suffices to dominate T ,θn n (v 0 ) by the IBP ((1 + δ)f ) started in s n (v 0 ), or, as in the proof of Proposition 4.4, by the π n -projected IBP defined by (4.1). Since both processes are trees starting with the same type of particle, it suffices to compare the offspring distributions. All particles in T ,θn n (v 0 ) have a tag v > n , but the projected IBP can have offspring with tag v ∈ { n + 1, . . . , n }.
Hence, these offspring are ignored in the following, giving us a lower bound on the EJP 19 (2014), paper 57. projected IBP. We assume that n is so large, that n ≥n and s n ( n + 1) ≥ log . Let x be a particle in the IBP of type (λ, α) with π n (λ) = v. We begin with the offspring to the left, i.e. tag u ∈ { n +1, . . . , v}. A particle in T ,θn n (v 0 ) with tag v cannot produce particles in u = v, therefore, the IBP clearly dominates. For u < v, using (3.8), the probability that a particle with tag u is a child of x, is P( . Writingf (k) = (1 + δ)f (k) =γk +β, for k ∈ N 0 , the number of particles with tag u produced by x in the projected IBP follows a Poisson distribution with parameter where we used that λ ≤ s n (v) and e y −1 ≥ y. For > 0, η ∈ [0, 1], the Poisson distribution with parameter is dominating the Bernoulli distribution with parameter η if and only if e − ≤ 1 − η. Since e −y ≤ 1 − y + y 2 /2 for y ≥ 0, it suffices to show that (1 − /2) ≥ η.
) 1−γ and the inequality holds for all large n and u ∈ V n , u < v, since η is a null sequence.
We turn to the right descendants. The pure jump process corresponding to the attachment rulef is denoted byZ and we write P l for the distribution ofZ when started in l, that is, P l (Z 0 = l) = 1. First suppose that α = . The cumulative sum of π n -projected right descendants of x have the distribution of (Z sn(u)−λ : v ≤ u ≤ n), whereZ 0 = 0. The cumulative sum of right descendants of v in T ,θn n (v 0 ) is distributed according to the law of (Z[v, u] ∧ θ n : v ≤ u ≤ n). We couple these distributions by defining ((Y (1) u , Y (2) u ) : v ≤ u ≤ n) to be the time-inhomogeneous Markov chain which starts in P 0 (Z sn(v)−λ ∈ ·) ⊗ δ 0 , has the desired marginals and evolves from state (l, k) at time j according to a coupling ofZ 1/j and Z[j, j + 1], which guarantees thatZ 1/j ≥ Z[j, j + 1], until Y (2) reaches state θ n , where Y (2) is absorbed. To show that this coupling exists, it suffices to show that e −f (l)/j = P l (Z 1/j = l) ≤ P k (Z[j, j + 1] = k) = 1 − f (k)/j for j ∈ V n , k ≤ θ n , k ≤ l.
Now suppose that α = r and that the location of x parent is projected onto tag w. The cumulative sum of π n -projected right descendants of x has the distribution of (Y sn(u)−λ : v ≤ u ≤ n), where Y is a version ofZ under measure P 1 . The cumulative sum of right descendants of v in T ,θn n (v 0 ) is distributed according to the law of We couple these distributions as in the α = case, but for times j ≤ w −2, the Markov chain evolves from state (l, k) according to a coupling of Y 1/j and Z[j, j + 1] conditioned on Z[j, w − 1] = 1, which guarantees that Y 1/j ≥ Z[j, j + 1], until either j = w − 2 or Y (2) reaches θ n and is absorbed. To show that this coupling exists, it suffices to show that We compute Sincef is non-decreasing, (4.2) follows when we show that (1 − /2) ≥ η with η = f (k + 1)/(j + γ) and =f (k + 1)/j = η(1 + δ)(1 + γ/j). Since k ≤ θ n = o(n) and j ≥ n , η is a null sequence and (4.2) is proved. In the transition from generation j = w − 1 to j = w, Y (2) cannot change its state while Y (1) can increase. From generation j = w onwards, the coupling explained in case α = is used. Thus, the Markov chain can be constructed such that Y (1) j ≥ Y (2) j for all j and the domination is proved.
This convergence can be strengthened to convergence in probability. 1{|C n,p (v)| ≥ c n } → ζ (p) in probability, as n → ∞.
To estimate the probability P(|C n,p (v)| ≥ c n , |C n,p (w)| ≥ c n ), we run two successive explorations in the graph G n (p), the first starting from v and the second starting from w. For these explorations, we use the exploration process described below Proposition 4.1 but in every step only neighbours in the set of veiled vertices are explored. The first exploration is terminated as soon as either the number of dead and active vertices exceeds c n or there are no active vertices left. The second exploration, additionally, stops when a vertex is found which was already unveiled in the first exploration. We denote Θ v := {the first exploration started in vertex v stops because c n vertices are found}. Vulnerability of robust preferential attachment networks Then, for any v ∈ V n , P(|C n,p (v)| ≥ c n ) = P(Θ v ) and in the proof of Proposition 7.1 of [15] it was shown that there exists a constant C > 0, independent of v and n, such that n w= n +1 P |C n,p (v)| ≥ c n , |C n,p (w)| ≥ c n  1{|C n,p (v)| ≥ 2c n } ≥ κ with high probability.
Then there exists a coupling of the networks (G n (p)) n and (G n (p)) n such that G n (p) ≤ G n (p) and, with high probability, all connected components in G n (p) with at least 2c n vertices belong to one connected component in G n (p). Lemma 4.11 in the case = 0 and p = 1 is Proposition 4.1 in [15]. The proof is valid for ∈ [0, 1), p ∈ (0, 1] up to obvious changes and is therefore omitted. Moreover, for δ ∈ (0, f (0)), another application of Lemma 4.9 implies that M n,p (2c n , f − δ) converges to ζ (p, f − δ) in probability. Hence, Lemma 2.11 and Lemma 4.11 imply that for all δ > 0, |C n,p | ≥ (n − n )(ζ (p) − δ ) with high probability. This concludes the proof.

Variations and other models
We study the preferential attachment network with a non-linear attachment rule in Section 5.1, and inhomogeneous random graphs and the configuration model in Sections 5.2.2 and 5.2.1, respectively. EJP 19 (2014), paper 57. degree of a uniformly chosen vertex. Janson and Luczak [26] showed that if (1.6) holds and P(D = 2) < 1, then (G (CM) n : n ∈ N) has a giant component ⇔ E[D(D − 1)] > ED. Janson [25] found a simple construction that allows to obtain a corresponding result for the network after random or deterministic removal of vertices (or edges), where the retention probability of a vertex can depend on its degree. Let π = (π k ) k∈N be a sequence of retention probabilities with π k P(D = k) > 0 for some k. Every vertex i is removed with probability 1 − π di and kept with probability π di , independently of all other vertices. Janson describes the network after percolation as follows [ To construct G (CM), n (p), we remove the n vertices with the largest degree from G (CM) n and then run vertex percolation with retention probability p on the remaining graph. In general, this does not fit exactly into the setup of Janson. To emulate the behaviour, we denote by n j the number of vertices with degree j in the graph and let K n = inf{k ∈ N 0 : ∞ j=k+1 n j ≤ n }. Then all vertices with degree larger than K n are deterministically removed in G (CM), n (p), i.e. π j = 0 for j ≥ K n + 1. In addition, we deterministically remove n − ∞ j=Kn+1 n j vertices of degree K n , while all other vertices are subject to vertex percolation with retention probability p. In particular, π j = p for j ≤ K n − 1.

Inhomogeneous random graphs
The classical Erdős-Rényi random graph can be generalized by giving each vertex a weight and choosing the probability for an edge between two vertices as an increasing function of their weights. Suppose that κ : (0, 1] × (0, 1] → (0, ∞) is a symmetric, continuous kernel with for all x ∈ ( , 1), for all measurable functions g such that the integral is well-defined, and · L 2 ( ,1) denotes the operator norm on the L 2 -space with respect to the Lebesgue measure on ( , 1). The same result holds for a version of the Norros-Reittu model in which edges between different vertex pairs are independent and edge {i, j} is present with probability 1 − e −κ(i/n,j/n)/n for all i, j ∈ {1, . . . , n}. Consequently, the estimates given in Theorem 1.7 hold for this model, too.