Random networks with sublinear preferential attachment: Degree evolutions

We define a dynamic model of random networks, where new vertices are connected to old ones with a probability proportional to a sublinear function of their degree. We first give a strong limit law for the empirical degree distribution, and then have a closer look at the temporal evolution of the degrees of individual vertices, which we describe in terms of large and moderate deviation principles. Using these results, we expose an interesting phase transition: in cases of strong preference of large degrees, eventually a single vertex emerges forever as vertex of maximal degree, whereas in cases of weak preference, the vertex of maximal degree is changing infinitely often. Loosely speaking, the transition between the two phases occurs in the case when a new edge is attached to an existing vertex with a probability proportional to the root of its current degree.


Motivation
Dynamic random graph models, in which new vertices prefer to be attached to vertices with higher degree in the existing graph, have proved to be immensely popular in the scientific literature recently. The two main reasons for this popularity are, on the one hand, that these models can be easily defined and modified, and can therefore be calibrated to serve as models for social networks, collaboration and interaction graphs, or the web graph. On the other hand, if the attachment probability is approximately proportional to the degree of a vertex, the dynamics of the model can offer a credible explanation for the occurrence of power law degree distributions in large networks.
The philosophy behind these preferential attachment models is that growing networks are built by adding nodes successively. Whenever a new node is added it is linked by edges to one or more existing nodes with a probability proportional to a function f of their degree. This function f , called attachment rule, or sometimes weight function, determines the qualitative features of the dynamic network.
The heuristic characterisation does not amount to a full definition of the model, and some clarifications have to be made, but it is generally believed that none of these crucially influence the long time behaviour of the model.
It is easy to see that in the general framework there are three main regimes: • the linear regime, where f (k) ≍ k; • the superlinear regime, where f (k) ≫ k; • the sublinear regime, where f (k) ≪ k.
The linear regime has received most attention, and a major case has been introduced in the much-cited paper Barabási and Albert (1999). There is by now a substantial body of rigorous mathematical work on this case. In particular, it is shown in Bollobás et al. (2001), Móri (2002) that the empirical degree distribution follows an asymptotic power law and in Móri (2005) that the maximal degree of the network is growing polynomially of the same order as the degree of the first node.
In the superlinear regime the behaviour is more extreme. In Oliveira and Spencer (2005) it is shown that a dominant vertex emerges, which attracts a positive proportion of all future edges. Asymptotically, after n steps, this vertex has degree of order n, while the degrees of all other vertices are bounded. In the most extreme cases eventually all vertices attach to the dominant vertex.
In the linear and sublinear regimes Rudas et al. (2007) find almost sure convergence of the empirical degree distributions. In the linear regime the limiting distribution obeys a power law, whereas in the sublinear regime the limiting distributions are stretched exponential distributions. Apart from this, there has not been much research so far in the sublinear regime, which is the main concern of the present article, though we include the linear regime in most of our results.
Specifically, we discuss a preferential attachment model where new nodes connect to a random number of old nodes, which in fact is quite desirable from the modelling point of view. More precisely, the node added in the nth step is connected independently to any old one with probability f (k)/n, where k is the (in-)degree of the old node. We first determine the asymptotic degree distribution, see Theorem 1.1, and find a result which is in line with that of Rudas et al. (2007). The result implies in particular that, if f (k) = (k + 1) α for 0 α < 1, then the asymptotic degree distribution (µ k ) satisfies log µ k ∼ − 1 1−α k 1−α , showing that power law behaviour is limited to the linear regime. Under the assumption that the strength of the attachment preference is sufficiently weak, we give very fine results about the probability that the degree of a fixed vertex follows a given increasing function, see Theorem 1.10 and Theorem 1.12. These large and moderate deviation results, besides being of independent interest, play an important role in the proof of our main result. This result describes an interesting dichotomy about the behaviour of the vertex of maximal degree, see Theorem 1.7: • The strong preference case: If n 1/f (n) 2 < ∞, then there exists a single dominant vertex -called persistent hub-which has maximal degree for all but finitely many times. However, only in the linear regime the number of new vertices connecting to the dominant vertex is growing polynomially in time.
• The weak preference case: If n 1/f (n) 2 = ∞, then there is almost surely no persistent hub. In particular, the index, or time of birth, of the current vertex of maximal degree is a function of time diverging to infinity in probability. In Theorem 1.15 we provide asymptotic results for the index and degree of this vertex, as time goes to infinity.
A rigorous definition of the model is given in Section 1.2, and precise statements of all the principal results follow in Section 1.3. At the end of that section, we also give a short overview over the further parts of this paper.

Definition of the model
We now explain how precisely we define our preferential attachment model given a monotonically increasing attachment rule f : {0, 1, 2, . . .} −→ (0, ∞) with f (n) n + 1 for all n ∈ Z + := {0, 1, . . .}. At time n = 1 the network consists of a single vertex (labeled 1) without edges and for each n ∈ N the graph evolves in the time step n → n + 1 according to the following rule • add a new vertex (labeled n + 1) and • insert for each old vertex m a directed edge n + 1 → m with probability f (indegree of m at time n) n .
The new edges are inserted independently for each old vertex. Note that the assumptions imposed on f guarantee that in each evolution step the probability for adding an edge is smaller or equal to 1. Formally we are dealing with a directed network, but indeed, by construction, all edges are pointing from the younger to the older vertex, so that the directions can trivially be recreated from the undirected (labeled) graph.
There is one notable change to the recipe given in Krapivsky and Redner (2001): We do not add one edge in every step but a random number, a property which is actually desirable in most applications. Given the graph after attachment of the nth vertex, the expected number of edges added in the next step is 1 n n m=1 f indegree of m at time n .
This quantity converges, as n → ∞ almost surely to a deterministic limit λ, see Theorem 1.1. Moreover, the law of the number of edges added is asymptotically Poissonian with parameter λ.
Observe that the outdegree of every vertex remains unchanged after the step in which the vertex was created. Hence our principal interest when studying the asymptotic evolution of degree distributions is in the indegrees.

Presentation of the main results
We denote by Z[m, n], for m, n ∈ N, m n, the indegree of the m-th vertex after the insertion of the n-th vertex, and by X k (n) the proportion of nodes of indegree k ∈ Z + at time n, that is Theorem 1.1 (Asymptotic empirical degree distribution).
(i) Let which is a sequence of probability weights. Then, almost surely, lim n→∞ X(n) = µ in total variation norm.
(ii) If f satisfies f (k) ηk + 1 for some η ∈ (0, 1), then the conditional distribution of the outdegree of the (n + 1)st incoming node (given the graph at time n) converges almost surely in variation topology to the Poisson distribution with parameter λ : = µ, f . Remark 1.2. The asymptotic degree distribution coincides with that in the random tree model introduced in Krapivsky and Redner (2001) and studied by Rudas et al. (2007), if f is chosen as an appropriate multiple of their weight function. This is strong evidence that these models show the same qualitative behaviour, and that our further results hold mutatis mutandis for preferential attachment models in which new vertices connect to a fixed number of old ones. Example 1.3. Suppose f (k) ∼ γk α , for 0 < α < 1 and γ > 0, then a straight forward analysis yields that Hence the asymptotic degree distribution has stretched exponential tails.
In order to analyse the network further, we scale the time as well as the way of counting the indegree. To the original time n ∈ N we associate an artificial time and to the original degree j ∈ Z + we associate the artificial degree .
An easy law of large numbers illustrates the role of these scalings.
Proposition 1.4 (Law of large numbers). For any fixed vertex labeled m ∈ N, we have that Remark 1.5. Since Ψ(n) ∼ log n, we conclude that for any m ∈ N, almost surely, In particular, we get for an attachment rule f with f (n) ∼ γn and γ ∈ (0, 1], that Φ(n) ∼ 1 γ log n which implies that log Z[m, n] ∼ log n γ , almost surely.
Furthermore, an attachment rule with f (n) ∼ γn α for α < 1 and γ > 0 leads to We denote by T := {Ψ(n) : n ∈ N} the set of artificial times, and by S := {Φ(j) : j ∈ Z + } the set of artificial degrees.
Proposition 1.6 (Central limit theorem). In the case of weak preference, for all s ∈ T, is a standard Brownian motion and (ϕ * t ) t 0 is the inverse of (ϕ t ) t 0 given by Our main result describes the behaviour of the vertex of maximal degree, and reveals an interesting dichotomy between weak and strong forms of preferential attachment.
Theorem 1.7 (Vertex of maximal degree). Suppose f is concave. Then we have the following dichotomy: then with probability one there exists a persistent hub, i.e. there is a single vertex which has maximal indegree for all but finitely many times.
then with probability one there exists no persistent hub and the time of birth, or index, of the current hub tends to infinity in probability.
Remark 1.8. Without the assumption of concavity of f , the assertion remains true in the weak preference regime. In the strong preference regime our results still imply that, almost surely, the number of vertices, which at some time have maximal indegree, is finite. Remark 1.9. In the weak preference case the information about the order of the vertices is asymptotically lost: as a consequence of the proof of Theorem 1.7, we have for two nodes s < s ′ in T that lim a phenomenon reminiscent of propagation of chaos. Conversely, in the strong preference case, the information about the order is not lost completely and one has Investigations so far were centred around typical vertices in the network. Large deviation principles, as provided below, are the main tool to analyse exceptional vertices in the random network. Throughout we use the large-deviation terminology of Dembo and Zeitouni (1998) and, from this point on, the focus is on the weak preference case.
Our aim is to determine the typical age and indegree evolution of the hub. For this purpose we assume that • f is regularly varying with index 0 α < 1 2 , • for some η < 1, we have f (j) η(j + 1) for all j ∈ Z + . (1) We setf := f • Φ −1 , and recall from Lemma A.1 in the appendix that we can representf as f (u) = u α/(1−α)l (u) for u > 0, wherel is a slowly varying function. We denote by I[0, ∞) the space of nondecreasing functions x : [0, ∞) → R with x(0) = 0 endowed with the topology of uniform convergence on compact subintervals of [0, ∞). • with speed (κ 1 1−αl (κ)) and good rate function otherwise.
• and with speed (κ) and good rate function Remark 1.11. The large deviation principle states, in particular, that the most likely deviation from the growth behaviour in the law of large numbers is having zero indegree for some (unusually long) time, and after that time typical behaviour kicking in.
More important for our purpose is a moderate deviation principle, which describes deviations on a finer scale. Similar as before, we denote by L(0, ∞) the space of càdlàg functions x : (0, ∞) → R endowed with the topology of uniform convergence on compact subsets of (0, ∞), and always use the convention x 0 := lim inf t↓0 x t .
Remark 1.13. If c = ∞ there is still a moderate deviation principle on the space of functions x : (0, ∞) → R with the topology of pointwise convergence. However, the rate function I, which has the same form as above with 1/∞ interpreted as zero, fails to be a good rate function.
Remark 1.14. Under assumption (1) the central limit theorem of Proposition 1.6 can be stated as a complement to the moderate deviation principle: For a κ ∼ κ See Section 2.1 for details.
Our final result describes weak limit laws for index and degree of the vertex of maximal degree. This result relies on the moderate deviation principle above.
Theorem 1.15 (Limit law for age and degree of the vertex of maximal degree). Suppose f is regularly varying with index α < 1 2 . Define s * t to be the index of the hub at time t, and Z max t = Z[s * t , t] to be the corresponding maximal indegree. One has, in probability, Moreover, in probability on L(0, ∞), Remark 1.16. In terms of the natural scaling, we get for the index m * n of the hub and the maximal indegree Z max n at natural time n ∈ N that, in probability, The remainder of this paper is devoted to the proofs of these results. Rather than proving the results in the order in which they are stated, we proceed by the techniques used. Section 2 is devoted to martingale techniques, which in particular prove the law of large numbers, Proposition 1.4, and the central limit theorem, Proposition 1.6. We also prove a property of the martingale limit which is crucial in the proof of Theorem 1.7. Section 3 is using Markov chain techniques and provides the proof of Theorem 1.1. In Section 4 we collect the large deviation techniques, proving Theorem 1.10 and Theorem 1.12. Section 5 combines the various techniques to prove our main result, Theorem 1.7, along with Theorem 1.15. An appendix collects the auxiliary statements from the theory of regular variation.

Martingale techniques
In this section we identify a martingale associated with the degree evolution of a vertex, and study its properties. This will be a vital tool in the further analysis of the network.

Martingale convergence
and otherwise it satisfies the following functional central limit theorem: Let and denote by ϕ * : [0, ∞) → [0, ∞) the inverse of (ϕ t ); then the martingales converge in distribution to standard Brownian motion as κ tends to infinity. In any case the processes ( 1 κ Z[s, s + κt]) t 0 converge, as κ ↑ ∞, almost surely, in L[0, ∞) to the identity. Proof. For t = Ψ(n) ∈ T we denote by ∆t the distance between t and its right neighbour in T, i.e.
One has Moreover, ∆t.
Observe that by Doob's martingale inequality and the uniform boundedness off (·) −1 one has where C = C(f (0)) is a constant only depending on f (0). Moreover, by Chebyshev, one has Letting i tend to infinity, we conclude that almost surely lim sup In particular, we obtain almost sure convergence of ( 1 κ Z[s, s + κt]) t 0 to the identity. As a consequence of (4), for any ε > 0, there exists a random almost surely finite constant η = η(ω, ε) such that, for all t s, Thus, condition (2) implies convergence of the martingale (M t ). We now assume that (ϕ t ) t 0 converges to infinity. Since ε > 0 was arbitrary the above estimate implies that To conclude the converse estimate note that t∈T (∆t) 2 < ∞ so that we get with (4) and (5) that for an appropriate finite random variable η. Therefore, The jumps of M κ are uniformly bounded by a deterministic value that tends to zero as κ tends to ∞. By a functional central limit theorem for martingales (see, e.g., Theorem 3.11 in Jacod and Shiryaev (2003)), the central limit theorem follows once we establish that, for any t 0, which is an immediate consequence of (6).
Proof of Remark 1.14. We suppose that f is regularly varying with index α < 1 2 . By the central limit theorem the processes converge in distribution to the Wiener process (W t ) as κ tends to infinity. For each κ > 0 we consider the time change (τ κ t ) t 0 := (ϕ κt /ϕ κ ). Using that ϕ is regularly varying with parameter 1−2α 1−α we find uniform convergence on compacts: Therefore,

Absolute continuity of the law of M ∞
In the sequel, we consider the martingale (M t ) t s,t∈T given by Z[s, t] − (t − s) for a fixed s ∈ T in the case of strong preference. We denote by M ∞ the limit of the martingale.
Proposition 2.2. If f is concave, then the distribution of M ∞ is absolutely continuous with respect to Lebesgue measure.
Proof. For ease of notation, we denote Again we use the notation ∆t = 1 In the first step of the proof we derive an upper bound for With (7) we conclude that .
Due to the concavity of f , we get that

Consequently, h(t) u∈[s,t)∩T
(1 − ∆f (ς(u + c + 1)) ∆u) and using that log(1 + x) x we obtain We continue with estimating the sum Σ in the latter exponential: and that For Lebesgue almost all arguments one has where c * is a positive constant not depending on t. Plugging this estimate into (7) we get .
Fix now an interval I ⊂ R of finite length and note that .
Moreover, for any open and thus immediately also for any arbitrary interval I one has , can be written as monotone limit of the absolutely continuous measures µ c (c ∈ N), and it is thus also absolutely continuous.

The empirical indegree distribution
In this section we prove Theorem 1.1. For k ∈ Z + and n ∈ N let µ k (n) = E[X k (n)] and µ(n) = (µ k (n)) k∈Z + . We first show that (µ(n)) n∈N converges to µ = (µ k ) k∈Z + as n tends to infinity. We start by deriving a recursive representation for µ(n). For k ∈ Z + , Thus the linearity and the tower property of conditional expectation gives and conceiving µ(n) as a row vector we can rewrite the recursive equation as where I = (δ i,j ) i,j∈N denotes the unit matrix. Next we show that µ is a probability distribution with µQ = 0. By induction, we get that for any k ∈ Z + . Since l=0 ∞ 1/f (l) ∞ l=0 1/(l + 1) = ∞ it follows that µ is a probability measure on Z + . Moreover, it is straight-forward to verify that hence µQ = 0. Now we use the matrices P (n) := I + 1 n+1 Q to define an inhomogeneous Markov process. The entries of each row of P (n) sum up to 1 but (as long as f is not bounded) each P (n) contains negative entries. Nonetheless one can use the P (n) as a time inhomogeneous Markov kernel as long as at the starting time m ∈ N the starting state l ∈ Z + satisfies l m − 1. We denote for any admissible pair l, m by (Y l,m n ) n m a Markov chain starting at time m in state l having transition kernels (P (n) ) n m . Due to the recursive equation we now have Next, fix k ∈ Z + , let m > k arbitrary, and denote by ν the restriction of µ to the set {m, m + 1, . . . }. Since µ is invariant under each P (n) we get Note that in the n-th step of the Markov chain, the probability to jump to state zero is 1 n+1 for all original states in {1, . . . , n − 1} and bigger than 1 n+1 for the original state 0. Thus one can couple the Markov chains (Y l,m n ) and (Y 0,1 n ) in such a way that and that once the processes meet at one site they stay together. Then As m → ∞ we thus get that lim n→∞ µ k (n) = µ k .
In the next step we show that the sequence of the empirical indegree distributions (X(n)) n∈N converges almost surely to µ. Note that n X k (n) is a sum of n independent Bernoulli random variables. Thus Chernoff's inequality (Chernoff (1981)) implies that for any t > 0 Borel-Cantelli implies that almost surely lim inf n→∞ X k (n) µ k for all k ∈ Z + . This establishes almost sure convergence of (X(n)) to µ.
We still need to show that the conditional law of the outdegree of a new node converges almost surely in the weak topology to a Poisson distribution. In the first step we will prove that, for η ∈ (0, 1), and the affine linear attachment rule f (k) = ηk + 1, one has almost sure convergence of Y n := 1 Given the past F n of the network formation, each ∆Z[m, n] is independent Bernoulli distributed with success probability 1 n (ηZ[m, n] + 1). Consequently, Now note that due to Theorem 1.7 (which can be used here, as it will be proved independently of this section) there is a single node that has maximal indegree for all but finitely many times. Let m * denote the random node with this property. With Remark 1.5 we conclude that almost surely Since for sufficiently large n equations (10) and (11) imply that Y · converges almost surely to a finite random variable.
where ∆M n+1 denotes a martingale difference. We shall denote by (M n ) n∈N the corresponding martingale, that is M n = n m=2 ∆M m . Since Y · is convergent, the martingale (M n ) converges almost surely. Next, we represent (12) in terms ofȲ n = Y n − y as the following inhomogeneous linear difference equation of first order: The corresponding starting value isȲ 1 = Y 1 − y = −y, and we can represent its solution as Note that h m n and ∆h m n tend to 0 as n tends to infinity so that n k=m+1 ∆h k n = 1 − h m n tends to 1. With M ∞ := lim n→∞ M n and ε m = sup n m |M n − M ∞ | we derive for m n Since lim m→∞ ε m = 0, almost surely, we thus conclude with (13) that n m=2 ∆M m h m n tends to 0. Consequently, lim n→∞ Y n = y, almost surely. Next, we show that also µ, id = y. Recall that µ is the unique invariant distribution satisfying µ Q = 0 (see (9) for the definition of Q).

This implies that for any
One cannot split the sum into two sums since the individual sums are not summable. However, noticing that the individual term f (k)µ k k ≈ k 2 µ k tends to 0, we can rearrange the summands to obtain This implies that µ, id = y and that for any m ∈ N Now, we switch to general attachment rules. We denote by f an arbitrary attachment rule that is dominated by an affine attachment rule f a . The corresponding degree evolutions will be denoted by (Z[m, n]) and (Z a [m, n]), respectively. Moreover, we denote by µ and µ a the limit distributions of the empirical indegree distributions. Since by assumption f f a , one can couple both degree evolutions such that Z[m, n] Z a [m, n] for all n m 0. Now Since m can be chosen arbitrarily large we conclude that lim sup n→∞ X(n), f µ, f . Since, conditional on F n , n m=1 ∆Z[m, n] is a sum of independent Bernoulli variables with success probabilities tending uniformly to 0, we finally get that L( n m=1 ∆Z[m, n]|F n ) converges in the weak topology to a Poisson distribution with parameter µ, f .

Large deviations
In this section we derive tools to analyse rare events in the random network. We provide large and moderate deviation principles for the temporal development of the indegree of a given vertex. This will allow us to describe the indegree evolution of the node with maximal indegree in the case of weak preferential attachment. The large and moderate deviation principles are based on an exponential approximation to the indegree evolution processes, which we first discuss.

Exponentially good approximation
In order to analyze the large deviations of the process Z[s, · ] (or Z[m, ·, ]) we use an approximating process. We first do this on the level of occupation measures. For s ∈ T and 0 u < v we define for all 0 u v.
The following lemma shows that T [u, v) is a good approximation to T s [u, v) in many cases.
where η 2 is a constant only depending on η 1 .
Proof. We fix t ∈ T withf (u)∆t η 1 . Note that it suffices to find an appropriate coupling conditional on the event {τ = t}. Let U be a uniform random variable and let We compute log 1 −f (u)∆w Next observe that, from a Taylor expansion, for a suitably large This proves the left inequality in (14). It remains to prove the right inequality. Note that and t w,w+∆w t+v w∈T As a direct consequence of this lemma we obtain an exponential approximation.
Lemma 4.2. Suppose that, for some η < 1 we have f (j) η(j + 1) for all j ∈ Z + . If ∞ j=0 f (j) 2 (j + 1) 2 < ∞, then for each s ∈ T one can couple T s with T such that, for all λ 0, where K > 0 is a finite constant only depending on f .
Proof. Fix s ∈ T and denote by τ u the first entry time of Z[s, ·] into the state u ∈ S. We couple the random variables T [u] and T s [u] as in the previous lemma and let, for v ∈ S, Then (M v ) v∈S is a martingale. Moreover, for each v = Φ(j) ∈ S one has τ v Ψ(j + 1) so that ∆τ v 1/(j + 1). Consequently, using the assumption of the lemma one gets that Thus by Lemma 4.1 there exists a constant η ′ < ∞ depending only on f (0) and η such that the increments of the martingale (M v ) are bounded by By assumption we have K := v∈S c 2 v < ∞ an we conclude with Lemma A.4 that for λ 0, We define (Z t ) t 0 to be the S-valued process given by and start by observing its connection to the indegree evolution.  Proof. We only present the proof for the first large deviation principle of Theorem 1.10 since all other statements can be inferred analogously. We let U δ (x) denote the open ball around x ∈ I[0, ∞) with radius δ > 0 in an arbitrarily fixed metric d generating the topology of uniform convergence on compacts, and, for fixed η > 0, we cover the compact set K = {x ∈ I[0, ∞) : I(x) η} with finitely many balls (U δ (x)) x∈I , where I ⊂ K. Since every x ∈ I is continuous, we can find ε > 0 such that for every x ∈ I and increasing and right continuous τ For fixed s ∈ T we couple the occupation times (T s [0, u)) u∈S and (T [0, u)) u∈S as in Lemma 4.2, and hence implicitly the evolutions (Z[s, t]) t s and (Z t ) t 0 . Next, note that Z[s, s + · ] can be transformed into Z · by applying a time change τ with |τ (t) − t| and an application of Lemma 4.2 gives a uniform upper bound in s, namely lim sup Since η and δ > 0 were arbitrary this proves the first statement.

The large deviation principles
By the exponential equivalence, Proposition 4.4, and (Dembo and Zeitouni, 1998, Theorem 4.2.13) it suffices to prove the large and moderate deviation principles in the framework of the exponentially equivalent processes (15) constructed in the previous section.
The first step in the proof of the first part of Theorem 1.10, is to show a large deviation principle for the occupation times of the underlying process. Throughout this section we denote a κ := κ 1/(1−α)l (κ).
Note that infĪ κ /κ and supĪ κ /κ approach the values u and v, respectively. Hence, we conclude with the dominated convergence theorem that one has as κ tends to infinity. Now the Gärtner-Ellis theorem implies the large deviation principle for the family (T [κu, κv)) κ>0 for 0 < u < v. It remains to prove the large deviation principle for u = 0. Note that κv.

Consequently, T [0,κε)
κ converges in probability to ε. Thus for t < v and for sufficiently small ε > 0 lim inf while the upper bound is obvious.
The next lemma is necessary for the analysis of the rate function in Lemma 4.5. It involves the function ψ defined as ψ(t) = 1 − t + t log t for t 0.
Lemma 4.6. For fixed 0 < x 0 < x 1 there exists an increasing function η : We now extend the definition of Λ * continuously by setting, for any u 0 and t 0, For the proof of Lemma 4.6 we use the following fact, which can be verified easily.
As the next step in the proof of Theorem 1.10 we formulate a finite-dimensional large deviation principle, which can be derived from Lemma 4.5.
Lemma 4.8. Fix 0 = t 0 < t 1 < · · · < t p . Then the vector 1 κ Z κt j : j ∈ {1, . . . , p} satisfies a large deviation principle in {0 a 1 · · · a p } ⊂ R p with speed a κ and rate function Proof. First fix 0 = a 0 < a 1 < · · · < a p . Observe that, whenever s j−1 < s j with s 0 = 0, By Lemma 4.5, given ε > 0 and A > 0, we find δ > 0 such that, for κ large, Hence, for sufficiently small δ we get with the above estimates that lim inf κ→∞ 1 a κ log P a j + ε > 1 κ Z κt j a j for j ∈ {1, . . . , p} Next, we prove the upper bound. Fix 0 = a 0 . . . a p and 0 = b 0 . . . b p with a j < b j , and observe that by the strong Markov property of (Z t ), Consequently, Using the continuity of (u, v) → Λ * u,v (t) for fixed t, it is easy to verify continuity of each r j of the parameters a j−1 , a j , b j−1 , and b j . Suppose now that (a j ) and (b j ) are taken from a predefined compact subset of R d . Then we have for an appropriate function ϑ with lim δ↓0 ϑ(δ) = 0. Now the upper bound follows with an obvious exponential tightness argument.
We can now prove a large deviation principle in a weaker topology, by taking a projective limit and simplifying the resulting rate function with the help of Lemma 4.6. Proof. Observe that the space of increasing functions equipped with the topology of pointwise convergence can be interpreted as projective limit of the spaces {0 a 1 · · · a p } with the canonical projections given by π(x) = (x(t 1 ), . . . , x(t p )) for 0 < t 1 < . . . < t p . By the Dawson-Gärtner theorem, we obtain a large deviation principle with good rate functioñ Note that the value of the variational expression is nondecreasing, if additional points are added to the partition. It is not hard to see thatJ(x) = ∞, if x fails to be absolutely continuous. Indeed, there exists δ > 0 and, for every n ∈ N, a partition δ s n 1 < t n 1 · · · s n n < t n n 1 δ such that n j=1 t n j − s n j → 0 but n j=1 x(t n j ) − x(s n j ) δ. Then, for any λ > 0, which can be made arbitrarily large by choice of λ.
From now on suppose that x is absolutely continuous. The remaining proof is based on the equationJ Before we prove its validity we apply (17) to derive the assertions of the lemma. For the lower bound we choose a scheme 0 < t n 1 < · · · < t p n , with p depending on n, such that t p n → ∞ and the mesh goes to zero. Define, for t n j−1 t < t n j , Note that, by Lebesgue's theorem, x n j (t) →ẋ t almost everywhere. Hencẽ For the upper bound we use the convexity of ψ to obtain as required to complete the proof. It remains to prove (17). We fix t ′ and t ′′ with t ′ < t ′′ and x(t ′ ) > 0, and partitions t ′ = t n 0 < · · · < t n n = t ′′ with δ n := sup j x(t n j ) − x(t n j−1 ) converging to 0. Assume n is sufficiently large such that η δn 1 2 (t ′ ) α 1−α , with η as in Lemma 4.6. Then, and ( * ) is uniformly bounded as long asJ (x) is finite. On the other hand also the finiteness of the right hand side of (17) implies uniform boundedness of ( * ). Hence, either both expressions in (17) are infinite or we conclude with Lemma 4.6 that for an appropriate choice of t n j , This expression easily extends to formula (17).
Lemma 4.10. The level sets of J are compact in I[0, ∞).
If we additionally assume ε 2eδ 1 2 , then we get (J(x)/ log δ − 1 2 ) 1−α ε. Therefore, in general Hence the level sets are uniformly equicontinuous. As x 0 = 0 for all x ∈ I[0, ∞) this implies that the level sets are uniformly bounded on compact sets, which finishes the proof.
We now improve our large deviation principle to the topology of uniform convergence on compact sets, which is stronger than the topology of pointwise convergence. To this end we introduce, for every m ∈ N, a mapping f m acting on functions x : [0, ∞) → R by Lemma 4.11. For every δ > 0 and T > 0, we have Proof. Note that P sup By Lemma 4.9 we have lim sup and, by Lemma 4.10, the right hand side diverges to infinity, uniformly in j, as m ↑ ∞.
Proof of the first large deviation principle in Theorem 1.10. We apply (Dembo and Zeitouni, 1998, Theorem 4.2.23), which allows to transfer the large deviation principle from the topological Hausdorff space of increasing functions with the topology of pointwise convergence, to the metrizable space I[0, ∞) by means of the sequence f m of continuous mappings approximating the identity. Two conditions need to be checked: On the one hand, using the equicontinuity of the sets {I(x) η} established in Lemma 4.10, we easily obtain lim sup for every η > 0, where d denotes a suitable metric on I[0, ∞). On the other hand, by Lemma 4.11, we have that (f m ( 1 κ Z κ · )) are a family of exponentially good approximations of ( 1 κ Z κ · ).
The proof of the second large principle can be done from first principles.
Proof of the second large deviation principle in Theorem 1.10. For the lower bound observe that, for any T > 0 and ε > 0, and recall that the first probability on the right hand side is exp{−κ af (0)} and the second converges to one, by the law of large numbers. For the upper bound note first that, by the first large deviation principle, for any ε > 0 and closed set A ⊂ {J(x) > ε}, Note further that, for any δ > 0 and T > 0, there exists ε > 0 such that J(x) ε implies sup 0 t T |x − y| < δ, where y t = (t − a) + for some a ∈ [0, T ]. Then, for θ < f (0), , and the result follows because the sum on the right is bounded by a constant multiple of κδ.

The moderate deviation principle
Recall from the beginning of Section 4.2 that it is sufficient to show Theorem 1.12 for the approximating process Z defined in (15). We initially include the case c = ∞ in our consideration, and abbreviate b κ := a κ κ 2α−1 so that we are looking for a moderate deviation principle with speed a κ b κ .
Lemma 4.12. Let 0 u < v, suppose that f and a κ are as in Theorem 1.12 and define Then the family satisfies a large deviation principle with speed (a κ b κ ) and rate function Proof. Denoting by Λ κ the logarithmic moment generating function of where Now focus on the case u > 0. A Taylor approximation gives ξ(w) = w + 1 2 (1 + o(1))w 2 , as w ↓ 0. By dominated convergence, Together with (21) we arrive at Now the Gärtner-Ellis theorem implies that the family ((T [κu, κv)−κ(v−u))/a κ ) satisfies a large deviation principle with speed (a κ b κ ) having as rate function the Fenchel-Legendre transform of u,v) . Next, we look at the case u = 0. If θ 1 c f (0) then Λ κ (θ) = ∞ for all κ > 0, so assume the contrary. The same Taylor expansion as above now gives as w ↑ ∞. In particular, the integrand in (20) is regularly varying with index − α 1−α > −1 and we get from Karamata's theorem, see e.g. (Bingham et al., 1987, Theorem 1.5.11), that Consequently, The Legendre transform of the right hand side is Since I [0,v) is not strictly convex the Gärtner-Ellis Theorem does not imply the full large deviation principle. It remains to prove the lower bound for open sets (t, ∞) with t 1 c I [0,v) f (0). Fix ε ∈ (0, u) and note that, for sufficiently large κ, so that by the large deviation principle for ((T [κε,κv) − κ(v − ε))/a κ ) and the exponential distribution it follows that Note that the right hand side converges to −I [0,v) (t) when letting ε tend to zero. This establishes the full large deviation principle for ((T [0, κv) − κv)/a κ ).
We continue the proof of Theorem 1.12 with a finite-dimensional moderate deviation principle, which can be derived from Lemma 4.12.
Lemma 4.13. Fix 0 = t 0 < t 1 < · · · < t p . Then the vector 1 aκ Z κt j − κt j : j ∈ {1, . . . , p} satisfies a large deviation principle in R p with speed a κ b κ and rate function Proof. We note that, for −∞ a (j) < b (j) ∞, we have (interpreting conditions on the right as void, if they involve infinity) P a (j) a κ Z κt j − κt j < b (j) a κ for all j = P T [0, κt j + a κ a (j) ) κ t j , T [0, κt j + a κ b (j) ) > κ t j for all j .
To continue from here we need to show that the random variables T [0, κt+a κ b) and T [0, κt)+a κ b are exponentially equivalent in the sense that Indeed, first let b > 0. As in Lemma 4.12, we see that for any t 0 and θ ∈ R, Chebyshev's inequality gives, for any A > 0, A similar estimate can be performed for P( , and the argument also extends to the case b < 0. From this (23) readily follows. Using Lemma 1.12 and independence, we obtain a large deviation principle for the vector with rate function Using the contraction principle, we infer from this a large deviation principle for the vector Combining this with (23) we obtain that and (observing the signs!) the required large deviation principle.
We may now take a projective limit and arrive at a large deviation principle in the space P(0, ∞) of functions x : (0, ∞) → R equipped with the topology of pointwise convergence.
Lemma 4.14. The family of functions 1 aκ Z κt − κt : t > 0 κ>0 satisfies a large deviation principle in the space P(0, ∞), with speed a κ b κ and rate function Proof. Observe that the space of functions equipped with the topology of pointwise convergence can be interpreted as the projective limit of R p with the canonical projections given by π(x) = (x(t 1 ), . . . , x(t p )) for 0 < t 1 < . . . < t p . By the Dawson-Gärtner theorem, we obtain a large deviation principle with rate functioñ Note that the value of the variational expression is nondecreasing, if additional points are added to the partition. We first fix t 1 > 0 and optimize the first summand independently. Observe that

Hence we obtain an upper and [in brackets] lower bound of
It is easy to see that (using arguments analogous to those given in the last step in the proof of the first large deviation principle) that this is +∞ if x fails to be absolutely continuous, and otherwise it equals In the latter case we havẽ If x 0 > 0 the last summand diverges to infinity. If x 0 = 0 and the limit of the integral is finite, then using Cauchy-Schwarz, as required to complete the proof.
Lemma 4.15. If c < ∞, the function I is a good rate function on L(0, ∞).
Proof. Recall that, by the Arzelà-Ascoli theorem, it suffices to show that for any η > 0 the family {x : I(x) η} is bounded and equicontinuous on every compact subset of (0, ∞). Suppose that I(x) η and 0 < s < t. Then, using Cauchy-Schwarz in the second step, which proves equicontinuity. The boundedness condition follows from this, together with the observation that 0 x 0 − cη/f (0).
To move our moderate deviation principle to the topology of uniform convergence on compact sets, recall the definition of the mappings f m from (19). We abbreviatē Lemma 4.16. (f m (Z (κ) )) m∈N are exponentially good approximations of (Z (κ) ) on L(0, ∞).
Proof. We need to verify that, denoting by · the supremum norm on any compact subset of (0, ∞), for every δ > 0, The crucial step is to establish that, for sufficiently large κ, for all j 2, P sup Hence, we get lim sup , and the right hand side can be made arbitrarily small by making m = 1 t j −t j−1 large.
Proof of Theorem 1.12. We apply (Dembo and Zeitouni, 1998, Theorem 4.2.23) to transfer the large deviation principle from the topological Hausdorff space P(0, ∞) to the metrizable space L(0, ∞) using the sequence f m of continuous functions. There are two conditions to be checked for this, on the one hand that (f m (Z (κ) )) m∈N are exponentially good approximations of (Z (κ) ), as verified in Lemma 4.16, on the other hand that lim sup for every η > 0, where d denotes a suitable metric on L(0, ∞). This follows easily from the equicontinuity of the set {I(x) η} established in Lemma 4.15. Hence the proof is complete.

The vertex with maximal indegree
In this section we prove Theorem 1.7 and Theorem 1.15.
5.1 Strong and weak preference: Proof of Theorem 1.7 The key to the proof is Proposition 5.1 which shows that, in the strong preference case, the degree of a fixed vertex can only be surpassed by a finite number of future vertices. The actual formulation of the result also contains a useful technical result for the weak preference case. Recall that ϕ t = t 0 1 f (v) dv, and let t(s) = sup{t ∈ S : 4ϕ t s}, for s 0.
Moreover, we let ϕ ∞ = lim t→∞ ϕ t , which is finite exactly in the strong preference case. In this case t(s) = ∞ eventually.
Proposition 5.1. For any fixed η > 0, almost surely only finitely many of the events occur.
For the proof we identify a family of martingales and then apply the concentration inequality for martingales, Lemma A.3. For s ∈ T, let (T s u ) u∈S be given byT s The following lemma is easy to verify.
Lemma 5.2. Let (t i ) i∈Z + be a strictly increasing sequence of nonnegative numbers with t 0 = 0 and lim i→∞ t i = ∞. Moreover, assume that λ > 0 is fixed such that λ ∆t i := λ (t i − t i−1 ) 1, for all i ∈ N, and consider a discrete random variable X with Then With this at hand, we can identify the martingale property of (T s u ) u∈S .
Lemma 5.3. For any s ∈ S, the process (T s u ) u∈S is a martingale with respect to the natural filtration (G u ). Moreover, for two neighbours u < u + in S, one has Proof. Fix two neighbours u < u + in S and observe that given G u (or given the entry time T s [0, u) + s into state u) the distribution of T s [u] is as in Lemma 5.2 with λ =f (u). Thus the lemma implies that is a martingale. The variance estimate of Lemma 5.2 yields the second assertion.
Proof of Proposition 5.1. We fix η 1/f (0) and u 0 ∈ S withf (u 0 ) 2. We consider P(A s ) for sufficiently large s ∈ T. More precisely, s needs to be large enough such that t(s) u 0 and s − η − u 0 s/2. We denote by σ the first time t in T for which Z[s, t] t − η, if such a time exists, and set σ = ∞ otherwise. We now look at realizations for which σ ∈ [s, t(s)) or, equivalently, A s occurs. We set ν = Z[s, σ]. Since the jumps of Z[s, ·] are bounded by 1/f (0) we conclude that Conversely, T s [0, ν) + s is the entry time into state ν and thus equal to σ; therefore, By Lemma 5.3 the process (T s u ) u∈S is a martingale. Moreover, for consecutive elements u < u + of S that are larger than u 0 , one has var(T s Now we apply the concentration inequality, Lemma A.3, and obtain, writing λ s = s − η − u 0 − 2ϕ t(s) 0, that where we use that As ϕ t(s) s/4, we obtain lim sup − 1 s log P(A s ) 6 5 . Denoting by ι(t) = max[0, t] ∩ T, we finally get that s∈T P(A s ) ∞ 0 e s P(A ι(s) ) ds < ∞, so that by Borel-Cantelli, almost surely, only finitely many of the events (A s ) s∈T occur.
Proof of Theorem 1.7. We first consider the weak preference case and fix s ∈ T. Recall that (Z[s, t] − (t − s)) t s and (Z[0, t] − t) t 0 are independent and satisfy functional central limit theorems (see Theorem 1.6). Thus (Z[s, t] − Z[0, t]) t s also satisfies a central limit theorem, i.e. an appropriately scaled version converges weakly to the Wiener process. Since the Wiener process changes its sign almost surely for arbitrarily large times, we conclude that Z[s, t] will be larger, respectively smaller, than Z[0, t] for infinitely many time instances. Therefore, s is not a persistent hub, almost surely. This proves the first assertion.
In the strong preference case recall that ϕ ∞ < ∞. For fixed η > 0, almost surely, only finitely many of the events (A s ) s∈T occur, by Proposition 5.1. Recalling that Z[0, t] − t has a finite limit, we thus get that almost surely only finitely many degree evolutions overtake the one of the first node. It remains to show that the limit points of (Z[s, t] − t) for varying s ∈ T are almost surely distinct. But this is an immediate consequence of Proposition 2.2.

5.2
The typical evolution of the hub: Proof of Theorem 1.15 From now on we assume that the attachment rule f is regularly varying with index α < 1 2 , and we represent f andf as For this choice of (a κ ) the moderate deviation principle, Theorem 1.12, leads to the speed (a κ ), in other words the magnitude of the deviation and the speed coincide. The proof of Theorem 1.15 is based on the following lemma.
2nd Part: We now prove that (an appropriately scaled version of) the evolution of a hub typically lies in an open neighbourhood around z.
Let U denote an open set in L(0, ∞) that includes z and denote by U c its complement in L(0, ∞). Furthermore, we set A ε = x ∈ L(0, ∞) : max t∈[ 1 2 ,1] x t 2(u max − ε) for ε 0. We start by showing that z is the unique minimizer of I on the set A 0 . Indeed, applying the inverse Hölder inequality gives, for x ∈ A 0 with finite rate I(x), I(x) 1 Moreover, one of the three inequalities is a strict inequality when x = z. Recall that, by Lemma 4.15, I has compact level sets. We first assume that one of the entries in U c ∩ A 0 has finite rate I. Since U c ∩ A 0 is closed, we conclude that I attains its infimum on U c ∩ A 0 . Therefore, I(U c ∩ A 0 ) := inf{I(x) : x ∈ U c ∩ A 0 } > I(z) = u max .
Conversely, using again compactness of the level sets, gives lim ε↓0 I(U c ∩ A ε ) = I(U c ∩ A 0 ).
Therefore, there exists ε > 0 such that I(U c ∩ A ε ) > I(z). Certainly, this is also true if U c contains no element of finite rate. From the moderate deviation principle, Theorem 1.12, together with the uniformity in s, see Proposition 4.4, we infer that lim sup κ→∞ 1 a κ max s∈T log P Z (s,κ) ∈ U c ∩ A ε − I(U c ∩ A ε ) < −I(z).