A moment-generating formula for Erd\H{o}s-R\'enyi component sizes

We derive a simple formula characterizing the distribution of the size of the connected component of a fixed vertex in the Erd\H{o}s-R\'enyi random graph which allows us to give elementary proofs of some results of Federico, van der Hofstad, den Hollander and Hulshof as well as Janson and Luczak about the susceptibility in the subcritical graph and the central limit theorem of Pittel for the size of the giant component in the supercritical graph.


Introduction
The Erdős-Rényi graph G n,p , introduced in [8], is the random graph on n vertices where each pair of vertices is connected with probability p, independently from each other. For an introduction to this fundamental mathematical model of large networks, see [6,14,11].
We denote by P n,p the law of G n,p and E n,p the corresponding expectation. We assume that the vertex set of G n,p is [n] = {1, . . . , n} and we denote by C the connected component in G n,p of the vertex indexed by 1. We denote by |C| the number of vertices of C.
For any n ∈ N, p ∈ [0, 1], j ∈ Z ∩ (−n, +∞), and k ∈ [n] we define Note that if j ≤ 0 then the r.h.s. is simply n+j n . We prove Proposition 1.1 in Section 2. Remark 1.2. (i) Define the n × n matrix M by M j,k = g n,p (j, k) for j ∈ Z ∩ (−n, 0] and k ∈ [n]. The matrix M is triangular with non-zero diagonal entries, hence it is invertible. Therefore, Proposition 1.1 uniquely characterizes the distribution of |C| under P n,p . (ii) A generalization of Proposition 1.1 appears in Proposition 1.6 of the recent preprint [12], see also [12,Remark 1.7]. The random graph process studied in [12] can be informally defined as follows: starting from the empty graph on the vertex set [n], cliques are added with a rate that only depends on their size (the dynamical Erdős-Rényi graph is the special case when only cliques of size two are added). Proposition 1.1 allows us to give short and self-contained proofs of some delicate results about the sizes of connected components of the Erdős-Rényi graph in the subcritical (see Theorem 1.4) as well as the supercritical (see Theorem 1.6) cases. First, we give a short non-rigorous demonstration of how our formula is used in Remark 1.3.
When we study the phase transition of the Erdős-Rényi graph, it is natural to introduce a parameter t ∈ R + and to study G n,p for p = p(t, n) = 1 − e −t/n . (1.3) We will fix this relation between p and t throughout this paper. For any n ∈ N, λ ∈ R, and k ∈ [n] we define so that we have f n,t ( j n , k) = g n,p (j, k) if j ∈ Z ∩ (−n, +∞) and thus where W is the Lambert-W function. Now it is known that if p = 1 − e −t/n and n → ∞ then |C| converges in distribution to the total number of offspring in a subcritical Galton-Watson branching process with POI(t) offspring distribution (see [4,Theorem 11.6.1]), i.e., |C| has Borel distribution with parameter t (see [2,Section 2.2] or [13,Section 7]). The generating function G t of the Borel distribution with parameter t is known to be characterized by the identity G t (z) ≡ ze (Gt(z)−1)t (see [3,Section 10.4]), which is in turn equivalent to G t (z) = −W (−e −t tz)/t, therefore a more rigorous version of (1.6) can be used to show that the distribution |C| weakly converges to the Borel distribution with parameter t as n → ∞.
Now we state our rigorous results. We will use the Bachmann-Landau big O notation: we write f (n, t) = O (g(n, t)) if there exists a universal constant C such that f (n, t) ≤ Cg(n, t) for any n ∈ N and any t in an explicitly specified domain. We write f (n) = O (g(n)) if there exists a constant C (that may depend on t) such that f (n) ≤ Cg(n) for any n ∈ N.
We will give a short and self-contained proof of some results of [9] and [13]: (1.8) We will prove Theorem 1.4 in Section 3.
Remark 1.5. E n,p (|C|) is often called the susceptibility of the Erdős-Rényi graph.
(i) Equation (1.15) of [9,Theorem 1.2] states that if p = µ n−1 and 0 < µ < 1 then Now (1.9) follows from (1.7) if we take into account that µ = (n − 1)(1 − e −t/n ). The proof of (1.9) in [9, Section 2] uses a coupling of the breadth-first exploration process of C and a process related to a branching random walk. Our proof of (1.7) is completely different as it only uses Proposition 1.1. (iii) Both statements of Theorem 1.4 give something meaningful in the whole subcritical regime outside the critical window, e.g., the first term of the r.h.s. of (1.7) is much bigger than the second one, which is much bigger than the third one if (1 − t) 3 n ≫ 1.
We also give a short and self-contained proof of the central limit theorem proved in [17] for the size of the giant connected component of G n,p (see also [5], [11] and [15] for alternative proofs). Our proof only uses Proposition 1.1, see Theorem 1.6 below. We begin with some notation.
Theorem 1.6. Let us denote by |C max | the size of the largest connected component of G n,p . For any t > 1 we have and Φ(x) is the c.d.f. of the standard normal distribution.
We prove Theorem 1.6 in Section 4. Our proof is different from earlier proofs, which use the joint CLT for tree components of various sizes [17], stochastic differential equations which arise in the context of epidemics [15], and exploration processes [5,11]. Remark 1.7. We believe that Proposition 1.1 can also be used to give elementary alternative proofs of some results of [1] on the sizes of connected components in the critical Erdős-Rényi graph. In particular, let ρ u 0 denote the sigma-finite excursion length measure of the "first" excursion of the Brownian motion with parabolic drift which encodes the block sizes of the standard multiplicative coalescent process at time u ∈ R (see [1, (64)]). We believe that if t := 1 + un −1/3 and X n := |C|/n 2/3 then the formula can be proved using the methods of this paper, as we now argue. If we fix some β ∈ R and plug λ := ⌊βn 2/3 ⌋/n into (1.5) then we obtain (after some calculation) the formula Now one can use stochastic calculus to show We conjecture that (1.14) can be derived from (1.15) and (1.16).
We discuss the origins of (1.2) in Remark 2.2(i) and an extension of (1.2) to the stochastic block model in Remark 2.2(ii).

Proof of Proposition 1.1
The proof of Proposition 1.1 will easily follow from the change of measure formula (2.1). An idea similar to (2.1) has already been used in the proof of [4, Theorem 11.6.1].
Proof. If k > M then both sides of (2.1) are zero. Thus w.l.o.g. we can assume k ≤ M ∧ N . Now we observe that if we prove (2.1) for some M ≤ N , then we also obtain (2.1) for M ′ = N and N ′ = M by rearranging the formula (2.1), thus we may assume w.l.o.g. that k ≤ M ≤ N . In order to prove (2.1) it is enough to show  Proof of Proposition 1.1. For any n ∈ N, j ∈ Z ∩ (−n, +∞) and p ∈ [0, 1] we have where in ( * ) we used (2.1) with n = N and M = n + j. The proof of (1.2) is complete.
, where X 1 , X 2 , . . . , X n denote independent exponentially distributed random variables X k ∼ EXP 1 − k n , then τ = min{ k : Y 1 +· · ·+Y k < 0 } has the same distribution as |C| under P n,p , p = 1−e −t/n . We chose to include an elementary proof instead in order to keep the paper self-contained.
(ii) It is possible to extend Proposition 1.1 to the stochastic block model, as we now briefly explain. Consider a random graph in which each vertex has a label, where the set of labels is {1, . . . , ℓ}. Let n = (n 1 , . . . , n ℓ ) and n = n 1 + · · · + n ℓ . We uniformly choose a labelling of the vertex set [n] from the set of labellings where the number of vertices with label j is n j for each j = 1, . . . , ℓ. Given the labels, we add edges independently: a vertex with label i and a vertex with label j is connected with probability p i,j . Let p = (p i,j ) ℓ i,j=1 . Denote by P n,p the law of the resulting random graph G n,p and E n,p the corresponding expectation. This random graph model is often called the stochastic block model and it is also a special case of the inhomogeneous random graph model of [7].
Denote by K(n) the set of vectors k = (k 1 , . . . , k ℓ ) for which 0 ≤ k j ≤ n j and k 1 + · · · + k ℓ ≥ 1. Denote by J (n) the set of vectors J = (J 1 , . . . , J ℓ ) for which −n j ≤ J j and −n < J 1 + · · · + J ℓ . Denote by J ≤0 (n) the subset of J (n) which consists of vectors J = (J 1 , . . . , J ℓ ) for which J j ≤ 0 for any j = 1, . . . , ℓ. Let us define Let C denote the connected component of the vertex indexed by 1 in G n,p . Denote by |C| j the number of vertices with label j in C and let |C| = (|C| 1 , . . . , |C| ℓ ). The generalization of the formula (1.2) to the stochastic block model is In order to prove (2.7), one needs the following analogue of (2.1), valid for k ∈ K(N ): Note that if J ∈ J ≤0 (n) then the r.h.s. of (2.7) is simply ℓ j=1 (n j +J j ) ℓ j=1 n j . Also note that the analogue of the property stated in Remark 1.2(i) holds: the system of equations (2.7) indexed by J ∈ J ≤0 (n) uniquely characterizes the distribution of |C| under P n,p .

Proof of Theorem 1.4
The basic idea is to treat E n,p [g n,p (j, |C|)] as the generating function of |C|, c.f. Remark 1.3. Thus if we want to obtain information about the first and second moments of |C|, we have to "differentiate" with respect to the variable j twice. Since j can only take integer values, we have to consider the first order discrete differences g n,p (j, |C|) − g n,p (0, |C|) for j = −1 and j = −2 in the proof of Lemmas 3.2 and 3.4, and the second order discrete difference (i.e., the difference of the first order differences) in the proof of Lemma 3.5. The statement of Lemma 3.1 is equivalent to [13, Lemma 3.2] (which is proved using differential equations), moreover it also classically follows from the fact that |C| is stochastically dominated by a subcritical branching process if t < 1. Despite of this, we chose to include a proof of Lemma 3.1 which only uses Proposition 1.1 in order to keep the paper self-contained.
Proof. Let k ∈ [n]. We begin with a calculation similar to (3.10): From (3.13) and Proposition 1.1 we obtain . (3.14) Subtracting one from both sides of (3.14), multiplying the result by n and applying (3.1) to the last three terms of (3.14), we obtain (3.12).
We will often use the shorthand P for P n,p(t,n) and E for E n,p(t,n) . If X is a random variable and A is an event, we will denote E(X; A) := E(X1 A ).  Before we prove Lemma 4.1, we use it to prove Theorem 1.6.
Proof of Theorem 1.6. Denote by |C 1 |, |C 2 |, . . . the non-increasing rearrangement of the sequence of component sizes of the graph G n,p . Thus |C 1 | = |C max | and |C 2 | is the size of the second largest component. Note that |C 1 | = |C 2 | is possible, but we will show that |C 2 | < |C 1 | with high probability. For any a ∈ R let us denote k n,a = ⌊θn + a · σ √ n⌋. (4.2) We will show that for a ≤ b ∈ R we have lim n→∞ P n,p(t,n) [ |C 1 | ∈ [k n,a , k n,b ], |C 2 | < k n,a ] = 1 θ lim n→∞ P n,p(t,n) [ |C| ∈ [k n,a , k n,b ] ] . Now by Lemma 4.1 the right-hand side of (4.3) is Φ (b) − Φ (a). This equation readily implies lim inf n→∞ P n,p(t,n) [ |C 2 | < k n,a ] ≥ Φ (b) − Φ (a) for any a ≤ b ∈ R, which in turn implies lim n→∞ P n,p(t,n) [ |C 2 | < k n,a ] = 1 for any a ∈ R. Combining this with Lemma 4.1 and (4.3) we obtain that Theorem 1.6 indeed holds. In order to prove (4.3) we observe that if k ∈ [k n,a , k n,b ], then This proves (4.4). Next we show that Now we observe that G n,p(t,n) is a subcritical Erdős-Rényi graph, since lim n→∞ n · p(t, n) Note that E n,p [|C|] remains bounded as n → ∞ by (3.1), hence (4.5) follows from (4.6).
We are now ready to prove (4.3):  Note that the choice of the exponents 1 4 , 3 4 and 5 8 above is somewhat arbitrary. Also note that I n and I n are the important intervals, while J n , K n and K n are insignificant, i.e., we will see that |C| ∈ I n ∪ I n with high probability. The only reason behind the distinction between J n and K n is that we will use different methods to show that J n and K n are insignificant.  Proof of Lemma 4.1. First note that lim n→∞ P n,p(t,n) |C| ∈ I n ∪ I n = 1 follows from the α = 0 case of (4.11). Combining this with (4.10) we obtain lim n→∞ P n,p(t,n) |C| ∈ I n = θ. The r.h.s. of (4.13) is the moment generating function of N 0, θ 1−θ , thus it classically follows from (4.13) that µ n weakly converges to N 0, σ 2 as n → ∞, where σ appears in (1.13). Together with (4.10) and (4.12) this implies Lemma 4.1, given Lemmas 4.2 and 4.3.
We will prove Lemma 4.2 in Section 4.1 and Lemma 4.3 in Section 4.2. The proofs will make excessive use of (1.5). Let us now introduce some notation that will be used throughout.
We will write f (n) = Ω (g(n)) if there exists a constant c > 0 (that may depend on t) such that f (n) ≥ cg(n) for any n ∈ N.

4.1
Proof of Lemma 4.2 Before we outline the strategy of the proof of Lemma 4.2 in the paragraph below (4.18), we need to introduce some notation. Let us abbreviate X = f n,t (−θ, |C|) and X * (4.14) = f n,t ((−θ) * n , |C|).
Recalling the definition of the intervals I n and J n from (4.8), we have We will estimate the three terms on the r.h.s. of (4.18). We will show that the first term approximates P (|C| ∈ I n ) as n → ∞, while the second and third terms vanish as n → ∞.

Proof of Lemma 4.3
Before we outline the strategy of the proof of Lemma 4.3 in the paragraph below (4.26), we need to introduce some notation. If we define α * * n := ⌊ √ nα⌋ √ n then α √ n * n (4.14) = α * * n √ n and |α * * n − α| ≤ We will estimate the five terms on the r.h.s. of (4.26). We will show that the terms corresponding to I n and I n in (4.26) approximate the terms corresponding to I n and I n in (4.11) as n → ∞, while the terms corresponding to J n , K n and K n in (4.26) vanish as n → ∞.
Proof of Lemma 4.3. Before we start estimating the five terms of (4.26), we note that if k ∈ I n ∪ J n ∪ K n ∪ I n then we can use Taylor expansion of ln(1 + x) to obtain for any α ∈ R the formula f n,t α √ n , k  can be deduced analogously to (4.22) using that for large enough n we have e −