Central limit theorem for the principal eigenvalue and eigenvector of Chung–Lu random graphs

Pierfrancesco Dionigi; Diego Garlaschelli; Rajat Subhra Hazra; Frank den Hollander; Michel Mandjes

doi:10.1088/2632-072X/acb8f7

1. Introduction, main results and discussion

1.1. Introduction

The spectral properties of adjacency matrices play an important role in various areas of network science. In the present paper we consider an inhomogeneous version of the Erdős–Rényi random graph called the Chung–Lu random graph and we derive a central limit theorem for the principal eigenvalue and eigenvector of its adjacency matrix.

1.1.1. Setting

Recall that the homogeneous Erdős–Rényi random graph has vertex set $[n] = \{1,\ldots,n\}$ , and each edge is present with probability p and absent with probability $1-p$ , independently for different edges, where $p \in (0,1)$ may depend on n (in what follows we often suppress the dependence on n from the notation; the reader is however warned that most quantities depend on n). The average degree is the same for every vertex and equals $(n-1)p$ when self-loops are not allowed, and np when self-loops are allowed (and are considered to contribute to the degrees of the vertices). In [15] the following generalisation of the Erdős–Rényi random graph is considered, called the Chung–Lu random graph, with the goal to accommodate general degrees. Given a sequence of degrees $\vec{d}_n = (d_i)_{i \in [n]}$ , consider the random graph $\mathcal{G}_n(\vec{d}_n)$ in which to each pair $i,j$ of vertices an edge is assigned independently with probability $p_{ij} = d_id_j/m_1$ , where $m_1 = \sum_{i = 1}^n d_i$ (for computational simplicity we allow self-loops). Here, the degrees can act as vertex weights. Vertices with low weights are more likely to have less neighbours than vertices with high weights which act as hubs (see [33, chapter 6] for a general introduction to generalised random graphs). If $d_\uparrow^2 \leqslant m_1$ with $d_\uparrow = \max_{i\in [n]} d_i$ , then $p_{ij} \leqslant 1$ for all $i,j \in [n]$ , and the sequence $\vec{d}_n$ is graphical. Note that in $\mathcal{G}_n(\vec{d}_n)$ the expected degree of vertex i is d_i. The classical Erdős–Rényi random graph (with self-loops) corresponds to $d_i = np$ for all $i \in [n]$ .

1.1.2. Principal eigenvalue and eigenvector

The largest eigenvalue of the adjacency matrix A and its corresponding eigenvector, written as $(\lambda_1,v_1)$ , contain important information about the random graph. Several community detection techniques depend on a proper understanding of these quantities [1, 25, 32], which in turn play an important role for various measures of network centrality [26, 27] and for the properties of dynamical processes (such as the spread of an epidemic) taking place on networks [12, 28]. For Erdős–Rényi random graphs, it was shown in [24] that with high probability (whp in the following) λ₁ scales like

$\begin{align} \lambda_1 \sim \max\{\sqrt{D_\infty}, np\}, \qquad n \to \infty, \end{align} \tag{ 1.1 }$

where $D_\infty$ is the maximum degree. This result was partially extended to $\mathcal{G}_n(\vec{d}_n)$ in [16], and more recently to a class of inhomogeneous Erdős–Rényi random graphs in [5, 6]. For a related discussion on the behaviour of $(\lambda_1,v_1)$ in real-world networks, see [12, 28]. In the present paper we analyse the fluctuations of $(\lambda_1,v_1)$ . We will be interested specifically in the case where λ₁ is detached from the bulk, which for Erdős–Rényi random graphs occurs when $\lambda_1 \sim np$ whp, and for Chung–Lu random graphs when $\lambda_1 \sim m_2/m_1$ , where $m_2 = \sum_{i\in [n]} d_i^2$ . Note that the quotient $m_2/m_1$ arises from the fact that the average adjacency matrix is rank one and that its only non-zero eigenvalue is $m_2/m_1$ . Such rank-one perturbations of a symmetric matrix with independent entries became prominent after the work in [4]. Later studies extended this work to finite-rank perturbations [3, 7, 10, 11, 20, 21]. Erdős–Rényi random graphs differ, in the sense that perturbations live on a scale different from $\sqrt{n}$ . For Chung–Lu random graphs we assume that $m_2/m_1\to\infty$ .

In the setting of inhomogeneous Erdős–Rényi random graphs, finite-rank perturbations were studied in [13]. In that paper the connection probability between between i and j is given by $p_{ij} = \varepsilon_n f(i/n, j/n)$ , where $f\colon\,[0,1]^2\to [0,1]$ is almost everywhere continuous and of finite rank, $\varepsilon_n \in [0,1]$ and $n\varepsilon_n\gg (\log n)^{8}$ . However, for a Chung–Lu random graph with a given degree sequence it is not always possible to construct an almost everywhere continuous function f independent of n such that $\varepsilon_n f(i/n, j/n) = d_i d_j/m_1$ . In the present paper we extend the analysis in [13] to Chung–Lu random graphs by focussing on $(\lambda_1,v_1)$ . For Erdős–Rényi random graphs it was shown in [18, 19] that λ₁ satisfies a central limit theorem (CLT) and that v₁ aligns with the unit vector. These papers extend the seminal work carried out in [22].

1.1.3. Chung–Lu random graphs

In the present paper, subject to mild assumptions on $\vec{d}_n$ , we extend the CLT for λ₁ from Erdős–Rényi random graphs to Chung–Lu random graphs, and derive a pointwise CLT for v₁ as well. It was shown in [16] that if $m_2/m_1\gg \sqrt{d_\uparrow}\,(\log n)$ , then $\lambda_1 \sim m_2/m_1$ whp, while if $\sqrt{d_\uparrow} \gg (m_2/m_1)( \log n)^2$ , then $\lambda_1 = d_\uparrow$ whp. In fact, examples show that a result similar to (1.1) does not hold, and that λ₁ does not scale like $\max\{m_2/m_1,$ $\sqrt{d_\uparrow}\}$ . These facts clearly show that the behaviour of λ₁ is controlled by subtle assumptions on the degree sequence. In what follows we stick to a bounded inhomogeneity regime where $m_2/m_1\asymp d_\uparrow$ .

The behaviour of v₁ is interesting and challenging, and is of major interest for applications. One of the crucial properties to look for in eigenvectors is the phenomenon of localization versus delocalization. An eigenvector is called localized when its mass concentrates on a small number of vertices, and delocalized when its mass is approximately uniformly distributed on the vertices. The complete delocalization picture for Erdős–Rényi random graphs was given in [19]. In fact, it was proved that λ₁ is close to the scaled unit vector in the $\ell_\infty$ -norm. In the present paper we do not study localization versus delocalization for Chung–Lu random graphs in detail, but we do show that in a certain regime there is strong evidence for delocalization because v₁ is close to the scaled unit vector. In [9, corollary 1.3 ] the authors found that the eigenvectors of a Wigner matrix with independent standard Gaussian entries are distributed according to a Haar measure on the orthogonal group, and the coordinates have Gaussian fluctuations after appropriate scaling. Our work shows that the coordinate-wise fluctuations hold as well for the principal eigenvector of the non-centered Chung–Lu adjacency matrix and that they are Gaussian after appropriate centering and scaling.

1.1.4. Outline

In section 1.2 we define the Chung–Lu random graph, state our assumption on the degree sequence, and formulate two main theorems: a CLT for the largest eigenvalue and a CLT for its associated eigenvector. In section 1.3 we discuss these theorems and place them in their proper context. Section 2 includes simulations for different graph sizes and degree sequences. Section 3 contains the proof of the CLT of the eigenvalue and section 4 studies the properties of the principal eigenvector.

1.2. Main results

1.2.1. Set-up

Let $\mathbb{G}_n$ be the set of simple graphs with n vertices. Let $\vec{d}_n = (d_i)_{i \in [n]}$ be a sequence of degrees, such that $d_i\in \mathbb{N}$ for all $i\in [n]$ and abbreviate

$\begin{align*} m_k = \sum_{i \in [n]} (d_i)^k, \qquad d_\uparrow = \max_{i \in [n]} d_i, \qquad d_\downarrow= \min_{i\in [n]} d_i. \end{align*}$

Note that these numbers depend on n, but in the sequel we will suppress this dependence. For each pair of vertices $i,j$ (not necessarily distinct), we add an edge independently with probability

$\begin{align} p_{ij}=\frac{d_id_j}{m_1}. \end{align} \tag{ 1.2 }$

The resulting random graph, which we denote by $\mathcal{G}_n(\vec{d}_n)$ , is referred to in the literature as the Chung–Lu random graph. In [15] it was assumed that $d_\uparrow^2 \leqslant m_1$ to ensure that $p_{ij} \leqslant 1$ . In the present paper we need sharper restrictions.

Assumption 1.1. Throughout the paper we need two assumptions on $\vec{d}_n$ as $n\to\infty$ :

Connectivity and sparsity: There exists a ξ > 2 such that
$\begin{align*} (\log n)^{2\xi} \ll d_\uparrow \ll n^{1/2}. \end{align*}$
Bounded inhomogeneity: $d_\downarrow \asymp d_\uparrow$ .

$\spadesuit$

The lower bound in assumption 1.1(D1) guarantees that the random graph is connected whp and that it is not too sparse. The upper bound is needed in order to have $d_\uparrow = {{\mathrm{o}}}(\sqrt{m_1})$ , which implies that (1.2) is well defined. Assumption 1.1(D2) is a restriction on the inhomogeneity of the model and requires that the smallest and the largest degree are comparable.

Remark 1.2. The lower bound on $d_\uparrow$ in assumption 1.1(D1) can be seen as an adaptation to our setting of the main condition in [16, theorem 2.1] for the asymptotics of λ₁. As mentioned in section 1.1, under the assumption

$\begin{align*} \frac{m_2}{m_1}\gg \sqrt{d_\uparrow}\,(\log n)^\xi, \end{align*}$

[16] shows that $\lambda_1 = [1+{{\mathrm{o}}}(1)]\,m_2/m_1$ whp. It is easy to see that the above condition together with assumption 1.1(D2) gives the lower bound in assumption 1.1(D1). $\spadesuit$

Remark 1.3. When $d_\uparrow\ll n^{1/6}$ , [33, theorem 6.19] implies that our results also hold for the Generalized Random Graph (GRG) model with the same average degrees. This model is defined by choosing connection probabilities of the form

$\begin{align*} p_{ij}=\frac{d_id_j}{m_1+d_id_j}, \end{align*}$

and arises in statistical physics as the canonical ensemble constrained on the expected degrees, which is also called the canonical configuration model. Note that in the above connection probability, d_i plays the role of a hidden variable, or a Lagrange multiplier controlling the expected degree of vertex i, but does not in general coincide with the expected degree itself. However, under the assumptions considered here, d_i does coincide with the expected degree asymptotically. The reader can find more about GRG and their use in [33, Chapter 6], and about their role in statistical physics in [31]. In the corresponding microcanonical ensemble the degrees are not only fixed in their expectation but they take a precise deterministic value, which corresponds to the microcanonical configuration model. The two ensembles were found to be nonequivalent in the limit as $n\to\infty$ [30]. This result was shown to imply a finite difference between the expected values of the largest eigenvalue λ₁ in the two models [17] when the degree sequence was chosen to be constant ( $d_i = d$ for all $i \in [n]$ ). In this latter case the canonical ensemble reduces to the Erdős–Rényi random graph with $p = d/n$ , while the microcanonical ensemble reduces to the d-regular random graph model. Although ensemble nonequivalence is not our main focus here, we will briefly relate some of our results to this phenomenon. $\spadesuit$

1.2.2. Notation

Let A be the adjacency matrix of $\mathcal{G}_n(\vec{d}_n)$ and $\mathbb{E}[A]$ its expectation. The (i, j)th entry of $\mathbb{E}[A]$ equals to p_ij in (1.2). The (i, j)th entry of $A-\mathbb{E}[A]$ is an independent centered Bernoulli random variable with parameter p_ij. Let $\lambda_1\geqslant \ldots\geqslant \lambda_n$ be the eigenvalues of A and let $v_1, \ldots, v_n$ be the corresponding eigenvectors. The vector e will be the n-dimensional column vector

$\begin{align} e=\frac{1}{\sqrt{m_1}}( d_1, \ldots, d_n)^t, \end{align} \tag{ 1.3 }$

where t stands for transpose. It is easy to see that $\mathbb{E}[A] = ee^t$ .

Definition 1.4. Following [19], we say that an event $\mathcal{E}$ holds with $(\xi,\nu)$ -high probability (written $(\xi,\nu)$ -hp) when there exist ξ > 2 and ν > 0 such that

$\begin{align} \mathbb{P}(\mathcal{E}^c) \leqslant \mathrm{e}^{-\nu (\log n)^\xi}. \end{align} \tag{ 1.4 }$

$\spadesuit$

Note that this is different from the classical notion of whp, because it comes with a specific rate.

Remark 1.5. Our results hold for any ν > 0 as soon as ξ > 2 (think of ν = 1). The role of ν becomes important when we consider specific subsets $\mathcal{S}$ of the event space and split into $\mathcal{S}\cap\mathcal{E}$ and $\mathcal{S}\cap\mathcal{E}^c$ (see e.g. [19]). $\spadesuit$

We write $\stackrel{w}{\longrightarrow}$ to denote weak convergence as $n\to\infty$ , and use the symbols ${{\mathrm{o}}},\mathrm{O}$ to denote asymptotic order for sequences of real numbers.

1.2.3. CLT for the principal eigenvalue

Our first theorem identifies two terms in the expectation of the largest eigenvalue, and shows that the largest eigenvalue follows a central limit theorem.

Theorem 1.6. Under assumption 1.1, the following hold:

(I)

$\begin{align*} \mathbb{E}[\lambda_1] = \frac{m_2}{m_1} + \frac{m_1m_3}{m_2^2} +{{\mathrm{o}}}(1), \qquad n\to\infty. \end{align*}$
(II)

$\begin{align*} \frac{m_2}{m_1}\left(\frac{\lambda_1-\mathbb{E}[\lambda_1]}{\sigma_1}\right) \stackrel{w}{\longrightarrow} \mathcal{N}(0,2), \qquad n \to \infty, \end{align*}$
where
$\begin{align*} \sigma_1^2 = \sum_{i,j} (p_{ij})^3(1-p_{ij}) \sim \frac{m_3^2}{m_1^3}, \qquad n \to \infty. \end{align*}$

1.2.4. CLT for the principal eigenvector

Our second theorem shows that the principal eigenvector is parallel to the normalised degree vector, and is close to this vector in $\ell^\infty$ -norm. It also identifies the expected value of the components of the principal eigenvector, and shows that the components follow a central limit theorem.

Theorem 1.7. Let $\tilde{e} = e\sqrt{m_1/m_2}$ be the $\ell^2$ -nomalized degree vector. Let v₁ be the eigenvector corresponding to λ₁ and let $v_1(i)$ denote the ith coordinate of v₁. Under assumption 1.1, the following hold:

(I)
$\langle v_1,\tilde{e} \rangle = 1+{{\mathrm{o}}}(1)$ as $n\to\infty$ with $(\xi,\nu)$ -hp .
(II)
$\|v_1- \tilde{e}\|_{\infty} \leqslant \mathrm{O}\left(\frac{(\log n)^{\xi}}{\sqrt{n d_\uparrow}}\right)$ as $n\to\infty$ with $(\xi,\nu)$ -hp .
(III)
$\mathbb{E}[v_1(i)] = \frac{d_i}{\sqrt{m_2}}+\mathrm{O}\left(\frac{(\log n)^{2\xi}}{\sqrt{m_2}}\right)$ as $n\to\infty$ .

Moreover, if the lower bound in assumption 1.1(D1) is strengthened to $(\log n)^{4\xi} \ll d_\uparrow$ , then for all $i \in [n]$ ,

(IV)

$\begin{align*} \frac{m^{3/2}_2}{m_1}\left(\frac{v_1(i)-d_i/\sqrt{m_2}}{s_1(i)}\right) \stackrel{w}{\longrightarrow} \mathcal{N}(0,1), \qquad n \to \infty, \end{align*}$
where
$\begin{align*} s_1^2(i)=\sum_j d_j^2 p_{ij}(1-p_{ij}) \sim d_i\frac{m_3}{m_1},\qquad n \to \infty. \end{align*}$

1.3. Discussion

We place the theorems in their proper context.

(a)
Theorems 1.6 and 1.7 provide a CLT for $\lambda_1,v_1$ . We note that $m_2/m_1$ is the leading order term in the expansion of λ₁, while $m_1m_3/m_2^2$ is a correction term. We observe that theorem 1.6(I) does not follow from the results in [16], because the largest eigenvalue need not be uniformly integrable and also the second order expansion is not considered there. We also note that in theorem 1.6(II) the centering of the largest eigenvalue, $\mathbb{E}[\lambda_1]$ , cannot be replaced by its asymptotic value as the error term is not compatible with the required variance.
(b)
The lower bound in assumption 1.1(D1) is needed to ensure that the random graph is connected, and is crucial because the largest eigenvalue is very sensitive to connectivity properties. Assumption 1.1(D2) is needed to control the inhomogeneity of the random graph. It plays a crucial role in deriving concentration bounds on the central moments $\langle e, (A-\mathbb{E}[A])^k e\rangle$ , $k \in \mathbb{N}$ , with the help of a result from [19]. Further refinements may come from different tools, such as the non-backtracking matrices used in [5, 6]. While assumption 1.1(D1) appears to be close to optimal, assumption 1.1(D2) is far from optimal. It would be interesting to allow for empirical degree distributions that converge to a limiting degree distribution with a power law tail.
(c)
As already noted, if the expected degrees are all equal to each other, i.e. $d_i = d$ for all $i \in [n]$ , then the Chung–Lu random graph, or canonical configuration model, reduces to the homogeneous Erdős–Rényi random graph with $p = d/n$ , while the corresponding microcanonical configuration model reduces to the homogeneous d-regular random graph model (here, all models allow for self-loops). This implies that, for the homogeneous Erdős–Rényi random graph with connection probability $p \gg (\log n)^{2\xi}/n$ , ξ > 2, theorem 1.6(I) reduces to
$\begin{align*} \mathbb{E} [\lambda_1] = np+1+{{\mathrm{o}}}(1), \qquad n\to\infty, \end{align*}$
while theorem 1.6(II) reduces to
$\begin{align*} \frac{1}{\sqrt{p}} \left(\lambda_1- \mathbb{E}[\lambda_1]\right) \stackrel{w}{\longrightarrow} \mathcal{N}(0, 2), \qquad n \to \infty. \end{align*}$
Both these properties were derived in [18] for homogeneous Erdős–Rényi random graphs and also for rank-1 perturbations of Wigner matrices. In [17], the fact that $\mathbb{E}[\lambda_1]$ in the canonical ensemble differs by a finite amount from the corresponding expected value (here, d = np) in the microcanonical ensemble (d-regular random graph) was shown to be a signature of ensemble nonequivalence.
(d)
In case $d_i = d$ for all $i \in [n]$ , theorem 1.7(III) reduces to the following CLT, which was not covered by [18] and [17].

Corollary 1.8. For the Erdős–Rényi random graph with $(\log n)^{4\xi}/n \ll p \ll n^{-1/2}$ for some ξ > 2,

$\begin{align*} n\sqrt{\frac{p}{1-p}}\left( v_1(i)-\frac{1}{\sqrt{n}}\right) \stackrel{w}{\longrightarrow} \mathcal{N}(0,1), \qquad n \to \infty. \end{align*}$

Note that, in the corresponding microcanonical ensemble (d-regular random graph), v₁ coincides with the constant vector where $v_1(i) = 1/\sqrt{n}$ for all $i \in [n]$ . Therefore in the canonical ensemble each coordinate $v_1(i)$ has Gaussian fluctuations around the corresponding deterministic value for the microcanonical ensemble. This behaviour is similar to the degrees having, in the canonical configuration model, either Gaussian (in the dense setting) or Poisson (in the sparse setting) fluctuations around the corresponding deterministic degrees for the microcanonical configuration model [23].

(e)
One way to satisfy assumption 1.1 is to specify functions $\omega,c_1,\ldots,c_n$ , satisfying $(\log n)^{2\xi}\ll\omega(n)\ll \sqrt{n}$ and $c \leqslant c_1(n) \leqslant \ldots \leqslant c_n(n)\leqslant C$ with $c,C\geqslant0$ , such that
$\begin{align*} d_i(n)=c_i(n)\omega(n), \qquad p_{ij}=\frac{c_ic_j}{\frac{1}{n}\sum_k c_k}\frac{\omega}{n}. \end{align*}$
The reason why we avoid such a description is that our setting is potentially broader. The concentration estimate in lemma 3.4 requires us to assume homogeneous degree sequences as above, while theorem 1.6(I) holds for much more general degree sequences. A further refinement of lemma 3.4 may be possible. The advantage of the above description is that it makes the scale $\omega(n)$ on which the degrees live explicit. However, most of the bounds in our proofs depend on some power of $d_\uparrow$ , up to some multiplicative constant. This means that, in the bounded inhomogeneity setting, expressing the asymptotics through $\omega(n)$ or $d_\uparrow$ are equivalent. Bounds expressed through $\omega(n)$ would cease to be meaningful as soon as we manage to push beyond the bounded inhomogeneity setting of our model, while the skeleton of our proof would still hold.
(f)
In [14] the empirical spectral distribution of A was considered under the assumption that
$\begin{align*} (d_\uparrow)^2/m_1 \ll 1 \ll n(d_\uparrow)^2/m_1, \end{align*}$
which is weaker than assumption 1.1. It was shown that if $\mu_n \stackrel{w}{\longrightarrow} \mu$ with $\mu_n = n^{-1} \sum_{i = 1}^n \delta_{d_i/d_\uparrow}$ and µ some probability distribution on $\mathbb{R}$ , then
$\begin{align*} \mathrm{ESD}\left(\frac{A}{\sqrt{n(d_\uparrow)^2/m_1}}\right) \stackrel{w}{\longrightarrow} \mu \boxtimes \mu_\mathrm{sc} \end{align*}$
with $\mu_{\mathrm{sc}}$ the Wigner semicircle law and $\boxtimes$ the free multiplicative convolution. Since $\mu\boxtimes \mu_{\mathrm{sc}}$ is compactly supported, this shows that the scaling for the largest eigenvalue and the spectral distribution are different.

2. Simulations

Theorems 1.6 and 1.7 show that, after proper scaling and under certain conditions of sparsity and homogeneity, the largest eigenvalue and the components of the largest eigenvector exhibit Gaussian behaviour in the limit as $n\to\infty$ . A natural question is how these quantities behave for finite n. Indeed, real-world networks have sizes that range from $n = 10^2$ to $n = 10^9$ . Another question is computational feasibility. Indeed, our CLTs require the degrees to lie between $(\log n)^{4}$ (respectively, $(\log{n})^{8}$ ) and $\sqrt{n}$ . In order to make this possible, n must be at least 10¹¹ (respectively, 10²⁹), which is unrealistic. Let us therefore see what simulations have to say⁶ .

2.1. Largest eigenvalue

In figure 1 we show histograms for the quantity

$\begin{align*} \bar{\lambda}_1=\frac{m_2}{m_1\sigma_1} (\lambda_1 - \mathbb{E}[\lambda_1]), \end{align*}$

which should be close to normal with mean 0 and variance 2 (for $\mathbb{E}[\lambda_1]$ the correction term ${{\mathrm{o}}}(1)$ is neglected). The convergence is fast: already for n = 500 the Gaussian shape emerges and represents an excellent fit: the sample mean µ is close to 0 and the sample standard deviation σ is close to $\sqrt{2}$ .

**Figure 1.** Histograms of $\bar{\lambda}_1$ for different graph sizes n and degree sequences $\vec{d}$ . The sample size for each regime is 10⁴. Each element specified in the degree sequence appears $\frac{n}{5}$ times. In red is plotted the Gaussian fit; µ is the sample mean (represented by a dashed line in the histogram), σ is the sample standard deviation. We expect µ ≈ 0 and $\sigma \approx \sqrt{2}$ .
Download figure:
Standard image High-resolution image

**Figure 1.** Histograms of $\bar{\lambda}_1$ for different graph sizes n and degree sequences $\vec{d}$ . The sample size for each regime is 10⁴. Each element specified in the degree sequence appears $\frac{n}{5}$ times. In red is plotted the Gaussian fit; µ is the sample mean (represented by a dashed line in the histogram), σ is the sample standard deviation. We expect µ ≈ 0 and $\sigma \approx \sqrt{2}$ .
Download figure:
Standard image High-resolution image

2.2. Largest eigenvector

In figure 2 we show histograms for the quantity

$\begin{align*} \bar{v}_1(i)=\frac{m^{3/2}_2}{m_1s_1(i)}\left(v_1(i)-d_i/\sqrt{m_2}\,\right), \end{align*}$

which should be close to normal with mean 0 and variance 1. The fit is again excellent.

**Figure 2.** Histograms of $\bar{v}_1(i)$ for different graph sizes n and degree sequences $\vec{d}$ . For each of the images, i is chosen to be the last i such that d_i is equal to the 4^th element of the corresponding degree sequence (e.g. for n = 500, $v_1(400)$ was analysed with $d_{400} = 20$ . The sample size for each regime is 10⁴. Each element in the degree sequence appears $\frac{n}{5}$ times. In red is plotted the gaussian fit; µ is the sample mean (represented by a dashed line in the histogram), σ is the sample standard deviation. We expect µ ≈ 0 and σ ≈ 1.
Download figure:
Standard image High-resolution image

**Figure 2.** Histograms of $\bar{v}_1(i)$ for different graph sizes n and degree sequences $\vec{d}$ . For each of the images, i is chosen to be the last i such that d_i is equal to the 4^th element of the corresponding degree sequence (e.g. for n = 500, $v_1(400)$ was analysed with $d_{400} = 20$ . The sample size for each regime is 10⁴. Each element in the degree sequence appears $\frac{n}{5}$ times. In red is plotted the gaussian fit; µ is the sample mean (represented by a dashed line in the histogram), σ is the sample standard deviation. We expect µ ≈ 0 and σ ≈ 1.
Download figure:
Standard image High-resolution image

2.3. Degrees of order log n and $\sqrt{n}$

What happens when the degrees are of order $\log n$ ? As can be seen in figure 3, in that range the Gaussian approximation for the largest eigenvalue is visibly worse, especially for the centering. The same happens for the components of the largest eigenvector, as can be seen in figure 4, where the Gaussian shape is lost and two peaks appear.

**Figure 4.** Histograms of $\bar{v}_1(i)$ for different graph sizes n and degree sequences $\vec{d}$ of order $\log n$ . For each of the images, i has been chosen to be the last i such that d_i is equal to the 3^rd element of the specified degree sequence (e.g. for n = 500, $v_1(375)$ was analysed with $d_{375} = 8$ . The sample size for each regime is 10⁴. Each element specified in the degree sequence appears $\frac{n}{4}$ ; µ is the sample mean (represented by a dashed line in the histogram), σ is the sample standard deviation. If theorem 1.7 would hold, then we would expect µ ≈ 0 and σ ≈ 1.
Download figure:
Standard image High-resolution image

3. Proof of theorem 1.6

In what follows we use the well-known method of writing the largest eigenvalue of a matrix as a rank-1 perturbation of the centered matrix. This method was previously successfully employed in [19, 22, 29].

Given the adjacency matrix A of our graph G, we can write $A = H+\mathbb{E}[A]$ with $H = A-\mathbb{E}[A]$ . Let v₁ be the eigenvector associated with the eigenvalue λ₁. Then

$\begin{align*} A v_1 =\lambda_1 v_1, \quad (H+\mathbb{E}[ A])v_1 =\lambda_1v_1, \quad (\lambda_1 I - H) v_1 = \mathbb{E}[A] v_1. \end{align*}$

Using that $\mathbb{E}[A] = ee^t$ , we have $(\lambda_1 I - H) v_1 = \left\langle e, v_1\right\rangle e$ , where I is the n × n identity matrix. It follows that if λ₁ is not an eigenvalue of H, then the matrix $(\lambda_1 I - H)$ is invertible, and so

$\begin{align} v_1= \left\langle e, v_1\right\rangle (\lambda_1 I - H)^{-1} e. \end{align} \tag{ 3.1 }$

Eliminating the eigenvector v₁ from the above equation, we get

$\begin{align*} 1 = \left\langle e, (\lambda_1 I - H)^{-1} e\right\rangle , \end{align*}$

where we use that $\left\langle e, v_1\right\rangle \neq 0$ (since λ₁ is not an eigenvalue of H). Note that this can be expressed as

$\begin{align} \lambda_1 = \left\langle e, \left(I- \frac{H}{\lambda_1}\right)^{-1} e\right\rangle = \sum_{k=0}^{\infty} \left\langle e, \left(\frac{H}{\lambda_1}\right)^{k}e\right\rangle \quad with\; (\xi,\nu)-hp, \end{align} \tag{ 3.2 }$

where the validity of the series expansion will be an immediate consequence of lemma 3.2 below.

Section 3.1 derives bounds on the spectral norm of H. Section 3.2 analyses the expansion in (3.2) and prove the scaling of $\mathbb{E}[\lambda_1]$ . Section 3.3 is devoted to the proof of the CLT for λ₁, section 4 to the proof of the CLT for v₁. In the expansion we distinguish three ranges: (a) $k = 0,1,2$ ; (b) $3 \leqslant k \leqslant L$ ; (c) $L \lt k \lt \infty$ , where

$\begin{align*} L=\lfloor \log n\rfloor. \end{align*}$

We will show that (a) controls the mean and the variance in both CLTs, while (b)–(c) are negligible error terms.

3.1. The spectral norm

In order to study λ₁, we need good bounds on the spectral norm of H. The spectral norm of matrices with inhomogeneous entries has been studied in a series of papers [2, 5, 6] for different density regimes.

An important role is played by $\lambda_1(\mathbb{E}[A])$ . In recent literature this quantity has been shown to play a prominent role in the so-called BBP-transition [4]. Given our setting (1.2), it is easy to see that

$\begin{align} \lambda_1(\mathbb{E}[A])=\frac{m_2}{m_1}, \end{align} \tag{ 3.3 }$

while all other eigenvalues of $\mathbb{E}[A]$ are zero.

Remark 3.1. Since $d_\downarrow\leqslant \frac{m_2}{m_1}\leqslant d_\uparrow$ , assumption 1.1(D2) implies that

$\begin{align} \frac{m_2}{m_1}\asymp d_\uparrow. \end{align} \tag{ 3.4 }$

$\spadesuit$

We start with the following lemma, which ensures concentration of λ₁ and is a direct consequence of the results in [6] (which matches assumption 1.1). In particular, we use [6, theorem 3.2] to check that the boundaries of the bulk of the spectral distribution live on a scale smaller than the scale of λ₁.

Lemma 3.2. Under assumption 1.1, with $(\xi,\nu)$ -hp

$\begin{align*} \left|\frac{\lambda_1(A)-\lambda_1(\mathbb{E}[A])}{\lambda_1(\mathbb{E}[A])}\right| = \mathrm{O}\left(\frac{1}{\sqrt{d_\uparrow}}\right), \qquad n \to \infty, \end{align*}$

and consequently

$\begin{align*} \frac{\lambda_1(A)}{\lambda_1\left(\mathbb{E}[A]\right)} \overset{\mathbb{P}}\to 1, \qquad n \to \infty. \end{align*}$

Proof. In the proof it is understood that all statements hold with $(\xi,\nu)$ -hp in the sense of (1.4). Let $A = H+\mathbb{E}[A]$ . Due to Weyl's inequality, we have that

$\begin{align*} \lambda_1(\mathbb{E}[A])-\|H\|\leqslant\lambda_1(A)\leqslant\lambda_1(\mathbb{E}[A])+\|H\|. \end{align*}$

From [6, theorem 3.2] we know that there is a universal constant C > 0 such that

$\begin{align*} \mathbb{E}\left[\|A-\mathbb{E} [A]\|\right]=\mathbb{E}\left[\|H\|\right]\leqslant \sqrt{d_\uparrow}\left(2+\frac{C}{q}\sqrt{\frac{\log n}{1\vee\log\left(\frac{\sqrt{\log n}}{q}\right)}}\right), \end{align*}$

where

$\begin{align*} q=\sqrt{d_\uparrow}\wedge n^{1/10} \kappa^{-1/9} \end{align*}$

with κ defined by

$\begin{align*} \kappa = \max_{ij} \frac{p_{ij}}{d_\uparrow/n}=\frac{n d_\uparrow}{m_1}. \end{align*}$

Thanks to assumption 1.1(D2), we have $\kappa = \mathrm{O}(1)$ . By remark 3.1 of [6, Remark 3.1] (which gives us that $q = \sqrt{d_\uparrow}$ for n large enough) and assumption 1.1, we get that

$\begin{align} \mathbb{E}\left[\|H\|\right] \leqslant \begin{cases} \sqrt{d_\uparrow}\left(2+\frac{C\sqrt{\log n}}{\sqrt{d_\uparrow}}\right), \qquad (\log n)^{2\xi}\leqslant\sqrt{d_\uparrow} \leqslant n^{1/10} \kappa^{-1/9},\\[9pt] \sqrt{d_\uparrow}\left(2+\frac{C^{^{\prime}}\sqrt{\log n}}{n^{1/10}}\right), \qquad \sqrt{d_\uparrow}\geqslant n^{1/10} \kappa^{-1/9}. \end{cases} \end{align} \tag{ 3.5 }$

Using [8, example 8.7] or [6, equation 2.4] (the Talagrand inequality), we know that there exists a universal constant c > 0 such that

$\begin{align*} \mathbb{P}\left( \left|\|H\|-\mathbb{E}[\|H\|]\right|>t\right)\leqslant 2\mathrm{e}^{-ct^2}. \end{align*}$

For $t = \sqrt{\nu (\log n)^\xi}$ ,

$\begin{align} \mathbb{E}[\|H\|]-\sqrt{\nu} (\log n)^{\xi/2}\leqslant\|H\|\leqslant \mathbb{E}[\|H\|]+\sqrt{\nu} (\log n)^{\xi/2}. \end{align} \tag{ 3.6 }$

Thus, we have

$\begin{align} \left|\lambda_1(A)-\lambda_1(\mathbb{E}[A])\right|\leqslant \|H\|\leqslant \sqrt{d_\uparrow}(2+{{\mathrm{o}}}(1))+ \sqrt{\nu} (\log n)^{\xi/2}. \end{align} \tag{ 3.7 }$

Using that $\lambda_1(\mathbb{E}[A]) = m_2/m_1$ , we have that with $(\xi,\nu)$ -hp the following bound holds:

$\begin{align*} \left|\frac{\lambda_1(A)-\lambda_1(\mathbb{E}[A])}{\lambda_1(\mathrm{E}[A])}\right| \leqslant \frac{\sqrt{d_\uparrow}}{m_2/m_1}\left( 2+{{\mathrm{o}}}(1)\right)+\frac{\sqrt{\nu} (\log n)^{\xi/2}}{m_2/m_1} = \mathrm{O}\left(\frac{\sqrt{d_\uparrow}}{m_2/m_1}\right). \end{align*}$

Via assumption 1.1 and (3.4) the claim follows.

Remark 3.3.

(a)
The proof of lemma 3.2 works well if we replace assumption 1.1(D2) by a milder condition. Indeed, the former is directly linked to the parameter κ that appears in the proof of lemma 3.2 and in the proof of [6, theorem 3.2], which contains a more general condition on the inhomogeneity of the degrees.
(b)
Note that a consequence of proof of lemma 3.2 is that with $(\xi,\nu)$ -hp
$\begin{align} \frac{\|H\|}{\lambda_1(A)}\leqslant 1-C_0 \end{align} \tag{ 3.8 }$
for some $C_0\in (0,1)$ . This allows us to claim that with $(\xi,\nu)$ -hp the inverse
$\begin{align} \left(I- \frac{H}{\lambda_1(A)}\right)^{-1} \end{align} \tag{ 3.9 }$
exists.

$\spadesuit$

Lemma 3.4. Let $1\leqslant k\leqslant L$ . Then, under assumption 1.1, with $(\xi,\nu)$ -hp

$\begin{align*} \left| \left\langle e, H^k e\right\rangle - {\mathbb E} \left[\left\langle e, H^k e\right\rangle \right]\right| \leqslant C \frac{m_2}{m_1}\frac{d_\uparrow^{\frac{k}{2}}(\log n)^{k\xi}}{\sqrt{n}}, \end{align*}$

i.e.

$\begin{align*} \max_{1 \leqslant k \leqslant L} \mathbb{P}\left( \left|\left\langle e, H^k e\right\rangle - {\mathbb E} \left[\left\langle e, H^k e\right\rangle \right]\right| > \frac{C (\log n)^{k\xi}d_\uparrow^{\frac{k}{2}}}{\sqrt{n}} \frac{m_2}{m_1}\right) \leqslant \mathrm{e}^{-\nu(\log n)^\xi}, \qquad n\geqslant n_1(\nu,\xi). \end{align*}$

Lemma 3.4 is a generalization to the inhomogeneous setting of [19, lemma 6.5]. We skip the proof because it requires a straightforward modification of the arguments in [19].

Lemma 3.5. Under assumption 1.1, for $2\leqslant k\leqslant L$ , there exists a constant C > 0 such that

$\begin{align} \mathbb{E}\left[ \left\langle e, H^k e\right\rangle\right] \leqslant \frac{m_2}{m_1} (Cd_\uparrow)^{k/2}. \end{align} \tag{ 3.10 }$

Proof. Let $\mathcal E$ be the high probability event defined by (3.6), i.e.

$\begin{align*} \| H\| \leqslant \mathrm{E}[\|H\|] + \sqrt{\nu}(\log n)^{\xi/2}\leqslant d_\uparrow\left( 1+ \mathrm{O}\left( \frac{(\log n)^{\xi/2}}{d_\uparrow}\right)\right). \end{align*}$

Due to assumption 1.1(D1) we can bound the right-hand side by $Cd_\uparrow$ . Since $\|e\|_2^2 = m_2/m_1$ , on this event we have

$\begin{align*} \mathbb{E}\left[ \left( \left\langle e, H^k e\right\rangle\right){\mathbf{1}}_{\mathcal E}\right] \leqslant \|e\|_2^2\, \mathbb{E}[\|H\|^k{\mathbf{1}}_{\mathcal E}]\leqslant \frac{m_2}{m_1} (Cd_\uparrow)^{k/2} . \end{align*}$

We show that the expectation when evaluated on the complementary event is negligible. Indeed, observe that

$\begin{align*} \mathbb{E}\left[ \left\langle e, H^k e\right\rangle\right] &= \mathbb{E}\left(\sum_{i_1, \ldots, i_{k+1}=1}^n e_{i_1}e_{i_{k+1}} \prod_{j=1}^k H(i_j, i_{j+1})\right)^2\\ &\leqslant \left( \frac{n^{k+1} d_\uparrow^2}{m_1}\right)^2\leqslant C\mathrm{e}^{(2k+2)\log n} \leqslant \mathrm{e}^{2(\log n)^2}, \end{align*}$

where in the last inequality we use that $d_\uparrow = {{\mathrm{o}}}(\sqrt{m_1})$ . This, combined with the exponential decay of the event $\mathcal E^c$ , gives

$\begin{align*} \mathbb{E}\left[ \left\langle e, H^k e\right\rangle{\mathbf{1}}_{A^c}\right] \leqslant C\mathrm{e}^{- \nu(\log n)^{\xi}}, \end{align*}$

and so the claim follows.

3.2. Expansion for the principal eigenvalue

We denote the event in lemma 3.2 by $\mathcal E$ , which has high probability. As noted in remark 3.3(b), $I-\frac{H}{\lambda_1}$ is invertible on $\mathcal E$ . Hence, expanding on $\mathcal E$ , we get

$\begin{align*} \lambda_1= \sum_{k=0}^\infty \left\langle e, \frac{H^k}{\lambda_1^k} e\right\rangle . \end{align*}$

We split the sum into two parts:

$\begin{align} \lambda_1 = \sum_{k=0}^{L} \frac{\left\langle e, H^k e\right\rangle}{\lambda_1^{k}} + \sum_{k=L+1}^\infty \frac{\left\langle e, H^k e\right\rangle}{\lambda_1^{k}}. \end{align} \tag{ 3.11 }$

First we show that we may ignore the second sum. To that end we observe that, by assumption 1.1 (D1), on the event $\mathcal E$ we can estimate

$\begin{align} \left| \sum_{k= L+1}^\infty \frac{\left\langle e, H^k e\right\rangle}{\lambda_1^{k}}\right| &\leqslant \sum_{k=L+1}^\infty \frac{\|e\|_2^2 \|H\|^k}{\lambda_1^k} \leqslant \sum_{k=L+1}^\infty \frac{m_2}{m_1} \frac{d_\uparrow^{k/2}}{(Cm_2/m_1)^k}\nonumber\\ &\leqslant \sum_{k=L+1}^\infty \frac{C^{^{\prime}}}{d_\uparrow^{k/2-1}}=\mathrm{O}\left(\mathrm{e}^{-c\log \sqrt{n}}\right). \end{align} \tag{ 3.12 }$

Because of (3.12) and the fact that $\mathbb{E}(\left\langle e,H e\right\rangle) = 0$ , (3.11) reduces to

$\begin{align*} \lambda_1&=\sum_{k=3}^{L} \frac{\mathbb{E} \left[\left\langle e, H^k e\right\rangle\right]}{\lambda_1^{k}} + \sum_{k=3}^{L} \frac{\left\langle e, H^k e\right\rangle - \mathbb{E}\left[\left\langle e, H^k e\right\rangle\right]}{\lambda_1^{k}}\\ &\quad +\left\langle e, e\right\rangle + \frac{1}{\lambda_1} \left\langle e, H e\right\rangle + \frac{1}{\lambda_1^2} \left\langle e, H^2 e\right\rangle+{{\mathrm{o}}}(1). \end{align*}$

Next, we estimate the second sum in the above equation. Using lemma 3.2, we get

$\begin{align*} \left|\sum_{k=3}^{L} \frac{\left\langle e, H^k e\right\rangle - \mathbb{E} \left[\left\langle e, H^k e\right\rangle \right]}{\lambda_1^{k}}\right| \leqslant \sum_{k=3}^{L} \frac{C d_\uparrow^{\frac{k}{2}}(\log n)^{k\xi}}{\sqrt{n}(m_2/m_1)^{k-1}} \leqslant \sum_{k=3}^{L} \frac{C(\log n)^{k\xi}}{\sqrt{n}d_\uparrow^{k/2-1}} \leqslant\mathrm{O}\left( \frac{C(\log n)^{\xi+1}}{\sqrt{nd_\uparrow}}\right) = {{\mathrm{o}}}(1). \end{align*}$

From lemma 3.5 we have

$\begin{align*} \sum_{k=3}^L \frac{\mathbb{E} \left\langle e, H^k e\right\rangle }{\lambda_1^{k}} \leqslant \sum_{k=3}^L \frac{\frac{m_2}{m_1} (Cd_\uparrow)^{k/2} }{\left(m_2/m_1\right)^k} =\mathrm{O}\left( \frac{1}{\sqrt{d_\uparrow}}\right)= {{\mathrm{o}}}(1), \end{align*}$

where the last estimate follows from assumption 1.1(D1). Hence, on $\mathcal E$ ,

$\begin{align*} \lambda_1= \left\langle e, e\right\rangle +\frac{1}{\lambda_1} \left\langle e, H e\right\rangle+ \frac{\left\langle e, H^2 e\right\rangle}{\lambda_1^2} + {{\mathrm{o}}}(1). \end{align*}$

Iterating the expression for λ₁ in the right-hand side, we get

$\begin{align*} \lambda_1 &= \left\langle e, e\right\rangle + \left\langle e, H e\right\rangle \left( \left\langle e, e\right\rangle + \frac{1}{\lambda_1}\left\langle e, He\right\rangle +\frac{1}{\lambda_1^2} \left\langle e, H^2 e\right\rangle + {{\mathrm{o}}} (1) \right)^{-1} \\ &\quad + \left\langle e, H^2 e\right\rangle \left( \left\langle e, e\right\rangle + \frac{1}{\lambda_1}\left\langle e, H e\right\rangle +\frac{1}{\lambda_1^2} \left\langle e, H^2 e\right\rangle + {{\mathrm{o}}} (1) \right)^{-2} + {{\mathrm{o}}} (1). \end{align*}$

Expanding the second and third term we get,

$\begin{align*} \lambda_1&= \left\langle e, e\right\rangle + \frac{\left\langle e, H e\right\rangle}{\left\langle e, e\right\rangle }\left( 1 - \frac{\left\langle e, H e\right\rangle }{\lambda_1 \left\langle e, e\right\rangle } -\frac{\left\langle e, H^2 e\right\rangle }{\lambda_1^2 \left\langle e, e\right\rangle } + {{\mathrm{o}}} (1) \right) \\ &\quad + \frac{\left\langle e, H^2 e\right\rangle }{(\left\langle e, e\right\rangle )^2}\left( 1 - \frac{2\left\langle e, H e\right\rangle }{\lambda_1 \left\langle e, e\right\rangle } -\frac{2\left\langle e, H^2 e\right\rangle }{\lambda_1^2 \left\langle e, e\right\rangle} + {{\mathrm{o}}} (1) \right) + {{\mathrm{o}}} (1),\\ &= \left\langle e, e \right\rangle + \frac{\left\langle e, H e\right\rangle}{\left\langle e, e\right\rangle } - \frac{\left\langle e, H e\right\rangle^2}{\lambda_1 \left\langle e, e\right\rangle^2} + \frac{\left\langle e, H^2 e\right\rangle }{\left\langle e, e\right\rangle^2} + {{\mathrm{o}}} (1). \end{align*}$

Here we use that $\left\langle e, e\right\rangle = m_2/m_1\to \infty$ , and we ignore several other terms because they are small with $(\xi,\nu)$ -hp , for example,

$\begin{align*} \frac{\left\langle e, H e\right\rangle \left\langle e, H^2 e\right\rangle}{\lambda_1^2\left\langle e, e\right\rangle^2} = \mathrm{O}\left( \frac{d_\uparrow^{3/2}}{(m_2/m_1)^4}\right)={{\mathrm{o}}}(1). \end{align*}$

One more iteration gives

$\begin{align*} \lambda_1 &= \left\langle e, e \right\rangle + \frac{\left\langle e, H e\right\rangle}{\left\langle e, e \right\rangle} + \frac{\left\langle e, H^2 e\right\rangle}{\left\langle e, e \right\rangle^2} \\ &\quad - \frac{\left\langle e, H e\right\rangle^2}{\left\langle e, e \right\rangle^2}\left( \left\langle e, e \right\rangle + \frac{1}{\lambda_1}\left\langle e, H e\right\rangle +\frac{1}{\lambda_1^2} \left\langle e, H^2 e\right\rangle + {{\mathrm{o}}} (1) \right)^{-1} + {{\mathrm{o}}} (1) \\ &= \left\langle e, e \right\rangle + \frac{\left\langle e, H e\right\rangle}{\left\langle e, e \right\rangle} + \frac{\left\langle e, H^2 e\right\rangle}{\left\langle e, e \right\rangle^2} - \frac{\left\langle e, He\right\rangle^2}{\left\langle e, e \right\rangle^3} + \frac{\left\langle e, H^2 e\right\rangle^2 \left\langle e, H e\right\rangle}{\lambda_1 \left\langle e, e \right\rangle^3} + \frac{\left\langle e, H^2 e\right\rangle^3}{\lambda_1^2 \left\langle e, e \right\rangle^3} + {{\mathrm{o}}} (1). \end{align*}$

Proof of theorem 1.6 (I) Since the probability of $\mathcal E^c$ decays exponentially with n, taking the expectation of the above term and using that $\mathbb{E}[\left\langle e, H e\right\rangle] = 0$ , we obtain

$\begin{align*} \mathbb{E} [\lambda_1] = \left\langle e, e \right\rangle + \frac{\mathbb{E}[\left\langle e, H^2 e\right\rangle ]}{\left\langle e, e \right\rangle^2} - \frac{\mathbb{E}[\left\langle e, He\right\rangle ^2]}{\left\langle e, e \right\rangle^3} + {{\mathrm{o}}} (1) = \frac{m_2}{m_1}+ \frac{m_1m_3}{m_2^2}- \frac{m_3^2}{m_2^3} + {{\mathrm{o}}}(1). \end{align*}$

Note that

$\begin{align*} \frac{m_3^2}{m_2^2}\leqslant \frac{d_\uparrow^2}{n}={{\mathrm{o}}}(1), \qquad \frac{m_1 m_3}{m_2^2}\leqslant \left(\frac{d_\uparrow}{d_\downarrow}\right)^4=\mathrm{O}(1), \end{align*}$

and so we can write

$\begin{align} \mathbb{E} [\lambda_1] = \frac{m_2}{m_1}+\frac{m_1m_3}{m_2^2}+ {{\mathrm{o}}}(1). \end{align} \tag{ 3.13 }$

3.3. CLT for the principal eigenvalue

Again consider the high probability event on which (3.9) holds. Recall that from the series decomposition in (3.11) we have

$\begin{align} \lambda_1&= \frac{\left\langle e, He\right\rangle}{\lambda_1} + \sum_{k=0}^{L} \frac{\mathbb{E}\left\langle e, H^ke\right\rangle}{\lambda_1^k} + \sum_{k=2}^{L} \frac{\left\langle e, H^ke\right\rangle- \mathbb{E} \left\langle e, H^k e\right\rangle}{\lambda_1^k} + \sum_{k>L} \frac{\left\langle e, H^k e\right\rangle}{\lambda_1^k}. \end{align} \tag{ 3.14 }$

Lemma 3.6. The equation

$\begin{align} x= \sum_{k=0}^{L} \frac{\mathbb{E}\left\langle e, H^ke\right\rangle}{x^k} \end{align} \tag{ 3.15 }$

has a solution x₀ satisfying

$\begin{align*} \lim_{n\to\infty} \frac{x_0}{m_2/m_1} = 1. \end{align*}$

Proof. Define the function $h\colon\, (0,\infty)\to\mathbb R$ by

$\begin{align*} h(x)= \sum_{k=0}^{\log n} \frac{\mathbb{E}\left\langle e, H^ke\right\rangle}{x^k}. \end{align*}$

Since $\mathbb{E}[e^{\prime} He] = 0$ , we have

$\begin{align*} h\left(\frac{xm_2}{m_1}\right)= \frac{m_2}{m_1} +\sum_{k=2}^{\log n} \frac{\mathbb{E}\left\langle e, H^ke\right\rangle}{(xm_2/m_1)^k}. \end{align*}$

For x > 0,

$\begin{align*} \left|\sum_{k=2}^{\log n} \frac{\mathbb{E}[\left\langle e, H^ke\right\rangle]}{(xm_2/m_1)^k}\right| &\leqslant \sum_{k=2}^\infty \frac{1}{(xm_2/m_1)^k}\frac{m_2}{m_1} (Cd_\uparrow)^{k/2} \\ &= {{\mathrm{o}}}\left( \frac{m_2}{m_1} \sum_{k=2}^\infty\frac{1}{x^k (\log n)^{k\xi}}\right) ={{\mathrm{o}}}\left( \frac{m_2}{m_1} x^{-2}\right). \end{align*}$

This shows that

$\begin{align*} \lim_{n\to\infty} \frac{1}{m_2/m_1} \sum_{k=0}^{\log n} \frac{\mathbb{E}\left\langle e, H^ke\right\rangle}{(xm_2/m_1)^k} = 1. \end{align*}$

Hence, for any $0\lt\delta \lt1$ ,

$\begin{align*} \lim_{n\to\infty} \frac{1}{m_2/m_1} \left[ \frac{m_2}{m_1}(1+\delta)- h\left((1+\delta) \frac{m_2}{m_1}\right)\right] = \delta. \end{align*}$

So, for large enough n,

$\begin{align*} h\left((1+\delta) \frac{m_2}{m_1}\right) < \frac{m_2}{m_1} (1+\delta). \end{align*}$

Similarly, for any $0\lt\delta\lt1$ ,

$\begin{align*} h\left((1-\delta) \frac{m_2}{m_1}\right) > \frac{m_2}{m_1} (1-\delta). \end{align*}$

This shows that there is a solution for (3.15), which lies in the interval $[ \frac{m_2}{m_1}(1-\delta), \frac{m_2}{m_1}(1-\delta)]$ .

Lemma 3.7. Let x₀ be a solution for (3.15). Define

$\begin{align*} R_n=\lambda_1 -x_0 -\frac{\left\langle e, H e\right\rangle}{m_2/m_1}. \end{align*}$

Then

$\begin{align*} R_n={{\mathrm{o}}}_{\mathbb{P}}\left( \frac{m_3}{m_2\sqrt{m_1}}\right), \qquad \mathbb{E}\left[|R_n|\right]={{\mathrm{o}}}\left( \frac{m_3}{m_2\sqrt{m_1}}\right). \end{align*}$

Proof of theorem 1.6 (II) From the previous lemmas we have

$\begin{align*} \lambda_1= x_0+ \frac{\left\langle e, H e\right\rangle}{m_2/m_1}+ R_n. \end{align*}$

Therefore

$\begin{align*} \mathbb{E}[\lambda_1]= x_0+ \mathbb{E}[R_n] \end{align*}$

and

$\begin{align*} \lambda_1- \mathbb{E}[\lambda_1]= \frac{\left\langle e, H e\right\rangle}{m_2/m_1}+ {{\mathrm{o}}}\left(\frac{m_3}{m_2\sqrt{m_1}}\right). \end{align*}$

Hence

$\begin{align} \frac{m_2}{m_1}\left(\lambda_1- \mathbb{E}[\lambda_1]\right)= \left\langle e, H, e\right\rangle + {{\mathrm{o}}}\left( \frac{m_3}{m_1^{3/2}}\right). \end{align} \tag{ 3.16 }$

Observe that

$\begin{align*} \left\langle e, H e\right\rangle = \sum_{i,j=1}^N h_{i,j} \frac{d_i d_j}{m_1}= 2\sum_{i\leqslant j}h_{i,j} \frac{d_i d_j}{m_1}. \end{align*}$

Let

$\begin{align*} \sigma_1^2= \sum_{i\leqslant j} \mathrm{Var}\left( \frac{2}{m_1}h_{i,j} d_i d_j\right)=\sum_{i\leqslant j} \frac{4d_i^3d_j^3}{m_1^3}\left(1- \frac{d_id_j}{m_1}\right)\sim 2 \frac{m_3^2}{m_1^3} \left( 1+\mathrm{O}\left(\frac{d_\uparrow^2}{n}\right)\right), \end{align*}$

where we use the symmetry of the expression in the last equality. We can apply Lyapunov's central limit theorem, because $\{h_{i,j}\colon\, i\leqslant j\}$ is an independent collection of random variables and Lyapunov's condition is satisfied, i.e.

$\begin{align*} \lim_{n \to \infty} \frac{1}{\sigma_n^3} \sum_{i > j} {\mathbb E} \left[ \left| H(i,j) d_i d_j \right|^3\right] \leqslant K\lim_{n \to \infty} \frac{m_1^{3/2}}{m_3^3}\frac{m_4^2}{m_1} =0, \end{align*}$

where K is a constant that does not depend on n. Hence

$\begin{align*} \frac{m_1^{3/2}\left\langle e, H e\right\rangle}{\sqrt{2} m_3}\overset{w}\longrightarrow N(0,1). \end{align*}$

Returning to the eigenvalue equation in (3.16) and dividing by σ₁, we have

$\begin{align*} \frac{\sqrt{m_1}{m_2}}{m_3} \left( \lambda_1- \mathrm{E}[ \lambda_1]\right) = \frac{m_1^{3/2}\left\langle e, H e\right\rangle }{m_3}+ {{\mathrm{o}}}(1)\overset{w}\longrightarrow N(0, 2). \end{align*}$

We next prove lemma 3.7, on which the proof of the central limit theorem relied.

Proof. Note that by (3.14) and (3.15) we can write

$\begin{align} \lambda_1-x_0= \frac{\left\langle e, H e\right\rangle}{\lambda_1}+\sum_{k=2}^{L} \mathbb{E}\left\langle e, H^k e\right\rangle \left( \frac{1}{\lambda_1^k}- \frac{1}{x_0^k}\right)+L_n, \end{align} \tag{ 3.17 }$

where

$\begin{align*} L_n=\sum_{k=2}^L \frac{\left\langle e, H^k e\right\rangle- \mathbb{E} \left\langle e, H^k e\right\rangle}{\lambda_1^k}+\sum_{k>L} \frac{\left\langle e, H^k e\right\rangle}{\lambda_1^k}. \end{align*}$

Thanks to lemma 3.2, 3.4 and (3.12) we have

$\begin{align*} L_n = \mathrm{O}\left(\frac{d_\uparrow(\log n)^{2\xi}}{\sqrt{n}m_2/m_1}\right). \end{align*}$

Note that $L_n = {{\mathrm{o}}}( \frac{m_3}{m_2 \sqrt{m_1}})$ . Indeed, using $m_3\geqslant n d_\downarrow^3$ and assumption 1.1(D1), we get

$\begin{align*} \frac{d_\uparrow(\log n)^{2\xi} m_2\sqrt{m_1}}{\sqrt{n}(m_2/m_1) m_3} \leqslant \frac{d_\uparrow^{5/2} n^{3/2} (\log n)^{2\xi}}{\sqrt{n} d_\downarrow n d_\downarrow^3 (\log n)^\xi}=\frac{d_\uparrow^{5/2}(\log n)^{\xi}}{d_\downarrow^4 } =\mathrm{O}\left( \frac{(\log n)^{\xi}}{d_\downarrow^{3/2}}\right). \end{align*}$

Observe that (3.17) can be rearranged as

$\begin{align*} (\lambda_1-x_0)= \frac{\left\langle e, He\right\rangle}{\lambda_1}-\sum_{k=2}^{L} (\lambda_1-x_0) \mathbb{E}\left\langle e, H^ke\right\rangle\lambda_1^{-k} x_0^{-k} \sum_{j=0}^{k-1} x_0^{k-1-j}+ L_n. \end{align*}$

Hence, bringing the second term from the right to the left, we have

$\begin{align*} (\lambda_1-x_0)\left[ 1+ \sum_{k=2}^{L} \mathbb{E}\left\langle e, H^ke\right\rangle\lambda_1^{-k} x_0^{-k} \sum_{j=0}^{k-1} x_0^{k-1-j}\right]= \frac{\left\langle e, H e\right\rangle}{\lambda_1} + L_n. \end{align*}$

Using the bounds on λ₁ and x₀, we get

$\begin{align*} \left|\sum_{k=2}^{L} \mathbb{E}\left\langle e, H^ke\right\rangle\lambda_1^{-k} x_0^{-k} \sum_{j=0}^{k-1} x_0^{k-1-j}\right| &\leqslant \sum_{k=2}^{L} \frac{k}{(m_2/m_1)^{k+1}} \mathbb{E}\left\langle e, H^ke\right\rangle\\ &\leqslant \sum_{k=2}^{L} \frac{k}{(m_2/m_1)^{k+1}}\frac{m_2}{m_1} (Cd_\uparrow)^{k/2} = \mathrm{O}\left(\frac{d_\uparrow}{(m_2/m_1)^2(\log n)^{2\xi-1}}\right)={{\mathrm{o}}}(1). \end{align*}$

We can therefore write

$\begin{align*} \lambda_1-x_0= \frac{\left\langle e,H e\right\rangle}{\lambda_1}+L_n, \end{align*}$

where $L_n = {{\mathrm{o}}}_P( \frac{m_3}{m_2 \sqrt{m_1}})$ . Finally, to go to R_n, note that

$\begin{align} R_n= \lambda_1-x_0- \frac{\left\langle e, H e\right\rangle} {m_2/m_1} =\left\langle e, H e\right\rangle \left( \frac{1}{\lambda_1}- \frac{1}{m_2/m_1}\right)+ L_n. \end{align} \tag{ 3.18 }$

To bound R_n, it is enough to show that the first term on the right-hand side is with $(\xi,\nu)$ -hp bounded by $\frac{m_3}{m_2 \sqrt{m_1}}$ . Using lemma 3.4 (for k = 1) and (3.7), we have with $(\xi,\nu)$ -hp

$\begin{align} \frac{\left| \left\langle e, H e\right\rangle \right| |\lambda_1- m_2/m_1|}{\lambda_1 m_2/m_1} \leqslant \frac{\sqrt{d_\uparrow}(\log n)^{\xi}}{\sqrt{n}}\frac{\sqrt{d_\uparrow}}{(m_2/m_1)}. \end{align} \tag{ 3.19 }$

Using again assumption 1.1(D1), $m_3\geqslant n d_\downarrow^3$ , $m_1\leqslant n d_\uparrow$ and $m_2\leqslant n d_\uparrow^2$ , we get that

$\begin{align*} \frac{d_\uparrow (\log n)^{\xi}}{\sqrt{n}(m_2/m_1)}\frac{m_2\sqrt{m_1}}{m_3}\leqslant \left(\frac{d_\uparrow}{d_\downarrow}\right)^3 \frac{c}{\sqrt{d_\uparrow}}={{\mathrm{o}}}(1). \end{align*}$

This controls the right-hand side of (3.19), and hence $R_n = {{\mathrm{o}}}( \frac{m_3}{m_2\sqrt{m_1}})$ with $(\xi,\nu)$ -hp .

We want to show that the latter is negligible both pointwise and in expectation. We already have that this is so with $(\xi,\nu)$ -hp on R_n. We want to show that the same bound holds in expectation. Let $\mathcal{A}$ be the high probability event of lemmas 3.2 and 3.4, and write

$\begin{align*} \mathbb{E}[|R_n|] = \mathbb{E}[|R_n| \textbf{1}_{\mathcal{A}^c}] + \mathbb{E}[|R_n|\textbf{1}_{\mathcal{A}}], \end{align*}$

where $\textbf{1}_{\mathcal{A}}$ is the indicator function of the event $\mathcal{A}$ . Since all the bounds hold on the high probability event $\mathcal A$ , it is immediate that

$\begin{align*} \mathbb{E}[|R_n|\textbf{1}_{\mathcal{A}}] ={{\mathrm{o}}}\left(\frac{m_3}{\sqrt{m_1} m_2}\right). \end{align*}$

The remainder can be bounded via the Cauchy–Schwarz inequality, namely,

$\begin{align*} \mathbb{E}[|R_n| \textbf{1}_{\mathcal{A}^c}] \leqslant \left(\mathbb{E}[|R_n|^2]\mathbb{E}[\textbf{1}_{\mathcal{A}^c}]\right)^{\frac{1}{2}}\leqslant\left(\mathbb{E}\left[|R_n|^2\right]e^{-\nu(\log n)^\xi}\right)^{\frac{1}{2}}. \end{align*}$

We see that if $\mathbb{E}[|R_n|^2] = {{\mathrm{o}}}(\mathrm{e}^{-\nu(\log n)^\xi})$ , then we are done. Expanding, we see that

$\begin{align*} \mathbb{E}[|R_n|^2] = \mathbb{E}\left[\left|\lambda_1 - x_0 - \frac{\left\langle e, H e\right\rangle}{m_2/m_1} \right|^2\right] \leqslant n^C \end{align*}$

for some C > 0, where we use that

$\begin{align*} \mathbb{E}[(\lambda_1^2)] \leqslant \mathbb{E}[\mathrm{Tr\,} A^2] = \sum_{i,j=1}^N \mathbb{E}[(A(i,j))^2] \leqslant d_\uparrow n \end{align*}$

and the trivial bound $|\left\langle e, He\right\rangle|\leqslant n^{C_*}$ for some $C_*\lt C$ . Hence we have $\left(\mathbb{E}[|R_n|^2]\mathbb{E}[\textbf{1}_{A^c}]\right)^{\frac{1}{2}} \leqslant \mathrm{e}^{-\nu(\log n)^{\xi}}$ and

$\begin{align*} \mathbb{E}[|R_n|] ={{\mathrm{o}}}\left(\frac{m_3}{\sqrt{m_1} m_2}\right). \end{align*}$

4. Proof of theorem 1.7

In this section we study the properties of the principal eigenvector. Let v₁ be the normalized principal eigenvector, i.e. $\|v_1\| = 1$ , and let e be as defined in (1.3). Recall from (3.1) that

$\begin{align*} \lambda_1 \left(1-\frac{H}{\lambda_1}\right) v_1= e\langle e,v_1\rangle, \end{align*}$

and after inversion (which is possible on the high probability event) we have

$\begin{align*} v_1=\frac{\langle e,v_1\rangle}{\lambda_1}(1-H/\lambda_1)^{-1} e. \end{align*}$

If K denotes the normalization factor, then we can rewrite the above equation with $(\xi,\nu)$ -hp as the series

$\begin{align} v_1=\frac{K}{\lambda_1}\sum_{k=0}^\infty \frac{H^ke}{\lambda_1^k}. \end{align} \tag{ 4.1 }$

Our first step is to determine the value of K in (4.1). We adapt the results from [19] to derive a component-wise central limit theorem in the inhomogeneous setting described by (1.2) under assumption 1.1. By the normalization of v,

$\begin{align} 1 = \langle v_1 , v_1\rangle= \frac{K^2}{\lambda_1^2}\left \langle \sum_{k=0}^\infty \frac{H^k}{\lambda_1^k} e,\sum_{\ell=0}^\infty \frac{H^{\ell}}{\lambda_1^{\ell}} e \right\rangle = \frac{K^2}{\lambda_1^2}\sum_{k=0}^\infty \frac{(k+1)\left\langle e,{H^k}e\right\rangle}{\lambda_1^k}, \end{align} \tag{ 4.2 }$

where we use the symmetry of H.

The following lemma settles theorem 1.7(I).

Lemma 4.1. Under assumption 1.1, and with $\tilde{e} = e\sqrt{\frac{m_1}{m_2}}$ , with $(\xi,\nu)$ -hp

$\begin{align} \langle \tilde{e},v_1\rangle=1+{{\mathrm{o}}}(1). \end{align} \tag{ 4.3 }$

Proof. Recall that $L = \lfloor \log n\rfloor$ . We rewrite (4.2) as

$\begin{align} \begin{split} \left(\frac{\lambda_1}{K}\right)^2 &= \sum^{L}_{k= 0}\frac{(k+1)}{\lambda_1^k}\,\mathbb{E}\left[\left\langle e, H^k e\right\rangle\right] +\sum_{k=1}^{L} \frac{(k+1)}{\lambda_1^k}\left|\left\langle e, H^k e\right\rangle -\mathbb{E}\left[\left\langle e,H^k e\right\rangle\right]\right| \\ &\quad +\sum_{k=L+1}^\infty \frac{(k+1)}{\lambda_1^k}\left\langle e, H^k e\right\rangle. \end{split} \end{align} \tag{ 4.4 }$

We first show that the last two parts are negligible and then show that the main term of the first part is the term with k = 0, i.e. $\langle e,e\rangle = m_2/m_1$ .

The last term in (4.4) is dealt with as follows. Using (3.8), we have with $(\xi,\nu)$ -hp

$\begin{align*} \begin{split} \sum_{k=L+1}^\infty \frac{(k+1)}{\lambda^k}\left\langle e,H^ke\right\rangle &\leqslant\sum_{k=L+1}^\infty (k+1) \frac{\|e\|^2\|H\|^k}{(m_2/m_1)^k} \leqslant \frac{m_2}{m_1}\sum_{k=L+1}^\infty (k+1) (1-C_0)^k\\ &\leqslant\frac{m_2}{m_1} (\log n+2) \mathrm{e}^{-c^{^{\prime}}\log n}\frac{1}{C_0^2} \end{split} \end{align*}$

with $c^{^{\prime}} = -\log(1-C_0)$ , where we use that $\sum_{k = 0}^\infty (k+1)(1-c)^k = 1/c^2$ for $|1-c|\lt1$ .

We tackle the second sum in (4.4) by using lemma 3.4. Indeed, with $(\xi,\nu)$ -hp we have

$\begin{align*} \begin{split} \sum_{k=1}^L \frac{(k+1)}{\lambda_1^k}\left|\left\langle e, H^k e\right\rangle -\mathbb{E}\left[\left\langle e,H^ke\right\rangle\right]\right| &\leqslant\sum_{k=1}^L (k+1) \frac{Cd_\uparrow^{k/2}(\log n)^{k\xi}}{\sqrt{n}}\left(\frac{m_2}{m_1}\right)^{1-k}\\ &\leqslant\frac{C^{^{\prime}}\sqrt{d_\uparrow}(\log n)^\xi(\log n +1)}{\sqrt{n}}\leqslant \frac{C^{^{\prime}}\sqrt{d_\uparrow}(\log n)^{2\xi}}{\sqrt{n}}, \end{split} \end{align*}$

where the constant varies in each step. By assumption 1.1(D1), the last term goes to zero.

As to the first term, note that by (3.5) for $k\geqslant3$ we have

$\begin{align*} \sum^{L}_{k=3}\frac{(k+1)}{\lambda_1^k}\mathbb{E}\left[\left\langle e, H^k e\right\rangle \right]&\leqslant \sum_{k=3}^{L} (k+1) \left(\frac{m_2}{m_1}\right)^{-k+1} (Cd_\uparrow)^{k/2} \\ &\leqslant \sum_{k=3}^L \frac{C d_\uparrow^{k/2}}{(m_2/m_1)^{(k-1)}}=\mathrm{O}\left(\frac{1}{\sqrt{d_\uparrow}}\right). \end{align*}$

The term with k = 1 is zero, while for k = 2 we have

$\begin{align*} 3\frac{\mathbb{E} \langle e,H^2e\rangle}{\lambda^2_1}\leqslant c \frac{m_1m_3}{m_2^2}= \mathrm{O}\left(1\right) \end{align*}$

for some constant c. After substituting these results into (4.4), we find

$\begin{align} \left(\frac{\lambda_1}{K}\right)^2= \frac{m_2}{m_1}\left(1+\mathrm{O}\left(\frac{1}{m_2/m_1}\right)\right) \end{align} \tag{ 4.5 }$

and the proof follows by normalizing the vector e and using (4.1).

The following lemma is an immediate consequence of (4.1) and lemma 4.1.

Lemma 4.2. Under assumptions 1.1, with $(\xi,\nu)$ -hp

$\begin{align} v_1 = \left(1+\mathrm{O}\left(\frac{m_1}{m_2}\right)\right) \sqrt{\frac{m_1}{m_2}} \sum_{k=0}^\infty\frac{H^k}{\lambda_1^k}e. \end{align} \tag{ 4.6 }$

In order to estimate how the components of v₁ concentrate, we need the following lemma.

Lemma 4.3. For $1 \leqslant k \leqslant L$ , with $(\xi,\nu)$ -hp

$\begin{align*} |H^k e(i)| =\left| \frac{1}{\sqrt{m_1}} \sum_{i_1,\dots,i_k}h_{i i_1}h_{i_1i_2}\dots h_{i_{k-1}i_k}d_{i_k}\right| \leqslant\frac{d_\uparrow}{\sqrt{m_1}}\left((\log n)^{\xi} \sqrt{d_\uparrow} \right)^k. \end{align*}$

The proof of this lemma is a direct consequence of lemma 3.4, is similar to [19, lemma 7.10] and therefore we skip it. An immediate corollary of the above estimate is the delocalized behaviour of the largest eigenvector stated in theorem 1.7(II).

Lemma 4.4. Let v₁ be the normalized principal eigenvector, and $\tilde{e} = e\sqrt{\frac{m_1}{m_2}}$ . Then with $(\xi,\nu)$ -hp

$\begin{align*} \|v_1- \tilde{e}\|_{\infty} \leqslant \mathrm{O}\left(\frac{(\log n)^{\xi}}{\sqrt{n d_\uparrow}}\right). \end{align*}$

Proof. Recall from (4.4) that

$\begin{align*} v_1(i) =\frac{K}{\lambda_1} \sum_{k=0}^\infty \frac{H^ke(i)}{\lambda_1^k} = \frac{K}{\lambda_1} e(i) + \frac{K}{\lambda_1} \sum_{k=1}^L \frac{H^ke(i)}{\lambda_1^k} +\frac{K}{\lambda_1} \sum_{k=L+1}^\infty \frac{H^ke(i)}{\lambda_1^k}. \end{align*}$

The last term is negligible with $(\xi,\nu)$ -hp , because it is the tail sum of a geometrically decreasing sequence. For the sum over $1 \leqslant k \leqslant L$ we can use lemma 4.3 and the fact that $K/\lambda_1 = \sqrt{\frac{m_1}{m_2}}+ {{\mathrm{o}}}(1)$ with $(\xi,\nu)$ -hp . So we have

$\begin{align*} \frac{K}{\lambda_1} \sum_{k=1}^L \frac{H^ke(i)}{\lambda_1^k}\leqslant \frac{d_\uparrow}{\sqrt{n}d_\downarrow} \frac{(\log n)^{\xi}}{\sqrt{d_\uparrow}}\leqslant \mathrm{O}\left(\frac{(\log n)^{\xi}}{\sqrt{nd_\uparrow}}\right). \end{align*}$

The first term with $(\xi,\nu)$ -hp is

$\begin{align*} \frac{K}{\lambda_1} e(i) = \tilde{e}(i)+ {{\mathrm{o}}}(1) \end{align*}$

and the error is uniform over all i. Indeed, with $(\xi,\nu)$ -hp

$\begin{align} \left| \frac{K}{\lambda_1} e(i) - \frac{K}{m_2/m_1} e(i)\right| \leqslant \frac{K d_i}{\sqrt{m_1}} \frac{\left|\lambda_1- m_2/m_1\right|}{(m_2/m_1)^2} \leqslant \sqrt{\frac{m_2}{m_1}}\frac{cd_\uparrow^{3/2}}{\sqrt{d_\downarrow n}} \frac{c^{^{\prime}}}{d_\uparrow^2} =\mathrm{O}\left(\frac{1}{\sqrt{n d_\uparrow}}\right), \end{align} \tag{ 4.7 }$

where we use assumption 1.1, remark 3.1 and (3.7). Since the detailed computations are similar to the previous arguments, we skip the details.

We next prove the central limit theorem for the components of the eigenvector stated in theorem 1.7(IV).

Theorem 4.5. Under assumption 1.1, with the extra assumption $d_\uparrow\gg (\log n)^{4\xi}$ ,

$\begin{align*} \sqrt{\frac{m_2^3}{d_i m_3 m_1}} \Big(v_1(i)-\frac{d_i}{\sqrt{m_2}}\Big) \overset{w}\rightarrow \mathcal N(0,1). \end{align*}$

Proof. First we compute $\mathbb{E}[v_1(i)]$ , and afterwards we show that the CLT holds componentwise.

We use the law of total expectation. Conditioning on the high probability event $\mathcal E$ in lemma 3.2, we can write the expectation of the normalized eigenvector v₁ as

$\begin{align*} \mathbb{E}[v_1(i)]=\mathbb{E}[v_1(i)|\mathcal{E}]\,\mathbb{P}(\mathcal{E})+\mathbb{E}[v_1(i)|\mathcal{E}^c]\,\mathbb{P}(\mathcal{E}^c). \end{align*}$

Because the components of a normalized n-dimensional vector are bounded, we know that

$\begin{align*} \mathbb{E}[v_1(i)]=\mathbb{E}[v_1(i)|\mathcal{E}]\,\mathbb{P}(\mathcal{E})+\mathrm{O}\left( e^{-c_\nu (\log n)^\xi}\right) \end{align*}$

for some suitable constant $c_\nu\gt0$ , dependent on ν and on the the bound on $v_1(i)$ . On $\mathcal{E}$ , we can expand v₁ as

$\begin{align*} v_1(i) = \frac{K}{\lambda_1} \left( e(i) + \frac{(He)(i)}{\lambda_1}+\frac{(H^2e)(i)}{\lambda_1^2} + \sum_{k=3}^\infty \frac{(H^ke)(i)}{\lambda_1^k} \right). \end{align*}$

Using the notation $\mathbb{E}_{\mathcal E}$ for the conditional expectation on the event $\mathcal E$ , we have

$\begin{align*} &\mathbb{E}_\mathcal{E}[v_1(i)]=\mathbb{E}_\mathcal{E}\left[\frac{K}{\lambda_1} e(i)\right] + \mathbb{E}_\mathcal{E}\left[\frac{K}{\lambda_1} \frac{(He)(i)}{\lambda_1}\right]+ + \mathbb{E}_\mathcal{E}\left[\frac{K}{\lambda_1} \sum_{k=2}^\infty \frac{(H^ke)(i)}{\lambda_1^k} \right]. \end{align*}$

For the first term we have, using (4.5),

$\begin{align*} \mathbb{E}_{\mathcal{E}}\left[ \frac{K}{\lambda_1}e_i\right] &= \mathbb{E}_{\mathcal{E}}\left[ \frac{1}{\sqrt{m_2/m_1}}e_i\right]+\mathrm{O}\left(\frac{d_i}{\sqrt{m_1} (m_2/m_1)^{3/2}}\right) =\frac{d_i}{\sqrt{m_2}} +\mathrm{O}\left( \frac{d_i}{\sqrt{m_1} (m_2/m_1)^{3/2}}\right). \end{align*}$

For the term corresponding to k = 1, we know that $\mathbb{E}[(He)(i)] = 0$ by construction on the whole space. However, under the event $\mathcal{E}$ we can show that its contribution is exponentially negligible. We have

$\begin{align*} \mathbb{E}_\mathcal{E}\left[\frac{K}{\lambda_1}\frac{(He)(i)}{\lambda_1}\right]=\mathbb{E}_\mathcal{E}\left[\frac{K}{\lambda_1}\frac{\sum_jh_{ij}d_j}{\sqrt{m_1}\lambda_1}\right] =\mathbb{E}_\mathcal{E}\left[\frac{\left(1+\mathrm{O}\left( \frac{1}{m_2/m_1}\right)\right)}{\sqrt{m_2/m_1}} \left(\frac{\sum_j h_{ij}d_j}{\sqrt{m_1}(m_2/m_1)}+\frac{\sum_j h_{ij}d_j}{\sqrt{m_1}}\frac{|\lambda_1-(m_2/m_1)|}{(m_2/m_1)^2}\right)\right]. \end{align*}$

Since $m_2/m_1\to\infty$ , there exists a constant $\tilde C$ such that

$\begin{align*} \frac{\left(1+\mathrm{O}\left(1/(m_2/m_1)\right)\right)}{\sqrt{m_2/m_1}}\leqslant \tilde{C} \frac{1}{\sqrt{m_2/m_1}}. \end{align*}$

We can therefore write

$\begin{align*} \mathbb{E}_\mathcal{E}\left[\frac{K}{\lambda_1}\frac{\sum_jh_{ij}d_j}{\sqrt{m_1}\lambda_1}\right] &\leqslant \tilde{C}\frac{1}{\sqrt{m_2/m_1}}\mathbb{E}_\mathcal{E}\left[\frac{\sum_j h_{ij}d_j}{\sqrt{m_1}(m_2/m_1)} +\frac{\sum_j h_{ij}d_j}{\sqrt{m_1}}\frac{|\lambda_1-(m_2/m_1)|}{(m_2/m_1)^2}\right]\\ &\leqslant\mathbb{E}_\mathcal{E}\left[\frac{\sum_j h_{ij}d_j}{\sqrt{m_1}(m_2/m_1)}\right] +\mathbb{E}_\mathcal{E}\left[\frac{\sum_j h_{ij}d_j}{\sqrt{m_1}}\frac{|\lambda_1-(m_2/m_1)|}{(m_2/m_1)^2}\right]\\ &\leqslant \mathbb{E}_\mathcal{E}\left[\sum_j h_{ij}d_j\right]\left(\frac{1}{\sqrt{m_1}(m_2/m_1)} +\frac{\sqrt{d_\uparrow}}{\sqrt{m_1}(m_2/m_1)}\right). \end{align*}$

Here we use (3.7) to bound the difference $|\lambda_1-(m_2/m_1)|$ . Next, write

$\begin{align*} 0 &=\mathbb{E}\left[\sum_j h_{ij}d_j\right]=\mathbb{E}_\mathcal{E}\left[\sum_j h_{ij}d_j\right]\mathbb{P}(\mathcal{E}) +\mathbb{E}_{\mathcal{E}^c}\left[\sum_j h_{ij}d_j\right]\mathbb{P}(\mathcal{E}^c)\\ &\leqslant\mathbb{E}_\mathcal{E}\left[\sum_j h_{ij}d_j\right]\mathbb{P}(\mathcal{E})+m_1\mathbb{P}(\mathcal{E}^c) =\mathbb{E}_\mathcal{E}\left[\sum_j h_{ij}d_j\right]\mathbb{P}(\mathcal{E})+\mathrm{O}\left( e^{-c_\nu (\log n)^\xi}\right), \end{align*}$

where c_ν is a constant depending on ν, and we use that $|h_{ij}|\leqslant 1$ and $m_1 = \mathrm{O}\left( e^{3/2\log n}\right)$ . We can therefore conclude that

$\begin{align*} \mathbb{E}_\mathcal{E}\left[\frac{(He)(i)}{\lambda_1}\right]=\mathrm{O}\left( e^{-c^{\prime}_\nu (\log n)^\xi}\right), \end{align*}$

where $c^{\prime}_\nu\gt0$ is a suitable constant depending on ν, and possibly different from c_ν.

To bound the remaining expectation terms, we use lemma 4.3, which gives a bound on $(H^ke)(i)$ on the event $\mathcal{E}$ . As before, we break up the sum into two contributions:

$\begin{align*} \mathbb{E}_\mathcal{E}\left[\frac{K}{\lambda_1} \sum_{k=2}^\infty \frac{(H^ke)(i)}{\lambda_1^k}\right] =\mathbb{E}_\mathcal{E}\left[\frac{K}{\lambda_1} \sum_{k=2}^L \frac{(H^ke)(i)}{\lambda_1^k}\right] +\mathbb{E}_\mathcal{E}\left[\frac{K}{\lambda_1} \sum_{k=L}^\infty \frac{(H^ke)(i)}{\lambda_1^k}\right]. \end{align*}$

For the second term we have

$\begin{align} \begin{aligned} \sum_{k=L+1}^\infty \frac{\left( H^ke\right)(i)}{\lambda_1^k} \leqslant C \sqrt{\frac{m_2}{m_1}}\,\mathrm{e}^{-C_c\log n}, \end{aligned} \end{align} \tag{ 4.8 }$

where we use (3.8) and $C_c = |\log (1-C_0)|$ . The first term can be bounded via lemma 4.3, which gives

$\begin{align} \begin{aligned} \sum_{k=2}^L \frac{(H^ke)(i)}{\lambda_1^k} &\leqslant\sum_{k=2}^L \frac{d_\uparrow \left((\log n)^\xi \sqrt{d_\uparrow}\right)^k}{\sqrt{m_1}(m_2/m_1)^k} =\mathrm{O}\left( \frac{(\log n)^{2\xi}}{\sqrt{m_1}}\right). \end{aligned} \end{align} \tag{ 4.9 }$

Using the above bounds, taking expectations and using (4.5), we get

$\begin{align*} \mathbb{E}_\mathcal{E}\left[\frac{K}{\lambda_1} \sum_{k=2}^\infty \frac{(H^ke)(i)}{\lambda_1^k}\right] =\mathrm{O}\left(\frac{(\log n)^{2\xi}}{\sqrt{m_2}}\right). \end{align*}$

Thus, we have obtained that

$\begin{align*} \mathbb{E}[v_1(i)]=\frac{d_i}{\sqrt{m_2}}+\mathrm{O}\left(\frac{(\log n)^{2\xi}}{\sqrt{m_2}}\right), \end{align*}$

which settles theorem 1.7(III).

We can write

$\begin{align*} v_1(i)-\frac{d_i}{\sqrt{m_2}}= \frac{\left( 1+\mathrm{O}\left(\frac{1}{m_2/m_1}\right)\right) e(i)}{\sqrt{m_2/m_1}} -\frac{d_i}{\sqrt{m_2}}+ \frac{K}{\lambda_1} \frac{(He)(i)}{\lambda_1} +\mathrm{O}\left(\frac{(\log n)^{2\xi}}{\sqrt{m_2}}\right), \end{align*}$

where we replace the last terms of the expansion of v₁ by the bounds derived above (note that these bounds are of the same order as the ones obtained for the same terms in expectation). The first term of the centered quantity $v_1(i)-d_i/\sqrt{m_2}$ is given by

$\begin{align*} \frac{\left( 1+\mathrm{O}\left(\frac{1}{m_2/m_1}\right)\right) e(i)}{\sqrt{m_2/m_1}} =\mathrm{O}\left(\frac{d_i}{\sqrt{m_1} (m_2/m_1)^{3/2}}\right). \end{align*}$

This last error can be easily seen to be ${{\mathrm{o}}}\left(\frac{(\log n)^{2\xi}}{\sqrt{m_2}}\right)$ . We can therefore write

$\begin{align*} v_1(i)-\mathbb{E}[v_1(i)]= \frac{K}{\lambda_1} \frac{(He)(i)}{\lambda_1} +\mathrm{O}\left(\frac{(\log n)^{2\xi}}{\sqrt{m_2}}\right). \end{align*}$

We proceed to show that the first term on the right-hand side of the above equality gives a CLT when the expression is rescaled by an appropriate quantity, and the error term goes to zero. It turns out that

$\begin{align*} s_n^2(i)=\mathrm{Var}\left(\sum_j h_{ij} d_j\right) = \sum_j \frac{d_id^3_j}{m_1}\left(1+\mathrm{O}\left(\frac{1}{d_\downarrow}\right)\right) \sim \frac{d_im_3}{m_1}. \end{align*}$

Multiplying by $\sqrt{\frac{m_2^3}{d_im_3m_1}}$ , we have

$\begin{align*} \sqrt{\frac{m_2^3}{d_im_3m_1}} \Big(v_1(i)-\langle\tilde{e},v_1\rangle\tilde{e}(i)\Big) = \frac{1}{s_n}\sum_j h_{ij}d_j +\mathrm{O}\left(\sqrt{\frac{m_2^2 (\log n)^{4\xi}}{d_im_3m_1}}\right). \end{align*}$

The error term is

$\begin{align*} \sqrt{\frac{m_2^2 (\log n)^{4\xi}}{d_im_3m_1}}=\mathrm{O}\left(\frac{(\log n)^{2\xi}}{\sqrt{d_\downarrow}}\right) ={{\mathrm{o}}}(1), \end{align*}$

where last inequality follows from the assumption that $d_\downarrow\gg(\log n)^{4\xi}$ . We now apply Lindeberg's CLT to the term $\frac{\sum_j h_{ij}d_j}{s_n}$ . The Lindeberg condition for the CLT reads

$\begin{align} \lim_{n\to\infty}\frac{1}{s_n^2(i)}\sum_{j}^n\mathbb{E}\left[(h_{i j}d_j)^2 \, {\boldsymbol{1}}_{\{|h_{i j}d_j|\geqslant \epsilon s_n(i) \}}\right]=0. \end{align} \tag{ 4.10 }$

Defining $\sigma^2_j(i) = \textbf{Var}(h_{ij}d_j)$ , we note that

$\begin{align*} \lim_{n\to\infty}\frac{\sigma^2_j(i)}{s^2_n(i)} =\lim_{n\to\infty}\frac{d_i d_j^3 m_1}{m_1 m_3 d_i}\leqslant \lim_{n\to\infty}\frac{d_\uparrow^3}{m_3} \leqslant\lim_{n\to\infty}\frac{d_\uparrow^3}{nd_\downarrow^3}=0. \end{align*}$

Let us finally examine the event

$\begin{align*} |h_{ij}d_j|\geqslant \epsilon s_n(i)=\epsilon \sqrt{\frac{d_im_3}{m_1}}\iff|h_{ij}|\geqslant \epsilon \sqrt{\frac{m_3}{m_1}\frac{d_i}{d_j^2}}. \end{align*}$

By definition, $|h_{ij}|\lt1$ . If we show that

$\begin{align*} \lim_{n\to\infty} \sqrt{\frac{m_3}{m_1}\frac{d_i}{d_j^2}}=\infty, \end{align*}$

then for all ε > 0 there exists n_ε such that the event

$\begin{align*} \epsilon\sqrt{\frac{m_3}{m_1}\frac{d_i}{d_j^2}}>1>|h_{ij}| \end{align*}$

has probability 1. Indeed,

$\begin{align*} \lim_{n\to\infty}\epsilon \sqrt{\frac{m_3}{m_1}\frac{d_i}{d_j^2}} > \lim_{n\to\infty}\epsilon\sqrt{\frac{nd_\downarrow^4}{nd_\uparrow^3}} \geqslant \lim_{n\to\infty}\epsilon\, C\sqrt{d_\downarrow} = \infty \end{align*}$

for a suitable constant C. Thus, (4.10) holds.

Acknowledgment

P D, R S H, F d H and M M are supported by the Netherlands Organisation for Scientific Research (NWO) through the Gravitation-grant NETWORKS-024.002.003. D G is supported by the Dutch Econophysics Foundation (Stichting Econophysics, Leiden, The Netherlands) and by the European Union—Horizon 2020 Program under the scheme 'INFRAIA-01-2018-2019 - Integrating Activities for Advanced Communities', Grant Agreement N.871042, 'SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics'.

Data availability statement

No new data were created or analysed in this study.

Central limit theorem for the principal eigenvalue and eigenvector of Chung–Lu random graphs

Article metrics

Submit

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction, main results and discussion

1.1. Introduction

1.1.1. Setting

1.1.2. Principal eigenvalue and eigenvector

1.1.3. Chung–Lu random graphs

1.1.4. Outline

1.2. Main results

1.2.1. Set-up

1.2.2. Notation

1.2.3. CLT for the principal eigenvalue

1.2.4. CLT for the principal eigenvector

1.3. Discussion

2. Simulations

2.1. Largest eigenvalue

2.2. Largest eigenvector

2.3. Degrees of order log n and $\sqrt{n}$

3. Proof of theorem 1.6

3.1. The spectral norm

3.2. Expansion for the principal eigenvalue

3.3. CLT for the principal eigenvalue

4. Proof of theorem 1.7

Acknowledgment

Data availability statement

Footnotes

Central limit theorem for the principal eigenvalue and eigenvector of Chung–Lu random graphs

Article metrics

Submit

Share this article

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction, main results and discussion

1.1. Introduction

1.1.1. Setting

1.1.2. Principal eigenvalue and eigenvector

1.1.3. Chung–Lu random graphs

1.1.4. Outline

1.2. Main results

1.2.1. Set-up

1.2.2. Notation

1.2.3. CLT for the principal eigenvalue

1.2.4. CLT for the principal eigenvector

1.3. Discussion

2. Simulations

2.1. Largest eigenvalue

2.2. Largest eigenvector

2.3. Degrees of order log n and \sqrt{n}

3. Proof of theorem 1.6

3.1. The spectral norm

3.2. Expansion for the principal eigenvalue

3.3. CLT for the principal eigenvalue

4. Proof of theorem 1.7

Acknowledgment

Data availability statement

Footnotes

2.3. Degrees of order log n and $\sqrt{n}$