Paper The following article is Open access

Central limit theorem for the principal eigenvalue and eigenvector of Chung–Lu random graphs

, , , and

Published 22 February 2023 © 2023 The Author(s). Published by IOP Publishing Ltd
, , Focus on Random Matrices and Complex Networks Citation Pierfrancesco Dionigi et al 2023 J. Phys. Complex. 4 015008 DOI 10.1088/2632-072X/acb8f7

2632-072X/4/1/015008

Abstract

A Chung–Lu random graph is an inhomogeneous Erdős–Rényi random graph in which vertices are assigned average degrees, and pairs of vertices are connected by an edge with a probability that is proportional to the product of their average degrees, independently for different edges. We derive a central limit theorem for the principal eigenvalue and the components of the principal eigenvector of the adjacency matrix of a Chung–Lu random graph. Our derivation requires certain assumptions on the average degrees that guarantee connectivity, sparsity and bounded inhomogeneity of the graph.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction, main results and discussion

1.1. Introduction

The spectral properties of adjacency matrices play an important role in various areas of network science. In the present paper we consider an inhomogeneous version of the Erdős–Rényi random graph called the Chung–Lu random graph and we derive a central limit theorem for the principal eigenvalue and eigenvector of its adjacency matrix.

1.1.1. Setting

Recall that the homogeneous Erdős–Rényi random graph has vertex set $[n] = \{1,\ldots,n\}$, and each edge is present with probability p and absent with probability $1-p$, independently for different edges, where $p \in (0,1)$ may depend on n (in what follows we often suppress the dependence on n from the notation; the reader is however warned that most quantities depend on n). The average degree is the same for every vertex and equals $(n-1)p$ when self-loops are not allowed, and np when self-loops are allowed (and are considered to contribute to the degrees of the vertices). In [15] the following generalisation of the Erdős–Rényi random graph is considered, called the Chung–Lu random graph, with the goal to accommodate general degrees. Given a sequence of degrees $\vec{d}_n = (d_i)_{i \in [n]}$, consider the random graph $\mathcal{G}_n(\vec{d}_n)$ in which to each pair $i,j$ of vertices an edge is assigned independently with probability $p_{ij} = d_id_j/m_1$, where $m_1 = \sum_{i = 1}^n d_i$ (for computational simplicity we allow self-loops). Here, the degrees can act as vertex weights. Vertices with low weights are more likely to have less neighbours than vertices with high weights which act as hubs (see [33, chapter 6] for a general introduction to generalised random graphs). If $d_\uparrow^2 \leqslant m_1$ with $d_\uparrow = \max_{i\in [n]} d_i$, then $p_{ij} \leqslant 1$ for all $i,j \in [n]$, and the sequence $\vec{d}_n$ is graphical. Note that in $\mathcal{G}_n(\vec{d}_n)$ the expected degree of vertex i is di . The classical Erdős–Rényi random graph (with self-loops) corresponds to $d_i = np$ for all $i \in [n]$.

1.1.2. Principal eigenvalue and eigenvector

The largest eigenvalue of the adjacency matrix A and its corresponding eigenvector, written as $(\lambda_1,v_1)$, contain important information about the random graph. Several community detection techniques depend on a proper understanding of these quantities [1, 25, 32], which in turn play an important role for various measures of network centrality [26, 27] and for the properties of dynamical processes (such as the spread of an epidemic) taking place on networks [12, 28]. For Erdős–Rényi random graphs, it was shown in [24] that with high probability (whp in the following) λ1 scales like

Equation (1.1)

where $D_\infty$ is the maximum degree. This result was partially extended to $\mathcal{G}_n(\vec{d}_n)$ in [16], and more recently to a class of inhomogeneous Erdős–Rényi random graphs in [5, 6]. For a related discussion on the behaviour of $(\lambda_1,v_1)$ in real-world networks, see [12, 28]. In the present paper we analyse the fluctuations of $(\lambda_1,v_1)$. We will be interested specifically in the case where λ1 is detached from the bulk, which for Erdős–Rényi random graphs occurs when $\lambda_1 \sim np$ whp, and for Chung–Lu random graphs when $\lambda_1 \sim m_2/m_1$, where $m_2 = \sum_{i\in [n]} d_i^2$. Note that the quotient $m_2/m_1$ arises from the fact that the average adjacency matrix is rank one and that its only non-zero eigenvalue is $m_2/m_1$. Such rank-one perturbations of a symmetric matrix with independent entries became prominent after the work in [4]. Later studies extended this work to finite-rank perturbations [3, 7, 10, 11, 20, 21]. Erdős–Rényi random graphs differ, in the sense that perturbations live on a scale different from $\sqrt{n}$. For Chung–Lu random graphs we assume that $m_2/m_1\to\infty$.

In the setting of inhomogeneous Erdős–Rényi random graphs, finite-rank perturbations were studied in [13]. In that paper the connection probability between between i and j is given by $p_{ij} = \varepsilon_n f(i/n, j/n)$, where $f\colon\,[0,1]^2\to [0,1]$ is almost everywhere continuous and of finite rank, $\varepsilon_n \in [0,1]$ and $n\varepsilon_n\gg (\log n)^{8}$. However, for a Chung–Lu random graph with a given degree sequence it is not always possible to construct an almost everywhere continuous function f independent of n such that $\varepsilon_n f(i/n, j/n) = d_i d_j/m_1$. In the present paper we extend the analysis in [13] to Chung–Lu random graphs by focussing on $(\lambda_1,v_1)$. For Erdős–Rényi random graphs it was shown in [18, 19] that λ1 satisfies a central limit theorem (CLT) and that v1 aligns with the unit vector. These papers extend the seminal work carried out in [22].

1.1.3. Chung–Lu random graphs

In the present paper, subject to mild assumptions on $\vec{d}_n$, we extend the CLT for λ1 from Erdős–Rényi random graphs to Chung–Lu random graphs, and derive a pointwise CLT for v1 as well. It was shown in [16] that if $m_2/m_1\gg \sqrt{d_\uparrow}\,(\log n)$, then $\lambda_1 \sim m_2/m_1$ whp, while if $\sqrt{d_\uparrow} \gg (m_2/m_1)( \log n)^2$, then $\lambda_1 = d_\uparrow$ whp. In fact, examples show that a result similar to (1.1) does not hold, and that λ1 does not scale like $\max\{m_2/m_1,$ $\sqrt{d_\uparrow}\}$. These facts clearly show that the behaviour of λ1 is controlled by subtle assumptions on the degree sequence. In what follows we stick to a bounded inhomogeneity regime where $m_2/m_1\asymp d_\uparrow$.

The behaviour of v1 is interesting and challenging, and is of major interest for applications. One of the crucial properties to look for in eigenvectors is the phenomenon of localization versus delocalization. An eigenvector is called localized when its mass concentrates on a small number of vertices, and delocalized when its mass is approximately uniformly distributed on the vertices. The complete delocalization picture for Erdős–Rényi random graphs was given in [19]. In fact, it was proved that λ1 is close to the scaled unit vector in the $\ell_\infty$-norm. In the present paper we do not study localization versus delocalization for Chung–Lu random graphs in detail, but we do show that in a certain regime there is strong evidence for delocalization because v1 is close to the scaled unit vector. In [9, corollary 1.3 ] the authors found that the eigenvectors of a Wigner matrix with independent standard Gaussian entries are distributed according to a Haar measure on the orthogonal group, and the coordinates have Gaussian fluctuations after appropriate scaling. Our work shows that the coordinate-wise fluctuations hold as well for the principal eigenvector of the non-centered Chung–Lu adjacency matrix and that they are Gaussian after appropriate centering and scaling.

1.1.4. Outline

In section 1.2 we define the Chung–Lu random graph, state our assumption on the degree sequence, and formulate two main theorems: a CLT for the largest eigenvalue and a CLT for its associated eigenvector. In section 1.3 we discuss these theorems and place them in their proper context. Section 2 includes simulations for different graph sizes and degree sequences. Section 3 contains the proof of the CLT of the eigenvalue and section 4 studies the properties of the principal eigenvector.

1.2. Main results

1.2.1. Set-up

Let $\mathbb{G}_n$ be the set of simple graphs with n vertices. Let $\vec{d}_n = (d_i)_{i \in [n]}$ be a sequence of degrees, such that $d_i\in \mathbb{N}$ for all $i\in [n]$ and abbreviate

Note that these numbers depend on n, but in the sequel we will suppress this dependence. For each pair of vertices $i,j$ (not necessarily distinct), we add an edge independently with probability

Equation (1.2)

The resulting random graph, which we denote by $\mathcal{G}_n(\vec{d}_n)$, is referred to in the literature as the Chung–Lu random graph. In [15] it was assumed that $d_\uparrow^2 \leqslant m_1$ to ensure that $p_{ij} \leqslant 1$. In the present paper we need sharper restrictions.

Assumption 1.1. Throughout the paper we need two assumptions on $\vec{d}_n$ as $n\to\infty$:

  • Connectivity and sparsity: There exists a ξ > 2 such that
  • Bounded inhomogeneity: $d_\downarrow \asymp d_\uparrow$.

$\spadesuit$

The lower bound in assumption 1.1(D1) guarantees that the random graph is connected whp and that it is not too sparse. The upper bound is needed in order to have $d_\uparrow = {{\mathrm{o}}}(\sqrt{m_1})$, which implies that (1.2) is well defined. Assumption 1.1(D2) is a restriction on the inhomogeneity of the model and requires that the smallest and the largest degree are comparable.

Remark 1.2. The lower bound on $d_\uparrow$ in assumption 1.1(D1) can be seen as an adaptation to our setting of the main condition in [16, theorem 2.1] for the asymptotics of λ1. As mentioned in section 1.1, under the assumption

[16] shows that $\lambda_1 = [1+{{\mathrm{o}}}(1)]\,m_2/m_1$ whp. It is easy to see that the above condition together with assumption 1.1(D2) gives the lower bound in assumption 1.1(D1). $\spadesuit$

Remark 1.3. When $d_\uparrow\ll n^{1/6}$, [33, theorem 6.19] implies that our results also hold for the Generalized Random Graph (GRG) model with the same average degrees. This model is defined by choosing connection probabilities of the form

and arises in statistical physics as the canonical ensemble constrained on the expected degrees, which is also called the canonical configuration model. Note that in the above connection probability, di plays the role of a hidden variable, or a Lagrange multiplier controlling the expected degree of vertex i, but does not in general coincide with the expected degree itself. However, under the assumptions considered here, di does coincide with the expected degree asymptotically. The reader can find more about GRG and their use in [33, Chapter 6], and about their role in statistical physics in [31]. In the corresponding microcanonical ensemble the degrees are not only fixed in their expectation but they take a precise deterministic value, which corresponds to the microcanonical configuration model. The two ensembles were found to be nonequivalent in the limit as $n\to\infty$ [30]. This result was shown to imply a finite difference between the expected values of the largest eigenvalue λ1 in the two models [17] when the degree sequence was chosen to be constant ($d_i = d$ for all $i \in [n]$). In this latter case the canonical ensemble reduces to the Erdős–Rényi random graph with $p = d/n$, while the microcanonical ensemble reduces to the d-regular random graph model. Although ensemble nonequivalence is not our main focus here, we will briefly relate some of our results to this phenomenon. $\spadesuit$

1.2.2. Notation

Let A be the adjacency matrix of $\mathcal{G}_n(\vec{d}_n)$ and $\mathbb{E}[A]$ its expectation. The (i, j)th entry of $\mathbb{E}[A]$ equals to pij in (1.2). The (i, j)th entry of $A-\mathbb{E}[A]$ is an independent centered Bernoulli random variable with parameter pij . Let $\lambda_1\geqslant \ldots\geqslant \lambda_n$ be the eigenvalues of A and let $v_1, \ldots, v_n$ be the corresponding eigenvectors. The vector e will be the n-dimensional column vector

Equation (1.3)

where t stands for transpose. It is easy to see that $\mathbb{E}[A] = ee^t$.

Definition 1.4. Following [19], we say that an event $\mathcal{E}$ holds with $(\xi,\nu)$-high probability (written $(\xi,\nu)$-hp) when there exist ξ > 2 and ν > 0 such that

Equation (1.4)

   $\spadesuit$

Note that this is different from the classical notion of whp, because it comes with a specific rate.

Remark 1.5. Our results hold for any ν > 0 as soon as ξ > 2 (think of ν = 1). The role of ν becomes important when we consider specific subsets $\mathcal{S}$ of the event space and split into $\mathcal{S}\cap\mathcal{E}$ and $\mathcal{S}\cap\mathcal{E}^c$ (see e.g. [19]).$\spadesuit$

We write $\stackrel{w}{\longrightarrow}$ to denote weak convergence as $n\to\infty$, and use the symbols ${{\mathrm{o}}},\mathrm{O}$ to denote asymptotic order for sequences of real numbers.

1.2.3. CLT for the principal eigenvalue

Our first theorem identifies two terms in the expectation of the largest eigenvalue, and shows that the largest eigenvalue follows a central limit theorem.

Theorem 1.6. Under assumption 1.1, the following hold:

  • (I)  
  • (II)  
    where

1.2.4. CLT for the principal eigenvector

Our second theorem shows that the principal eigenvector is parallel to the normalised degree vector, and is close to this vector in $\ell^\infty$-norm. It also identifies the expected value of the components of the principal eigenvector, and shows that the components follow a central limit theorem.

Theorem 1.7. Let $\tilde{e} = e\sqrt{m_1/m_2}$ be the $\ell^2$-nomalized degree vector. Let v1 be the eigenvector corresponding to λ1 and let $v_1(i)$ denote the ith coordinate of v1. Under assumption 1.1, the following hold:

  • (I)  
    $\langle v_1,\tilde{e} \rangle = 1+{{\mathrm{o}}}(1)$ as $n\to\infty$  with $(\xi,\nu)$-hp .
  • (II)  
    $\|v_1- \tilde{e}\|_{\infty} \leqslant \mathrm{O}\left(\frac{(\log n)^{\xi}}{\sqrt{n d_\uparrow}}\right)$ as $n\to\infty$  with $(\xi,\nu)$-hp .
  • (III)  
    $\mathbb{E}[v_1(i)] = \frac{d_i}{\sqrt{m_2}}+\mathrm{O}\left(\frac{(\log n)^{2\xi}}{\sqrt{m_2}}\right)$ as $n\to\infty$.

Moreover, if the lower bound in assumption 1.1(D1) is strengthened to $(\log n)^{4\xi} \ll d_\uparrow$, then for all $i \in [n]$,

  • (IV)  
    where

1.3. Discussion

We place the theorems in their proper context.

  • (a)  
    Theorems 1.6 and 1.7 provide a CLT for $\lambda_1,v_1$. We note that $m_2/m_1$ is the leading order term in the expansion of λ1, while $m_1m_3/m_2^2$ is a correction term. We observe that theorem 1.6(I) does not follow from the results in [16], because the largest eigenvalue need not be uniformly integrable and also the second order expansion is not considered there. We also note that in theorem 1.6(II) the centering of the largest eigenvalue, $\mathbb{E}[\lambda_1]$, cannot be replaced by its asymptotic value as the error term is not compatible with the required variance.
  • (b)  
    The lower bound in assumption 1.1(D1) is needed to ensure that the random graph is connected, and is crucial because the largest eigenvalue is very sensitive to connectivity properties. Assumption 1.1(D2) is needed to control the inhomogeneity of the random graph. It plays a crucial role in deriving concentration bounds on the central moments $\langle e, (A-\mathbb{E}[A])^k e\rangle$, $k \in \mathbb{N}$, with the help of a result from [19]. Further refinements may come from different tools, such as the non-backtracking matrices used in [5, 6]. While assumption 1.1(D1) appears to be close to optimal, assumption 1.1(D2) is far from optimal. It would be interesting to allow for empirical degree distributions that converge to a limiting degree distribution with a power law tail.
  • (c)  
    As already noted, if the expected degrees are all equal to each other, i.e. $d_i = d$ for all $i \in [n]$, then the Chung–Lu random graph, or canonical configuration model, reduces to the homogeneous Erdős–Rényi random graph with $p = d/n$, while the corresponding microcanonical configuration model reduces to the homogeneous d-regular random graph model (here, all models allow for self-loops). This implies that, for the homogeneous Erdős–Rényi random graph with connection probability $p \gg (\log n)^{2\xi}/n$, ξ > 2, theorem 1.6(I) reduces to
    while theorem 1.6(II) reduces to
    Both these properties were derived in [18] for homogeneous Erdős–Rényi random graphs and also for rank-1 perturbations of Wigner matrices. In [17], the fact that $\mathbb{E}[\lambda_1]$ in the canonical ensemble differs by a finite amount from the corresponding expected value (here, d = np) in the microcanonical ensemble (d-regular random graph) was shown to be a signature of ensemble nonequivalence.
  • (d)  
    In case $d_i = d$ for all $i \in [n]$, theorem 1.7(III) reduces to the following CLT, which was not covered by [18] and [17].

Corollary 1.8. For the Erdős–Rényi random graph with $(\log n)^{4\xi}/n \ll p \ll n^{-1/2}$ for some ξ > 2,

Note that, in the corresponding microcanonical ensemble (d-regular random graph), v1 coincides with the constant vector where $v_1(i) = 1/\sqrt{n}$ for all $i \in [n]$. Therefore in the canonical ensemble each coordinate $v_1(i)$ has Gaussian fluctuations around the corresponding deterministic value for the microcanonical ensemble. This behaviour is similar to the degrees having, in the canonical configuration model, either Gaussian (in the dense setting) or Poisson (in the sparse setting) fluctuations around the corresponding deterministic degrees for the microcanonical configuration model [23].

  • (e)  
    One way to satisfy assumption 1.1 is to specify functions $\omega,c_1,\ldots,c_n$, satisfying $(\log n)^{2\xi}\ll\omega(n)\ll \sqrt{n}$ and $c \leqslant c_1(n) \leqslant \ldots \leqslant c_n(n)\leqslant C$ with $c,C\geqslant0$, such that
    The reason why we avoid such a description is that our setting is potentially broader. The concentration estimate in lemma 3.4 requires us to assume homogeneous degree sequences as above, while theorem 1.6(I) holds for much more general degree sequences. A further refinement of lemma 3.4 may be possible. The advantage of the above description is that it makes the scale $\omega(n)$ on which the degrees live explicit. However, most of the bounds in our proofs depend on some power of $d_\uparrow$, up to some multiplicative constant. This means that, in the bounded inhomogeneity setting, expressing the asymptotics through $\omega(n)$ or $d_\uparrow$ are equivalent. Bounds expressed through $\omega(n)$ would cease to be meaningful as soon as we manage to push beyond the bounded inhomogeneity setting of our model, while the skeleton of our proof would still hold.
  • (f)  
    In [14] the empirical spectral distribution of A was considered under the assumption that
    which is weaker than assumption 1.1. It was shown that if $\mu_n \stackrel{w}{\longrightarrow} \mu$ with $\mu_n = n^{-1} \sum_{i = 1}^n \delta_{d_i/d_\uparrow}$ and µ some probability distribution on $\mathbb{R}$, then
    with $\mu_{\mathrm{sc}}$ the Wigner semicircle law and $\boxtimes$ the free multiplicative convolution. Since $\mu\boxtimes \mu_{\mathrm{sc}}$ is compactly supported, this shows that the scaling for the largest eigenvalue and the spectral distribution are different.

2. Simulations

Theorems 1.6 and 1.7 show that, after proper scaling and under certain conditions of sparsity and homogeneity, the largest eigenvalue and the components of the largest eigenvector exhibit Gaussian behaviour in the limit as $n\to\infty$. A natural question is how these quantities behave for finite n. Indeed, real-world networks have sizes that range from $n = 10^2$ to $n = 10^9$. Another question is computational feasibility. Indeed, our CLTs require the degrees to lie between $(\log n)^{4}$ (respectively, $(\log{n})^{8}$) and $\sqrt{n}$. In order to make this possible, n must be at least 1011 (respectively, 1029), which is unrealistic. Let us therefore see what simulations have to say 6 .

2.1. Largest eigenvalue

In figure 1 we show histograms for the quantity

which should be close to normal with mean 0 and variance 2 (for $\mathbb{E}[\lambda_1]$ the correction term ${{\mathrm{o}}}(1)$ is neglected). The convergence is fast: already for n = 500 the Gaussian shape emerges and represents an excellent fit: the sample mean µ is close to 0 and the sample standard deviation σ is close to $\sqrt{2}$.

Figure 1.

Figure 1. Histograms of $\bar{\lambda}_1$ for different graph sizes n and degree sequences $\vec{d}$. The sample size for each regime is 104. Each element specified in the degree sequence appears $\frac{n}{5}$ times. In red is plotted the Gaussian fit; µ is the sample mean (represented by a dashed line in the histogram), σ is the sample standard deviation. We expect µ ≈ 0 and $\sigma \approx \sqrt{2}$.

Standard image High-resolution image

2.2. Largest eigenvector

In figure 2 we show histograms for the quantity

which should be close to normal with mean 0 and variance 1. The fit is again excellent.

Figure 2.

Figure 2. Histograms of $\bar{v}_1(i)$ for different graph sizes n and degree sequences $\vec{d}$. For each of the images, i is chosen to be the last i such that di is equal to the 4th element of the corresponding degree sequence (e.g. for n = 500, $v_1(400)$ was analysed with $d_{400} = 20$. The sample size for each regime is 104. Each element in the degree sequence appears $\frac{n}{5}$ times. In red is plotted the gaussian fit; µ is the sample mean (represented by a dashed line in the histogram), σ is the sample standard deviation. We expect µ ≈ 0 and σ ≈ 1.

Standard image High-resolution image

2.3. Degrees of order log n and $\sqrt{n}$

What happens when the degrees are of order $\log n$? As can be seen in figure 3, in that range the Gaussian approximation for the largest eigenvalue is visibly worse, especially for the centering. The same happens for the components of the largest eigenvector, as can be seen in figure 4, where the Gaussian shape is lost and two peaks appear.

Figure 3.

Figure 3. Histograms of $\bar{\lambda}_1$ for different graph sizes n and degree sequences $\vec{d}$ of order $\log n$. The sample size for each regime is 104. Each element specified in the degree sequence appears $\frac{n}{4}$ times. In red is plotted the Gaussian fit; µ is the sample mean (represented by a dashed line in the histogram), σ is the sample standard deviation. If theorem 1.6 would hold, then we would expect µ ≈ 0 and $\sigma \approx \sqrt{2}$.

Standard image High-resolution image
Figure 4.

Figure 4. Histograms of $\bar{v}_1(i)$ for different graph sizes n and degree sequences $\vec{d}$ of order $\log n$. For each of the images, i has been chosen to be the last i such that di is equal to the 3rd element of the specified degree sequence (e.g. for n = 500, $v_1(375)$ was analysed with $d_{375} = 8$. The sample size for each regime is 104. Each element specified in the degree sequence appears $\frac{n}{4}$; µ is the sample mean (represented by a dashed line in the histogram), σ is the sample standard deviation. If theorem 1.7 would hold, then we would expect µ ≈ 0 and σ ≈ 1.

Standard image High-resolution image

3. Proof of theorem 1.6

In what follows we use the well-known method of writing the largest eigenvalue of a matrix as a rank-1 perturbation of the centered matrix. This method was previously successfully employed in [19, 22, 29].

Given the adjacency matrix A of our graph G, we can write $A = H+\mathbb{E}[A]$ with $H = A-\mathbb{E}[A]$. Let v1 be the eigenvector associated with the eigenvalue λ1. Then

Using that $\mathbb{E}[A] = ee^t$, we have $(\lambda_1 I - H) v_1 = \left\langle e, v_1\right\rangle e$, where I is the n×n identity matrix. It follows that if λ1 is not an eigenvalue of H, then the matrix $(\lambda_1 I - H)$ is invertible, and so

Equation (3.1)

Eliminating the eigenvector v1 from the above equation, we get

where we use that $\left\langle e, v_1\right\rangle \neq 0$ (since λ1 is not an eigenvalue of H). Note that this can be expressed as

Equation (3.2)

where the validity of the series expansion will be an immediate consequence of lemma 3.2 below.

Section 3.1 derives bounds on the spectral norm of H. Section 3.2 analyses the expansion in (3.2) and prove the scaling of $\mathbb{E}[\lambda_1]$. Section 3.3 is devoted to the proof of the CLT for λ1, section 4 to the proof of the CLT for v1. In the expansion we distinguish three ranges: (a) $k = 0,1,2$; (b) $3 \leqslant k \leqslant L$; (c) $L \lt k \lt \infty$, where

We will show that (a) controls the mean and the variance in both CLTs, while (b)–(c) are negligible error terms.

3.1. The spectral norm

In order to study λ1, we need good bounds on the spectral norm of H. The spectral norm of matrices with inhomogeneous entries has been studied in a series of papers [2, 5, 6] for different density regimes.

An important role is played by $\lambda_1(\mathbb{E}[A])$. In recent literature this quantity has been shown to play a prominent role in the so-called BBP-transition [4]. Given our setting (1.2), it is easy to see that

Equation (3.3)

while all other eigenvalues of $\mathbb{E}[A]$ are zero.

Remark 3.1. Since $d_\downarrow\leqslant \frac{m_2}{m_1}\leqslant d_\uparrow$, assumption 1.1(D2) implies that

Equation (3.4)

   $\spadesuit$

We start with the following lemma, which ensures concentration of λ1 and is a direct consequence of the results in [6] (which matches assumption 1.1). In particular, we use [6, theorem 3.2] to check that the boundaries of the bulk of the spectral distribution live on a scale smaller than the scale of λ1.

Lemma 3.2. Under assumption 1.1, with $(\xi,\nu)$-hp

and consequently

Proof. In the proof it is understood that all statements hold with $(\xi,\nu)$-hp in the sense of (1.4). Let $A = H+\mathbb{E}[A]$. Due to Weyl's inequality, we have that

From [6, theorem 3.2] we know that there is a universal constant C > 0 such that

where

with κ defined by

Thanks to assumption 1.1(D2), we have $\kappa = \mathrm{O}(1)$. By remark 3.1 of [6, Remark 3.1] (which gives us that $q = \sqrt{d_\uparrow}$ for n large enough) and assumption 1.1, we get that

Equation (3.5)

Using [8, example 8.7] or [6, equation 2.4] (the Talagrand inequality), we know that there exists a universal constant c > 0 such that

For $t = \sqrt{\nu (\log n)^\xi}$,

Equation (3.6)

Thus, we have

Equation (3.7)

Using that $\lambda_1(\mathbb{E}[A]) = m_2/m_1$, we have that with $(\xi,\nu)$-hp the following bound holds:

Via assumption 1.1 and (3.4) the claim follows.

Remark 3.3. 

  • (a)  
    The proof of lemma 3.2 works well if we replace assumption 1.1(D2) by a milder condition. Indeed, the former is directly linked to the parameter κ that appears in the proof of lemma 3.2 and in the proof of [6, theorem 3.2], which contains a more general condition on the inhomogeneity of the degrees.
  • (b)  
    Note that a consequence of proof of lemma 3.2 is that with $(\xi,\nu)$-hp
    Equation (3.8)
    for some $C_0\in (0,1)$. This allows us to claim that with $(\xi,\nu)$-hp the inverse
    Equation (3.9)
    exists.

   $\spadesuit$

Lemma 3.4. Let $1\leqslant k\leqslant L$. Then, under assumption 1.1, with $(\xi,\nu)$-hp

i.e.

Lemma 3.4 is a generalization to the inhomogeneous setting of [19, lemma 6.5]. We skip the proof because it requires a straightforward modification of the arguments in [19].

Lemma 3.5. Under assumption 1.1, for $2\leqslant k\leqslant L$, there exists a constant C > 0 such that

Equation (3.10)

Proof. Let $\mathcal E$ be the high probability event defined by (3.6), i.e.

Due to assumption 1.1(D1) we can bound the right-hand side by $Cd_\uparrow$. Since $\|e\|_2^2 = m_2/m_1$, on this event we have

We show that the expectation when evaluated on the complementary event is negligible. Indeed, observe that

where in the last inequality we use that $d_\uparrow = {{\mathrm{o}}}(\sqrt{m_1})$. This, combined with the exponential decay of the event $\mathcal E^c$, gives

and so the claim follows.

3.2. Expansion for the principal eigenvalue

We denote the event in lemma 3.2 by $\mathcal E$, which has high probability. As noted in remark 3.3(b), $I-\frac{H}{\lambda_1}$ is invertible on $\mathcal E$. Hence, expanding on $\mathcal E$, we get

We split the sum into two parts:

Equation (3.11)

First we show that we may ignore the second sum. To that end we observe that, by assumption 1.1 (D1), on the event $\mathcal E$ we can estimate

Equation (3.12)

Because of (3.12) and the fact that $\mathbb{E}(\left\langle e,H e\right\rangle) = 0$, (3.11) reduces to

Next, we estimate the second sum in the above equation. Using lemma 3.2, we get

From lemma 3.5 we have

where the last estimate follows from assumption 1.1(D1). Hence, on $\mathcal E$,

Iterating the expression for λ1 in the right-hand side, we get

Expanding the second and third term we get,

Here we use that $\left\langle e, e\right\rangle = m_2/m_1\to \infty$, and we ignore several other terms because they are small with $(\xi,\nu)$-hp , for example,

One more iteration gives

Proof of theorem 1.6 (I) Since the probability of $\mathcal E^c$ decays exponentially with n, taking the expectation of the above term and using that $\mathbb{E}[\left\langle e, H e\right\rangle] = 0$, we obtain

Note that

and so we can write

Equation (3.13)

3.3. CLT for the principal eigenvalue

Again consider the high probability event on which (3.9) holds. Recall that from the series decomposition in (3.11) we have

Equation (3.14)

Lemma 3.6. The equation

Equation (3.15)

has a solution x0 satisfying

Proof. Define the function $h\colon\, (0,\infty)\to\mathbb R$ by

Since $\mathbb{E}[e^{\prime} He] = 0$, we have

For x > 0,

This shows that

Hence, for any $0\lt\delta \lt1$,

So, for large enough n,

Similarly, for any $0\lt\delta\lt1$,

This shows that there is a solution for (3.15), which lies in the interval $[ \frac{m_2}{m_1}(1-\delta), \frac{m_2}{m_1}(1-\delta)]$.

Lemma 3.7. Let x0 be a solution for (3.15). Define

Then

Proof of theorem 1.6 (II) From the previous lemmas we have

Therefore

and

Hence

Equation (3.16)

Observe that

Let

where we use the symmetry of the expression in the last equality. We can apply Lyapunov's central limit theorem, because $\{h_{i,j}\colon\, i\leqslant j\}$ is an independent collection of random variables and Lyapunov's condition is satisfied, i.e.

where K is a constant that does not depend on n. Hence

Returning to the eigenvalue equation in (3.16) and dividing by σ1, we have

We next prove lemma 3.7, on which the proof of the central limit theorem relied.

Proof. Note that by (3.14) and (3.15) we can write

Equation (3.17)

where

Thanks to lemma 3.2, 3.4 and (3.12) we have

Note that $L_n = {{\mathrm{o}}}( \frac{m_3}{m_2 \sqrt{m_1}})$. Indeed, using $m_3\geqslant n d_\downarrow^3$ and assumption 1.1(D1), we get

Observe that (3.17) can be rearranged as

Hence, bringing the second term from the right to the left, we have

Using the bounds on λ1 and x0, we get

We can therefore write

where $L_n = {{\mathrm{o}}}_P( \frac{m_3}{m_2 \sqrt{m_1}})$. Finally, to go to Rn , note that

Equation (3.18)

To bound Rn , it is enough to show that the first term on the right-hand side is with $(\xi,\nu)$-hp bounded by $\frac{m_3}{m_2 \sqrt{m_1}}$. Using lemma 3.4 (for k = 1) and (3.7), we have with $(\xi,\nu)$-hp

Equation (3.19)

Using again assumption 1.1(D1), $m_3\geqslant n d_\downarrow^3$, $m_1\leqslant n d_\uparrow$ and $m_2\leqslant n d_\uparrow^2$, we get that

This controls the right-hand side of (3.19), and hence $R_n = {{\mathrm{o}}}( \frac{m_3}{m_2\sqrt{m_1}})$ with $(\xi,\nu)$-hp .

We want to show that the latter is negligible both pointwise and in expectation. We already have that this is so with $(\xi,\nu)$-hp on Rn . We want to show that the same bound holds in expectation. Let $\mathcal{A}$ be the high probability event of lemmas 3.2 and 3.4, and write

where $\textbf{1}_{\mathcal{A}}$ is the indicator function of the event $\mathcal{A}$. Since all the bounds hold on the high probability event $\mathcal A$, it is immediate that

The remainder can be bounded via the Cauchy–Schwarz inequality, namely,

We see that if $\mathbb{E}[|R_n|^2] = {{\mathrm{o}}}(\mathrm{e}^{-\nu(\log n)^\xi})$, then we are done. Expanding, we see that

for some C > 0, where we use that

and the trivial bound $|\left\langle e, He\right\rangle|\leqslant n^{C_*}$ for some $C_*\lt C$. Hence we have $\left(\mathbb{E}[|R_n|^2]\mathbb{E}[\textbf{1}_{A^c}]\right)^{\frac{1}{2}} \leqslant \mathrm{e}^{-\nu(\log n)^{\xi}}$ and

4. Proof of theorem 1.7

In this section we study the properties of the principal eigenvector. Let v1 be the normalized principal eigenvector, i.e. $\|v_1\| = 1$, and let e be as defined in (1.3). Recall from (3.1) that

and after inversion (which is possible on the high probability event) we have

If K denotes the normalization factor, then we can rewrite the above equation with $(\xi,\nu)$-hp as the series

Equation (4.1)

Our first step is to determine the value of K in (4.1). We adapt the results from [19] to derive a component-wise central limit theorem in the inhomogeneous setting described by (1.2) under assumption 1.1. By the normalization of v,

Equation (4.2)

where we use the symmetry of H.

The following lemma settles theorem 1.7(I).

Lemma 4.1. Under assumption 1.1, and with $\tilde{e} = e\sqrt{\frac{m_1}{m_2}}$, with $(\xi,\nu)$-hp

Equation (4.3)

Proof. Recall that $L = \lfloor \log n\rfloor$. We rewrite (4.2) as

Equation (4.4)

We first show that the last two parts are negligible and then show that the main term of the first part is the term with k = 0, i.e. $\langle e,e\rangle = m_2/m_1$.

The last term in (4.4) is dealt with as follows. Using (3.8), we have with $(\xi,\nu)$-hp

with $c^{^{\prime}} = -\log(1-C_0)$, where we use that $\sum_{k = 0}^\infty (k+1)(1-c)^k = 1/c^2$ for $|1-c|\lt1$.

We tackle the second sum in (4.4) by using lemma 3.4. Indeed, with $(\xi,\nu)$-hp we have

where the constant varies in each step. By assumption 1.1(D1), the last term goes to zero.

As to the first term, note that by (3.5) for $k\geqslant3$ we have

The term with k = 1 is zero, while for k = 2 we have

for some constant c. After substituting these results into (4.4), we find

Equation (4.5)

and the proof follows by normalizing the vector e and using (4.1).

The following lemma is an immediate consequence of (4.1) and lemma 4.1.

Lemma 4.2. Under assumptions 1.1, with $(\xi,\nu)$-hp

Equation (4.6)

In order to estimate how the components of v1 concentrate, we need the following lemma.

Lemma 4.3. For $1 \leqslant k \leqslant L$, with $(\xi,\nu)$-hp

The proof of this lemma is a direct consequence of lemma 3.4, is similar to [19, lemma 7.10] and therefore we skip it. An immediate corollary of the above estimate is the delocalized behaviour of the largest eigenvector stated in theorem 1.7(II).

Lemma 4.4. Let v1 be the normalized principal eigenvector, and $\tilde{e} = e\sqrt{\frac{m_1}{m_2}}$. Then with $(\xi,\nu)$-hp

Proof. Recall from (4.4) that

The last term is negligible with $(\xi,\nu)$-hp , because it is the tail sum of a geometrically decreasing sequence. For the sum over $1 \leqslant k \leqslant L$ we can use lemma 4.3 and the fact that $K/\lambda_1 = \sqrt{\frac{m_1}{m_2}}+ {{\mathrm{o}}}(1)$ with $(\xi,\nu)$-hp . So we have

The first term with $(\xi,\nu)$-hp is

and the error is uniform over all i. Indeed, with $(\xi,\nu)$-hp

Equation (4.7)

where we use assumption 1.1, remark 3.1 and (3.7). Since the detailed computations are similar to the previous arguments, we skip the details.

We next prove the central limit theorem for the components of the eigenvector stated in theorem 1.7(IV).

Theorem 4.5. Under assumption 1.1, with the extra assumption $d_\uparrow\gg (\log n)^{4\xi}$,

Proof. First we compute $\mathbb{E}[v_1(i)]$, and afterwards we show that the CLT holds componentwise.

We use the law of total expectation. Conditioning on the high probability event $\mathcal E$ in lemma 3.2, we can write the expectation of the normalized eigenvector v1 as

Because the components of a normalized n-dimensional vector are bounded, we know that

for some suitable constant $c_\nu\gt0$, dependent on ν and on the the bound on $v_1(i)$. On $\mathcal{E}$, we can expand v1 as

Using the notation $\mathbb{E}_{\mathcal E}$ for the conditional expectation on the event $\mathcal E$, we have

For the first term we have, using (4.5),

For the term corresponding to k = 1, we know that $\mathbb{E}[(He)(i)] = 0$ by construction on the whole space. However, under the event $\mathcal{E}$ we can show that its contribution is exponentially negligible. We have

Since $m_2/m_1\to\infty$, there exists a constant $\tilde C$ such that

We can therefore write

Here we use (3.7) to bound the difference $|\lambda_1-(m_2/m_1)|$. Next, write

where cν is a constant depending on ν, and we use that $|h_{ij}|\leqslant 1$ and $m_1 = \mathrm{O}\left( e^{3/2\log n}\right)$. We can therefore conclude that

where $c^{\prime}_\nu\gt0$ is a suitable constant depending on ν, and possibly different from cν .

To bound the remaining expectation terms, we use lemma 4.3, which gives a bound on $(H^ke)(i)$ on the event $\mathcal{E}$. As before, we break up the sum into two contributions:

For the second term we have

Equation (4.8)

where we use (3.8) and $C_c = |\log (1-C_0)|$. The first term can be bounded via lemma 4.3, which gives

Equation (4.9)

Using the above bounds, taking expectations and using (4.5), we get

Thus, we have obtained that

which settles theorem 1.7(III).

We can write

where we replace the last terms of the expansion of v1 by the bounds derived above (note that these bounds are of the same order as the ones obtained for the same terms in expectation). The first term of the centered quantity $v_1(i)-d_i/\sqrt{m_2}$ is given by

This last error can be easily seen to be ${{\mathrm{o}}}\left(\frac{(\log n)^{2\xi}}{\sqrt{m_2}}\right)$. We can therefore write

We proceed to show that the first term on the right-hand side of the above equality gives a CLT when the expression is rescaled by an appropriate quantity, and the error term goes to zero. It turns out that

Multiplying by $\sqrt{\frac{m_2^3}{d_im_3m_1}}$, we have

The error term is

where last inequality follows from the assumption that $d_\downarrow\gg(\log n)^{4\xi}$. We now apply Lindeberg's CLT to the term $\frac{\sum_j h_{ij}d_j}{s_n}$. The Lindeberg condition for the CLT reads

Equation (4.10)

Defining $\sigma^2_j(i) = \textbf{Var}(h_{ij}d_j)$, we note that

Let us finally examine the event

By definition, $|h_{ij}|\lt1$. If we show that

then for all ε > 0 there exists nε such that the event

has probability 1. Indeed,

for a suitable constant C. Thus, (4.10) holds.

Acknowledgment

P D, R S H, F d H and M M are supported by the Netherlands Organisation for Scientific Research (NWO) through the Gravitation-grant NETWORKS-024.002.003. D G is supported by the Dutch Econophysics Foundation (Stichting Econophysics, Leiden, The Netherlands) and by the European Union—Horizon 2020 Program under the scheme 'INFRAIA-01-2018-2019 - Integrating Activities for Advanced Communities', Grant Agreement N.871042, 'SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics'.

Data availability statement

No new data were created or analysed in this study.

Footnotes

  • This work was performed using the compute resources from the Academic Leiden Interdisciplinary Cluster Environment (ALICE) provided by Leiden University.

Please wait… references are loading.
10.1088/2632-072X/acb8f7