1 Introduction

Grassmann integration with supersymmetric (SUSY) methods is ubiquitous in the physics literature of random quantum systems, see e.g. the basic monograph of Efetov [7]. This approach is especially effective for analyzing the Green function in the middle of the bulk spectrum with spectral parameter close to the real axis, i.e. precisely in the regime where other methods often fail. The main algebraic strength lies in the fact that Gaussian integrals with Grassmann variables counterbalance the determinants obtained in the partition functions of complex Gaussian integrals. This greatly simplifies algebraic manipulations as it was demonstrated in several papers, see e.g. the proof of the absolutely continuous spectrum on the Bethe lattice by Klein [17] or the bounds on the Lyapunov exponents for random walks in random environment at the critical energy by Wang [29]. However, in theoretical physics Grassmann integrations are also commonly used as an analytic tool by performing saddle point analysis on the superspace coordinatized by complex and Grassmann variables. Since Grassmann variables lack the concept of size, the rigorous justification of this very appealing idea is notoriously difficult.

Initiated by Spencer (see [26] for a summary) and starting with the paper [4] by Disertori, Pinson and Spencer, only a handful of mathematical papers have succeeded in exploiting this powerful tool in an essentially analytic way. We still lack the mathematical framework of a full-fledged analysis on the superspace that would enable us to translate physics arguments into proofs directly, but a combination of refined algebraic identities from physics (such as the superbosonization formula) and a careful analysis have yielded results that are currently inaccessible with more standard probabilistic methods. In this paper we present such results on random band matrices that surpass a well known limitation of the recently developed probabilistic techniques to prove the local versions of the celebrated Wigner semicircle law. We start with introducing the physical motivation of our model.

The Hamiltonian of quantum systems on a graph with vertex set \(\varGamma \) is a self-adjoint matrix \(H= (h_{ab})_{a,b\in \varGamma }, H=H^*\). The matrix elements \(h_{ab}\) represent the quantum transition rates from vertex a to b. Disordered quantum systems have random matrix elements. We assume they are centered, \(\mathbb {E}h_{ab} =0\), and independent subject to the basic symmetry constraint \(h_{ab} = \bar{h}_{ba}\). The variance \(\sigma _{ab}^2: = \mathbb {E} |h_{ab}|^2\) represents the strength of the transition from a to b and we use a scaling where the norm \(\Vert H\Vert \) is typically order 1. The simplest case is the mean field model, where \(h_{ab}\) are identically distributed; this is the standard Wigner matrix ensemble [31]. The other prominent example is the Anderson model [2] or random Schrödinger operator, \(H= \varDelta +V\), where the kinetic energy \(\varDelta \) is the (deterministic) graph Laplacian and the potential \(V= ( V_x)_{x\in \varGamma }\) is an on-site multiplication operator with random multipliers. If \(\varGamma \) is a discrete \(\mathsf {d}\)-dimensional torus then only few matrix elements \(h_{ab}\) are nonzero and they connect nearest neighbor points in the torus, \(\text {dist}(a,b)\le 1\). This is in sharp contrast to the mean field character of the Wigner matrices.

Random band matrices naturally interpolate between the mean field Wigner matrices and the short range Anderson model. They are characterized by a parameter M, called the band width, such that the matrix elements \(h_{ab}\) for \(\text {dist}(a,b)\ge M\) are zero or negligible. If M is comparable with the diameter L of the system then we are in the mean field regime, while \(M\sim 1\) corresponds to the short range model.

The Anderson model exhibits a metal-insulator phase transition: at high disorder the system is in the localized (insulator) regime, while at small disorder it is in the delocalized (metallic) regime, at least in \(\mathsf {d}\ge 3\) dimensions and away from the spectral edges. The localized regime is characterized by exponentially decaying eigenfunctions and off diagonal decay of the Green’s function, while in the complementary regime the eigenfunctions are supported in the whole physical space. In terms of the localization length \(\ell \), the characteristic length scale of the decay, the localized regime corresponds to \(\ell \ll L\), while in the delocalized regime \(\ell \sim L\). Starting from the basic papers [1, 15], the localized regime is well understood, but the delocalized regime is still an open mathematical problem for the \(\mathsf {d}\)-dimensional torus.

Let \(N=L^{\mathsf {d}}\) be the number of vertices in the discrete torus. Since the eigenvectors of the mean field Wigner matrices are always delocalized [13, 14], while the short range models are localized, by varying the parameter M in the random band matrix, one expects a (de)localization phase transition. Indeed, for \(\mathsf {d}=1\) it is conjectured (and supported by non rigorous supersymmetric calculations [16]) that the system is delocalized for broad bands, \(M\gg N^{1/2}\) and localized for \(M\ll N^{1/2}\). The optimal power 1/2 has not yet been achieved from either sides. Localization has been shown for \(M\ll N^{1/8}\) in [23], while delocalization in a certain weak sense for the most eigenvectors was proven for \(M\gg N^{4/5}\) in [11]. Interestingly, for a special Gaussian model even the sine kernel behavior of the 2-point correlation function of the characteristic polynomials could be proven down to the optimal band width \(M\gg N^{1/2}\), see [19, 21]. Note that the sine kernel is consistent with the delocalization but does not imply it. We remark that our discussion concerns the bulk of the spectrum; the transition at the spectral edge is much better understood. In [25] it was shown with moment method that the edge spectrum follows the Tracy–Widom distribution, characteristic to mean field model, for \(M\gg N^{5/6}\), but it yields a different distribution for narrow bands, \(M\ll N^{5/6}\).

Delocalization is closely related to estimates on the diagonal elements of the resolvent \(G(z)=(H-z)^{-1}\) at spectral parameters with small imaginary part \(\eta =\mathsf {Im}z\). Indeed, if \(G_{ii}(E+i\eta )\) is bounded for all i and all \(E\in {\mathbb R }\), then each \(\ell ^2\)-normalized eigenvector \(\mathbf{{u}}\) of H is delocalized on scale \(\eta ^{-1}\) in a sense that \(\max _i |u_i|^2 \lesssim \eta \), i.e. u is supported on at least \(\eta ^{-1}\) sites. In particular, if \(G_{ii}\) can be controlled down to the scale \(\eta \sim 1/N\), then the system is in the complete delocalized regime.

For band matrices with band width M, or even under the more general condition \( \sigma _{ab}^2\le M^{-1}\), the boundedness of \(G_{ii}\) was shown down to scale \(\eta \gg M^{-1}\) in [14] (see also [12]). If \(M\gg N^{1/2}\), it is expected that \(G_{ii}\) remains bounded even down to \(\eta \gg N^{-1}\) which is the typical eigenvalue spacing, the smallest relevant scale in the model. However, the standard approach [12, 14] via the self-consistent equations for the Green’s function does not seem to work for \(\eta \le 1/M\); the fluctuation is hard to control. The more subtle approach using the self-consistent matrix equation in [11] could prove delocalization and the off-diagonal Green’s function profile that are consistent with the conventional quantum diffusion picture, but it was valid only for relatively large \(\eta \), far from \(M^{-1}\). Moment methods, even with a delicate renormalization scheme [24] could not break the barrier \(\eta \sim M^{-1}\) either.

In this paper we attack the problem differently; with supersymmetric (SUSY) techniques. Our main result is that \(G_{ii}(z)\) is bounded, and the local semicircle law holds for any \(\eta \gg N^{-1}\), i.e. down to the optimal scale, if the band width is not too small, \(M\gg N^{6/7}\), but under two technical assumptions. First, we consider a generalization of Wegner’s n-orbital model [22, 30], namely, we assume that the band matrix has a block structure, i.e. it consists of \(M\times M\) blocks and the matrix elements within each block have the same distribution. This assumption is essential to reduce the number of integration variables in the supersymmetric representation, since, roughly speaking, each \(M\times M\) block will be represented by a single supermatrix with 16 complex or Grassmann variables. Second, we assume that the distribution of the matrix elements matches a Gaussian up to four moments in the spirit of [28]. Supersymmetry heavily uses Gaussian integrations, in fact all mathematically rigorous works on random band matrices with supersymmetric method assume that the matrix elements are Gaussian, see [46, 1921, 26, 27]. The Green’s function comparison method [14] allows one to compare Green’s functions of two matrix ensembles provided that the distributions match up to four moments and provided that \(G_{ii}\) are bounded. This was an important motivation to reach the optimal scale \(\eta \gg N^{-1}\).

In the next subsections we introduce the model precisely and state our main results. Our supersymmetric analysis was inspired by [20], but our observable, \(G_{ab}\), requires a partly different formalism, in particular we use the singular version of the superbosonization formula [3]. Moreover, our analysis is considerably more involved since we consider relatively narrow bands. In Sect. 1.3, we explain our novelties compared with [20].

1.1 Matrix model

Let \(H_N=(h_{ab})\) be an \(N\times N\) random Hermitian matrix, in which the entries are independent (up to symmetry), centered, complex variables. In this paper, we are concerned with \(H_N\) possessing a block band structure. To define this structure explicitly, we set the additional parameters \(M\equiv M(N)\) and \(W\equiv W(N)\) satisfying

$$\begin{aligned} W=N/M. \end{aligned}$$

For simplicity, we assume that both M and W are integers. Let \(S=(\mathfrak {s}_{jk})\) be a \(W\times W\) symmetric matrix, which will be chosen as a weighted Laplacian of a connected graph on W vertices. Now, we decompose \(H_N\) into \(W\times W\) blocks of size \(M\times M\) and relabel

$$\begin{aligned} h_{jk,\alpha \beta }:=h_{ab},\qquad j,k=1,\ldots , W,\quad \alpha ,\beta =1,\ldots ,M, \end{aligned}$$

where \(j\equiv j(a)\) and \(k\equiv k(b)\) are the spatial indices that describe the location of the block containing \(h_{ab}\) and \(\alpha \equiv \alpha (a)\) and \(\beta \equiv \beta (b)\) are the orbital indices that describe the location of the entry in the block. More specifically, we have

$$\begin{aligned} j=\lceil a/M\rceil ,\quad k=\lceil b/M\rceil ,\quad \alpha =(a-(j-1)M,\quad \beta =b-(k-1)M. \end{aligned}$$

We will call \((j(a),\alpha (a))\) (resp. \((k(b),\beta (b))\)) as the spatial-orbital parametrization of a (resp. b). Moreover, we assume

$$\begin{aligned} \mathbb {E} h_{jk,\alpha \beta }h_{j'k',\alpha '\beta '}=\frac{1}{M}\delta _{jk'}\delta _{j'k}\delta _{\alpha \beta '}\delta _{\beta \alpha '}(\delta _{jk}+\mathfrak {s}_{jk}). \end{aligned}$$
(1.1)

That means, the variance profile of the random matrix \(\sqrt{M}H_N\) is given by

$$\begin{aligned} \widetilde{S}=(\tilde{\mathfrak {s}}_{jk}):=I+S, \end{aligned}$$
(1.2)

in which each entry represents the common variance of the entries in the corresponding block of \(\sqrt{M}H_N\).

1.2 Assumptions and main results

In the sequel, for some matrix \(A=(a_{ij})\) and some index sets \(\mathsf {I}\) and \(\mathsf {J}\), we introduce the notation \(A^{(\mathsf {I}|\mathsf {J})}\) to denote the submatrix obtained by deleting the ith row and jth column of A for all \(i\in \mathsf {I}\) and \(j\in \mathsf {J}\). We will adopt the abbreviation

$$\begin{aligned} A^{(i|j)}:=A^{(\{i\}|\{j\})},\quad i\ne j,\qquad A^{(i)}:=A^{(\{i\}|\{i\})}. \end{aligned}$$
(1.3)

In addition, we use \(||A||_{\max }:=\max _{i,j}|a_{ij}|\) to denote the max norm of A. Throughout the paper, we need some assumptions on S.

Assumption 1.1

(On S) Let \(\mathcal {G}=(\mathcal {V},\mathcal {E})\) be a connected simple graph with \(\mathcal {V}=\{1,\ldots , W\}\). Assume that S is a \(W\times W\) symmetric matrix satisfying the following four conditions.

  1. (i)

    S is a weighted Laplacian on \(\mathcal {G}\), i.e. for \(i\ne j\), we have \(\mathfrak {s}_{ij}>0\) if \(\{i,j\}\in \mathcal {E}\) and \(\mathfrak {s}_{ij}=0\) if \(\{i,j\}\not \in \mathcal {E}\), and for the diagonal entries, we have

    $$\begin{aligned} \mathfrak {s}_{ii}=-\sum _{j:j\ne i}\mathfrak {s}_{ij},\quad \forall \; i=1,\ldots ,W. \end{aligned}$$
  2. (ii)

    \(\widetilde{S}\) defined in (1.2) is strictly diagonally dominant, i.e., there exists some constant \(c_0>0\) such that

    $$\begin{aligned} 1+2\mathfrak {s}_{ii}>c_0,\quad \forall \; i=1,\ldots , W. \end{aligned}$$
  3. (iii)

    For the discrete Green’s functions, we assume that there exist some positive constants C and \(\gamma \) such that

    $$\begin{aligned} \max _{i=1,\ldots , W}||(S^{(i)})^{-1}||_{\max }\le CW^{\gamma }. \end{aligned}$$
    (1.4)
  4. (iv)

    There exists a spanning tree \(\mathcal {G}_0=(\mathcal {V},\mathcal {E}_0)\subset \mathcal {G}\), on which the weights are bounded below, i.e. for some constant \(c>0\), we have

    $$\begin{aligned} \mathfrak {s}_{ij}\ge c, \quad \forall \; \{i,j\}\in \mathcal {E}_0. \end{aligned}$$

Remark 1.2

From Assumption 1.1 (ii), we easily see that

$$\begin{aligned} \widetilde{S}\ge c_0I. \end{aligned}$$
(1.5)

Later, in Lemma 7.4, we will see that \(||(S^{(i)})^{-1}||_{\max }\le CW^2\) always holds. Hence, we may assume \(\gamma \le 2\).

Example 1.1

Let \(\varDelta \) be the standard discrete Laplacian on the \(\mathsf {d}\)-dimensional torus \([1,\mathfrak {w}]^{\mathsf {d}}\cap \mathbb {Z}^{\mathsf {d}}\), with periodic boundary condition, where \(\mathfrak {w}=W^{1/\mathsf {d}}\). Here by standard we mean the weights on the edges of the box are all 1. Now let \(S=a\varDelta \) for some positive constant \(a<1/4\mathsf {d}\). It is then easy to check Assumption 1.1 (i), (ii) and (iv) are satisfied. In addition, if \(\mathsf {d}=1\), it is well known that we can choose \(\gamma =1\) in Assumption 1.1 (iii). For \(\mathsf {d}\ge 3\), one can choose \(\gamma =0\). For \(\mathsf {d}=2\), one can choose \(\gamma =\varepsilon \) for arbitrarily small constant \(\varepsilon \). For instance, one can refer to [8] for more details.

For simplicity, we also introduce the notation

$$\begin{aligned} \sigma ^2_{ab}:=\mathbb {E}|h_{ab}|^2,\qquad \mathcal {T}:=(\sigma ^2_{ab})=\frac{1}{M}\widetilde{S}\otimes (\mathbf {1}_M\mathbf {1}_M'),\quad a,b=1,\ldots ,N, \end{aligned}$$
(1.6)

where \(\mathbf {1}_M\) is the M-dimensional vector whose components are all 1 and \(\widetilde{S}\) is the variance matrix in (1.2). It is elementary that

$$\begin{aligned} \text {Spec}(\mathcal {T})=\text {Spec}(\widetilde{S})\cup \{0\}\subset [0,1]. \end{aligned}$$
(1.7)

Our assumption on M depends on the constant \(\gamma \) in Assumption 1.1 (iii).

Assumption 1.3

(On M) We assume that there exists a (small) positive constant \(\varepsilon _1\) such that

$$\begin{aligned} M\ge W^{4+2\gamma +\varepsilon _1}. \end{aligned}$$
(1.8)

Remark 1.4

A direct consequence of (1.8) and \(N=MW\) is

$$\begin{aligned} M\ge N^{\frac{4+2\gamma +\varepsilon _1}{5+2\gamma +\varepsilon _1}}. \end{aligned}$$
(1.9)

Especially, when \(\gamma =1\), one has \(M\gg N^{6/7}\). Actually, through a more involved analysis, (1.8) [or (1.9)] can be further improved. At least, for \(\gamma \le 1\), we expect that \(M\gg N^{4/5}\) is enough. However, we will not pursue this direction here.

Besides Assumption 1.1 on the variance profile of H, we need to impose some additional assumption on the distribution of its entries. To this end, we temporarily employ the notation \(H^g=(h^g_{ab})\) to represent a random block band matrix with Gaussian entries, satisfying (1.1), Assumptions 1.1 and 1.3.

Assumption 1.5

(On distribution) We assume that for each \(a,b\in \{1,\ldots , N\}\), the moments of the entry \(h_{ab}\) match those of \(h_{ab}^g\) up to the 4th order, i.e.

$$\begin{aligned} \mathbb {E}(\mathsf {Re}h_{ab})^k(\mathsf {Im}h_{ab})^\ell =\mathbb {E}(\mathsf {Re}h_{ab}^g)^k(\mathsf {Im}h_{ab}^g)^\ell ,\quad \forall \; k,\ell \in \mathbb {N}, \quad \text {s.t.}\quad k+\ell \le 4. \nonumber \\ \end{aligned}$$
(1.10)

In addition, we assume the distribution of \(h_{ab}\) possesses a subexponential tail, namely, there exist positive constants \(c_1\) and \(c_2\) such that for any \(\tilde{\gamma }>0\),

$$\begin{aligned} \mathbb {P}\Big (|h_{ab}|\ge \tilde{\gamma }^{c_1}(\mathbb {E}|h_{ab}|^2)^{\frac{1}{2}}\Big )\le c_2 e^{-\tilde{\gamma }} \end{aligned}$$
(1.11)

holds uniformly for all \(a,b=1,\ldots , N\).

The four moment condition (1.10) in the context of random matrices first appeared in [28].

To state our results, we will need the following notion on the comparison of two random sequences, which was introduced in [9, 12].

Definition 1.6

(Stochastic domination) For some possibly N-dependent parameter set \(\mathsf {U}_N\), and two families of random variables \(\mathsf {X}=(\mathsf {X}_N(u): N\in \mathbb {N},u\in \mathsf {U}_{N})\) and \(\mathsf {Y}=(\mathsf {Y}_N(u): N\in \mathbb {N},u\in \mathsf {U}_N)\), we say that \(\mathsf {X}\) is stochastically dominated by \(\mathsf {Y}\), if for all \(\varepsilon '>0\) and \(D>0\) we have

$$\begin{aligned} \sup _{u\in \mathsf {U}_N}\mathbb {P}\Big (\mathsf {X}_N(u)\ge N^{\varepsilon '}\mathsf {Y}_N(u)\Big )\le N^{-D} \end{aligned}$$
(1.12)

for all sufficiently large \(N\ge N_0(\varepsilon ', D)\). In this case we write \(\mathsf {X}\prec \mathsf {Y}\).

The set \(\mathsf {U}_N\) is omitted from the notation \(\mathsf {X}\prec \mathsf {Y}\). Whenever we want to emphasize the role of \(\mathsf {U}_N\), we say that \(\mathsf {X}_N(u)\prec \mathsf {Y}_N(u)\) holds for all \(u\in \mathsf {U}_N\). For example, by (1.1) and Assumption 1.5, we have

$$\begin{aligned} |h_{ab}|\prec {1}/{\sqrt{M}},\quad \forall \; a,b=1,\ldots , N. \end{aligned}$$
(1.13)

Note that here \(\mathsf {U}_N=\{u=(a,b): a,b=1,\ldots ,N\}\). In some applications, we also use this notation for random variables without any parameter or with a fixed parameter, i.e. the set of parameters \(\mathsf {U}_N\) plays no role.

Note that \(\widetilde{S}\) is doubly stochastic. It is known that the empirical eigenvalue distribution of \(H_N\) converges to the semicircle law, whose density function is given by

$$\begin{aligned} \varrho _{sc}(x):=\frac{1}{2\pi }\sqrt{4-x^2}\cdot \mathbf {1}(|x|\le 2), \end{aligned}$$

see [14] for instance. We denote the Green’s function of \(H_N\) by

$$\begin{aligned} G(z)\equiv G_N(z):=(H_N-z)^{-1},\quad z=E+\mathbf {i}\eta \in \mathbb {C}^+:=\{w\in \mathbb {C}: \mathsf {Im}w>0\} \end{aligned}$$

and its (ab) matrix element is \(G_{ab}(z)\). Throughout the paper, we will always use E and \(\eta \) to denote the real and imaginary part of z without further mention. In addition, for simplicity, we suppress the subscript N from the notation of the matrices here and there. The Stieltjes transform of \(\varrho _{sc}(x)\) is

$$\begin{aligned} m_{sc}(z)=\int _{-2}^2\frac{\varrho _{sc}(x)}{x-z}\mathrm{d}x=\frac{-z+\sqrt{z^2-4}}{2}, \end{aligned}$$

where we chose the branch of the square root with positive imaginary part for \(z\in \mathbb {C}^+\). Note that \(m_{sc}(z)\) is a solution to the following self-consistent equation

$$\begin{aligned} m_{sc}(z)=\frac{1}{-z-m_{sc}(z)}. \end{aligned}$$
(1.14)

The semicircle law also holds in a local sense, see Theorem 2.3 in [12]. For simplicity, we cite this result with a slight modification adjusted to our assumption.

Proposition 1.7

(Erdős, Knowles, Yau, Yin, [12]) Let H be a random block band matrix satisfying Assumptions 1.11.3 and 1.5. Then, for any fixed small positive constants \(\kappa \) and \(\varepsilon \), we have

$$\begin{aligned}&\max _{a,b}|G_{ab}(z)-\delta _{ab}m_{sc}(z)|\prec (M\eta )^{-\frac{1}{2}},\quad \text {if}\quad E\in [-2+\kappa ,2-\kappa ]\nonumber \\&\quad \text {and}\quad M^{-1+\varepsilon }\le \eta \le 10. \end{aligned}$$
(1.15)

Remark 1.8

We remark that Theorem 2.3 in [12] was established under a more general assumption \(\sum _k \sigma _{jk}^2=1\) and \(\sigma _{jk}^2\le C/M\). Especially, the block structure on the variance profile is not needed. In addition, Theorem 2.3 in [12] also covers the edges of the spectrum, which will not be discussed in this paper. We also refer to [14] for a previous result, see Theorem 2.1 therein.

Our aim in this paper is to extend the local semicircle law to the regime \(\eta \gg N^{-1}\) and replace M with N in (1.15). More specifically, we will work in the following set, defined for arbitrarily small constant \(\kappa >0\) and any sufficiently small positive constant \( \varepsilon _2:=\varepsilon _2(\varepsilon _1)\),

$$\begin{aligned} \mathbf {D}(N,\kappa , \varepsilon _2):=\left\{ z=E+\mathbf {i}\eta \in \mathbb {C}: |E|\le \sqrt{2}-\kappa ,N^{-1+\varepsilon _2}\le \eta \le M^{-1}N^{\varepsilon _2}\right\} . \nonumber \\ \end{aligned}$$
(1.16)

Throughout the paper, we will assume that \(\varepsilon _2\) is much smaller than \(\varepsilon _1\), see (1.8) for the latter. Specifically, there exists some large enough constant C such that \(\varepsilon _2\le \varepsilon _1/C\).

Theorem 1.9

(Local semicircle law) Suppose that H is a random block band matrix satisfying Assumptions 1.11.3 and 1.5. Let \(\kappa \) be an arbitrarily small positive constant and \(\varepsilon _2\) be any sufficiently small positive constant. Then

$$\begin{aligned} \max _{a,b}|G_{ab}(z)-\delta _{ab}m_{sc}(z)|\prec (N\eta )^{-\frac{1}{2}} \end{aligned}$$
(1.17)

for all \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\).

Remark 1.10

In fact, (1.17) together with the fact that \(G_{ab}(z)\) and \(m_{sc}(z)\) are Lipschitz functions of z with Lipschitz constant \(\eta ^{-2}\) imply the uniformity of the estimate in z in the following stronger sense

$$\begin{aligned} \max _{z\in \mathbf {D}(N,\kappa ,\varepsilon _2)}\max _{a,b}\Big [(N\eta )^{\frac{1}{2}}|G_{ab}(z)-\delta _{ab}m_{sc}(z)|\Big ]\prec 1. \end{aligned}$$
(1.18)

Remark 1.11

The restriction \(|E|\le \sqrt{2}-\kappa \) in (1.16) is technical. We believe the result can be extended to the whole bulk regime of the spectrum, i.e., \(|E|\le 2-\kappa \). The upper bound of \(\eta \) in (1.16) is also technical. However, for \(\eta > M^{-1}N^{\varepsilon _2}\), one can control the Green’s function by (1.15) directly.

Let \(\lambda _1,\ldots ,\lambda _N\) be the eigenvalues of \(H_N\). We denote by \(\mathbf {u}_i:=(u_{i1},\ldots , u_{iN})\) the normalized eigenvector of \(H_N\) corresponding to \(\lambda _i\). From Theorem 1.9, we can also get the following delocalization property for the eigenvectors.

Theorem 1.12

(Complete delocalization) Let H be a random block band matrix satisfying Assumptions 1.11.3 and 1.5. We have

$$\begin{aligned} \max _{i:|\lambda _i|\le \sqrt{2}-\kappa } ||\mathbf {u}_i||_\infty \prec N^{-\frac{1}{2}}. \end{aligned}$$
(1.19)

Remark 1.13

We remark that delocalization in a certain weak sense was proven in [11] for an even more general class of random band matrices if \(M\gg N^{4/5}\). However, Theorem 1.12 asserts delocalization for all eigenvectors in a very strong sense (supremum norm), while Proposition 7.1 of [11] stated that most eigenvectors are delocalized in a sense that their substantial support cannot be too small.

1.3 Outline of the proof strategy and novelties

In this section, we briefly outline the strategy for the proof of Theorem 1.9.

The first step, which is the main task of the whole proof, is to establish the following Theorem 1.15, namely, a prior estimate of the Green’s function in the Gaussian case. For technical reason, we need the following slight modification of Assumption 1.3, to state the result.

Assumption 1.14

(On M) Let \(\varepsilon _1\) be the small positive constant in Assumption 1.3. We assume

$$\begin{aligned} N(\log N)^{-10}\ge M\ge W^{4+2\gamma +\varepsilon _1}. \end{aligned}$$
(1.20)

In the regime \(M\ge N(\log N)^{-10}\), we see that (1.17) anyway follows from (1.15) directly.

Theorem 1.15

Assume that H is a Gaussian block band matrix, satisfying Assumptions 1.1 and 1.14. Let n be any fixed positive integer. Let \(\kappa \) be an arbitrarily small positive constant and \(\varepsilon _2\) be any sufficiently small positive constant. There is \(N_0=N_0(n)\), such that for all \(N\ge N_0\) and all \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\), we have

$$\begin{aligned} \mathbb {E}|G_{ab}(z)|^{2n} \le N^{C_0}\Big (\delta _{ab}+\frac{1}{(N\eta )^n}\Big ),\quad \forall \; a,b=1,\ldots , N \end{aligned}$$
(1.21)

for some positive constant \(C_0\) independent of n and z.

Remark 1.16

Much more delicate analysis can show that the prefactor \(N^{C_0}\) can be improved to some n-dependent constant \(C_n\). We refer to Sect. 12 for further comment on this issue.

Using the definition of stochastic domination in Definition 1.6, a simple Markov inequality shows that (1.21) implies

$$\begin{aligned} |G_{ab}(z)|\prec \delta _{ab}+(N\eta )^{-\frac{1}{2}}, \quad \forall \; a,b=1,\ldots , N. \end{aligned}$$
(1.22)

The proof of Theorem 1.15 is the main task of our paper. We will use the supersymmetry method. We partially rely on the arguments from Shcherbina’s work [20] concerning universality of the local 2-point function and we develop new techniques to treat our observable, the high moment of the entries of G(z), under a more general setting. We will comment on the novelties later in this subsection.

The second step is to generalize Theorem 1.15 from the Gaussian case to more general distribution satisfying Assumption 1.5, via a Green’s function comparison strategy initiated in [14], see Lemma 2.1 below.

The last step is to use Lemma 2.1 and its Corollary 2.2 to prove our main theorems. Using (1.22) above to bound the error term in the self-consistent equation for the Green’s function, we can prove Theorem 1.9 by a continuity argument in z, with the aid of the initial estimate for large \(\eta \) provided in Proposition 1.7. Theorem 1.12 will then easily follow from Theorem 1.9.

The second and the last steps are carried out in Sect. 2. The main body of this paper, Sects. 311 is devoted to the proof of Theorem 1.15.

One of the main novelty of this work is to combine the supersymmetry method and the Green’s function comparison strategy to go beyond the Gaussian ensemble, which was so far the only random band matrix ensemble amenable to the supersymmetry method, as mentioned at the beginning. The comparison strategy requires an apriori control on the individual matrix elements of the Green’s function with high probability [(see (1.22)], this is one of our main motivations behind Theorem 1.15.

Although we consider a different observable than [20], many technical aspects of the supersymmetric analysis overlaps with [20]. For the convenience of the reader, we now briefly introduce the strategy of [20], and highlight the main novelties of our work.

In [20], the author considers the 2-point correlation function of the trace of the resolvent of the Gaussian block band matrix H, with the variance profile \(\widetilde{S}=1+a\varDelta \), under the assumption \(M\sim N\) (note that we use M instead of W in [20] for the size of the blocks). The 2-point correlation function can be expressed in terms of a superintegral of a superfunction \(F(\{\breve{\mathcal {S}}_i\}_{i=1}^W)\) with a collection of \(4\times 4\) supermatrices \(\breve{\mathcal {S}}_i:=\mathcal {Z}^*_i\mathcal {Z}_i\). Here for each \(i, \mathcal {Z}_i=(\varPsi _{1,i},\varPsi _{2,i},\varPhi _{1,i},\varPhi _{2,i})\) is an \(M\times 4\) matrix and \(\mathcal {Z}^*_i\) is its conjugate transpose, where \(\varPsi _{1,i}\) and \(\varPsi _{2,i}\) are Grassmann M-vectors whilst \(\varPhi _{1,i}\) and \(\varPhi _{2,i}\) are complex M-vectors. Then, by using the superbosonization formula in the nonsingular case (\(M\ge 4\)) from [18], one can transform the superintegral of \(F(\{\breve{\mathcal {S}}_i\}_{i=1}^W)\) to a superintegral of \(F(\{\mathcal {S}_i\}_{i=1}^W)\), where each \(\mathcal {S}_i\) is a supermatrix akin to \(\breve{\mathcal {S}}_i\), but only consists of 16 independent variables (either complex or Grassmann). We will call the integral representation of the observable after using the superbosonization formula as the final integral representation. Schematically it has the form

$$\begin{aligned} \int \mathsf {g}(\mathcal {S}_c)e^{M\mathsf {f}_c(\mathcal {S}_c)+\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_c)}\mathrm{d}\mathcal {S}, \end{aligned}$$
(1.23)

for some functions \(\mathsf {g}(\cdot ), \mathsf {f}_c(\cdot )\) and \(\mathsf {f}_g(\cdot )\), where we used the abbreviation \(\mathcal {S}:=\{\mathcal {S}_i\}_{i=1}^W\) and \(\mathcal {S}_c\) and \(\mathcal {S}_g\) represents the collection of all complex variables and Grassmann variables in \(\mathcal {S}\), respectively. Here, \(\mathsf {g}(\mathcal {S}_c)\) and \(\mathsf {f}_c(\mathcal {S}_c)\) are some complex functions and \(\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_c)\) will be mostly regarded as a function of the Grassmann variables with complex variables as its parameters. The number of variables (either complex or Grassmann) in the final integral representation then turns out to be of order W, which is much smaller than the original order N. In fact, in [20] it is assumed that \(W=O(1)\) although the author also mentions the possibility to deal with the case \(W\sim N^{\varepsilon }\) for some small positive \(\varepsilon \), see the remark below Theorem 1 therein.

Performing a saddle point analysis for the complex measure \(\exp \{M\mathsf {f}_c(\mathcal {S}_c)\}\), one can restrict the integral in a small vicinity of some saddle point, say, \(\mathcal {S}_c=\mathcal {S}_{c0}\). It turns out that \(\mathsf {f}_c(\mathcal {S}_{c0})=0\) and \(\mathsf {f}_c(\mathcal {S}_c)\) decays quadratically away from \(\mathcal {S}_{c0}\). Consequently, by plugging in the saddle point \(\mathcal {S}_{c0}\), one can estimate \(\mathsf {g}(\mathcal {S}_c)\) by \(\mathsf {g}(\mathcal {S}_{c0})\) directly. However, for \(\exp \{M\mathsf {f}_c(\mathcal {S}_c)\}\) and \(\exp \{\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_c)\}\), one shall expand them around the saddle point. Roughly speaking, in some vicinity of \(\mathcal {S}_{c0}\), one will find that the expansions read

$$\begin{aligned} e^{M\mathsf {f}_c(\mathcal {S}_c)}=\exp \{-\mathbf {u}'\mathbb {A}\mathbf {u}+\mathsf {e}_c(\mathbf {u})\},\quad e^{\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_c)}=\exp \{-\varvec{\rho }'\mathbb {H}\varvec{\tau }\}\mathsf {p}(\varvec{\rho },\varvec{\tau },\mathbf {u}). \quad \quad \end{aligned}$$
(1.24)

Here \(\mathbf {u}\) is a real vector of dimension O(W), which is essentially a vectorization of \(\sqrt{M}(\mathcal {S}_c-\mathcal {S}_{c0}); \mathsf {e}_c(\mathbf {u})=o(1)\) is some error term; \(\varvec{\rho }\) and \(\varvec{\tau }\) are two Grassmann vectors of dimension O(W). \({\mathbb {H}}\) is a complex matrix [(c.f. (9.26)], and \(\mathbb {A}\) is a complex matrix with positive-definite Hermitian part [(the explicit form of \(\mathbb {A}\) can be read from (8.30)]. Moreover, \(\mathbb {A}\) is closely related to \(\mathbb {H}\) in the sense that determinant of a certain minor of \(\mathbb {H}\) (after two rows and two columns removed) is proportional to the square root of the determinant of \( \mathbb {A}\), up to trivial factors. In addition, \(\mathsf {p}(\varvec{\rho },\varvec{\tau },\mathbf {u})\) is the expansion of \(\exp \{\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_c)-\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_{c0})\}\), which possesses the form

$$\begin{aligned} \mathsf {p}(\varvec{\rho },\varvec{\tau },\mathbf {u})=\sum _{\ell =0}^{O(W)} M^{-\frac{\ell }{2}}\mathsf {p}_\ell (\varvec{\rho },\varvec{\tau },\mathbf {u}), \end{aligned}$$
(1.25)

where \(\mathsf {p}_\ell (\varvec{\rho },\varvec{\tau },\mathbf {u})\) is a polynomial of the components of \(\varvec{\rho }\) and \(\varvec{\tau }\) with degree \(2\ell \), regarding \(\mathbf {u}\) as fixed parameters. Now, keeping the leading order term of \(\mathsf {p}(\varvec{\rho },\varvec{\tau },\mathbf {u})\), and discarding the remainder terms, one can get the final estimate of the integral by taking the Gaussian integral over \(\mathbf {u}, \varvec{\rho }\) and \(\varvec{\tau }\). This completes the summary of [20].

Similarly to [20], we also use the superbosonization formula to reduce the number of variables and perform the saddle point analysis on the resulting integral. However, owing to the following three main aspects, our analysis is significantly different from [20].

  • (Different observable) Our objective is to compute high moments of the single entry of the Green’s function. By using Wick’s formula (see Proposition 3.1), we express \(\mathbb {E}|G_{jk}|^{2n}\) in terms of a superintegral of some superfunction of the form

    $$\begin{aligned} \tilde{F}\left( \{\varPsi _{a,j},\varPsi ^*_{a,j},\varPhi _{a,j}, \varPhi ^*_{a,j}\}_{\begin{array}{c} a=1,2;\\ j=1,\ldots ,W \end{array}}\right) :=\left( \bar{\phi }_{1,q,\beta }\phi _{1,p,\alpha }\bar{\phi }_{2,p,\alpha }\phi _{2,q,\beta }\right) ^nF(\{\breve{\mathcal {S}}_i\}_{i=1}^W) \end{aligned}$$

    for some \(p,q\in \{1,\ldots , W\}\) and \(\alpha ,\beta \in \{1,\ldots ,M\}\), where \(\phi _{1,p,\alpha }\) is the \(\alpha \)th coordinate of \(\varPhi _{1,p}\), and the others are defined analogously. Unlike the case in [20], \(\tilde{F}\) is not a function of \(\{\breve{\mathcal {S}}_i\}_{i=1}^W\) only. Hence, using the superbosonization formula to change \(\breve{\mathcal {S}}_i\) to \(\mathcal {S}_i\) directly is not feasible in our case. In order to handle the factor \(\big (\bar{\phi }_{1,q,\beta }\phi _{1,p,\alpha }\bar{\phi }_{2,p,\alpha }\phi _{2,q,\beta }\big )^n\), the main idea is to split off certain rank-one supermatrices from \(\breve{\mathcal {S}}_p\) and \(\breve{\mathcal {S}}_q\) such that this factor can be expressed in terms of the entries of these rank-one supermatrices. Then we use the superbosonization formula not only in the nonsingular case from [18] but also in the singular case from [3] to change and reduce the variables, resulting the final integral representation of \(\mathbb {E}|G_{jk}|^{2n}\). Though this final integral representation, very schematically, is still of the form (1.23), due to the decomposition of the supermatrices \(\breve{\mathcal {S}}_p\) and \(\breve{\mathcal {S}}_q\), it is considerably more complicated than its counterpart in [20]. Especially, the function \(\mathsf {g}(\mathcal {S}_c)\) differs from its counterpart in [20], and its estimate at the saddle point follows from a different argument.

  • (Small band width) In [20], the author considers the case that the band width M is comparable with N, i.e. the number of blocks W is finite. Though the derivation of the 2-point correlation function is highly nontrivial even with such a large band width, our objective, the local semicircle law and delocalization of the eigenvectors, however, can be proved for the case \(M\sim N\) in a similar manner as for the Wigner matrix (\(M=N\)), see [12, 14]. In our work, we will work with much smaller band width to go beyond the results in [12, 14], see Assumption 1.3. Several main difficulties stemming from a narrow band width can be heuristically explained as follows.

At first, let us focus on the integral over the small vicinity of the saddle point, in which the exponential functions in the integrand in (1.23) approximately look like (1.24).

We regard the first term in (1.24) as a complex Gaussian measure, of dimension O(W). When \(W\sim 1\), one can discard the error term \(\mathsf {e}_c(\mathbf {u})\) directly and perform the Gaussian integral over \(\mathbf {u}\), due to the fact \(\int \mathrm{d}\mathbf {u}\exp \{-\mathbf {u}'\mathsf {Re}(\mathbb {A})\mathbf {u}\}|\mathsf {e}_c(\mathbf {u})|=o(1)\). However, such an estimate is not allowed when \(W\sim N^{\varepsilon }\) (say), because the normalization of the measure \(\exp \{-\mathbf {u}'\mathsf {Re}(\mathbb {A})\mathbf {u}\}\) might be exponentially larger than that of \(\exp \{-\mathbf {u}'\mathbb {A}\mathbf {u}\}\). In order to handle this issue, in Sect. 8.2, we will do a second deformation of the contours of the variables in \(\mathbf {u}\), following the steepest descent paths exactly, whereby we can transform the complex Gaussian measure to a real one (c.f., (8.45)), thus the error term of the integral can be controlled.

Now, we turn to the second term in (1.24). When \(W\sim 1\), there are only finitely many Grassmann variables. Hence, the complex coefficient of each term in the polynomial \(\mathsf {p}(\varvec{\rho }, \varvec{\tau },\mathbf {u})\), which is of order \(M^{-\ell /2}\) for some \(\ell \in \mathbb {N}\) (see (1.25)), actually controls the magnitude of the integral of this term against the Gaussian measure \(\exp \{-\varvec{\rho }'\mathbb {H}\varvec{\tau }\}\). Consequently, in case of \(W\sim 1\), it suffices to keep the leading order term (according to \(M^{-\ell /2}\)), one may discard the others trivially, and compute the Gaussian integral over \(\varvec{\rho }\) and \(\varvec{\tau }\) explicitly. However, when \(W\sim N^{\varepsilon }\) (say), in light of the Wick’s formula (3.2) and the fact that the coefficients are of order \(M^{-\ell /2}\), the order of the integral of each term of \(\mathsf {p}(\varvec{\rho }, \varvec{\tau },\mathbf {u})\) against the Gaussian measure reads \(M^{-\ell /2}\det \mathbb {H}^{(\mathsf {I}|\mathsf {J})}\) for some index sets \(\mathsf {I}\) and \(\mathsf {J}\) and some \(\ell \in \mathbb {N}\). Due to the fact \(W\sim N^{\varepsilon }, \det \mathbb {H}^{(\mathsf {I}|\mathsf {J})}\) is typically exponential in W. Hence, it is much more complicated to determine and compare the orders of the integrals of all \(e^{O(W)}\) terms. In Sect. 9.1, in particular using Assumption 1.1 (iii) and Lemma 9.4, we perform a unified estimate for the integrals of all the terms, rather than simply estimate them by \(M^{-\ell /2}\).

In addition, the analysis for the integral away from the vicinity of the saddle point in our work is also quite different from [20]. Actually, the integral over the complement of the vicinity can be trivially ignored in [20], since each factor in the integrand of (1.23) is of order 1, thus gaining any o(1) factor for the integrand outside the vicinity is enough for the estimate. However, in our case, either \(\exp \{M\mathsf {f}_c(\mathcal {S}_c)\}\) or \(\int \mathrm{d} \mathcal {S}_g \exp \{\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_c)\}\) is essentially exponential in W. This fact forces us to provide an apriori bound for \(\int \mathrm{d} \mathcal {S}_g \exp \{\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_c)\}\) in the full domain of \(\mathcal {S}_c\) rather than in the vicinity of the saddle point only. This step will be done in Sect. 6. In addition, in Sect. 7, an analysis of the tail behavior of the measure \(\exp \{M\mathsf {f}_c(\mathcal {S}_c)\}\) will also be performed, in order to control the integral away from the vicinity of the saddle point.

  • (General variance profile \(\widetilde{S}\)) In [20], the authors considered the special case \(S=a\varDelta \) with \(a<1/4\mathsf {d}\). We generalize the discussion to more general weighted Laplacians S satisfying Assumption 1.1, which, as a special case, includes the standard Laplacian \(\varDelta \) for any fixed dimension \(\mathsf {d}\).

1.4 Notation and organization

Throughout the paper, we will need some notation. At first, we conventionally use U(r) to denote the unitary group of degree r, as well, U(1, 1) denotes the group of \(2\times 2\) matrices Q obeying

$$\begin{aligned} Q^*\left( \begin{array}{ccc} 1 &{}0\\ 0 &{}-1 \end{array}\right) Q=\left( \begin{array}{ccc} 1 &{}0\\ 0 &{}-1 \end{array}\right) . \end{aligned}$$

Furthermore, we denote

$$\begin{aligned} \mathring{U}(r)=U(r)/U(1)^r,\quad \mathring{U}(1,1)=U(1,1)/U(1)^2. \end{aligned}$$
(1.26)

Recalling the real part E of z, we will frequently need the following two parameters

$$\begin{aligned} a_+=\frac{\mathbf {i}E+\sqrt{4-E^2}}{2},\quad a_-=\frac{\mathbf {i}E-\sqrt{4-E^2}}{2}. \end{aligned}$$
(1.27)

Correspondingly, we define the following four matrices

$$\begin{aligned}&D_{\pm }=\text {diag}(a_+,a_-), \quad D_{\mp }=\text {diag}(a_-,a_+),\quad D_{+}=\text {diag}(a_+,a_+),\nonumber \\&\quad D_{-}=\text {diag}(a_-,a_-). \end{aligned}$$
(1.28)

We remark here \(D_\pm \) does not mean “\(D_+\) or \(D_-\)”. In addition, we introduce the matrix

$$\begin{aligned} \mathfrak {I}=\bigg (\begin{array}{ccc} 0 &{}\quad 1\\ 1&{}\quad 0 \end{array}\bigg ). \end{aligned}$$
(1.29)

For simplicity, we introduce the following notation for some domains used throughout the paper.

$$\begin{aligned}&\mathbb {I}:=[0,1], \quad \mathbb {L}:=[0,2\pi ),\quad \varSigma :\,\, \text {unit circle}, \quad \mathbb {R}_+:=[0,\infty ),\nonumber \\&\quad \mathbb {R}_-:=-\mathbb {R}_+,\quad \varGamma :=a_+\mathbb {R}_+. \end{aligned}$$
(1.30)

For some \(\ell \times \ell \) Hermitian matrix A, we use \(\lambda _1(A)\le \cdots \le \lambda _\ell (A)\) to represent its ordered eigenvalues. For some possibly N-dependent parameter set \(\mathsf {U}_N\), and two families of complex functions \(\{a_N(u): N\in \mathbb {N}, u\in \mathsf {U}_N\}\) and \(\{b_N(u): N\in \mathbb {N}, u\in \mathsf {U}_N\}\), if there exists a positive constant \(C>1\) such that \(C^{-1}|b_N(u)|\le |a_N(u)|\le C |b_N(u)|\) holds uniformly in N and u, we write \(a_N(u)\sim b_N(u)\). Conventionally, we use \(\{\mathbf {e}_i:i=1,\ldots , \ell \}\) to denote the standard basis of \(\mathbb {R}^{\ell }\), in which the dimension \(\ell \) has been suppressed for simplicity. For some real quantities a and b, we use \(a\wedge b\) and \(a\vee b\) to represent \(\min \{a,b\}\) and \(\max \{a,b\}\), respectively.

Throughout the paper, \(c, c', c_1, c_2, C, C', C_1, C_2\) represent some generic positive constants that are possibly n-dependent and may differ from line to line. In contrast, we use \(C_0\) to denote some generic positive constant independent of n.

The paper will be organized in the following way. In Sect. 2, we prove Theorem 1.9 and Theorem 1.12, with Theorem 1.15. The proof of Theorem 1.15 will be done in Sects. 311. More specifically, in Sect. 3, we use the supersymmetric formalism to represent \(\mathbb {E}|G_{ij}|^{2n}\) in terms of a superintegral, in which the integrand can be factorized into several functions; Sect. 4 is devoted to a preliminary analysis on these functions; Sects. 510 are responsible for different steps of the saddle point analysis, whose organization will be further clarified at the end of Sect. 5; Sect. 11 is devoted to the final proof of Theorem 1.15, by summing up the discussions in Sects. 310. In Sect. 12, we make a comment on how to remove the prefactor \(N^{C_0}\) in (1.21). At the end of the paper, we also collect some frequently used symbols in a table, for the convenience of the reader.

2 Proofs of Theorem 1.9 and Theorem 1.12

Assuming Theorem 1.15, we prove Theorems 1.9 and 1.12 in this section. At first, (1.21) can be generalized to the generally distributed matrix with the four moment matching condition via the Green’s function comparison strategy.

Lemma 2.1

Assume that H is a random block band matrix, satisfying Assumptions 1.11.5 and 1.14. Let \(\kappa \) be an arbitrarily small positive constant and \(\varepsilon _2\) be any sufficiently small positive constant. There is \(N_0=N_0(n)\), such that for all \(N\ge N_0\) and all \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\), we have

$$\begin{aligned} \mathbb {E}|G_{ab}(z)|^{2n} \le N^{C_0}\Big (\delta _{ab}+\frac{1}{(N\eta )^n}\Big ),\quad \forall \; a,b=1,\ldots , N \end{aligned}$$
(2.1)

for some positive constant \(C_0\) uniform in n and z.

By the definition of stochastic domination in Definition 1.6, we can get the following corollary immediately.

Corollary 2.2

Under the assumptions of Lemma 2.1, we have

$$\begin{aligned} |G_{ab}(z)|\prec \delta _{ab}+(N\eta )^{-\frac{1}{2}}, \quad \forall \; a,b=1,\ldots , N,\quad \forall \; z\in \mathbf {D}(N,\kappa ,\varepsilon _2). \end{aligned}$$
(2.2)

In the sequel, at first, we prove Lemma 2.1 from Theorem 1.15 via the Green’s function comparison strategy. Then we prove Theorem 1.9, using Lemma 2.1. Finally, we will show that Theorem 1.12 follows from Theorem 1.9 simply.

2.1 Green’s function comparison: Proof of Lemma 2.1

To show (2.1), we use Lindeberg’s replacement strategy to compare the Green’s functions of the Gaussian case and the general case. That means, we will replace the entries of \(H^g\) by those of H one by one, and compare the Green’s functions step by step. Choose and fix a bijective ordering map

$$\begin{aligned} \varpi : \{(\imath ,\jmath ): 1\le \imath \le \jmath \le N\}\rightarrow \{1,\ldots , \varsigma (N)\},\quad \varsigma (N):={N(N+1)}/{2}. \end{aligned}$$
(2.3)

Then we use \(H_k\) to represent the \(N\times N\) random Hermitian matrix whose \((\imath ,\jmath )\)th entry is \(h_{\imath \jmath }\) if \(\varpi (\imath ,\jmath )\le k\), and is \(h^g_{\imath \jmath }\) otherwise. Especially, we have \(H_0=H^g\) and \(H_{\varsigma (N)}=H\). Correspondingly, we define the Green’s functions by

$$\begin{aligned} G_k(z):=(H_k-z)^{-1},\quad k=1,\ldots , \varsigma (N). \end{aligned}$$

Fix k and denote

$$\begin{aligned} \varpi ^{-1}(k)=(a,b). \end{aligned}$$
(2.4)

Then, we write

$$\begin{aligned} H_{k-1}= & {} H_k^0+\mathsf {V}_{ab},\quad \mathsf {V}_{ab}:=\Big (1-\frac{\delta _{ab}}{2}\Big )\big (h_{ab}^g \mathbf {e}_a\mathbf {e}_b^*+h_{ba}^g\mathbf {e}_b\mathbf {e}_a^*\big ),\\ H_k= & {} H_k^0+\mathsf {W}_{ab}, \quad \mathsf {W}_{ab}:=\Big (1-\frac{\delta _{ab}}{2}\Big )\big (h_{ab} \mathbf {e}_a\mathbf {e}_b^*+h_{ba}\mathbf {e}_b\mathbf {e}_a^*\big ), \end{aligned}$$

where \(H_k^0\) is obtained via replacing \(h_{ab}\) and \(h_{ba}\) by 0 in \(H_k\) (or replacing \(h^g_{ab}\) and \(h^g_{ba}\) by 0 in \(H_{k-1}\)). In addition, we denote

$$\begin{aligned} G_k^0(z)=(H_k^0-z)^{-1}. \end{aligned}$$

Set \(\varepsilon _3\equiv \varepsilon _3(\gamma ,\varepsilon _1)\) to be a sufficiently small positive constant, satisfying (say)

$$\begin{aligned} \varepsilon _3\le \frac{1}{100}\cdot \frac{\varepsilon _1}{5+2\gamma +\varepsilon _1}, \end{aligned}$$
(2.5)

where \(\gamma \) is from Assumption 1.1 (iii) and \(\varepsilon _1\) is from (1.8). For simplicity, we introduce the following parameters for \(\ell =1,\ldots , \varsigma (N)\) and \(\imath ,\jmath =1,\ldots , N\),

$$\begin{aligned}&\widehat{\varTheta }_0:=N^{C_0},\nonumber \\&\quad \widehat{\varTheta }_{\ell ,\imath \jmath }:=\widehat{\varTheta }_0\left( 1+C\left( \frac{N^{\varepsilon _3}}{\sqrt{M}}\right) ^{5}\right) ^\ell \prod _{\varpi (a,b)\le \ell }\left( 1+C\delta _{\{\imath ,\jmath \}\{a,b\}}\left( \frac{N^{\varepsilon _3}\sqrt{N\eta }}{\sqrt{M}}\right) ^5\right) ,\nonumber \\ \end{aligned}$$
(2.6)

where C is a positive constant. Here we used the notation \(\delta _{\mathsf {I}\mathsf {J}}=1\) if two index sets \(\mathsf {I}\) and \(\mathsf {J}\) are the same and \(\delta _{\mathsf {I}\mathsf {J}}=0\) otherwise. It is easy to see that for \(\eta \le M^{-1}N^{\varepsilon _2}\), we have

$$\begin{aligned} \widehat{\varTheta }_{\ell ,\imath \jmath }\le 2 \widehat{\varTheta }_0,\qquad \forall \; \ell =1,\ldots , \varsigma (N),\quad \imath ,\jmath =1,\ldots , N, \end{aligned}$$
(2.7)

by using (1.9). Now, we compare \(G_{k-1}(z)\) and \(G_k(z)\). We will prove the following lemma.

Lemma 2.3

Suppose that the assumptions in Lemma 2.1 hold. Additionally, we assume that for some sufficiently small positive constant \(\varepsilon _3\) satisfying (2.5),

$$\begin{aligned}&|(G_\ell )_{\imath \jmath }(z)|\prec N^{\varepsilon _3},\quad |(G_\ell ^0)_{\imath \jmath }(z)|\prec N^{\varepsilon _3},\quad \forall \; \ell =1,\ldots , \varsigma (N), \quad \forall \; \imath ,\nonumber \\&\quad \jmath =1,\ldots , N,\quad \forall z\in \mathbf {D}(N,\kappa ,\varepsilon _2). \end{aligned}$$
(2.8)

Let \(n\in \mathbb {N}\) be any given integer. Then, if

$$\begin{aligned}&\mathbb {E}|(G_{k-1})_{\imath \jmath }(z)|^{2n}\le \widehat{\varTheta }_{k-1, \imath \jmath } \left( \delta _{\imath \jmath }+\frac{1}{(N\eta )^n}\right) ,\nonumber \\&\quad \forall \; \imath ,\jmath =1,\ldots , N,\quad \forall z\in \mathbf {D}(N,\kappa ,\varepsilon _2) \end{aligned}$$
(2.9)

we also have

$$\begin{aligned} \mathbb {E}|(G_{k})_{\imath \jmath }(z)|^{2n}\le \widehat{\varTheta }_{k,\imath \jmath }\left( \delta _{\imath \jmath }+\frac{1}{(N\eta )^n}\right) , \quad \forall \; \imath ,\jmath =1,\ldots , N,\quad \forall z\in \mathbf {D}(N,\kappa ,\varepsilon _2) \nonumber \\ \end{aligned}$$
(2.10)

for any \(k=1,\ldots , \varsigma (N)\).

Proof of Lemma 2.3

Fix k and omit the argument z from now on, but all formulas are understood to hold for all \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2) \). At first, under the conditions (2.8) and (2.9), we show that

$$\begin{aligned} \mathbb {E}|(G_k^0)_{\imath \jmath }|^{2n}\le 3\widehat{\varTheta }_0\left( \delta _{\imath \jmath }+\frac{1}{(N\eta )^n}\right) , \quad \forall \; \imath ,\jmath =1,\ldots , N. \end{aligned}$$
(2.11)

To see this, we use the expansion with (2.4)

$$\begin{aligned} (G_k^0)_{\imath \jmath }=(G_{k-1})_{\imath \jmath }+(G_{k-1} \mathsf {V}_{ab}G_k^0)_{\imath \jmath }, \end{aligned}$$

which implies that for a sufficiently small \(\varepsilon '>0\) and a sufficiently large constant \(D>0\)

$$\begin{aligned} \mathbb {E}|(G_k^0)_{\imath \jmath }|^{2n}&\le \mathbb {E}\left( |(G_{k-1})_{\imath \jmath }|+\frac{N^{2\varepsilon _3+\varepsilon '}}{\sqrt{M}}\right) ^{2n}+\eta ^{-2n}N^{-D}\nonumber \\&=\sum _{\ell =0}^{2n} \left( {\begin{array}{c}2n\\ \ell \end{array}}\right) \bigg (\frac{N^{2\varepsilon _3+\varepsilon '}}{\sqrt{M}}\bigg )^{\ell }\mathbb {E} |(G_{k-1})_{\imath \jmath }|^{2n-\ell }+\eta ^{-2n}N^{-D} \end{aligned}$$
(2.12)

where the first step follows from (1.13), (2.8), Definition 1.6 and the trivial bound \(\eta ^{-1}\) for the Green’s functions. Now, using (2.9), (2.7) and Hölder inequality, we have

$$\begin{aligned} \mathbb {E} |(G_{k-1})_{\imath \jmath }|^{2n-\ell }\le 2\widehat{\varTheta }_0 \left( \delta _{\imath \jmath }+\frac{1}{(N\eta )^{n-\frac{\ell }{2}}}\right) ,\quad 0\le \ell \le 2n. \end{aligned}$$
(2.13)

In addition, for sufficiently small \(\varepsilon '\), it is easy to check that there exists a constant \(c>0\) such that

$$\begin{aligned} \frac{N^{2\varepsilon _3+\varepsilon '}}{\sqrt{M}}\le N^{-c}\frac{1}{(N\eta )^{\frac{1}{2}}} \end{aligned}$$
(2.14)

for \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\), in light of the fact \(M\gg N^{\frac{4}{5}}\), c.f., (1.9). Substituting (2.13) and (2.14) into (2.12) and choosing D to be sufficiently large, we can easily get the bound (2.11).

Now, recall (2.4) again and expand \(G_{k-1}(z)\) and \(G_k(z)\) around \(G_k^0(z)\), namely

$$\begin{aligned} G_{k-1}&= G_{k}^0+\sum _{\ell =1}^m(-1)^\ell (G_k^0 \mathsf {V}_{ab})^{\ell } G_k^0+(-1)^{m+1}(G_k^0 \mathsf {V}_{ab})^{m+1}G_{k-1},\nonumber \\ G_k&= G_k^0+\sum _{\ell =1}^m(-1)^\ell (G_k^0 \mathsf {W}_{ab})^{\ell } G_k^0+(-1)^{m+1}(G_k^0 \mathsf {W}_{ab})^{m+1}G_k. \end{aligned}$$
(2.15)

We always choose m to be sufficiently large, depending on \(\varepsilon _3\) but independent of N. Then, we can write

$$\begin{aligned} (G_{k-1})_{\imath \jmath }&=(G_k^0)_{\imath \jmath }+\sum _{\ell =1}^m \mathsf {R}_{\ell ,\imath \jmath }+\widetilde{\mathsf {R}}_{m+1,\imath \jmath },\quad \nonumber \\ (G_k)_{\imath \jmath }&=(G_k^0)_{\imath \jmath }+\sum _{\ell =1}^m\mathsf {S}_{\ell , \imath \jmath }+\widetilde{\mathsf {S}}_{m+1,\imath \jmath }, \end{aligned}$$
(2.16)

where

$$\begin{aligned}&\mathsf {R}_{\ell ,\imath \jmath }:=(-1)^{\ell }\Big ((G_k^0 \mathsf {V}_{ab})^{\ell } G_k^0\Big )_{\imath \jmath },\quad \mathsf {S}_{\ell ,\imath \jmath }:=(-1)^{\ell }\Big ((G_k^0 \mathsf {W}_{ab})^{\ell } G_k^0\Big )_{\imath \jmath },\quad \ell =1,\ldots ,m, \nonumber \\&\widetilde{\mathsf {R}}_{m+1,\imath \jmath }:=(-1)^{m+1}\Big ((G_k^0 \mathsf {V}_{ab})^{m+1}G_{k-1}\Big )_{\imath \jmath },\quad \widetilde{\mathsf {S}}_{m+1,\imath \jmath }:=(-1)^{m+1}\Big ((G_k^0 \mathsf {W}_{ab})^{m+1}G_k\Big )_{\imath \jmath }. \nonumber \\ \end{aligned}$$
(2.17)

At first, by taking m sufficiently large, from (2.8) and (1.13), we have the trivial bound

$$\begin{aligned} |\widetilde{\mathsf {R}}_{m+1,\imath \jmath }|, |\widetilde{\mathsf {S}}_{m+1,\imath \jmath }|\prec M^{-{\frac{m+1}{2}}}N^{(m+2)\varepsilon _3}\ll \frac{1}{M^3\sqrt{N\eta }}. \end{aligned}$$
(2.18)

For \(\mathsf {R}_{\ell ,\imath \jmath }\) and \(\mathsf {S}_{\ell ,\imath \jmath }\), we split the discussion into off-diagonal case and diagonal case. In the case of \(\imath \ne \jmath \), we keep the first and the last factors of the terms in the expansions of \(((G_k^0 \mathsf {V}_{ab})^{\ell } G_k^0)_{\imath \jmath }\) and \(((G_k^0 \mathsf {W}_{ab})^{\ell } G_k^0)_{\imath \jmath }\), namely, \((G_k^0)_{\imath \jmath '}\) and \((G_k^0)_{\imath '\jmath }\) for some \(\imath ',\jmath '=a\) or b, and bound the factors in between by using (1.13) and (2.8), resulting the bound

$$\begin{aligned} |\mathsf {R}_{\ell ,\imath \jmath }|, |\mathsf {S}_{\ell ,\imath \jmath }|\prec M^{-\frac{\ell }{2}}N^{(\ell -1)\varepsilon _3}\sum _{\imath ',\jmath '=a,b}|(G_k^0)_{\imath \jmath '}(G_k^0)_{\imath '\jmath }|,\quad \ell =1,\ldots ,m.\quad \quad \quad \end{aligned}$$
(2.19)

For \(\imath =\jmath \), we only keep the first factor of the terms in the expansions of \(((G_k^0 \mathsf {V}_{ab})^{\ell } G_k^0)_{\imath \imath }\) and \(((G_k^0 \mathsf {W}_{ab})^{\ell } G_k^0)_{\imath \imath }\), and bound the others by using (1.13) and (2.8), resulting the bound

$$\begin{aligned} |\mathsf {R}_{\ell ,\imath \imath }|, |\mathsf {S}_{\ell ,\imath \imath }|\prec M^{-\frac{\ell }{2}}N^{\ell \varepsilon _3}\Big (|(G_k^0)_{\imath a}|+|(G_k^0)_{\imath b}|\Big ),\quad \ell =1,\ldots ,m. \end{aligned}$$
(2.20)

Observe that, in case \(\imath \ne \jmath \), if \(\{\imath ,\jmath \}\ne \{a,b\}\), at least one of \((G_k^0)_{\imath \jmath '}\) and \((G_k^0)_{\imath '\jmath }\) is an off-diagonal entry of \(G_k^0\) for \(\imath ',\jmath '=a\) or b.

Now we compare the 2nth moment of \(|(G_{k-1})_{\imath \jmath }|\) and \(|(G_k)_{\imath \jmath }|\). At first, we write

$$\begin{aligned} \mathbb {E}|(G_d)_{\imath \jmath }|^{2n}=\mathbb {E}((G_d)_{\imath \jmath })^n(\overline{(G_d)_{\imath \jmath }})^n,\quad d=k-1,k. \end{aligned}$$
(2.21)

By substituting the expansion (2.16) into (2.21), we can write

$$\begin{aligned} \mathbb {E}|(G_d)_{\imath \jmath }|^{2n}=\mathbf {A}(\imath ,\jmath )+\mathbf {R}_{d}(\imath ,\jmath ), \quad d=k-1,k, \end{aligned}$$
(2.22)

where \(\mathbf {A}(\imath ,\jmath )\) is the sum of the terms which depend only on \(H_k^0\) and the first four moments of \(h_{ab}\), and \(\mathbf {R}_d(\imath ,\jmath )\) is the sum of all the other terms. We claim that \(\mathbf {R}_d(\imath ,\jmath )\) satisfies the bound

$$\begin{aligned} |\mathbf {R}_d(\imath ,\jmath )|\le C\widehat{\varTheta }_0\left( \frac{N^{\varepsilon _3}}{\sqrt{M}}\right) ^5\left( \delta _{\imath \jmath }+\frac{\delta _{\{\imath ,\jmath \}\{a,b\}}}{(N\eta )^{n-\frac{5}{2}}}+\frac{1}{(N\eta )^n}\right) ,\quad d=k-1,k, \nonumber \\ \end{aligned}$$
(2.23)

for some positive constant C independent of n. Now, we verify (2.23). According to (2.11) and the fact that the sequence \(\mathsf {R}_{1,\imath \jmath },\ldots , \mathsf {R}_{m,\imath \jmath }, \widetilde{\mathsf {R}}_{m+1,\imath \jmath }\), as well as \(\mathsf {S}_{1,\imath \jmath },\ldots , \mathsf {S}_{m,\imath \jmath }, \widetilde{\mathsf {S}}_{m+1,\imath \jmath }\), decreases by a factor \(N^{\varepsilon _3}/\sqrt{M}\) in magnitude, it is not difficult to check the leading order terms of \(\mathbf {R}_{k-1}(\imath ,\jmath )\) are of the form

$$\begin{aligned} \mathbb {E}\left( (G_k^0)_{\imath \jmath }\right) ^{p}\left( \overline{(G_k^0)_{\imath \jmath }}\right) ^{2n-p-\sum _{\ell =1}^5(q_\ell +q'_\ell )} \prod _{\ell =1}^{5}\mathsf {R}_{\ell ,\imath \jmath }^{q_{\ell }} \bar{\mathsf {R}}_{\ell ,\imath \jmath }^{q'_\ell }, \end{aligned}$$
(2.24)

with some \(p,q_\ell ,q'_\ell \in \mathbb {N}\) such that

$$\begin{aligned} \sum _{\ell =1}^5\ell (q_\ell +q'_\ell )=5,\quad 0\le p\le 2n-\sum _{\ell =1}^5(q_\ell +q'_\ell ), \end{aligned}$$
(2.25)

and the leading order terms of \(\mathbf {R}_{k}(\imath ,\jmath )\) possess the same form (with \(\mathsf {R}\) replaced by \(\mathsf {S}\)). Every other term has at least 6 factors of \(h_{ab}\) or \(h_{ab}^g\) or their conjugates, thus their sizes are typically controlled by \(M^{-3}(N\eta )^{-n}\), i.e. they are subleading. Hence, it suffices to bound (2.24).

Now, the five factors of \(h_{ab}\) or \(h_{ba}\) within the \(\mathsf {R}_{\ell ,\imath \jmath }\)’s in (2.24) are independent of the rest and estimated by \(M^{-5/2}\). For the remaining factors from \(G^0_k\), we use (2.11) to bound 2n of them and use (2.8) to bound the rest. In the case that \(\imath \ne \jmath \) and \(\{\imath ,\jmath \}\ne \{a,b\}\), by the discussion above, we must have an off-diagonal entry of \(G_k^0\) in the product \((G_k^0)_{\imath \jmath '} (G_k^0)_{\imath ' \jmath }\) for any choice of \(\imath ',\jmath '=a\) or b. Then, in the bound for \(\mathsf {R}_{\ell ,\imath \jmath }\) in (2.19), for each \((G_k^0)_{\imath \jmath '} (G_k^0)_{\imath '\jmath }\), we keep the off-diagonal entry and bound the other by \(N^{\varepsilon _3}\) from assumption (2.8). Hence, by using (2.19) and (2.25), we see that for some \(\imath _r,\jmath _r\in \{\imath ,\jmath ,a,b\}\) with \(\imath _r\ne \jmath _r, r=1,\ldots , \sum (q_\ell +q'_\ell )\), the following bound holds

$$\begin{aligned} (2.24)&\le \left( \frac{N^{\varepsilon _3}}{\sqrt{M}}\right) ^{5}\mathbb {E}\left( |(G_k^0)_{\imath \jmath }|^{2n-\sum _{\ell =1}^5(q_\ell +q'_\ell )}\prod _{r=1}^{\sum _{\ell =1}^5(q_\ell +q'_\ell )}|(G_k^0)_{\imath _r \jmath _r}|\right) \nonumber \\&\le 3\left( \frac{N^{\varepsilon _3}}{\sqrt{M}}\right) ^{5}\frac{\widehat{\varTheta }_0}{(N\eta )^{n}}, \end{aligned}$$
(2.26)

where the last step follows from (2.11) and Hölder’s inequality. In case of \(\imath \ne \jmath \) but \(\{\imath ,\jmath \}=\{a,b\}\), we keep an entry in the product \((G_k^0)_{\imath \jmath '} (G_k^0)_{\imath ' \jmath }\) and bound the other by \(N^{\varepsilon _3}\). We remark here in this case the entry being kept can be either diagonal or off-diagonal. Consequently, for some \(\imath _r,\jmath _r\in \{\imath ,\jmath ,a,b\},r=1,\ldots ,\sum (q_\ell +q'_\ell )\), we have the bound

$$\begin{aligned} (2.24)&\le \left( \frac{N^{\varepsilon _3}}{\sqrt{M}}\right) ^5\mathbb {E}\left( |(G_k^0)_{\imath \jmath }|^{2n-\sum _{\ell =1}^5(q_\ell +q'_\ell )}\prod _{r=1}^{\sum _{\ell =1}^5(q_\ell +q'_\ell )}|(G_k^0)_{\imath _r \jmath _r}|\right) \nonumber \\&\le 3\left( \frac{N^{\varepsilon _3}}{\sqrt{M}}\right) ^{5}\frac{\widehat{\varTheta }_0}{(N\eta )^{n-\frac{5}{2}}}, \end{aligned}$$
(2.27)

by using (2.11) and Hölder’s inequality again. Hence, we have shown (2.23) in the case of \(\imath \ne \jmath \). For \(\imath =\jmath \), it is analogous to show

$$\begin{aligned} (2.24)\le 3\left( \frac{N^{\varepsilon _3}}{\sqrt{M}}\right) ^{5}\widehat{\varTheta }_0 \end{aligned}$$
(2.28)

by using (2.11), (2.20) and Hölder’s inequality. Hence, we verified (2.23). Consequently, by Assumption 1.5, (2.22) and (2.23) we have

$$\begin{aligned} \Big |\mathbb {E}|(G_{k-1})_{\imath \jmath }|^{2n}-\mathbb {E}|(G_k)_{\imath \jmath }|^{2n}\Big |\le C\widehat{\varTheta }_0\left( \frac{N^{\varepsilon _3}}{\sqrt{M}}\right) ^5\left( \delta _{\imath \jmath }+\frac{\delta _{\{\imath ,\jmath \}\{a,b\}}}{(N\eta )^{n-\frac{5}{2}}}+\frac{1}{(N\eta )^n}\right) , \end{aligned}$$

which together with the assumption (2.9) for \(\mathbb {E}|(G_{k-1})_{\imath \jmath }|^{2n}\) and the definition of \(\widehat{\varTheta }_{\ell ,\imath \jmath }\)’s in (2.6), we can get (2.10). Hence, we completed the proof of Lemma 2.3. \(\square \)

To show (2.1), we also need the following lemma.

Lemma 2.4

Suppose that the assumptions in Lemma 2.1 hold. Fix the indices \(a,b\in \{1,\ldots N\}\). Let \(H^0\) be a matrix obtained from H with its (ab)th entry replaced by 0. Then, if for some \(\eta _0\ge 1/N\) there exists

$$\begin{aligned} |G_{\imath \imath }(z)|\prec 1,\quad |(G^0)_{\imath \imath }(z)|\prec 1 \quad \text {for} \quad \eta \ge \eta _0, \quad \forall \; \imath =1,\ldots ,N, \end{aligned}$$
(2.29)

then we also have

$$\begin{aligned} |G_{\imath \jmath }(z)|\prec \frac{\eta _0}{\eta },\quad |(G^0)_{\imath \jmath }(z)|\prec \frac{\eta _0}{\eta }, \quad \text {for}\quad \frac{1}{N}<\eta \le \eta _0, \quad \forall \; \imath ,\jmath =1,\ldots ,N. \end{aligned}$$

Proof of Lemma 2.4

The proof is almost the same as the discussion on pages 2311–2312 in [10]. For the convenience of the reader, we sketch it below. At first, according to the discussion below (4.28) in [10], for any \(\imath ,\jmath =1,\ldots , N\), we have

$$\begin{aligned} |G_{\imath \jmath }(E+\mathbf {i}\eta )|\le C\max _{\ell }\sum _{k\ge 0} \mathsf {Im}G_{\ell \ell }(E+\mathbf {i}2^k\eta ). \end{aligned}$$

Now, we set \(k_1:=\max \{k: 2^k\eta <\eta _0\}\) and \(k_2:=\max \{k: 2^k\eta <1\}\). According to our assumption, both \(k_1\) and \(k_2\) are of the order \(\log N\). Now, we have

$$\begin{aligned} \sum _{k\ge 0} \mathsf {Im}G_{\ell \ell }(E+\mathbf {i}2^k\eta )=&\sum _{k=0}^{k_1} \mathsf {Im}G_{\ell \ell }(E+\mathbf {i}2^k\eta )+\sum _{k=k_1}^{k_2} \mathsf {Im}G_{\ell \ell }(E+\mathbf {i}2^k\eta )\\&+\sum _{k=k_2+1}^{\infty } \mathsf {Im}G_{\ell \ell }(E+\mathbf {i}2^k\eta )\\&\prec \frac{\eta _0}{\eta }\sum _{k=0}^{k_1} \frac{1}{2^k}\mathsf {Im}G_{\ell \ell }(E+\mathbf {i}\eta _0)+(k_2-k_1)+1\prec \frac{\eta _0}{\eta } \end{aligned}$$

where in the second step, we used the fact that the function \(y\mapsto y\mathsf {Im}G_{\ell \ell } (E+\mathbf {i}y)\) is monotonically increasing, the condition (2.29) and the fact \(\eta \le \eta _0\). Hence, we conclude the proof of Lemma 2.4. \(\square \)

Now, with Theorem 1.15, Lemmas 2.3 and 2.4, we can prove Lemma 2.1.

Proof for Lemma 2.1

The proof relies on the following bootstrap argument, namely, we show that once

$$\begin{aligned} |(G_\ell )_{\imath \jmath }|\prec 1, \quad \forall \; \ell =1,\ldots ,\varsigma (N),\quad \forall \; \imath ,\jmath =1,\ldots , N \end{aligned}$$
(2.30)

holds for \(\eta \ge \eta _0\) with \(\eta _0\in [N^{-1+\varepsilon _2+\varepsilon _3},M^{-1}N^{\varepsilon _2}]\), it also holds for \(\eta \ge \eta _0N^{-\varepsilon _3}\) for any \(\varepsilon _3\) satisfying (2.5). Assuming (2.30) holds for \(\eta \ge \eta _0\), we see that

$$\begin{aligned} \max _{\imath ,\jmath }|(G^0_{\ell })_{\imath \jmath }|&=\max _{\imath ,\jmath }|(G_\ell )_{\imath \jmath }+((G_\ell )\mathsf {W}_{ab}G^0_\ell )_{\imath \jmath }| \nonumber \\&\prec \max _{\imath ,\jmath }|(G_\ell )_{\imath \jmath }|\left( 1+\frac{1}{\sqrt{M}}\max _{\imath ,\jmath }|(G^0_\ell )_{\imath \jmath }|\right) \\&\prec 1+\frac{1}{\sqrt{M}} \max _{\imath ,\jmath }|(G^0_\ell )_{\imath \jmath }|. \end{aligned}$$

Consequently, for \(\eta \ge \eta _0\), we also have

$$\begin{aligned} |(G^0_\ell )_{\imath \jmath }|\prec 1, \quad \forall \; \ell =1,\ldots ,\varsigma (N),\quad \forall \; \imath ,\jmath =1,\ldots , N. \end{aligned}$$
(2.31)

Therefore, (2.29) holds. Then, by Lemma 2.4, we see that (2.8) holds for \(\eta \ge \eta _0N^{-\varepsilon _3}\). Furthermore, by Lemma 2.3 and Theorem 1.15 for \(G_0\), i.e. the Gaussian case, one can get that for any given n,

$$\begin{aligned}&\mathbb {E}|(G_{\ell })_{\imath \jmath }|^{2n}\le \widehat{\varTheta }_{\ell ,\imath \jmath }\left( \delta _{\imath \jmath }+\frac{1}{(N\eta )^n}\right) \le 2\widehat{\varTheta }_0\left( \delta _{\imath \jmath }+\frac{1}{(N\eta )^n}\right) ,\nonumber \\&\quad \text {for}\quad M^{-1}N^{\varepsilon _2}\ge \eta \ge \eta _0N^{-\varepsilon _3},\nonumber \\&\quad \forall \; \ell =1,\ldots , \varsigma (N),\quad \forall \; \imath ,\jmath =1,\ldots ,N. \end{aligned}$$
(2.32)

Note that since (2.32) holds for any given n, we get (2.30) for \(M^{-1}N^{\varepsilon _2}\ge \eta \ge \eta _0N^{-\varepsilon _3}\).

Now we start from \(\eta _0=M^{-1}N^{\varepsilon _2}\). By Proposition 1.7 we see that (2.30) holds for all \(\eta \ge \eta _0\). Then we can use the bootstrap argument above finitely many times to show (2.30) holds for all \(\eta \ge N^{-1+\varepsilon _2}\). Consequently, we have (2.8) for all \(\eta \ge N^{-1+\varepsilon _2}\). Then, Lemma 2.1 follows from Lemma 2.3 and Theorem 1.15 immediately. \(\square \)

2.2 Proof of Theorem 1.9

Without loss of generality we can assume that \(M\le N(\log N)^{-10}\), otherwise, Proposition 1.7 implies (1.17) immediately. We only need to consider the diagonal entries \(G_{ii}\) below, since the bound for the off-diagonal entires of G(z) is implied by (2.1) directly. For simplicity, we introduce the notation

$$\begin{aligned} \varLambda _d\equiv \varLambda _d(z):=\max _{i}|G_{ii}(z)-m_{sc}(z)|. \end{aligned}$$

To bound \(\varLambda _d\), a key technical input is the estimate for the quantity

$$\begin{aligned} \zeta _i\equiv \zeta _i(z):=(G_{ii})^{-1}+z+\sum _{a}\sigma _{ai}^2G_{aa}, \end{aligned}$$
(2.33)

which is given in the following lemma.

Lemma 2.5

Suppose that H satisfies Assumptions 1.11.5 and 1.14. We have

$$\begin{aligned} \max _{i=1,\ldots ,N}|\zeta _i(z)|\prec (N\eta )^{-\frac{1}{2}}, \end{aligned}$$
(2.34)

for all \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\).

The proof of Lemma 2.5 will be postponed. Using Lemma 2.5, we see that, with high probability, (2.33) is a small perturbation of the self-consistent equation of \(m_{sc}\), i.e. (1.14), considering \(\sum _{a}\sigma _{ai}^2=1\). To control \(\varLambda _d\), we use a continuity argument from [12].

We remind here that in the sequel, the parameter set of the stochastic dominance is always \(\mathbf {D}(N,\kappa ,\varepsilon _2)\), without further mention. We need to show that

$$\begin{aligned} \varLambda _d\prec (N\eta )^{-\frac{1}{2}}, \end{aligned}$$
(2.35)

and first we claim that it suffices to show that

$$\begin{aligned} \mathbf {1}\left( \varLambda _d\le N^{-\frac{\varepsilon _2}{4}}\right) \varLambda _d\prec (N\eta )^{-\frac{1}{2}}. \end{aligned}$$
(2.36)

Indeed, if (2.36) were proven, we see that with high probability either \(\varLambda _d>N^{-\frac{\varepsilon _2}{4}}\) or \(\varLambda _d\prec (N\eta )^{-\frac{1}{2}}\le N^{-\frac{\varepsilon _2}{2}}\) for \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\). That means, there is a gap in the possible range of \(\varLambda _d\). Now, choosing \(\varepsilon \) in (1.15) to be sufficiently small, we are able to get for \(\eta =M^{-1}N^{\varepsilon _2}\),

$$\begin{aligned} \varLambda _d\prec N^{-\frac{\varepsilon _2}{2}},\quad \forall \; E\in [-2+\kappa ,2-\kappa ],\quad \forall \; i=1,\ldots , N. \end{aligned}$$
(2.37)

By the fact that \(\varLambda _d\) is continuous in z, we see that with high probability, \(\varLambda _d\) can only stay in one side of the range, namely, (2.35) holds. The rigorous details of this argument involve considering a fine discrete grid of the z-parameter and using that G(z) is Lipschitz continuous (albeit with a large Lipschitz constant \(1/\eta \)). The details are found in Section 5.3 of [12].

Hence, what remains is to verify (2.36). The proof of (2.36) is almost the same as that for Lemma 3.5 in [14]. For the convenience of the reader, we sketch it below without reproducing the details. We set

$$\begin{aligned} \bar{m}\equiv \bar{m}(z):=\frac{1}{N}\sum _{i=1}^N G_{ii}(z),\quad \mathsf {u}_i\equiv \mathsf {u}_i(z):=G_{ii}-\bar{m},\quad i=1,\ldots , N. \end{aligned}$$

We also denote . By the assumption \(\varLambda _d\le N^{-\frac{\varepsilon _2}{4}}\), we have

$$\begin{aligned} \mathsf {u}_i= O\left( N^{-\frac{\varepsilon _2}{4}}\right) . \end{aligned}$$
(2.38)

Now we rewrite (2.33) as

$$\begin{aligned} 0=G_{ii}+\frac{1}{z+\sum _{a}\sigma _{ai}^2G_{aa}-\zeta _i}=:G_{ii}+\frac{1}{z+\bar{m}(z)}+\widetilde{\zeta }_i. \end{aligned}$$
(2.39)

By using (2.34), Lemma 5.1 in [14], and the assumption \(\varLambda _d\le N^{-\frac{\varepsilon _2}{4}}\), we can show that

(2.40)

One can refer to the derivation of (5.14) in [14] for more details. Averaging over i for (2.39) and (2.40) leads to

$$\begin{aligned} \bar{m}(z)+\frac{1}{z+\bar{m}(z)}=-\widetilde{\zeta }, \end{aligned}$$
(2.41)

where

(2.42)

Plugging (2.38) and (2.34) into (2.42) yields

$$\begin{aligned} |\widetilde{\zeta }|\prec N^{-\frac{\varepsilon _2}{2}}. \end{aligned}$$
(2.43)

Using (2.43), the fact \(|\bar{m}(z)-m_{sc}(z)|\le \varLambda _d\le N^{-\frac{\varepsilon _2}{4}}\), and Lemma 5.2 in [14], to (2.41), we have

(2.44)

where in the first step we have used the fact that \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\) thus away from the edges of the semicircle law. Now, we combine (2.39), (2.40) and (2.41), resulting

(2.45)

We just take the above identity as the definition of \(\mathsf {w}_i\). Analogously, we set . Then (2.42) and (2.45) imply

(2.46)

where the second step follows from the fact \(|z+m_{sc}(z)|\ge 1\) in \(\mathbf {D}(N,\kappa ,\varepsilon _2)\) (see (5.1) in [14] for instance), (2.43) and (2.44), and in the last step we used (2.44) again.

Now, using the fact \(m_{sc}^2(z)=(m_{sc}(z)+z)^{-2}\) (see (1.14)), we rewrite (2.45) in terms of the matrix \(\mathcal {T}\) introduced in (1.6) as . Consequently, we have

(2.47)

Then for \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\), using (1.7) and Proposition A.2 (ii) in [12] (with \(\delta _-=1\) and \(\theta >c\)), we can get

$$\begin{aligned} \varGamma (z)=O(\log N). \end{aligned}$$
(2.48)

Plugging (2.48) and (2.46) into (2.47) yields

where the second step follows from (2.34). Then (2.38) further implies that , which together with (2.44) and (2.34) also implies

$$\begin{aligned} |\bar{m}(z)-m_{sc}(z)|\prec (N\eta )^{-\frac{1}{2}}. \end{aligned}$$

Hence

Therefore, we completed the proof of Theorem 1.9.

Proof of Lemma 2.5

At first, recalling the notation defined in (1.3), we denote the Green’s function of \(H^{(i)}\) as

$$\begin{aligned} G^{(i)}(z):=(H^{(i)}-z)^{-1}, \end{aligned}$$

with a little abuse of notation. For simplicity, we omit the variable z from the notation below. At first, we recall the elementary identity by Schur’s complement, namely,

$$\begin{aligned} G_{ii}=\left( {h_{ii}-z-(\mathbf {h}_i^{\langle i\rangle })^* G^{(i)}\mathbf {h}_i^{\langle i\rangle }}\right) ^{-1}. \end{aligned}$$
(2.49)

where we used the notation \(\mathbf {h}_i^{\langle i\rangle }\) to denote the ith column of H, with the ith component deleted. Now, we use the identity for \(a,b\ne i\) (see Lemma 4.5 in [12] for instance),

$$\begin{aligned} G_{ab}^{(i)}=G_{ab}-G_{ai}G_{ib}(G_{ii})^{-1}=G_{ab}-G_{ai}G_{ib}\Big (h_{ii}-z-(\mathbf {h}_i^{\langle i\rangle })^* G^{(i)}\mathbf {h}_i^{\langle i\rangle }\Big ). \nonumber \\ \end{aligned}$$
(2.50)

By using (1.11) and the large deviation estimate for the quadratic form (see Theorem C.1 of [12] for instance), we have

$$\begin{aligned} \left| (\mathbf {h}_i^{\langle i\rangle })^* G^{(i)}\mathbf {h}_i^{\langle i\rangle }-\sum _{a\ne i} \sigma _{ai}^2\cdot G_{aa}^{(i)}\right| \prec \sqrt{\frac{1}{M}\max _a|G_{aa}^{(i)}|^2+\max _{a\ne b} |G_{ab}^{(i)}|^2}, \end{aligned}$$
(2.51)

which implies that

$$\begin{aligned} \Big |(\mathbf {h}_i^{\langle i\rangle })^* G^{(i)}\mathbf {h}_i^{\langle i\rangle }\Big |\prec \max _{a\ne i}|G_{aa}^{(i)}|+\sqrt{\frac{1}{M}\max _{a\ne i}|G_{aa}^{(i)}|^2+\max _{a\ne b} |G_{ab}^{(i)}|^2}\le 3\max _{a,b\ne i}|G_{ab}^{(i)}|, \nonumber \\ \end{aligned}$$
(2.52)

where we have used the fact that \(\sum _{a}\sigma _{ai}^2=1\) in the first inequality above. Plugging (1.22) and (2.52) into (2.50) and using Corollary 2.2 we obtain

$$\begin{aligned} \max _{a,b\ne i}|G_{ab}^{(i)}|\prec 1+\frac{1}{N\eta }\Big (1+3\max _{a,b\ne i}|G_{ab}^{(i)}|\Big ), \end{aligned}$$

which implies

$$\begin{aligned} \max _{a,b\ne i}|G_{ab}^{(i)}|\prec 1,\quad \Big |(\mathbf {h}_i^{\langle i\rangle })^* G^{(i)}\mathbf {h}_i^{\langle i\rangle }\Big |\prec 1. \end{aligned}$$
(2.53)

In addition, (1.22), (2.50) and (2.53) lead to the fact that

$$\begin{aligned} \left| G_{ab}(z)-G_{ab}^{(i)}(z)\right| \prec \frac{1}{N\eta },\qquad \left| G_{ab}^{(i)}(z)\right| \prec \delta _{ab}+(N\eta )^{-\frac{1}{2}},\quad \forall \; a, b\ne i. \quad \quad \quad \end{aligned}$$
(2.54)

Now, using (2.33), (2.49), (2.51) and (2.54), we can see that

$$\begin{aligned} |\zeta _i|=\left| -h_{ii}+(\mathbf {h}_i^{\langle i\rangle })^* G^{(i)}\mathbf {h}_i^{\langle i\rangle }-\sum _{a} \sigma _{ai}^2 G_{aa}\right| \prec (N\eta )^{-\frac{1}{2}}. \end{aligned}$$
(2.55)

Therefore, we completed the proof of Lemma 2.5. \(\square \)

2.3 Proof of Theorem 1.12

With Theorem 1.9, we can prove Theorem 1.12 routinely. At first, according to the uniform bound (1.18), we have

$$\begin{aligned} \max _{a,b}\sup _{z\in \mathbf {D}(N,\kappa ,\varepsilon _2)}|G_{ab}(z)-\delta _{ab}m_{sc}(z)|\prec 1, \end{aligned}$$

which implies that

$$\begin{aligned} \max _{a}\sup _{z\in \mathbf {D}(N,\kappa ,\varepsilon _2)}|G_{aa}(z)|\prec 1 \end{aligned}$$
(2.56)

due to the fact that \(m_{sc}(z)\sim 1\). Recalling the normalized eigenvector \(\mathbf {u}_i=(u_{i1},\ldots , u_{iN})\) corresponding to \(\lambda _i\), and using the spectral decomposition, we have

$$\begin{aligned} \max _a\mathsf {Im}G_{aa}(z)=\max _a\sum _{i=1}^N\frac{|u_{ia}|^2\eta }{|\lambda _i-E|^2+\eta ^2}=\sum _{i=1}^N\frac{||\mathbf {u}_i||_\infty ^2\eta }{|\lambda _i-E|^2+\eta ^2}. \end{aligned}$$
(2.57)

For any \(|\lambda _i|\le \sqrt{2}-\kappa \), we set \(E=\lambda _i\) on the r.h.s. of (2.57) and use (2.56) to bound the l.h.s. of it. Then we obtain \({||\mathbf {u}_{i}||^2_\infty }/{\eta }\prec 1\). Choosing \(\eta =N^{-1+\varepsilon _2}\) above and using the fact that \(\varepsilon _2\) can be arbitrarily small, we can get (1.19). Hence, we completed the proof of Theorem 1.12.

3 Supersymmetric formalism and integral representation for the Green’s function

In this section, we will represent \(\mathbb {E}|G_{ij}(z)|^{2n}\) for the Gaussian case by a superintegral. The final representation is stated in (3.31). We make the convention here, for any real argument in an integral below, its region of the integral is always \(\mathbb {R}\), unless specified otherwise.

3.1 Gaussian integrals and superbosonization formulas

Let \(\varvec{\phi }=(\phi _1,\ldots , \phi _k)'\) be a vector of complex components, \(\varvec{\psi }=(\psi _1,\ldots ,\psi _k)'\) be a vector of Grassmann components. In addition, let \(\varvec{\phi }^*\) and \(\varvec{\psi }^*\) be the conjugate transposes of \(\varvec{\phi }\) and \(\varvec{\psi }\), respectively. We recall the following well-known formulas for Gaussian integrals.

Proposition 3.1

(Gaussian integrals or Wick’s formulas)

  1. (i)

    Let \(\mathrm {A}\) be a \(k\times k\) complex matrix with positive-definite Hermitian part, i.e. \(\mathsf {Re}A>0\). Then for any \(\ell \in \mathbb {N}\), and \(i_1,\ldots , i_\ell , j_1,\ldots , j_\ell \in \{1,\ldots , k\}\), we have

    $$\begin{aligned} \int \prod _{a=1}^k\frac{\mathrm{d}\mathsf {Re}\phi _a \mathrm{d}\mathsf {Im}\phi _a}{\pi }\;\exp \{-\varvec{\phi }^*\mathrm {A}\varvec{\phi }\}\prod _{b=1}^\ell \bar{\phi }_{i_b}\phi _{j_b}=\frac{1}{\det \mathrm {A}} \; \sum _{\sigma \in \mathbb {P}(\ell )} \prod _{b=1}^\ell (\mathrm {A}^{-1})_{j_b,i_{\sigma (b)}}, \nonumber \\ \end{aligned}$$
    (3.1)

    where \(\mathbb {P}(\ell )\) is the permutation group of degree \(\ell \).

  2. (ii)

    Let \(\mathrm {B}\) be any \(k\times k\) matrix. Then for any \(\ell \in \{ 0,\ldots , k\}\), any \(\ell \) distinct integers \(i_1,\ldots , i_\ell \) and another \(\ell \) distinct integers \(j_1,\ldots , j_\ell \in \{1,\ldots , k\}\), we have

    $$\begin{aligned} \int \prod _{a=1}^k\mathrm{d}\bar{\psi }_a \mathrm{d}\psi _a\; \exp \{-\varvec{\psi }^*\mathrm {B}\varvec{\psi }\}\prod _{b=1}^\ell \bar{\psi }_{i_b}\psi _{j_b} =(-1)^{\ell +\sum _{\alpha =1}^\ell (i_\alpha +j_\alpha )}\det \mathrm {B}^{(\mathsf {I}|\mathsf {J})}, \nonumber \\ \end{aligned}$$
    (3.2)

    where \(\mathsf {I}=\{i_1,\ldots , i_\ell \}\), and \(\mathsf {J}=\{j_1,\ldots , j_\ell \}\).

Now, we introduce the superbosonization formula for superintegrals. Let \(\varvec{\chi }=(\chi _{ij})\) be an \(\ell \times r\) matrix with Grassmann entries, \(\mathbf {f}=(f_{ij})\) be an \(\ell \times r\) matrix with complex entries. In addition, we denote their conjugate transposes by \(\varvec{\chi }^*\) and \(\mathbf {f}^*\) respectively. Let F be a function of the entries of the matrix

$$\begin{aligned} \mathcal {S}(\mathbf {f},\mathbf {f}^*;\varvec{\chi },\varvec{\chi }^*):=\bigg (\begin{array}{ccc} \varvec{\chi }^*\varvec{\chi } &{} \varvec{\chi }^*\mathbf {f}\\ \mathbf {f}^*\varvec{\chi } &{}\mathbf {f}^*\mathbf {f} \end{array}\bigg ). \end{aligned}$$

Let \(\mathcal {A}(\varvec{\chi },\varvec{\chi }^*)\) be the Grassmann algebra generated by \(\chi _{ij}\)’s and \(\bar{\chi }_{ij}\)’s. Then we can regard F as a function defined on a complex vector space, taking values in \(\mathcal {A}(\varvec{\chi },\varvec{\chi }^*)\). Hence, we can and do view \(F(\mathcal {S}(\mathbf {f},\mathbf {f}^*;\varvec{\chi },\varvec{\chi }^*))\) as a polynomial in \(\chi _{ij}\)’s and \(\bar{\chi }_{ij}\)’s, in which the coefficients are functions of \(f_{ij}\)’s and \(\bar{f}_{ij}\)’s. Under this viewpoint, we state the assumption on F as follows.

Assumption 3.2

Suppose that \(F(\mathcal {S}(\mathbf {f},\mathbf {f}^*;\varvec{\chi },\varvec{\chi }^*))\) is a holomorphic function of \(f_{ij}\)’s and \(\bar{f}_{ij}\)’s if they are regarded as independent variables, and F is a Schwarz function of \(\mathsf {Re}f_{ij}\)’s and \(\mathsf {Im}f_{ij}\)’s, by those we mean that all of the coefficients of \(F(\mathcal {S}(\mathbf {f},\mathbf {f}^*;\varvec{\chi },\varvec{\chi }^*))\), as functions of \(f_{ij}\)’s and \(\bar{f}_{ij}\)’s, possess the above properties.

Proposition 3.3

(Superbosonization formula for the nonsingular case, [18]) Suppose that F satisfies Assumption 3.2. For \(\ell \ge r\), we have

$$\begin{aligned} \int F\bigg ( \begin{array}{ccc} \varvec{\chi }^*\varvec{\chi } &{} \varvec{\chi }^*\mathbf {f}\\ \mathbf {f}^*\varvec{\chi } &{}\mathbf {f}^*\mathbf {f} \end{array} \bigg )\mathrm{d}\mathbf {f}\mathrm{d}\varvec{\chi }= & {} (\mathbf {i}\pi )^{-r(r-1)} \int \mathrm{d}\hat{\mu }(\mathbf {x})\mathrm{d}\hat{\nu }(\mathbf {y})\mathrm{d}\varvec{\omega }\mathrm{d} \varvec{\xi } \;\nonumber \\&\times F\bigg ( \begin{array}{ccc} \mathbf {x} &{} \varvec{\omega }\\ \varvec{\xi } &{}\mathbf {y} \end{array} \bigg ) \frac{\det ^\ell \mathbf {y}}{\det ^\ell (\mathbf {x}-\varvec{\omega }\mathbf {y}^{-1}\varvec{\xi })}, \end{aligned}$$
(3.3)

where \(\mathbf {x}=(x_{ij})\) is a unitary matrix; \(\mathbf {y}=(y_{ij})\) is a positive-definite Hermitian matrix; \(\varvec{\omega }\) and \(\varvec{\xi }\) are two Grassmann matrices, and all of them are \(r\times r\). Here

$$\begin{aligned}&\mathrm{d}\mathbf {f}=\prod _{i,j}\frac{\mathrm{d}\mathsf {Re}f_{ij} \mathrm{d} \mathsf {Im}f_{ij}}{\pi },\quad \mathrm{d}\varvec{\chi }=\prod _{i,j} \mathrm{d}\bar{\chi }_{ij} \mathrm{d}\chi _{ij},\\&\mathrm{d}\hat{\nu }(\mathbf {y})=\mathbf {1}(\mathbf {y}>0) \prod _{i=1}^r \mathrm{d} y_{ii} \prod _{j>k} \mathrm{d}\mathsf {Re}y_{jk} \mathrm{d}\mathsf {Im}y_{jk}, \quad \mathrm{d}\varvec{\omega }\mathrm{d} \varvec{\xi }= \prod _{i,j=1}^r \mathrm{d}\omega _{ij}\mathrm{d} \xi _{ij}, \end{aligned}$$

and \(\mathrm{d}\hat{\mu }(\cdot )\) is defined by

$$\begin{aligned}&\mathrm{d}\hat{\mu }(\mathbf {x})=\frac{\pi ^{r(r-1)/2}}{\prod _{i=1}^r i!}\cdot \prod _{i=1}^r\frac{\mathrm{d} x_i }{2\pi \mathbf {i}} \cdot (\varDelta (x_1,\ldots , x_r))^2 \cdot \mathrm{d}\mu (V), \end{aligned}$$

under the parametrization induced by the eigendecomposition, namely,

$$\begin{aligned} \mathbf {x}=V^*\hat{\mathbf {x}}V,\quad \hat{\mathbf {x}}=\mathrm{diag}(x_1,\ldots ,x_r),\quad V\in \mathring{U}(r). \end{aligned}$$

Here \(\mathrm{d}\mu (V)\) is the Haar measure on \(\mathring{U}(r)\), and \(\varDelta (\cdot )\) is the Vandermonde determinant. In addition, the integral w.r.t. \(\mathbf {x}\) ranges over U(r), that w.r.t. \(\mathbf {y}\) ranges over the cone of positive-definite matrices.

For the singular case, i.e. \(r>\ell \), we only state the formula for the case of \(r=2\) and \(\ell =1\), which is enough for our purpose. We can refer to formula (11) in [3] for the result under more general setting.

Proposition 3.4

(Superbosonization formula for the singular case, [3]) Suppose that F satisfies Assumption 3.2. If \(r=2\) and \(\ell =1\), we have

$$\begin{aligned} \int F\bigg ( \begin{array}{ccc} \varvec{\chi }^*\varvec{\chi } &{} \varvec{\chi }^*\mathbf {f}\\ \mathbf {f}^*\varvec{\chi } &{}\mathbf {f}^*\mathbf {f} \end{array} \bigg )\mathrm{d}\mathbf {f}\mathrm{d}\varvec{\chi }= & {} \frac{-1}{\pi ^2}\int \mathrm{d}\mathbf {w} \mathrm{d}\hat{\mu }(\mathbf {x})\cdot \mathbf {1}(y\ge 0)\mathrm{d}y \;\nonumber \\&\cdot \, \mathrm{d}\varvec{\omega }\mathrm{d} \varvec{\xi } F\bigg ( \begin{array}{ccc} \mathbf {x} &{} \varvec{\omega }\mathbf {w}^*\\ \mathbf {w}\varvec{\xi } &{}y\mathbf {w}\mathbf {w}^* \end{array} \bigg )\frac{y(y-\varvec{\xi }\mathbf {x}^{-1}\varvec{\omega })^{2}}{\det ^2\mathbf {x}}, \end{aligned}$$
(3.4)

where y is a positive variable; \(\mathbf {x}\) is a 2-dimensional unitary matrix; \(\varvec{\omega }=(\omega _1,\omega _2)'\) and \(\varvec{\xi }=(\xi _1,\xi _2)\) are two vectors with Grassmann components. In addition, \(\mathbf {w}\) is a unit vector, which can be parameterized by

$$\begin{aligned} \mathbf {w}=\left( v,\sqrt{1-v^2}e^{\mathbf {i\theta }}\right) ',\quad v\in \mathbb {I}, \quad \theta \in \mathbb {L}. \end{aligned}$$

Moreover, the differentials are defined as

$$\begin{aligned} \mathrm{d}\mathbf {w}=v\mathrm{d}v \mathrm{d}\theta ,\qquad \mathrm{d}\varvec{\omega }\mathrm{d}\varvec{\xi }=\prod _{i=1,2} \mathrm{d}\omega _i\mathrm{d}\xi _i. \end{aligned}$$

In addition, the integral w.r.t. \(\mathbf {x}\) ranges over U(2).

3.2 Initial representation

For \(a=1,2\) and \(j=1,\ldots ,W\), we set

$$\begin{aligned} \varPhi _{a, j}= & {} \big (\phi _{a,j,1},\ldots ,\phi _{a,j,M}\big )',\quad \varPsi _{a, j}=\big (\psi _{a,j,1},\ldots ,\psi _{a,j,M}\big )'\\ \varPhi _a= & {} \big (\varPhi _{a,1}',\ldots ,\varPhi _{a,W}'\big )',\quad \varPsi _a=\big (\varPsi _{a,1}',\ldots ,\varPsi _{a,W}'\big )'. \end{aligned}$$

For each j and each \(a, \varPhi _{a,j}\) is a vector with complex components, and \(\varPsi _{a,j}\) is a vector with Grassmann components. In addition, we use \(\varPhi ^*_{a,j}\) and \(\varPsi ^*_{a,j}\) to represent the conjugate transposes of \(\varPhi _{a,j}\) and \(\varPsi _{a,j}\) respectively. Analogously, we adopt the notation \(\varPhi ^*_{a}\) and \(\varPsi ^*_{a}\) to represent the conjugate transposes of \(\varPhi _{a}\) and \(\varPsi _{a}\), respectively. We have the following integral representation for the moments of the Green’s function.

Lemma 3.5

For any \(p,q=1,\ldots , W\) and \(\alpha , \beta =1,\ldots , M\), we have

$$\begin{aligned} |G_{pq,\alpha \beta }(z)|^{2n}&=\frac{1}{(n!)^2}\int \mathrm{d} \varPhi \mathrm{d}\varPsi \; \big (\bar{\phi }_{1,q,\beta }\phi _{1,p,\alpha }\bar{\phi }_{2,p,\alpha }\phi _{2,q,\beta }\big )^n\nonumber \\&\quad \times \exp \Big \{\mathbf {i}\varPsi _1^*(z-H)\varPsi _1+\mathbf {i}\varPhi _1^*(z-H)\varPhi _1\nonumber \\&\quad -\,\mathbf {i}\varPsi _2^*(\bar{z}-H)\varPsi _2-\mathbf {i}\varPhi ^*_2(\bar{z}-H)\varPhi _2\Big \}, \end{aligned}$$
(3.5)

where

$$\begin{aligned} \mathrm{d}\varPhi =\prod _{a=1,2}\prod _{j=1}^W\prod _{\alpha '=1}^M\frac{\mathrm{d}\mathsf {Re}\phi _{a,j,\alpha '}\mathrm{d} \mathsf {Im}\phi _{a,j,\alpha '}}{\pi },\quad \mathrm{d}\varPsi =\prod _{a=1,2} \prod _{j=1}^W\prod _{\alpha '=1}^M\mathrm{d} \bar{\psi }_{a,j,\alpha '}\mathrm{d}\psi _{a,j,\alpha '}. \end{aligned}$$

Proof

By using Proposition 3.1 (i) with \(\ell =n\) and Proposition 3.1 (ii) with \(\ell =0\), we can get (3.5). \(\square \)

3.3 Averaging over the Gaussian random matrix

Recall the variance profile \(\widetilde{S}\) in (1.2). Now, we take expectation of the Green’s function, i.e average over the random matrix. By elementary Gaussian integral, we get

$$\begin{aligned} \mathbb {E}|G_{pq,\alpha \beta }(z)|^{2n}&= \frac{1}{(n!)^2}\int \mathrm{d} \varPhi \mathrm{d}\varPsi \; \big (\bar{\phi }_{1,q,\beta }\phi _{1,p,\alpha }\bar{\phi }_{2,p,\alpha }\phi _{2,q,\beta }\big )^n\nonumber \\&\quad \times \exp \left\{ \mathbf {i}\sum _{j=1}^W (Tr \breve{X}_jJZ+Tr \breve{Y}_j J Z)\right\} \nonumber \\&\quad \times \exp \left\{ \frac{1}{2M}\sum _{j,k}\tilde{\mathfrak {s}}_{jk}Tr\breve{X}_jJ\breve{X}_k J-\frac{1}{2M}\sum _{j,k}\tilde{\mathfrak {s}}_{jk}Tr\breve{Y}_jJ\breve{Y}_k J\right\} \nonumber \\&\quad \times \exp \left\{ -\frac{1}{M}\sum _{j,k}\tilde{\mathfrak {s}}_{jk} Tr \breve{\varOmega }_jJ\breve{\varXi }_kJ\right\} , \end{aligned}$$
(3.6)

where \(J:=\text {diag}(1,-1)\) and \(Z:=\text {diag}(z,\bar{z})\), and for each \(j=1,\ldots , W\), the matrices \(\breve{X}_j, \breve{Y}_j, \breve{\varOmega }_j\) and \(\breve{\varXi }_j\) are \(2\times 2\) blocks of a supermatrix, namely,

$$\begin{aligned} \breve{\mathcal {S}}_j=\left( \begin{array}{c|c} \breve{X}_j &{} \breve{\varOmega }_j \\ \hline \breve{\varXi }_j &{} \breve{Y}_j \end{array} \right) := \left( \begin{array}{cc|cc} \varPsi _{1,j}^*\varPsi _{1,j} &{}\varPsi _{1,j}^*\varPsi _{2,j} &{} \varPsi _{1,j}^*\varPhi _{1,j} &{}\varPsi _{1,j}^*\varPhi _{2,j} \\ \varPsi _{2,j}^*\varPsi _{1,j} &{}\varPsi _{2,j}^*\varPsi _{2,j} &{} \varPsi _{2,j}^*\varPhi _{1,j} &{}\varPsi _{2,j}^*\varPhi _{2,j}\\ \hline \varPhi _{1,j}^*\varPsi _{1,j} &{}\varPhi _{1,j}^*\varPsi _{2,j} &{} \varPhi _{1,j}^*\varPhi _{1,j} &{}\varPhi _{1,j}^*\varPhi _{2,j} \\ \varPhi _{2,j}^*\varPsi _{1,j} &{}\varPhi _{2,j}^*\varPsi _{2,j} &{} \varPhi _{2,j}^*\varPhi _{1,j} &{}\varPhi _{2,j}^*\varPhi _{2,j}\end{array}\right) . \end{aligned}$$

Remark 3.6

The derivation of (3.6) from (3.5) is quite standard. We refer to the proof of (2.14) in [20] for more details and will not reproduce it here.

3.4 Decomposition of the supermatrices

From now on, we split the discussion into the following three cases

  • (Case 1): Entries in the off-diagonal blocks, i.e. \(p\ne q\),

  • (Case 2): Off-diagonal entries in the diagonal blocks, i.e. \(p=q, \quad \alpha \ne \beta \),

  • (Case 3): Diagonal entries, i.e. \(p=q, \quad \alpha =\beta \).

For each case, we will perform a decomposition for the supermatrix \(\breve{\mathcal {S}}_j\) (\(j=p\) or q). For a vector \(\mathbf {v}\) and some index set \(\mathsf {I}\), we use \(\mathbf {v}^{\langle \mathsf {I}\rangle }\) to denote the subvector obtained by deleting the ith component of \(\mathbf {v}\) for all \(i\in \mathsf {I}\). Then, we adopt the notation

$$\begin{aligned}&\breve{\mathcal {S}}_j^{\langle \mathsf {I}\rangle }=\left( \begin{array}{c|c} \breve{X}_j^{\langle \mathsf {I}\rangle } &{} \breve{\varOmega }_j^{\langle \mathsf {I}\rangle } \\ \hline \breve{\varXi }_j^{\langle \mathsf {I}\rangle } &{} \breve{Y}_j^{\langle \mathsf {I}\rangle } \end{array} \right) ,\qquad \breve{\mathcal {S}}_j^{[i]}=\left( \begin{array}{c|c} \breve{X}_j^{[i]} &{} \breve{\varOmega }_j^{[i]} \\ \hline \breve{\varXi }_j^{[i]} &{} \breve{Y}_j^{[i]} \end{array} \right) . \end{aligned}$$

Here, for \(\mathsf {A}=\breve{X}_j, \breve{Y}_j, \breve{\varOmega }_j\) or \(\breve{\varXi }_j\), the notation \(\mathsf {A}^{\langle \mathsf {I}\rangle }\) is defined via replacing \(\varPhi _{a,j}, \varPsi _{a,j}, \varPhi _{a,j}^*\) and \(\varPsi _{a,j}^*\) by \(\varPhi _{a,j}^{\langle \mathsf {I}\rangle }, \varPsi _{a,j}^{\langle \mathsf {I}\rangle }, (\varPhi _{a,j}^*)^{\langle \mathsf {I}\rangle }\) and \((\varPsi _{a,j}^*)^{\langle \mathsf {I}\rangle }\), respectively, for \(a=1,2\), in the definition of \(\mathsf {A}\). In addition, the notation \(\mathsf {A}^{[i]}\) is defined via replacing \(\varPhi _{a,j}, \varPsi _{a,j}, \varPhi _{a,j}^*\) and \(\varPsi _{a,j}^*\) by \(\phi _{a,j,i}, \psi _{a,j,i}, \bar{\phi }_{a,j,i}\) and \(\bar{\psi }_{a,j,i}\) respectively, for \(a=1,2\), in the definition of \(\mathsf {A}\). Moreover, for \(\mathsf {A}=\breve{\mathcal {S}}_j, \breve{X}_j, \breve{Y}_j, \breve{\varOmega }_j\) or \(\breve{\varXi }_j\), we will simply abbreviate \(\mathsf {A}^{\langle \{a,b\}\rangle }\) and \(\mathsf {A}^{\langle \{a\}\rangle }\) by \(\mathsf {A}^{\langle a,b\rangle }\) and \(\mathsf {A}^{\langle a\rangle }\), respectively. Note that \(\breve{\mathcal {S}}_j^{[i]}\) is of rank-one.

Recalling the spatial-orbital parametrization for the rows or columns of H, it is easy to see from the block structure that for any \(\alpha ,\alpha '\in \{1,\ldots , M\}\), exchanging the \((j,\alpha )\)th row with the \((j,\alpha ')\)th row and simultaneously exchanging the corresponding columns will not change the distribution of H.

For Case 1, due to the symmetry mentioned above, we can assume \(\alpha =\beta =1\). Then we extract two rank-one supermatrices from \(\breve{\mathcal {S}}_p\) and \(\breve{\mathcal {S}}_q\) such that the quantities \(\bar{\phi }_{2,p,1}\phi _{1,p,1}\) and \(\bar{\phi }_{1,q,1}\phi _{2,q,1}\) can be expressed in terms of the entries of these supermatrices. More specifically, we decompose the supermatrices

$$\begin{aligned} \breve{\mathcal {S}}_p=\breve{\mathcal {S}}_p^{\langle 1\rangle }+\breve{\mathcal {S}}_p^{[1]},\quad \breve{\mathcal {S}}_q=\breve{\mathcal {S}}_q^{\langle 1\rangle }+\breve{\mathcal {S}}_q^{[1]}. \end{aligned}$$
(3.7)

Consequently, we can write

$$\begin{aligned} \bar{\phi }_{1,q,1}\phi _{1,p,1}\bar{\phi }_{2,p,1}\phi _{2,q,1}=\left( \breve{Y}_q^{[1]}\right) _{12}\left( \breve{Y}_p^{[1]}\right) _{21}. \end{aligned}$$
(3.8)

For Case 2, due to symmetry, we can assume that \(\alpha =1, \beta =2\). Then we extract two rank-one supermatrices from \(\breve{\mathcal {S}}_p\), namely,

$$\begin{aligned} \breve{\mathcal {S}}_p=\breve{\mathcal {S}}_p^{\langle 1,2\rangle }+\breve{\mathcal {S}}_p^{[1]}+\breve{\mathcal {S}}_p^{[2]}. \end{aligned}$$
(3.9)

Consequently, we can write

$$\begin{aligned} \bar{\phi }_{1,p,2}\phi _{1,p,1}\bar{\phi }_{2,p,1}\phi _{2,p,2}=(\breve{Y}_p^{[2]})_{12}(\breve{Y}_p^{[1]})_{21}. \end{aligned}$$
(3.10)

Finally, for Case 3, due to symmetry, we can assume that \(\alpha =1\). Then we extract only one rank-one supermatrix from \(\breve{\mathcal {S}}_p\), namely,

$$\begin{aligned} \breve{\mathcal {S}}_p=\breve{\mathcal {S}}_p^{\langle 1\rangle }+\breve{\mathcal {S}}_p^{[1]}. \end{aligned}$$
(3.11)

Consequently, we can write

$$\begin{aligned} \bar{\phi }_{1,p,1}\phi _{1,p,1}\bar{\phi }_{2,p,1}\phi _{2,p,1}=\left( \breve{Y}_p^{[1]}\right) _{12}\left( \breve{Y}_p^{[1]}\right) _{21}=\left( \breve{Y}_p^{[1]}\right) _{11}\left( \breve{Y}_p^{[1]}\right) _{22}. \end{aligned}$$

Since the discussion for all three cases are similar, we will only present the details for Case 1. More specifically, in the remaining part of this section and Sect. 4 to Sect. 10, we will only treat Case 1. In Sect. 11, we will sum up the discussions in the previous sections and explain how to adapt them to Case 2 and Case 3, resulting a final proof of Theorem 1.15.

3.5 Variable reduction by superbosonization formulae

We will work with Case 1. Recall the decomposition (3.7). We use the superbosonization formulae to reduce the number of variables. We shall treat \(\breve{\mathcal {S}}_k (k\ne p,q)\) and \(\breve{\mathcal {S}}_j^{\langle 1\rangle } (j=p,q)\) on an equal footing and use the formula (3.3) with \(r=2,\ell =M\) for the former and \(r=2,\ell =M-1\) for the latter, while we separate the terms \(\breve{\mathcal {S}}_j^{[1]} (j=p,q)\) and use the formula (3.4). For simplicity, we introduce the notation

$$\begin{aligned} \widetilde{\mathcal {S}}_j=\bigg \{\begin{array}{cc} \breve{\mathcal {S}}_j,\qquad \text {if}\quad j\ne p, q,\\ \breve{\mathcal {S}}_j^{\langle 1\rangle },\quad \text {if}\quad j=p, q. \end{array} \end{aligned}$$
(3.12)

Accordingly, we will use \(\widetilde{X}_j, \widetilde{\varOmega }_j, \widetilde{\varXi }_j\) and \(\widetilde{Y}_j\) to denote four blocks of \(\widetilde{\mathcal {S}}_j\). With this notation, we can rewrite (3.6) with \(\alpha =\beta =1\) as

$$\begin{aligned}&\mathbb {E}|G_{pq,11}(z)|^{2n}= \frac{1}{(n!)^2}\int \mathrm{d} \varPhi \mathrm{d}\varPsi \; \left( \bar{\phi }_{1,q,1}\phi _{1,p,1}\bar{\phi }_{2,p,1}\phi _{2,q,1}\right) ^n\nonumber \\&\quad \times \exp \left\{ \mathbf {i}\sum _{j=1}^W \Big (Tr \widetilde{X}_jJZ+ Tr \widetilde{Y}_j J Z\Big )\right\} \nonumber \\&\qquad \times \exp \left\{ \frac{1}{2M}\sum _{j,k}\tilde{\mathfrak {s}}_{jk}\Big (Tr\widetilde{X}_jJ\widetilde{X}_k J-Tr\widetilde{Y}_jJ\widetilde{Y}_k J \Big )\right\} \nonumber \\&\quad \quad \times \exp \left\{ -\frac{1}{M}\sum _{j,k}\tilde{\mathfrak {s}}_{jk} Tr \widetilde{\varOmega }_jJ\widetilde{\varXi }_kJ\right\} \nonumber \\&\qquad \times \prod _{k=p,q}\exp \left\{ \mathbf {i} Tr \breve{X}_k^{[1]}JZ+\mathbf {i} Tr \breve{Y}_k^{[1]}JZ\right\} \nonumber \\&\qquad \times \prod _{k=p,q}\exp \left\{ \frac{1}{M}\sum _{j=1}^W\tilde{\mathfrak {s}}_{jk}\Big ( Tr\widetilde{X}_jJ\breve{X}_k^{[1]}J-Tr\widetilde{Y}_jJ\breve{Y}_k^{[1]}J\Big )\right\} \nonumber \\&\qquad \times \prod _{k,\ell =p,q}\exp \left\{ \frac{\tilde{\mathfrak {s}}_{k\ell }}{2M}\Big ( Tr\breve{X}_k^{[1]}J\breve{X}_\ell ^{[1]}J-Tr\breve{Y}_k^{[1]}J\breve{Y}_\ell ^{[1]}J\Big )\right\} \nonumber \\&\qquad \times \prod _{k,\ell =p,q}\exp \left\{ -\frac{\tilde{\mathfrak {s}}_{k\ell }}{M}Tr\breve{\varOmega }_k^{[1]} J\breve{\varXi }_\ell ^{[1]} J\right\} \nonumber \\&\qquad \times \prod _{k=p,q}\exp \left\{ -\frac{1}{M}\sum _{j}\tilde{\mathfrak {s}}_{jk}\Big ( Tr\widetilde{\varOmega }_j J\breve{\varXi }_k^{[1]} J+Tr\breve{\varOmega }_k^{[1]} J\widetilde{\varXi }_j J\Big )\right\} , \end{aligned}$$
(3.13)

where the first factor \((\cdots )^n\) of integrand is the observable and all other factors constitute a normalized measure, written in a somewhat complicated form according to the decomposition from Sect. 3.4.

Now, we use the superbosonization formulae (3.3) and (3.4) to the supermatrices \(\breve{\mathcal {S}}_k (k\ne p,q), \breve{\mathcal {S}}_j^{\langle 1\rangle }\) and \(\breve{\mathcal {S}}_j^{[1]} (j=p,q)\) one by one, to change to the reduced variables as

$$\begin{aligned}&\breve{X}_k^{[1]}\rightarrow X_k^{[1]},\quad \breve{\varOmega }_k^{[1]}\rightarrow \varvec{\omega }_k^{[1]} \left( \mathbf {w}^{[1]}_k\right) ^*, \quad \breve{\varXi }_k^{[1]}\rightarrow \mathbf {w}_k^{[1]}\varvec{\xi }_k^{[1]},\nonumber \\&\quad \breve{Y}_k^{[1]}\rightarrow Y_k^{[1]}:=y_k^{[1]}\mathbf {w}^{[1]}_k\left( \mathbf {w}^{[1]}_k\right) ^*, \quad k=p,q,\nonumber \\&\quad \widetilde{X}_j\rightarrow X_j,\quad \widetilde{Y}_j\rightarrow Y_j,\quad \widetilde{\varOmega }_j\rightarrow \varOmega _j,\quad \widetilde{\varXi }_j\rightarrow \varXi _j,\quad j=1,\ldots , W. \end{aligned}$$
(3.14)

Here, for \(j=1,\ldots , W, X_j\) is a \(2\times 2\) unitary matrix; \(Y_j\) is a \(2\times 2\) positive-definite matrix; \(\varOmega _j=(\omega _{j,\alpha \beta })\) and \(\varXi _j=(\xi _{j,\alpha \beta })\) are \(2\times 2\) Grassmann matrices. For \(k=p\) or \(q, X_k^{[1]}\) is a \(2\times 2\) unitary matrix; \(y_k^{[1]}\) is a positive variable; \(\varvec{\omega }_{k}^{[1]}=(\omega _{k,1}^{[1]}, \omega _{k,2}^{[1]})'\) is a column vector with Grassmann components; \(\varvec{\xi }_{k}^{[1]}=(\xi _{k,1}^{[1]},\xi _{k,2}^{[1]})\) is a row vector with Grassmann components. In addition, for \(k=p,q\),

$$\begin{aligned} \mathbf {w}^{[1]}_k=\Big ({v}^{[1]}_k, {u}^{[1]}_ke^{\mathbf {i}\sigma _k^{[1]}}\Big )',\quad {u}_k^{[1]}=\sqrt{1-\left( {v}_k^{[1]}\right) ^2},\quad {v}_k^{[1]}\in \mathbb {I},\quad \sigma _k^{[1]}\in \mathbb {L}. \quad \end{aligned}$$
(3.15)

Then by using superbosonization formulae, we arrive at the representation

$$\begin{aligned}&\mathbb {E}|G_{pq,11}(z)|^{2n}\nonumber \\&\quad =\frac{(-1)^{W}}{(n!)^2\pi ^{2W+4}}\int \mathrm{d}X^{[1]} \mathrm{d}\mathbf {y}^{[1]} \mathrm{d} \mathbf {w}^{[1]} \mathrm{d}\varvec{\omega }^{[1]} \mathrm{d}\varvec{\xi }^{[1]} \mathrm{d}X \mathrm{d}Y \mathrm{d}\varOmega \mathrm{d} \varXi \nonumber \\&\quad \quad \times \left( y_p^{[1]}y_q^{[1]}\left( \mathbf {w}^{[1]}_q\left( \mathbf {w}^{[1]}_q\right) ^*\right) _{12}\left( \mathbf {w}^{[1]}_p\left( \mathbf {w}^{[1]}_p\right) ^*\right) _{21}\right) ^n\nonumber \\&\quad \quad \times \exp \left\{ \mathbf {i}\sum _{j=1}^W \Big (Tr X_jJZ+Tr Y_j J Z\Big )\right\} \nonumber \\&\quad \quad \times \exp \left\{ \frac{1}{2M}\sum _{j,k}\tilde{\mathfrak {s}}_{jk}\Big (TrX_jJX_k J-TrY_jJY_k J\Big ) \right\} \nonumber \\&\quad \quad \times \exp \left\{ -\frac{1}{M}\sum _{j,k}\tilde{\mathfrak {s}}_{jk} Tr\varOmega _jJ\varXi _kJ\right\} \;\prod _{j}\frac{\det ^M Y_j}{\det ^M\big (X_j- \varOmega _j Y_j^{-1}\varXi _j\big )}\nonumber \\&\quad \quad \times \prod _{k=p,q}\frac{\det \big (X_k-\varOmega _kY_k^{-1}\varXi _k\big )}{\det Y_k} \prod _{k=p,q}\exp \left\{ \mathbf {i} Tr X_k^{[1]}JZ+\mathbf {i} Tr Y_k^{[1]}JZ \right\} \nonumber \\&\quad \quad \times \prod _{k=p,q}\exp \left\{ \frac{1}{M}\sum _{j=1}^W\tilde{\mathfrak {s}}_{jk} \Big (TrX_jJX_k^{[1]}J-TrY_jJY_k^{[1]}J\Big )\right\} \nonumber \\&\quad \quad \times \prod _{k,\ell =p,q}\exp \left\{ \frac{\tilde{\mathfrak {s}}_{k\ell } }{2M}\Big (TrX_k^{[1]}JX_\ell ^{[1]}J-TrY_k^{[1]}JY_\ell ^{[1]}J\Big )\right\} \nonumber \\&\quad \quad \times \prod _{k,\ell =p,q}\exp \left\{ -\frac{\tilde{\mathfrak {s}}_{k\ell }}{M}Tr\varvec{\omega }_k^{[1]}(\mathbf {w}_k^{[1]})^* J\mathbf {w}_\ell ^{[1]}\varvec{\xi }_\ell ^{[1]} J\right\} \nonumber \\&\quad \quad \times \prod _{k=p,q} \exp \left\{ -\frac{1}{M}\sum _{j=1}^W\tilde{\mathfrak {s}}_{jk} Tr \varOmega _j J\mathbf {w}_k^{[1]}\varvec{\xi }_k^{[1]} J\right\} \nonumber \\&\quad \quad \times \prod _{k=p,q}\exp \left\{ -\frac{1}{M}\sum _{j=1}^W\tilde{\mathfrak {s}}_{jk} Tr\varvec{\omega }_k^{[1]}(\mathbf {w}_k^{[1]})^* J\varXi _j J\right\} \nonumber \\&\quad \quad \times \prod _{k=p,q}\frac{y_k^{[1]}\Big (y_k^{[1]}-\varvec{\xi }_k^{[1]}(X_k^{[1]})^{-1}\varvec{\omega }_k^{[1]}\Big )^{2}}{\det ^2 (X_k^{[1]})}, \end{aligned}$$
(3.16)

where we used the notation \(\mathbf {y}^{[1]}:=(y_p^{[1]},y_q^{[1]}), \mathbf {w}^{[1]}:=(\mathbf {w}_p^{[1]}, \mathbf {w}_q^{[1]})\). The differentials are defined by

$$\begin{aligned} \mathrm{d}X^{[1]}&=\prod _{j=p,q} \mathrm{d} \hat{\mu }\left( X_j^{[1]}\right) ,\qquad \mathrm{d}\mathbf {y}^{[1]}=\prod _{j=p,q}\mathbf {1}\left( y_j^{[1]}>0\right) \mathrm{d}y_j^{[1]},\\ \mathrm{d}\mathbf {w}^{[1]}&=\prod _{j=p,q}\mathrm{d}\mathbf {w}^{[1]}_j=\prod _{j=p,q}{v}_j^{[1]}\mathrm{d} {v}_j^{[1]} \mathrm{d}\sigma _j^{[1]},\qquad \mathrm{d}\varvec{\omega }^{[1]} \mathrm{d}\varvec{\xi }^{[1]}=\prod _{\alpha =1,2} \prod _{j=p,q} \mathrm{d}\omega _{j,\alpha }^{[1]} \mathrm{d}\xi _{j,\alpha }^{[1]},\\ \mathrm{d}X&=\prod _{j=1}^W \mathrm{d} \hat{\mu }(X_j),\qquad \mathrm{d}Y=\prod _{j=1}^W \mathrm{d} \hat{\nu }(Y_j),\qquad \mathrm{d}\varOmega \mathrm{d}\varXi = \prod _{\alpha ,\beta =1,2}\prod _{j=1}^W \mathrm{d}\omega _{j,\alpha \beta }\mathrm{d}\xi _{j,\alpha \beta }. \end{aligned}$$

Now we change the variables as \(X_jJ\rightarrow X_j,Y_jJ \rightarrow B_j,\varOmega _jJ\rightarrow \varOmega _j, \varXi _jJ\rightarrow \varXi _j\) and perform the scaling \(X_j\rightarrow -MX_j, B_j\rightarrow MB_j, \varOmega _j\rightarrow \sqrt{M} \varOmega _j\) and \(\varXi _j\rightarrow \sqrt{M}\varXi _j\). Consequently, we can write

$$\begin{aligned} \mathbb {E}|G_{pq,11}(z)|^{2n}&= \frac{(-1)^{W}M^{4W}}{(n!)^2\pi ^{2W+4}} \int \mathrm{d}X^{[1]} \mathrm{d}\mathbf {y}^{[1]} \mathrm{d} \mathbf {w}^{[1]} \mathrm{d}\varvec{\omega }^{[1]} \mathrm{d}\varvec{\xi }^{[1]} \mathrm{d}X \mathrm{d}B \mathrm{d} \varOmega \mathrm{d} \varXi \nonumber \\&\quad \times \exp \Big \{-M\big (K(X)+L(B)\big )\Big \}\nonumber \\&\quad \times \mathcal {P}( \varOmega , \varXi , X, B)\cdot \mathcal {Q}\big ( \varOmega , \varXi , \varvec{\omega }^{[1]},\varvec{\xi }^{[1]}, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]}\big ) \nonumber \\&\quad \times \mathcal {F}\big (X,B,X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]}\big ), \end{aligned}$$
(3.17)

where the functions in the integrand are defined as

$$\begin{aligned}&K(X):=-\frac{1}{2}\sum _{j,k}\tilde{\mathfrak {s}}_{jk}Tr X_jX_k+\mathbf {i}E\sum _j TrX_j +\sum _j \log \det X_j,\nonumber \\&L(B):=\frac{1}{2}\sum _{j,k}\tilde{\mathfrak {s}}_{jk}Tr B_jB_k-\mathbf {i}E\sum _j Tr B_j -\sum _j \log \det B_j, \nonumber \\&\mathcal {P}( \varOmega , \varXi , X, B)\nonumber \\&\quad :=\exp \left\{ -\sum _{j,k}\tilde{\mathfrak {s}}_{jk} Tr \varOmega _j\varXi _k\right\} \times \prod _j\frac{1}{\det ^M\big (1+\frac{1}{M}X_j^{-1} \varOmega _jB_j^{-1}\varXi _j\big )}\nonumber \\&\quad \quad \times \prod _{k=p,q}\frac{\det \big (X_k+\frac{1}{M}\varOmega _k B_k^{-1}\varXi _k\big )}{\det B_k}, \end{aligned}$$
(3.18)
$$\begin{aligned}&\mathcal {Q}( \varOmega , \varXi , \varvec{\omega }^{[1]},\varvec{\xi }^{[1]}, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]}):=\prod _{k=p,q} \Big (1-(y_k^{[1]})^{-1}\varvec{\xi }_k^{[1]}(X_k^{[1]})^{-1}\varvec{\omega }_k^{[1]}\Big )^{2}\nonumber \\&\quad \quad \times \prod _{k=p,q}\exp \left\{ -\frac{1}{\sqrt{M}}\sum _{j}\tilde{\mathfrak {s}}_{jk}\Big ( Tr \varOmega _j \mathbf {w}_k^{[1]}\varvec{\xi }_k^{[1]}J +Tr\varvec{\omega }_k^{[1]}(\mathbf {w}_k^{[1]})^* J\varXi _j\Big )\right\} \nonumber \\&\quad \quad \times \prod _{k,\ell =p,q}\exp \left\{ -\frac{1}{M}\tilde{\mathfrak {s}}_{k\ell }Tr\varvec{\omega }_k^{[1]}(\mathbf {w}_k^{[1]})^* J\mathbf {w}_\ell ^{[1]}\varvec{\xi }_\ell ^{[1]}J \right\} \nonumber \\&\mathcal {F}(X,B,X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]}):=f(X,X^{[1]})\;g(B, \mathbf {y}^{[1]},\mathbf {w}^{[1]}) , \end{aligned}$$
(3.19)

with

$$\begin{aligned} f(X,X^{[1]})&:=\exp \left\{ M\eta \sum _{j=1}^W Tr X_jJ\right\} \; \prod _{k,\ell =p,q}\exp \left\{ \frac{\tilde{\mathfrak {s}}_{k\ell }}{2M} TrX_k^{[1]}JX_\ell ^{[1]}J\right\} \nonumber \\&\quad \times \prod _{k=p,q}\frac{1}{\det ^2\left( X_k^{[1]}\right) } \exp \left\{ \mathbf {i} Tr X_k^{[1]}JZ-\sum _j\tilde{\mathfrak {s}}_{jk} TrX_jX_k^{[1]}J\right\} ,\end{aligned}$$
(3.20)
$$\begin{aligned} g(B, \mathbf {y}^{[1]},\mathbf {w}^{[1]})&:=\exp \left\{ -M\eta \sum _{j=1}^W Tr B_jJ\right\} \prod _{k,\ell =p,q}\exp \left\{ -\frac{\tilde{\mathfrak {s}}_{k\ell }}{2M} TrY_k^{[1]}JY_\ell ^{[1]} J\right\} \nonumber \\&\quad \times \left( \left( \mathbf {w}^{[1]}_q\left( \mathbf {w}^{[1]}_q\right) ^*\right) _{12}\left( \mathbf {w}^{[1]}_p\left( \mathbf {w}^{[1]}_p\right) ^*\right) _{21}\right) ^n\nonumber \\&\quad \times \prod _{k=p,q}\left( y_k^{[1]}\right) ^{n+3}\exp \left\{ \mathbf {i} Tr Y_k^{[1]}JZ-\sum _j\tilde{\mathfrak {s}}_{jk} TrB_jY_k^{[1]}J\right\} . \end{aligned}$$
(3.21)

In (3.17), the regions of \(X_j\)’s and \(X_k^{[1]}\)’s are all U(2), and those of \(B_j\)’s are the set of the matrices A satisfying \(AJ>0\). We remind the reader here that if we parametrize the unitary matrix \(X_j\) according to its eigendecomposition, the scaling \(X_j\rightarrow -MX_j\) is equivalent to changing the contour of the eigenvalues of \(X_j\) from the unit circle \(\varSigma \) to \(\frac{1}{M}\varSigma \), up to the orientation. Afterwards, we deformed the contour back to \(\varSigma \) in (3.17). This is possible since the only singularity of the integrand in (3.17) in the variables of the eigenvalues of \(X_j\) is at 0, c.f., the matrix \(X_j^{-1}\) in the factor \(\mathcal {P}( \varOmega , \varXi , X, B)\).

3.6 Parametrization for XB

Similarly to the discussion in [20], we start with some preliminary parameterization. At first, we do the eigendecomposition

$$\begin{aligned} X_j=P_j^*\hat{X}_j P_j,\quad B_j=Q_j^{-1}\hat{B}_jQ_j,\qquad P_j\in \mathring{U}(2),\quad Q_j\in \mathring{U}(1,1), \end{aligned}$$
(3.22)

where

$$\begin{aligned} \hat{X}_j=\text {diag}(x_{j,1},x_{j,2}),\quad \hat{B}_j=\text {diag}(b_{j,1},-b_{j,2}),\quad x_{j,1},x_{j,2}\in \varSigma ,\quad b_{j,1}, b_{j,2}\in \mathbb {R}_+. \nonumber \\ \end{aligned}$$
(3.23)

Further, we introduce

$$\begin{aligned} V_j=P_jP_1^*\in \mathring{U}(2), \quad T_j=Q_jQ_1^{-1}\in \mathring{U}(1,1),\quad j=1,\ldots , W. \end{aligned}$$
(3.24)

Especially, we have \(V_1=T_1=I\). Now, we parameterize \(P_1, Q_1, V_j\) and \(T_j\) for all \(j=2,\ldots , W\) as follows

$$\begin{aligned}&P_1=\left( \begin{array}{ccc}u &{}ve^{\mathbf {i}\theta }\\ -ve^{-\mathbf {i}\theta } &{}u\end{array}\right) ,\quad V_j=\left( \begin{array}{ccc}u_j &{}v_je^{\mathbf {i}\theta _j}\\ -v_je^{-\mathbf {i}\theta _j} &{}u_j\end{array}\right) ,\nonumber \\&u=\sqrt{1-v^2},\quad u_j=\sqrt{1-v_j^2},\quad v, v_j\in \mathbb {I}, \quad \theta ,\theta _j\in \mathbb {L},\nonumber \\&Q_1=\left( \begin{array}{ccc}s &{}te^{\mathbf {i}\sigma }\\ te^{-\mathbf {i}\sigma } &{}s\end{array}\right) ,\quad T_j=\left( \begin{array}{ccc}s_j &{}t_je^{\mathbf {i}\sigma _j}\\ t_je^{-\mathbf {i}\sigma _j} &{}s_j\end{array}\right) ,\nonumber \\&s=\sqrt{1+t^2}, \quad s_j=\sqrt{1+t_j^2}, \quad t, t_j\in \mathbb {R}_+,\quad \sigma ,\sigma _j\in \mathbb {L}. \end{aligned}$$
(3.25)

Under the parametrization above, we can express the corresponding differentials as follows.

$$\begin{aligned} \mathrm{d}X\mathrm{d}B= & {} \mathrm{d}\mu (P_1)\mathrm{d}\nu (Q_1)\cdot \prod _{j=2}^W \mathrm{d}\mu (V_j) \mathrm{d}\nu (T_j) \cdot \prod _{j=1}^W \mathrm{d}b_{j,1} \mathrm{d} b_{j,2} \cdot \frac{\mathrm{d} x_{j,1}}{2\pi \mathbf {i}} \frac{\mathrm{d}x_{j,2}}{2\pi \mathbf {i}}\nonumber \\&\times 2^W(\pi /2)^{2W} \prod _{j=1}^W(x_{j,1}-x_{j,2})^2 (b_{j,1}+b_{j,2})^2, \end{aligned}$$
(3.26)

where

$$\begin{aligned} \mathrm{d}\mu (P_1)= & {} 2v \mathrm{d} v \cdot \frac{\mathrm{d}\theta }{2\pi },\quad \mathrm{d} \mu (V_j)= 2v_j \mathrm{d}v_j \cdot \frac{\mathrm{d} \theta _j}{2\pi },\quad \mathrm{d}\nu (Q_1)=2t \mathrm{d}t \cdot \frac{\mathrm{d}\sigma }{2\pi }, \\ \mathrm{d} \nu (T_j)= & {} 2t_j \mathrm{d} t_j\cdot \frac{\mathrm{d}\sigma _j}{2\pi }. \end{aligned}$$

In addition, for simplicity, we do the change of variables

$$\begin{aligned} \varOmega _j\rightarrow P_1^* \varOmega _j Q_1,\quad \varXi _j\rightarrow Q_1^{-1} \varXi _j P_1. \end{aligned}$$
(3.27)

Note that the Berezinian of such a change is 1. After this change, \(\mathcal {P}( \varOmega , \varXi , X,B,\mathbf {y}^{[1]},\mathbf {w}^{[1]})\) turns out to be independent of \(P_1\) and \(Q_1\).

To adapt to the new parametrization, we change the notation

$$\begin{aligned}&K(X)\rightarrow K(\hat{X},V),\quad L(B)\rightarrow L(\hat{B},T),\quad \mathcal {P}( \varOmega , \varXi , X, B)\rightarrow \mathcal {P}( \varOmega , \varXi , \hat{X}, \hat{B}, V, T),\nonumber \\&\mathcal {Q}( \varOmega , \varXi , \varvec{\omega }^{[1]},\varvec{\xi }^{[1]}, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]})\rightarrow \mathcal {Q}( \varOmega , \varXi , \varvec{\omega }^{[1]},\varvec{\xi }^{[1]}, P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]}),\nonumber \\&\mathcal {F}(X,B, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]})\rightarrow \mathcal {F}(\hat{X},\hat{B}, V, T, P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]}),\\&f(X, X^{[1]})\rightarrow f(P_1, V, \hat{X}, X^{[1]}),\quad g(B, \mathbf {y}^{[1]},\mathbf {w}^{[1]})\rightarrow g(Q_1, T,\hat{B},\mathbf {y}^{[1]},\mathbf {w}^{[1]}).\nonumber \end{aligned}$$
(3.28)

We recall here that K(X) does not depend on \(P_1\), as well, L(B) does not depend on \(Q_1\). Moreover, according to the change (3.27), we have

$$\begin{aligned} \mathcal {P}( \varOmega , \varXi , \hat{X}, \hat{B}, V, T)&=\exp \left\{ -\sum _{j,k}\tilde{\mathfrak {s}}_{jk} Tr \varOmega _j\varXi _k\right\} \nonumber \\&\quad \times \prod _j\frac{1}{\det ^M\big (1+M^{-1}V_j^*\hat{X}_j^{-1}V_j \varOmega _jT_j^{-1}\hat{B}_j^{-1}T_j\varXi _j\big )}\nonumber \\&\quad \times \prod _{k=p,q}\frac{\det \big (V_k^*\hat{X}_kV_k+M^{-1} \varOmega _k T_k^{-1}\hat{B}_k^{-1}T_k\varXi _k\big )}{\det \hat{B}_k} \end{aligned}$$
(3.29)

and

$$\begin{aligned}&\mathcal {Q}\left( \varOmega , \varXi , \varvec{\omega }^{[1]},\varvec{\xi }^{[1]}, P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]}\right) \nonumber \\&\quad =\prod _{k=p,q}\exp \left\{ -\frac{1}{\sqrt{M}}\sum _{j}\tilde{\mathfrak {s}}_{jk} \Big (TrP_1^* \varOmega _jQ_1 \mathbf {w}_k^{[1]}\varvec{\xi }_k^{[1]}J +Tr\varvec{\omega }_k^{[1]}(\mathbf {w}_k^{[1]})^* JQ_1^{-1}\varXi _j P_1\Big )\right\} \nonumber \\&\quad \quad \times \prod _{k,\ell =p,q}\exp \Big \{ -\frac{1}{M}\tilde{\mathfrak {s}}_{k\ell }Tr\varvec{\omega }_k^{[1]}(\mathbf {w}_k^{[1]})^* J(\mathbf {w}_\ell ^{[1]})\varvec{\xi }_\ell ^{[1]}J \Big \}\nonumber \\&\quad \quad \times \prod _{k=p,q}\left( 1-(y_k^{[1]})^{-1}\varvec{\xi }_k^{[1]}(X_k^{[1]})^{-1}\varvec{\omega }_k^{[1]}\right) ^{2}. \end{aligned}$$
(3.30)

Consequently, using (3.26), from (3.17) we can write

$$\begin{aligned} \mathbb {E}|G_{pq,11}(z)|^{2n}&= \frac{M^{4W}}{(n!)^2 8^W\pi ^{2W+4}} \int \prod _{j=2}^W \mathrm{d}\mu (V_j) \mathrm{d}\nu (T_j) \nonumber \\&\quad \times \int _{\mathbb {R}_+^{2W}} \prod _{j=1}^W \mathrm{d}b_{j,1} \mathrm{d} b_{j,2} \; \oint _ {\varSigma ^{2W}} \prod _{j=1}^W\mathrm{d}x_{j,1}\mathrm{d}x_{j,2} \nonumber \\&\quad \times \exp \left\{ -M\big (K(\hat{X},V)+L(\hat{B},T)\big )\right\} \nonumber \\&\quad \times \prod _{j=1}^W(x_{j,1}-x_{j,2})^2(b_{j,1}+b_{j,2})^2\cdot \mathsf {A}(\hat{X}, \hat{B}, V, T). \end{aligned}$$
(3.31)

where we introduced the notation

$$\begin{aligned} \mathsf {A}(\hat{X}, \hat{B}, V, T)&:=\int \mathrm{d}X^{[1]} \mathrm{d}\mathbf {y}^{[1]} \mathrm{d} \mathbf {w}^{[1]} \mathrm{d}\varvec{\omega }^{[1]} \mathrm{d}\varvec{\xi }^{[1]} \mathrm{d} \varOmega \mathrm{d} \varXi \mathrm{d}\mu (P_1)\mathrm{d}\nu (Q_1)\;\nonumber \\&\quad \mathcal {P}( \varOmega , \varXi , \hat{X}, \hat{B}, V, T) \mathcal {Q}( \varOmega , \varXi , \varvec{\omega }^{[1]},\varvec{\xi }^{[1]}, P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]}) \nonumber \\&\quad \times \mathcal {F}(\hat{X},\hat{B}, V, T, P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]}). \end{aligned}$$
(3.32)

In (3.31), the regions of \(V_j\)’s are all \(\mathring{U}(2)\), and those of \(T_j\)’s are all \(\mathring{U}(1,1)\). Observe that all Grassmann variables are inside the integrand of the integral \(\mathsf {A}(\hat{X}, \hat{B}, V, T)\). Hence, (3.31) separates the saddle point calculation from the observable \(\mathsf {A}(\hat{X}, \hat{B}, V, T)\).

To facilitate the discussions in the remaining part, we introduce some additional terms and notation here. Henceforth, we will employ the notation \((X^{[1]})^{-1}=\{(X^{[1]}_p)^{-1}, (X^{[1]}_q)^{-1}\}\) and \((\mathbf {y}^{[1]})^{-1}=\{(y_p^{[1]})^{-1}, (y_q^{[1]})^{-1}\}\) for the collection of inverse matrices and reciprocals, respectively. For a matrix or a vector A under discussion, we will use the term A-variables to refer to all the variables parametrizing it. For example, \(\hat{X}_j\)-variables means \(x_{j,1}\) and \(x_{j,2}\), and \(\hat{X}\)-variables refer to the collection of all \(\hat{X}_j\)-variables. Analogously, we can define the terms T-variables, \(\mathbf {y}^{[1]}\)-variables , \(\varOmega \)-variables and so on. We use another term A-entries to refer to the non-zero entries of A. Note that \(\hat{X}_j\)-variables are just \(\hat{X}_j\)-entries. However, for \(T_j\), they are different, namely,

$$\begin{aligned} T_j\text {-variables}:\quad t_j, \sigma _j,\quad \text {vs.}\quad T_j\text {-entries}: \quad s_j, t_je^{\mathbf {i}\sigma _j}, t_je^{-\mathbf {i}\sigma _j}. \end{aligned}$$

Analogously, we will also use the term T-entries to refer to the collection of all \(T_j\)-entries. Then V-entries, \(\mathbf {w}^{[1]}\)-entries, etc. are defined in the same manner. It is easy to check that \(Q_1^{-1}\)-entries are the same as \(Q_1\)-entries, up to a sign, as well, \(T_j^{-1}\)-entries are the same as \(T_j\)-entries, for all \(j=2,\ldots , W\).

Moreover, to simplify the notation, we make the convention here that we will frequently use a dot to represent all the arguments of a function. That means, for instance, we will write \(\mathcal {P}( \varOmega , \varXi , \hat{X}, \hat{B}, V, T)\) as \(\mathcal {P}(\cdot )\) if there is no confusion. Analogously, we will also use the abbreviation \(\mathcal {Q}(\cdot ), \mathcal {F}(\cdot ), \mathsf {A}(\cdot )\), and so on.

Let \(\mathbf {a}:=\{a_1,\ldots , a_\ell \}\) be a set of variables, we will adopt the notation

$$\begin{aligned} \mathfrak {Q}(\mathbf {a}; \kappa _1, \kappa _2, \kappa _3) \end{aligned}$$

to denote the class of all multivariate polynomials \(\mathfrak {p}(\mathbf {a})\) in the arguments \(a_1,\ldots , a_\ell \) such that the following three conditions are satisfied: (i) The total number of the monomials in \(\mathfrak {p}(\mathbf {a})\) is bounded by \(\kappa _1\); (ii) the coefficients of all monomials in \(\mathfrak {p}(\mathbf {a})\) are bounded by \(\kappa _2\) in magnitude; (iii) the power of each \(a_i\) in each monomial is bounded by \(\kappa _3\), for all \(i=1,\ldots , \ell \). For example, \(5b_{j,1}^{-1}+3b_{j,1}t_j^2+1\in \mathfrak {Q}\big (\{b_{j,1}^{-1}, b_{j,1}, t_j\}; 3, 5, 2\big )\). In addition, we define the subset of \(\mathfrak {Q}(\mathbf {a}; \kappa _1, \kappa _2, \kappa _3)\), namely,

$$\begin{aligned} \mathfrak {Q}_{\text {deg}}\big (\mathbf {a}; \kappa _1, \kappa _2, \kappa _3\big ) \end{aligned}$$

consisting of those polynomials in \(\mathfrak {Q}(\mathbf {a}; \kappa _1, \kappa _2, \kappa _3)\) such that the degree is bounded by \(\kappa _3\), i.e. the total degree of each monomial is bounded by \(\kappa _3\). For example \(5b_{j,1}^{-1}+3b_{j,1}t_j^2+1\in \mathfrak {Q}_{\text {deg}}\big (\{b_{j,1}^{-1}, b_{j,1}, t_j\}; 3, 5, 3\big )\).

4 Preliminary discussion on the integrand

In this section, we perform a preliminary analysis on the factors of the integrand in (3.17). Recall the matrix \(\mathfrak {I}\) defined in (1.29).

4.1 Factor \(\exp \{-M(K(\hat{X},V)+L(\hat{B},T))\}\)

Recall the parametrization of \(\hat{B}_j, \hat{X}_j, T_j\) and \(V_j\) in (3.23) and (3.25), as well as the matrices defined in (1.28). According to the discussion in [20], there are three types of saddle points of this function, namely,

  • Type I :      For each j,      \(\displaystyle (\hat{B}_j, T_j,\hat{X}_j)=(D_{\pm }, I, D_{\pm })\quad \text {or} \quad (D_{\pm }, I, D_{\mp })\),

    \(\qquad \theta _j\in \mathbb {L}, ~~~~v_j=0\) if \(\hat{X}_j=\hat{X}_1\), and \(v_j=1\) if \(\hat{X}_j\ne \hat{X}_1\).

  • Type II :     For each j,      \(\displaystyle (\hat{B}_j, T_j,\hat{X}_j)=(D_{\pm }, I, D_{+})\) and \(V_j\in \mathring{U}(2)\).

  • Type III :       For each j,      \(\displaystyle (\hat{B}_j, T_j,\hat{X}_j)=(D_{\pm }, I, D_{-})\) and \(V_j\in \mathring{U}(2)\).

(Actually, since \(\theta _j\) and \(v_j\) vary on continuous sets, it would be more appropriate to use the term saddle manifolds.) We will see that the main contribution to the integral (3.17) comes from some small vicinities of the Type I saddle points. At first, by the definition in (3.24), we have \(V_1=I\). If we regard \(\theta _j\)’s in the parametrization of \(V_j\)’s as fixed parameters, it is easy to see that there are totally \(2^W\) choices of Type I saddle points. Furthermore, the contributions from all the Type I saddle points are the same, since one can always do the transform \(V_j\rightarrow \mathfrak {I} V_j\) or \((\hat{X}_j, P_j)\rightarrow (\mathfrak {I}\hat{X}_j\mathfrak {I}, \mathfrak {I}P_j)\) for several j to change one saddle to another. That means, for Type I saddle points, it suffices to consider

  • Type I’ :      For each j,      \(\displaystyle (\hat{B}_j, T_j,\hat{X}_j, V_j)=(D_{\pm }, I, D_{\pm }, I)\).

Therefore, the total contribution to the integral (3.17) from all Type I saddle points is \(2^W\) times that from the Type I’ saddle point.

Following the discussion in [20], we will show in Sect. 5 that both \(K(\hat{X},V)-K(D_{\pm }, I)\) and \(L(\hat{B},T)-L(D_{\pm }, I)\) have positive real parts, bounded by some positive quadratic forms from below, which allows us to perform the saddle point analysis. In addition, it will be seen that in a vicinity of the Type I’ saddle point, \(\exp \{-M(K(\hat{X},V)+L(\hat{B},T))\}\) is approximately Gaussian.

4.2 Factor \(\mathcal {Q}( \varOmega , \varXi , \varvec{\omega }^{[1]},\varvec{\xi }^{[1]}, P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]})\)

The function \(\mathcal {Q}(\cdot )\) contains both the \( \varOmega , \varXi \)-variables from \(\mathcal {P}(\cdot )\), and the \(P_1, Q_1, X^{[1]},\mathbf {y}^{[1]}, \mathbf {w}^{[1]}\)-variables from \(\mathcal {F}(\cdot )\). In addition, note that in the integrand in (3.17), \(\mathcal {Q}(\cdot )\) is the only factor containing the \(\varvec{\omega }^{[1]}\) and \(\varvec{\xi }^{[1]}\)- variables. Hence, we can compute the integral

$$\begin{aligned}&\mathsf {Q}\big ( \varOmega , \varXi , P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]}\big )\nonumber \\&\quad :=\int d\varvec{\omega }^{[1]}d\varvec{\xi }^{[1]}\; \mathcal {Q}\big ( \varOmega , \varXi , \varvec{\omega }^{[1]},\varvec{\xi }^{[1]}, P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]}\big ) \end{aligned}$$
(4.1)

At first, the explicit formula for \(\mathsf {Q}( \cdot )\) is complicated and irrelevant for us. From (3.30) and the definition of the Grassmann integral, it is not difficult to see that \(\mathsf {Q}( \cdot )\) is a polynomial of the \((X^{[1]})^{-1}, (\mathbf {y}^{[1]})^{-1}, \mathbf {w}^{[1]}, P_1, Q_1, \varOmega \) and \(\varXi \)-entries. In principle, for each monomial in the polynomial \(\mathsf {Q}(\cdot )\), we can combine the Grassmann variables with \(\mathcal {P}(\cdot )\), then perform the integral over \(\varOmega \) and \(\varXi \)-variables, whilst we combine the complex variables with \(\mathcal {F}(\cdot )\), and perform the integral over \(X^{[1]}, \mathbf {y}^{[1]}, \mathbf {w}^{[1]}, P_1\) and \(Q_1\)-variables. A formal discussion on \(\mathsf {Q}(\cdot )\) will be given in Sect. 6.1. However, the terms from \(\mathsf {Q}(\cdot )\) turn out to be irrelevant in our proof. Therefore, in the arguments with \(\mathsf {Q}(\cdot )\) involved, a typical strategy that we will adopt is as follows: we usually neglect \(\mathsf {Q}(\cdot )\) at first, and perform the discussion on \(\mathcal {P}(\cdot )\) and \(\mathcal {F}(\cdot )\) separately, at the end, we make necessary comments on how to slightly modify the discussions to take \(\mathsf {Q}(\cdot )\) into account.

4.3 Factor \(\mathcal {P}( \varOmega , \varXi , \hat{X}, \hat{B}, V, T)\)

We will mainly regard \(\mathcal {P}(\cdot )\) as a function of the \(\varOmega \) and \(\varXi \)-variables. As mentioned above, we also have some \(\varOmega \) and \(\varXi \)-variables from the irrelevant term \(\mathsf {Q}(\cdot )\). But we temporarily ignore them and regard as if the integral over \( \varOmega \) and \(\varXi \)-variables reads

$$\begin{aligned} \mathsf {P}(\hat{X}, \hat{B}, V, T):=\int d \varOmega d\varXi \; \mathcal {P}( \varOmega , \varXi , \hat{X}, \hat{B}, V, T). \end{aligned}$$
(4.2)

We shall estimate \(\mathsf {P}(\cdot )\) in three different regions: (1) the complement of the vicinities of the saddle points; (2) the vicinity of Type I saddle point; (3) the vicinities of Type II and III saddle points, which will be done in Sects. 6.2, 9.1 and 10.1, respectively. (Definition 5.5 gives the precise definition of the vicinities.) In each case we will decompose the function \(\mathcal {P}(\cdot )\) as a product of a Gaussian measure and a multivariate polynomial of Grassmann variables. Consequently, we can employ (3.2) to perform the integral of this polynomial against the Gaussian measure, whereby \(\mathsf {P}(\cdot )\) can be estimated.

4.4 Factor \(\mathcal {F}(\hat{X},\hat{B}, V, T, P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]})\)

Observe that \(\mathcal {F}\) is the only term containing the energy scale \(\eta \). As in the previous discussion of \(\mathcal {P}(\cdot )\), here we also ignore the \(P_1, Q_1, X^{[1]},\mathbf {y}^{[1]}, \mathbf {w}^{[1]}\)-variables from the irrelevant term \(\mathsf {Q}(\cdot )\) temporarily, and investigate the integral

$$\begin{aligned} \mathsf {F}(\hat{X},\hat{B}, V, T)&=\int d X^{[1]} d\mathbf {y}^{[1]} \mathrm{d}\mathbf {w}^{[1]} d\mu (P_1) d\nu (Q_1)\;\nonumber \\&\quad \times \mathcal {F}(\hat{X},\hat{B}, V, T, P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]})\nonumber \\&=\int d X^{[1]}d\mu (P_1)\; f(\hat{X}, V, P_1)\nonumber \\&\quad \times \int d\mathbf {y}^{[1]} \mathrm{d}\mathbf {w}^{[1]} d\nu (Q_1)\; g(\hat{B}, T, Q_1, \mathbf {y}^{[1]},\mathbf {w}^{[1]}). \end{aligned}$$
(4.3)

We shall also estimate \(\mathsf {F}(\cdot )\) in three different regions: (1) the complement of the vicinities of the saddle points; (2) the vicinity of Type I saddle point; (3) the vicinities of Type II and III saddle points, which will be done in Sects. 6.39.2 and 10.2, respectively.

Especially, when we restrict the \(\hat{X}, \hat{B}, V\) and T-variables to the vicinity of the Type I saddle points, the above integral can be performed approximately, resulting our main term, a factor of order \(1/(N\eta )^{n+2}\). This step will be done in Sect. 9. It is instructive to give a heuristic sketch of this calculation. At first, we plug the Type I saddle points into (4.3). We will show that the integral of \(f(\cdot )\) approximately reads

$$\begin{aligned} e^{-(a_+-a_-)N\eta }\int d X^{[1]}d\mu (P_1)\; f(D_{\pm }, I, P_1)\sim \frac{1}{N\eta }, \end{aligned}$$

which is the easy part. Then, recalling the definition of \(g(\cdot )\) in (3.21) and the parameterization (3.15), we will show that the integral of \(g(\cdot )\) approximately reads

$$\begin{aligned}&e^{(a_+-a_-)N\eta }\int d\mathbf {y}^{[1]} \mathrm{d}\mathbf {w}^{[1]} d\nu (Q_1)\; g(D_{\pm }, I, Q_1, \mathbf {y}^{[1]},\mathbf {w}^{[1]})\nonumber \\&\quad \sim \int _0^\infty 2t d t\int _{\mathbb {L}^2} d\sigma _p^{[1]} d\sigma _q^{[1]} \; e^{\mathbf {i}n\sigma _p^{[1]}}e^{-\mathbf {i}n\sigma _q^{[1]}}\cdot e^{-cN\eta t^2+c_1e^{-\mathbf {i}\sigma _p^{[1]}}t+c_2e^{\mathbf {i}\sigma _q^{[1]}}t}\nonumber \\&\quad \sim \int _0^\infty 2t d t\cdot t^{2n}\cdot e^{-cN\eta t^2}\sim \frac{1}{(N\eta )^{n+1}}, \end{aligned}$$
(4.4)

where in the second step above we used the fact that

$$\begin{aligned} \int _{\mathbb {L}} d\sigma \cdot e^{\mathbf {i}n\sigma }\;e^{ce^{-\mathbf {i}\sigma }t}\sim t^n. \end{aligned}$$

We notice that the factor \(e^{\mathbf {i}n\sigma _p^{[1]}}e^{-\mathbf {i}n\sigma _q^{[1]}}\) in (4.4) actually comes from the term \(\Big (\big (\mathbf {w}^{[1]}_q(\mathbf {w}^{[1]}_q)^*\big )_{12}\big (\mathbf {w}^{[1]}_p(\mathbf {w}^{[1]}_p)^*\big )_{21}\Big )^n\) in (3.21). This factor brings a strong oscillation to the integrand in the integral (4.4). In Case 2, an analogous factor will appear, resulting the same estimate as (4.4). However, in Case 3, such an oscillating factor is absent, then the estimate for the counterpart of the integral in (4.4) is of order \(1/N\eta \) instead of \(1/(N\eta )^{n+1}\). The detailed analysis will be presented in Sects. 10 and 11.

5 Saddle points and vicinities

In this section, we study the saddle points of \(K(\hat{X},V)\) and \( L(\hat{B},T)\) and deform the contours of the \(\hat{B}\)-variables to pass through the saddle points. Then we introduce and classify some small vicinities of these saddle points. The derivation of the saddle points of \(K(\hat{X},V)\) and \( L(\hat{B},T)\) in Sects. 5.1 and 5.2 below is essentially the same as the counterpart in [20], the only difference is that we are working under a more general setting on S. Hence, in Sects. 5.1 and 5.2, we just sketch the discussion, list the results, and make necessary modifications to adapt to our setting. In the sequel, we employ the notation

$$\begin{aligned} \mathbf {b}_a:= & {} (b_{1,a},\ldots , b_{W,a}),\quad \mathbf {x}_a:=(x_{1,a},\ldots , x_{W,a}), \qquad a=1,2,\nonumber \\ \mathbf {t}:= & {} (t_2,\ldots , t_W),\quad \mathbf {v}:=(v_2,\ldots , v_W),\nonumber \\&\varvec{\sigma }:=(\sigma _2,\ldots ,\sigma _W),\quad \varvec{\theta }:=(\theta _2,\ldots ,\theta _W). \end{aligned}$$
(5.1)

As mentioned above, later we also need to deform the contours, and discuss the integral over some vicinities of the saddle points, thus it is convenient to introduce a notation for the integral over specific domains. To this end, for \(a=1,2\), we use \(\mathbf {I}^b_{a}\) and \(\mathbf {I}^x_a\) to denote generic domains of \(\mathbf {b}_a\) and \(\mathbf {x}_a\) respectively. Analogously, we use \(\mathbf {I}^t\) and \(\mathbf {I}^v\) to represent generic domains of \(\mathbf {t}\) and \(\mathbf {v}\), respectively. These domains will be specified later. Now, for some collection of domains, we introduce the notation

$$\begin{aligned} \mathcal {I}\left( \mathbf {I}^{b}_1, \mathbf {I}^b_2, \mathbf {I}^x_1, \mathbf {I}^x_2, \mathbf {I}^t,\mathbf {I}^v\right)&:=\frac{M^{4W}}{(n!)^28^{W}\pi ^{2W+4}}\int _{\mathbb {L}^{2W-2}} \prod _{j=2}^W\frac{\mathrm{d}\theta _j}{2\pi }\prod _{j=2}^W \frac{\mathrm{d}\sigma _j}{2\pi } \nonumber \\&\quad \times \int _{\mathbf {I}^b_1} \prod _{j=1}^W \mathrm{d} b_{j,1} \int _{\mathbf {I}^b_2} \prod _{j=1}^W \mathrm{d} b_{j,2} \int _{\mathbf {I}^x_1} \prod _{j=1}^W \mathrm{d} x_{j,1}\nonumber \\&\quad \times \int _{\mathbf {I}^x_2} \prod _{j=1}^W \mathrm{d} x_{j,2} \int _{\mathbf {I}^t} \prod _{j=2}^W 2t_j \mathrm{d} t_j \int _{\mathbf {I}^v} \prod _{j=2}^W 2v_j \mathrm{d} v_j\nonumber \\&\quad \times \exp \left\{ -M\big (K(\hat{X},V)+L(\hat{B},T)\big )\right\} \nonumber \\&\quad \times \prod _{j=1}^W (x_{j,1}-x_{j,2})^2(b_{j,1}+b_{j,2})^2\cdot \mathsf {A}(\hat{X}, \hat{B}, V, T). \end{aligned}$$
(5.2)

For example, we can write (3.31) as

$$\begin{aligned} \mathbb {E}|G_{pq,11}(z)|^{2n}=\mathcal {I}(\mathbb {R}_+^W, \mathbb {R}_+^W, \varSigma ^W, \varSigma ^W, \mathbb {R}^{W-1}_+,\mathbb {I}^{W-1}), \end{aligned}$$
(5.3)

which is the integral over the full domain.

5.1 Saddle points of \(L(\hat{B},T)\)

We introduce the function

$$\begin{aligned} \Bbbk (a):=\frac{a^2}{2}-\mathbf {i}Ea-\log a, \quad a\in \mathbb {C}. \end{aligned}$$
(5.4)

Recalling the definition of \(L(\cdot )\) in (3.18), the decomposition of \(B_j\)’s in (3.22) and the definition of \(T_j\)’s in (3.24), we can write

$$\begin{aligned} L(\hat{B},T)=:\ell (\mathbf {b}_1)+\ell (-\mathbf {b}_2)+\ell _S(\hat{B},T), \end{aligned}$$
(5.5)

where we used the notation introduced in (5.1), and the functions \(\ell (\cdot )\) and \(\ell _S(\cdot )\) are defined as

$$\begin{aligned} \ell (\mathbf {a}):= & {} -\frac{1}{4}\sum _{j,k}\mathfrak {s}_{jk}(a_j-a_k)^2+\sum _j\Bbbk (a_j),\quad \mathbf {a}=(a_1,\ldots , a_W)\in \mathbb {C}^W,\nonumber \\ \ell _S(\hat{B},T):= & {} \frac{1}{2}\sum _{j,k} \mathfrak {s}_{jk} |(T_kT_j^{-1})_{12}|^2(b_{j,1}+b_{j,2})(b_{k,1}+b_{k,2}). \end{aligned}$$
(5.6)

Following the discussion in [20] with slight modification (see Section 3 therein), we see that for \(|E|\le \sqrt{2}-\kappa \), the saddle point of \( L(\hat{B},T)\) is

$$\begin{aligned} (\hat{B}_j,T_j)=(D_{\pm }, I), \quad \forall \; j=1,\ldots ,W, \end{aligned}$$
(5.7)

where \(D_{\pm }\) is defined in (1.28). For simplicity, we will write (5.7) as \((\hat{B},T)=(D_{\pm },I)\) in the sequel. Observe that

$$\begin{aligned} L(D_{\pm }, I)=\ell (a_+)+\ell (a_-),\qquad \ell _S(D_{\pm },I)=0. \end{aligned}$$
(5.8)

We introduce the notation

$$\begin{aligned} \mathring{\ell }_{++}(\mathbf {a}):=\ell (\mathbf {a})-\ell (a_+),\quad \mathring{\ell }_{+-}(\mathbf {a}):=\ell (\mathbf {a})-\ell (a_-)\quad \mathring{\ell }_{--}(\mathbf {a}):=\ell (-\mathbf {a})-\ell (a_-), \nonumber \\ \end{aligned}$$
(5.9)

where \(\ell (a_+)\) represents the value of \(\ell (\mathbf {a})\) at the point \(\mathbf {a}=(a_+,\ldots , a_+)\), and \(\ell (a_-)\) is defined analogously. Correspondingly, we adopt the notation

$$\begin{aligned} \mathring{L}(\hat{B},T):=L(\hat{B},T)-L(D_{\pm }, I)=\mathring{\ell }_{++}(\mathbf {b}_1)+\mathring{\ell }_{--}(\mathbf {b}_2)+\ell _S(\hat{B},T), \quad \quad \end{aligned}$$
(5.10)

where the second step is implied by (5.5), (5.8) and (5.9). Now, for each \(j=1,\ldots ,W\), we deform the contours of \(b_{j,1}\) and \(b_{j,2}\) to

$$\begin{aligned} b_{j,1}\in \varGamma :=\{ra_+| r\in \mathbb {R}_+\},\quad b_{j,2}\in \bar{\varGamma }=\{-ra_-|r\in \mathbb {R}_+\} \end{aligned}$$
(5.11)

to pass through the saddle points of \(\hat{B}\)-variables, based on the following lemma which will be proved in Sect. 7.

Lemma 5.1

With the notation introduced in (5.2), we have

$$\begin{aligned} \mathcal {I}\big (\varGamma ^W, \bar{\varGamma }^W, \varSigma ^W, \varSigma ^W, \mathbb {R}_+^{W-1},\mathbb {I}^{W-1}\big )&=\mathcal {I}\big (\mathbb {R}_+^W, \mathbb {R}_+^W, \varSigma ^W, \varSigma ^W, \mathbb {R}_+^{W-1},\mathbb {I}^{W-1}\big )\nonumber \\&=\mathbb {E}|G_{pq,11}(z)|^{2n}. \end{aligned}$$

We introduce the notation

$$\begin{aligned} r_{j,1}=|b_{j,1}|,\quad r_{j,2}=|b_{j,2}|,\quad j=1,\ldots ,W. \end{aligned}$$
(5.12)

Along the new contours, we have the following lemma.

Lemma 5.2

Suppose that \(|E|\le \sqrt{2}-\kappa \). Let \(\mathbf {b}_{1}\in \varGamma ^W, \mathbf {b}_{2}\in \bar{\varGamma }^W\). We have

$$\begin{aligned} \mathsf {Re}\mathring{L}(\hat{B},T)\ge & {} c\sum _{a=1,2}\sum _{j=1}^W \big ((r_{j,a}-1)^2+(r_{j,a}-\log r_{j,a}-1)\big )+\mathsf {Re}\ell _S(\hat{B},T)\nonumber \\\ge & {} c\sum _{a=1,2}\sum _{j=1}^W (r_{j,a}-1)^2 \end{aligned}$$
(5.13)

for some positive constant c.

Proof

Since \(|E|\le \sqrt{2}-\kappa \), we have \(\mathsf {Re}(b_{j,1}+b_{j,2})(b_{k,1}+b_{k,2})\ge 0\) for \(\mathbf {b}_{1}\in \varGamma ^W\) and \(\mathbf {b}_{2}\in \bar{\varGamma }^W\), thus \(\mathsf {Re}\ell _S(\hat{B},T)\ge 0\), in light of the definition in (5.6). Consequently, according to (5.10), it suffices to prove

$$\begin{aligned} \mathsf {Re}\mathring{\ell }_{++}(\mathbf {b}_1)+\mathsf {Re}\mathring{\ell }_{--}(\mathbf {b}_2)\ge c\sum _{a=1,2}\sum _{j=1}^W \Big ((r_{j,a}-1)^2+(r_{j,a}-\log r_{j,a}-1)\Big ) \nonumber \\ \end{aligned}$$
(5.14)

for some positive constant c. To see this, we observe the following identities obtained via elementary calculation,

$$\begin{aligned} \mathsf {Re}\mathring{\ell }_{++}(\mathbf {b}_1)&=\frac{E^2-2}{4}\Big (\frac{1}{2}\sum _{j,k} \mathfrak {s}_{jk}(r_{j,1}-r_{k,1})^2-\sum _j(r_{j,1}-1)^2\Big )\nonumber \\&\quad +\sum _{j=1}^W\big (r_{j,1}-\log r_{j,1}-1\big ) \end{aligned}$$
(5.15)

which together with \(|E|\le \sqrt{2}-\kappa \) and (1.5) implies (5.14) immediately. The same identity holds if we replace \(\mathring{\ell }_{++}(\mathbf {b}_1)\) and \(r_{j,1}\)’s by \(\mathring{\ell }_{--}(\mathbf {b}_2)\) and \(r_{j,2}\)’s in (5.15), respectively. Hence, we completed the proof of Lemma 5.2. \(\square \)

5.2 Saddle points of \(K(\hat{X},V)\)

Analogously, recalling the definition in (5.6), we can write

$$\begin{aligned} K(\hat{X},V)=: -\ell (\mathbf {x}_1)-\ell (\mathbf {x}_2)+ \ell _S(\hat{X},V), \end{aligned}$$
(5.16)

where \(\ell (\cdot )\) is defined in the first line of (5.6) and \(\ell _S(\hat{X},V)\) is defined as

$$\begin{aligned} \ell _S(\hat{X},V)=\frac{1}{2}\sum _{j,k} \mathfrak {s}_{jk}|(V_kV_j^*)_{12}|^2(x_{j,1}-x_{j,2})(x_{k,1}-x_{k,2}). \end{aligned}$$
(5.17)

Analogously to the notation \(L(D_{\pm },I)\), we will use \(K(D_{\pm },I)\) to represent the value of \(K(\hat{X},V)\) at \((\hat{X}_j,V_j)=(D_{\pm }, I)\) for all \(j=1,\ldots , W\). In addition, \(K(D_{+},I)\) and \(K(D_{-},I)\) are defined in the same manner. Observing that

$$\begin{aligned} \ell _S(D_{\pm },I)=\ell _S(D_{+},I)=\ell _S(D_{-},I)=0, \end{aligned}$$
(5.18)

we have

$$\begin{aligned} K(D_{\pm }, I)=-\ell (a_+)-\ell (a_-),\quad K(D_{+},I)=-2\ell (a_+),\quad K(D_{-},I)=-2\ell (a_-). \nonumber \\ \end{aligned}$$
(5.19)

Moreover, we employ the notation

$$\begin{aligned} \mathring{K}(\hat{X},V)=K(\hat{X},V)-K(D_{\pm }, I)=-\mathring{\ell }_{++}(\mathbf {x}_1)-\mathring{\ell }_{+-}(\mathbf {x}_2)+\ell _S(\hat{X},V). \quad \quad \quad \end{aligned}$$
(5.20)

We will need the following elementary observations that are easy to check from (5.19) and (5.6)

$$\begin{aligned} K(D_{\pm }, I)+L(D_{\pm }, I)=0, \qquad \mathsf {Re}K(D_{+},I)=\mathsf {Re}K(D_{-},I)=\mathsf {Re}K(D_{\pm },I). \nonumber \\ \end{aligned}$$
(5.21)

In addition, we introduce the \(W\times W\) matrix

$$\begin{aligned} S^v=(\mathfrak {s}^v_{jk}),\quad \mathfrak {s}^v_{jk}:=\mathfrak {s}_{jk}|(V_kV_j^*)_{12}|^2, \end{aligned}$$
(5.22)

and the \(2W\times 2W\) matrices

$$\begin{aligned} \mathbb {S}=S\oplus S,\qquad \mathbb {S}^v:=\mathbb {S}+\bigg (\begin{array}{cccc} -S^v &{}S^v\\ S^v &{} -S^v\end{array}\bigg ), \end{aligned}$$
(5.23)

where \(\mathbb {S}^v\) depends on the V-variables according to (5.22). Here we regard V-variables as fixed parameters. Due to the fact \(|(V_kV_j^*)_{12}|\in \mathbb {I}\), it is easy to see that \(\mathbb {S}^v\) is a weighted Laplacian of a graph with 2W vertices. In particular, \(\mathbb {S}^v\le 0\). By the definition (5.22), one can see that \(S^v_{ii}=0\) for all \(i=1,\ldots , W\). Consequently, we can obtain

$$\begin{aligned} \sum _{k\ne j} \mathbb {S}^v_{jk}=\sum _{k\ne j} \mathbb {S}^v_{kj}=-\mathbb {S}^v_{jj}=\bigg \{\begin{array}{ll}-\mathfrak {s}_{jj},&{}\quad \text {if}\quad j=1,\ldots W\\ -\mathfrak {s}_{j-W,j-W},&{}\quad \text {if} \quad j=W+1,\ldots , 2W. \end{array} \end{aligned}$$

Similarly to (1.5), we get

$$\begin{aligned} I+\mathbb {S}^v\ge c_0I, \end{aligned}$$
(5.24)

where \(c_0\) is the constant in Assumption 1.1 (ii). Moreover, it is not difficult to see from the definitions in (5.17), (5.22) and (5.23) that

$$\begin{aligned} \frac{1}{4}\sum _{j,k} \mathfrak {s}_{jk} Tr( \hat{X}_j-\hat{X}_k)^2+ \ell _S(\hat{X},V)=-\frac{1}{2} \mathbf {x}'\mathbb {S}^v\mathbf {x}, \end{aligned}$$
(5.25)

where we used the notation \(\mathbf {x}:=(\mathbf {x}_1',\mathbf {x}_2')'\). Now let

$$\begin{aligned} \vartheta _j=\arg x_{j,1},\quad \vartheta _{W+j}=\arg x_{j,2},\quad \forall \; j=1,\ldots , W. \end{aligned}$$
(5.26)

Then, recalling the parametrization of \(V_j\)’s in (3.25), we have the following lemma.

Lemma 5.3

Assume that \(x_{j,1}, x_{j,2}\in \varSigma \) for all \(j=1,\ldots , W\). We have

$$\begin{aligned} \mathsf {Re}\mathring{K}(\hat{X},V)\ge \frac{1}{4}\sum _{j,k=1}^{2W}(\mathbb {S}^v)_{jk}(\cos \vartheta _j-\cos \vartheta _k)^2+c\sum _{j=1}^{2W}\Big (\sin \vartheta _j-\frac{E}{2}\Big )^2 \quad \quad \quad \end{aligned}$$
(5.27)

for some positive constant c. In addition, \(\mathsf {Re}\mathring{K}(\hat{X},V)\) attains its minimum 0 at the following three types of saddle points

  • Type I :   For each j,  \(\hat{X}_j=D_{\pm }\quad \text {or} \quad D_{\mp }\), \(\qquad \theta _j\in \mathbb {L} ~~~~v_j=0\) if \(\hat{X}_j=\hat{X}_1\), and \(v_j=1\) if \(\hat{X}_j\ne \hat{X}_1\),

  • Type II :      For each j,      \(\hat{X}_j=D_{+},V_j\in \mathring{U}(2)\),

  • Type III :      For each j,      \(\hat{X}_j=D_{-},V_j\in \mathring{U}(2)\),

which are the restrictions of three types of saddle points in Sect. 4.1, on \(\hat{X}\) and V-variables.

Remark 5.4

The Type I saddle points of \((\hat{X},V)\) are exactly those points satisfying

$$\begin{aligned} V_j^*\hat{X}_jV_j=D_{\pm },\quad \forall \; j=1,\ldots , W,\quad \text {or}\quad V_j^*\hat{X}_jV_j=D_{\mp },\quad \forall \; j=1,\ldots , W. \end{aligned}$$

In Lemma 5.3, we wrote them in terms of \(\hat{X}_j, v_j\) and \(\theta _j\) in order to evoke the parameterization in (3.23) and (3.25).

Proof

By (5.16), (5.25), the definitions of the functions \(\ell (\cdot )\) in (5.7) and \(\Bbbk (\cdot )\) in (5.4), we can write

$$\begin{aligned} \mathring{K}(\hat{X},V)=-\frac{1}{2} \mathbf {x}^*\mathbb {S}^v\mathbf {x}-\sum _{j=1}^W\Big (\Bbbk (x_{j,1})+\Bbbk (x_{j,2})-\Bbbk (a_+)-\Bbbk (a_-)\Big ). \end{aligned}$$

By using (5.26) and the fact \(|x_{j,a}|=1\) for all \(j=1,\ldots , W\) and \(a=1,2\), we can obtain via elementary calculation

$$\begin{aligned}&\mathsf {Re}\mathring{K}(\hat{X},V)=\frac{1}{4}\sum _{j,k=1}^{2W}(\mathbb {S}^v)_{jk}\Big ((\cos \vartheta _j-\cos \vartheta _k)^2-(\sin \vartheta _j-\sin \vartheta _k)^2\Big )\nonumber \\&\quad +\sum _{j=1}^{2W}\Big (\sin \vartheta _j-\frac{E}{2}\Big )^2. \end{aligned}$$
(5.28)

In light of the fact \(\mathbb {S}^v\le 0\) and (5.24), we have

$$\begin{aligned} I+\frac{1}{2}\mathbb {S}^v\ge I+\mathbb {S}^v\ge c_0I. \end{aligned}$$
(5.29)

Applying (5.29) to the r.h.s. of (5.28) yields (5.27).

Then what remains is to show that \(\mathsf {Re}\mathring{K}(\hat{X},V)\) attains its minimum 0 at three types of points listed in Lemma 5.3. This step can be verified in the same manner as the counterpart in [20]. Hence, we just omit it here and refer to the proof of Lemma 2 in [20] for more details. Therefore, we completed the proof of Lemma 5.3. \(\square \)

5.3 Vicinities of the saddle points

Having studied the saddle points of \(L(\hat{B},T)\) and \(K(\hat{X},V)\), we then introduce some small vicinities of them. To this end, we introduce the quantity

$$\begin{aligned} \varTheta \equiv \varTheta (N,\varepsilon _0):=WN^{\varepsilon _0} \end{aligned}$$
(5.30)

for small positive constant \(\varepsilon _0\) which will be chosen later. Let \(\mathbf {a}=(a_{1},\ldots , a_{W})\in \mathbb {C}^W\) be any complex vector. In the sequel, we adopt the notation for any \(d\in \mathbb {C}\),

$$\begin{aligned} \mathbf {a}+d&:=\left( a_{1}+d,\ldots ,a_{W}+d\right) ,\quad d\mathbf {a}:=\left( d a_{1},\ldots ,d a_{W}\right) ,\nonumber \\ \arg (\mathbf {a})&:=\big (\arg (a_1),\ldots , \arg (a_W)\big ). \end{aligned}$$
(5.31)

Now, we define the domains \(\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+, \varUpsilon ^x_-\) and \(\varUpsilon _S\) as follows

$$\begin{aligned} \varUpsilon ^b_\pm \equiv \varUpsilon ^b_\pm (N,\varepsilon _0)&:=\left\{ \mathbf {a}\in \varGamma ^W: ||\mathbf {a}-a_\pm ||_2^2\le \frac{\varTheta }{M}\right\} , \nonumber \\ \varUpsilon ^x_\pm \equiv \varUpsilon ^x_\pm (N,\varepsilon _0)&:=\left\{ \mathbf {a}\in \varSigma ^W: ||\arg (a_\pm ^{-1}\mathbf {a})||_2^2\le \frac{\varTheta }{M}\right\} ,\nonumber \\ \varUpsilon _S\equiv \varUpsilon _S(N,\varepsilon _0)&:=\left\{ \mathbf {a}\in \mathbb {R}_+^{W-1}: -\mathbf {a}'S^{(1)}\mathbf {a}\le \frac{\varTheta }{M}\right\} , \end{aligned}$$
(5.32)

where the superscripts b and x indicate that these will be domains of the corresponding variables. In order to define the vicinities of the Type I saddle points properly, we introduce the permutation \(\epsilon _j\) of \(\{1,2\}\), for each triple \((x_{j,1}, x_{j,2}, v_j)\). Specifically, recalling the fact of \(u_j=\sqrt{1-v_j^2}\) from (3.25), we define

$$\begin{aligned} v_{j,\epsilon _j}\equiv v_{j,\epsilon _j}(\epsilon _1):=v_j\mathbf {1}(\epsilon _j=\epsilon _1)+u_j\mathbf {1}(\epsilon _j\ne \epsilon _1). \end{aligned}$$

Denoting by \(\varvec{\epsilon }=(\epsilon _1,\ldots , \epsilon _W)\) and \(\varvec{\epsilon }(a)=(\epsilon _1(a),\ldots , \epsilon _W(a))\) for \(a=1,2\), we set

$$\begin{aligned} \mathbf {x}_{\varvec{\epsilon }(a)}=(x_{1,\epsilon _1(a)},\ldots , x_{W,\epsilon _W(a)}),\qquad a=1,2,\qquad \mathbf {v}_{\varvec{\epsilon }}=(v_{2,\epsilon _2},\ldots , v_{W,\epsilon _W}). \quad \quad \end{aligned}$$
(5.33)

With this notation, we now define the Type I, II, and III vicinities, parameterized by \((\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_1,\mathbf {x}_2,\mathbf {t},\mathbf {v})\) of the corresponding saddle point types. We also define the special case of the Type I vicinity, namely, Type I’ vicinity, corresponding to the Type I’ saddle point defined in Sect. 4.1.

Definition 5.5

  • Type I vicinity :   \(\big (\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_{\varvec{\epsilon }(1)},\mathbf {x}_{\varvec{\epsilon }(2)},\mathbf {t},\mathbf {v}_{\varvec{\epsilon }}\big )\in \varUpsilon ^b_+\times \varUpsilon ^b_-\times \varUpsilon ^x_+\times \varUpsilon ^x_-\times \varUpsilon _S\times \varUpsilon _S \) for some \(\varvec{\epsilon }\).

  • Type I’ vicinity :  \(\big (\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_1,\mathbf {x}_2,\mathbf {t},\mathbf {v}\big )\in \varUpsilon ^b_+\times \varUpsilon ^b_-\times \varUpsilon ^x_+\times \varUpsilon ^x_-\times \varUpsilon _S\times \varUpsilon _S \).

  • Type II vicinity : \(\big (\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_1,\mathbf {x}_2,\mathbf {t},\mathbf {v}\big )\in \varUpsilon ^b_+\times \varUpsilon ^b_-\times \varUpsilon ^x_+\times \varUpsilon ^x_+\times \varUpsilon _S\times \mathbb {I}^{W-1}\).

  • Type III vicinity :  \(\big (\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_1,\mathbf {x}_2,\mathbf {t},\mathbf {v}\big )\in \varUpsilon ^b_+\times \varUpsilon ^b_-\times \varUpsilon ^x_-\times \varUpsilon ^x_-\times \varUpsilon _S\times \mathbb {I}^{W-1}\).

In the following discussion, the parameter \(\varepsilon _0\) in \(\varTheta \) is allowed to be different from line to line. However, given \(\varepsilon _1\) in (1.8), we shall always choose \(\varepsilon _2\) in (1.16) and \(\varepsilon _0\) in (5.30) according to the rule

$$\begin{aligned} C\varepsilon _2\le \varepsilon _0\le \varepsilon _1/C \end{aligned}$$
(5.34)

for some sufficiently large \(C>0\). Consequently, by Assumption 1.14 we have

$$\begin{aligned} N(\log N)^{-10}\ge & {} M=M^{1-4\varepsilon _0}M^{4\varepsilon _0}\ge W^{(4+2\gamma +\varepsilon _1)(1-4\varepsilon _0)} M^{4\varepsilon _0}\nonumber \\\gg & {} W^{(4+2\gamma +4\varepsilon _0)} M^{4\varepsilon _0}=W^{2\gamma }\varTheta ^4. \end{aligned}$$
(5.35)

To prove Theorem 1.15, we split the task into three steps. The first step is to exclude the integral outside the vicinities. Specifically, we will show the following lemma.

Lemma 5.6

Under Assumptions 1.1 and 1.14, we have,

$$\begin{aligned}&\mathcal {I}\big (\varGamma ^{W}, \bar{\varGamma }^{W}, \varSigma ^W,\varSigma ^W, \mathbb {R}^{W-1}_+,\mathbb {I}^{W-1}\big )\nonumber \\&\quad =2^W\mathcal {I}\big (\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+, \varUpsilon ^x_-, \varUpsilon _S,\varUpsilon _S\big ) + \mathcal {I}\big (\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+, \varUpsilon ^x_+, \varUpsilon _S,\mathbb {I}^{W-1}\big )\nonumber \\&\quad \quad +\mathcal {I}\big (\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_-, \varUpsilon ^x_-, \varUpsilon _S,\mathbb {I}^{W-1}\big )+O( e^{-\varTheta }). \end{aligned}$$
(5.36)

Remark 5.7

The first three terms on the r.h.s. of (5.36) correspond to the integrals over vicinities of the Type I, II, and III saddle points, respectively. Note that for the first term, we have used the fact that the total contribution of the integral over the Type I vicinity is \(2^W\) times that over the Type I’ vicinity.

The second step, is to estimate the integral over the Type I vicinity. We have the following lemma.

Lemma 5.8

Under Assumptions 1.1 and 1.14, there exists some positive constant \(C_0\) uniform in n and some positive number \(N_0=N_0(n)\) such that for all \(N\ge N_0\),

$$\begin{aligned} 2^W\big |\mathcal {I}(\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+,\varUpsilon ^x_-, \varUpsilon _S, \varUpsilon _S)\big |\le \frac{N^{C_0}}{(N\eta )^{n}}. \end{aligned}$$
(5.37)

The last step is to show that the integral over the Type II and III vicinities are also negligible.

Lemma 5.9

Under Assumptions 1.1 and 1.14, there exists some positive constant c such that,

$$\begin{aligned}&\mathcal {I}\big (\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+, \varUpsilon ^x_+, \varUpsilon _S,\mathbb {I}^{W-1}\big )=O\left( e^{-c W}\right) ,\nonumber \\&\quad \mathcal {I}\big (\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_-, \varUpsilon ^x_-, \varUpsilon _S,\mathbb {I}^{W-1}\big )=O(e^{-c W}). \end{aligned}$$

Therefore, the remaining task is to prove Lemmas 5.1, 5.65.8 and 5.9. For the convenience of the reader, we outline the organization of the subsequent part as follows.

At first, the proofs of Lemmas 5.1 and 5.6 require a discussion on the bound of the integrand, especially on the term \(\mathsf {A}(\cdot )\), which contains the integral over all the Grassmann variables. To this end, we perform a crude analysis for the function \(\mathsf {A}(\cdot )\) in Sect. 6 in advance, with which we are able to prove Lemmas 5.1 and 5.6 in Sect. 7. Then, we can restrict ourselves to the integral over the vicinities, i.e., prove Lemmas 5.8 and 5.9. In Sect. 8, we will analyze the factor \(\exp \{-M(\mathring{K}(\hat{X},V)+\mathring{L}(\hat{B},T))\}\), which is approximately Gaussian in the Type I vicinity. The key step is to deform the contours of \(\hat{X}\) and \(\hat{B}\)-variables again to the steepest paths exactly, whereby we can control the error terms in the integrand and prove Lemma 5.8 in Sect. 9. In the Type II and III vicinities, we only deform the contours of \(\hat{B}\)-variables again and bound \(\exp \{-M\mathring{K}(\hat{X},V)\}\) by its absolute value directly. It turns out to be enough for our proof of Lemma 5.9, which is given in Sect. 10.

6 Crude bound on \(\mathsf {A}(\hat{X}, \hat{B}, V, T)\)

In this section, we provide a bound on the function \(\mathsf {A}(\cdot )\) in terms of the \(\hat{B}, T\)-variables, which holds on all the domains under discussion in the sequel. Here, by crude bound we mean a bound of order \(\exp \{O(WN^{\varepsilon _2})\})\), which will be specified in Lemma 6.1 below. By the definition in (3.32), we see that \(\mathsf {A}(\cdot )\) is an integral of the product of \(\mathcal {Q}(\cdot ), \mathcal {P}(\cdot )\) and \(\mathcal {F}(\cdot )\). As mentioned in Sect. 4.2, a typical procedure we will adopt is to ignore \(\mathcal {Q}(\cdot )\) at first, then estimate the integrals of \(\mathcal {P}(\cdot )\) and \(\mathcal {F}(\cdot )\), which are denoted by \(\mathsf {P}(\cdot )\) and \(\mathsf {F}(\cdot )\), respectively [see (4.2) and (4.3)], finally, we make necessary comment on how to modify the bounding scheme to take \(\mathcal {Q}(\cdot )\) into account, whereby we can get the desired bound for \(\mathsf {A}(\cdot )\).

For the sake of simplicity, from now on, we will use the notation

$$\begin{aligned} \omega _{j,1}:= & {} \omega _{j,11},\quad \omega _{j,2}:=\omega _{j,12},\quad \omega _{j,3}:=\omega _{j,21},\quad \omega _{j,4}:=\omega _{j,22},\nonumber \\ \xi _{j,1}:= & {} \xi _{j,11},\quad \xi _{j,2}:=\xi _{j,21},\quad \xi _{j,3}:=\xi _{j,12},\quad \xi _{j,4}:=\xi _{j,22}. \end{aligned}$$
(6.1)

Moreover, we introduce the domains

$$\begin{aligned} \widehat{\varSigma }&:=\Big \{re^{\mathbf {i}\vartheta }: |r-1|\le \frac{1}{10}, \vartheta \in \mathbb {L}\Big \},\nonumber \\ \mathbb {K}\equiv \mathbb {K}(E)&:=\left\{ \begin{array}{ccc}\Big \{\omega \in \mathbb {C}:0\le \arg \omega \le \frac{\arg a_+}{2}+\frac{\pi }{8}\Big \},\quad \text {if}\quad E\ge 0,\\ \\ \Big \{\omega \in \mathbb {C}:\frac{\arg a_+}{2}-\frac{\pi }{8}\le \arg \omega \le 0\Big \},\quad \text {if}\quad E<0. \end{array} \right. \end{aligned}$$
(6.2)

By the assumption that \(|E|\le \sqrt{2}-\kappa \) in (1.16), it is easy to see that \(|\arg \omega |\le \pi /4-c\) for all \(\omega \in \mathbb {K}\cup \bar{\mathbb {K}}\), where c is some positive constant depending on \(\kappa \). Our aim is to show the following lemma.

Lemma 6.1

Suppose that \(\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_1,\mathbf {x}_2\in \mathbb {K}^W\times \bar{\mathbb {K}}^W\times \widehat{\varSigma }^W\times \widehat{\varSigma }^W\). Under the assumption of Theorem 1.15, we have

$$\begin{aligned} |\mathsf {A}(\hat{X},\hat{B}, V, T)|\le e^{O(WN^{\varepsilon _2})}\prod _{j=1}^W \Big (r_{j,1}^{-1}+r_{j,2}^{-1}+t_j+1\Big )^C:= e^{O(WN^{\varepsilon _2})} \mathfrak {p}(\mathbf {r}^{-1},\mathbf {t}). \end{aligned}$$

Remark 6.2

Obviously, using the terminology introduced at the end of Sect. 3.6, we have

$$\begin{aligned} \mathfrak {p}(\mathbf {r}^{-1},\mathbf {t})\in \mathfrak {Q}\left( \left\{ r_{j,1}^{-1},r_{j,2}^{-1},t_j\right\} _{j=1}^{W}; \kappa _1,\kappa _2,\kappa _3\right) ,\quad \kappa _1=e^{O(W)},\quad \kappa _2, \kappa _3=O(1). \nonumber \\ \end{aligned}$$
(6.3)

6.1 Integral of \(\mathcal {Q}\)

In this section, we investigate the function

$$\begin{aligned}&\mathsf {Q}( \varOmega , \varXi , P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]})\nonumber \\&\quad :=\int \mathrm{d}\varvec{\omega }^{[1]}\mathrm{d}\varvec{\xi }^{[1]}\; \mathcal {Q}( \varOmega , \varXi , \varvec{\omega }^{[1]},\varvec{\xi }^{[1]}, P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]}). \end{aligned}$$
(6.4)

Recall \(\mathfrak {Q}_{\text {deg}}(\mathbf {a}; \kappa _1, \kappa _2, \kappa _3)\) defined at the end of Sect. 3.6, the parameterization in (3.15) and (3.25) and the notation introduced in (6.1). We have the following lemma.

Lemma 6.3

If we regard \(\sigma , {v}_p^{[1]}, {v}_q^{[1]}, P_1\) and \((X^{[1]})^{-1}\)-entries as fixed parameters, we have

$$\begin{aligned} \mathsf {Q}(\cdot )\in \mathfrak {Q}_{\text {deg}}(\mathfrak {S}; \kappa _1,\kappa _2,\kappa _3), \qquad \kappa _1=W^{O(1)},\quad \kappa _2, \kappa _3=O(1), \end{aligned}$$
(6.5)

where \(\mathfrak {S}\) is the set of variables defined by

$$\begin{aligned} \mathfrak {S}:=\left\{ t,s, (y^{[1]}_k)^{-1}, e^{\mathbf {i}\sigma _k^{[1]}}, e^{-\mathbf {i}\sigma _k^{[1]}}, \frac{\omega _{i,a}\xi _{j,b}}{M}\right\} _{\begin{array}{c} i,j=1,\ldots ,W;\\ k=p,q; a,b=1,\ldots , 4 \end{array}}. \end{aligned}$$

Proof

Note that \(\mathcal {Q}(\cdot )\) can be regarded as a function of the Grassmann variables in \(\varvec{\omega }^{[1]}\) and \(\varvec{\xi }^{[1]}\). Hence, by the definition in (3.30), it is a polynomial of these variables with bounded degree. In addition, we can always combine the factor \(1/\sqrt{M}\) with \(\varOmega _j\) and \(\varXi _j\)-variables in the first factor of the r.h.s. of (3.30). Then it is not difficult to check that

$$\begin{aligned} \mathcal {Q}(\cdot )\in \mathfrak {Q}_{\text {deg}}(\mathfrak {S}';\kappa _1,\kappa _2,\kappa _3),\quad \kappa _1=W^{O(1)},\quad \kappa _2,\kappa _3=O(1), \end{aligned}$$
(6.6)

where

$$\begin{aligned} \mathfrak {S}':=\left\{ t,s, \left( y^{[1]}_k\right) ^{-1}, e^{\mathbf {i}\sigma _k^{[1]}}, e^{-\mathbf {i}\sigma _k^{[1]}},\frac{\omega _{j,r}\xi _{k,b}^{[1]}}{\sqrt{M}},\frac{\xi _{j,r}\omega _{k,b}^{[1]}}{\sqrt{M}},\omega _{k,a}^{[1]}\xi _{\ell ,b}^{[1]}\right\} _{\begin{array}{c} j=1,\ldots , W; k=p,q;\\ r=1,\ldots , 4; a, b=1,2 \end{array}}. \end{aligned}$$

By the definition in (6.4), \(\mathsf {Q}(\cdot )\) is the integral of \(\mathcal {Q}(\cdot )\) over the \(\varvec{\omega }^{[1]}\) and \(\varvec{\xi }^{[1]}\)-variables. Now, we regard all the other variables in \(\mathfrak {S}'\), except \(\varvec{\omega }^{[1]}\) and \(\varvec{\xi }^{[1]}\)-variables, as parameters. By the definition of Grassmann integral, we know that only the coefficient of the highest order term \(\prod _{k=p,q}\prod _{a=1,2}\omega ^{[1]}_{k,a}\xi ^{[1]}_{k,a}\) in \(\mathcal {Q}(\cdot )\) survives after integrating \(\varvec{\omega }^{[1]}\) and \(\varvec{\xi }^{[1]}\)-variables out. Then, it is easy to see (6.5) from (6.6), completing the proof.

\(\square \)

6.2 Integral of \(\mathcal {P}\)

In this subsection, we temporarily ignore the \(\varOmega \) and \(\varXi \)-variables from \(\mathsf {Q}(\cdot )\), and estimate \(\mathsf {P}(\hat{X}, \hat{B}, V, T)\) defined in (4.2). Recalling \(r_{j,1}\) and \(r_{j,2}\) defined in (5.12), we can formulate our estimate as follows.

Lemma 6.4

Suppose that the assumptions in Lemma 6.1 hold. We have

$$\begin{aligned} |\mathsf {P}(\hat{X}, \hat{B}, V, T)|\le e^{O(W)}\prod _{j=1}^W \left( r_{j,1}^{-1}+r_{j,2}^{-1}+t_j+1\right) ^{O(1)}. \end{aligned}$$
(6.7)

Proof

We start with one factor from \(\mathcal {P}(\cdot )\) (see (3.29)), namely

$$\begin{aligned} \varvec{\varpi }_j&:=\frac{1}{\det ^{M}\big (1+M^{-1}V_j^*\hat{X}_j^{-1}V_j \varOmega _jT_j^{-1}\hat{B}_j^{-1}T_j\varXi _j\big )}\nonumber \\&=\exp \Big \{ -M\log \det \big (1+M^{-1}V_j^*\hat{X}_j^{-1}V_j \varOmega _jT_j^{-1}\hat{B}_j^{-1}T_j\varXi _j\big )\Big \}\nonumber \\&= 1+\sum _{\ell =1}^4 \frac{1}{M^{\ell -1}} \mathfrak {p}_\ell (\hat{X}_j, \hat{B}_j, V_j,T_j, \varOmega _j, \varXi _j). \end{aligned}$$
(6.8)

Here \(\mathfrak {p}_\ell (\cdot )\) is a polynomial in \(\hat{X}_j^{-1}, \hat{B}_j^{-1}, V_j, T_j, \varOmega _j\) and \(\varXi _j\)-entries with bounded degree and bounded coefficients. Moreover, if we regard \(\mathfrak {p}_\ell (\cdot )\) as a polynomial of \(\varOmega _j\) and \(\varXi _j\)-entries, it is homogeneous, with degree \(2\ell \), and the total degree for \(\varOmega _j\)-variables is \(\ell \), thus that for \(\varXi _j\)-entries is also \(\ell \). More specifically, we can write

$$\begin{aligned} \mathfrak {p}_\ell (\hat{X}_j, \hat{B}_j, V_j,T_j, \varOmega _j, \varXi _j)=\sum _{\begin{array}{c} \alpha _1,\ldots ,\alpha _\ell ,\\ ~~\beta _1,\ldots , \beta _\ell =1 \end{array}}^4 \mathfrak {p}_{\ell ,\varvec{\alpha },\varvec{\beta }}(\hat{X}_j, \hat{B}_j, V_j,T_j) \prod _{i=1}^{\ell } \omega _{j,\alpha _{i}}\xi _{j, \beta _{i}}, \end{aligned}$$

where we used the notation in (6.1) and denoted \(\varvec{\alpha }=(\alpha _{1},\ldots , \alpha _{\ell })\) and \(\varvec{\beta }=(\beta _{1},\ldots , \beta _{\ell })\). It is easy to verify that \(\varvec{\varpi }_j\) is of the form (6.8) by taking Taylor expansion w.r.t. the Grassmann variables. The expansion in (6.8) terminates at \(\ell =4\), owing to the fact that there are totally 8 Grassmann variables from \(\varOmega _j\) and \(\varXi _j\). In addition, it is also easy to check that \(\mathfrak {p}_{\ell ,\varvec{\alpha },\varvec{\beta }}(\cdot )\) is a polynomial of \(\hat{X}_j^{-1}, \hat{B}_j^{-1}, V_j, T_j\)-entries with bounded degree and bounded coefficients, which implies that there exist two positive constants \(C_1\) and \(C_2\), such that

$$\begin{aligned} |\mathfrak {p}_{\ell ,\varvec{\alpha },\varvec{\beta }}(\cdot )|\le C_1\left( r_{j,1}^{-1}+r_{j,2}^{-1}+t_j+1\right) ^{C_2} \end{aligned}$$
(6.9)

uniformly in \(\ell , \varvec{\alpha }\) and \(\varvec{\beta }\). Here we used the fact that \(\hat{X}_{j}^{-1}\) and \(V_j\)-entries are all bounded and \(T_j\)-entries are bounded by \(1+t_j\).

Now, we go back to the definition of \(\mathcal {P}(\cdot )\) in (3.29) and study the last factor. Similarly to the discussion above, it is easy to see that for \(k=p\) or q,

$$\begin{aligned} \hat{\varvec{\varpi }}_k&:=\frac{\det \big (V_k^*\hat{X}_kV_k+M^{-1} \varOmega _k T_k^{-1}\hat{B}_k^{-1}T_k\varXi _k\big )}{\det \hat{B}_k}=\hat{\mathfrak {p}}_0(\hat{X}_k, \hat{B}_k)\nonumber \\&\quad +\sum _{\ell =1}^4\sum _{\varvec{\alpha },\varvec{\beta }}\hat{\mathfrak {p}}_{\ell ,\varvec{\alpha },{\varvec{\beta }}}(\hat{X}_k, \hat{B}_k, V_k,T_k)\prod _{i=1}^\ell \omega _{k,\alpha _{k,i}}\xi _{k,\beta _{k,i}}, \end{aligned}$$
(6.10)

where \(\hat{\mathfrak {p}}_0(\cdot )=\det \hat{X}_k/\det \hat{B}_k\) and \(\hat{\mathfrak {p}}_{\ell ,\varvec{\alpha },{\varvec{\beta }}}(\cdot )\)’s are some polynomials of \(\hat{X}_k, \hat{B}_k^{-1}, V_k, T_k\)-entries with bounded degree and bounded coefficients. Similarly, we have

$$\begin{aligned} |\hat{\mathfrak {p}}_0(\cdot )|, |\hat{\mathfrak {p}}_{\ell ,\varvec{\alpha },\varvec{\beta }}(\cdot )|\le C_1(r_{k,1}^{-1}+r_{k,2}^{-1}+t_k+1)^{C_2} \end{aligned}$$
(6.11)

for some positive constants \(C_1\) and \(C_2\).

According to the definitions in (6.8) and (6.10), we can rewrite (3.29) as

$$\begin{aligned}&\mathcal {P}( \varOmega , \varXi , \hat{X}, \hat{B}, V, T)=\exp \left\{ -\sum _{j,k}\tilde{\mathfrak {s}}_{jk} Tr \varOmega _j\varXi _k\right\} \prod _{j=1}^W\varvec{\varpi }_j\prod _{k=p,q} \hat{\varvec{\varpi }}_k. \quad \quad \quad \end{aligned}$$
(6.12)

In light of the discussion above, \(\prod _{j=1}^W\varvec{\varpi }_j\prod _{k=p,q}\hat{\varvec{\varpi }}_k\) is a polynomial of \(\hat{X}^{-1}, \hat{B}^{-1}, V, T, \varOmega \) and \(\varXi \)-entries, in which each monomial is of the form

$$\begin{aligned} \mathfrak {q}_{\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta }}}(\hat{X}^{-1},\hat{B}^{-1}, V, T)\prod _{j=1}^W\prod _{i=1}^{\ell _j} \omega _{j,\alpha _{j,i}}\xi _{j,\beta _{j,i}}, \end{aligned}$$
(6.13)

where we used the notation

$$\begin{aligned} \mathbf {\ell }= & {} (\ell _1,\ldots , \ell _W),\quad {\varvec{\alpha }}\equiv {\varvec{\alpha }}(\mathbf {\ell })=(\varvec{\alpha }_1,\ldots , \varvec{\alpha }_W),\quad {\varvec{\beta }}\equiv {\varvec{\beta }}(\mathbf {\ell }) =(\varvec{\beta }_1,\ldots , \varvec{\beta }_W),\nonumber \\ \varvec{\alpha }_j= & {} (\alpha _{j,1},\ldots , \alpha _{j,\ell _j}),\quad \varvec{\beta }_j=(\beta _{j,1},\ldots , \beta _{j,\ell _j}), \end{aligned}$$
(6.14)

and \(\mathfrak {q}_{\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta }}}(\cdot )\) is a polynomial of \(\hat{X}, \hat{X}^{-1}, \hat{B}^{-1}, V\) and T-entries. Moreover, all the entries of \(\mathbf {\ell }, {\varvec{\alpha }}\) and \({\varvec{\beta }}\) are bounded by 4. By (6.9) and (6.11), we have

$$\begin{aligned} |\mathfrak {q}_{\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta }}}(\hat{X}^{-1},\hat{B}^{-1}, V, T)|\le e^{O(W)}\prod _{j=1}^W\left( r_{j,1}^{-1}+r_{j,2}^{-1}+t_j+1\right) ^{C}. \end{aligned}$$
(6.15)

In addition, it is easy to see that the number of the summands of the form (6.13) in \(\prod _{j=1}^W\varvec{\varpi }_j\prod _{k=p,q}\hat{\varvec{\varpi }}_k\) is bounded by \(e^{O(W)}\).

Now, we define the vectors

$$\begin{aligned} \mathbf {\Omega }:=(\varvec{\omega }_1,\varvec{\omega }_2,\varvec{\omega }_3, \varvec{\omega }_4),\qquad \mathbf {\Xi }:=(\varvec{\xi }_1,\varvec{\xi }_2,\varvec{\xi }_3, \varvec{\xi }_4), \end{aligned}$$
(6.16)

where \(\varvec{\omega }_\alpha =(\omega _{1,\alpha },\ldots , \omega _{W,\alpha })\) and \(\varvec{\xi }_\alpha =(\xi _{1,\alpha },\ldots , \xi _{W,\alpha })\) for \(\alpha =1,2,3,4\). Here we used the notation (6.1). In addition, we introduce the matrix

$$\begin{aligned} \widetilde{\mathbb {H}}=\widetilde{S}\oplus \widetilde{S}\oplus \widetilde{S}\oplus \widetilde{S}. \end{aligned}$$

It is easy to check \(\sum _{j,k}\tilde{\mathfrak {s}}_{jk} Tr \varOmega _j\varXi _k=\mathbf {\Omega }\widetilde{\mathbb {H}}\mathbf {\Xi }'\). By using the Gaussian integral formula for the Grassmann variables (3.2), we see that for each \(\mathbf {\ell }, \varvec{\alpha }\) and \(\varvec{\beta }\), we have

$$\begin{aligned} \Big |\int \mathrm{d} \varOmega \mathrm{d}\varXi \cdot \exp \left\{ -\sum _{j,k}\tilde{\mathfrak {s}}_{jk} Tr \varOmega _j\varXi _k\right\} \cdot \prod _{j=1}^W\prod _{i=1}^{\ell _j} \omega _{j,\alpha _{j,i}}\xi _{j,\beta _{j,i}}\Big |\le |\det \widetilde{\mathbb {H}}^{(\mathsf {I}|\mathsf {J})}|, \nonumber \\ \end{aligned}$$
(6.17)

for some index sets \(\mathsf {I}\) and \(\mathsf {J}\) with \(|\mathsf {I}|=|\mathsf {J}|\). By Assumption 1.1 (i) and (ii), we see that the 2-norm of each row of \(\widetilde{S}\) is O(1). Consequently, by using Hadamard’s inequality, we have

$$\begin{aligned} |\det \widetilde{\mathbb {H}}^{(\mathsf {I}|\mathsf {J})}|=e^{O(W)}. \end{aligned}$$
(6.18)

Therefore, (6.12)–(6.18) and the bound \(e^{O(W)}\) for the total number of summands of the form (6.13) in \(\prod _{j=1}^W\varvec{\varpi }_j\prod _{k=p,q}\hat{\varvec{\varpi }}_k\) imply (6.7). Thus we completed the proof.

\(\square \)

6.3 Integral of \(\mathcal {F}\)

In this subsection, we also temporarily ignore the \(X^{[1]},\mathbf {y}^{[1]},\mathbf {w}^{[1]},P_1,Q_1\)-variables from \(\mathsf {Q}(\cdot )\), and estimate \(\mathsf {F}(\hat{X},\hat{B}, V, T)\) defined in (4.3). We have the following lemma.

Lemma 6.5

Suppose that the assumptions in Lemma 6.1 hold. We have

$$\begin{aligned} |\mathsf {F}(\hat{X},\hat{B}, V, T)| \le e^{O(WN^{\varepsilon _2})}\prod _{k=p,q}\big (r_{k,1}^{-1}+r_{k,2}^{-1}+t_k+1\big )^{O(1)}. \end{aligned}$$
(6.19)

Proof

Recalling the decomposition of \(\mathcal {F}(\cdot )\) in (3.19) together with the parameterization in (3.29), we will study the integrals

$$\begin{aligned} \mathbb {G}(\hat{B},T):= & {} \int \mathrm{d}\nu (Q_1) \mathrm{d}\mathbf {y}^{[1]} \mathrm{d}\mathbf {w}^{[1]}\; g(Q_1,T,\hat{B}, \mathbf {y}^{[1]}, \mathbf {w}^{[1]}),\end{aligned}$$
(6.20)
$$\begin{aligned} \mathbb {F}(\hat{X},V):= & {} \int \mathrm{d} \mu (P_1) \mathrm{d}X^{[1]} \; f(P_1, V, \hat{X}, X^{[1]}) \end{aligned}$$
(6.21)

separately. Recalling the convention at the end of Sect. 3, we use \(f(\cdot )\) and \(g(\cdot )\) to represent the integrands above. One can refer to (3.20) and (3.21) for the definition. From the assumption \(\eta \le M^{-1}N^{\varepsilon _2}\), we see

$$\begin{aligned} |\mathbb {F}(\hat{X},V)|\le e^{O(WN^{\varepsilon _2})}, \end{aligned}$$
(6.22)

since \(P_1, V, \hat{X}, X^{[1]}\)-variables are all bounded and \(|\det X_p^{[1]}|, |\det X_q^{[1]}|\sim 1\) when \(\mathbf {x}_1,\mathbf {x}_2\in \widehat{\varSigma }\).

For \(\mathbb {G}(\hat{B},T)\), we use the facts

$$\begin{aligned} \mathsf {Re}(TrB_jY_k^{[1]}J)&\ge 0, \quad \mathsf {Re}(\mathbf {i} Tr Y_k^{[1]}JZ)=-\eta Tr Y_k^{[1]}\le 0,\nonumber \\ TrY_k^{[1]}JY_\ell ^{[1]} J&\ge 0,\quad k,\ell =p,q, \nonumber \\ \big |\big (\mathbf {w}^{[1]}_q(\mathbf {w}^{[1]}_q)^*\big )_{12}\big |&\le 1,\quad \big |\big (\mathbf {w}^{[1]}_p(\mathbf {w}^{[1]}_p)^*\big )_{21}\big |\le 1, \end{aligned}$$
(6.23)

to estimate trivially several terms, whereby we can get the bound

$$\begin{aligned} |g(\cdot )|&\le \exp \left\{ -M\eta \sum _{j=1}^W Tr \mathsf {Re}(B_j)J\right\} \prod _{k=p,q} \left( y_k^{[1]}\right) ^{n+3}\exp \Big \{-\tilde{\mathfrak {s}}_{kk} Tr \mathsf {Re}(B_k)Y_k^{[1]}J\Big \}. \nonumber \\ \end{aligned}$$
(6.24)

Here \(\mathsf {Re}(B_j)=Q_1^{-1}T_j^{-1} \mathsf {Re}(\hat{B}_j) T_jQ_1\). Hence, integrating \(y_p^{[1]}\) and \(y_q^{[1]}\) out yields

$$\begin{aligned} \int _{\mathbb {R}_+^2} \mathrm{d}y_p^{[1]} \mathrm{d} y_q^{[1]} |g(Q_1,T,\hat{B}, \mathbf {y}^{[1]}, \mathbf {w}^{[1]})|\le C\frac{\exp \Big \{ -M\eta \sum _{j=1}^W Tr \mathsf {Re}(B_j)J\Big \}}{ \prod _{k=p,q}\Big (\big (\mathbf {w}_k^{[1]}\big )^*J \mathsf {Re}(B_k) \mathbf {w}_k^{[1]}\Big )^{C_1}}, \nonumber \\ \end{aligned}$$
(6.25)

for some positive constants C and \(C_1\) depending on n, where we used the elementary facts that \(\tilde{\mathfrak {s}}_{kk}\ge c\) for some positive constant c and

$$\begin{aligned} Tr \mathsf {Re}(B_j)Y_k^{[1]}J=y_k^{[1]}\big (\mathbf {w}_k^{[1]}\big )^*J \mathsf {Re}(B_j) \mathbf {w}_k^{[1]},\quad k=p,q,\quad j=1,\ldots , W. \nonumber \\ \end{aligned}$$
(6.26)

Now, note that

$$\begin{aligned} (\mathbf {w}_k^{[1]})^*J \mathsf {Re}B_j \mathbf {w}_k^{[1]}\ge \lambda _1(J\mathsf {Re}B_j),\quad k=p,q,\quad j=1,\ldots , W. \end{aligned}$$
(6.27)

In addition, it is also easy to see \(\lambda _1(T_j)=s_j-t_j\) and \(\lambda _1(Q_1)=s-t\), according to the definitions in (3.25). Now, by the fact \(JA^{-1}=AJ\) for any \(A\in \mathring{U}(1,1)\), we have

$$\begin{aligned} J\mathsf {Re}B_j= Q_1 T_j \text {diag}\big ( \mathsf {Re}b_{j,1}, \mathsf {Re}b_{j,2}\big )T_jQ_1. \end{aligned}$$
(6.28)

Consequently, we can get

$$\begin{aligned} \lambda _1(J\mathsf {Re}B_j)\ge (s_j-t_j)^2(s-t)^2\text {min}\{\mathsf {Re}b_{j,1}, \mathsf {Re}b_{j,2}\}=\frac{\text {min}\{\mathsf {Re}b_{j,1}, \mathsf {Re}b_{j,2}\}}{(s_j+t_j)^2(s+t)^2}, \nonumber \\ \end{aligned}$$
(6.29)

by recalling the facts \(s^2-t^2=1\) and \(s_j^2-t_j^2=1\). Therefore, combining (6.25), (6.27) and (6.29), we have

$$\begin{aligned} \int _{\mathbb {R}_+^2} \mathrm{d}y_p^{[1]} \mathrm{d} y_q^{[1]} |g(\cdot )|&\le C(s+t)^{4C_1}\exp \left\{ -M\eta \sum _{j=1}^W Tr (\mathsf {Re}B_j)J\right\} \nonumber \\&\quad \times \prod _{k=p,q}\frac{(s_k+t_k)^{2C_1}}{ \big (\text {min}\{\mathsf {Re}b_{k,1}, \mathsf {Re}b_{k,2}\}\big )^{C_1}}. \end{aligned}$$
(6.30)

Now, what remains is to estimate the exponential function in (6.30). By elementary calculation from (6.28) we obtain

$$\begin{aligned} Tr (\mathsf {Re}B_j)J\ge \big (\mathsf {Re}b_{j,1}+\mathsf {Re}b_{j,2}\big )\big ((s_j^2+t_j^2)(s^2+t^2)-4sts_j t_j\big ). \end{aligned}$$

Observe that

$$\begin{aligned} \left( s_j^2+t_j^2\right) (s^2+t^2)-4sts_j t_j&=\frac{\left( s_j^2+t_j^2\right) ^2(s^2+t^2)^2-16(sts_jt_j)^2}{\left( s_j^2+t_j^2\right) (s^2+t^2)+4sts_j t_j}\\&\ge \frac{4t^4+4t^2+4t_j^4+4t_j^2+1}{2\left( 1+2t^2_j\right) (1+2t^2)}\ge \frac{1+2t^2}{2(1+2t_j^2)}. \end{aligned}$$

It implies that

$$\begin{aligned} \exp \left\{ -M\eta \sum _{j=1}^W Tr (\mathsf {Re}B_j)J\right\} \le \exp \left\{ -cM\eta \sum _{j=1}^W \frac{ r_{j,1}+r_{j,2}}{1+2t_j^2}(1+2t^2)\right\} , \end{aligned}$$
(6.31)

for some positive constant c, where we have used the fact \(\mathsf {Re}b_{j,\alpha }\ge cr_{j,\alpha }\) for all \(j=1,\ldots , W\) and \(\alpha =1,2\), in light of the assumption \(|E|\le \sqrt{2}-\kappa \) and the definition of \(\mathbb {K}\) in (6.2). Plugging (6.31) into (6.30), estimating \((s+t)^2\le 2(1+2t^2)\), and integrating t out, we can crudely bound

$$\begin{aligned} \int _{\mathbb {R}_+^2} \mathrm{d}y_p^{[1]} \mathrm{d} y_q^{[1]} \int _{\mathbb {R}^+} 2t\mathrm{d}t\cdot |g(\cdot )|&\le C\left( \frac{1}{M\eta }\right) ^{C_2}\left( \sum _{j=1}^W \frac{ r_{j,1}+r_{j,2}}{1+2t_j^2}\right) ^{-C_2}\nonumber \\&\quad \times \prod _{k=p,q}\left( \frac{1+2t_k^2}{ \text {min}\{r_{k,1},r_{k,2}\}}\right) ^{C_1}. \end{aligned}$$
(6.32)

Now, we use the trivial bounds

$$\begin{aligned} \left( \sum _{j=1}^W \frac{ r_{j,1}+r_{j,2}}{1+2t_j^2}\right) ^{-C_2}\le \left( \frac{1+2t_p^2}{ r_{p,1}+r_{p,2}}\right) ^{C_2}\le \left( (1+2t_p^2)\left( r_{p,1}^{-1}+r_{p,2}^{-1}\right) \right) ^{C_2}, \nonumber \\ \end{aligned}$$
(6.33)

and

$$\begin{aligned} \frac{1+2t_k^2}{ \text {min}\{r_{k,1}, r_{k,2}\}}\le C(1+2t_k^2)\left( r_{k,1}^{-1}+r_{k,2}^{-1}\right) . \end{aligned}$$
(6.34)

Inserting (6.33) and (6.34) into (6.32) and integrating out the remaining variables yields

$$\begin{aligned} |\mathbb {G}(\hat{B},T)| \le C\left( \frac{1}{M\eta }\right) ^{C_1} \prod _{k=p,q}\left( r_{k,1}^{-1}+r_{k,2}^{-1}+t_k+1\right) ^{C_3}. \end{aligned}$$
(6.35)

Combining (6.22) and (6.35) we can get the bound (6.19). Thus we completed the proof of Lemma 6.5. \(\square \)

6.4 Summing up: Proof of Lemma 6.1

In the discussions in Sects. 6.2 and 6.3, we ignored the irrelevant factor \(\mathsf {Q}(\cdot )\). However, it is easy to modify the discussion slightly to take this factor into account, whereby we can prove Lemma 6.1.

Proof

At first, by the definition in (6.4), we can rewrite (3.32) as

$$\begin{aligned} \mathsf {A}(\cdot )=\int \mathrm{d}X^{[1]} \mathrm{d}\mathbf {y}^{[1]} \mathrm{d} \mathbf {w}^{[1]}\mathrm{d} \varOmega \mathrm{d} \varXi \mathrm{d}\mu (P_1) \mathrm{d}\nu (Q_1)\; \mathcal {P}( \cdot )\mathsf {Q}(\cdot ) \mathcal {F}(\cdot ). \end{aligned}$$

Now, by the conclusion \(\kappa _1=W^{O(1)}\) in Lemma 6.3, it suffices to consider one term in \(\mathsf {Q}(\cdot )\), which is a monomial of the form \(\mathfrak {p}(t,s, (y^{[1]}_p)^{-1},(y^{[1]}_q)^{-1})\mathfrak {q}(\varOmega , \varXi )\), regarding \(\sigma , {v}_p^{[1]}, {v}_q^{[1]}, P_1\)-variables, \(X^{[1]}\)-variables and \(\mathbf {w}^{[1]}\)-variables as bounded parameters. Here \(\mathfrak {p}(\cdot )\) is a monomial of \(t,s, (y^{[1]}_p)^{-1},(y^{[1]}_q)^{-1}\) and \(\mathfrak {q}(\cdot )\) is a monomial of \(\varOmega ,\varXi \)-variables, both with bounded coefficients and bounded degrees, according to the fact \(\kappa _2,\kappa _3=O(1)\) in Lemma 6.3. Now we define

$$\begin{aligned}&\mathsf {P}_{\mathfrak {q}}(\hat{X}, \hat{B}, V, T):=\int \mathrm{d} \varOmega \mathrm{d}\varXi \; \mathcal {P}( \cdot )\cdot \mathfrak {q}(\varOmega , \varXi ),\nonumber \\&\mathsf {F}_{\mathfrak {p}}(\hat{X}, \hat{B}, V, T):=\int \mathrm{d} X^{[1]} \mathrm{d}\mathbf {y}^{[1]} \mathrm{d}\mathbf {w}^{[1]} \mathrm{d}\mu (P_1) \mathrm{d}\nu (Q_1)\;\mathcal {F}(\cdot )\nonumber \\&\quad \cdot \mathfrak {p}\left( t,s, (y^{[1]}_p)^{-1},(y^{[1]}_q)^{-1}\right) . \end{aligned}$$
(6.36)

By repeating the discussions in Sects. 6.2 and 6.3 with slight modification, we can easily see that (6.7) and (6.19) hold as well, if we replace \(\mathsf {P}(\cdot )\) and \(\mathsf {F}(\cdot )\) by \(\mathsf {P}_{\mathfrak {q}}(\cdot )\) and \(\mathsf {F}_{\mathfrak {p}}(\cdot )\), respectively. Therefore, we completed the proof of Lemma 6.1. \(\square \)

7 Proofs of Lemmas 5.1 and 5.6

In this section, with the aid of Lemma 6.1, we prove Lemmas 5.1 and 5.6. A key problem is that the domain of \(\mathbf {t}\)-variables is not compact. This forces us to analyze the exponential function

$$\begin{aligned} \mathbb {M}(\mathbf {t}):=\exp \Big \{-M\mathsf {Re}\big (\ell _S(\hat{B},T)\big )\Big \} \end{aligned}$$
(7.1)

carefully for any fixed \(\hat{B}\)-variables. Recall the definition of the sector \(\mathbb {K}\) in (6.2). For \(\mathbf {b}_1\in \mathbb {K}^W\) and \(\mathbf {b}_2\in \bar{\mathbb {K}}^W\), we have

$$\begin{aligned} \min _{j,k}\mathsf {Re}(b_{j,1}+b_{j,2})(b_{k,1}+b_{k,2})\ge c\min _{j,k}\sum _{a,b=1,2} r_{j,a}r_{k,b}\ge c\min _{j,a} r_{j,a}^2:=\mathfrak {A}(\hat{B}),\nonumber \\ \end{aligned}$$
(7.2)

From now on, we regard \(\mathbb {M}(\mathbf {t})\) as a measure of the \(\mathbf {t}\)-variables and study it in the following two regions separately:

$$\begin{aligned} (i):\mathbf {t}\in \mathbb {I}^{W-1},\qquad (ii):\mathbf {t}\in \mathbb {R}_+^{W-1}\setminus \mathbb {I}^{W-1}. \end{aligned}$$

Roughly speaking, when \(\mathbf {t}\in \mathbb {I}^{W-1}\), we will see that \(\mathbb {M}(\mathbf {t})\) can be bounded pointwisely by a Gaussian measure. More specifically, we have the following lemma.

Lemma 7.1

With the notation above, we have

$$\begin{aligned} \mathbb {M}(\mathbf {t})\le \exp \left\{ - \frac{M}{12}\mathfrak {A}(\hat{B})\sum _{j,k} \mathfrak {s}_{jk}(t_k-t_j)^2\right\} , \quad \forall \; \mathbf {t}\in \mathbb {I}^{W-1}. \end{aligned}$$

Proof

Using the definition of \(\ell _S(\hat{B},T)\) in (5.6) and \(\mathfrak {A}(\hat{B})\) in (7.2) and the fact \(|(T_kT_j^{-1})_{12}|=|s_jt_ke^{\mathbf {i}\sigma _k}-s_kt_je^{\mathbf {i}\sigma _j}|\), we have

$$\begin{aligned} \mathsf {Re}\ell _S(\hat{B},T)\ge \frac{1}{2}\mathfrak {A}(\hat{B})\sum _{j,k} \mathfrak {s}_{jk}\left| s_jt_ke^{\mathbf {i}\sigma _k}-s_kt_je^{\mathbf {i}\sigma _j}\right| ^2. \end{aligned}$$
(7.3)

Simple estimate using \(s_j^2=1+t_j^2\) shows that

$$\begin{aligned} \left| s_jt_ke^{\mathbf {i}\sigma _k}-s_kt_je^{\mathbf {i}\sigma _j}\right| ^2\ge \frac{1}{4}(t_k-t_j)^2\left( \frac{1}{1+2t_j^2}+\frac{1}{1+2t_k^2}\right) \ge \frac{1}{6}(t_k-t_j)^2. \nonumber \\ \end{aligned}$$
(7.4)

Notice that the assumption \(\mathbf {t}\in \mathbb {I}^{W-1}\) was used only in the last inequality. By (7.3), (7.4) and the definition (7.1), Lemma 7.1 follows immediately. \(\square \)

However, the behavior of \(\mathbb {M}(\mathbf {t})\) for \(\mathbf {t}\in \mathbb {R}_+^{W-1}{\setminus }\mathbb {I}^{W-1}\) is much more sophisticated. We will not try to provide a pointwise control of \(\mathbb {M}(\mathbf {t})\) in this region. Instead, we will bound the integral of \(\mathfrak {q}(\mathbf {t})\) against \(\mathbb {M}(\mathbf {t})\) over this region, for any given monomial \(\mathfrak {q}(\cdot )\) of interest. More specifically, recalling the definition of \(\varTheta \) in (5.30) and the spanning tree \(\mathcal {G}_0=(\mathcal {V},\mathcal {E}_0)\) in Assumption 1.1, and additionally setting

$$\begin{aligned} \mathfrak {L}:=\frac{M}{4}\mathfrak {A}(\hat{B})\min _{i,j\in \mathcal {E}_0}\mathfrak {s}_{ij}, \end{aligned}$$
(7.5)

we have the following lemma.

Lemma 7.2

Let \(\mathfrak {q}(\mathbf {t})=\prod _{j=2}^Wt_j^{n_j}\) be a monomial of \(\mathbf {t}\)-variables, with powers \(n_j=O(1)\) for all \(j=2,\ldots , W\). We have

$$\begin{aligned} \int _{\mathbb {R}_+^{W-1}\setminus \mathbb {I}^{W-1}} \prod _{j=2}^W\mathrm{d} t_j \; \mathbb {M}(\mathbf {t}) \mathfrak {q}(\mathbf {t})\le \Big (1+\mathfrak {L}^{-\frac{1}{2}}\Big )^{O(W^2)}\exp \left\{ -\varTheta ^2\mathfrak {A}(\hat{B})+O(W^2\log N)\right\} \nonumber \\ \end{aligned}$$
(7.6)

Remark 7.3

Roughly speaking, by Lemma 7.2 we see that the integral of \(\mathfrak {q}(\mathbf {t})\)-variables against the measure \(\mathbb {M}(\mathbf {t})\) over the region \(\mathbb {R}_+^{W-1}\setminus \mathbb {I}^{W-1}\) is exponentially small, owing to the fact \(\varTheta ^2\gg W^2\log N\).

We will postpone the proof of Lemma 7.2 to the end of this section. In the sequel, at first, we prove Lemmas 5.1 and 5.6 with the aid of Lemmas 6.17.1 and 7.2. Before commencing the formal proofs, we mention two basic facts which are formulated as the following lemma.

Lemma 7.4

Under Assumption 1.1, we have the following two facts.

  • For the smallest eigenvalue of \(-S^{(1)}\), there exists some positive constant c such that

    $$\begin{aligned} \lambda _1(-S^{(1)})\ge {c}/{W^2}. \end{aligned}$$
    (7.7)
  • Let \(\varvec{\varrho }=(\rho _2,\ldots , \rho _{W})'\) be a real vector and \(\rho _1=0\). If there is at least one \(\alpha \in \{2,\ldots ,W\}\) such that \(\varrho _\alpha \ge \varTheta /\sqrt{M}\), then we have

    $$\begin{aligned} \sum _{j,k} \mathfrak {s}_{jk} (\varrho _j-\varrho _k)^2\ge {\varTheta }/{M}. \end{aligned}$$
    (7.8)

Proof

Let \(\varvec{\varrho }=(\rho _2,\ldots , \rho _{W})'\) be a real vector and \(\rho _1=0\). Now, we assume \(|\rho _\alpha |=\max _{\beta =2,\ldots , W}|\rho _\beta |\). Then

$$\begin{aligned} \frac{-\varvec{\rho }'S^{(1)}\varvec{\rho }}{||\varvec{\rho }||_2^2}=\frac{\frac{1}{2}\sum _{j,k} \mathfrak {s}_{jk} (\rho _j-\rho _k)^2}{\sum _j \rho _j^2}\ge \frac{c(\rho _\alpha -\rho _1)^2}{W^2\rho _\alpha ^2}=\frac{c}{W^{2}}, \end{aligned}$$

where the second step follows from Assumption 1.1 (iv) and Cauchy-Schwarz inequality. Analogously, we have \(\sum _{j,k} \mathfrak {s}_{jk} (\varrho _j-\varrho _k)^2\ge {c\varrho _\alpha ^2 }/{W}\), which implies (7.8) by the definition of \(\varTheta \) in (5.30). Hence, we completed the proof. \(\square \)

Recalling the notation defined in (5.2) and the facts \(|x_{j,a}|=1\) and \(|b_{j,a}|=r_{j,a}\) for all \(j=1,\ldots , W\) and \(a=1,2\), for any sequence of domains, we have

$$\begin{aligned}&\left| \mathcal {I}(\mathbf {I}^{b}_1, \mathbf {I}^b_2, \mathbf {I}^x_1, \mathbf {I}^x_2, \mathbf {I}^t,\mathbf {I}^v)\right| \nonumber \\&\quad \le e^{O(W\log N)}\int _{\mathbf {I}^b_1} \prod _{j=1}^W \mathrm{d} b_{j,1} \int _{\mathbf {I}^b_2} \prod _{j=1}^W \mathrm{d} b_{j,2} \int _{\mathbf {I}^x_1} \prod _{j=1}^W \mathrm{d} x_{j,1} \int _{\mathbf {I}^x_2} \prod _{j=1}^W \mathrm{d} x_{j,2}\nonumber \\&\quad \times \int _{\mathbf {I}^t} \prod _{j=2}^W 2t_j \mathrm{d} t_j \int _{\mathbf {I}^v} \prod _{j=2}^W 2v_j \mathrm{d} v_j \times \exp \left\{ -M\big (\mathsf {Re}K(\hat{X},V)+\mathsf {Re}L(\hat{B},T)\big )\right\} \nonumber \\&\quad \cdot |\mathsf {A}(\hat{X}, \hat{B}, V, T)|\cdot \prod _{j=1}^W (r_{j,1}+r_{j,2})^2. \end{aligned}$$
(7.9)

In addition, according to Lemma 6.1, we have

$$\begin{aligned} |\mathsf {A}(\hat{X}, \hat{B}, V, T)|\cdot \prod _{j=2}^W 2t_j\cdot \prod _{j=2}^W 2v_j \cdot \prod _{j=1}^W (r_{j,1}+r_{j,2})^2\le e^{O(WN^{\varepsilon _2})}\tilde{\mathfrak {p}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t}), \nonumber \\ \end{aligned}$$
(7.10)

for some polynomial \(\tilde{\mathfrak {p}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t})\) with positive coefficients, and

$$\begin{aligned}&\tilde{\mathfrak {p}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t})\in \mathfrak {Q}\Big (\{r_{j,1}, r_{j,2}, r_{j,1}^{-1}, r_{j,2}^{-1}, t_j\}_{j=1}^W; \kappa _1,\kappa _2,\kappa _3\Big ),\quad \kappa _1=e^{O(W)}, \nonumber \\&\quad \kappa _2, \kappa _3=O(1). \end{aligned}$$
(7.11)

7.1 Proof of Lemma 5.1

At first, since throughout the whole proof, the domains of \(\mathbf {x}_1, \mathbf {x}_2\), and \(\mathbf {v}\)-variables, namely, \(\varSigma ^{W}, \varSigma ^W\) and \(\mathbb {I}^{W-1}\), will not be involved, we just use \(*\)’s to represent them, in order to simplify the notation.

Now, we introduce the following contours with the parameter \(\mathfrak {D}\in \mathbb {R}_+\),

$$\begin{aligned} \varGamma _\mathbb {\mathfrak {D}}:= & {} \big \{ra_+|r\in [0,\mathfrak {D}]\big \}\subset \varGamma ,\quad \mathbb {R}_\mathfrak {D}=[0,(\mathsf {Re}a_+) \mathfrak {D}]\subset \mathbb {R}_+, \\ \mathcal {L}_\mathfrak {D}:= & {} \big \{(\mathsf {Re}a_+)\mathfrak {D}+\mathbf {i}(\mathsf {Im}a_+)r|r\in [0,\mathfrak {D}]\big \}. \end{aligned}$$

In addition, we recall the sector \(\mathbb {K}\) defined in (6.2). Then, trivially, we have \(\mathbb {R}_+,\varGamma , \mathcal {L}_\mathfrak {D}\subset \mathbb {K}\) and \(\mathbb {R}_+, \bar{\varGamma }, \bar{\mathcal {L}}_\mathfrak {D}\in \bar{\mathbb {K}}\) for all \(\mathfrak {D}\in \mathbb {R}_+\).

Observe that the integrand in (5.2) is an analytic function of the \(\hat{B}\)-variables. To see this, we can go back to the integral representation (3.17) and the definitions of L(B) and \(\mathcal {P}(\varOmega ,\varXi ,X,B)\) in (3.18). Since \(\exp \{M\log \det B_j\}=(\det B_j)^M\), actually the logarithmic terms in L(B) do not produce any singularity in the integrand in (3.17); it can also compensate the factors \(b_{j,a}^{-\ell }\) from \(\mathcal {P}(\varOmega ,\varXi ,X,B)\) since \(\ell \ll M\). Consequently, we have

$$\begin{aligned} \mathcal {I}\Big ((\varGamma _{\mathfrak {D}}\cup \mathcal {L}_\mathfrak {D})^W, (\bar{\varGamma }_{\mathfrak {D}}\cup \bar{\mathcal {L}}_\mathfrak {D})^W, *, *, \mathbb {R}_+^{W-1},*\Big )=\mathcal {I}\Big ((\mathbb {R}_\mathfrak {D})^W, (\mathbb {R}_\mathfrak {D})^W,*, *, \mathbb {R}_+^{W-1},*\Big ). \end{aligned}$$

Hence, to prove Lemma 5.1, it suffices to prove the following lemma.

Lemma 7.5

Suppose that \(|E|\le \sqrt{2}-\kappa \). As \(\mathfrak {D}\rightarrow \infty \), the following convergence hold,

$$\begin{aligned}&\mathrm{(i)}:\quad \mathcal {I}\Big ((\varGamma _{\mathfrak {D}}\cup \mathcal {L}_\mathfrak {D})^W, (\bar{\varGamma }_{\mathfrak {D}}\cup \bar{\mathcal {L}}_\mathfrak {D})^W, *, *, \mathbb {R}_+^{W-1},*\Big )\\&\qquad \qquad \,-\mathcal {I}\Big ((\varGamma _{\mathfrak {D}})^W, ({\bar{\varGamma }_{\mathfrak {D}}})^W, *, *, \mathbb {R}_+^{W-1},*\Big )\rightarrow 0,\\&\mathrm{(ii)}:\quad \mathcal {I}\Big ((\varGamma )^W, (\bar{\varGamma })^W, *, *, \mathbb {R}_+^{W-1},*\Big )-\mathcal {I}\Big ((\varGamma _{\mathfrak {D}})^W, ({\bar{\varGamma }_{\mathfrak {D}}})^W, *, *, \mathbb {R}_+^{W-1},*\Big )\rightarrow 0,\\&\mathrm{(iii)}:\quad \mathcal {I}\Big (\mathbb {R}_+^W, \mathbb {R}_+^W, *, *, \mathbb {R}_+^{W-1},*\Big )-\mathcal {I}\Big (\mathbb {R}_\mathfrak {D}^W, \mathbb {R}_\mathfrak {D}^W, *, *, \mathbb {R}_+^{W-1},*\Big )\rightarrow 0. \end{aligned}$$

Proof

For simplicity, we use the notation

$$\begin{aligned}&\mathbf {I}_\mathfrak {D}^{b,1}:=\big (\varGamma _{\mathfrak {D}}\cup \mathcal {L}_\mathfrak {D}\big )^W\times \big (\bar{\varGamma }_{\mathfrak {D}}\cup \bar{\mathcal {L}}_\mathfrak {D}\big )^W \setminus (\varGamma _{\mathfrak {D}})^W\times (\bar{\varGamma }_{\mathfrak {D}})^W,\\&\mathbf {I}_\mathfrak {D}^{b,2}:=\varGamma ^W\times \bar{\varGamma }^W\setminus (\varGamma _{\mathfrak {D}})^W\times ({\bar{\varGamma }_{\mathfrak {D}}})^W,\qquad \mathbf {I}_\mathfrak {D}^{b,3}:=\mathbb {R}_+^W \times \mathbb {R}_+^W\setminus \mathbb {R}_\mathfrak {D}^W\times \mathbb {R}_\mathfrak {D}^W. \end{aligned}$$

Now, recall the definition of the function \(\ell (\mathbf {a})\) in (5.6) and the representation of \(L(\hat{B},T)\) in (5.5). Hence, in light of the definition of \(\mathbb {M}(\mathbf {t})\) in (7.1), we have

$$\begin{aligned} \exp \big \{-M\mathsf {Re}L(\hat{B},T)\big \}= \exp \big \{-M\big (\mathsf {Re}\ell (\mathbf {b}_1)+\mathsf {Re}\ell (-\mathbf {b}_2)\big )\big \} \mathbb {M}(\mathbf {t}). \end{aligned}$$
(7.12)

By the assumption \(|E|\le \sqrt{2}-\kappa \), we see that \(\mathsf {Re}b_{j,a}b_{k,a}>0\) for all \(b_{j,a},b_{k,a}\in \mathbb {K}\cup \bar{\mathbb {K}}\). Consequently, when \(b_{j,1}\in \mathbb {K}\) and \(b_{j,2}\in \bar{\mathbb {K}}\) for all \(j=1,\ldots , W\), we have

$$\begin{aligned} \mathsf {Re}\ell (\mathbf {b}_1)+\mathsf {Re}\ell (-\mathbf {b}_2)&\ge \sum _{a=1,2}\sum _{j}\left( \frac{1}{2}(1+\mathfrak {s}_{jj}) \mathsf {Re}b_{j,a}^2+(-1)^{a+1}E\mathsf {Im}b_{j,a}- \log r_{j,a}\right) \nonumber \\&\ge c\sum _{j} \left( r_{j,1}^2+r_{j,2}^2\right) -\sum _{j} (\log r_{j,1}+\log r_{j,2}), \end{aligned}$$
(7.13)

where we used Assumption 1.1 (ii) and the fact that \((-1)^{a+1}E\mathsf {Im}b_{j,a}\ge 0\).

Now, when \((\mathbf {b}_1,\mathbf {b}_2)\in \mathbf {I}_\mathfrak {D}^{b,i}\) for \(i=1,2, 3\), we have \(\sum _{a=1,2}\sum _{j} r_{j,a}^2 \ge c \mathfrak {D}^2\) for some positive constant c, which implies the trivial fact

$$\begin{aligned} \sum _{j} \left( r_{j,1}^2+r_{j,2}^2\right) \ge \frac{1}{2}\sum _{j} \left( r_{j,1}^2+r_{j,2}^2\right) +\frac{c}{2} \mathfrak {D}^2. \end{aligned}$$
(7.14)

Consequently, we can get from (7.12), (7.13) and (7.14) that for some positive constant c,

$$\begin{aligned} \exp \big \{-M\mathsf {Re}L(\hat{B},T)\big \}\le e^{-cM\mathfrak {D}^2}\prod _{a=1,2}\prod _{j=1}^W e^{-cMr_{j,a}^2 }\;r_{j,a}^M\cdot \mathbb {M}(\mathbf {t}) \end{aligned}$$

holds in \(\mathbf {I}_\mathfrak {D}^{b,i}\) for \(i=1,2,3\). In addition, by the boundedness of V and \(\hat{X}\)-variables, we can get the trivial bound \(MK(\hat{X},V)=O(N)\). Hence, from (7.9) and (7.10) we see that the quantities in Lemma 7.5 (i), (ii) and (iii) can be bounded by the following integral with \(i=1,2,3\), respectively,

$$\begin{aligned} e^{-cM\mathfrak {D}^2+O(N)} \int _{\mathbf {I}^{b,i}_\mathfrak {D}}\prod _{j=1}^W \mathrm{d} b_{j,1} \mathrm{d} b_{j,2} \int _{\mathbb {R}_+^{W-1}}\prod _{j=2}^W \mathrm{d} t_j \prod _{a=1,2}\prod _{j=1}^We^{-cMr_{j,a}^2 }\; r_{j,a}^M \mathbb {M}(\mathbf {t})\tilde{\mathfrak {p}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t}). \end{aligned}$$
(7.15)

According to the facts \(\kappa _1=e^{O(W)}\) and \(\kappa _2=O(1)\) in (7.11), it is suffices to consider one monomial, say \(\tilde{\mathfrak {q}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t})\), instead \(\tilde{\mathfrak {p}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t})\). Then, bounding \(t_j\)’s by 1 trivially in the region \(\mathbf {t}\in \mathbb {I}^{W-1}\) and using Lemma 7.2 in the region \(\mathbf {t}\in \mathbb {R}_+^{W-1}\setminus \mathbb {I}^{W-1}\), we can first integrate \(\mathbf {t}\)-variables out; then performing the Gaussian integral on \(\hat{B}\)-variables, it is not difficult to get the estimate

$$\begin{aligned} (7.15)\le e^{-cM\mathfrak {D}^2+O(N\log N)}\rightarrow 0,\quad \text {as}\quad \mathfrak {D}\rightarrow \infty . \end{aligned}$$

for \(i=1,2,3\). Here we remark that after integrating \(\mathbf {t}\)-variables out, we get a singularity \((\min _{j,a} r_{j,a})^{-CW^2}\) from the factor \((1+\mathfrak {L}^{-\frac{1}{2}})^{O(W^2)}\) according to (7.6), which can be compensated by the factor \(\prod _{a}\prod _{j}r_{j,a}^M\) by the fact \(M\gg W^2\). Thus we completed the proof. \(\square \)

7.2 Proof of Lemma 5.6

Plugging the first identity of (5.21) and (7.10) into (7.9), we can write

$$\begin{aligned} \left| \mathcal {I}(\mathbf {I}^{b}_1, \mathbf {I}^b_2, \mathbf {I}^x_1, \mathbf {I}^x_2, \mathbf {I}^t,\mathbf {I}^v)\right|&\le e^{O(WN^{\varepsilon _2})}\int _{\mathbf {I}^b_1} \prod _{j=1}^W \mathrm{d} b_{j,1} \int _{\mathbf {I}^b_2} \prod _{j=1}^W \mathrm{d} b_{j,2}\nonumber \\&\quad \times \int _{\mathbf {I}^x_1} \prod _{j=1}^W \mathrm{d} x_{j,1} \int _{\mathbf {I}^x_2} \prod _{j=1}^W \mathrm{d} x_{j,2} \times \int _{\mathbf {I}^t} \prod _{j=2}^W \mathrm{d} t_j \int _{\mathbf {I}^v} \prod _{j=2}^W \mathrm{d} v_j \nonumber \\&\quad \times \exp \left\{ -M\left( \mathsf {Re}\mathring{K}(\hat{X},V)+\mathsf {Re}\mathring{L}(\hat{B},T)\right) \right\} \tilde{\mathfrak {p}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t}), \end{aligned}$$
(7.16)

where \(\tilde{\mathfrak {p}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t})\) is specified in (7.11).

Lemma 5.6 immediately follows from the following two lemmas.

Lemma 7.6

Under Assumptions 1.1 and 1.14, we have

$$\begin{aligned} \mathcal {I}\big (\varGamma ^{W}, \bar{\varGamma }^{W}, \varSigma ^W,\varSigma ^W, \mathbb {R}_+^{W-1}\setminus \mathbb {I}^{W-1},\mathbb {I}^{W-1}\big )\le e^{-\varTheta ^2}. \end{aligned}$$
(7.17)

Lemma 7.7

Under Assumptions 1.1 and 1.14, we have

$$\begin{aligned} \mathcal {I}\big (\varGamma ^{W}, \bar{\varGamma }^{W}, \varSigma ^W,\varSigma ^W, \mathbb {I}^{W-1},\mathbb {I}^{W-1}\big )&=2^W\mathcal {I}\big (\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+, \varUpsilon ^x_-, \varUpsilon _S,\varUpsilon _S\big )\nonumber \\&\quad +\mathcal {I}\big (\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+, \varUpsilon ^x_+, \varUpsilon _S,\mathbb {I}^{W-1}\big )\nonumber \\&\quad +\mathcal {I}\big (\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_-, \varUpsilon ^x_-, \varUpsilon _S,\mathbb {I}^{W-1}\big )+O( e^{-\varTheta }). \end{aligned}$$
(7.18)

In the sequel, we prove Lemmas 7.6 and 7.7.

Proof of Lemma 7.6

Recall (7.16) with the choice of the integration domains

$$\begin{aligned} \big (\mathbf {I}^{b}_1, \mathbf {I}^b_2, \mathbf {I}^x_1, \mathbf {I}^x_2, \mathbf {I}^t,\mathbf {I}^v\big )=\big (\varGamma ^{W}, \bar{\varGamma }^{W}, \varSigma ^W,\varSigma ^W, \mathbb {R}_+^{W-1}\setminus \mathbb {I}^{W-1},\mathbb {I}^{W-1}\big ). \end{aligned}$$

To simplify the integral on the r.h.s. of (7.16), we use the fact \(\mathsf {Re}\mathring{K}(\hat{X},V)\ge 0\) implied by (5.27), together with the facts that the \(\mathbf {x}\) and \(\mathbf {v}\)-variables are bounded by 1. Consequently, we can eliminate the integral over \(\mathbf {x}\) and \(\mathbf {v}\)-variables from the integral on the r.h.s. of (7.16). Moreover, according to (7.11), it suffices to prove

$$\begin{aligned}&\int _{\varGamma ^W} \prod _{j=1}^W \mathrm{d} b_{j,1} \int _{\bar{\varGamma }^W} \prod _{j=1}^W \mathrm{d} b_{j,2} \int _{\mathbb {R}_+^{W-1}\setminus \mathbb {I}^{W-1}} \prod _{j=2}^W \mathrm{d} t_j \exp \left\{ -M\mathsf {Re}\mathring{L}(\hat{B},T)\right\} \cdot \tilde{\mathfrak {q}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t})\nonumber \\&\quad \le e^{-\varTheta ^2} \end{aligned}$$
(7.19)

instead, for some monomial \(\tilde{\mathfrak {q}}(\cdot )\) in \(\tilde{\mathfrak {p}}(\cdot )\). The proof of (7.19) then is just an application of the first inequality of (5.13), Lemma 7.2 and elementary Gaussian integral. We omit the details here. \(\square \)

To prove Lemma 7.7, we split the exponential function into two parts. We use one part to control the integral, and the other will be estimated by its magnitude. More specifically, we shall prove the following two lemmas.

Lemma 7.8

Under Assumptions 1.1 and 1.14, we have

$$\begin{aligned}&\int _{\varGamma ^W} \prod _{j=1}^W \mathrm{d} b_{j,1} \int _{\bar{\varGamma }^W} \prod _{j=1}^W \mathrm{d} b_{j,2} \int _{\varSigma ^W} \prod _{j=1}^W \mathrm{d} x_{j,1} \int _{\varSigma ^W} \prod _{j=1}^W \mathrm{d} x_{j,2} \int _{\mathbb {I}^{W-1}} \prod _{j=2}^W \mathrm{d} t_j \int _{\mathbb {I}^{W-1}} \prod _{j=2}^W \mathrm{d} v_j\nonumber \\&\quad \times \exp \Big \{-\frac{1}{2}M\big (\mathsf {Re}\mathring{K}(\hat{X},V)+\mathsf {Re}\mathring{L}(\hat{B},T)\big )\Big \}\cdot \tilde{\mathfrak {p}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t})\le e^{O(W)}. \end{aligned}$$
(7.20)

Lemma 7.9

If \((\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_1,\mathbf {x}_2,\mathbf {t}, \mathbf {v})\in \varGamma ^W\times \bar{\varGamma }^W\times \varSigma ^W\times \varSigma ^W\times \mathbb {I}^{W-1}\times \mathbb {I}^{W-1}\), but not in any of the Types I, II, III vicinities in Definition 5.5, we have

$$\begin{aligned} \exp \Big \{-\frac{1}{2}M\big (\mathsf {Re}\mathring{K}(\hat{X},V)+\mathsf {Re}\mathring{L}(\hat{B},T)\big )\Big \}\le e^{-\varTheta }. \end{aligned}$$
(7.21)

With Lemmas 7.8 and 7.9, we can prove Lemma 7.7.

Proof of Lemma 7.7

For the sake of simplicity, in this proof, we temporarily use \(\mathcal {I}_{\text {full}}\) to represent the l.h.s. of (7.18), i.e. the integral over the full domain, and use \(\mathcal {I}_{I}, \mathcal {I}_{II}\) and \(\mathcal {I}_{III}\) to represent the first three terms on the r.h.s. of (7.18). Now, combining (7.16), (7.21) and (7.20), we see that,

$$\begin{aligned} \left| \mathcal {I}_{\text {full}}-\mathcal {I}_{I}-\mathcal {I}_{II}-\mathcal {I}_{III}\right| \le e^{O(WN^{\varepsilon _2})}\cdot e^{-\varTheta }\cdot e^{O(W)}\le e^{-\varTheta }, \end{aligned}$$

in light of the definition of \(\varTheta \) in (5.30) and the assumption (5.34). Hence, we completed the proof of Lemma 7.7. \(\square \)

Proof of Lemma 7.8

At first, again, the polynomial \(\tilde{\mathfrak {p}}(\cdot )\) in the integrand can be replaced by some monomial \(\tilde{\mathfrak {q}}(\cdot )\) in the discussion, owing to the fact that \(\kappa _1=\exp \{O(W)\}\) in (7.11). Then, the proof is similar to that of Lemma 7.6, but much simpler, since \(\mathbf {t}\)-variables are bounded by 1 now. Consequently, we can eliminate \(\hat{X}, \mathbf {t}\) and \(\mathbf {v}\)-variables from the integral directly and use the trivial bounds

$$\begin{aligned} \tilde{\mathfrak {q}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t})\le \prod _{a=1,2}\prod _{j=1}^W r_{j,a}^{\ell _{j,a}},\qquad \mathsf {Re}\mathring{L}(\hat{B},T)\ge & {} c\sum _{a=1,2}\sum _{j=1}^W (r_{j,a}-1)^2, \quad \quad \end{aligned}$$
(7.22)

where the latter is from (5.13). Then, an elementary Gaussian integral leads to the conclusion immediately. \(\square \)

Proof of Lemma 7.9

According to (5.13) and (5.27), we see both \(M\mathsf {Re}\mathring{L}(\hat{B},T)\) and \(M\mathsf {Re}\mathring{K}(\hat{X},V)\) are nonnegative on the full domain. Hence, it suffices to show one of them is larger than \(\varTheta \) outside the Type I, II, III vicinities. For the former, using the first inequality of (5.13), (7.3) and (7.4), we have

$$\begin{aligned} \mathsf {Re}\mathring{L}(\hat{B},T)&\ge c\sum _{a=1,2}\sum _{j=1}^W (r_{j,a}-1)^2+\mathsf {Re}\ell _S(\hat{B},T)\nonumber \\&\ge c\Big (||\mathbf {b}_1-a_+||_2^2+||\mathbf {b}_2+a_-||_2^2\Big )\\&\quad + \frac{\mathfrak {A}(\hat{B})}{12}(-\mathbf {t}'S^{(1)}\mathbf {t}). \end{aligned}$$

Then it is easy to see that \( M\mathsf {Re}\mathring{L}(\hat{B},T)\ge \varTheta \) if \((\mathbf {b}_1,\mathbf {b}_2,\mathbf {t})\not \in \varUpsilon ^b_+\times \varUpsilon ^b_-\times \varUpsilon _S\), by the definitions in (5.32).

Now we turn to \(M\mathsf {Re}\mathring{K}(\hat{X},V)\). Recall the definition of \(\vartheta _j\)’s in (5.26). If there is some \(j\in \{1,2W\}\) such that \((\sin \vartheta _j-E/2)^2> \varTheta /M\), we can get \(M\mathsf {Re}\mathring{K}(\hat{X},V)>\varTheta \) immediately, by using (5.27). Hence, in the sequel, it suffices to consider the case \(\Big (\sin \vartheta _j-{E}/{2}\Big )^2\le {\varTheta }/{M}\) for all \({j=1,\ldots , 2W}\), which implies

$$\begin{aligned} |\arg (a_+^{-1}x_{j,a})|^2\wedge |\arg (a_-^{-1}x_{j,a})|^2\le \frac{\varTheta }{M}, \quad \forall \; j=1,\ldots , W; a=1,2. \quad \quad \end{aligned}$$
(7.23)

Now, recall the notation defined in (5.33). We claim that it suffices to focus on the following three subcases of (7.23), corresponding to the Type I, II and III saddle points.

  1. (i)

    There is a sequence of permutations of \(\{1,2\}, \varvec{\epsilon }=(\epsilon _1,\ldots , \epsilon _W)\), such that \(||\arg (a_+^{-1}\mathbf {x}_{\varvec{\epsilon }(1)})||_\infty ^2\le {\varTheta }/{M}\) and \(||\arg (a_-^{-1}\mathbf {x}_{\varvec{\epsilon }(2)})||_\infty ^2\le {\varTheta }/{M}\),

  2. (ii)

    For \(a=1,2\), there is \(||\arg (a_+^{-1}\mathbf {x}_{a})||_\infty ^2\le {\varTheta }/{M}\).

  3. (iii)

    For \(a=1,2\), there is \(||\arg (a_-^{-1}\mathbf {x}_{a})||_\infty ^2\le {\varTheta }/{M}\).

For those \(\hat{X}\)-variables which satisfy (7.23) but do not belong to any of the case (i), (ii) or (iii) listed above, one can actually get \(M\mathsf {Re}\mathring{K}(\hat{X},V)>cM\gg \varTheta \). We explain this estimate for the following case, namely, there is some pair \(\{i,j\}\in \mathcal {E}\) such that

$$\begin{aligned} |\arg (a_+^{-1}x_{i,1})|^2, |\arg (a_-^{-1}x_{i,2})|^2\le \frac{\varTheta }{M},\quad |\arg (a_-^{-1}x_{j,1})|^2, |\arg (a_-^{-1}x_{j,2})|^2\le \frac{\varTheta }{M}, \nonumber \\ \end{aligned}$$
(7.24)

all the other cases can be handled analogously. Using (5.27), we have

$$\begin{aligned} \mathsf {Re}\mathring{K}(\hat{X},V)&\ge \frac{1}{4}(\mathbb {S}^v)_{ij}(\cos \vartheta _i-\cos \vartheta _j)^2+\frac{1}{4}(\mathbb {S}^v)_{i+W,j}(\cos \vartheta _{i+W}-\cos \vartheta _j)^2\nonumber \\&\quad +\frac{1}{4}(\mathbb {S}^v)_{i,j+W}(\cos \vartheta _i-\cos \vartheta _{j+W})^2\nonumber \\&\quad +\frac{1}{4}(\mathbb {S}^v)_{i+W,j+W}(\cos \vartheta _{i+W}-\cos \vartheta _{j+W})^2\,. \end{aligned}$$
(7.25)

Now, by the assumption (7.24), we have

$$\begin{aligned} \cos \vartheta _i=\mathsf {Re}(a_+)+o(1),\quad \cos \vartheta _{i+W}, \cos \vartheta _j,\cos \vartheta _{j+W}=\mathsf {Re}(a_-)+o(1). \end{aligned}$$
(7.26)

Consequently, from (7.25) and the definition (5.23) we have

$$\begin{aligned} \mathsf {Re}\mathring{K}(\hat{X},V)\ge c\Big ((\mathbb {S}^v)_{ij}+(\mathbb {S}^v)_{i,j+W}\Big )= c\mathfrak {s}_{ij}>c', \end{aligned}$$

where the last step follows from Assumption 1.1 (iv). Therefore, we have \(M\mathsf {Re}\mathring{K}(\hat{X},V)>cM\).

Hence, we can focus on the cases (i), (ii) and (iii). Note that in case (i), we actually defined a vicinity of \((\mathbf {x}_{\varvec{\epsilon }(1)},\mathbf {x}_{\varvec{\epsilon }(2)})\) in terms of the \(\ell ^\infty \)-norm rather than \(\ell ^2\)-norm of \(\mathbf {x}_{\varvec{\epsilon }(1)}\) and \(\mathbf {x}_{\varvec{\epsilon }(2)}\), thus the vicinity is larger than \(\varUpsilon _+^{x}\times \varUpsilon _-^x\). Without loss of generality, we assume that \(\epsilon _1=\mathrm {id}\) which is the identity. Then by doing the transform \((\hat{X}_j,V_j)\rightarrow (\mathfrak {I}\hat{X}_j\mathfrak {I}, \mathfrak {I}V_j)\) for those j with \(\epsilon _j\ne \mathrm {id}\), we can transform \((\mathbf {x}_{\varvec{\epsilon }(1)}, \mathbf {x}_{\varvec{\epsilon }(1)}, \mathbf {v}_{\varvec{\epsilon }})\) to \((\mathbf {x}_1,\mathbf {x}_2,\mathbf {v})\) for all permutation sequence \(\varvec{\epsilon }\). Hence, it suffices to assume \(\epsilon _j=\mathrm {id}\) for all \(j=1,\ldots , W\) and consider the case

$$\begin{aligned} ||a_+^{-1}\mathbf {x}_1||_\infty <\frac{\varTheta }{M},\quad ||a_-^{-1}\mathbf {x}_2||_\infty <\frac{\varTheta }{M} \quad \text {but}\quad (\mathbf {x}_1,\mathbf {x}_2,\mathbf {v})\not \in \varUpsilon _+^x\times \varUpsilon _-^x\times \varUpsilon _S. \end{aligned}$$

Now, if \((\mathbf {x}_1,\mathbf {x}_2)\not \in \varUpsilon _+^x\times \varUpsilon _-^x\), we can use (5.27) again to see that

$$\begin{aligned} M\mathsf {Re}\mathring{K}(\hat{X},V)\ge cM\sum _{j=1}^{2W}\left( \sin \vartheta _j-\frac{E}{2}\right) ^2\ge cM\big (||a_+^{-1}\mathbf {x}_1||_2^2+||a_-^{-1}\mathbf {x}_1||_2^2\big )\ge \varTheta , \end{aligned}$$

where in the second step we used the fact \(||a_+^{-1}\mathbf {x}_1||_\infty <{\varTheta }/{M}\) and \(||a_-^{-1}\mathbf {x}_2||_\infty <{\varTheta }/{M}\), and in the last step we used the definition in (5.32). Now, if \((\mathbf {x}_1,\mathbf {x}_2)\in \varUpsilon _+^x\times \varUpsilon _-^x\) but \(\mathbf {v}\not \in \varUpsilon _S\), we go back to (5.20). It is easy to check \(-\mathring{\ell }_{++}(\mathbf {x}_1)-\mathring{\ell }_{+-}(\mathbf {x}_2)\ge 0\) for \((\mathbf {x}_1,\mathbf {x}_2)\in \varUpsilon _+^x\times \varUpsilon _-^x\). Hence it suffices to estimate \(\ell _S(\hat{X},V)\). By the definition in (5.17) and the fact \(x_{j,1}-x_{j,2}=a_+-a-+o(1)\) for all \(j=1,\ldots , W\), we have

$$\begin{aligned} \ell _S(\hat{X},V)\ge & {} c\sum _{j,k} \mathfrak {s}_{jk}|(V_kV_j^*)_{12}|^2=c\sum _{j,k} \mathfrak {s}_{jk}|u_jv_ke^{\mathbf {i}\theta _k}-u_kv_je^{\mathbf {i}\theta _j}|^2\nonumber \\\ge & {} c\sum _{j,k}\mathfrak {s}_{jk}(v_j-v_k)^2, \end{aligned}$$
(7.27)

where the last step follows from the same argument as (7.4). Consequently, \(M\ell _S(\hat{X},V)\ge \varTheta \) if \(\mathbf {v}\not \in \varUpsilon _S\). Hence, we finished the discussion in case (i). For cases (ii) and (iii), the proofs are much easier since it suffices to discuss the \(\hat{X}\)-variables, we just omit the details here. Hence, we completed the proof of Lemma 7.9. \(\square \)

7.3 Proof of Lemma 7.2

Let \(\mathbb {I}^c=\mathbb {R}_+{\setminus } \mathbb {I}\). Now we consider the domain sequence \(\mathbf {\mathbb {J}}=(\mathbb {J}_2,\ldots , \mathbb {J}_{W})\in \{\mathbb {I}, \mathbb {I}^c\}^{W-1}\). We decompose the integral in Lemma 7.2 as follows

$$\begin{aligned} \int _{\mathbb {R}^{W-1}\setminus \mathbb {I}^{W-1}} \prod _{j=2}^W \mathrm{d} t_j\; \mathbb {M}(\mathbf {t})\mathfrak {q}(\mathbf {t})=\sum _{\begin{array}{c} \mathbf {\mathbb {J}}\in \{\mathbb {I},\mathbb {I}^c\}^{W-1}\\ \mathbf {\mathbb {J}}\ne \mathbb {I}^{W-1} \end{array}}\int _{\prod _{j=2}^W \mathbb {J}_j} \prod _{j=2}^W \mathrm{d} t_j\; \mathbb {M}(\mathbf {t})\mathfrak {q}(\mathbf {t}). \end{aligned}$$
(7.28)

Note the total number of the choices of such \(\mathbf {\mathbb {J}}\) in the sum above is \(2^{W-1}-1\). It suffices to consider one of these sequences \(\mathbf {\mathbb {J}}\in \{\mathbb {I}, \mathbb {I}^c\}^{W-1}\) in which there is at least one i such that \(\mathbb {J}_i=\mathbb {I}^c\).

Recall the spanning tree \(\mathcal {G}_0=(\mathcal {V},\mathcal {E}_0)\) in Assumption 1.1. The simplest case is that there exists a linear spanning tree (a path) \(\mathcal {G}_0\) with

$$\begin{aligned} \mathcal {E}_0=\{(i,i+1)\}_{i=1}^{W-1}\subset \mathcal {E}. \end{aligned}$$
(7.29)

We first present the proof in this simplest case.

Now, we only keep the edges in the path \(\mathcal {E}_0\), i.e. the terms with \(k=j-1\) in (7.3), we also trivially discard the term \(1/(1+2t_{j}^2)\) from the sum \(1/(1+2t_{j-1}^2)+1/(1+2t_{j}^2)\) in the estimate (7.4) (the first inequality), and finally we bound all \(M\mathfrak {A}(\hat{B})\mathfrak {s}_{j-1,j}/4\) by \(\mathfrak {L}\) defined in (7.5) from below. That means, we use the bound

$$\begin{aligned} \mathbb {M}(\mathbf {t})\le \prod _{j=2}^{W}\exp \left\{ -\mathfrak {L}\frac{(t_{j}-t_{j-1})^2}{1+2t_{j-1}^2}\right\} :=\prod _{j=2}^{W}\breve{\mathbb {M}}_j(\mathbf {t}). \end{aligned}$$
(7.30)

Consequently, we have

$$\begin{aligned} \int _{\prod _{j=2}^W \mathbb {J}_j} \prod _{j=2}^W \mathrm{d} t_j\; {\mathbb {M}}(\mathbf {t})\mathfrak {q}(\mathbf {t}) \le \int _{ \prod _{j=2}^{W}\mathbb {J}_j} \prod _{j=2}^{W}\mathrm{d} t_j\; \prod _{j=2}^{W} t_j^{n_j} \breve{\mathbb {M}}_j(\mathbf {t}). \end{aligned}$$
(7.31)

Note that, as a function of \(\mathbf {t}, \breve{\mathbb {M}}_j(\mathbf {t})\) only depends on \(t_{j-1}\) and \(t_{j}\).

Having fixed \(\mathbf {\mathbb {J}}\), assume that k is the largest index such that \(\mathbb {J}_{k}=\mathbb {I}^c\), i.e. \(t_{k+1},\ldots , t_W\in \mathbb {I}\). Then, using the fact \(t_1=0\) and \(t_k\ge 1\), it is not difficult to check

$$\begin{aligned} \sum _{j=2}^{W} \frac{(t_{j}-t_{j-1})^2}{1+2t_{j-1}^2}\ge \sum _{j=2}^{k} \frac{(t_{j}-t_{j-1})^2}{1+2t_{j-1}^2}\ge \frac{1}{300k^2},\quad \text {if}\quad t_{k}\in \mathbb {I}^c \end{aligned}$$
(7.32)

by employing the elementary facts

$$\begin{aligned} \frac{ (t_{j}-t_{j-1})^2}{1+2t_{j-1}^2}\ge \left\{ \begin{array}{lll} \frac{1}{3}(t_{j}/t_{j-1}-1)^2,&{}\quad \text {if}\quad t_{j-1}\in \mathbb {I}^c\\ \frac{1}{3} (t_{j}-t_{j-1})^2,&{}\quad \text {if}\quad t_{j-1}\in \mathbb {I} \end{array}\right. \qquad j=2,\ldots , W. \quad \quad \end{aligned}$$
(7.33)

Now, we split \(\prod _{j=2}^W\breve{\mathbb {M}}_j(\mathbf {t})\) into two parts. We use one to control the integral, and the other will be estimated by (7.32). Specifically, substituting (7.32) into (7.31) we have

$$\begin{aligned} \int _{\prod _{j=2}^W \mathbb {J}_j} \prod _{j=2}^W \mathrm{d} t_j\; {\mathbb {M}}(\mathbf {t})\mathfrak {q}(\mathbf {t})\le e^{-\frac{\mathfrak {L}}{600k^2}} \int _{\mathbb {R}_+^{W-1}}\prod _{j=2}^{W}\mathrm{d} t_j\; \prod _{j=2}^{W} t_j^{n_j} \big (\breve{\mathbb {M}}_j(\mathbf {t})\big )^{\frac{1}{2}}. \end{aligned}$$
(7.34)

Therefore, what remains is to estimate the integral in (7.34), which can be done by elementary Gaussian integral step by step. More specifically, using (7.33) and the change of variable \(t_j/t_{j-1}-1\rightarrow t_j\) in case of \(t_{j-1}\in \mathbb {I}^c\) and \(t_j-t_{j-1}\rightarrow t_j\) in case of \(t_{j-1}\in \mathbb {I}\), it is elementary to see that for any \(\ell =O(W)\),

$$\begin{aligned} \int _{\mathbb {R}_+} dt_j\; t_j^\ell \big (\breve{\mathbb {M}}_j(\mathbf {t})\big )^{\frac{1}{2}}&\le \ell !! \Big (1+c\mathfrak {L}^{-\frac{1}{2}}\Big )^{O(\ell )}\; \left( t_{j-1}^{\ell +1}+1\right) \nonumber \\&\le e^{O(W\log N)}\Big (1+\mathfrak {L}^{-\frac{1}{2}}\Big )^{O(\ell )}\; \left( t_{j-1}^{\ell +1}+1\right) . \end{aligned}$$
(7.35)

Starting from \(j=W\), using (7.35) to integrate (7.34) successively, the exponent of \(t_j\) increases linearly (\(n_j=O(1)\)), thus we can get

$$\begin{aligned} \int _{\prod _{j=2}^W \mathbb {J}_j} \prod _{j=2}^W \mathrm{d} t_j\; {\mathbb {M}}(\mathbf {t})\mathfrak {q}(\mathbf {t})\le e^{-\frac{\mathfrak {L}}{600W^2}}\cdot e^{O(W^2\log N)}\cdot \Big (1+\mathfrak {L}^{-\frac{1}{2}}\Big )^{O(W^2)} . \end{aligned}$$

Then (7.6) follows from the definition of \(\mathfrak {L}\) in (7.5) and (5.35). Hence, we completed the proof for (7.6) when the spanning tree is given by (7.29).

Now, we consider more general spanning tree \(\mathcal {G}_0\) and regard 1 as its root. We start from the generalization of (7.30), namely,

$$\begin{aligned} \mathbb {M}(\mathbf {t})\le \prod _{\{i,j\}\in \mathcal {E}_0}\exp \left\{ -\mathfrak {L}\frac{(t_{j}-t_{i})^2}{1+2t_{i}^2}\right\} :=\prod _{\{i,j\}\in \mathcal {E}_0}\breve{\mathbb {M}}_{i,j}(\mathbf {t}). \end{aligned}$$
(7.36)

Here we make the convention that \(\text {dist}(1,i)=\text {dist}(1,j)-1\) for all \(\{i,j\}\in \mathcal {E}_0\), where \(\text {dist}(a,b)\) represents the distance between a and b. Now, if there is \(k'\) such that \(\mathbb {J}_{k'}\in \mathbb {I}^c\), we can prove the following analogue of (7.32), namely,

$$\begin{aligned} \sum _{\{i,j\}\in \mathcal {E}_0}\frac{(t_{j}-t_{i})^2}{1+2t_{i}^2}\ge \frac{1}{300k^2} \end{aligned}$$

Consequently, we can get the analogue of (7.34) via replacing \(\breve{\mathbb {M}}_j(\mathbf {t})\)’s by \(\breve{\mathbb {M}}_{i,j}(\mathbf {t})\)’s. Finally, integrating \(t_j\)’s out successively, from the leaves to the root 1, yields the same conclusion, i.e. (7.6), for general \(\mathcal {G}_0\). Therefore, we completed the proof of Lemma 7.2.

8 Gaussian measure in the vicinities

From now on, we can restrict ourselves to the Type I, II and III vicinities. As a preparation of the proofs of Lemmas 5.8 and 5.9, we will show in this section that the exponential function

$$\begin{aligned} \exp \left\{ -M\big (\mathring{K}(\hat{X},V)+\mathring{L}(\hat{B},T)\big )\right\} \end{aligned}$$
(8.1)

is approximately a Gaussian measure (unnormalized).

8.1 Parametrization and initial approximation in the vicinities

We change the \(\mathbf {x}, \mathbf {b}, \mathbf {t}, \mathbf {v}\)-variables to a new set of variables, namely, \(\mathring{\mathbf {x}}, \mathring{\mathbf {b}}, \mathring{\mathbf {t}}\) and \(\mathring{\mathbf {v}}\). The precise definition of \(\mathring{x}\) differs in the different vicinities. To distinguish the parameterization, we set \(\varkappa =\pm , +\), or \(-\), corresponding to Type I, II or III vicinity, respectively. Recalling \(D_\varkappa \) from (1.28). For each j and each \(\varkappa \), we then set

$$\begin{aligned}&\hat{X}_j=D_\varkappa \text {diag}\left( \exp \big \{\mathbf {i}\mathring{x}_{j,1}/\sqrt{M}\big \}, \exp \big \{\mathbf {i}\mathring{x}_{j,2}/\sqrt{M}\big \}\right) ,\quad \mathring{x}_{j,a}/\sqrt{M}\in [-\pi ,\pi ],\nonumber \\&\hat{B}_j=D_{\pm }+D_{\pm } \text {diag}\left( \mathring{b}_{j,1}/\sqrt{M}, \mathring{b}_{j,2}/\sqrt{M}\right) ,\qquad t_j=\mathring{t}_j/\sqrt{M}. \end{aligned}$$
(8.2)

If \(\varkappa =\pm \), we also need to parameterize \(v_j\) by

$$\begin{aligned} v_j=\mathring{v}_j/\sqrt{M}. \end{aligned}$$
(8.3)

Correspondingly, we set the vectors

$$\begin{aligned} \mathring{\mathbf {b}}_a&:=(\mathring{b}_{1,a},\ldots , \mathring{b}_{W,a}),\quad \mathring{\mathbf {x}}_a:=(\mathring{x}_{1,a},\ldots , \mathring{x}_{W,a}), \qquad a=1,2,\\ \mathring{\mathbf {t}}&:=(\mathring{t}_2,\ldots , \mathring{t}_W),\quad \mathring{\mathbf {v}}:=(\mathring{v}_2,\ldots , \mathring{v}_W). \end{aligned}$$

Accordingly, recalling the quantity \(\varTheta \) from (5.30), we introduce the domains

$$\begin{aligned} \mathring{\varUpsilon }&\equiv \mathring{\varUpsilon }(N, \varepsilon _0):=\{\mathbf {a}\in \mathbb {R}^W: ||\mathbf {a}||_2^2\le \varTheta \},\nonumber \\ \mathring{\varUpsilon }_S&\equiv \mathring{\varUpsilon }_S(N,\varepsilon _0):=\{\mathbf {a}\in \mathbb {R}_+^{W-1}: -\mathbf {a}'S^{(1)}\mathbf {a}\le \varTheta \}. \end{aligned}$$
(8.4)

We remind here, as mentioned above, in the sequel, the small constant \(\varepsilon _0\) in \(\mathring{\varUpsilon }\) and \(\mathring{\varUpsilon }_S\) may be different from line to line, subject to (5.34). Now, by the definition of the Type I’, II and III vicinities in Definition 5.5 and the parametrization in (8.2) and (8.3), we can redefine the vicinities as follows.

Definition 8.1

We can redefine three types of vicinities as follows.

  • Type I’ vicinity :    \(\big (\mathring{\mathbf {b}}_1,\mathring{\mathbf {b}}_2,\mathring{\mathbf {x}}_{1},\mathring{\mathbf {x}}_{2},\mathring{\mathbf {t}},\mathring{\mathbf {v}}\big )\in \mathring{\varUpsilon }\times \mathring{\varUpsilon }\times \mathring{\varUpsilon }\times \mathring{\varUpsilon } \times \mathring{\varUpsilon }_S\times \mathring{\varUpsilon }_S \), with \(\varkappa =\pm \).

  • Type II vicinity : \(\big (\mathring{\mathbf {b}}_1,\mathring{\mathbf {b}}_2,\mathring{\mathbf {x}}_{1},\mathring{\mathbf {x}}_{2},\mathring{\mathbf {t}},\mathbf {v}\big )\in \mathring{\varUpsilon }\times \mathring{\varUpsilon }\times \mathring{\varUpsilon }\times \mathring{\varUpsilon } \times \mathring{\varUpsilon }_S\times \mathbb {I}^{W-1}\), with \(\varkappa =+\).

  • Type III vicinity : \(\big (\mathring{\mathbf {b}}_1,\mathring{\mathbf {b}}_2,\mathring{\mathbf {x}}_{1},\mathring{\mathbf {x}}_{2},\mathring{\mathbf {t}},\mathbf {v}\big )\in \mathring{\varUpsilon }\times \mathring{\varUpsilon }\times \mathring{\varUpsilon }\times \mathring{\varUpsilon } \times \mathring{\varUpsilon }_S\times \mathbb {I}^{W-1}\), with \(\varkappa =-\).

We recall from (7.8) the fact

$$\begin{aligned} \mathring{\mathbf {t}}\in \mathring{\varUpsilon }_S\Longrightarrow ||\mathring{\mathbf {t}}||_\infty =O(\varTheta ). \end{aligned}$$
(8.5)

Now, we use the representation (5.2). Then, for the Type I vicinity, we change \(\mathbf {x},\mathbf {b},\mathbf {t},\mathbf {v}\)-variables to \(\mathring{\mathbf {x}},\mathring{\mathbf {b}},\mathring{\mathbf {t}},\mathring{\mathbf {v}}\)-variables according to (8.2) with \(\varkappa =\pm \), thus

$$\begin{aligned}&2^W\mathcal {I}\big (\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+, \varUpsilon ^x_-, \varUpsilon _S,\varUpsilon _S\big )=\frac{M^{2}}{(n!)^24^W\pi ^{2W+4}}\int _{\mathbb {L}^{2W-2}} \prod _{j=2}^W\frac{\mathrm{d} \theta _j}{2\pi }\prod _{j=2}^W \frac{\mathrm{d} \sigma _j}{2\pi }\nonumber \\&\quad \times \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{b}_{j,1} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{b}_{j,2} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,1} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,2} \int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W 2\mathring{t}_j \mathrm{d}\nonumber \\&\quad \times \mathring{t}_j \int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W 2\mathring{v}_j \mathrm{d} \mathring{v}_j\; \prod _{j=1}^{W} \exp \Big \{\mathbf {i}\frac{\mathring{x}_{j,1}+\mathring{x}_{j,2}}{\sqrt{M}}\Big \} \times \exp \left\{ -M\big (\mathring{K}(\hat{X},V)+\mathring{L}(\hat{B},T)\big )\right\} \nonumber \\&\quad \times \prod _{j=1}^W (x_{j,1}-x_{j,2})^2(b_{j,1}+b_{j,2})^2\cdot \mathsf {A}(\hat{X}, \hat{B}, V, T). \end{aligned}$$
(8.6)

For the Type II or III vicinities, i.e. \(\varkappa =+\) or \(-\), we change \(\mathbf {x},\mathbf {b},\mathbf {t}\)-variables to \(\mathring{\mathbf {x}},\mathring{\mathbf {b}},\mathring{\mathbf {t}}\)-variables. Consequently, we have

$$\begin{aligned}&\mathcal {I}\big (\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_\varkappa , \varUpsilon ^x_\varkappa , \varUpsilon _S,\mathbb {I}^{W-1}\big )=\frac{(-a_\varkappa ^2)^{W}}{(n!)^2}\cdot \frac{M^{W+1}}{8^W\pi ^{2W+4}}\cdot \int _{\mathbb {L}^{2W-2}} \prod _{j=2}^W\frac{\mathrm{d} \theta _j}{2\pi }\prod _{j=2}^W \frac{\mathrm{d} \sigma _j}{2\pi }\nonumber \\&\quad \times \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{b}_{j,1} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{b}_{j,2} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,1} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,2} \int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W 2\mathring{t}_j \mathrm{d} \mathring{t}_j \nonumber \\&\quad \times \int _{\mathbb {I}^{W-1}} \prod _{j=2}^W 2v_j \mathrm{d} v_j\; \prod _{j=1}^{W} \exp \left\{ \mathbf {i}\frac{\mathring{x}_{j,1}+\mathring{x}_{j,2}}{\sqrt{M}}\right\} \exp \Big \{-M\big (\mathring{K}(\hat{X},V)+\mathring{L}(\hat{B},T)\big )\Big \}\nonumber \\&\quad \times \prod _{j=1}^W (x_{j,1}-x_{j,2})^2(b_{j,1}+b_{j,2})^2\cdot \mathsf {A}(\hat{X}, \hat{B}, V, T). \end{aligned}$$
(8.7)

We will also need the following facts

$$\begin{aligned} \prod _{j=1}^W |(x_{j,1}-x_{j,2})^2(b_{j,1}+b_{j,2})^2|=e^{O(W)},\qquad |\mathsf {A}(\hat{X},\hat{B}, V, T)|\le e^{O(WN^{\varepsilon _2})} \nonumber \\ \end{aligned}$$
(8.8)

if \(\mathbf {x}_{1},\mathbf {x}_2\in \widehat{\varSigma }^W, b_{j,1}=a_++o(1), b_{j,2}=-a_-+o(1)\) and \(t_j=o(1)\), for all \(j=1,\ldots , N\), which always hold in these types of vicinities. The first estimate in (8.8) is trivial, and the second follows from Lemma 6.1.

Now, we approximate (8.1) in the vicinities. We introduce the matrices \(\mathcal {E}_+(\vartheta )=\text {diag}(e^{\mathbf {i}\vartheta }, e^{-\mathbf {i}\vartheta })\mathfrak {I}\) and \(\mathcal {E}_-(\vartheta )=\text {diag}( e^{\mathbf {i}\vartheta }, -e^{-\mathbf {i}\vartheta } )\mathfrak {I}\) for any \(\vartheta \in \mathbb {L}\). Then, with the parameterization above, expanding \(\hat{X}_j\) in (3.23) and \(T_j\) in (3.25) up to the second order, we can write

$$\begin{aligned}&T_j=I+\frac{\mathring{t}_j}{\sqrt{M}}\mathcal {E}_+(\sigma _j)+\frac{1}{M}R^t_j,\quad \hat{X}_j=D_\varkappa +\frac{\mathbf {i}}{\sqrt{M}}D_\varkappa \text {diag}(\mathring{x}_{j,1},\mathring{x}_{j,2})+\frac{1}{M}R^x_j,\nonumber \\&\quad \varkappa =\pm , +,-. \end{aligned}$$
(8.9)

For \(\varkappa =\pm \), we also expand \(V_j\) in (3.25) up to the second order, namely,

$$\begin{aligned} V_j=I+\frac{\mathring{v}_j}{\sqrt{M}}\mathcal {E}_-(\theta _j)+\frac{1}{M}R^v_j. \end{aligned}$$
(8.10)

We just take (8.9) and (8.10) as the definition of \(R^x_j, R^t_j\) and \(R^v_j\). Note that \(R^x_j\) is actually \(\varkappa \)-dependent. However, this dependence is irrelevant for our analysis thus is suppressed from the notation. It is elementary that

$$\begin{aligned} ||R^x_j||_{\max }=O\left( \mathring{x}_{j,1}^2+\mathring{x}_{j,2}^2\right) ,\quad ||R^t_j||_{\max }=O\left( \mathring{t}_j^2\right) ,\quad ||R^v_j||_{\max }=O\left( \mathring{v}_j^2\right) . \nonumber \\ \end{aligned}$$
(8.11)

Recall the facts (5.10) and (5.20). In addition, in light of (5.18)–(5.20), we can also represent \(\mathring{K}(\hat{X},V)\) in the following two alternative ways

$$\begin{aligned} \mathring{K}(\hat{X},V)&=\left( -\mathring{\ell }_{++}(\mathbf {x}_1)-\mathring{\ell }_{++}(\mathbf {x}_2)\right) +\ell _S(\hat{X},V)+\big (K(D_{+},I)-K(D_{\pm },I)\big ), \end{aligned}$$
(8.12)
$$\begin{aligned} \mathring{K}(\hat{X},V)&= \left( -\mathring{\ell }_{+-}(\mathbf {x}_1)-\mathring{\ell }_{+-}(\mathbf {x}_2)\right) +\ell _S(\hat{X},V)+\big (K(D_{-},I)-K(D_{\pm },I)\big ). \end{aligned}$$
(8.13)

We will use the representations of \(\mathring{K}(\hat{X},V)\) in (8.12) and (8.13) for Type II and III vicinities respectively. In addition, we introduce the matrices

$$\begin{aligned} \mathbb {A}_+:=(1+a_+^2)I+a_+^2S,\qquad \mathbb {A}_-:=(1+a_-^2)I+a_-^2S. \end{aligned}$$
(8.14)

Then, we have the following lemma.

Lemma 8.2

With the parametrization in (8.9), we have the following approximations.

  • Let \(\mathring{\mathbf {b}}_1, \mathring{\mathbf {b}}_2\in \mathbb {C}^{W}\) and \(||\mathring{\mathbf {b}}_1||_\infty , ||\mathring{\mathbf {b}}_2||_\infty =o(\sqrt{M})\), we have

    $$\begin{aligned}&M\left( \mathring{\ell }_{++}(\mathbf {b}_1)+\mathring{\ell }_{--}(\mathbf {b}_2)\right) =\frac{1}{2}\mathring{\mathbf {b}}_1'\mathbb {A}_+\mathring{\mathbf {b}}_1+\frac{1}{2}\mathring{\mathbf {b}}_2'\mathbb {A}_-\mathring{\mathbf {b}}_2+R^b,\nonumber \\&\quad \quad R^b=O\left( \frac{\sum _{a=1,2}||\mathring{\mathbf {b}}_{a}||_3^3}{\sqrt{M}}\right) . \end{aligned}$$
    (8.15)
  • Let \(\varkappa =\pm \) and \(\mathring{\mathbf {x}}_1,\mathring{\mathbf {x}}_2\in \mathbb {C}^{W}\) and \(||\mathring{\mathbf {x}}_1||_\infty ,||\mathring{\mathbf {x}}_2||_\infty =o(\sqrt{M})\), we have

    $$\begin{aligned}&M\left( -\mathring{\ell }_{++}(\mathbf {x}_1)-\mathring{\ell }_{+-}(\mathbf {x}_2)\right) =\frac{1}{2}\mathring{\mathbf {x}}_1'\mathbb {A}_+\mathring{\mathbf {x}}_1+\frac{1}{2}\mathring{\mathbf {x}}_2'\mathbb {A}_-\mathring{\mathbf {x}}_2+R^x_\pm ,\nonumber \\&\quad R^x_\pm =O\left( \frac{\sum _{a=1,2}||\mathring{\mathbf {x}}_{a}||_3^3}{\sqrt{M}}\right) . \end{aligned}$$
    (8.16)
  • In the Type II vicinity, we have

    $$\begin{aligned} M\left( -\mathring{\ell }_{++}(\mathbf {x}_1)-\mathring{\ell }_{++}(\mathbf {x}_2)\right) =\frac{1}{2}\mathring{\mathbf {x}}_1'\mathbb {A}_+\mathring{\mathbf {x}}_1+\frac{1}{2}\mathring{\mathbf {x}}_2'\mathbb {A}_+\mathring{\mathbf {x}}_2+R^x_+,\quad R^x_+=O\left( \frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\right) . \nonumber \\ \end{aligned}$$
    (8.17)
  • In the Type III vicinity, we have

    $$\begin{aligned} M\left( -\mathring{\ell }_{+-}(\mathbf {x}_1)-\mathring{\ell }_{+-}(\mathbf {x}_2)\right) =\frac{1}{2}\mathring{\mathbf {x}}_1'\mathbb {A}_-\mathring{\mathbf {x}}_1+\frac{1}{2}\mathring{\mathbf {x}}_2'\mathbb {A}_-\mathring{\mathbf {x}}_2+R^x_-,\quad R^x_-=O\left( \frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\right) . \nonumber \\ \end{aligned}$$
    (8.18)

Here \(R^b R^x_\pm , R^x_+\) and \(R^x_-\) are remainder terms of the Taylor expansion of the function \(\ell (\mathbf {a})\) defined in (5.6).

Proof

It follows from the Taylor expansion of the function \(\ell (\mathbf {a})\) easily. \(\square \)

Then, according to (5.10), (5.20), (8.12) and (8.13), what remains is to approximate \(M\ell _S(\hat{B},T)\) and \(M\ell _S(\hat{X},V)\) in the vicinities. Recalling the definition in (5.6) and the parameterization in (8.2), we can rewrite

$$\begin{aligned} M\ell _S(\hat{B},T)&=\frac{1}{2}\sum _{j,k} \mathfrak {s}_{jk}|s_j\mathring{t}_ke^{\mathbf {i}\sigma _k}-s_k\mathring{t}_je^{\mathbf {i}\sigma _j}|^2\prod _{\ell =j,k}\left( a_+-a_-+\frac{a_+\mathring{b}_{\ell ,1}-a_-\mathring{b}_{\ell ,2}}{\sqrt{M}}\right) \nonumber \\&=: \frac{(a_+-a_-)^2}{2}\sum _{j,k} \mathfrak {s}_{jk}|\mathring{t}_ke^{\mathbf {i}\sigma _k}-\mathring{t}_je^{\mathbf {i}\sigma _j}|^2+R^{t,b}. \end{aligned}$$
(8.19)

We take the above equation as the definition of \(R^{t,b}\). Now, we set

$$\begin{aligned} \tau _{j,1}:=\mathring{t}_j\cos \sigma _j,\qquad \tau _{j,2}:=\mathring{t}_j\sin \sigma _j,\quad \forall \; j=2,\ldots , W \end{aligned}$$
(8.20)

and change the variables and the measure as

$$\begin{aligned} (\mathring{t}_j,\sigma _j)\rightarrow (\tau _{j,1},\tau _{j,2}),\qquad 2\mathring{t}_j\mathrm{d} \mathring{t}_j\frac{\mathrm{d} \sigma _j}{2\pi }\rightarrow \frac{1}{\pi } \mathrm{d} \tau _{j,1}\mathrm{d} \tau _{j,2}. \end{aligned}$$
(8.21)

In the Type I’ vicinity, we can do the same thing for \(M\ell _S(\hat{X},V)\), namely,

$$\begin{aligned} M\ell _S(\hat{X},V)=:\frac{(a_+-a_-)^2}{2}\sum _{j,k} \mathfrak {s}_{jk}|\mathring{v}_ke^{\mathbf {i}\theta _k}-\mathring{v}_je^{\mathbf {i}\theta _j}|^2+R^{v,x}_\pm , \end{aligned}$$
(8.22)

where \(R^{v,x}_\pm \) is the remainder term. Then we set

$$\begin{aligned} \upsilon _{j,1}:=\mathring{v}_j\cos \theta _j,\qquad \upsilon _{j,2}:=\mathring{v}_j\sin \theta _j,\quad \forall \; j=2,\ldots , W \end{aligned}$$
(8.23)

and change the variables and measure as

$$\begin{aligned} (\mathring{v}_j,\theta _j)\rightarrow (\upsilon _{j,1},\upsilon _{j,2}),\qquad 2\mathring{v}_j\mathrm{d} \mathring{v}_j\frac{\mathrm{d} \theta _j}{2\pi }\rightarrow \frac{1}{\pi } \mathrm{d} \upsilon _{j,1}\mathrm{d} \upsilon _{j,2}. \end{aligned}$$
(8.24)

Now, we introduce the vectors

$$\begin{aligned} \varvec{\tau }_a=(\tau _{2,a},\ldots , \tau _{W,a}),\qquad \varvec{\upsilon }_a=(\upsilon _{2,a},\ldots , \upsilon _{W,a}),\qquad a=1,2. \end{aligned}$$

With this notation, we can rewrite (8.19) and (8.22) as

$$\begin{aligned} M\ell _S(\hat{B},T)&=-(a_+-a_-)^2\sum _{a=1,2}\varvec{\tau }'_aS^{(1)}\varvec{\tau }_a+R^{t,b},\nonumber \\ M\ell _S(\hat{X},V)&=-(a_+-a_-)^2\sum _{a=1,2}\varvec{\upsilon }'_aS^{(1)}\varvec{\upsilon }_a+R^{v,x}_\pm . \end{aligned}$$
(8.25)

According to (8.21) and (8.24), we can express (8.6) as an integral over \(\mathring{\mathbf {b}}, \mathring{\mathbf {x}}, \mathring{\varvec{\tau }}\) and \(\mathring{\varvec{\upsilon }}\)-variables. However, we need to specify the domains of \(\mathring{\varvec{\tau }}\) and \(\mathring{\varvec{\upsilon }}\)-variables in advance. Our aim is to restrict the integral in the domains

$$\begin{aligned} \varvec{\tau }_a\in \mathring{\varUpsilon }_S,\quad \varvec{\upsilon }_a\in \mathring{\varUpsilon }_S,\qquad a=1,2. \end{aligned}$$
(8.26)

To see the truncation from the domain \((\mathring{\mathbf {t}},\mathring{\mathbf {v}},\varvec{\sigma },\varvec{\theta })\in \mathring{\varUpsilon }_S\times \mathring{\varUpsilon }_S\times \mathbb {L}^{W-1}\times \mathbb {L}^{W-1}\) to (8.26) is harmless in the integral (8.6), we need to bound \(R^{t,b}\) and \(R^{v,x}_\pm \) in terms of \(\varvec{\tau }'_aS^{(1)}\varvec{\tau }_a\) and \(\varvec{\upsilon }'_aS^{(1)}\varvec{\upsilon }_a\), respectively. By (8.5), we have \(s_j=1+O(\varTheta ^{2}/M)\) for all \(j=2,\ldots , W\), which also implies

$$\begin{aligned} |s_j\mathring{t}_k e^{\mathbf {i}\sigma _k}-s_k\mathring{t}_je^{\mathbf {i}\sigma _j}|^2&=\sum _{a=1,2}(\tau _{j,a}-\tau _{k,a})^2+O\left( \frac{\varTheta ^{3}}{M}\right) \sum _{a=1,2}|\tau _{j,a}-\tau _{k,a}|\nonumber \\&\quad +O\left( \frac{\varTheta ^{6}}{M^2}\right) . \end{aligned}$$
(8.27)

Then, by the definition of \(R^{t,b}\) in (8.19), it is not difficult to check that in the Type I’ vicinity, we have

$$\begin{aligned} |R^{t,b}|&\le O\left( \frac{\varTheta ^{\frac{1}{2}}}{\sqrt{M}}\right) \sum _{a=1,2} \big (-\varvec{\tau }_a'S^{(1)}\varvec{\tau }_a\big )\nonumber \\&\quad +O\left( \frac{\varTheta ^{\frac{7}{2}}}{M}\right) \sum _{a=1,2}\big (-\varvec{\tau }_a'S^{(1)}\varvec{\tau }_a\big )^{1/2}+O\left( \frac{\varTheta ^{7}}{M^2}\right) . \end{aligned}$$
(8.28)

The same estimate holds if we replace \(R^{t,b}\) by \(R^{v,x}_\pm , \varvec{\tau }_a\) by \(\varvec{\upsilon }_a\). Then it is obvious that if one of \(\varvec{\tau }_1, \varvec{\tau }_2, \varvec{\upsilon }_1\) and \(\varvec{\upsilon }_2\) is not in \(\mathring{\varUpsilon }_S\), we will get (7.21). Hence, using (8.8), we can discard the integral outside the vicinity, analogously to the proof of Lemma 7.7. More specifically, in the sequel, we can and do assume (8.26). Now, plugging (8.26) into (8.28) in turn yields the bound

$$\begin{aligned} |R^{t,b}|=O\left( \frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\vee \frac{\varTheta ^{4}}{M}\right) ,\quad |R^{v,x}_\pm |=O\left( \frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\vee \frac{\varTheta ^{4}}{M}\right) . \end{aligned}$$
(8.29)

By the discussion above, for the Type I vicinity, we can write (8.6) as

$$\begin{aligned}&2^W\mathcal {I}(\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+, \varUpsilon ^x_-, \varUpsilon _S,\varUpsilon _S)=\frac{M^{2}}{(n!)^24^{W}\pi ^{4W+2}}\cdot \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{b}_{j,1} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{b}_{j,2} \nonumber \\&\quad \times \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,1} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,2} \int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \tau _{j,1}\int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \tau _{j,2} \int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \upsilon _{j,1}\int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \upsilon _{j,2} \nonumber \\&\quad \times \exp \Big \{-\frac{1}{2}\mathring{\mathbf {b}}_1'\mathbb {A}_+\mathring{\mathbf {b}}_1-\frac{1}{2}\mathring{\mathbf {b}}_2'\mathbb {A}_-\mathring{\mathbf {b}}_2-R^b\Big \}\exp \Big \{-\frac{1}{2}\mathring{\mathbf {x}}_1'\mathbb {A}_+\mathring{\mathbf {x}}_1-\frac{1}{2}\mathring{\mathbf {x}}_2'\mathbb {A}_-\mathring{\mathbf {x}}_2-R^x_\pm \Big \}\nonumber \\&\quad \times \exp \Big \{(a_+-a_-)^2\sum _{a=1,2}\varvec{\tau }'_aS^{(1)}\varvec{\tau }_a-R^{t,b}\Big \}\exp \Big \{(a_+-a_-)^2\sum _{a=1,2}\varvec{\upsilon }'_aS^{(1)}\varvec{\upsilon }_a-R^{v,x}_\pm \Big \}\nonumber \\&\quad \times \prod _{j=1}^{W}\exp \Big \{\mathbf {i}\frac{\mathring{x}_{j,1}+\mathring{x}_{j,2}}{\sqrt{M}}\Big \} \prod _{j=1}^W (x_{j,1}-x_{j,2})^2(b_{j,1}+b_{j,2})^2\cdot \mathsf {A}(\hat{X}, \hat{B}, V, T)\nonumber \\&\quad +O(e^{-\varTheta }), \end{aligned}$$
(8.30)

where the error term stems from the truncation of the vicinity \((\mathring{\mathbf {t}},\mathring{\mathbf {v}}, \varvec{\sigma },\varvec{\theta })\in \mathring{\varUpsilon }_S\times \mathring{\varUpsilon }_S\times \mathbb {L}^{W-1}\times \mathbb {L}^{W-1}\) to \((\varvec{\tau }_1, \varvec{\tau }_2,\varvec{\upsilon }_1, \varvec{\upsilon }_2)\in \mathring{\varUpsilon }_S\times \mathring{\varUpsilon }_S\times \mathring{\varUpsilon }_S\times \mathring{\varUpsilon }_S\).

Now, for the Type II and III vicinities, the discussion on \(\ell _S(\hat{B},T)\) is of course the same. For \(\ell _S(\hat{X},V)\), we make the following approximation. For the Type II vicinity, using the notation in (5.22), we can write

$$\begin{aligned} M\ell _S(\hat{X},V)&=\frac{-Ma_+^2}{2}\sum _{j,k} \mathfrak {s}_{jk}^v\prod _{\ell =j,k}\left( \frac{\mathring{x}_{\ell ,1}}{\sqrt{M}}-\frac{\mathring{x}_{\ell ,2}}{\sqrt{M}}+O\Big (\frac{\mathring{x}_{\ell ,1}^2+\mathring{x}_{\ell ,2}^2}{M}\Big )\right) \nonumber \\&=:-\frac{a_+^2}{2}\sum _{j,k} \mathfrak {s}_{jk}^v(\mathring{x}_{j,1}-\mathring{x}_{j,2})(\mathring{x}_{k,1}-\mathring{x}_{k,2})+R^{v,x}_{+}. \end{aligned}$$
(8.31)

It is easy to see that

$$\begin{aligned} R^{v,x}_{+}=O\left( \frac{||\mathring{\mathbf {x}}_1||_3^3+||\mathring{\mathbf {x}}_2||_3^3}{\sqrt{M}}\right) =O\left( \frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\right) ,\quad \text {for}\quad \mathring{\mathbf {x}}_1, \mathring{\mathbf {x}}_2\in \mathring{\varUpsilon }. \end{aligned}$$
(8.32)

Recall \(\mathbb {S}^v\) defined in (5.23). Set

$$\begin{aligned} \mathbb {A}^v_+:=(1+a_+^2)I_{2W}+a_+^2\mathbb {S}^v, \qquad \mathbb {A}^v_-:=(1+a_-^2)I_{2W}+a_-^2\mathbb {S}^v, \end{aligned}$$
(8.33)

Combining (5.16), (5.19), (8.17) and (8.31) we obtain

$$\begin{aligned}&M(K(\hat{X},V)-K(D_{+}, I))=\frac{1}{2}\mathring{\mathbf {x}}_1'\mathbb {A}_+\mathring{\mathbf {x}}_1+\frac{1}{2}\mathring{\mathbf {x}}_2'\mathbb {A}_+\mathring{\mathbf {x}_2}\\&\quad -\frac{a_+^2}{2}\sum _{j,k} \mathfrak {s}_{jk}^v(\mathring{x}_{j,1}-\mathring{x}_{j,2})(\mathring{x}_{k,1}-\mathring{x}_{k,2})+R^x_++R^{v,x}_{+}=\frac{1}{2}\mathring{\mathbf {x}}'\mathbb {A}^v_+\mathring{\mathbf {x}}+R^x_++R^{v,x}_{+}, \end{aligned}$$

where

$$\begin{aligned} \mathring{\mathbf {x}}:=(\mathring{\mathbf {x}}_1',\mathring{\mathbf {x}}_2')'. \end{aligned}$$

Analogously, for the Type III vicinity, we can write

$$\begin{aligned} M(K(\hat{X},V)-K(D_{-}, I))=\frac{1}{2}\mathring{\mathbf {x}}'\mathbb {A}^v_-\mathring{\mathbf {x}}+R^x_-+R^{v,x}_{-}, \end{aligned}$$
(8.34)

where

$$\begin{aligned} R^{v,x}_{-}=O\Big (\frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\Big ). \end{aligned}$$
(8.35)

Consequently, by (8.12) and (8.13) we can write (8.7) for \(\kappa =+,-\) as

$$\begin{aligned}&\mathcal {I}(\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_{\varkappa }, \varUpsilon ^x_{\varkappa }, \varUpsilon _S,\mathbb {I}^{W-1})=\exp \Big \{M\big (K(D_{\pm },I)-K(D_{\varkappa },I)\big )\Big \}\nonumber \\&\quad \times \frac{(-a_\varkappa ^2)^{W}}{(n!)^2}\cdot \frac{M^{W+1}}{8^W\pi ^{3W+3}}\times \int _{\mathbb {L}^{W-1}} \prod _{j=2}^W\frac{\mathrm{d} \theta _j}{2\pi } \int _{\mathbb {I}^{W-1}} \prod _{j=2}^W 2v_j \mathrm{d} v_j\int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{b}_{j,1}\nonumber \\&\quad \times \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{b}_{j,2} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,1} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,2} \times \int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \tau _{j,1}\int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \tau _{j,2} \nonumber \\&\quad \times \exp \Big \{-\frac{1}{2}\mathring{\mathbf {b}}_1'\mathbb {A}_+\mathring{\mathbf {b}}_1-\frac{1}{2}\mathring{\mathbf {b}}_2'\mathbb {A}_-\mathring{\mathbf {b}}_2-R^b\Big \}\times \exp \Big \{-\frac{1}{2}\mathring{\mathbf {x}}'\mathbb {A}_{\varkappa }^v\mathring{\mathbf {x}}-R^x_{\varkappa }-R^{v,x}_{{\varkappa }}\Big \}\nonumber \\&\quad \times \exp \Big \{(a_+-a_-)^2\sum _{a=1,2}\varvec{\tau }'_aS^{(1)}\varvec{\tau }_a-R^{t,b}\Big \}\times \prod _{j=1}^{W} \exp \Big \{\mathbf {i}\frac{\mathring{x}_{j,1}+\mathring{x}_{j,2}}{\sqrt{M}}\Big \}\nonumber \\&\quad \times \prod _{j=1}^W (x_{j,1}-x_{j,2})^2\times (b_{j,1}+b_{j,2})^2\mathsf {A}(\hat{X}, \hat{B}, V, T)+O(e^{-\varTheta }). \end{aligned}$$
(8.36)

8.2 Steepest descent paths in the vicinities

In order to estimate the integrals (8.30) and (8.36) properly, we need to control various remainder terms in (8.30) and (8.36) to reduce these integrals to Gaussian ones. The final result is collected in Proposition 8.4 at the end of this section. As a preparation, we shall further deform the contours of \(\mathring{\mathbf {b}}\)-variables and \(\mathring{\mathbf {x}}\)-variables to the steepest descent paths. We mainly provide the discussion for the \(\mathring{\mathbf {b}}\)-variables, that for the \(\mathring{\mathbf {x}}\)-variables is analogous.

For simplicity, in this section, we assume \(0\le E\le \sqrt{2}-\kappa \), the case \(-\sqrt{2}+\kappa \le E\le 0\) can be discussed similarly. We introduce the eigendecomposition of S as

$$\begin{aligned} S=\mathsf {U}\hat{S}\mathsf {U}'. \end{aligned}$$

Note that \(\mathsf {U}\) is an orthogonal matrix thus the entries are all real. Now, we perform the change of coordinate

$$\begin{aligned} \mathbf {c}_a=(c_{1,a},\ldots , c_{W,a})':=\mathsf {U}'\mathring{\mathbf {b}}_a,\quad a=1,2. \end{aligned}$$

Obviously, for the differentials, we have \(\prod _{j=1}^W \mathrm{d}\mathring{b}_{j,a}=\prod _{j=1}^W \mathrm{d} c_{j,a}\) for \(a=1,2\). In addition, for the domains, it is elementary to see

$$\begin{aligned} \mathring{\mathbf {b}}_a\in \mathring{\varUpsilon }\Longleftrightarrow \mathbf {c}_a\in \mathring{\varUpsilon },\quad a=1,2. \end{aligned}$$
(8.37)

Now, we introduce the notation

$$\begin{aligned} \gamma _j^+:={1}/{\sqrt{1+a_+^2+a_+^2\lambda _j(S)}},\qquad \gamma _j^-:={1}/{\sqrt{1+a_-^2+a_-^2\lambda _j(S)}}, \end{aligned}$$

and set the diagonal matrices

$$\begin{aligned} \mathbb {D}_+:=\text {diag}(\gamma _1^+,\ldots , \gamma _W^+),\qquad \mathbb {D}_-:=\text {diag}(\gamma _1^-,\ldots , \gamma _W^-). \end{aligned}$$

By the assumption \(0\le E\le \sqrt{2}-\kappa \) and (1.5), it is not difficult to check

$$\begin{aligned}&|\gamma _j^+|\sim 1,\quad |\gamma _j^-|\sim 1,\qquad \arg \gamma _j^+\in \left( -\frac{\pi }{8},0\right] ,\qquad \arg \gamma _j^-\in \left[ 0,\frac{\pi }{8}\right) ,\nonumber \\&\quad \forall \; j=1,\ldots , W. \end{aligned}$$
(8.38)

With the notation introduced above, we have

$$\begin{aligned} \mathring{\mathbf {b}}_1'\mathbb {A}_+\mathring{\mathbf {b}}_1=\mathbf {c}_{1}'\mathbb {D}_+^{-2}\mathbf {c}_1,\quad \mathring{\mathbf {b}}_2'\mathbb {A}_-\mathring{\mathbf {b}}_2=\mathbf {c}_{2}'\mathbb {D}_-^{-2}\mathbf {c}_2. \end{aligned}$$

To simplify the following discussion, we enlarge the domain of the \(\mathbf {c}\)-variables to

$$\begin{aligned} \mathbf {c}_a\in \varUpsilon _\infty \equiv \varUpsilon _\infty (\varepsilon ):=[-\varTheta ^{\frac{1}{2}}, \varTheta ^{\frac{1}{2}}]^{W},\quad a=1,2. \end{aligned}$$
(8.39)

Obviously, \(\mathring{\varUpsilon }\subset \varUpsilon _\infty \). It is easy to check that (7.21) also holds when \(\mathbf {c}_a\in \varUpsilon _\infty {\setminus }\mathring{\varUpsilon }\) for either \(a=1\) or 2, according to (8.37), thus such a modification of the domain will only produce an error term of order \(O(\exp \{-\varTheta \})\) in the integral (8.30), by using (8.8).

Now we do the scaling \(\mathbf {c}_1\rightarrow \mathbb {D}_+\mathbf {c}_{1}\) and \(\mathbf {c}_2\rightarrow \mathbb {D}_-\mathbf {c}_{2}\). Consequently, we have

$$\begin{aligned} \mathring{\mathbf {b}}_1=\mathsf {U}\mathbb {D}_+\mathbf {c}_1,\qquad \mathring{\mathbf {b}}_2=\mathsf {U}\mathbb {D}_-\mathbf {c}_2, \end{aligned}$$
(8.40)

thus

$$\begin{aligned} \mathring{\mathbf {b}}_1'\mathbb {A}_+\mathring{\mathbf {b}}_1=\sum c_{j,1}^2,\quad \mathring{\mathbf {b}}_2'\mathbb {A}_-\mathring{\mathbf {b}}_2=\sum c_{j,2}^2. \end{aligned}$$
(8.41)

Accordingly, we should adjust the change of differentials as

$$\begin{aligned} \prod _{j=1}^W \mathrm{d} \mathring{b}_{j,1}\rightarrow \det \mathbb {D}_+\cdot \prod _{j=1}^W \mathrm{d} c_{j,1},\quad \prod _{j=1}^W \mathrm{d} \mathring{b}_{j,2}\rightarrow \det \mathbb {D}_-\cdot \prod _{j=1}^W \mathrm{d} c_{j,2}. \end{aligned}$$

In addition, the domain of \(\mathbf {c}_1\) should be changed from \(\varUpsilon _\infty \) to \(\prod _{j=1}^W \mathbb {J}_j^+\) and that of \(\mathbf {c}_2\) should be changed from \(\varUpsilon _\infty \) to \(\prod _{j=1}^W \mathbb {J}_j^-\), where

$$\begin{aligned} \mathbb {J}_j^+:=(\gamma _j^+)^{-1}[-\varTheta ^{\frac{1}{2}}, \varTheta ^{\frac{1}{2}}],\qquad \mathbb {J}_j^-:=(\gamma _j^-)^{-1}[-\varTheta ^{\frac{1}{2}}, \varTheta ^{\frac{1}{2}}]. \end{aligned}$$

Now, we consider the integrand in (8.30) as a function of \(\mathbf {c}\)-variables on the disks, namely,

$$\begin{aligned} c_{j,1}\in \mathbb {O}_j^+:=\big \{z\in \mathbb {C}: |z|\le \varTheta ^{\frac{1}{2}}|\gamma _j^+|^{-1}\big \}, c_{j,2}\in \mathbb {O}_j^-:=\big \{z\in \mathbb {C}: |z|\le \varTheta ^{\frac{1}{2}}|\gamma _j^-|^{-1}\big \}. \end{aligned}$$

For \(\mathbf {c}_1\in \prod _{j=1}^W\mathbb {O}_j^+\) and \(\mathbf {c}_2\in \prod _{j=1}^W\mathbb {O}_j^-\), by (8.38) and (8.40) we have

$$\begin{aligned} ||\mathring{\mathbf {b}}_1||_\infty , ||\mathring{\mathbf {b}}_2||_\infty \le O(\varTheta ). \end{aligned}$$
(8.42)

Here we used the elementary fact \(||U\mathbf {a}||_\infty \le \sqrt{W}||\mathbf {a}||_\infty \) for any \(\mathbf {a}\in \mathbb {C}^W\) and and unitary matrix U. Then, we deform the contour of \(c_{j,1}\) from \(\mathbb {J}_j^+\) to

$$\begin{aligned} (-\varSigma _{j}^+)\cup \mathbb {L}_j^+\cup \varSigma _j^+ \end{aligned}$$

for each \(j=1,\ldots , W\), where

$$\begin{aligned} \mathbb {L}_j^+:=\mathbb {R}\cap \mathbb {O}_j^+,\qquad \varSigma _j^+=\left\{ z\in \partial \mathbb {O}_j^+: 0\le \arg z\le -\arg \gamma _j^+\right\} . \end{aligned}$$

It is not difficult to see \(\mathsf {Re}c_{j,1}^2\ge \varTheta \) for \( c_{j,1}\in (-\varSigma _j^+)\cup \varSigma _j^+\), by (8.38). Consequently, by (8.41), we have

$$\begin{aligned} \Big |\exp \Big \{-\frac{1}{2}\mathring{\mathbf {b}}_1'\mathbb {A}_+\mathring{\mathbf {b}}_1\Big \}\Big |=\Big |\exp \left\{ -\frac{1}{2}\sum _{j=1}^W c_{j,1}^2\right\} \Big |\le O(e^{-\varTheta }). \end{aligned}$$

Then using (8.8), we can get rid of the integral over \(\varSigma _j^+\) and \(-\varSigma _j^+\), analogously to the discussion in Sect. 7. Similarly, we can perform the same argument for \(\mathbf {c}_2\). Consequently, we can restrict the \(\mathbf {c}_1\) and \(\mathbf {c}_2\) integral from \(\mathbf {c}_1\in \prod _{j=1}^W \mathbb {J}_j^+\) and \(\mathbf {c}_2\in \prod _{j=1}^W \mathbb {J}_j^-\) to the domain \(\mathbf {c}_1\in \prod _{j=1}^W\mathbb {L}_j^+\) and \(\mathbf {c}_2\in \prod _{j=1}^W \mathbb {L}_j^-\).

By (8.15), (8.42) and the fact \(||\mathbf {a}||_3^3\le ||\mathbf {a}||_\infty ||\mathbf {a}||_2^2\) for any vector \(\mathbf {a}\), we see that

$$\begin{aligned} |R^b|\le C\frac{||\mathring{\mathbf {b}}_1||_\infty +||\mathring{\mathbf {b}}_2||_\infty }{\sqrt{M}}\big (||\mathring{\mathbf {b}}_1||_2^2+||\mathring{\mathbf {b}}_2||_2^2\big )\le \frac{\varTheta }{\sqrt{M}}\big (||\mathbf {c}_1||_2^2+||\mathbf {c}_2||_2^2\big ) \quad \quad \end{aligned}$$
(8.43)

for some positive constant C, where in the last step we also used the fact that \(||\mathbf {b}_a||_2= O(||\mathbf {c}_a||_2)\) for \(a=1,2\), which is implied by (8.40) and (8.38). Consequently, we have

$$\begin{aligned}&\exp \left\{ -\frac{1}{2}||\mathbf {c}_1||_2^2-\frac{1}{2}||\mathbf {c}_2||_2^2-R^b\right\} \nonumber \\&\quad =\exp \left\{ -\left( \frac{1}{2}+o(1)\right) ||\mathbf {c}_1||_2^2-\left( \frac{1}{2}+o(1)\right) ||\mathbf {c}_2||_2^2\right\} . \end{aligned}$$

This allows us to take a step further to truncate \(\mathbf {c}_1\) and \(\mathbf {c}_2\) according to their 2-norm, namely

$$\begin{aligned} \mathbf {c}_1,\mathbf {c}_2\in \mathring{\varUpsilon }. \end{aligned}$$
(8.44)

Similarly to the discussion in the proof of Lemma 7.7, such a truncation will only produce an error of order \(\exp \{-\varTheta \}\) to the integral, by using (8.8).

Now, analogously to (8.40), we can change \(\mathring{\mathbf {x}}\)-variables to \(\mathbf {d}\)-variables, defined by

$$\begin{aligned} \mathbf {d}_1=(d_{1,1},\ldots , d_{W,1}):=\mathbb {D}_+^{-1}\mathsf {U}'\mathring{\mathbf {x}}_1,\qquad \mathbf {d}_2=(d_{1,2},\ldots , d_{W,2}):=\mathbb {D}_-^{-1}\mathsf {U}'\mathring{\mathbf {x}}_2. \end{aligned}$$

Thus accordingly, we change the differentials

$$\begin{aligned} \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,1}\rightarrow \det \mathbb {D}_+\cdot \prod _{j=1}^W \mathrm{d} d_{j,1},\quad \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,2}\rightarrow \det \mathbb {D}_-\cdot \prod _{j=1}^W \mathrm{d} d_{j,2}. \end{aligned}$$

In addition, like (8.44), we deform the domain to \(\mathbf {d}_1,\mathbf {d}_2\in \mathring{\varUpsilon }\). Finally, using the fact \(\det \mathbb {D}_+\mathbb {D}_-=1/\sqrt{\det \mathbb {A}_+\mathbb {A}_-}\), from (8.30) we arrive at the representation

$$\begin{aligned}&2^W\mathcal {I}\big (\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+, \varUpsilon ^x_-, \varUpsilon _S,\varUpsilon _S\big )\nonumber \\&\quad =\frac{M^{2}}{(n!)^24^{W}\pi ^{4W+2}}\cdot \frac{1}{\det \mathbb {A}_+\mathbb {A_-}}\cdot \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} c_{j,1} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} c_{j,2}\int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} d_{j,1} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} d_{j,2} \nonumber \\&\qquad \times \int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \tau _{j,1}\int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \tau _{j,2} \int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \upsilon _{j,1}\int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \upsilon _{j,2} \; \prod _{j=1}^{W} \exp \left\{ \mathbf {i}\frac{\mathring{x}_{j,1}+\mathring{x}_{j,2}}{\sqrt{M}}\right\} \nonumber \\&\qquad \times \exp \left\{ -\frac{1}{2}||\mathbf {c}_1||_2^2-\frac{1}{2}||\mathbf {c}_2||_2^2-R^b\right\} \times \exp \Big \{-\frac{1}{2}||\mathbf {d}_1||_2^2-\frac{1}{2}||\mathbf {d}_2||_2^2-R^x_\pm \Big \}\nonumber \\&\qquad \times \exp \Big \{(a_+-a_-)^2\sum _{a=1,2}\varvec{\tau }'_aS^{(1)}\varvec{\tau }_a-R^{t,b}\Big \}\exp \Big \{(a_+-a_-)^2\sum _{a=1,2}\varvec{\upsilon }'_aS^{(1)}\varvec{\upsilon }_a-R^{v,x}_\pm \Big \}\nonumber \\&\qquad \times \prod _{j=1}^W (x_{j,1}-x_{j,2})^2(b_{j,1}+b_{j,2})^2\cdot \mathsf {A}(\hat{X}, \hat{B}, V, T)+O(e^{-\varTheta }), \end{aligned}$$
(8.45)

in which \(\mathbf {x}\) and \(\mathring{\mathbf {x}}\)-variables should be regarded as functions of the \(\mathbf {d}\)-variables, as well, \(\mathbf {b}\) and \(\mathring{\mathbf {b}}\)-variables should be regarded as functions of the \(\mathbf {c}\)-variables.

Now, in the Type II and III vicinities, we only do the change of coordinates for the \(\mathring{\mathbf {b}}\)-variables, which is enough for our purpose. Consequently, we have

$$\begin{aligned}&\mathcal {I}\big (\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_\varkappa , \varUpsilon ^x_\varkappa , \varUpsilon _S,\mathbb {I}^{W-1}\big )\nonumber \\&\quad =\exp \big \{M(K(D_{\pm },I)-K(D_{\varkappa },I))\big \}\times \frac{(-a_\varkappa ^2)^{W}}{(n!)^2}\cdot \frac{M^{W+1}}{8^W\pi ^{3W+3}} \times \frac{1}{\sqrt{\det \mathbb {A}_+\mathbb {A}_-}}\nonumber \\&\quad \cdot \int _{\mathbb {L}^{W-1}} \prod _{j=2}^W\frac{\mathrm{d} \theta _j}{2\pi } \int _{\mathbb {I}^{W-1}} \prod _{j=2}^W 2v_j \mathrm{d} v_j\int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} c_{j,1} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} c_{j,2} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,1}\nonumber \\&\quad \times \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,2} \int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \tau _{j,1}\int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \tau _{j,2} \; \prod _{j=1}^{W} \exp \Big \{\mathbf {i}\frac{\mathring{x}_{j,1}+\mathring{x}_{j,2}}{\sqrt{M}}\Big \} \nonumber \\&\quad \times \exp \Big \{-\frac{1}{2}||\mathbf {d}_1||_2^2-\frac{1}{2}||\mathbf {d}_2||_2^2-R^b\Big \}\times \exp \Big \{(a_+-a_-)^2\sum _{a=1,2}\varvec{\tau }'_aS^{(1)}\varvec{\tau }_a-R^{t,b}\Big \}\nonumber \\&\quad \times \exp \Big \{-\frac{1}{2}\mathring{\mathbf {x}}'\mathbb {A}_\varkappa ^v\mathring{\mathbf {x}}-R^x_\varkappa -R^{v,x}_{\varkappa }\Big \}\times \prod _{j=1}^W (x_{j,1}-x_{j,2})^2(b_{j,1}+b_{j,2})^2\nonumber \\&\quad \cdot \mathsf {A}(\hat{X}, \hat{B}, V, T)+O(e^{-\varTheta }). \end{aligned}$$
(8.46)

By (8.38), it is then easy to see

$$\begin{aligned} \mathbf {c}_1,\mathbf {c}_2\in \mathring{\varUpsilon }\Longrightarrow \mathring{\mathbf {b}}_1, \mathring{\mathbf {b}}_2\in \mathring{\varUpsilon },\qquad \mathbf {d}_1,\mathbf {d}_2\in \mathring{\varUpsilon }\Longrightarrow \mathring{\mathbf {x}}_1, \mathring{\mathbf {x}}_2\in \mathring{\varUpsilon }. \end{aligned}$$
(8.47)

We keep the terminology “Type I’, II and III vicinities” for the slightly modified domains defined in terms of \(\mathbf {c}, \mathbf {d}, \varvec{\tau }\) and \(\varvec{\upsilon }\)-variables. More specifically, we redefine the vicinities as follows.

Definition 8.3

We slightly modify Definition 8.1 as follows.

  • Type I’ vicinity: \(\,\,\,\,\,\displaystyle \mathbf {c}_1,\mathbf {c}_2,\mathbf {d}_1,\mathbf {d}_2\in \mathring{\varUpsilon }, \,\,\,\,\, \varvec{\tau }_1,\varvec{\tau }_2,\varvec{\upsilon }_1,\varvec{\upsilon _2}\in \mathring{\varUpsilon }_S\).

  • Type II vicinity: \(\,\,\,\,\,\displaystyle \mathbf {c}_1,\mathbf {c}_2,\mathring{\mathbf {x}}_1,\mathring{\mathbf {x}}_2\in \mathring{\varUpsilon }, \,\,\,\,\, \varvec{\tau }_1,\varvec{\tau }_2\in \mathring{\varUpsilon }_S,\,\,\,\,\, V_j\in \mathring{U}(2)\) for all \(j=2,\ldots , W\),

    \(\qquad \text {where} \, \mathring{\mathbf {x}}-\text {variables are defined in}\) (8.9) with \(\varkappa =+\).

  • Type III vicinity: \(\,\,\,\,\,\displaystyle \mathbf {c}_1,\mathbf {c}_2,\mathring{\mathbf {x}}_1,\mathring{\mathbf {x}}_2\in \mathring{\varUpsilon }, \,\,\,\,\, \varvec{\tau }_1,\varvec{\tau }_2\in \mathring{\varUpsilon }_S,\,\,\,\,\, V_j\in \mathring{U}(2)\) for all \(j=2,\ldots , W\),

    \(\qquad \text {where} \, \mathring{\mathbf {x}}-\text {variables are defined in}\) (8.9) with \(\varkappa =-\).

Now, recall the remainder terms \(R^b, R^x_{\pm }, R^x_+\) and \(R^x_-\) in Lemma 8.2, \(R^{t,b}\) and \(R^{v,x}_\pm \) in (8.25), \(R_{+}^{v,x}\) in (8.31) and \(R_{-}^{v,x}\) in (8.34). In light of (8.47), the bounds on these remainder terms are the same as those obtained in Sect. 8.1. For the convenience of the reader, we collect them as the following proposition.

Proposition 8.4

Under Assumptions 1.1 and 1.14, we have the following estimate, in the vicinities.

$$\begin{aligned}&(i):R^{t,b}=O\Big (\frac{\varTheta ^{4}}{M}\Big ), \quad R^{v,x}_\pm =O\Big (\frac{\varTheta ^{4}}{M}\Big ),\quad (ii): R_{+}^{v,x}=O\Big (\frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\Big ),\nonumber \\&\quad R_{-}^{v,x}=O\Big (\frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\Big ).\\&(iii): R^b=O\Big (\frac{\varTheta ^{2}}{\sqrt{M}}\Big ), \quad R^x=O\Big (\frac{\varTheta ^{2}}{\sqrt{M}}\Big ),\quad R_{+}^x=O\Big (\frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\Big ),\quad R_{-}^x=O\Big (\frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\Big )\,. \end{aligned}$$

Proof

Note that, (i) can be obtained from (8.28), and (ii) is implied by (8.32) and (8.35), and (iii) follows from Lemma 8.2. Hence, we completed the proof. \(\square \)

Analogously, in the vicinities, \(||\mathring{\mathbf {b}}_1||_{2}^2, ||\mathring{\mathbf {b}}_2||_{2}^2, ||\mathring{\mathbf {x}}_1||_2^2, ||\mathring{\mathbf {x}}_2||_2^2, ||\mathring{\mathbf {t}}||_\infty \) and \(||\mathring{\mathbf {v}}||_\infty \) are still bounded by \(\varTheta \).

9 Integral over the Type I vicinities

With (8.45), we estimate the integral over the Type I vicinity in this section. At first, in the Type I’ vicinity, we have \(||\mathring{\mathbf {x}}_a||_\infty =O(\varTheta ^{\frac{1}{2}})\) and \(||\mathring{\mathbf {b}}_a||_\infty =O(\varTheta ^{\frac{1}{2}})\) for \(a=1,2\). Consequently, according to the parametrization in (8.2), we can get

$$\begin{aligned} \prod _{j=1}^W (x_{j,1}-x_{j,2})^2(b_{j,1}+b_{j,2})^2=(a_+-a_-)^{4W}\left( 1+O\left( \frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\right) \right) . \end{aligned}$$
(9.1)

Hence, what remains is to estimate the function \(\mathsf {A}(\hat{X}, \hat{B}, V, T)\). We have the following lemma.

Lemma 9.1

Suppose that the assumptions in Theorem 1.15 hold. In the Type I’ vicinity, for any given positive integer n, there is \(N_0=N_0(n)\), such that for all \(N\ge N_0\) we have

$$\begin{aligned} |\mathsf {A}(\hat{X}, \hat{B}, V, T)|\le \frac{\varTheta ^2W^{C_0}}{M(N\eta )^{n+\ell }}\cdot |\det \mathbb {A}_+|^2\cdot \det (S^{(1)})^2. \end{aligned}$$

for some positive constant \(C_0\) and some integer \(\ell =O(1)\), both of which are independent of n.

With (8.45), (9.1) and Lemma 9.1, we can prove Lemma 5.8.

Proof of Lemma 5.8

Using (8.45), (9.1), Lemma 9.1, Proposition 8.4 with (5.35), the fact \(\det \mathbb {A}_+=\overline{\det \mathbb {A}_-}\) and the trivial estimate \(M\varTheta ^2W^{C_0}/{(N\eta )^{\ell }}\le N^{C_0}\) for sufficiently large constant \(C_0\), we have

$$\begin{aligned}&2^W|\mathcal {I}(\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+, \varUpsilon ^x_-, \varUpsilon _S,\varUpsilon _S)|\nonumber \\&\quad \le \frac{N^{C_0}}{(N\eta )^{n}}\cdot \frac{1}{(2\pi ^2)^{2W}}\cdot \det (S^{(1)})^2\cdot (a_+-a_-)^{4W}\\&\qquad \times \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} c_{j,1} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} c_{j,2}\int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} d_{j,1} \int _{\mathring{\varUpsilon }} \prod _{j=1}^W \mathrm{d} d_{j,2}\int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \tau _{j,1}\int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \tau _{j,2} \\&\qquad \times \int _{\mathring{\varUpsilon }_S}\prod _{j=2}^W \mathrm{d} \upsilon _{j,1}\int _{\mathring{\varUpsilon }_S} \prod _{j=2}^W \mathrm{d} \upsilon _{j,2} \; \exp \left\{ -\frac{1}{2}\big (||\mathbf {c}_1||_2^2+||\mathbf {c}_2||_2^2+||\mathbf {d}_1||_2^2+||\mathbf {d}_2||_2^2\big )\right\} \\&\qquad \times \exp \left\{ (a_+-a_-)^2\left( \varvec{\tau }'_1S^{(1)}\varvec{\tau }_1+\varvec{\tau }'_1S^{(1)}\varvec{\tau }_2+\varvec{\upsilon }'_1S^{(1)}\varvec{\upsilon }_1+\varvec{\upsilon }'_1S^{(1)}\varvec{\upsilon }_2\right) \right\} +O(e^{-\varTheta }). \end{aligned}$$

Then, by elementary Gaussian integral we obtain (5.37). Hence, we completed the proof of Lemma 5.8. \(\square \)

The remaining part of this section will be dedicated to the proof of Lemma 9.1. Recall the definitions of the functions \(\mathsf {A}(\cdot ), \mathsf {Q}(\cdot ), \mathsf {P}(\cdot )\) and \(\mathsf {F}(\cdot )\) in (3.32), (4.1), (4.2) and (4.3). Using the strategy in Sect. 6 again, we ignore the irrelevant factor \(\mathsf {Q}(\cdot )\) at the beginning. Hence, we bound \(\mathsf {P}(\cdot )\) and \(\mathsf {F}(\cdot )\) at first, and modify the bounding procedure slightly to take \(\mathsf {Q}(\cdot )\) into account in the end, resulting a proof of Lemma 9.1.

9.1 \(\mathsf {P}(\hat{X}, \hat{B}, V, T)\) in the Type I’ vicinity

Our aim, in this section, is to prove the following lemma.

Lemma 9.2

Suppose that the assumptions in Theorem 1.15 hold. In the Type I’ vicinity, we have

$$\begin{aligned} |\mathsf {P}(\hat{X},\hat{B}, V, T)|\le \frac{W^{2+\gamma }\varTheta ^2}{M}|\det \mathbb {A}_+|^2\det (S^{(1)})^2. \end{aligned}$$
(9.2)

Before commencing the formal proof, we introduce more notation below. At first, we introduce

$$\begin{aligned} \mathring{\kappa }_j\equiv \mathring{\kappa }_j(\hat{X},\hat{B},V,T):=|\mathring{x}_{j,1}|+|\mathring{x}_{j,2}|+|\mathring{b}_{j,1}|+|\mathring{b}_{j,2}|+|\mathring{v}_j|+|\mathring{t}_j|= O(\varTheta ), \nonumber \\ \end{aligned}$$
(9.3)

where the bound holds in the Type I’ vicinity.

Recalling (6.12) with \(\varvec{\varpi }_j\) defined in (6.8) and \(\hat{\varvec{\varpi }}_j\) in (6.10). Now, we write

$$\begin{aligned} \varvec{\varpi }_j&=\exp \Big \{ -M\log \det \big (1+M^{-1}V_j^*\hat{X}_j^{-1}V_j \varOmega _jT_j^{-1}\hat{B}_j^{-1}T_j\varXi _j\big )\Big \}\nonumber \\&=:\exp \Big \{-TrV_j^*\hat{X}_j^{-1}V_j \varOmega _jT_j^{-1}\hat{B}_j^{-1}T_j\varXi _j\Big \}\; \exp \left\{ \sum _{\ell =2}^4\frac{(-1)^{\ell -1}}{\ell M^{\ell -1}}\varDelta _{\ell ,j}\right\} , \end{aligned}$$
(9.4)

where

$$\begin{aligned} \varDelta _{\ell ,j}:= Tr \big (V_j^*\hat{X}_j^{-1}V_j \varOmega _jT_j^{-1}\hat{B}_j^{-1}T_j\varXi _j\big )^\ell . \end{aligned}$$
(9.5)

The second step of (9.4) follows from the Taylor expansion of the logarithmic function. Now, we expand the first factor of (9.4) around the Type I’ saddle point, namely

$$\begin{aligned}&\exp \Big \{-TrV_j^*\hat{X}_j^{-1}V_j \varOmega _jT_j^{-1}\hat{B}_j^{-1}T_j\varXi _j\Big \}\nonumber \\&\quad =:\exp \Big \{-TrD_{\pm }^{-1}\varOmega _jD_{\pm }^{-1}\varXi _j\Big \} \exp \Big \{-\frac{1}{\sqrt{M}}\varDelta _j\Big \}. \end{aligned}$$
(9.6)

We take (9.6) as the definition of \(\varDelta _j\), which is of the form

$$\begin{aligned} \varDelta _j=\sum _{\alpha ,\beta =1}^4\mathring{\mathfrak {p}}_{j,\alpha ,\beta } \cdot \omega _{j,\alpha }\xi _{j,\beta } \end{aligned}$$

for some function \(\mathring{\mathfrak {p}}_{j,\alpha ,\beta }\) of \(\mathring{\mathbf {x}}, \mathring{\mathbf {b}}, \mathring{\mathbf {v}}\) and \(\mathring{\mathbf {t}}\)-variables, satisfying

$$\begin{aligned} \mathring{\mathfrak {p}}_{j,\alpha ,\beta }=O(\mathring{\kappa }_j),\qquad \forall \; \alpha ,\beta =1,\ldots ,4. \end{aligned}$$
(9.7)

One can check (9.7) easily by using (8.9)–(8.11). Analogously, we can also write

$$\begin{aligned} \varDelta _{\ell ,j}=\sum _{\begin{array}{c} \alpha _1,\ldots ,\alpha _\ell ,\\ ~~\beta _1,\ldots , \beta _\ell =1 \end{array}}^4\mathring{\mathfrak {p}}_{\ell ,j,\varvec{\alpha },\varvec{\beta }}\prod _{i=1}^\ell \omega _{j,\alpha _i}\xi _{j,\beta _i},\qquad \varvec{\alpha }:=(\alpha _1,\ldots ,\alpha _\ell ),\quad \varvec{\beta }:=(\beta _1,\ldots , \beta _\ell ), \nonumber \\ \end{aligned}$$
(9.8)

where

$$\begin{aligned} \mathring{\mathfrak {p}}_{\ell ,j,\varvec{\alpha },\varvec{\beta }}=O(1),\qquad \forall \; \ell =2,\ldots , 4; \alpha _1,\ldots ,\alpha _\ell , \beta _1,\ldots ,\beta _\ell =1,\ldots ,4. \quad \quad \end{aligned}$$
(9.9)

The bound on \(\mathring{\mathfrak {p}}_{\ell ,j,\varvec{\alpha },\varvec{\beta }}\) in (9.9) follows from the fact that all the \(V_j, \hat{X}_j^{-1}, T_j, T_j^{-1}\) and \(\hat{B}_j^{-1}\)-entries are bounded in the Type I’ vicinity, uniformly in j. Consequently, we can write for \(j\ne p,q\)

$$\begin{aligned} \exp \left\{ -\frac{1}{\sqrt{M}}\varDelta _j+\sum _{\ell =2}^4\frac{(-1)^{\ell -1}}{\ell M^{\ell -1}}\varDelta _{\ell ,j}\right\} =1+\sum _{\ell =1}^4 M^{-\frac{\ell }{2}}\sum _{\begin{array}{c} \alpha _1,\ldots ,\alpha _\ell ,\\ ~~\beta _1,\ldots , \beta _\ell =1 \end{array}}^4\mathring{\mathfrak {q}}_{\ell ,j,\varvec{\alpha },\varvec{\beta }}\prod _{i=1}^\ell \omega _{j,\alpha _i}\xi _{j,\beta _i}. \nonumber \\ \end{aligned}$$
(9.10)

In a similar manner, we can also write for \(k=p,q\),

$$\begin{aligned}&\exp \left\{ -\frac{1}{\sqrt{M}}\varDelta _k+\sum _{\ell =2}^4\frac{(-1)^{\ell -1}}{\ell M^{\ell -1}}\varDelta _{\ell ,k}\right\} \hat{\varvec{\varpi }}_k\nonumber \\&\quad =\hat{\mathfrak {p}}_0(\cdot )\Bigg (1+\sum _{\ell =1}^4 M^{-\frac{\ell }{2}}\sum _{\begin{array}{c} \alpha _1,\ldots ,\alpha _\ell ,\\ ~~\beta _1,\ldots , \beta _\ell =1 \end{array}}^4\mathring{\mathfrak {q}}_{\ell ,k,\varvec{\alpha },\varvec{\beta }}\prod _{i=1}^\ell \omega _{k,\alpha _i}\xi _{k,\beta _i}\Bigg ), \end{aligned}$$
(9.11)

where \(\hat{\mathfrak {p}}_0(\cdot )=\det \hat{X}_k/\det \hat{B}_k\), which is introduced in (6.10), and \(\mathring{\mathfrak {q}}_{\ell ,j,\varvec{\alpha },\varvec{\beta }}\) is some function of \(\hat{X}, \hat{B}, V\) and T-variables, satisfying the bound

$$\begin{aligned} \mathring{\mathfrak {q}}_{\ell ,j,\varvec{\alpha },\varvec{\beta }}=O((1+\mathring{\kappa }_j)^\ell ),\quad \forall \; \ell =1,\ldots , 4,\quad j=1,\ldots ,W. \end{aligned}$$
(9.12)

Obviously, we have \(\hat{\mathfrak {p}}_0(\cdot )=O(1)\) in Type I’ vicinity.

Now, in order to distinguish \(\ell ,\varvec{\alpha } \) and \(\varvec{\beta }\) for different j, we index them as \(\ell _j, \varvec{\alpha }_j\) and \(\varvec{\beta }_j\), where

$$\begin{aligned} \varvec{\alpha }_j\equiv \varvec{\alpha }_j(\ell _j):=(\alpha _{j,1},\ldots , \alpha _{j,\ell _j}),\quad \varvec{\beta }_j\equiv \varvec{\beta }_j(\ell _j):=(\beta _{j,1},\ldots , \beta _{j,\ell _j}). \end{aligned}$$

In addition, we define

$$\begin{aligned} \mathbf {\ell }:=(\ell _1,\ldots , \ell _W),\quad {\varvec{\alpha }}\equiv {\varvec{\alpha }}(\mathbf {\ell }):=(\varvec{\alpha }_1,\ldots , \varvec{\alpha }_W),\quad {\varvec{\beta }}\equiv {\varvec{\beta }}(\mathbf {\ell }):=(\varvec{\beta }_1,\ldots , \varvec{\beta }_W). \end{aligned}$$

Let \(||\mathbf {\ell }||_1=\sum _{j=1}^W\ell _j\) be the 1-norm of \(\mathbf {\ell }\). Note that \({\varvec{\alpha }}\) and \({\varvec{\beta }}\) are \(||\mathbf {\ell }||_1\)-dimensional. With these notations, using (6.12), (9.4), (9.6), (9.10) and (9.11) we have the representation

$$\begin{aligned}&\mathcal {P}( \varOmega , \varXi , \hat{X}, \hat{B}, V, T)\nonumber \\&\quad =\hat{\mathfrak {p}}_0(\hat{X}_p,\hat{B}_p)\hat{\mathfrak {p}}_0(\hat{X}_q,\hat{B}_q)\times \exp \left\{ -\sum _{j,k}\tilde{\mathfrak {s}}_{jk} Tr \varOmega _j\varXi _k-\sum _{j=1}^W TrD_{\pm }^{-1}\varOmega _jD_{\pm }^{-1}\varXi _j\right\} \nonumber \\&\qquad \times \left( 1+\sum _{\mathbf {\ell }\in \llbracket 0,4\rrbracket ^W, ||\mathbf {\ell }||_1\ge 1} M^{-\frac{||\mathbf {\ell }||_1}{2}}\sum _{{\varvec{\alpha }},{\varvec{\beta }}\in \llbracket 1,4\rrbracket ^{||\mathbf {\ell }||_1}}\prod _{j=1}^W\mathring{\mathfrak {q}}_{\ell _j, j, \varvec{\alpha }_j,\varvec{\beta }_j}\cdot \prod _{j=1}^W\prod _{i=1}^{\ell _j}\omega _{j,\alpha _{j,i}}\xi _{j,\beta _{j,i}} \right) , \end{aligned}$$
(9.13)

where we made the convention

$$\begin{aligned} \mathring{\mathfrak {q}}_{0,j, \emptyset ,\emptyset }=1,\quad \prod _{i=1}^{0}\omega _{j,\alpha _{j,i}}\xi _{j,\beta _{j,i}}=1, \quad \forall \; j=1,\ldots , W. \end{aligned}$$
(9.14)

According to (9.12) and (9.14), we have

$$\begin{aligned} \prod _{j=1}^W |\mathring{\mathfrak {q}}_{\ell _j, j, \varvec{\alpha }_j,\varvec{\beta }_j}|\le e^{O(||\mathbf {\ell }||_1)} \prod _{j=1}^W (1+\mathring{\kappa }_j)^{\ell _j}. \end{aligned}$$
(9.15)

In addition, we can decompose the sum

$$\begin{aligned} \sum _{\mathbf {\ell }\in \llbracket 0,4\rrbracket ^W,||\mathbf {\ell }||_1\ge 1}=\sum _{\mathfrak {m}=1}^{4W} \sum _{\mathbf {\ell }\in \llbracket 0,4\rrbracket ^W,||\mathbf {\ell }||_1=\mathfrak {m}}. \end{aligned}$$
(9.16)

It is easy to see

$$\begin{aligned} \sharp \{\mathbf {\ell }\in \llbracket 0,4\rrbracket ^W: ||\mathbf {\ell }||_1=\mathfrak {m}\}\le \left( {\begin{array}{c}4W\\ \mathfrak {m}\end{array}}\right) . \end{aligned}$$
(9.17)

Moreover, it is obvious that

$$\begin{aligned} \sharp \{{\varvec{\alpha }},{\varvec{\beta }}\in \llbracket 1,4\rrbracket ^{||\mathbf {\ell }||_1}\}=16^{||\mathbf {\ell }||_1}. \end{aligned}$$
(9.18)

Therefore, it suffices to investigate the integral

$$\begin{aligned} \mathfrak {P}_{\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta } }}:= & {} \int \mathrm{d} \varOmega \mathrm{d} \varXi \exp \left\{ -\sum _{j,k}\tilde{\mathfrak {s}}_{jk} Tr \varOmega _j\varXi _k-\sum _{j=1}^W TrD_{\pm }^{-1}\varOmega _jD_{\pm }^{-1}\varXi _j\right\} \nonumber \\&\times \prod _{j=1}^W\prod _{i=1}^{\ell _j}\omega _{j,\alpha _{j,i}}\xi _{j,\beta _{j,i}} \end{aligned}$$

for each combination \((\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta }})\), and then sum it up for \((\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta }})\) to get the estimate of \(\mathsf {P}(\hat{X},\hat{B},V,T)\). Specifically, we have the following lemma.

Lemma 9.3

With the notation above, we have

$$\begin{aligned} \mathfrak {P}_{\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta } }}=0,\quad \text {if} \quad ||\mathbf {\ell }||_1=0\quad \text {or}\quad 1. \end{aligned}$$
(9.19)

Moreover, we have

$$\begin{aligned} |\mathfrak {P}_{\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta } }}|\le |\det \mathbb {A}_+|^2\det (S^{(1)})^2 \big (||\mathbf {\ell }||_1-1\big )! (2W^\gamma )^{(||\mathbf {\ell }||_1-1)}, \quad \text {if}\quad ||\mathbf {\ell }||_1\ge 2.\quad \quad \quad \end{aligned}$$
(9.20)

We postpone the proof of Lemma 9.3 and prove Lemma 9.2 at first.

Proof of Lemma 9.2

By (4.2), (9.13) and (9.19) and the fact that \(\hat{\mathfrak {p}}_0(\cdot )=O(1)\), we have

$$\begin{aligned} |\mathsf {P}(\hat{X}, \hat{B}, V, T)|\le C\sum _{\mathbf {\ell }\in \llbracket 0,4\rrbracket ^W, ||\mathbf {\ell }||_1\ge 2} M^{-\frac{||\mathbf {\ell }||_1}{2}}\sum _{{\varvec{\alpha }},{\varvec{\beta }}\in \llbracket 1,4\rrbracket ^{||\mathbf {\ell }||_1}} \prod _{j=1}^W |\mathring{\mathfrak {q}}_{\ell _j, j, \varvec{\alpha }_j,\varvec{\beta }_j}|\cdot |\mathfrak {P}_{\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta }}}|.\nonumber \\ \end{aligned}$$
(9.21)

Substituting the bounds (9.15), (9.18) and (9.20) into (9.21) yields

$$\begin{aligned} |\mathsf {P}(\hat{X}, \hat{B}, V, T)|&\le |\det \mathbb {A}_+|^2\det (S^{(1)})^2\nonumber \\&\quad \times \sum _{\mathbf {\ell }\in \llbracket 0,4\rrbracket ^W, ||\mathbf {\ell }||_1\ge 2} e^{O(||\mathbf {\ell }||_1)}\cdot M^{-\frac{||\mathbf {\ell }||_1}{2}}\nonumber \\&\quad \times \big (||\mathbf {\ell }||_1-1\big )! (2W^\gamma )^{(||\mathbf {\ell }||_1-1)}\cdot \prod _{j=1}^W (1+\mathring{\kappa }_j)^{\ell _j}. \end{aligned}$$
(9.22)

Now, from (9.3) we have \(\prod _{j=1}^W (1+\mathring{\kappa }_j)^{\ell _j}\le \varTheta ^{||\mathbf {\ell }||_1}\), which can absorb the irrelevant factor \(e^{O(||\mathbf {\ell }||_1)}\). Using (9.16), (9.17), we have

$$\begin{aligned}&\sum _{\mathbf {\ell }\in \llbracket 0,4\rrbracket ^W,||\mathbf {\ell }||_1\ge 2} e^{O(||\mathbf {\ell }||_1)}\cdot M^{-\frac{||\mathbf {\ell }||_1}{2}}\cdot \big (||\mathbf {\ell }||_1-1\big )! (2W^\gamma )^{(||\mathbf {\ell }||_1-1)}\cdot \prod _{j=1}^W (1+\mathring{\kappa }_j)^{\ell _j}\nonumber \\&\qquad \le \sum _{\mathfrak {m}=2}^{4W} C^{\mathfrak {m}}\left( {\begin{array}{c}4W\\ \mathfrak {m}\end{array}}\right) \cdot M^{-\frac{\mathfrak {m}}{2}} \cdot \varTheta ^{\mathfrak {m}} \cdot \mathfrak {m}! W^{(\mathfrak {m}-1)\gamma }\nonumber \\&\qquad \le \sum _{\mathfrak {m}=2}^{4W} C^{\mathfrak {m}} (4W)^{\mathfrak {m}} \cdot M^{-\frac{\mathfrak {m}}{2}} \cdot \varTheta ^{\mathfrak {m}} \cdot W^{(\mathfrak {m}-1)\gamma }=O\Big (\frac{W^{2+\gamma }\varTheta ^2}{M}\Big ), \end{aligned}$$
(9.23)

where the last step follows from (5.30) and (5.35). Now, substituting (9.23) into (9.22), we can complete the proof of Lemma 9.2. \(\square \)

Hence, what remains is to prove Lemma 9.3. We will need the following technical lemma whose proof is postponed.

Lemma 9.4

For any index sets \(\mathsf {I},\mathsf {J}\subset \{ 1,\ldots , W\}\) with \(|\mathsf {I}|=|\mathsf {J}|=\mathfrak {m}\ge 1\), we have the following bounds for the determinants of the submatrices of \(S, \mathbb {A}_+\) and \(\mathbb {A}_-\) defined in (8.14).

  • For \((\mathbb {A}_+)^{(\mathsf {I}|\mathsf {J})}\) and \((\mathbb {A}_-)^{(\mathsf {I}|\mathsf {J})}\), we have

    $$\begin{aligned} {|\det (\mathbb {A}_+)^{(\mathsf {I}|\mathsf {J})}|}/{|\det \mathbb {A}_+|}\le 1,\quad {|\det (\mathbb {A}_-)^{(\mathsf {I}|\mathsf {J})}|}/{|\det \mathbb {A}_-|}\le 1. \end{aligned}$$
    (9.24)
  • For \(S^{(\mathsf {I}|\mathsf {J})}\), we have

    $$\begin{aligned} {|\det S^{(\mathsf {I}|\mathsf {J})}|}/{|\det S^{(1)}|}\le (\mathfrak {m}-1)!(2W^\gamma )^{(\mathfrak {m}-1)}. \end{aligned}$$
    (9.25)

Proof of Lemma 9.3

Recall the definition in (6.16). Furthermore, we introduce the matrix

$$\begin{aligned} \mathbb {H}=(a_+^{-2}\mathbb {A}_+)\oplus S\oplus S\oplus (a_-^{-2}\mathbb {A}_-). \end{aligned}$$
(9.26)

Using the fact \(a_+a_-=-1\), we can write

$$\begin{aligned} -\sum _{j,k}\tilde{\mathfrak {s}}_{jk} Tr \varOmega _j\varXi _k-\sum _{j=1}^W TrD_{\pm }^{-1}\varOmega _jD_{\pm }^{-1}\varXi _j=-\mathbf {\Omega }\mathbb {H}\mathbf {\Xi }'. \end{aligned}$$

Now, by the Gaussian integral of the Grassmann variables in (3.2), we see that

$$\begin{aligned} |\mathfrak {P}_{\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta } }}|=|\det \mathbb {H}^{(\mathsf {I}|\mathsf {J})}| \end{aligned}$$
(9.27)

for some index sets \(\mathsf {I}, \mathsf {J}\subset \{1,\ldots , 4W\}\) determined by \({\varvec{\alpha }}\) and \({\varvec{\beta }}\) such that

$$\begin{aligned} |\mathsf {I}|=|\mathsf {J}|=||\mathbf {\ell }||_1. \end{aligned}$$

Here we mention that (9.27) may fail when at least two components in \(\varvec{\alpha }_j\) coincide for some j. But \(\mathfrak {P}_{\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta } }}=0\) in this case because of \(\chi ^2=0\) for any Grassmann variable \(\chi \).

Now, obviously, there exists index sets \(\mathsf {I}_\alpha , \mathsf {J}_\alpha \subset \{1,\ldots , W\}\) for \(\alpha =1,\ldots , 4\) such that

$$\begin{aligned}&\mathbb {H}^{(\mathsf {I}|\mathsf {J})}=(a_+^{-2}\mathbb {A}_+)^{(\mathsf {I}_1|\mathsf {J}_1)}\oplus S^{(\mathsf {I}_2| \mathsf {J}_2)}\oplus S^{(\mathsf {I}_3|\mathsf {J}_3)}\oplus (a_-^{-2}\mathbb {A}_-)^{(\mathsf {I}_4|\mathsf {J}_4)},\nonumber \\&\quad \sum _\alpha |\mathsf {I}_\alpha |=\sum _\alpha |\mathsf {J}_\alpha |=||\mathbf {\ell }||_1. \end{aligned}$$

It suffices to consider the case \(|\mathsf {I}_\alpha |=|\mathsf {J}_\alpha |\) for all \(\alpha =1,2,3,4\). Otherwise, \(\det \mathbb {H}^{(\mathsf {I}|\mathsf {J})}\) is obviously 0, in light of the block structure of \(\mathbb {H}\), see the definition (9.26). Now, note that, since \(\det S=0\), we have

$$\begin{aligned} \det \mathbb {H}^{(\mathsf {I}|\mathsf {J})}=0,\qquad \text {if}\quad ||\mathbf {\ell }||_1=0, 1. \end{aligned}$$

For more general \(\mathbf {\ell }\), by Lemma 9.4, we have

$$\begin{aligned} |\det \mathbb {H}^{(\mathsf {I}|\mathsf {J})}|\le |\det \mathbb {A}_+\mathbb {A}_-|\det (S^{(1)})^2\big (||\mathbf {\ell }||_1-1\big )! (2W^\gamma )^{(||\mathbf {\ell }||_1-1)}. \end{aligned}$$

Then, by the fact \(|\det \mathbb {A}_+\mathbb {A}_-|=|\det \mathbb {A}_+|^2\), we can conclude the proof of Lemma 9.3. \(\square \)

To prove Lemma 9.4, we will need the following lemma.

Lemma 9.5

For the weighted Laplacian S, we have

$$\begin{aligned} \det S^{(i|j)}=(-1)^{j-i}\det S^{(i)}, \quad \forall \; i,j=1,\ldots , W \end{aligned}$$
(9.28)

Remark 9.6

A direct consequence of (9.28) is \(\det S^{(1)}=\cdots =\det S^{(W)}\).

Proof of Lemma 9.5

Without loss of generality, we assume \(j>i\) in the sequel. We introduce the matrices

$$\begin{aligned} P_{ij}:=I_{i-1}\oplus \bigg (\begin{array}{ccc}~ &{}I_{j-i-1}\\ 1 &{}~\end{array}\bigg )\oplus I_{W-j}, \qquad E_{j}:=I-2\mathbf {e}_j\mathbf {e}_j^*-\sum _{\ell \ne j} \mathbf {e}_\ell \mathbf {e}_j^*. \end{aligned}$$

It is not difficult to check

$$\begin{aligned} S^{(i|j)}=S^{(i)} P_{ij} E_j. \end{aligned}$$
(9.29)

Then, by the fact \(\det P_{ij}E_j=(-1)^{j-i}\), we can get the conclusion. \(\square \)

Proof of Lemma 9.4

At first, by the definition in (8.14), (1.5) and the fact \(\mathsf {Re}a_+^2=\mathsf {Re}a_-^2>0\), it is easy to see that the singular values of \(\mathbb {A}_+\) and \(\mathbb {A}_-\) are all larger than 1. With the aid of the rectangular matrix \((\mathbb {A}_+)^{(\mathsf {I}|\emptyset )}\) as an intermediate matrix, we can use Cauchy interlacing property twice to see that the kth largest singular value of \((\mathbb {A}_+)^{(\mathsf {I}|\mathsf {J})}\) is always smaller than the kth largest singular value of \(\mathbb {A}_+\). Consequently, we have the first inequality of (9.24). In the same manner, we can get the second inequality of (9.24)

Now, we prove (9.25). At first, we address the case that \(\mathsf {I}\cap \mathsf {J}\ne \emptyset \). In light of Remark 9.6, without loss of generality, we assume that \(1\in \mathsf {I}\cap \mathsf {J}\). Then \(S^{(\mathsf {I}|\mathsf {J})}\) is a submatrix of \(S^{(1)}\). Therefore, we can find two permutation matrices P and Q, such that

$$\begin{aligned} PS^{(1)}Q=\bigg (\begin{array}{ccccc} \mathrm {A} &{}\mathrm {B}\\ \mathrm {C} &{}\mathrm {D} \end{array}\bigg ), \end{aligned}$$

where \(\mathrm {D}=S^{(\mathsf {I}|\mathsf {J})}\). Now, by Schur complement, we know that

$$\begin{aligned} {|\det S^{(\mathsf {I}|\mathsf {J})}|}/{|\det S^{(1)}|}=|\det (\mathrm {A}-\mathrm {B}\mathrm {D}^{-1}\mathrm {C})^{-1}|. \end{aligned}$$

Moreover, \((\mathrm {A}-\mathrm {B}\mathrm {D}^{-1}\mathrm {C})^{-1}\) is the \((|\mathsf {I}|-1)\) by \((|\mathsf {I}|-1)\) upper-left corner of \((PS^{(1)}Q)^{-1}=Q^{-1}(S^{(1)})^{-1}P^{-1}\). That means \(\det S^{(\mathsf {I}|\mathsf {J})}/\det S^{(1)}\) is the determinant of a sub matrix of \((S^{(1)})^{-1}\) (with dimension \(|\mathsf {I}|-1\)), up to a sign. Then, by Assumption 1.1 (iii), we can easily get

$$\begin{aligned} |\det S^{(\mathsf {I}|\mathsf {J})}|/|\det S^{(1)}|\le (|\mathsf {I}|-1)! W^{(|\mathsf {I}|-1)\gamma }. \end{aligned}$$

Now, for the case \(\mathsf {I}\cap \mathsf {J}=\emptyset \), we can fix one \(i\in \mathsf {I}\) and \(j\in \mathsf {J}\). Due to (9.28), it suffices to consider

$$\begin{aligned} {\det S^{(\mathsf {I}|\mathsf {J})}}/{\det S^{(i|j)}}. \end{aligned}$$
(9.30)

By similar discussion, one can see that (9.30) is the determinant of a sub matrix of \((S^{(i|j)})^{-1}\) with dimension \(|\mathsf {I}|-1\). Hence, it suffices to investigate the bound of the entries of \((S^{(i|j)})^{-1}\). From (9.29) we have

$$\begin{aligned} (S^{(i|j)})^{-1}=E_j^{-1}P_{ij}^{-1} (S^{(i)})^{-1}. \end{aligned}$$
(9.31)

Then, it is elementary to see that the entries of \((S^{(i|j)})^{-1}\) are bounded by \(2W^\gamma \), in light of Assumption 1.1 (iii). Consequently, we have

$$\begin{aligned} {|\det S^{(\mathsf {I}|\mathsf {J})}|}/{|\det S^{(i|j)}|}\le (|\mathsf {I}|-1)! (2W^\gamma )^{|\mathsf {I}|-1}, \end{aligned}$$

which implies (9.25). Hence, we completed the proof of Lemma 9.4. \(\square \)

9.2 \(\mathsf {F}(\hat{X},\hat{B}, V, T)\) in the Type I’ vicinity

Neglecting the \(X^{[1]}, \mathbf {y}^{[1]}\) and \(\mathbf {w}^{[1]}\)-variables in \(\mathsf {Q}(\cdot )\) at first, we investigate the integral \(\mathsf {F}(\hat{X},\hat{B}, V, T)\) in the Type I’ vicinity in this section. We have the following lemma.

Lemma 9.7

Suppose that the assumptions in Theorem 1.15 hold. In the Type I’ vicinity, we have

$$\begin{aligned} \mathsf {F}(\hat{X},\hat{B}, V, T)= O\Big (\frac{1}{(N\eta )^{n+2}}\Big ). \end{aligned}$$
(9.32)

Recalling the functions \(\mathbb {G}(\hat{B},T)\) and \(\mathbb {F}(\hat{X},V)\) defined in (6.20) and (6.21), we further introduce

$$\begin{aligned} \mathring{\mathbb {G}}(\hat{B},T)= & {} \exp \big \{(a_+-a_-)N\eta \big \} \mathbb {G}(\hat{B},T),\quad \mathring{\mathbb {F}}(\hat{X},V)\nonumber \\= & {} \exp \big \{-(a_+-a_-)N\eta \big \}\mathbb {F}(\hat{X},V). \end{aligned}$$
(9.33)

Then, we have the decomposition

$$\begin{aligned} \mathsf {F}(\hat{X},\hat{B}, V,T)=\mathring{\mathbb {G}}(\hat{B},T) \mathring{\mathbb {F}}(\hat{X},V). \end{aligned}$$
(9.34)

Hence, we can estimate \(\mathring{\mathbb {F}}(\hat{X},V)\) and \(\mathring{\mathbb {G}}(\hat{B},T)\) separately in the sequel.

9.2.1 Estimate of \(\mathring{\mathbb {F}}(\hat{X},V)\)

Lemma 9.8

Suppose that the assumptions in Theorem 1.15 hold. In the Type I’ vicinity, we have

$$\begin{aligned} \mathring{\mathbb {F}}(\hat{X},V)=O\Big (\frac{1}{N\eta }\Big ). \end{aligned}$$
(9.35)

Proof

Using (8.9) and (8.10), we can write

$$\begin{aligned} X_j=P_1^*V_j^*\hat{X}_j V_j P_1=P_1^*D_{\pm } P_1+O\Big (\frac{\varTheta }{\sqrt{M}}\Big ), \end{aligned}$$
(9.36)

where the remainder term represents a \(2\times 2\) matrix whose max-norm is bounded by \(\varTheta /\sqrt{M}\). Using (9.36) and recalling \(N=MW\) yields

$$\begin{aligned} \exp \left\{ M\eta \sum _{j=1}^W Tr X_jJ\right\} =\exp \Big \{ N\eta Tr P_1^*D_{\pm } P_1J\Big \}\left( 1+O\Big (\frac{\varTheta N\eta }{\sqrt{M}}\Big )\right) . \quad \quad \end{aligned}$$
(9.37)

Substituting (9.37) into (3.20) and (6.21), we can write

$$\begin{aligned} \mathbb {F}(\hat{X},V)&=\int \mathrm{d} \mu (P_1) \mathrm{d} X^{[1]} \; \exp \Big \{ N\eta Tr P_1^*D_{\pm } P_1J\Big \}\prod _{k=p,q}\frac{1}{\det ^2(X_k^{[1]})} \\&\quad \times \prod _{k=p,q}\exp \Big \{ \mathbf {i} Tr X_k^{[1]}JZ-\sum _j\tilde{\mathfrak {s}}_{jk} TrX_jX_k^{[1]}J\Big \}\nonumber \\&\quad \times \prod _{k,\ell =p,q}\exp \left\{ \frac{\tilde{\mathfrak {s}}_{k\ell }}{2M} TrX_k^{[1]}JX_\ell ^{[1]}J\right\} \cdot \Big (1+O\Big (\frac{\varTheta N\eta }{\sqrt{M}}\Big )\Big ). \end{aligned}$$

Recalling the parametrization of \(P_1\) in (3.25), we have

$$\begin{aligned} Tr P_1^*D_{\pm } P_1J=(1-2v^2)(a_+-a_-). \end{aligned}$$

Consequently, we have

$$\begin{aligned} \mathring{\mathbb {F}}(\hat{X},V)&=\int \mathrm{d} X^{[1]}\int v\mathrm{d} v\int \frac{\mathrm{d} \theta }{\pi } \; \exp \Big \{ -2(a_+-a_-)N\eta v^2 \Big \}\prod _{k=p,q}\frac{1}{\det ^2(X_k^{[1]})}\\&\quad \times \prod _{k=p,q}\exp \left\{ \mathbf {i} Tr X_k^{[1]}JZ-\sum _j\tilde{\mathfrak {s}}_{jk} TrX_jX_k^{[1]}J\right\} (1+o(1)). \end{aligned}$$

By the fact that \(X^{[1]}\)-variables are all bounded and \(|\det X_k^{[1]}|=1\) for \(k=p,q\), it is easy to see that

$$\begin{aligned} |\mathring{\mathbb {F}}(\hat{X},V)|\le C\int _0^{1} v \mathrm{d} v\; \exp \Big \{ -2(a_+-a_-)N\eta v^2 \Big \}=O\Big (\frac{1}{N\eta }\Big ). \end{aligned}$$

Therefore, we completed the proof. \(\square \)

9.2.2 Estimate of \(\mathring{\mathbb {G}}(\hat{B},T)\)

Recall the definition of \(\mathring{\mathbb {G}}(\hat{B},T)\) from (9.33), (6.20) and (3.21). In this section, we will prove the following lemma.

Lemma 9.9

Suppose that the assumptions in Theorem 1.15 hold. In the Type I’ vicinity, we have

$$\begin{aligned} \mathring{\mathbb {G}}(\hat{B},T)=O\left( \frac{1}{(N\eta )^{n+1}}\right) . \end{aligned}$$
(9.38)

Note that \(y_p^{[1]}, y_q^{[1]}\) and t in the parametrization of \(Q_1\) (see (3.25) ) are not bounded, we shall truncate them with some appropriate bounds at first, whereby we can neglect some irrelevant terms in the integrand, in order to simplify the integral. More specifically, we will do the truncations

$$\begin{aligned} t\le (N\eta )^{-1/4},\qquad y_p^{[1]}, y_q^{[1]}\le (N\eta )^{\frac{1}{8}}. \end{aligned}$$
(9.39)

Accordingly, we set

$$\begin{aligned} \widehat{\mathbb {G}}(\hat{B},T)&:=e^{(a_+-a_-)N\eta }\int _{\mathbb {L}}\frac{\mathrm{d} \sigma }{2\pi }\int _{\mathbb {I}^2}{v}_p^{[1]}{v}_q^{[1]}\mathrm{d} {v}_p^{[1]}\mathrm{d} {v}_q^{[1]} \int _0^{(N\eta )^{\frac{1}{8}}} \mathrm{d} y_p^{[1]} \int _0^{(N\eta )^{\frac{1}{8}}} \mathrm{d} y_q^{[1]} \nonumber \\&\quad \times \int _{0}^{(N\eta )^{-\frac{1}{4}}} 2t\mathrm{d} t \int _{\mathbb {L}^2}\mathrm{d} \sigma _p^{[1]}\mathrm{d} \sigma _q^{[1]} \; g(Q_1,T,\hat{B}, \mathbf {y}^{[1]}, \mathbf {w}^{[1]}), \end{aligned}$$
(9.40)

where we have used the parameterization of \(\mathbf {w}^{[1]}\) in (3.15). We will prove the following lemma.

Lemma 9.10

Suppose that the assumptions in Theorem 1.15 hold. In the Type I’ vicinity, we have

$$\begin{aligned} \mathring{\mathbb {G}}(\hat{B},T)=\widehat{\mathbb {G}}(\hat{B},T)+O(e^{-N^{\varepsilon }}) \end{aligned}$$

for some positive constant \(\varepsilon \).

Proof

At first, by (6.26)–(6.29), we have for any j,

$$\begin{aligned} \mathsf {Re}Tr B_jY_k^{[1]}J \ge \frac{y_k^{[1]}}{(s+t)^2}\cdot \frac{\text {min}\{\mathsf {Re}b_{j,1}, \mathsf {Re}b_{j,2}\}}{(s_j+t_j)^2}\!\ge \! c\frac{y_k^{[1]}}{1+2t^2},\quad k=p,q, \quad \quad \end{aligned}$$
(9.41)

for some positive constant c, where the last step follows from the facts that \(\mathsf {Re}b_{j,1}, \mathsf {Re}b_{j,2}=\mathsf {Re}a_++o(1)\) and \(t_j=o(1)\) in the Type I’ vicinity. In addition, it is not difficult to get

$$\begin{aligned} Tr B_j J=\Big (a_+-a_-+O\Big (\frac{\varTheta }{\sqrt{M}}\Big )\Big )(1+2t^2),\quad \forall \; j=1,\ldots , W, \end{aligned}$$

which implies that

$$\begin{aligned} M\eta \sum _{j=1}^WTr B_j J=(a_+-a_-)N\eta +2\Big (a_+-a_-+O\Big (\frac{\varTheta }{\sqrt{M}}\Big )\Big )N\eta t^2+O\Big (\frac{\varTheta N\eta }{\sqrt{M}}\Big ). \nonumber \\ \end{aligned}$$
(9.42)

Note that the second and third factors in the definition of \(g(\cdot )\) in (3.21) can be bounded by 1, according to (6.23). Then, as a consequence of (9.41) and (9.42), we have

$$\begin{aligned} e^{(a_+-a_-)N\eta }| g(\cdot )| \le C(y_p^{[1]}y_q^{[1]})^{n+3}\exp \{-c'N\eta t^2 \}\exp \left\{ -c\frac{y_p^{[1]}+y_q^{[1]}}{1+2t^2}\right\} , \end{aligned}$$
(9.43)

for some positive constants Cc and \(c'\). By integrating \(y_p^{[1]}\) and \(y_q^{[1]}\) out at first, we can easily see that the first truncation in (9.39) only produces an error of order \(O(\exp \{-N^{\varepsilon }\})\) to the integral \(\mathring{\mathbb {G}}(\hat{B},T)\), for some positive constant \(\varepsilon =\varepsilon (\varepsilon _2)\) by the assumption \(\eta \ge N^{-1+\varepsilon _2}\) in (1.16). Then one can substitute the first bound in (9.39) to the last factor of the r.h.s. of (9.43), thus

$$\begin{aligned} \exp \left\{ -c\frac{y_p^{[1]}+y_q^{[1]}}{1+2t^2}\right\} \le \exp \left\{ -\frac{c}{2}(y_p^{[1]}+y_q^{[1]})\right\} . \end{aligned}$$

We can also do the second truncation in (9.39) in the integral \(\mathring{\mathbb {G}}(\hat{B},T)\), up to an error of order \(O(\exp \{-N^{\varepsilon }\})\), for some positive constant \(\varepsilon \). Therefore, we completed the proof of Lemma 9.10. \(\square \)

With the aid of Lemma 9.10, it suffices to work on \(\widehat{\mathbb {G}}(\hat{B},T)\) in the sequel. We have the following lemma.

Lemma 9.11

We have

$$\begin{aligned} \widehat{\mathbb {G}}(\hat{B},T)=O\Big (\frac{1}{(N\eta )^{n+1}}\Big ). \end{aligned}$$

Proof of Lemma 9.11

Recall the parameterization of \(\mathbf {w}^{[1]}_k\) in (3.15) again. To simplify the notation, we set

$$\begin{aligned} \mathfrak {w}_k^{[1]}={u}_k^{[1]}{v}_k^{[1]},\quad k=p,q. \end{aligned}$$

Similarly to (9.36), using \(t=o(1)\) from (9.39), we have the expansion

$$\begin{aligned} B_j=Q_1^{-1}T_j^{-1}\hat{B}_jT_jQ_1=Q_1^{-1}D_{\pm } Q_1+O\Big (\frac{\varTheta }{\sqrt{M}}\Big ). \end{aligned}$$

Consequently, we have

$$\begin{aligned} -M\eta \sum _{j=1}^WTrB_j J=-N\eta (a_+-a_-)(1+2t^2)+O\Big (\frac{\varTheta N\eta }{\sqrt{M}}\Big ). \end{aligned}$$
(9.44)

In addition, for \(k=p,q\), using the fact \(\sum _{j}\tilde{\mathfrak {s}}_{jk}=1\), we have

$$\begin{aligned}&\sum _j\tilde{\mathfrak {s}}_{jk} Tr B_jY_k^{[1]}J = y_k^{[1]}\Big ((a_+-a_-)t^2+\big (a_+({u}_k^{[1]})^2-a_-({v}_k^{[1]})^2\big )\Big )\nonumber \\&\quad +y_k^{[1]}\Big ((a_+-a_-)\big (e^{-\mathbf {i}(\sigma _k^{[1]}+\sigma )}+e^{\mathbf {i}(\sigma _k^{[1]}+\sigma )}\big )\mathfrak {w}_k^{[1]} s t\Big )+\frac{\varTheta }{\sqrt{M}}TrR_k Y_k^{[1]}, \end{aligned}$$
(9.45)

where \(R_k\) is a \(2\times 2\) matrix independent of \(Y_k^{[1]}\), satisfying \(||R_k||_{\max }=O(1)\).

Observe that the term in (9.44) is obviously independent of \(\mathbf {w}^{[1]}\)-variables. In addition, for \(k=p\) or q, we have

$$\begin{aligned} \mathbf {i}TrY_k^{[1]}JZ=\big (-\eta +\mathbf {i}E(1-2({v}_k^{[1]})^2)\big )y_k^{[1]}, \end{aligned}$$
(9.46)

and for \(k,\ell =p\) or q, we have

$$\begin{aligned} TrY_k^{[1]}J Y_{\ell }^{[1]} J=y_{k}^{[1]}y_{\ell }^{[1]}\Big ((\mathfrak {w}_{k}^{[1]}\mathfrak {w}_{\ell }^{[1]})^2+\mathfrak {w}_{k}^{[1]}\mathfrak {w}_{\ell }^{[1]}\left( e^{\mathbf {i}(\sigma _k^{[1]}-\sigma _{\ell }^{[1]})}-e^{\mathbf {i}(\sigma _{\ell }^{[1]}-\sigma _k^{[1]})}\right) \Big ). \nonumber \\ \end{aligned}$$
(9.47)

Moreover, we have

$$\begin{aligned} \Big (\left( \mathbf {w}^{[1]}_q(\mathbf {w}^{[1]}_q)^*\right) _{12}\left( \mathbf {w}^{[1]}_p(\mathbf {w}^{[1]}_p)^*\right) _{21}\Big )^n=\left( \mathfrak {w}_p^{[1]}\mathfrak {w}_q^{[1]}\right) ^ne^{\mathbf {i}n(\sigma _p^{[1]}-\sigma _q^{[1]})}. \end{aligned}$$
(9.48)

Substituting (9.44), (9.45) and (9.46)–(9.48) to the definition of \(g(\cdot )\) in (3.21) and reordering the factors properly, we can write the integrand in (9.40) as

$$\begin{aligned}&\exp \{(a_+-a_-)N\eta \} g(\cdot )=\exp \{{\mathbf {i}n(\sigma _p^{[1]}-\sigma _q^{[1]})}\}\nonumber \\&\qquad \times \exp \left\{ -(a_+-a_-)st\sum _{k=p,q}y_k^{[1]}\mathfrak {w}_k^{[1]}\left( e^{-\mathbf {i}(\sigma _k^{[1]}+\sigma )}+e^{\mathbf {i}(\sigma _k^{[1]}+\sigma )}\right) \right\} \nonumber \\&\qquad \times \exp \left\{ -\frac{\varTheta }{\sqrt{M}}\sum _{k=p,q} Tr R_k Y_k^{[1]}\right\} \nonumber \\&\quad \quad \times \exp \left\{ -\frac{1}{M} \tilde{\mathfrak {s}}_{pq}y_p^{[1]}y_q^{[1]}\mathfrak {w}_p^{[1]}\mathfrak {w}_q^{[1]}\left( e^{\mathbf {i}(\sigma _p^{[1]}-\sigma _q^{[1]})}-e^{\mathbf {i}(\sigma _q^{[1]}-\sigma _p^{[1]})}\right) \right\} \nonumber \\&\qquad \times \prod _{k=p,q}(y_k^{[1]})^{n+3}(\mathfrak {w}_k^{[1]})^n\cdot \prod _{k,\ell =p,q}\exp \left\{ -\frac{1}{2M} \tilde{\mathfrak {s}}_{k\ell }y_k^{[1]}y_\ell ^{[1]}\left( \mathfrak {w}_k^{[1]}\mathfrak {w}_\ell ^{[1]}\right) ^2\right\} \nonumber \\&\qquad \times \exp \left\{ -2N\eta (a_+-a_-)t^2\right\} \prod _{k=p,q}\exp \left\{ -y_k^{[1]}\left( \left( a_+({u}_k^{[1]})^2-a_-({v}_k^{[1]})^2\right) \right. \right. \nonumber \\&\quad \quad \left. \left. +(a_+-a_-)t^2+\eta -\mathbf {i}E\left( 1-2({v}_k^{[1]})^2\right) \right) \right\} \times \Big (1+O\Big (\frac{\varTheta N\eta }{\sqrt{M}}\Big )\Big ), \end{aligned}$$
(9.49)

where the last factor is independent of the \(\mathbf {w}^{[1]}\)-variables. Here, we put the factors containing \(\sigma _p^{[1]}\) and \(\sigma _q^{[1]}\) together, namely, the first two lines on the r.h.s. of (9.49). For further discussion, we write for \(k=p,q\)

$$\begin{aligned} Tr R_k Y_k^{[1]}=y_k^{[1]}\big (\mathfrak {r}_{k}^+ e^{\mathbf {i}\sigma _k}+\mathfrak {r}_k^-e^{-\mathbf {i}\sigma _k}+\mathfrak {r}_k\big ), \end{aligned}$$
(9.50)

where \(\mathfrak {r}_{k}^+ , \mathfrak {r}_k^-\) and \(\mathfrak {r}_k\) are all polynomials of \({u}_k^{[1]}\) and \({v}_k^{[1]}\), with bounded degree and bounded coefficients, in light of \(||R_k||_{\max }=O(1)\), the definition of \(Y_k^{[1]}\) in (3.14) and the parametrization in (3.15).

Now, we start to estimate the integral (9.40) by using (9.49). We deal with the integral over \(\sigma _p^{[1]}\) and \(\sigma _q^{[1]}\) at first. These variables are collected in the integral of the form

$$\begin{aligned} \mathcal {I}_\sigma (\ell _1,\ell _2)&:=\int _{\mathbb {L}^2} \mathrm{d} \sigma _p^{[1]}\mathrm{d} \sigma _q^{[1]} \exp \big \{{\mathbf {i}(n+\ell _1)\sigma _p^{[1]}\big \}\exp \big \{-\mathbf {i}(n+\ell _2)\sigma _q^{[1]}}\big \}\nonumber \\&\quad \times \exp \left\{ -\frac{\varTheta }{\sqrt{M}}\sum _{k=p,q} Tr R_k Y_k^{[1]}\right\} \\&\quad \times \exp \left\{ -(a_+-a_-)st\sum _{k=p,q}y_k^{[1]}\mathfrak {w}_k^{[1]}\big (e^{-\mathbf {i}(\sigma _k^{[1]}+\sigma )}+e^{\mathbf {i}(\sigma _k^{[1]}+\sigma )}\big )\right\} \\&\quad \times \exp \left\{ -\frac{1}{M} \tilde{\mathfrak {s}}_{pq}y_p^{[1]}y_q^{[1]}\mathfrak {w}_p^{[1]}\mathfrak {w}_q^{[1]}\big (e^{\mathbf {i}(\sigma _p^{[1]}-\sigma _q^{[1]})}-e^{\mathbf {i}(\sigma _q^{[1]}-\sigma _p^{[1]})}\big )\right\} \end{aligned}$$

with integers \(\ell _1\) and \(\ell _2\) independent of n. Note that according to (9.49), it suffices to consider \(\mathcal {I}_\sigma (0,0)\) for the proof of (9.38). We study \(\mathcal {I}_\sigma (\ell _1,\ell _2)\) for general \(\ell _1\) and \(\ell _2\) here, which will be used later.

Now, we set

$$\begin{aligned}&c_{p,q}:=\tilde{\mathfrak {s}}_{pq}y_p^{[1]}y_q^{[1]}\mathfrak {w}_p^{[1]}\mathfrak {w}_q^{[1]},\nonumber \\&c_{k,1}:=-(a_+-a_-)st y_k^{[1]}\mathfrak {w}_k^{[1]}e^{-\mathbf {i}\sigma }-\frac{\varTheta }{\sqrt{M}} y_k^{[1]}\mathfrak {r}_k^-,\quad k=p,q,\nonumber \\&c_{k,2}:=-(a_+-a_-)st y_k^{[1]}\mathfrak {w}_k^{[1]}e^{\mathbf {i}\sigma }-\frac{\varTheta }{\sqrt{M}} y_k^{[1]}\mathfrak {r}_k^+,\quad k=p,q. \end{aligned}$$
(9.51)

In addition, we introduce

$$\begin{aligned} d_{p,q}:=y_p^{[1]}y_q^{[1]},\quad d_{k}:=\Big (t +\frac{\varTheta }{\sqrt{M}}\Big )y_k^{[1]},\quad k=p,q. \end{aligned}$$
(9.52)

Obviously, when (9.39) is satisfied, we have

$$\begin{aligned} c_{p,q}=O(d_{p,q}),\quad c_{k,1}=O(d_{k}),\quad c_{k,2}=O(d_{k}),\quad k=p,q. \end{aligned}$$
(9.53)

With the aid of the notation defined in (9.50) and (9.51), we can write

$$\begin{aligned} \mathcal {I}_\sigma (\ell _1,\ell _2)= & {} \exp \Big \{-\frac{\varTheta }{\sqrt{M}}(y_p^{[1]}\mathfrak {r}_p+y_q^{[1]}\mathfrak {r}_q)\Big \}\int _{\mathbb {L}^2} \mathrm{d} \sigma _p\mathrm{d} \sigma _q \nonumber \\&\times \exp \left\{ \mathbf {i}(n+\ell _1)\sigma _p^{[1]}\right\} \exp \left\{ -\mathbf {i}(n+\ell _2)\sigma _q^{[1]}\right\} \nonumber \\&\times \prod _{k=p,q}\exp \left\{ c_{k,1} e^{-\mathbf {i}\sigma _k^{[1]}}+c_{k,2} e^{\mathbf {i}\sigma _k^{[1]}}\right\} \nonumber \\&\times \exp \Big \{-\frac{c_{p,q}}{M}e^{\mathbf {i}(\sigma _p^{[1]}-\sigma _q^{[1]})}+\frac{c_{p,q}}{M}e^{\mathbf {i}(\sigma _q^{[1]}-\sigma _p^{[1]})}\Big \} . \end{aligned}$$
(9.54)

We have the following lemma.

Lemma 9.12

Under the truncation (9.39), we have

$$\begin{aligned} |\mathcal {I}_\sigma (\ell _1,\ell _2)|\le C\Big (\Big (\frac{d_{p,q}}{M}\Big )^{n+\ell _3}+d_{p}^{2(n+\ell _3)}+d_{q}^{2(n+\ell _3)}\Big ),\qquad \ell _3:=\frac{\ell _1+\ell _2}{2} \end{aligned}$$

for some positive constant C.

Proof

At first, by Taylor expansion, we have

$$\begin{aligned}&\exp \left\{ \mathbf {i}(n+\ell _1)\sigma _p^{[1]}\right\} \exp \left\{ -\mathbf {i}(n+\ell _2)\sigma _q^{[1]}\right\} \exp \left\{ -\frac{c_{p,q}}{M}e^{\mathbf {i}(\sigma _p^{[1]}-\sigma _q^{[1]})}+\frac{c_{p,q}}{M}e^{\mathbf {i}(\sigma _q^{[1]}-\sigma _p^{[1]})}\right\} \nonumber \\&\quad =\sum _{n_1,n_2=0}^\infty \frac{(-1)^{n_1}}{(n_1)!(n_2)!} \Big (\frac{c_{p,q}}{M}\Big )^{n_1+n_2} \exp \left\{ \mathbf {i}(n+\ell _1+n_1-n_2)\sigma _p^{[1]}\right\} \nonumber \\&\quad \quad \times \exp \left\{ -\mathbf {i}(n+\ell _2+n_1-n_2)\sigma _q^{[1]}\right\} . \end{aligned}$$
(9.55)

Now, for any \(m_1,m_2\in \mathbb {Z}\), we denote

$$\begin{aligned} \tilde{\mathcal {I}}_\sigma (m_1,m_2)&:=\int _{\mathbb {L}^2} \mathrm{d} \sigma _p^{[1]}\mathrm{d} \sigma _q^{[1]}\; \exp \{\mathbf {i}m_1\sigma _p^{[1]}\}\exp \{-\mathbf {i}m_2\sigma _q^{[1]}\}\nonumber \\&\quad \times \prod _{k=p,q}\exp \left\{ c_{k,1} e^{-\mathbf {i}\sigma _k^{[1]}}+c_{k,2} e^{\mathbf {i}\sigma _k^{[1]}}\right\} \nonumber \\&=4\pi ^2\sum _{n_3=0}^{\infty }\mathbf {1}(n_3+m_1\ge 0)\frac{(c_{p,1})^{n_3+m_1}(c_{p,2})^{n_3}}{n_3!(n_3+m_1)!}\nonumber \\&\quad \times \sum _{n_4=0}^{\infty }\mathbf {1}(n_4+m_2\ge 0)\frac{(c_{q,1})^{n_4}(c_{q,2})^{n_4+m_2}}{n_4!(n_4+m_2)!}. \end{aligned}$$
(9.56)

Setting

$$\begin{aligned} m_1:=n+\ell _1+n_1-n_2,\qquad m_2:=n+\ell _2+n_1-n_2, \end{aligned}$$
(9.57)

and using (9.55), we can rewrite (9.54) as

$$\begin{aligned} \mathcal {I}_\sigma (\ell _1,\ell _2)= & {} \exp \Big \{-\frac{\varTheta }{\sqrt{M}}\big (y_p^{[1]}\mathfrak {r}_p+y_q^{[1]}\mathfrak {r}_q\big )\Big \}\sum _{n_1,n_2=0}^\infty \frac{(-1)^{n_1}}{(n_1)!(n_2)!}\nonumber \\&\Big (\frac{c_{p,q}}{M}\Big )^{n_1+n_2}\tilde{\mathcal {I}}_\sigma (m_1,m_2). \end{aligned}$$
(9.58)

For simplicity, we employ the notation

$$\begin{aligned} m_3:=m_3(\ell _1,n_1,n_2, n_3)=m_1+n_3,\quad m_4:=m_4(\ell _2, n_1,n_2, n_4)=m_2+n_4. \nonumber \\ \end{aligned}$$
(9.59)

Consequently, by (9.58) and (9.56) we obtain

$$\begin{aligned} |\mathcal {I}_\sigma (\ell _1,\ell _2)|&\le 4\pi ^2\left| \exp \left\{ -\frac{\varTheta }{\sqrt{M}}(y_p^{[1]}\mathfrak {r}_p+y_q^{[1]}\mathfrak {r}_q)\right\} \right| \sum _{n_1,n_2=0}^\infty \frac{1}{(n_1)!(n_2)!} \Big |\frac{c_{p,q}}{M}\Big |^{n_1+n_2}\nonumber \\&\quad \times \sum _{n_3=0}^{\infty }\mathbf {1}(m_3\ge 0)\frac{|c_{p,1}|^{m_3}|c_{p,2}|^{n_3}}{n_3!m_3!} \cdot \sum _{n_4=0}^{\infty }\mathbf {1}(m_4\ge 0)\frac{|c_{q,1}|^{n_4}|c_{q,2}|^{m_4}}{n_4!m_4!}\nonumber \\&\le C\max _{n_1,n_2,n_3,n_4\in \mathbb {N}} \Big |\Big (\frac{c_{p,q}}{M}\Big )^{n_1+n_2}(c_{p,1})^{m_3}(c_{p,2})^{n_3} (c_{q,1})^{n_4}(c_{q,2})^{m_4}\Big |\nonumber \\&\le C \max _{n_1,n_2,n_3,n_4\in \mathbb {N}}\Big |\Big (\frac{c_{p,q}}{M}\Big )^{n_1+n_2}(c_{p,1})^{m_3}(c_{q,2})^{m_4}\Big | \end{aligned}$$
(9.60)

for some positive constant C, where in the last step we used the fact \(|c_{k,1}|<1, |c_{k,2}|< 1\), which can be seen directly from the definition in (9.53), the truncations in (9.39) and the assumption \(\eta \le M^{-1}N^{\varepsilon _2}\). Analogously, we also have \(|{c_{p,q}}/M|<1\). According to the definitions (9.57) and (9.59), we have

$$\begin{aligned} 2(n_1+n_2)+m_3+m_4\ge 2n+\ell _1+\ell _2. \end{aligned}$$

Hence, by using \(|c_{k,1}|<1, |c_{k,2}|< 1\) and \(|{c_{p,q}}/{M}|<1\), we have the trivial bound

$$\begin{aligned}&\max _{n_1,n_2,n_3,n_4\ge 0}\left| \left( \sqrt{\frac{|c_{p,q}|}{M}}\right) ^{2(n_1+n_2)}(c_{p,1})^{m_3}(c_{q,2})^{m_4}\right| \nonumber \\&\quad \le \left( \frac{|c_{p,q}|}{M}\right) ^{n+\ell _3}+|c_{p,1}|^{2(n+\ell _3)}+|c_{q,2}|^{2(n+\ell _3)}. \end{aligned}$$

Therefore, we completed the proof by using (9.53). \(\square \)

Now, we return to the proof of Lemma 9.11. Using (9.49) and Lemma 9.12 with \(\ell _1=\ell _2=0\) to (9.40), and integrating the bounded variables \({v}_p^{[1]}, {v}_q^{[1]}\) and \(\sigma \) out, we can get

$$\begin{aligned} |\widehat{\mathbb {G}}(\hat{B},T)|&\le C\int _0^{(N\eta )^{\frac{1}{8}}} \mathrm{d} y_p^{[1]} \int _0^{(N\eta )^{\frac{1}{8}}} \mathrm{d} y_q^{[1]} \int _{0}^{(N\eta )^{-\frac{1}{4}}} 2t\mathrm{d} t\cdot (d_{p,q})^{n+3} \\&\quad \times \Big (\Big (\frac{d_{p,q}}{M}\Big )^{n}+d_{p}^{2n}+d_{q}^{2n}\Big ) \exp \big \{-2N\eta (a_+-a_-)t^2\big \}\\&\quad \times \exp \left\{ -\frac{\sqrt{4-E^2}}{2}\sum _{k=p,q} y_k^{[1]}\right\} (1+o(1)) \end{aligned}$$

where the last two factors come from the facts

$$\begin{aligned}&\left| \exp \left\{ -\sum _{k=p,q} y_k^{[1]}\left( a_+\left( {u}_k^{[1]}\right) ^2-a_-\left( {v}_k^{[1]}\right) ^2\right) \right\} \right| = \exp \left\{ -\frac{\sqrt{4-E^2}}{2}\sum _{k=p,q} y_k^{[1]}\right\} , \nonumber \\&\left| \exp \left\{ -\sum _{k=p,q}y_k^{[1]}\left( (a_+-a_-)t^2+\eta -\mathbf {i}E\left( 1-2\left( {v}_k^{[1]}\right) ^2\right) \right) \right\} \; \left( 1+O\left( \frac{\varTheta N\eta }{\sqrt{M}}\right) \right) \right| \nonumber \\ {}&\quad =1+o(1). \end{aligned}$$
(9.61)

In (9.61) we used the fact \(({u}_k^{[1]})^2+({v}_k^{[1]})^2=1\). Now, we integrate \(y_p^{[1]}\) and \(y_q^{[1]}\) out. Consequently, by the definition in (9.52), we have

$$\begin{aligned} |\widehat{\mathbb {G}}(\hat{B},T)|\le & {} C \int _{0}^{(N\eta )^{-\frac{1}{4}}} 2t\mathrm{d} t \; \Big (\frac{1}{M^n}+\Big (\frac{\varTheta }{\sqrt{M}}\Big )^{2n}+t^{2n}\Big ) \exp \big \{-2N\eta (a_+-a_-)t^2\big \}\\= & {} O\Big (\frac{1}{(N\eta )^{n+1}}\Big ), \end{aligned}$$

where in the last step we have used the assumption \(\eta \le M^{-1}N^{\varepsilon _2}\) in (1.16), Assumption 1.14, the definition of \(\varTheta \) in (5.30) and the fact \(N=MW\). Hence, we completed the proof of Lemma 9.11. \(\square \)

Finally, we can prove Lemma 9.9, and further prove Lemma 9.7.

Proof of Lemma 9.9

This is a direct consequence of Lemmas 9.10 and 9.11. \(\square \)

Proof of Lemma 9.7

This is a direct consequence of (9.34), Lemma 9.8 and Lemma 9.9. \(\square \)

9.3 Summing up: Proof of Lemma 9.1

In this section, we slightly modify the discussions in Sects. 9.1 and 9.2 to prove Lemma 9.1. The combination of Lemmas 9.2 and 9.7 would directly imply Lemma 9.1 if the \(\mathsf {Q}(\cdot )\) factor were not present in the definition of \(\mathsf {A}(\cdot )\). Now we should take \(\mathsf {Q}(\cdot )\) into account. This argument is similar to the corresponding discussion in Sect. 6.4.

Proof of Lemma 9.1

At first, we observe that \(\kappa _1, \kappa _2\) and \(\kappa _3\) in (6.5) are obviously independent of n. Then, by the fact \(\kappa _1=W^{O(1)}\), it suffices to consider one monomial of the form

$$\begin{aligned} \mathfrak {p}_1\Big (t,s, (y^{[1]}_p)^{-1},(y^{[1]}_q)^{-1}\Big )\mathfrak {p}_2\Big (\Big \{e^{\mathbf {i}\sigma _k^{[1]}}, e^{-\mathbf {i}\sigma _k^{[1]}}\Big \}_{k=p,q}\Big )\mathfrak {q}\Big (\Big \{\frac{\omega _{i,a}\xi _{j,b}}{M}\Big \}_{\begin{array}{c} i,j=1,\ldots ,W\\ a,b=1,\ldots ,4 \end{array}}\Big ), \end{aligned}$$

where the degrees of \(\mathfrak {p}_1(\cdot ), \mathfrak {p}_2(\cdot )\) and \(\mathfrak {q}(\cdot )\) are all O(1), and independent of n, in light of the fact \(\kappa _3=O(1)\) in (6.5). Especially, the order of \((y^{[1]}_p)^{-1}\)and \((y^{[1]}_q)^{-1}\) are not larger than 2, which can be easily seen from the definition of \(\mathcal {Q}(\cdot )\) in (3.19).

Now, we reuse the notation \(\mathsf {P}_{\mathfrak {q}}(\hat{X}, \hat{B}, V, T)\) and \(\mathsf {F}_{\mathfrak {p}}(\hat{X}, \hat{B}, V, T)\) in (6.36), by redefining them as

$$\begin{aligned} \mathsf {P}_{\mathfrak {q}}(\hat{X}, \hat{B}, V, T):= & {} \int \mathrm{d} \varOmega \mathrm{d}\varXi \; \mathcal {P}(\cdot )\cdot \mathfrak {q}\Big (\Big \{\frac{\omega _{i,a}\xi _{j,b}}{M}\Big \}_{\begin{array}{c} i,j=1,\ldots ,W\\ a,b=1,\ldots ,4 \end{array}}\Big ),\\ \mathsf {F}_{\mathfrak {p}}(\hat{X}, \hat{B}, V, T):= & {} \int \mathrm{d} X^{[1]} \mathrm{d}\mathbf {y}^{[1]} \mathrm{d}\mathbf {w}^{[1]} \mathrm{d}\mu (P_1) \mathrm{d}\nu (Q_1)\; \mathcal {F}(\cdot )\\&\times \mathfrak {p}_1\Big (t,s, (y^{[1]}_p)^{-1},(y^{[1]}_q)^{-1}\Big )\mathfrak {p}_2\Big (\Big \{e^{\mathbf {i}\sigma _k^{[1]}}, e^{-\mathbf {i}\sigma _k^{[1]}}\Big \}_{k=p,q}\Big ). \end{aligned}$$

It is easy to check \(\mathcal {P}(\cdot )\mathfrak {q}(\cdot )\) also has an expansion of the form in (9.13). Hence, the bound in (9.2) holds for \(\mathsf {P}_{\mathfrak {q}}(\cdot )\) as well. For \(\mathsf {F}_{\mathfrak {p}}(\cdot )\), the main modification is to use Lemma 9.12 with general \(\ell _1\) and \(\ell _2\) independent of n, owing to the function \(\mathfrak {p}_2(\cdot )\). In addition, by the truncations in (9.39), we can bound \(\mathfrak {p}_1(\cdot )\) by some constant C. Hence, it suffices to replace n by \(n+\ell _3\) in the proof of Lemma 9.11. Finally, we can get

$$\begin{aligned} \mathsf {F}_{\mathfrak {p}}(\hat{X}, \hat{B}, V, T)=O\left( \frac{1}{(N\eta )^{n+\ell _3}}\right) , \end{aligned}$$

with some finite integer \(\ell _3\) independent of n. Consequently, we completed the proof of Lemma 9.1. \(\square \)

10 Integral over the Type II and III vicinities

In this section, we prove Lemma 5.9. We only present the discussion for \(\mathcal {I}(\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+, \varUpsilon ^x_+, \varUpsilon _S,\mathbb {I}^{W-1})\), i.e. integral over the Type II vicinity. The discussion on \(\mathcal {I}(\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_-, \varUpsilon ^x_-, \varUpsilon _S,\mathbb {I}^{W-1})\) is analogous. We start from (8.46). Similarly, we shall provide an estimate for the integrand. At first, under the parameterization (8.2) with \(\varkappa =+\), we see that

$$\begin{aligned} \prod _{j=1}^W(x_{j,1}-x_{j,2})^2(b_{j,1}+b_{j,2})^2&=\frac{(-a_+^2)^W}{M^W}(a_+-a_-)^{2W}\left( 1+O\left( \frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\right) \right) \nonumber \\&\quad \times \prod _{j=1}^W\left( \mathring{x}_{j,1}-\mathring{x}_{j,2}+O\left( \frac{\varTheta }{\sqrt{M}}\right) \right) ^2. \end{aligned}$$
(10.1)

Then, what remains is to estimate \(\mathsf {A}(\hat{X}, \hat{B}, V, T)\). Our aim is to prove the following lemma.

Lemma 10.1

Suppose that the assumptions in Theorem 1.15 hold. In the Type II vicinity, we have

$$\begin{aligned} |\mathsf {A}(\hat{X}, \hat{B}, V, T)|\le e^{-cN\eta } |\det \mathbb {A}_+|^2\det (S^{(1)})^2 \end{aligned}$$
(10.2)

for some positive constant c.

With the aid of (10.1) and Lemma 10.1, we can prove Lemma 5.9.

Proof of Lemma 5.9

Recall (8.46). At first, by the definition of \(\mathbb {A}_+^v\) in (8.33), (5.24) and the fact \(\mathsf {Re}a_+^2>0\), we can see that

$$\begin{aligned} \mathsf {Re}(\mathring{\mathbf {x}}'\mathbb {A}_+^v\mathring{\mathbf {x}})\ge ||\mathring{\mathbf {x}}||_2^2 \end{aligned}$$
(10.3)

for all \(\{V_j\}_{j=2}^W\in (\mathring{U}(2))^{W-1}\). Substituting (5.21), (10.1), (10.2), (10.3) and the estimates in Proposition 8.4 into (8.46) yields

$$\begin{aligned}&| \mathcal {I}(\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+,\varUpsilon ^x_+, \varUpsilon _S,\mathbb {I}^{W-1})|\\&\quad \le e^{-cN\eta }\cdot \frac{(a_+-a_-)^{2W}}{8^W\pi ^{3W-1}}\cdot |\det S^{(1)}|^2\cdot |\det \mathbb {A}_+|\cdot \int _{\mathbb {L}^{W-1}} \prod _{j=2}^W\frac{\mathrm{d} \theta _j}{2\pi }\int _{\mathbb {I}^{W-1}} \prod _{j=2}^W 2v_j \mathrm{d} v_j\\&\qquad \times \int _{\mathbb {R}^{W-1}} \prod _{j=2}^W \mathrm{d} \tau _{j,1}\int _{\mathbb {R}^{W-1}} \prod _{j=2}^W \mathrm{d} \tau _{j,2}\int _{\mathbb {R}^W} \prod _{j=1}^W \mathrm{d} c_{j,1}\int _{\mathbb {R}^W} \prod _{j=1}^W \mathrm{d} c_{j,2} \int _{\mathbb {R}^W} \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,1}\nonumber \\&\qquad \times \int _{\mathbb {R}^W} \prod _{j=1}^W \mathrm{d} \mathring{x}_{j,2} \exp \{(a_+-a_-)^2\varvec{\tau }_1'S^{(1)}\varvec{\tau }_1\}\exp \{(a_+-a_-)^2\varvec{\tau }_2'S^{(1)}\varvec{\tau }_2\}\nonumber \\&\quad \quad \times \exp \left\{ -\frac{1}{2}||\mathring{\mathbf {c}}_{1}||_2^2-\frac{1}{2}||\mathring{\mathbf {c}}_{2}||_2^2\right\} \exp \left\{ -\frac{1}{2}||\mathring{\mathbf {x}}_{1}||_2^2-\frac{1}{2}||\mathring{\mathbf {x}}_{2}||_2^2\right\} \\&\qquad \times \prod _{j=1}^W\Big (\mathring{x}_{j,1}-\mathring{x}_{j,2}+O\Big (\frac{\varTheta }{\sqrt{M}}\Big )\Big )^2, \end{aligned}$$

where we absorbed several factors by \(\exp \{-cN\eta \}\). We also enlarged the domains to the full ones. Then, using the trivial facts

$$\begin{aligned} \int _{\mathbb {L}^{W-1}} \prod _{j=2}^W\frac{\mathrm{d} \theta _j}{2\pi }\int _{\mathbb {I}^{W-1}} \prod _{j=2}^W 2v_j \mathrm{d} v_j=1 \end{aligned}$$

and performing the Gaussian integral for the remaining variables, we can get

$$\begin{aligned} | \mathcal {I}(\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+,\varUpsilon ^x_+, \varUpsilon _S,\mathbb {I}^{W-1})|\le C|\det S^{(1)}|\cdot |\det \mathbb {A}_+|\cdot \Big (1+O\Big (\frac{\varTheta }{\sqrt{M}}\Big )\Big )^{W}. \nonumber \\ \end{aligned}$$
(10.4)

Observe that

$$\begin{aligned} |\det \mathbb {A}_+|\le |1+a_+^2|^W\le 2^W. \end{aligned}$$
(10.5)

Moreover, by Assumption 1.1 (ii), we see that \(|\mathfrak {s}_{ii}|\le (1-c_0)/2\) for some small positive constant \(c_0\). Consequently, since \(S^{(1)}\) is negative definite, we have

$$\begin{aligned} |\det S^{(1)}|\le \prod _{i\ne 1} |\mathfrak {s}_{ii}| \le \Big (\frac{1-c_0}{2}\Big )^W \end{aligned}$$
(10.6)

by Hadamard’s inequality. Substituting (10.5) and (10.6) into (10.4) yields

$$\begin{aligned} | \mathcal {I}(\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+,\varUpsilon ^x_+, \varUpsilon _S,\mathbb {I}^{W-1})|=O(e^{-c W}) \end{aligned}$$
(10.7)

for some positive constant \(\delta \). Hence, we proved the first part of Lemma 5.9. The second part can be proved analogously. \(\square \)

In the sequel, we prove Lemma 10.1. We also ignore the factor \(\mathsf {Q}(\cdot )\) from the discussion at first.

10.1 \(\mathsf {P}(\hat{X}, \hat{B}, V, T)\) in the Type II vicinity

Lemma 10.2

Suppose that the assumptions in Theorem 1.15 hold. In the Type II vicinity, we have

$$\begin{aligned} |\mathsf {P}(\hat{X},\hat{B}, V, T)|\le \frac{W^{2+\gamma }\varTheta ^2}{M}|\det \mathbb {A}_+|^2\det (S^{(1)})^2. \end{aligned}$$
(10.8)

Proof

We will follow the strategy in Sect. 9.1. We regard all V-variables as fixed parameters. Now, we define the function

$$\begin{aligned} \mathring{\iota }\equiv \mathring{\iota }_j(\hat{X},\hat{B},T):=|\mathring{x}_{j,1}|+|\mathring{x}_{j,2}|+|\mathring{b}_{j,1}|+|\mathring{b}_{j,2}|+|\mathring{t}_j|. \end{aligned}$$

Then, we recall the representation (9.4) and the definition of \(\varDelta _{\ell ,j}\) in (9.5). We still adopt the representation (9.8). It is easy to see that in the Type II vicinity, we also have the bound (9.9) for \(\mathring{\mathfrak {p}}_{\ell ,j,\varvec{\alpha },\varvec{\beta }}\). The main difference is the first factor of the r.h.s. of (9.4). We expand it around the saddle point as

$$\begin{aligned} \exp \Big \{-TrV_j^*\hat{X}_j^{-1}V_j \varOmega _jT_j^{-1}\hat{B}_j^{-1}T_j\varXi _j\Big \}=: & {} \exp \Big \{-TrD_{+}^{-1}\varOmega _jD_{\pm }^{-1}\varXi _j\Big \}\;\nonumber \\&\times \exp \Big \{-\frac{1}{\sqrt{M}}\widehat{\varDelta }_j\Big \}. \end{aligned}$$

We take the formula above as the definition of \(\widehat{\varDelta }_j\), which is of the form

$$\begin{aligned} \widehat{\varDelta }_j=\sum _{\alpha ,\beta =1}^4 \hat{p}_{j,\alpha ,\beta } \cdot \omega _{j,\alpha }\xi _{j,\beta }, \end{aligned}$$

where \(\hat{p}_{j,\alpha ,\beta }\) is a function of \(\hat{X}, \hat{B}, V\) and T-variables, satisfying

$$\begin{aligned} \hat{p}_{j,\alpha ,\beta }=O(\mathring{\iota }). \end{aligned}$$

Let \(\widehat{\mathbb {H}}=(a_+^{-2}\mathbb {A}_+)\oplus S\oplus (a_+^{-2}\mathbb {A}_+)\oplus S\). Recalling the notation in (6.16), we can write

$$\begin{aligned} -\sum _{j,k}\tilde{\mathfrak {s}}_{jk} Tr \varOmega _j\varXi _k-\sum _{j=1}^W TrD_{+}^{-1}\varOmega _jD_{\pm }^{-1}\varXi _j=-\mathbf {\Omega }\widehat{\mathbb {H}}\mathbf {\Xi }'. \end{aligned}$$

Now, via replacing \(\varDelta _{1,j}\) by \(\widehat{\varDelta }_{1,j}, \mathring{\kappa }_j\) by \(\mathring{\iota }_j, \mathbb {H}\) by \(\widehat{\mathbb {H}}\) in the proof of Lemma 9.2, we can perform the proof of Lemma 10.2 in the same way. We leave the details to the reader. \(\square \)

10.2 \(\mathsf {F}(\hat{X}, \hat{B}, V, T)\) in the Type II vicinity

Lemma 10.3

Suppose that the assumptions in Theorem 1.15 hold. In the Type II vicinity, we have

$$\begin{aligned} \mathsf {F}(\hat{X}, \hat{B}, V, T)=O\left( \frac{\exp \{-(a_+-a_-)N\eta \}}{(N\eta )^{n+1}}\right) . \end{aligned}$$
(10.9)

Proof

Recall the decomposition (9.34). Note that Lemma 9.9 is still applicable. Hence, it suffices to estimate \(\mathring{\mathbb {F}}(\hat{X},V)\). Now, note that in the Type II vicinity, it is easy to see that

$$\begin{aligned} \sum _{j=1}^W Tr X_jJ=O\Big (\frac{||\mathring{\mathbf {x}}_{1}||_1+||\mathring{\mathbf {x}}_{2}||_1}{\sqrt{M}}\Big )=O\Big (\frac{\varTheta }{\sqrt{M}}\Big )\Big ). \end{aligned}$$

Consequently, by the assumption on \(\eta \), we have

$$\begin{aligned} \exp \left\{ M\eta \sum _{j=1}^W Tr X_jJ\right\} =\exp \{O(\varTheta \sqrt{M}\eta )\}=1+o(1). \end{aligned}$$

From (3.20) we can also see that all the other factors of \(f(P_1, V, \hat{X}, X^{[1]})\) are O(1). Hence, by the definition (9.33), we have \(\mathring{\mathbb {F}}(\hat{X},V)=O(\exp \{-(a_+-a_-)N\eta \})\), which together with Lemma 9.9 yields the conclusion. \(\square \)

10.3 Summing up: Proof of Lemma 10.1

Analogously, we shall slightly modify the proofs of Lemma 10.2 and Lemma 10.3, in order to take \(\mathsf {Q}(\cdot )\) into account. The proof can then be performed in the same manner as Lemma 9.1. We omit the details.

11 Proof of Theorem 1.15

The conclusion for Case 1 is a direct consequence of the discussions in Sects. 3.510. More precisely, by using Lemmas 5.15.65.8 and 5.9, we can get (1.21) immediately.

The proofs of Case 2 and Case 3 can be performed analogously, with slight modifications, which will be stated below. In Case 2, we shall slightly modify the discussions in Sects. 3.510 for Case 1, according to the decomposition of supermatrices in (3.9). Now, at first, in (3.12) and (3.13), for \(A=\breve{\mathcal {S}}, \breve{{X}}, \breve{{Y}}, \breve{{\varOmega }}\) or \(\breve{{\varXi }}\), we replace \(A_p^{\langle 1\rangle }\) and \(A_q^{\langle 1\rangle }\) by \(A_p^{\langle 1,2\rangle }\) and \(A_q\) respectively, and replace \(A_q^{[1]}\) by \(A_p^{[2]}\). In addition, in the last three lines of (3.13), we shall also replace \(\tilde{s}_{jq}\) by \(\tilde{s}_{jp}\), and replace \(\tilde{s}_{pq}\) and \(\tilde{s}_{qp}\) by \(\tilde{s}_{pp}\), and in the first line, we replace \(\bar{\phi }_{1,q,1}\phi _{1,p,1}\bar{\phi }_{2,p,1}\phi _{2,q,1}\) by \(\bar{\phi }_{1,p,2}\phi _{1,p,1}\bar{\phi }_{2,p,1}\phi _{2,p,2}\). Then, in (3.14) and (3.15), for \(A=X, Y, \varOmega , \varXi , \varvec{\omega }, \varvec{\xi }, \mathbf {w}, y, \tilde{u}, \tilde{v}\) or \(\sigma \), we replace \(A_q^{[1]}\) by \(A_p^{[2]}\). With these modifications, it is easy to check the proof in Sects. 3.510 applies to Case 2 as well. The main point is we can still gain the factor \(1/(N\eta )^{n+1}\) from integral of \(g(\cdot )\) defined in (3.21) (with \(y_q^{[1]}\) and \(\mathbf {w}_q^{[1]}\) replaced by \(y_p^{[2]}\) and \(\mathbf {w}_p^{[2]}\)). Heuristically, we can go back to (4.4), and replace \(\sigma _q^{[1]}\) by \(\sigma _p^{[2]}\) therein. It is then quite clear the same estimate holds. Consequently, Lemmas 5.15.6, 5.8 and 5.9 still hold under the replacement of the variables described above. Hence, (1.21) holds in Case 2.

In Case 3, we can also mimic the discussions for Case 1 with slight modifications. We also start from (3.12) and (3.13). For \(A=\breve{\mathcal {S}}, \breve{{X}}, \breve{{Y}}, \breve{{\varOmega }}, \breve{{\varXi }}, \varvec{\omega }\) and \(\varvec{\xi }\), we replace \(A_q^{\langle 1\rangle }\) by \(A_q\), and replace \(A_q^{[1]}\) by 0. In addition, in the first line of (3.13), we replace \(\bar{\phi }_{1,q,1}\phi _{1,p,1}\bar{\phi }_{2,p,1}\phi _{2,q,1}\) by \(\bar{\phi }_{1,p,1}\phi _{1,p,1}\bar{\phi }_{2,p,1}\phi _{2,p,1}\). Consequently, after using superbosonization formula, we will get the factor \((y_p^{[1]}|(\mathbf {w}_p^{[1]}(\mathbf {w}_p^{[1]})^*)_{12}|)^{2n}\) instead of \((y_p^{[1]}y_q^{[1]}(\mathbf {w}^{[1]}_q(\mathbf {w}^{[1]}_q)^*)_{12}(\mathbf {w}^{[1]}_p(\mathbf {w}^{[1]}_p)^*)_{21})^n\) in (3.16). Then, for the superdeterminant terms

$$\begin{aligned} \prod _{k=p,q}\frac{\det (X_k-\varOmega _k(Y_k)^{-1}\varXi _k)}{\det Y_k},\quad \prod _{k=p,q}\frac{y_k^{[1]}\Big (y_k^{[1]}-\varvec{\xi }_k^{[1]}(X_k^{[1]})^{-1}\varvec{\omega }_k^{[1]}\Big )^{2}}{\det ^2 (X_k^{[1]})}. \end{aligned}$$

we shall only keep the factors with \(k=p\) and delete those with \(k=q\). Moreover, we shall also replace \(A_q^{[1]}\) by 0 for \(A=X, Y, \varOmega , \varXi , \varvec{\omega }, \varvec{\xi }, \mathbf {w}, y, \tilde{u}, \tilde{v}\) or \(\sigma \) in (3.16). In addition, \(\mathrm{d A^{[1]}}\) shall be redefined as the differential of \(A_p^{[1]}\)-variables only, for \(A=X, \mathbf {y}, \mathbf {w}, \varvec{w}\) and \(\varvec{\xi }\). One can check step by step that such a modification does not require any essential change of our discussions for Case 1. Especially, note that our modification has nothing to do with the saddle point analysis on the Gaussian measure \(\exp \{-M(K(\hat{X},V)+L(\hat{B},T))\}\). Moreover, the term \(\mathcal {P}(\cdot )\) in (3.29) can be redefined by deleting the factor with \(k=q\) in the last term therein. Such a modification does not change our analysis of \(\mathcal {P}(\cdot )\). In addition, the irrelevant term \(\mathcal {Q}(\cdot )\) can also be defined accordingly. Specifically, we shall delete the factor with \(k=q\) in the last term of (3.30) and replace \(A_q^{[1]}\) by 0 for \(A=\varOmega , \varXi , \varvec{\omega }, \varvec{\xi }, \mathbf {w}, y\). It is routine to check that Lemma 6.3 still holds under such a modification. Analogously, we can redefine the functions \(\mathcal {F}(\cdot ), f(\cdot )\) and \(g(\cdot )\) in (3.19)–(3.21). Now, the main difference between Case 3 and Case 1 or 2 is that the factor \((y_p^{[1]}|(\mathbf {w}_p^{[1]}(\mathbf {w}_p^{[1]})^*)_{12}|)^{2n}\) does not produce oscillation in the integral of \(g(\cdot )\) any more. Heuristically, the counterpart of (4.4) in Case 3 reads

$$\begin{aligned}&e^{(a_+-a_-)N\eta }\int d\mathbf {y}^{[1]} \mathrm{d}\mathbf {w}^{[1]} d\nu (Q_1)\cdot g(\hat{B}, T, Q_1, \mathbf {y}^{[1]},\mathbf {w}^{[1]})\\&\quad \sim \int _0^\infty 2t d t\int _{\mathbb {L}} d\sigma _p^{[1]} \cdot e^{-cN\eta t^2+c_1e^{-\mathbf {i}\sigma _p^{[1]}}t}\sim \frac{1}{N\eta }. \end{aligned}$$

Hence, (1.21) holds for Case 3. Therefore, we completed the proof of Theorem 1.15.

12 Comment on the prefactor \(N^{C_0}\) in (1.21)

In the proof of (1.21), we have used \(N^{C_0}\) to replace \(M\varTheta ^2W^{C_0}/(N\eta )^{\ell }\) (see the proof of Lemma 5.8). However, the latter is also artificial. It can be improved to some n-dependent constant \(C_n\) via a more delicate analysis on \(\mathsf {A}(\cdot )\), i.e. the integral of \(\mathcal {P}(\cdot )\mathcal {Q}(\cdot )\mathcal {F}(\cdot )\). Such an improvement stems from the cancellation in the Gaussian integral. At first, a finer analysis will show that the factor \(\mathcal {Q}(\cdot )\) can really be ignored, in the sense that it does not play any role in the estimate of the order of \(\mathbb {E}|G_{ij}(z)|^{2n}\). Hence, for simplicity, we just focus on the product \(\mathsf {P}(\cdot )\mathsf {F}(\cdot )\) instead of \(\mathsf {A}(\cdot )\). Then, we go back to Lemmas 9.2 and 9.7. Recall the decomposition (9.34). A more careful analysis on \(\mathsf {F}(\cdot )\) leads us to the following expansion, up to the subleading order terms of the factors \(\mathring{\mathbb {G}}(\cdot )\) and \(\mathring{\mathbb {F}}(\cdot )\),

$$\begin{aligned} \mathsf {F}(\cdot )&=\mathring{\mathbb {G}}(\cdot )\mathring{\mathbb {F}}(\cdot )\sim \frac{1}{(N\eta )^{n+2}}\left( 1+\frac{M\eta }{\sqrt{M}}\sum _{j=1}^W \mathsf {l}_j(\mathring{x}_{j,1},\mathring{x}_{j,2},\mathring{v}_j)+\cdots \right) \nonumber \\&\quad \times \left( 1+\frac{M\eta }{\sqrt{M}}\sum _{j=1}^W \mathsf {l}'_j(\mathring{b}_{j,1},\mathring{b}_{j,2}, \mathring{t}_j)+\cdots \right) , \end{aligned}$$
(12.1)

where \(\mathsf {l}_j(\cdot )\)’s and \(\mathsf {l}'_j(\cdot )\)’s are some linear combinations of the arguments. Analogously, we shall write down the leading order term of \(\mathsf {P}(\cdot )\) in terms of \(\mathring{\mathbf {x}}, \mathring{\mathbf {b}}, \mathring{\mathbf {t}}\) and \(\mathring{\mathbf {v}}\) explicitly. Then it can be seen that the leading order term of \(\mathsf {P}(\cdot )\) is a linear combination of \(\mathring{x}_{j,1}\mathring{x}_{k,2}, \mathring{b}_{j,1}\mathring{b}_{k,2}, \mathring{x}_{j,\alpha }\mathring{b}_{k,\beta }, \upsilon _{j,\alpha }\tau _{k,\beta }\) for \(j,k=1,\ldots , W\) and \(\alpha ,\beta =1,2\), in which all the coefficients are of order 1 / M. Observe that the Gaussian integral in (8.45) will kill the linear terms. Consequently, in the expansion (12.1), the first term that survives after the Gaussian integral is actually

$$\begin{aligned} \frac{1}{(N\eta )^{n+2}}\cdot \frac{M\eta }{\sqrt{M}}\sum _{j=1}^W \mathsf {l}_j(\mathring{x}_{j,1},\mathring{x}_{j,2},\mathring{v}_j)\cdot \frac{M\eta }{\sqrt{M}}\sum _{j=1}^W \mathsf {l}'_j(\mathring{b}_{j,1},\mathring{b}_{j,2}, \mathring{t}_j). \end{aligned}$$
(12.2)

Replacing \(\mathsf {A}(\cdot )\) by the product of the leading order term of \(\mathsf {P}(\cdot )\) and (12.2) in the integral (8.45) and taking the Gaussian integral over \(\mathbf {c}, \mathbf {d}, \varvec{\tau }\) and \(\varvec{\upsilon }\)-variables yield the true order \(1/(N\eta )^n\), without additional N-dependent prefactors.

Table of symbols

For the convenience of the reader, in the following table we collect some frequently used symbols followed by the locations where they are defined.

\(a_+, a_-\)

(1.27)

\( \hat{X}_j,\hat{B}_j \)

(3.23)

\(\mathbb {S},\mathbb {S}^{v}\)

(5.23)

\( u,s,u_j,s_j\)

(3.25)

\(P_1,Q_1\)

(3.25)

\(\varTheta \)

(5.30)

\( \tau _{j,1},\tau _{j,2} \)

(8.20)

\(V_j, T_j\)

(3.24)

\(\mathbb {I}, \mathbb {L}, \varSigma , \varGamma \)

(1.30)

\( \upsilon _{j,1}, \upsilon _{j,2}\)

(8.23)

\(\mathbb {A}_+,\mathbb {A}_-\)

(8.14)

\(\varUpsilon ^b_{\pm },\varUpsilon ^x_{\pm }, \varUpsilon _S\)

(5.32)

\( \Bbbk (a)\)

(5.4)

\(\mathbb {A}_+^v,\mathbb {A}_-^v\)

(8.33)

\(\mathring{\varUpsilon }, \mathring{\varUpsilon }_S\)

(8.4)

\(D_{\pm }, D_{\mp }, D_+, D_-\)

(1.28)

\(\mathbb {H}\)

(9.26)

\(\varUpsilon _{\infty }\)

(8.39)