Fluctuations of the Magnetization for Ising Models on Erd\H{o}s-R\'enyi Random Graphs -- the Regimes of Small p and the Critical Temperature

We continue our analysis of Ising models on the (directed) Erd\H{o}s-R\'enyi random graph. This graph is constructed on $N$ vertices and every edge has probability $p$ to be present. These models were introduced by Bovier and Gayrard [J. Stat. Phys., 1993] and analyzed by the authors in a previous note, in which we consider the case of $p=p(N)$ satisfying $p^3N^2\to +\infty$ and $\beta<1$. In the current note we prove a quenched Central Limit Theorem for the magnetization for $p$ satisfying $pN \to \infty$ in the high-temperature regime $\beta<1$. We also show a non-standard Central Limit Theorem for $p^4N^3 \to \infty$ at the critical temperature $\beta=1$. For $p^4N^3 \to 0$ we obtain a Gaussian limiting distribution for the magnetization. Finally, on the critical line $p^4N^3 \to c$ the limiting distribution for the magnetization contains a quadratic component as well as a $x^4$-term. Hence, at $\beta=1$ we observe a phase transition in $p$ for the fluctuations of the magnetization.

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Description of the model
In this paper we continue our investigation of Ising models on the Erdős-Rényi random graph. They fall into the category of disordered ferromagnets, see e.g. [12] for a classic survey and [13] for first mathematically rigorous results. The model we are investigating in the present note was introduced by Bovier and Gayrard in [1]. In the same article the authors also prove a law of large number type result, which we will describe later.
The general 'architecture' of the model is that of a realization of a directed Erdős-Rényi graph G = G (N, p). This means that on the vertex set {1, . . . , N} the directed edge (i, j) is realized with probability p ∈ (0, 1]. Note that the case i = j is allowed, so the graph G may have loops. The indicator random variables ε i,j , i, j ∈ {1, . . . , N}, which indicate whether an edge (i, j) is present or not, are assumed to be independent. Since the graph is directed also ε i,j and ε j,i may differ. By definition, (ε i, j ) N i, j=1 are thus independent random variables with One major difference of this note from our previous article [16] is that here we only assume that p = p(N) and N satisfy pN → ∞ as N → ∞. Note that this even allows for p smaller than log(N)/N, which is the threshold for G being connected asymptotically almost surely. This is to be contrasted to the situation in [16], where we had to assume that p 3 N 2 → ∞. Another important difference is that we are also able to prove a (non-standard) central limit theorem at the critical inverse temperature β = 1 in this note.
Returning to the definition of our model, the Hamiltonian or energy function of the Ising model on G (i.e. a fixed realization of the Erdős-Rényi random graph) is a function H := H N : {−1, +1} N → R. This function is given by for σ = (σ 1 , . . . , σ N ) ∈ {−1, +1} N . With such an energy function H we may associate a Gibbs measure on {−1, +1} N . This is a random probability measure with respect to the randomness encoded by the (ε i, j ) N i, j=1 . It is given by where β 0 is called the inverse temperature. The normalizing constant is given by The well known Curie-Weiss model is the model with p ≡ 1 (of course then H and μ β are no longer random). It has been intensively studied in the past, see [10], for a survey. One of the first findings is that the Curie-Weiss model undergoes a phase transition at β = 1. This can be seen, among others, when considering the magnetization per particle Here we have set In the standard Curie-Weiss model the law of m N under the Gibbs measure converges to where δ x is the Dirac-measure in a point x, m + (β) is the largest solution of z = tanh(βz), (1.6) and m − (β) = −m + (β). Since for β 1 the equation (1.6) has only the solution z = 0, in this so-called high temperature regime m N converges to 0 in probability. For β > 1 the largest solution of (1.6) is strictly positive. Hence the magnetization m N is asymptotically concentrated in two values, a positive one and a negative one. As shown by Bovier and Gayrard in [1], the same holds true for the dilute Curie-Weiss Ising model defined by (1.1) and (1.2) if pN → ∞.
For β 1 the magnetization m N converges to 0 in probability under the Gibbs measure for almost all realizations of the random graph. For β > 1 the distribution of the magnetization again converges to 1 2 (δ m + (β) + δ m − (β) ), where m + (β) and m − (β) are defined as above. Indeed, there is also a central limit theorem for the magnetization in the Curie-Weiss model (see, e.g. [2,[9][10][11]) when β < 1. In this case √ Nm N converges in distribution to a normal random variable with mean 0 and variance 1 1−β . Moreover, at β = 1, there is no such standard central limit theorem and one has to scale in a different way. The result is that 4 √ Nm N converges in distribution to a non-normal random variable with density proportional to exp(− 1 12 x 4 ) with respect to the Lebesegue measure.
Inspired by the first of these results, in [16] we showed the following: denote by L N the following random element of the space of probability measures on R, denoted by M(R): (Note that L N is random, because it depends on the random variables ε i,j , i, j ∈ {1, . . . , N}.) Then, if p 3 N 2 → ∞, we showed that L N converges in probability to the normal distribution with mean 0 and variance 1 1−β , denoted by N 0,1/(1−β) . Here, we consider L N as a random variable with values in the space M(R) endowed with some metric generating the weak topology, and N 0,1/(1−β) as a deterministic element of the same space.
Note that the situation we analyzed in [16] is essentially different from the situation, when the topology of the graph is locally tree like, as is the case for sparse Erdős-Rényi graphs or in some other models of random graphs. The corresponding situation was analyzed, e.g. in [4,5] as well as in [6][7][8]15], and [14].
Comparing our results in [16] with the fluctuation results for the Curie-Weiss model cited above, immediately raises two questions: first, is the restriction N 2 p 3 → ∞ necessary for the above statement to hold? And second, can we say anything about the fluctuations of the magnetization at β = 1? For both these questions our techniques in [16] were insufficient. In a nutshell, the key idea there was to consider the following generalization of the partition function: (1.8) Here we took g ∈ C b (R) to be a continuous, bounded function on R. Note that Z N (β, g) is a generalization of the partition function since for g ≡ 1 we obtain Z N (β) = Z N (β, 1), i.e. the partition function defined in (1.3). Moreover, we see that where, for a fixed disorder (ε i, j ) N i, j=1 , E μ β denotes the expectation with respect to the Gibbs measure μ β .
In [16] we were able to show that for g 0, g ≡ 0, β < 1 and convergence determining for weak convergence, this was enough to prove our main result. However, this approach essentially needs the conditions β < 1 and N 2 p 3 → ∞. The objective of the present note is to analyze the fluctuations of the magnetization, when either N 2 p 3 does not tend to infinity, or β = 1.

Main results
The central results of this note concern the quantity L N defined in (1.7) and two related quantities, which we will call L 1 N and L 3 N because they appear in the first and the third case of theorem 1.2, respectively. They are defined by We will analyze the convergence of L N in the high temperature regime β < 1 and the convergence of L 1 N and L 3 N at the critical temperature. To this end we will furnish the set M(R) of probability measures on R with the topology of weak convergence. Recall that this weak topology can be metrized by the Lévy metric Endowed with this metric, M(R) becomes a complete, separable metric space.
Our first result is an extension of theorem 1.1 in [16] to the regime of smaller values of p such that pN → ∞. Theorem 1.1. Assume that 0 < β < 1 and let p = p(N) be such that pN → ∞ as N → ∞. Then, L N , considered as a random element of M(R), converges in probability to N 0,1/(1−β) . That is to say, for every ε > 0, As can be expected from the results in the Curie-Weiss model and from the fact that at β = 1 the variance of the normal distribution in theorem 1.1 explodes, in the critical case β = 1 one has to scale differently. Interestingly, besides this phase transition in β, we will also find that there is a phase transition in p.
(a) Assume that β = 1 and let p = p(N) be such that p 4 N 3 → ∞ as N → ∞. Then, L 1 N , considered as a random element of M(R), converges in probability to M. Here M ∈ M(R) is the probability measure with density with respect to Lebesgue measure. The convergence in probability means that for every with respect to Lebesgue measure. As above, the convergence in probability means that for every ε > 0, (c) If again β = 1 and p = p(N) is such that Np → ∞, but p 4 N 3 → 0 as N → ∞, then L 3 N , considered as a random element of M(R), converges in probability to N 0,12 . Again, this is to say that for every ε > 0, and N 0,12 ∈ M(R) denotes the centered normal distribution with variance 12. Remark 1.3. Obviously, theorem 1.2 states that there is a phase transition at p = N −3/4 for the fluctuations of the magnetization in the dilute Curie-Weiss model at criticality. There are at least two obvious guesses, where this phase transition may come from. One is that at p = p(N) = N −3/4 some asymptotic almost sure property of the random graph appears or disappears. However, even though there are many such properties it is hard to imagine which property this might be, and why it affects the critical fluctuations but not those at β < 1.
The other guess is the one that sounds more plausible: for the Curie-Weiss model on a random graph also the fluctuations of the degree distribution may influence the fluctuation of the magnetization. On the other hand, for p(N) → 0 the expectation and the variance of the degree of a vertex is about pN. For the critical value for p in theorem 1.2, p = N −3/4 , this gives N 1/4 . Since the variance of the magnetization at β = 1 and p much larger than N −3/4 is of the order N 3/2 it is difficult to see, why fluctuations in the degrees of order N 1/4 should play any particular role.
On a technical level, this phase transitions occurs because of the p tanh(γ)|σ| 2 -term in (4.3) and the resulting approximations of EZ 1 N (β, g) in proposition 4.3, in particular in proposition 4.3(b). This, in turn, is a result of the centering in exponent in the definition of T(σ), see (3.1). As long as p is 'large' the p tanh(γ)|σ| 2 is well approximated by pγ|σ| 2 = 1 2N |σ| 2 and this term cancels with the leading order of the entropy (at β = 1). For p much larger than N −3/4 the third order term in the expansion of p tanh(γ)|σ| 2 (in γ) is much smaller than the next (i.e. fourth) order term of the entropy. However, for p of the order N −3/4 the two terms are of the same order and for p even smaller the third order term in the expansion of p tanh(γ)|σ| 2 becomes dominant.
However, we cannot find a good graph theoretic interpretation for the appearance of this term.

Technical preparation
In this section we will prepare for the proof of the main theorems 1.1 and 1.2. The study of the following function can be motivated when taking the expectation E with respect to the randomness given by ε i,j , i, j ∈ {1, . . . , N} of the random variable Z N (β, g) for some function g ∈ C b (R). Using the independence of the (ε i, j ) i, j we will then have to take expectations of the form where β 1 is some fixed value and we set γ := β 2Np . Noting that γ becomes small when N becomes large, we are led to analyze the function log(1 − p + pe z ) for z small and 0 p 1.
More precisely, for an integer m and arbitrary complex variables z and p let us define the slightly more general functions F m (x, p, z) := log(1 − p + pe xz−m log cosh(z) ). (2.2) In this section we do not require that p is a probability and consider p as a complex-valued quantity. We will next compute the power series expansion of some linear combinations of F m (x, p, z). Here, the x and m variables will be given and the linear combinations will be expanded in the p and z variables around the origin (0, 0). For any fixed x and m, if |p| < 2 and |z| < z 0 with sufficiently small z 0 > 0, then |p(e zx−m log cosh(z) − 1)| < 1.
Thus, for fixed x and m the function F m (x, p, z) is an analytic function of two complex variables p and z on the domain As such, it has a power series expansion which converges uniformly and absolutely on compact subsets of this domain. In particular, by absolute convergence, we can re-arrange and re-group the terms arbitrarily. For example, the first terms of the power series expansions of F 1 (1, p, z), F 2 (2, p, z), and F 2 (0, p, z) are as follows: and finally The next lemma collects the expansions which we shall need below.
Lemma 2.1. We have the following expansions:

4)
and Proof. The proofs of all formulae are similar. As an example, we prove (2.4). To this end, we first expand the analytic function F 1 (1, p, z) − F 1 (1, p, −z) in a Taylor series of the following form This expansion converges absolutely and uniformly on compact subsets of D, and the coefficients Q (1) n (z) are analytic functions on {|z| < z 0 }. We need to compute the functions Q (1) n (z) for n = 0, 1. This is done by the formula Taking n = 0 and n = 1, we obtain that Q (1) 0 (z) = 0 and That is, the expansion takes the form To see that each Q (1) n (z), n 2, does not contain terms of the form c 0 , c 1 z, c 2 z 2 , we consider the function G(p, z) : First observe that G(p, 0) = 0. Moreover, differentiating with respect to z yields: Both derivatives vanish at z = 0, thus proving that we have an expansion of the form where the coefficientsQ (1) n are analytic functions on {|z| < z 0 }. This proves (2.4).

Proof of theorem 1.1
In this section we will prove theorem 1.1. The main difference of our following considerations to the proof of the CLT for √ Nm N given in [16] is that here we will replace the quantity Z N (β, g) as defined in (1.8) by a term, which allows an asymptotic expansion also for smaller values of p. To this end, for σ ∈ {±1} N we introduce and recall that for fixed β < 1 we defined γ := β 2Np . Moreover, let We will study the behavior ofZ N (β, g). To this end, we will introduce the following set of pairs of configurations Here we set |στ | := N i=1 σ i τ i . The spin configurations in S N will be called the typical spin configurations in the rest of this section. The spin configurations in the complement In a slight abuse of notation, we say that a configuration σ ∈ {±1} N is typical and write σ ∈ S N . By this we will mean that |σ| 2 N(Np) 1/5 .
We start with the following computation

Lemma 3.1. For all p = p(N) such that pN → ∞ and all σ ∈ S N we have
Proof. Similarly to the computation of (2.1) we compute Defining Observe that since σ i ∈ {±1} for all i, we are interested in the behavior of f only in the two values +1 and −1. For these two values we can rewrite f in a linear form. This means that we write Here a 0 and a 1 depend on p and γ, of course, and are given by

Recalling (2.3) we obtain
On the other hand, by (2.4) with an o(1)-term that is uniform over all σ ∈ S N . Indeed, for a typical configuration σ we have This proves the claim.
Since eventually we want to compare VZ N (β, g) to [EZ N (β, g)] 2 , in a next step, we will compute E(T(σ)T(τ )) for typical σ and τ .
Proof. The idea of the proof is similar to the proof of lemma 3.1. Taking the expectations as in the proof of lemma 3.1 we get where we have set with F 2 defined as in (2.2). Note that we are just interested in g(0) and g(±2) since both, σ and τ , are in {±1} N and hence we have that σ i σ j + τ i τ j ∈ {−2, 0, 2}. For these values, the function g can be represented in the form 12 that again depend on p and γ. They can be computed by solving a system of four linear equations in four variables: This system of equations has a unique solution Using (2.5)-(2.7), we see that and hence as well as and hence for all typical (σ, τ ) ∈ S N , where the o(1)-terms are uniform. And finally, for all typical σ and τ . This proves the claim. Combining lemmas 3.1 and 3.2 yields Proof. This follows directly from the definition of the covariance and lemmas 3.1 and 3.2.
Note that the claim would not be true without the log cosh γ-term in the definition of T(σ). The role of this term is to make T(σ)'s 'asymptotically independent'. Corollary 3.3 already suggests that we have VZ N (β, g) = o(E(Z N (β, g)) 2 ) for all g ∈ C b (R), g 0, g ≡ 0. However, to prove this we still need to control the contribution of the atypical configurations. This is done in the following propositions.
The first of them may be interesting in its own rights, moreover, its proof is also a nice warm-up for the proof of proposition 3.5 thereafter, which is similar, but technically more demanding.
Here ξ denotes a standard normally distributed random variable.
Proof. By an obvious decomposition of {±1} N into typical and atypical σ's, we have For σ ∈ S N we can use the result of lemma 3.1 to obtain: Letting g ∞ := sup t∈R |g(t)| < ∞ and using the same arguments as in the proof of lemma 3.1, we can write for some sequence of constants C N,1 that does not depend on σ and stays bounded. Hence for some absolute constant C, which here and in the sequel may change from line to line. By these considerations we arrive at Since β < 1, and C N,1 stays bounded, there is δ > 0 such that β − 1 + by approximating the sum by an integral over a larger domain. Indeed, setting Finally, we prove that where ξ is a standard normal random variable. As in [16], proof of theorem 1.4, this follows from the central limit theorem of de Moivre-Laplace applied to |σ| √ N together with the uniform integrability of the sequence g |σ| √ N e β 2N |σ| 2 for β < 1 (see [10], proof of theorem 5.9.4), We are now ready to prove that our guess about the size of the variance ofZ N (β, g) was correct.
Proof. The idea of the proof is similar to the previous one. Again we decompose the term of interest: We already saw in corollary 3.
). This, together with the fact that g is bounded implies that On the other hand, since g 0, and the right-hand side is EZ N (β, g)) 2 .
From the proof of proposition 3.4 we see that for any σ, τ where we recall that the sequence of constants C N,1 does not depend on σ and τ and is bounded. Similarly, along the lines of the proof of lemma 3.2 we see that where also the sequences of constants C N,2 and C N, 3 do not depend on σ and τ and stay bounded.
To treat the second sum on the right-hand side of (3.12) denote by V N (k, l, m) the set of pairs Moreover, by ν N (k, l, m) = #V N (k, l, m) we denote the number of such pairs. Again, we want to apply a local limit theorem. To this end, if σ and τ are taken independently and uniformly at random from Their mean is 0 and their covariance matrix is the 3 × 3 identity matrix because The three-dimensional local central limit theorem [3] tells us that there is a universal constant C such that Combining this with (3.15) we see that for any (k, l, m) ∈ Z 3 , Since β < 1, we obtain for some δ > 0 and N large enough .
In a very similar way we can bound the sum of the E[T(σ)]E[T(τ )]-terms over (σ, τ ) ∈ V N (k, l, m). Employing the same notation as in the previous step we obtain for some δ > 0 and N sufficiently large. We can therefore conclude for some δ > 0 and N sufficiently large. Recalling that the function g is bounded, we arrive at 1 10 or |y| > (Np) 1 10 or |z| > (Np) 1 10 }. Arguing as in (3.10) the sum on the right-hand side can again be bounded from above by the corresponding integral over a larger domain. Including the pre-factor N −3/2 this yields because Np → ∞ as N → ∞. Combining these considerations with proposition 3.4 we obtain Finally, the assertion follows from a combination of the estimates for the typical configurations and the atypical configurations.
We are now ready to prove theorem 1.1.
Proof of theorem 1.1. Proposition 3.5 shows that, as long as in L 2 , for all non-negative g ∈ C b (R), g ≡ 0.
By Chebyshev's inequality, this convergence holds in probability, as well. Moreover, consider the quantity Z N (β, g) defined in (1.8). Note that .

(3.17)
Recall that for the convergence of the random probability measure L N defined in (1.7) we need to consider its integral against all non-negative g ∈ C b (R). However, we have that But from (3.17) we obtain It follows from (3.16) and proposition 3.4 that in probability, where ξ is a standard normally distributed random variable. However, the right-hand side of this equation is nothing but the integral

Proof of theorem 1.2
The core idea in the proof of theorem 1.2 is similar to the proof in the previous section. Therefore, we will use most of the definitions introduced there. However, as can be seen from the result, the relevant spin configurations are no longer those, where |σ| 2 is of order N (and the same for |τ | 2 ). We therefore introduce two new sets of configurations: Here h(x) is an increasing function going to infinity in such a way that h(Np) N 3/2 p 2 → 0 and (Np) 1/10 /h(Np) → ∞ as N → ∞. The regime R 1 N will be relevant, whenever β = 1 and p 4 N 3 → 0.
In the regime when p 4 N 3 → 0 we define Here h (x) is an increasing function going to infinity in such a way that We are able to find h such that the third condition is fulfilled, because we can take h (Np) =  1 10 , which is in agreement with our above choices. R 1 N will be the set of the typical pairs of spin configurations for the proof of the first and the second part of the theorem, while R 2 N will play the same role in the proof of the third part of the theorem. Thus, whenever we consider the set R 1 N we will tacitly assume that p 4 N 3 → 0, while, when we use R 2 N , we will think of p being such that p 4 N 3 → 0. So these sets will replace the set S N from the previous section.
For the rest of this section we will keep β = 1 fixed. Our first step is an analogue of lemma 3.1. Note, however, that the error term in the subsequent lemma is uniform over all σ's rather than only over the typical ones.
Proof. The proof is almost verbatim the same as the proof of theorem 3.1. We will follow this proof and write Since σ i ∈ {±1} for all i, we can rewrite f as The only difference is that we expand the expression using lemma 2.1, in particular (2.4). This gives a 1 |σ| 2 = p tanh(γ)|σ| 2 + O(p 2 γ 3 |σ| 2 ).
The following is the analogue of lemma 3.2. Again, the error term is uniform over all pairs

Lemma 4.2. Let β = 1. For p = p(N) such that pN → ∞ and any
Proof. The proof follows the idea of the proof of lemma 3.2. We will apply similar changes as those that were made when going from the proof of lemma 3.1 to the proof of lemma 4.1.
Following the proof of lemma 3.2 we again write We again represent g in the form We now expand the b 1 and b 2 using (2.5) and (2.6) to obtain Again, p 2 γ 3 |σ| 2 1 8Np and p 2 γ 3 |στ | 2 1 8Np go to 0 uniformly over all (σ, τ ). This proves the desired statement.
We are now ready to compute expectations of g ∈ C b (R) with respect to the Gibbs measure. Here the different regimes of p occur. We will adopt our definition ofZ N (1, g) to our requirements. For g ∈ C b (R), g 0, g ≡ 0 definẽ Proof. We start by proving part (a). Similar to the proof of proposition 3.4 we write For σ ∈ R 1 N we can use the result of lemma 4.1 to obtain: with o(1)-term that is uniform over R 1 N . Note that we are in the regime where p 4 N 3 → ∞ and that on R 1 N we have |σ| 2 N 3/2 h(Np). Thus by definition of γ, h, and R 1 N . We will first show that this sum over R 1 N is the dominant term, i.e. we claim that (4.10) Letting g ∞ := sup t∈R |g(t)| < ∞ we use lemma 4.1 to obtain the estimate On a scale 2 N N 1/4 , the terms with k = ±N do not contribute to the above sum because p tanh(γ)k 2 1 2 N and e 1/2 < 2. For the other terms, we will exploit the Stirling formula log(n!) = n log n − n + 1 2 log(2π) + 1 2 log n + 1 2 for n 1. The term log(n + 1 2 ) differs from the usual representation with log n by O(1/n) and is included to treat the boundary cases k = ±n below. The Stirling formula yields the crude bound with and We will apply the bound in the range N 2 4 k 2 . Note that I is an even function and that the Taylor expansion of NI(k/N) is given by The d i are the Taylor coefficients of (4.14) By using these facts, the inequality tanh(γ) γ, and aborting the Taylor expansion in (4.13) after the fourth order term we obtain For the values of k such that k 2 N 2 4 again by the above expansion we have for some constant C > 0, because now −λ N (k) is bounded from above by a constant. Again using the Taylor expansion (4.13), aborting it after the fourth order term, and using tanh(γ) γ we obtain Z Kabluchko et al But using a very similar argument as in (3.10) we can again bound the sum by the corresponding integral The integral R e − 1 12 x 4 dx is finite, the domain of integration x 2 1 2 h(Np) converges to the empty set and therefore, using for example dominated convergence, the integral goes to 0. Thus indeed, To estimate the sum over σ ∈ R 1 N , we use the following exact asymptotics also following from the Stirling formula: where the o(1)-term is uniform in the specified range. We used that on R 1 N the λ N (k)-term uniformly goes to 0. On the other hand, by the above identity for the binomial coefficient and the Taylor expansion of the function I.
Again, the k 2 /(2N)-term cancels. Moreover, for k 2 N 3/2 h(Np) only the first summands of N j 2 d 2 j k 2 j N 2 j survives and thus, by computing d 4 , with an o(1)-term that is uniform over all k with k 2 N 3/2 h(Np). Hence, This proves (a).
The proof of (b) is similar to the one of (a). Again, By lemma 4.1, . The proof follows almost verbatim the proof of the corresponding fact in (a).
On the other hand, again using identity (4.18) for the binomial coefficient, and the Taylor approximation (4.13) of I: (1) .

This proves (b).
To show (c) we will use the set R 2 N and decompose the sum of interest as follows For the first summand on the right-hand side notice that with an o(1)-term that is uniform over R 2 N . Indeed, expanding tanh(γ) we see that the terms For the second sum on the right-hand side in (4.19) we apply the following bound for some constant c > 0 by the same considerations as before. We split the sum on the righthand side into two parts Using the very same technique as in the first part of this proof we see that Turning to the first sum on the right-hand side of (4.20) we again bound λ N (k) by a constant. Moreover, we expand tanh(γ) = ∞ j=0 c 2 j+1 γ 2 j+1 and I as in (4.13). The quadratic terms of these expansions cancel and we are left with estimating Here the c j are the Taylor coefficients of tanh, while the d j are the Taylor coefficients of I, as above. Now, because the Taylor expansion of tanh is oscillating around its limit, since γ 1. On the other hand for k with k 2 N(Np) 2 h (Np). We have k: our assumptions on h . The last step is again an argument in the spirit of (3.10). Indeed, k: by replacing the sum by the corresponding finite integral (over all of R).
which proves that again by (4.18) and the Taylor expansion of I given in (4.13). Note that the o(1)-term is uniform in k with k 2 N(Np) 2 h (Np) by assumption on h . Now, This shows the assertion.
Proof. The proof combines the ideas of the proofs of proposition 3.5 and lemma 4.3. The three cases (a)-(c) are similar. We start with a detailed proof of (a). Proof of (a). We write: We will show that the sum over the typical pairs of configurations, that is over N (β, g)) 2 ). Indeed, by lemmas 4.1 and 4.2 and the fact that g is non-negative we see that Cov(T(σ)T(τ )) = o (E (T(σ)E(T(τ )) uniformly over (σ, τ ) ∈ R 1 N , so that, just as in the proof of proposition 3.5, Let us now turn to the non-typical pairs, that is to summands with (σ, τ ) / ∈ R 1 N . Here we cannot simply apply the local limit theorem as in section 3. Again, denote by V N (k, l, m) the set of pairs We are thus left with analyzing spin configurations that are related to the set Then, In Dividing ν N (k, l, m) by 2 2N turns it into a probability mass function, which can be written in terms of a conditional probability as follows: Here, P unif is a probability distribution, under which (σ, τ ) is uniformly distributed on {±1} N × {±1} N . Using the hypergeometric distribution, we compute (4.24) Using either (4.11) together with e −λ N (k) √ N or alternatively the exponential Markov inequality we see that for all |k| N and |l| N we have Here again I is given by (4.12) and we used its Taylor expansion (4.13) (which consists of positive terms, only) up to fourth order.
We will first treat the cases where either |k| N 3 or |l| N 3 . Employing our well-known estimates m 2 N 2 , p tanh γ (σ,τ )∈V N (k,l,m) We now proceed to the case where |k| < N 3 and |l| < N 3 . Recall that (4.11) states that (for n ∈ Z, M ∈ N and |n| M). Applying this to the binomial coefficients in (4.24) and bounding the log-correction in the exponent of the denominator by 0 we obtain that with weights N+k 2N and N−k 2N , respectively. Moreover, not only I itself is a convex function, but considering its Taylor expansion (as in (4.13)) I(x) = j 1 d 2j x 2j with positive coefficients d 2j we see that it is a positive linear combination of convex functions. Using that d 2 = 1 2 we obtain where for the inequality we used that for each j 2 This amounts to saying that Keeping in mind that we already eliminated the cases where k 2 or l 2 are larger than N 2 /9 we only have to prove that In view of our last estimates this term can be bounded as follows: because e −λ N (k)−λ N (l) C, when k 2 N 2 9 and l 2 N 2 9 . Our aim is to prove that the right-hand side is o( √ N). We include the set of remaining triples (k, l, m) in the union of two sets: 3 : First we consider the case when (k, l, m) ∈ R 1 N,1 and |m| N 5 . We claim that Indeed, as in (4.14) we bound e −λ N+k (m+l)−λ N−k (l−m) √ (N + k)(N − k) such that we only need to show that For |m| N 5 we can estimate cN for some constant c > 0, and hence because the number of terms is at most N 3 and m 2 N 2 . Now, the right-hand side goes to 0 because tanh 2 (γ) γ 2 , and pN → ∞. Now we consider the case when (k, l, m) ∈ R 1 N,1 and |m| N 5 . Denote In this case, we can estimate e −λ N+k (m+l)−λ N−k (l−m) C for another appropriate constant C > 0. In order to treat the two terms involving m, consider Note that pN(1 − p) tanh 2 (γ) → 0. Thus, given ε > 0 we have 8pN(1 − p) tanh 2 (γ) − 1 −1 + ε for N large enough. Therefore, Thus, if ε > 0 is small enough and we write C 1 : (4.27) Applying these estimates we arrive at (k,l,m)∈R where we used the fact that on R 1 N,1 one has for some C 3 > 0 and that for any k, l we have that for a constant C 4 > 0 that does not depend on k and l. But for N sufficiently large we can bound the sum on the right-hand side of (4.28) by an integral. This is again a step that follows the considerations in (3.10). More precisely, we obtain On the other hand, turning to R 1 N,2 and applying the same estimates as above for some constant C > 0 that may change from line to line. We have used the bound N (N+k)(N−k) C √ N and the fact that (assuming for concreteness that m > 0) Finally, we observed that can be bounded by a finite integral and thus itself is bounded. Hence we may bound the right-hand side of (4.30) by a constant times Hence,  N (β, g)) 2 ).
As remarked above, Combining this with (4.23) completes the proof of (a). Proof of (b). The proof of (b) is basically the same as the one for (a). Indeed, again we write As in (a), we can use lemmas 4.1 and 4.2 and to see that uniformly in (σ, τ ) ∈ R 1 N we have Cov(T(σ)T(τ )) = o (E (T(σ)E(T(τ )). As in the proof of proposition 3.5, we have (σ,τ )∈R 1 N (β, g)) 2 ). (4.32) The contribution of the pairs (σ, τ ) / ∈ R 1 N can be bounded exactly as in (a). Proof of (c). The proof of (c) is not the same, but similar to the proof of (a). In view of proposition 4.3 (c), it suffices to show that We again split the variance-this time ofZ 3 N (β, g)-into two parts: Cov(T(σ), T(τ )). Proceeding as in (a), we estimate the sum over the triples (k, l, m) ∈ R 2 N with k 2 Now we consider the terms with (k, l, m) ∈ R 2 N,1 such that additionally |m| N 5 . The contribution of these terms, namely for some constant C > 0, by 'integrating out' the k and l variables and-again in the spirit of (3.10)-by bounding the remaining sum by the integral on the right side over the set By definition of h we see that E 4 N ↓∅ as N → ∞, such that, by dominated convergence again, the right-hand side in (4.35) goes to 0.
In view of the third part of proposition 4.3 this shows that  N (β, g)) 2 ).
Taking the estimates over R 2 N,1 and R 2 N,2 together we see that