Subgaussian concentration and rates of convergence in directed polymers

We consider directed random polymers in $(d+1)$ dimensions with nearly gamma i.i.d. disorder. We study the partition function $Z_{N,\omega}$ and establish exponential concentration of $\log Z_{N,\omega}$ about its mean on the subgaussian scale $\sqrt{N/\log N}$ . This is used to show that $\mathbb{E}[ \log Z_{N,\omega}]$ differs from $N$ times the free energy by an amount which is also subgaussian (i.e. $o(\sqrt{N})$), specifically $O(\sqrt{\frac{N}{\log N}}\log \log N)$.


Introduction.
We consider a symmetric simple random walk on Z d , d ≥ 1. We denote the paths of the walk by (x n ) n≥1 and its distribution (started from 0) by P . Let (ω n,x ) n∈N,x∈Z d be a collection of i.i.d. mean-zero random variables with distribution ν and denote their joint distribution by P. We think of (ω n,x ) n∈N,x∈Z d as a random potential with the random walk moving inside this potential. This interaction gives rise to the directed polymer in a random environment and can be formalised by the introduction of the following Gibbs measure on paths of length N : where β > 0 is the inverse temperature. The normalisation is the partition function.
A central question for such polymers is how the fluctuations of the path are influenced by the presence of the disorder. Loosely speaking, consider the two exponents ξ and χ given by It is believed that χ < 1/2 for all β > 0 and all d (see [18].) It is expected and partially confirmed for some related models ( [20], [9]) that the two exponents χ, ξ are related via So there is reason for interest in the fluctuations of log Z N,ω , and in particular in establishing that these fluctuations are subgaussian, that is, o(N 1/2 ), as compared to the gaussian scale N 1/2 . It is the o(·) aspect that has not previously been proved: in [22] it is proved that in the point-to-point case (that is, with paths (x n ) n≥1 restricted to end at a specific site at distance N from the origin) one has variance which is O(N ) when the disorder has finite variance, and an exponential bound for | log Z N,ω − E log Z N,ω | on scale N 1/2 when the disorder has an exponential moment.
The zero-temperature case of the polymer model is effectively last passage percolation. More complete results exist in this case in dimension 1 + 1, for specific distributions [15]. There, based on exact computations related to combinatorics and random matrix theory, not only the scaling exponent χ for the directed last passage time was obtained, but also its limiting distribution after centering and scaling. A first step towards an extension of this type of result in the case of directed polymers in dimension 1 + 1 for particular disorder is made in [13]; see also [6] for a step towards asymptotics. The best known result for undirected point-to-point last passage percolation is in [8], stating that for v ∈ Z d , d ≥ 2, one has Var(max γ:0→v x∈γ ω x ) ≤ C|v|/ log |v|, when the disorder ω is Bernoulli. Some results on sublinear variance estimates for directed last passage percolation in 1 + 1 dimensions with gaussian disorder were obtained in [10], but the type of estimates there does not extend to higher dimensions, or to directed polymers at positive temperature.
The assumption of gaussian disorder is also strongly used there. In [14] estimates of the variance of directed last passage percolation are obtained via a coupling method, which appears difficult to extend to the case of polymers. In [7] exponential concentration estimates on the scale (|v|/ log |v|) 1/2 were obtained for first passage percolation, for a large class of disorders.
The extension of these results to directed polymers is not straightforward. This can be be seen, for example, from the fact that subgaussian fluctuations for a point-to-point directed polymer can naturally fail. Such failure occurs, for example, if one restricts the end point of a (1 + 1)-dimensional directed polymer to be (N, N ). Then (1.1) reduces to a sum of i.i.d. variables whose fluctuations are therefore gaussian.
The first result of the present paper is to obtain exponential concentration estimates on the scale (N/ log N ) 1/2 . Specifically, for nearly gamma disorder distributions (see Definition 2.1, a modification of the definition in [7]) we prove the following; here and throughout the paper we use K i to denote constants which depend only on β and ν. Theorem 1.1. Suppose the disorder distribution ν is nearly gamma with e 4β|ω| ν(dω) < ∞. Then there exist K 0 , K 1 such that for all N ≥ 2 and t > 0.
The nearly gamma condition ensures that ν has some exponential moment (see Lemma 2.2), so for small β the exponential moment hypothesis in Theorem 1.1 is redundant. The proof follows the rough outline of [7], and uses some results from there, which we summarize in Section 2.
We use Theorem 1.1, in combination with coarse graining techniques motivated by [5], to provide subgaussian estimates of the rate of convergence of N −1 E log Z N,ω to the free energy. Here the free energy of the polymer (also called the pressure) is defined as The existence of the free energy is obtained by standard subadditivity arguments and concentration results [11], which furthermore guarantee that Specifically, our second main result is as follows.
Theorem 1.2. Under the same assumptions as in Theorem 1.1, there exists K 2 such that for all N ≥ 3, Controlling the speed of convergence of the mean is useful when one considers deviations of N −1 log Z N,ω from its limit p(β) instead of from its mean, analogously to [9].
Regarding the organization of the paper, in Section 2 we review certain concentration inequalities and related results, mostly from [7], and give an extension of the definition from [7] of a nearly gamma distribution so as to allow non-positive variables. In Section 3 we provide the proof of Theorem 1.1. In Section 4 we provide the proof of Theorem 1.2. Finally, in Section 5 we provide the proof of a technical lemma used in Section 4.

Preliminary Results on Concentration and Nearly Gamma Distributions.
Let us first define the class of nearly gamma distributions. This class, introduced in [7] is quite wide and in particular it includes the cases of Gamma and normal variables. The definition given in [7] required that the support does not include negative values. Here we will extend this definition in order to accommodate such values as well.
Definition 2.1. Let ν be a probability measure on R, absolutely continuous with respect to the Lebesque measure, with density h and cumulative distribution function H. Let also Φ be the cumulative distribution function of the standard normal. ν is said to be nearly gamma (with parameters A, B) if (i) The support I of ν is an interval.
(ii) h(·) is continuous on I.
(iii) For every y ∈ I we have where A, B are nonnegative constants.
The motivation for this definition (see [7]) is that H −1 • Φ maps a gaussian variable to one with distribution ν, and ψ(y) is the derivative of this map, evaluated at the inverse image of y. With the bound on ψ in (iii), the log Sobolev inequality satisfied by a gaussian distribution with respect to the differentiation operator translates into a useful log Sobolev inequality satisfied by the distribution ν with respect to the operator ψ(y)d/dy.
It was established in [7] that a distribution is nearly gamma if (i), (ii) of Definition 2.1 are valid, and (iii) is replaced by remains bounded away from zero and infinity for x ∼ ν ± , for some α ± > −1.
h(x) remains bounded away from zero and infinity, as x → +∞. The analogous statement is valid if ν − = −∞. The nearly gamma property ensures the existence of an exponential moment, as follows.
Lemma 2.2. Suppose the distribution ν is nearly gamma with parameters A, B. Then e tx ν(dx) < ∞ for all t < 2/A.
Proof. Let T = H −1 • Φ, so that T (ξ) has distribution ν for standard normal ξ; then (2.1) is equivalent to T ′ (x) ≤ B + A|T (x)| for all x ∈ R. Considering T (x) ≥ 0 and T (x) < 0 separately, it follows readily from this that and the lemma follows.
For ω ∈ R Z d+1 and (m, y) ∈ Z d+1 we defineω (m,y) ∈ R Z d+1 \{(m,y)} by the relation ω = (ω (m,y) , ω m,y ). In other words,ω (m,y) is ω with the coordinate ω (m,y) removed. Given a function F on R Z d+1 and a configuration ω, the average sensitivity of F to changes in the (m, y) coordinate is given by . We use the same notation (a mild abuse) when F depends on only a subset of the coordinates.
We are now ready to state the theorem of Benaim and Rossignol [7], specialized to the operator ψ(s)d/ds applied to functions e θ 2 F (ωm,y,·) .
Then for every t > 0 we have that Observe that if K is of order N , then a bound on ρ N σ N of order N α with α < 1 is sufficient to ensure that l(K) is of order N/ log N . In particular it is sufficient to have σ N of order N and ρ N of order N −τ with τ > 0, which is what we will use below.

Concentration for the Directed Polymer.
In this section we will establish the first main result of the paper, which is Theorem 1.1. We assume throughout that the distribution ν of the disorder is nearly gamma with parameters A, B. We finally denote P = ν Z d+1 . We write µ(f ) for the integral of a function f with respect to a measure µ.
Let (n, x) ∈ N × Z d . We denote the partition function of the directed polymer of length N in the shifted environment ω n+·,x+· by Define the set of paths from the origin we write γ N = {(i, x i ) : i = 0, . . . , N } for a generic or random polymer path in Γ N . Let and let M (n,x) N,ω denote the same quantity for the shifted disorder, analogously to (3.1).
Proposition 3.1. There exists θ 0 (β, ν) such that for all |θ| < θ 0 and |I| ≤ (2d) N , the function F I N,ω satisfies the following Poincaré type inequality: where C AB is a constant depending on the nearly gamma parameters A, B.
Proof. By the definition of nearly gamma we have that where the last equality is achieved by performing first the summation over (m, y) and using that the range of the path consists of N sites after the starting site. Regarding the second term on the right side of (3.3), we define M I N,ω = max (n,x)∈I M (n,x) N,ω for a set I ⊂ N × Z d . We then have −βM I N,ω ≤ F I N,ω ≤ βM I N,ω so following similar steps as in where b a constant to be specified. We would like to show that the second term on the right side of (3.5) is smaller that the first one. First, in the case that θ > 0, since the disorder has mean zero, bounding Z (1 + |θ|βuN )e |θ|βuN P(M I N,ω > uN )du. (3.8) Denoting by J (·) the large deviation rate function related to |ω| we have that (3.8) is bounded by Let 0 < L < lim x→∞ J (x)/x (which exists since J (x)/x is nondecreasing for x > E|ω|) and choose b large enough so J (b)/b > L. Then provided |θ| is small enough (depending on β, ν) and b is large enough (depending on ν), (3.9)is bounded above by where the last inequality uses (3.6) and (3.7). This combined with (3.5) and (3.4) completes the proof.
The averaging over sets I used in the preceding proof is related to the auxiliary randomness used in the main proof in [8].
Define the point-to-point partition function and let µ N,ω,z be the corresponding Gibbs measure. With I fixed, we define We finally define It is clear that r n ≤ r + N + r − N and s n ≤ s + N + s − N . We make use of two choices of the set I of sites: let 0 < α < 1/2 and Proposition 3.2. For α < 1/2 and I = I α ± , there exists K 3 such that the following estimates hold true: Proof. We first consider r ± N and s ± N . Observe that The difference on the right side can be written as N,ω 1 x+x m−n =y ωm,y≥ωm,y e β ωm,y−ωm,y dP(ω m,y ). (3.12) To bound r + N , we have using (3.12): where in the equality we used the homogeneity of the environment and in the last inequality we used the fact that the directed path has at most one contact point with the set I α + and, therefore, (n,x)∈I α + 1 x+x m−n =y ≤ 1. Hence The estimate on s + N follows along the same lines. Specifically, we have using (3.12) that where in the equalities we used the fact that and in the last inequality we used the easily verified fact that µ (n,x) N,ω 1 x+x m−n =y and e −βωm,y are negatively correlated. It follows from (3.14) that We now need to show how these estimates extend to r − N , s − N . Using (3.10) and the second equality in (3.11), e β(ωm,y −ωm,y)1 x+x m−n=y 1ω m,y <ωm,y . (3.16) By Jensen's inequality this is bounded by From this we can proceed analogously to (3.13) and obtain Using (3.10), (3.15), (3.18) and the three equalities in (3.11), it follows that 19) where M N,ω is from (3.2). A similar computation to the one following (3.5) shows that for L, b as chosen after (3.9), with b sufficiently large (depending on ν), and then as in (3.13), . Further, analogously to (3.14) but with I α + replaced by a single point, we obtain Next, analogously to (3.16) and (3.17), , observe that by (3.18) and (3.25), similarly to (3.19), Proposition 3.2 shows that log[N/(r N s N log(N/r N s N ))] is of order log N . We can apply Proposition 3.1 and Theorem 2.3, the latter with ρ N = r N , σ N = s N , F = F I α + N,ω and K a multiple of N , to yield part (i) of the next proposition. Part (ii) follows similarly, usinĝ r N (z) andŝ N (z) in place of r N and s N , and F (ω) = log Z N,ω (z).
(ii) There exists K 5 and N 1 = N 1 (β, ν) such that for all N ≥ N 1 , t > 1 and all z ∈ Z d with |z| 1 ≤ N .
We can now prove the first main theorem.
Proof of Theorem 1.1. We start by obtaining an a.s. upper and lower bound on log Z N,ω . Loosely, for the lower bound we consider a point (⌊N α ⌋, x) ∈ I α + and we force the polymer started at (0, 0) to pass through that point; the energy accumulated by the first part of the polymer, i.e.
is then bounded below by the minimum energy that the polymer could accumulate during its first ⌊N α ⌋ steps. More precisely, we define M n 1 ,n 2 N,ω := max{|ω n,x | : n 1 ≤ n ≤ n 2 , |x| ∞ ≤ N }, and then bound below by the minimum possible energy: we then get that In a related fashion we can obtain an upper bound on log Z N,ω . In this case we start the polymer from a location (−⌊N α ⌋, x) ∈ I α − and we force it to pass through (0, 0). Letting we then have analogously to (3.29 For N ≥ N 0 (β, ν), Proposition 3.3(i) guarantees that the first term on the right side in (3.34) is bounded by 8e −K 4 t/2 . The second and the third terms are similar so we consider only the second one. If t > 1, then for some K 6 , for large N , Putting the estimates together we get from (3.34) that for some K 7 , for all N large (say N ≥ N 2 (β, ν) ≥ N 0 (β, ν)) and t > 1. For t ≤ 1, (3.35) is trivially true if we take K 7 small enough. This completes the proof for N ≥ N 2 .
For 2 ≤ N < N 2 an essentially trivial proof suffices. Fix any (nonrandom) path (y n ) n≤N and let T N = N n=1 ω n,yn , so that Z N,ω ≥ (2d) −N e βT N . Let K 8 = N 2 log 2d + max N <N 2 E log Z N,ω , K 9 = min N <N 2 E log Z N,ω and K 10 = max N <N 2 EZ N,ω . Then for some K 11 , K 12 , and by Markov's inequality, The theorem now follows for these N ≥ 2.

Subgaussian rates of convergence
In this section we prove Theorem 1.2. We start with the simple observation that E log Z N,ω is superadditive: which by standard superadditivity results implies that the limit in (1.4) exists, with Let L d+1 be the even sublattice of Z d+1 : and for l < m and (l, x), (m, y) ∈ L d+1 define Z m−l,ω ((l, x)(m, y)) = E l,x e β m n=l+1 ωn,x n ; x m = y .
Recall the notation (3.1) for a polymer in a shifted disorder.
The following lemma will be used throughout. Its proof follows the same lines as ( [5], Lemma 2.2(i)) and analogously to that one it is a consequence of Theorem 1.1 for part (i), and Proposition 3.3(ii) for part (ii).
Lemma 4.1. Let ν be nearly gamma. There exists K 13 as follows. Let n max ≥ 1 and let 0 ≤ s 1 < t 1 ≤ s 2 < t 2 < · · · ≤ s r < t r with t j − s j ≤ n max for all j ≤ r. For each j ≤ r let (s j , y j ) ∈ H s j and (t j , z j ) ∈ H t j , and let ζ j = log Z t j −s j ,ω ((s j , x j )(t j , y j )), t j −s j ,ω . Then for a > 0, we have the following. (i) , Note (iii) follows from (i), since ζ j ≤ χ j . We do not have a bound like (4.5), with factor (log n max ) 1/2 , for the lower tail of the ζ j 's, but for our purposes such a bound is only needed for the upper tail, as (4.4) suffices for lower tails.
We continue with a result which is like Theorem 1.2 but weaker (not subgaussian) and much simpler. Define the set of paths from the origin For a specified block length n, and for N = kn, the simple skeleton of a path in Γ N is {(jn, x jn ) : 0 ≤ j ≤ k}. Let C s denote the class of all possible simple skeletons of paths from (0, 0) to (kn, 0) and note that For a skeleton S (of any type, including simple and types to be introduced below), we write Γ N (S) for the set of all paths in Γ N which pass through all points of S. For a set A of paths of length N we set and we write Z N,ω (S) for Z N,ω (Γ N (S)).
The proof of Theorem 1.2 follows the general outline of the preceding proof. But to obtain that (stronger) theorem, we need to sometimes use Lemma 4.1(i),(iii) in place of (ii), and use a coarse-graining approximation effectively to reduce the size of (4.6), so that we avoid the log n in the exponent on the right side of (4.10), and can effectively use log log n instead.
For (n, x) ∈ L d+1 let so s(n, x) ≥ 0 by (4.2). s(n, x) may be viewed as a measure of the inefficiency created when a path makes an increment of (n, x). As in the proof of Lemma 4.2, we consider a polymer of length N = kn for some block length n to be specified and k ≥ 1. In general we take n sufficiently large, and then take k large, depending on n; we tacitly take n to be even, throughout. In addition to (4.1) we have the relation for all x, y, z ∈ Z d and all n, m ≥ 1, which implies that s(·, ·) is subadditive. Subadditivity of s 0 follows from (4.1). Let For our designated block length n, for x ∈ Z d with (n, x) ∈ L d , we say the transverse increment x is inadequate if s(n, x) > n 1/2 θ(n), and adequate otherwise. Note the dependence on n is suppressed in this terminology. For general values of m, we say (m, x) is efficient is s(m, x) ≤ 4n 1/2 ρ(n), and inefficient otherwise; again there is a dependence on n. For m = n, efficiency is obviously a stronger condition than adequateness. In fact, to prove Theorem 1.2 it is sufficient to show that for large n, there exists x for which (n, x) is efficient.
Let h n = max{|x| ∞ : x is adequate}. (Note we have not established any monotonicity for s(n, ·), so some sites x with |x| ∞ ≤ h n may be inadequate.) We wish to coarse-grain on scale u n = 2⌊h n /2ϕ(n)⌋. A coarse-grained (or CG) point is a point of form (jn, x jn ) with j ≥ 0 and x jn ∈ u n Z d . A coarse-grained (or CG) skeleton is a simple skeleton {(jn, x jn ) : 0 ≤ j ≤ k} consisting entirely of CG points. By a CG path we mean a path from (0, 0) to (kn, 0) for which the simple skeleton is a CG skeleton.

Remark 4.3.
A rough strategy for the proof of Theorem 1.2 is as follows; what we actually do is based on this but requires certain modifications. It is enough to show that for some K 15 , for large n, s(n, x) ≤ K 15 n 1/2 ρ(n) for some x. Suppose to the contrary s(n, x) > K 15 n 1/2 ρ(n) for all x; this means that for every simple skeleton S we have E log Z kn,ω (S) ≤ knp(β) − kK 15 n 1/2 ρ(n).
The first step is to use this and Lemma 4.1 to show that, if we take n then k large, with high probability log Z kn,ω (Ŝ) ≤ knp(β) − 1 2 kK 15 n 1/2 ρ(n) for every CG skeletonŜ; this makes use of the fact that the number of CG skeletons is much smaller than the number of simple skeletons. The next step is to show that with high probability, every simple skeleton S can be approximated by a CG skeletonŜ without changing log Z kn,ω (S) too much, and therefore log Z kn,ω (S) ≤ knp(β) − 1 4 kn 1/2 K 15 ρ(n) for every simple skeleton S.
Dividing by kn and letting k → ∞ gives a limit which contradicts (1.3); this shows efficient values x must exist.
We continue with the proof of Theorem 1.

Let
when N is clear from the context we refer to points x ∈Ĥ N as accessible sites. Clearly Lemma 4.4. (i) There exists K 16 such that for all n ≥ 2, s(n, 0) ≤ K 16 n 1/2 log n.
(ii) There exists K 17 such that for n large (depending on β) and even, if |x| 1 ≤ K 17 n 1/2 θ(n) then x is adequate.
Proof. We first prove (i). It suffices to consider n large. Let m = n/2. It follows from Proposition 3.3(ii) that It follows from (4.14), Theorem 1.1 and Lemma 4.2 that with probability at least 1/4, for some accessible site x we have  Turning to (ii), let J = 2⌊K 19 n 1/2 θ(n)⌋, with K 19 to be specified. Analogously to (4.14) we have using Proposition 3.3(ii) that for large n, Similarly, also for large n,
Observe that for a simple skeleton S = {(jn, x jn ), j ≤ k}, we have a sum over blocks: The rough strategy outlined in Remark 4.3 involves approximating Z N,ω (S) by Z N,ω (Ŝ), whereŜ is a CG skeleton which approximates the simple skeleton S; equivalently, we want to replace x (j−1)n , x jn in (4.23) by CG points. This may be problematic for some values of j and some paths in Γ N (S), however, for three reasons. First, if we do not restrict the possible increments to satisfy |x jn − x (j−1)n | ∞ ≤ h n , there will be too many CG skeletons to sum over. Second, even when increments satisfy this inequality, there are difficulties if increments are inadequate. Third, paths which veer to far off course transversally within a block present problems in the approximation by a CG path. Our methods for dealing with these difficulties principally involve two things: we do the CG approximation only for "nice" blocks, and rather than just CG skeletons, we allow more general sums of the form l j=1 log Z τ j −τ j−1 ,ω ((τ j−1 , y j ), (τ j , z j )), which need not have y j = z j−1 . We turn now to the details. In approximating (4.23) we want to in effect only change paths within a distance n 1 ≤ 6dn/ϕ(n) (to be specified) of each hyperplane H jn . To this end, given a site w = (jn ± n 1 , y jn±n 1 ) ∈ H jn±n 1 , let z jn be the site in u n Z d closest to y jn±n 1 in ℓ 1 norm (breaking ties by some arbitrary rule), and let π jn (w) = (jn, z jn ), which may be viewed as the projection into H jn of the CG approximation to w within the hyperplane H jn±n 1 . Given a path γ = {(i, x i ), i ≤ kn} from (0, 0) to (kn, 0), define points We say a sidestep occurs in block j in γ if either ∈ E in and a sidestep occurs in block j}, Blocks with indices in E are called bad blocks, and E is called the bad set. Define the tuples define the CG-approximate skeleton of γ to be and define the CG-approximate bad (respectively good) skeleton of γ to be Note E in (γ), E side (γ), S bad CG (γ) and S good CG (γ) are all functions of S CG (γ). We refer to the bad set E also as the index set of S bad CG (γ). Let C CG (respectively C bad CG ) denote the class of all possible CG-approximate skeletons (respectively bad skeletons) of paths of length kn starting at (0, 0). For B ⊂ {1, . . . , k} let C CG (B) denote the class of all CG-approximate skeletons in C CG with bad set B, and analogously, let C bad CG (B) denote the class of all possible CG-approximate bad skeletons in C bad CG with index set B. Then for b ≤ k define C bad CG (b) = ∪ B:|B|=b C bad CG (B). The partition function corresponding to a CG-approximate skeleton S CG is (4.25) So that we may consider these two products separately, we denote the first asZ N,ω (S good CG ) and the second asZ N,ω (S bad CG ). For a CG-approximate skeleton in C CG (B), and for j / ∈ B, if e ′ j−1 = (n(j −1), w), d j−1 = (n(j − 1), x), f ′ j = (nj, y) and d j = (nj, z), we always have It follows readily that if T 1 , . . . , T j−1 are specified and j / ∈ B, then there are at most (4h n u −1 n + 3) 2d ≤ (5ϕ(n)) 2d possible values of T j ; if j ∈ B there are at most (2n) 4d . It follows that the number of CG-approximate skeletons satisfies Note that the factor ϕ(n) in place of n in (4.26) represents the entropy reduction resulting from the use of CG paths. Summing (4.26) over B we obtain (4.27) |C CG | ≤ 2 k (2n) 4dk .
For B = {j 1 < · · · < j |B| } ⊂ {1, . . . , k}, setting j 0 = 0 we have We also use the non-coarse-grained analogs of the T j , given by and define the augmented skeleton of γ to be We write C aug for the class of all possible augemented skeletons of paths from (0, 0) to (kn, 0). Note that E side (γ), E in (γ) and S CG (γ) are functions of S aug (γ); we denote by F the "coarse-graining map" such that We can write The following will be proved in the next section.
Lemma 4.5. For n sufficiently large, there exists an even integer n 1 ≤ 6dn/ϕ(n) such that for all p ∈ H n 1 we have This lemma is central to the following, which bounds the difference between partition functions for a skeleton and for its CG approximation. Lemma 4.6. There exists K 20 such that under the conditions of Theorem 1.1, for n sufficiently large, P log Z N,ω (S aug ) − logZ N,ω (F (S aug )) ≥ 80dkn 1/2 ρ(n) for some S aug ∈ C aug ≤ e −K 20 k(log n)(log log n) . (4.32) Fix S CG ∈ C CG and S aug ∈ F −1 (S CG ). We can write S aug as {V j , j ≤ k} with V j as in (4.30). Then using Lemma 4.2, By Lemma 4.5, the last sum is bounded by 40dkn 1/2 ρ(n). Hence letting T denote the first sum on the right side of (4.34), we have by (4.34) and Lemma 4.1(ii): Combining (4.33) and (4.35) with (4.27) and (4.31) we obtain that for large n, P log Z N,ω (S aug ) − logZ N,ω (F (S aug )) ≥ 80dkn 1/2 ρ(n) for some S aug ∈ C aug ≤ (2n) 9dk e −kK 21 (log n)(log log n) ≤ e −kK 21 (log n)(log log n)/2 . It is worth noting that in (4.36) we do not make use of the entropy reduction contained in (4.26). Nonetheless we are able to obtain a good bound because we apply Lemma 4.1(ii) with n max = n 1 instead of n max = n.
Let b nk = ⌊ k log log n (log n) 3/2 ⌋. We deal separately with CG-approximate skeletons according to whether the number of bad blocks exceeds b nk . Let . The next lemma shows that bad blocks have a large cost, in the sense of reducing the mean of the log partition function-compare the n 1/2 θ(n) factor in (4.37) to the n 1/2 log n factor in (4.7).
Lemma 4.7. For n sufficiently large, for all 1 ≤ b ≤ k and S bad CG ∈ C bad CG (b), , let E in , E side be the corresponding sets of indices of bad blocks, and let {T j , j ∈ B} be as in (4.24). Then ≤ p(β)n − n 1/2 θ(n). For j ∈ E side , write e j−1 − d j−1 as (n 1 , x), so |x| ∞ > h n and therefore x is inadequate. If the sidestep occurs from (j − 1)n to (j − 1)n + n 1 , then by superadditivity and Lemma 4.4(i), E log Z n 1 ,ω (d j−1 , e j−1 ) = E log Z n 1 ,ω ((0, 0), (n 1 , x)) ≤ E log Z n,ω ((0, 0), (n, x)) − E log Z n−n 1 ,ω ((n 1 , x), (n, x)) and therefore It follows by additivity that for all CG skeletons S CG . Rather than considering deviations of logZ N,ω (S CG ) above its mean, it will be advantageous to consider deviations above the right side of (4.43). The next two lemmas show that it is unlikely for this deviation to be very large for any CG skeleton. We will use the fact that for each S CG ∈ C CG with bad set B, we have by Lemmas 4.2 and 4.7 Proof. From (4.26) we see that (4.46) |C − CG | ≤ 2 k (5ϕ(n)) 2dk (2n) 4db nk ≤ e 10dk log log n .
Combining this with Lemma 4.1(ii),(iii) (with n max = n) and (4.27), (4.29), we obtain P logZ N,ω (S CG ) − kE log Z n,ω ≥ 80dkn 1/2 ρ(n) for some S CG ∈ C − CG ≤ b nk b=0 P logZ N,ω (S bad CG ) − bE log Z n,ω ≥ 40dkn 1/2 ρ(n) for some S bad CG ∈ C bad CG (b)  Note that the event in the third line of (4.47) is well-defined because S good CG is a function of S CG . For each b ≤ b nk we have With (4.47) this shows that for k sufficiently large (depending on n), We continue with a similar but simpler result for C + CG . Lemma 4.9. Under the conditions of Theorem 1.2, for n sufficiently large and N = kn, P logZ N,ω (S CG ) − kE log Z n,ω ≥ 0 for some S CG ∈ C + CG ≤ e −K 13 k(log n)(log log n)/16 .
Lemma 5.1. Provided n is large, there exists a path from (0, 0) to (n, x * ) containing a short climbing segment which is clean.
Note that Lemma 5.1 is a purely deterministic statement, since the property of being clean does not involve the configuration ω.
Translating the segment obtained in Lemma 5.1 to begin at the origin, we obtain a path α * from (0, 0) to some site (m * , y * ), with the following properties: , y * 1 = u n and α * is clean.