An invariance principle for one-dimensional random walks among dynamical random conductances

We study variable-speed random walks on $\mathbb Z$ driven by a family of nearest-neighbor time-dependent random conductances $\{a_t(x,x+1)\colon x\in\mathbb Z, t\ge0\}$ whose law is assumed invariant and ergodic under space-time shifts. We prove a quenched invariance principle for the random walk under the minimal moment conditions on the environment; namely, assuming only that the conductances possess the first positive and negative moments. A novel ingredient is the representation of the parabolic coordinates and the corrector via a dual random walk which is considerably easier to analyze.


INTRODUCTION
The aim of this work is to describe the long-time behavior of a random walk among dynamical random conductances. This problem has enjoyed considerable attention in recent years; we will comment on the relevant literature as soon as the key concepts have been introduced. Throughout this paper we will focus only on one specific instance; namely, the nearest-neighbor random walks on Z. Our aim is to prove that this walk scales to a non-degenerate Brownian motion assuming only minimal moment conditions on the random environment.
Let us introduce the problem in more precise terms. The aforementioned random "walk" is actually a continuous-time Markov chain on Z whose dynamics is best described by the (time-dependent) generator L t that acts on f : Z → R via (1.1) Here {a t (x, x ± 1) : x ∈ Z, t ≥ 0} is a family of positive (and finite) numbers that are assumed to obey the symmetry condition a t (x, x + 1) = a t (x + 1, x), x ∈ Z, t ≥ 0. (1.2) We will refer to a t (e), for e = (x, x + 1), as the conductance of edge e at time t. We will assume that the conductances are defined for all real-valued t and that they are random, meaning that each a t (e) is a function of some ω ∈ Ω in a probability space (Ω, F , P). Writing B(R) for the Borel σ-algebra on R, we impose: c 2018 M. Biskup. Reproduction, by any means, of the entire article for non-commercial purposes is permitted without charge.

Assumption 1.1
For each edge e, the map t, ω → a t (e) on R × Ω is positive, B(R) ⊗ Fmeasurable, and locally Lebesgue-integrable in t. Moreover, there is a family of space-time shifts, τ t,x : Ω → Ω indexed by t ∈ R and x ∈ Z, such that a t (x, x + 1) • τ s,y = a t+s (x + y, x + y + 1), t, s ∈ R, x, y ∈ Z. (1. 3) The law P is invariant and ergodic with respect to {τ t,x : t ∈ R, x ∈ Z}.
A natural way to interpret the random-walk dynamics is via a Poisson-clock environment: Given a sample of {a t (x, x + 1) : x ∈ Z, t ∈ R}, each edge e = (x, x + 1) is endowed with an independent time-inhomogeneous Poisson point process of intensity measure a t (e)dt. The above assumptions ensure that this process exists and that no two arrivals, to be called "rings," occur at the same time. The random-walk path is then a deterministic function of the Poisson environment: the walk stays at a vertex until an incident edge receives the next "ring" at which point it moves to the corresponding neighbor. See Fig. 1 below.
Implementing the Poisson-clock representation rigorously requires showing that the minimal positive solution to the Kolmogorov Backward Equation is non-explosive; i.e., that the number of steps taken by the walk is finite a.s. in any finite time. This follows by the assumed local-integrability, stationarity and the Ergodic Theorem. Indeed, for each t > 0 there is a (possibly random) M ∈ (0, ∞) and a positive density of edges e (in both lattice directions) where the total jump rate t 0 a s (e)ds is bounded by M. Consequently, there is a positive density of edges that receive no "ring" in the timeinterval [0, t]. Up to time t, the walk is thus effectively confined to a finite set of vertices where the total number of available clock "rings" is finite a.s. as well.
Throughout the rest of the paper, we will use the following notation: (1) X = {X t : t ≥ 0} denotes a sample of the above random walk, (2) P x a denotes the law of X in a given configuration a = {a t (x, x ± 1) : x ∈ Z, t ∈ R} of the conductances subject to the initial condition P x a (X 0 = x) = 1, and (3) E denotes expectation with respect to P.
Our main result is then: Theorem 1.2 (Quenched invariance principle) Suppose that, on top of Assumption 1.1, the conductance law obeys the moment conditions E a 0 (e) < ∞ and E a 0 (e) −1 < ∞ (1.4) at some (and thus every) edge e. Then there is a constant σ ∈ (0, ∞) such that for any T > 0 and P-a.e. sample a = {a t (x, x + 1) : x ∈ Z, t ∈ R} of the conductances, the law of X (n) t := 1 √ n X nt , 0 ≤ t ≤ T, (1.5) induced by P 0 a on the Skorohod space D[0, T] of càdlàg paths converges, as n → ∞, weakly to the law of the Brownian motion {B t : t ≥ 0} with EB t = 0 and E(B 2 t ) = σ 2 t. Theorem 1.2 improves on earlier work by Deuschel and Slowik [10] where the validity of a quenched invariance principle for the corresponding random walk on Z was established under the following moment conditions: The algebraic restriction on p and q in (1.6) arises from the method of proof which invokes elliptic regularity techniques to construct, and prove sublinearity of, the so called corrector, a key object underlying many invariance principles proved so far in this setting. The corresponding problem on Z d for d ≥ 2 has been treated in Andres, Chiarini, Deuschel and Slowik [2] albeit under a somewhat different functional relation between p and q (and d) than (1.6) might suggest (see [10,Remark 1.9]). Although our proof is based on corrector techniques as well, we are able to utilize the one-dimensional nature of the walk to work solely under the weaker conditions (1.4) than (1.6). Our approach is rooted in that for two-dimensional static environments, where a quenched invariance principle is known to hold under (1.4) in d = 1, 2 (Biskup [6]) while requiring 1/p + 1/q < 2/d in d ≥ 3 (Andres, Deuschel and Slowik [3]). The need for higher moments in higher dimension has a good reason: for every p, q ≥ 1 satisfying 1/p + 1/q > 2/(d − 1), a static environment exists satisfying the moment conditions in (1.4) where the sublinearity of the corrector fails (Biskup and Kumagai [7]). Whether a quenched invariance principle itself holds just under (1.4) in all d ≥ 1 remains a subject of extensive debate among experts.

REMARKS AND OUTLINE
We proceed with a couple of remarks. First, the reader may wonder whether the conditions (1.4) are in fact necessary for the result to hold. This is certainly not true for static environments where, thanks to an explicit form of the corrector (see, e.g., Biskup and Prescott [8,Introduction]) and the fact that we deal with the variable speed random walk (see Barlow and Deuschel [4,Theorem 1.1] for changes in the constant-speed case), the first condition in (1.4) can be replaced by a 0 (e) < ∞ a.s. In the absence of the second condition in (1.4) we actually get a trivial result: Theorem 2.1 (Role of the lower moment condition) Let P be be the law of static conductances {a(x, x + 1) : x ∈ Z} that are stationary and ergodic with respect to shifts and obey P a(0, 1) < ∞ = 1 and E a(0, x) −1 = ∞. (2.1) Then for each δ > 0, In particular, under the diffusive scaling the random walk tends to a vanishing limiting process, at least in the sense of finite-dimensional distributions averaged over the environment.
As should be intuitively clear, the main role of the upper moment condition is to prevent blow ups. Here it suffices to consider spatially-homogeneous (dynamical) random environments: Theorem 2.2 (Role of the upper moment condition) Given a stationary ergodic process {η t : t ∈ R} on (0, ∞) with law P, define the dynamical conductances via If Eη 0 = ∞, then for any t > 0 and for P-a.e. sample of the conductances, the random variables {n −1/2 X nt : n ≥ 0} are not tight under P 0 a . These examples show that our moment conditions (1.4) are not only sufficient, but also necessary for a quenched invariance principle with a non-trivial limit process to hold in all the environments satisfying Assumption 1.1.
Our second remark concerns the situation when we actually allow the conductances to vanish over sets of times of positive Lebesgue measure. This has been addressed by Biskup and Rodriguez [9], albeit only in d ≥ 2, by requiring sufficiently high (namely, 4d + ) moments of the quantity We believe that the arguments presented here can be extended to cover the d = 1 case as well although it is not clear what the minimal moment conditions on T e should be. Note that this setting includes some relevant examples; e.g., the random walk on dynamical bond percolation (see Fig. 1). Our third remark concerns the dual random walk, which underlies the proofs in the rest of this paper. Leaving the introduction of this walk to Section 4, we just note that this walk has the same diffusive constant as the main walk of concern in this paper (see Remark 8.4 for details). It would be of interest to see if a closer -ideally, path-wise coupling -relation between these processes could be established. Related to this is the fact that the current proof relies also quite heavily on the assumption that the jumps are only between the nearest neighbors.
Our final remark concerns the fact that the random walk is of variable speed. Here we note that, unlike the case of static environments, in dynamical environments different ways to assign speed -i.e., normalize the generator -cannot be related by a time change of the underlying process. At this point, all the existing studies of invariance principles in these cases (namely, the aforementioned references [2,9]) are restricted to the variable speed case. It is thus of interest to see whether the present approach can be extended to include other versions, most notably discrete-time, as well.
The remainder of this note is organized as follows. In Section 3 we present the standard homogenization argument that gives the convergence in Theorem 1.2 subject to two technical claims: existence and sublinearity of the corrector. The main novel contribution of the paper is explained in Sections 4-5 where we introduce an auxiliary random walk that drives various computation in the rest of the argument. The proof of the technical claims is relegated to Sections 6-8. Theorems 2.1-2.2 are proved in Section 9.

HOMOGENIZATION ARGUMENT
We are now ready to start discussing the proof of our main results. The argument for convergence builds on well-known techniques from homogenization theory (see Kumagai [13] and Biskup [6] for recent overviews) which we will explain next. It is the proof of the key technical ingredients -namely, the existence and sublinearity of the corrector -that requires a model-specific, and quite non-standard, approach.
We will henceforth abbreviate b t (x) := a t (x, x + 1) (3.1) and note that (1.3) becomes b s (y) • τ t,x = b s+t (y + x), s, t ∈ R, x, y ∈ Z. (3. 2) The first point to note is that the structure of the underlying Markov chain gives us the standard "point of view of the particle:" Lemma 3.1 (Point of view of the particle) Suppose Assumption 1.1 holds. Given a sample a := {b t (x) : t ∈ R, x ∈ Z} from P, let {X t : t ≥ 0} be a sample from P 0 a . Then t → τ t,X t (a) is a Markov process on Ω with invariant distribution P. Moreover, the process is ergodic in the sense that, for any f ∈ L 1 (P), for P-a.e. a ∈ Ω and P 0 a -a.e. {X t : t ≥ 0}.
Next we introduce the corrector method which relies on the concept of the parabolic coordinates. These can be thought of as a time-dependent random embedding of Z into R that turns the random walk into a martingale; see Fig. 2. Note that, in static environments, the corresponding object solves a Laplace equation for the generator of the Markov chain and can thus be called a harmonic coordinate. In dynamical environments, the Laplace equation is replaced by a parabolic problem; namely, the (reversed-time) heat equation.
Recall our notation L t for the time dependent generator in (1.1). The existence and relevant properties of the parabolic coordinates are then the content of:
(2) For each t, s ∈ R and each x, y ∈ Z, the cocycle condition holds x) is a jointly measurable function of t and the environment with and (4) Finally, the spatial gradients of ψ(t, ·) are a.s. positive, Note that condition (3.8) ensures that under the embedding of Z using the parabolic coordinates, the vertices do not swap their order (or, in other words, their space-time trajectories never cross; see Fig. 2). Deferring the proof to later, we note: Then {M t , F t : t ≥ 0} is an L 2 -martingale with càdlàg paths and the variance process Proof. The continuity of t → ψ(t, x) along with the càdlàg property of t → X t ensure the càdlàg property of t → M t . Recalling that X has piecewise constant paths a.s., let N(t) denote the number of jumps of X in the time interval [0, t]. Integrating (3.4) yields Since, as ↓ 0, (3.13) this shows that {M t , F t : t ≥ 0} is a local martingale. The compensator on the righthand side of (3.12) is (Lebesgue) differentiable, and so the quadratic variation process [M] (using Helland's [11] notation) of M is carried entirely by its discontinuous part, (3.14) The variance process M is the compensator that makes [M] a martingale. (We use the cocycle conditions (3.5) to write M t using the space-time shifts.) The condition (3.7) (and the fact that Θ ≥ 0) ensures that t → Θ • τ t,X t is locally integrable and, using an elementary localization argument, M t is thus square integrable for all t ≥ 0.
As noted before, x → ψ(t, x) can be thought of as a time-dependent, random embedding of the lattice Z into R that makes the random walk a martingale. The deformation caused by the change of the embedding, is the aforementioned corrector. A key issue to address now is how much the deformation affects the random walk at the diffusive space-time scales. For this we need: The proof of this theorem will be given in Sections 7-8. With the help of the above theorems, we can now give: Proof of Theorem 1.2 from Theorems 3.2-3.4. The following argument is standard; we include it merely for completeness of the exposition. Consider the martingale M from (3.9) and let M t be its variance process. Lemma 3.1 ensures that, for P-a.e. sample of the environment and P 0 a -a.e. path of the Markov chain, Next recall that N(t) denotes the number of jumps of X in time interval [0, t] and consider the truncated quadratic variation process (using again the notation of Helland [11, formula (4.6)]) By the cocycle conditions (3.5), the compensator of σ [M] is given by where we set, for general r > 0, The Dominated Convergence ensures that EΘ r → 0 as r → ∞. By Lemma 3.1 and the downward monotonicity of r → Θ r , for P-a.e. sample a of the conductances and P 0 a -a.e. sample of X we thus get  .7) and (3.8). In order to prove the corresponding statement for the paths of the Markov chain itself, it suffices to show By Theorem 3.4, for each > 0 there is a (random) K with P(K < ∞) = 1 such that For < 1, the triangle inequality converts this to the pointwise estimate The weak convergence of M (n) to Brownian motion ensures that {sup t≤T |M (n) t | : n ≥ 1} is tight. Taking n → ∞ followed by ↓ 0 then yields (3.23), as desired.

DUAL RANDOM WALK
The proof of our main result has so far been reduced to Theorems 3.2-3.4 whose proofs constitute the remainder of this paper. In prior work (namely, [10]) these were proved with the help of elliptic-regularity techniques that require the moment conditions (1.6). As we only wish to assume (1.4), we will proceed by methods that are tailored to the underlying one-dimensional, and nearest-neighbor, nature of the problem.
To explain the main idea, let us start with the existence of the parabolic coordinates. Suppose ψ solves (3.4). Then, as is readily checked, 3) Our principal observation, and the reason for using the adjoint-operator notation, is which is the generator of the (variable-speed) simple symmetric random walk Y with jump rate 2b t (x) at x at time t. The minus sign on the left-hand side of (4.2) directs us to run this random walk backwards relative to our current labeling of time; (4.2) is then recognized to be the Kolmogorov Forward Equation associated with Y.
Next we recall the requirement that the gradients of ψ be stationary with respect to the space-time shifts. Hence we expect that for some measurable function ϕ of the conductances only. Assuming ϕ ∈ L 1 (P), equation (4.2) is then equivalent to the statement that the measure This suggests that we first extract a stationary distribution Q of the environments using the usual averaging procedure and, assuming we can show Q P, define ϕ as the Radon-Nikodym derivative dQ dP . We note that ϕ, once constructed, has to be non-negative and a simple argument based on stationarity even gives ϕ > 0 P-a.s. This implies equivalence of Q with P (which we need to convert a.s. statements under Q to those under P) as well as the "trajectory noncrossing" condition (3.8). The fact that Eϕ = 1, which will also be shown as part of the construction, then gives sublinearity of the corrector in the spatial direction.
In order to implement the above strategy, a number of technical hurdles have to be overcome. The first of these is the very existence of the random walk Y which requires care due to the dependence of the jump-rates on the (possibly highly irregular) field of the conductances. Then comes the construction of the invariant measure Q, and the Radon-Nikodym term ϕ, which will be performed in Section 5. The proof of Theorem 3.2 comes in Section 6.
Let us start with the construction of the dual random walk Y. Proceeding along the lines standard in the theory of continuous-time Markov chains (see, e.g., Liggett [14]), we will first define the transition function of Y as the minimal positive solution to the Kolmogorov Backward Equations and then, while proving non-explosivity, construct the actual chain as well. Throughout we will regard the conductance configuration as fixed and subject only to the explicitly stated (deterministic) requirements.
We start by defining a family of non-negative kernels K n (t, x; s, y) indexed by integers n ≥ 0 and depending on reals −∞ < t ≤ s < ∞ and vertices x, y ∈ Z, inductively via the iteration scheme where we set K 0 (s, x; t, y) := 0. Notice that, compared to the usual notation for transition kernels, the evolution runs backwards in time.
Then for all t < s and all x, y ∈ Z, the (Lebesgue) integrals on the right-hand side of (4.7) converge and n → K n (s, x; t, y) is non-decreasing and taking values in [0, 1]. In particular, Moreover, for any t and y fixed, s, x → K(s, x; t, y) is a non-negative solution to the Kolmogorov Backward Equation where L s acts on the first spatial variable on the right-hand side and the s-derivative is in the Lebesgue sense. The kernel K is sub-stochastics in the sense that, for all s ≥ t and all x ∈ Z, Finally, K transforms covariantly under the space-time shifts; namely, holds for all s ≥ t, all u ∈ R and all x, y, z ∈ Z.
Proof. As is readily checked by induction, we have K n ≥ 0 with n → K n is nondecreasing and, thanks to the integrability of t → b t (x), also ∑ y∈Z K n (s, x; t, y) ≤ 1. The limit in (4.8) thus exists and obeys (4.10). Passing the limit inside the integral in (4.7) using the Monotone Convergence Theorem and some elementary differentiation proves that K solves the integral version of (4.9). As is checked by induction from (4.7) and (3.2), equation (4.11) holds for K n ; the limit (4.8) then extends it to K as well.
A standard question arising in the above context is whether equality holds in (4.10). As usual, this will be resolved by interpreting K n as the transition probability for a Markov chain restricted to make at most n steps; equality in (4.10) is then equivalent to non-explosivity of this chain in finite time. We need the following ingredients: (1) Z := the discrete-time simple symmetric random walk on Z, and (2) N := an independent rate-1 Poisson point process. Let P x denote the joint law of these objects such that P x (Z 0 = x) = 1. Aiming to define the desired Markov chain as a suitable time-change of the constant-speed continuoustime simple random walk t → Z N(t) , we first need to prove: is Borel-measurable and locally Lebesgue integrable on (0, ∞) for all x ∈ Z and, in addition, that (as a function of t) Then for all x ∈ Z and P x -a.e. realization of the processes Z and N as above, there is a unique Moreover, we have P x (A(t) < ∞) = 1 for each t ≥ 0 and each x ∈ Z. In particular, is well defined for all t ≥ 0 P x -a.s. and obeys Proof. The starting point is to solve (4.13) for A. We will do this by constructing its inverse, to be denoted by W. Let τ 0 := 0 < τ 1 < τ 2 < . . . mark the arrival times of the Poisson process N. On [τ n , τ n+1 ) we have N(·) = n and so we may define W inductively by setting W(0) = W(τ 0 ) := 0 and for all n ≥ 0. Here the second condition in (4.12) forces that W(t) < ∞ for all t ≥ 0 while the integrability and positivity of t → b −t (x) assumed in (4.12) ensure that t → W(t) is uniquely defined, strictly increasing, continuous on [0, ∞).
Next let W(∞) := sup t≥0 W(t) and define the inverse of W by Set t n := W(τ n ) and note that N(A(s)) = n for A(s) ∈ [τ n , τ n+1 ) which is equivalent to s ∈ [t n , t n+1 ). Using this in (4.16) (and invoking the continuity of A) shows As t 0 = 0, this yields (4.13) for all t < W(∞) by elementary resummation. From (4.7) we now inductively check that, for all t ≥ 0, To get (4.15) we have to show that Y is non-explosive meaning P x (N(A(t)) < ∞) = 1 for each t ≥ 0. By (4.17) this boils down to proving W(∞) = ∞ P x -a.s. Noting that Z is recurrent, there is P x -a.s. an infinite sequence n 0 = 0 < n 1 < n 2 < . . . enumerating the times with Z n k = x. Then Z N(A(s)) = x for s ∈ [t n k , t n k +1 ) and so, by (4.18), The sum on the right diverges P As a consequence we now readily get: Under the assumptions of Lemma 4.2, for each each x ∈ Z and each s ∈ R, t, y → K(s, x; −t, y) is a strong solution to the Kolmogorov Forward Equation for all t ≤ s and all y ∈ Z.
Proof. By a simple translation of the environment (which preserves the conditions of Lemma 4.2) it suffices to prove this for s := 0. In this case we have the representation (4.19). Decomposing according to the last step of the walk Y we then get Taking n → ∞ and using (4.8) along with the Monotone Convergence Theorem we get that K satisfies the integral from of (4.21).
Note that the proof also yields which will come handy later.

INVARIANT MEASURE FOR DUAL RANDOM WALK
Moving along the strategy outlined at the beginning of Section 4, we will now construct an invariant distribution Q for the Markov chain t → τ −t,Y t (a) on random environments and thus prove Theorem 3.2. Throughout we consider Assumption 1.1 and the moment conditions (1.4) as granted. We leave it to the reader to check that this ensures the condition (4.12) for a.e. sample of the conductances. A standard way to extract an invariant distribution is to average the indicator of an event A over a finite-stretch of the Markov chain path initiated from the a priori measure, and then take a weak subsequential limit. For such an averaged measure, Tonelli's Theorem, the shift-invariance of P and (4.11) yield Writing ϕ T for the expression following 1 A in (5.1) gives Q T (da) := ϕ T (a)P(da). Assuming we can prove tightness, every subsequential weak limit of measures Q T as T → ∞ will then be invariant for the induced chain t → τ −t,Y t (a). We will use the above derivation only as motivation; for our purposes, it will be more convenient to work with T averaged over an exponential distribution. We thus define our approximate Radon-Nikodym term by ϕ := A similar calculation as in (5.1) shows, with the help of (4.11), that and, in particular, ϕ < ∞ a.s. The main technical problem is to control the "mass" of ϕ in the limit as ↓ 0. This will be done via: and, in fact, Before we embark on a formal proof, let us note that a similar kind of weighted-L 2 estimate appears in most corrector-based approaches to the random conductance model. Disregarding various convergence issues, it is a consequence of the following argument: Introduce the quantity Then ϕ − 1 is the spatial gradient of χ , Moreover, writing L for the operator L t lifted to the space of environments, and denoting by the local drift at the space-time origin, χ satisfies the "massive" corrector equation These two facts give χ the meaning of an approximate, stationary corrector. Multiplying (5.10) at t = 0 and x = 0 by χ , taking expectation, using that along with the fact that E ∂ ∂t χ 2 • τ t,0 = 0 thanks to stationarity of P produces the standard identity which, being a direct consequence of the PDE (5.10), can be thought of as a statement of elliptic regularity. From (5.12) we get (5.5) by dropping the first term on the left and applying the Cauchy-Schwarz inequality on the right-hand side. Of course, the main issue with this formal calculation is that, at this point, we have no a priori information on the integrability of (and even convergence of the integral defining) χ . We will therefore need to introduce an additional truncation and work with averaging over space and time instead of the random environment.
Recall our notation K n for the kernels defined in (4.7). We start by introducing a truncated version of ϕ via ϕ ,n := ∞ 0 dt e − t ∑ y∈Z K n (t, y; 0, 0), n ≥ 0. (5.13) Since n → K n is (pointwise) non-decreasing and tending to K, we have ϕ ,n ≤ ϕ and so Eϕ ,n ≤ 1, n ≥ 0, (5.14) with ϕ ,n ↑ ϕ as n → ∞ thanks to the Monotone Convergence Theorem. The key reason for introducing the truncated objects is that they are pointwise bounded: Since the random walk Y makes only nearest-neighbor jumps and the kernel K n involves only trajectories with at most n jumps, the sum in (5.13) is effectively reduced to |y| ≤ n. From K n ≤ 1 we then have Next we introduce the truncated version of (5.6), Here the integral converges absolutely since t → ϕ ,n • τ t,0 is continuous, t → b t (x) is locally integrable and (5.15) thus gives By the first condition in (1.4) the integral has finite expectation under P; Tonelli's Theorem then implies that the integral is finite P-a.s. We now claim a finite-n version of (5.7): Lemma 5.2 For all > 0 and all n ≥ 0, Proof. The shift-covariance of the K n kernel implies, for any t > 0, that The Kolmogorov Forward Equation (4.23) then yields where the derivative on the left is in the Lebesgue sense and L + is the operator L + t lifted to the space of environments; The definition (5.16) now shows where we also used that t → (ϕ ,n+1 − 1) • τ t,0 is bounded.
In light of (5.18), for the integrand in (5.16) we now get where L and V are as in (5.8) and (5.9). Using this we readily check:

Lemma 5.3 (Corrector equation)
For each x ∈ Z and each n ≥ 0, t → χ ,n • τ t,x is continuous and Lebesgue differentiable with Proof. It suffices to prove the claim for x = 0. Pick t ∈ R. Invoking (5.23) in (5.16), an elementary change of variables yields The bound (5.17) then shows that t → χ ,n • τ t,0 has sublinear growth as well. Equipped with these observations, we are ready to give: Proof of Proposition 5.1. The proof runs parallel to the argument leading up to (5.12) except that we average of space-time rather than the environment. We continue writing Lχ ,n + V as it is concise, but the reader should replace this by the left-hand side of (5.23) whenever convenient. The starting point is to multiply (5.24) by χ ,n+1 • τ t,0 and integrate over t ∈ [0, r], for some r > 0. Relabeling n + 1 for n, this yields The integrals are finite P-a.s. by to the fact that t → χ ,n • τ t,0 is continuous and t → b t (x) is locally integrable P-a.s. Next we multiply both sides by e −r/R and integrate over r ≥ 0. The resulting integrals converge absolutely thanks to the P-a.s. sublinear growth of t → χ ,n • τ t,0 . Neglecting the contribution of the second term on the left of (5.27) and combining that of the first term with the corresponding term on the right-hand side then shows For 2R > 1/ (to be assumed next) we can drop the first term. Summing the resulting inequality over its translates by x ∈ {0, . . . , R}, the identity (5.23) along with Lemma 5.2 and integration by parts show where f R is a "boundary term" given explicitly by Since we are aiming to control the right-hand side of (5.29) in P-probability, it suffices to focus on the R → ∞ behavior of f R alone. By (5.15) and (5.17), this quantity is bounded in absolute value by (2n In light of (5.26), the part of the integrand in the large parentheses grows sublinearly in t a.s. Plugging that in, bounding te −t/R by a constant times Re −t/(2R) and noting that, by, say, the L 1 -part of the Pointwise Ergodic Theorem, the integral of t → e −t/(2R) b t (0) over all t ≥ 0 as at most order-R in probability, we get that h R and thus also (5.29) are o(R 2 ) in probability. From ϕ ,n ≤ ϕ ,n+1 we then get The Cauchy-Schwarz inequality bounds the square of the second term on the right by the left-hand side times By the Pointwise Spatial Ergodic Theorem (and our assumptions on P), this quantity is asymptotic to R 2 E(b 0 (0)) as R → ∞ and so with o(R 2 )/R 2 → 0 in probability as R → ∞. One more use of the Pointwise Spatial Ergodic Theorem on the left-hand side (which, thanks to the Monotone Convergence Theorem, applies to non-negative random variables even without any moment assumptions) then yields 35) The claim now follows from ϕ ,n ↑ ϕ and the Monotone Convergence Theorem.

Remark 5.4
Once we have (5.35), the Cauchy-Schwarz inequality along with the first condition in (1.4) show that also χ ∈ L 2 (P). The argument leading up to (5.12) can then be applied thus proving the identity (5.12) directly.
With the weighted-L 2 estimate in hand, we can move to the construction of the Radon-Nikodym term ϕ. Instead of working with invariant measures, we proceed by (equivalent) functional-analytic arguments. Consider the linear functional (5.36) and note that it is positive and normalized in the sense that Writing L 0 (P) for the set of equivalence classes of measurable functions of the environment, the main outcome of the present section is now: For each > 0, the linear functional-φ extends to a continuous linear functional on with norm bounded by [E(b 0 (0))] 1/2 regardless of > 0. In particular, weak sequential limits of φ as ↓ 0 exist and take the form f → E(ϕ f ) for some ϕ ∈ L 0 (P) satisfying ϕ ≥ 0, E(ϕ) = 1 and E(b 0 (0)ϕ 2 ) ≤ E(b 0 (0)). (5.39) In addition, for each t > 0 we have x; 0, 0), P-a.s.

(5.40)
In particular, ϕ admits a version such that and that, on a set of full P-measure, t → ϕ • τ t,x is continuous and weakly differentiable with where L + is the operator in (5.21). The measure Q defined from ϕ via (4.6) is stationary and ergodic for the induced Markov chain t → τ −t,Y t (a).
Proof. Let H denote the space of continuous linear functionals on H. Pick f ∈ L ∞ (P) The Cauchy-Schwarz inequality along with (5.5) yield It follows that φ extends continuously to H with the norm bounded by [E(b 0 (0))] 1/2 . As bounded sequences in H are weakly compact, sequential limits of φ as ↓ 0 exist and, by the Riesz lemma, take the form f → E(b 0 (0) −1 h f ) for some h ∈ H. Writing ϕ := b 0 (0) −1 h we get the second inequality in (5.39); the equality in (5.39) and nonnegativity of ϕ follow from (5.37) and the fact that 1 ∈ H. Next we observe that, for any t > 0, splitting the integral in (5.2) to an integral over [0, t) and the other over [t, ∞), the Chapman-Kolmogorov equations for K along with (4.11) yield The calculation (5.1) shows that the L 1 (P)-norm of the first term is 1 − e − t which tends to zero as ↓ 0. Integrating (5.44) against f ∈ L ∞ (P), moving the shift away from ϕ, taking ↓ 0 along the sequence where φ converges and moving the shift back to ϕ proves (5.40) with the null set possibly depending on t. Now define By (5.40) and Tonelli's Theorem, ϕ = ϕ P-a.s. and so ϕ is a version of ϕ. As is checked with the help of (5.13) and a change of variables, t → ϕ • τ t,x continuous in t ∈ R on a set of full P-measure. Plugging (5.40) for ϕ on the right-hand side of (5.45) and invoking the Chapman-Kolmogorov equations for K shows that (5.40) extends to ϕ and so we can henceforth regard ϕ to be this version. The differential equation ( due to strict positivity of K(t, ·; 0, ·) for t > 0 and (5.40) again. The event on the right is invariant under space-time shift and so, by ergodicity of P, it has probability zero or one. The case of full measure is ruled out by E(ϕ) = 1. The invariance of Q for the random walk t → τ −t,Y t (a) on Ω is a consequence of (5.40). To prove ergodicity, we adapt an argument of Andres [1, Proposition 2.1]. Let A be a measurable set of environments such that for Q-a.e. a ∈ A and each t > 0 we have τ −t,Y t (a) ∈ A for P 0 a -a.e. sample of Y. This implies But ϕ > 0 and, for t > 0, also K(0, 0; −t, x) > 0 P-a.s. and so we get s. for each t > 0 and each x ∈ Z. Swapping the roles of A and A c then gives 1 A = 1 A • τ −t,x P-a.s. for each t > 0 and each x ∈ Z. By shift-ergodicity of P, we have P(A) ∈ {0, 1}. Since Q is equivalent to P, the same applies to Q(A).

PARABOLIC COORDINATES
Having established the necessary facts pertaining to the dual random walk Y we now move to the construction of the parabolic coordinates. This proves the first of the two technical theorems underpinning the main convergence result. We then also prepare the ground for proving the second technical claim by developing an alternative representation for the corrector. Let ϕ be a quantity constructed in Theorem 5.5; we assume that ϕ is the version that satisfies (5.42) for all t and x on a set of full P-measure. Set where the integral converges absolutely P-a.s. by Tonelli's Theorem and the fact that b 0 (0)ϕ ∈ L 1 (P) as implied by b 0 (0)ϕ 2 ∈ L 1 (P) and the second condition in (1.4). The quantity χ(t, 0) will serve as the corrector in time t; compare with its precursor in (5.6).
Remembering that ϕ should correspond to the spatial gradients of the parabolic coordinate, the cocycle conditions (3.5) dictate that we define The quantities in (6.1), resp., (6.2) are defined analogously for negative t, resp., x, by swapping the limits of the integral/sum and changing the overall sign of the expression. With this definition in hand, we are ready to give: Proof of Theorem 3.2. A similar calculation to that in the proof of Lemma 5.2 shows, with the help of the PDE (5.42) obeyed by ϕ, that This readily implies and proves the cocycle condition (3.5). The PDE (3.4) obeyed by ψ is then a direct consequence of the definition (6.1). The identities (3.6-3.7) follow from (5.39) while (3.8) is a rewrite of (5.41).
Although the formula (6.1) serves well for the construction of the parabolic coordinate, it appears less amenable for the purposes of proving Theorem 3.4. There we will use a different representation which we will prove next: where each of the double sums converges to a finite number P-a.s.
For the proof of a.s. convergence we first show: Proof. Using the stationary distribution on environments, Q(da) := ϕ(a)P(da), the quantity in question is recognized as the left-hand side of Since t → Y t is a martingale with associated variance process As noted above, the last expectation is finite by (5.5).
Proof of Proposition 6.1. Fix t ≥ 0. A shift of the environment and a change of variables show that the expectation under P of the sum of the two terms in (6.5) equals the expectation in (6.6). Since ϕ > 0 P-a.s., the sums in (6.5) converge to a finite number P-a.s. Denoting, with some abuse of our earlier notation, by χ n (t) the quantity in (6.6) with the sums over x and y additionally restricted to values in [−n, n], we in particular have χ n (t) → χ(t, 0) as n → ∞ a.s. by the Dominated Convergence Theorem. We will now calculate the t-derivative of χ n (t). Fix y ∈ Z and let us temporarily abbreviate ϕ x := ϕ • τ t,x , b x := b t (x) and K x := K(t, x; 0, y). (6.10) Then (5.42) reads as while the Backward Kolmogorov Equation (4.9) becomes The product rule for the derivative then shows Using the standard telescoping argument, we have (6.14) Similarly we obtain We will now return to the full notation while still abbreviating (bϕ) t,x := b t (x)ϕ • τ t,x . Summing (6.14-6.15) over y in the respective range of values (still confined to [−n, n]) and then subtracting the sums in (6.15) from those in (6.14) yields Here the first term on the right of (6.16) arose by combining the contributions from the terms b 0 ϕ n K −1 in (6.14-6.15). Similarly, the second term combines the contributions from the term b −1 ϕ −1 K 0 . The remaining terms in (6.16) collect the contributions of the terms b −n−1 ϕ −n−1 K −n , b −n ϕ −n K −n−1 , b n ϕ n K n+1 and b n+1 ϕ n+1 K n , respectively. The first two terms on the right of (6.16) dominate the expression in the limit n → ∞. Indeed, integrating over a compact interval of t and taking expectation with respect to P, the remaining four terms on the right of (6.16) converge to zero in L 1 (P) as n → ∞. The term P x (|Y t | ≤ n) in turn increases to one as n → ∞ for both x = 0, −1. The Monotone Convergence Theorem gives r.h.s. of (6.5) = t 0 ds b s (−1)ϕ • τ s,−1 − b s (0)ϕ • τ s,0 , P-a.s. (6.17) The quantity on the right is χ(t, 0), as desired.

Remark 6.3
The reader may wonder at this point how we arrived at the above alternative expression for χ(t, 0) in the first place. This was done as follows. We know that the spatial gradients of the corrector are given by ϕ − 1. Setting where the sum converges because x → ϕ • τ 0,x has a sublinear growth, we then get This indicates that χ • τ t,x is an approximate (stationary) corrector at space time position (t, x). We should thus be able to approximate χ(t, 0) by the quantity Using (5.40) for the term ϕ • τ 0,x and invoking (4.11) along with the fact that y → K(t, x; 0, y) is a probability mass function, this is recast as taking, at least formally, the limit ↓ 0 in (6.21) we then discover (6.5).
Looking at how the ranges of x and t in (3.16) scale with n, for the behavior of the corrector in time we need to actually prove a subdiffusive growth estimate: We remark that finding a representation of the corrector that makes subdiffusivity of the corrector in time transparent has been the primary driving force behind the approach developed in the present paper. Before we delve into its proof (which is deferred to the next section), let us show how it implies the desired theorem: Proof of Theorem 3.4 from Proposition 7.2. We follow arguments developed in Berger and Biskup [5]; see also [6,Lemma 4.12]. First we identify a "good grid" of space-time points where the corrector can be controlled by way of ergodic-theoretical and geometric arguments. The oscillation of the corrector over the "holes" left out by the grid is then controlled by methods of harmonic analysis. The proof is divided into three steps.
Let ρ K, be the density of K, -good points in Z; this quantity exist by Birkhoff's Ergodic Theorem and is generally random but, since its expectation equals the probability in (7.11), from the obvious monotonicity in K we have P-a.s. (7.12) Similarly, if θ K, is the density of {n ∈ N : (n, 0) is K, -good} in N (remember that t → χ(t, 0) is continuous so checking only integer times will be enough) we have θ K, −→ K→∞ 1, P-a.s. (7.13) It follows that, for each > 0 and P-a.e. environment a there is K = K(a) < ∞ such that We now fix this K and let G K, denote the set of (t, x) ∈ [0, ∞) × Z such that at least one of the following conditions holds: (1) t = 0 or x = 0 or both, (2) t is integer and (t, 0) is K, -good, (3) (0, x) is K, -good. The set G K, is the aforementioned "good grid." Step 2 (Estimating χ on good grid): We now derive a pointwise estimate of the corrector on the good grid. Note that each (t, x) ∈ G K, can be connected to the origin by following a pair of horizontal and vertical lines that lie entirely in G K, -which line comes first depends on which of the three condition above applies at (t, x); one or both lines are trivial when (1) is in force. Since these line segments meet at a K, -good point, the cocycle condition and the triangle inequality show It remains to control the corrector at points away from G K, . Note that the "holes" left out by G K, are rectangles bounded by horizontal and vertical lines in G K, . We will write ∂R for the points in G K, bounding rectangle R (which we think as disjoint from G K, ). Next recall that the parabolic coordinates are defined so that ψ(t, X t ) is a martingale. Using the Optional Stopping Theorem (or the PDE for ψ directly), this implies a Maximum Principle: For any rectangle R as above and (7.17) where diam Z (R) is the diameter of the projection of R ∪ ∂R onto the spatial coordinate.
Since ∂R ⊂ G K, , the supremum on the right can be controlled via (7.15) provided we can control the diameter of any rectangle that intersects [0, n] × [− √ n, √ n]; this then takes care also of the second term on the right.
Step 3 (Away from good grid): Let {x k : k ∈ Z}, with x 0 := 0, be the increasing sequence enumerating K, -good points on the line t = 0; this sequence exists by the fact that ρ K, > 0 (note that the left and right densities of , K-good points are equal P-a.s.). The existence and positivity of the density of good points implies Similarly, letting {t k : k ≥ 0}, where t 0 := 0, enumerate the K, -good points with integer time coordinate and zero space coordinate, we have Since x k /k as well as t k /k tend to positive numbers as k → ∞, there is a (random) K < ∞ such that, for all k, It follows that, once n ≥ K + n, Combining this with (7.15) and (7.17) yields Dividing by √ n and taking n → ∞ followed by ↓ 0 then yields the claim.

SUBDIFFUSIVITY IN TIME
As a final point of the proof, it remains to prove the subdiffusive estimate for the corrector in time. It is here where we will benefit from the representation in Proposition 6.1. As it turns out, it suffices to focus on the limit of large negative times. The cocycle conditions give χ(−t, 0) = −χ(t, 0) • τ −t,0 for any t > 0, and so where Y is the dual random walk. We start by showing that the sums in (8.1) are dominated by x-values of order √ t: and lim M→∞ lim sup Proof. By symmetry it suffices to prove just (8.2). Recall the definition (4.13) of the time-change process A(t) that links Y to the discrete time simple symmetric random walk (Z n ) n≥0 and an independent rate-1 Poisson process (N(t)) t≥0 . Pick p ∈ (0, 1/2) and note that the sum in (8.2) is bounded by the sum of the following terms and We will now estimate these two terms separately. Since dA(t) = 2b 0 (0) • τ −t,Y t dt, we can analyze the behavior t → A(t) by following the evolution of the environment from the point of view of the random walk Y. To this end, define the maximal function A := sup t>0 t . The Markov inequality shows, for any q > 0, that Since b 0 (0) ∈ L 1 (Q) and Q is invariant for the environment observed from the walk Y, the Maximal Ergodic Theorem gives E Q E 0 ((A ) r ) < ∞ for all r ∈ (0, 1) and so Now use the fact that if f ∈ L 1 (P) is non-negative, and f := sup n≥1 1 n ∑ n−1 k=0 f • τ 0,k is the associated maximal function under spatial shifts, then integration by parts yields with c(q) < ∞ whenever 2q > 1. Hence, if we assume q ∈ (1/2, 1 − p), applying this to the function f := ϕ E 0 ((A ) This shows that 1 √ t I M (t) tends to zero as t → ∞ followed by M → ∞. The convergence occurs on { f < ∞} which is a full-measure event because f ∈ L 1 (P) by (8.7).
Concerning the expression in (8.5), abbreviate t(x) := x 2(1−p) t p and note that Since N(t) is Poisson with parameter t, the first probability is at most e −ct(x) , for some constant c > 0, by a standard large-deviation estimate. The Reflection Principle in turns bounds the second probability by 2e −cx 2 /t(x) . Bounding the sum over x ≥ M √ t as the sum over n ≥ M − 1 and a sum over x ∈ [n √ t, (n + 1) √ t) and invoking integration by parts shows where ϕ is the maximal function associated with spatial shifts of ϕ. The resulting sum tends to zero as M → ∞ uniformly in t ≥ 1.
In order to handle the remaining part of the sums in (8.1), we will prove: There is σ > 0 such that, for W := N (0, σ 2 ), P-a.e. environment and any M > 0, as well as Before we give the proof, note that from here we now quickly get: Proof of Proposition 7.2 from Lemma 8.2. Since the right-hand sides of (8.12-8.13) coincide, Lemma 8.1 gives lim t→∞ |χ(−t, 0)| √ t = 0, P-a.s. (8.14) so we just need to turn this into a statement about the limit of times tending to positive infinity. Let > 0 and set Then K < ∞ P-a.s. by (8.14) and so, by the Pointwise Ergodic Theorem, for P-a.e. environment there is a (random) R < ∞ such that the set Ξ R := {n ∈ N : K • τ n,0 ≤ R} has a positive (and well defined) density in N. This implies that there is a (random) n 0 < ∞ such that Ξ R ∩ [n, 2n] = ∅ for all n ≥ n 0 . Now assume that t ∈ [n/2, n] for some n ≥ n 0 and use the above observation to find t n ∈ Ξ R ∩ [n, 2n]. Then Since t n ≤ 2n ≤ 4t and K • τ t n ,0 ≤ R, the right-hand side is at most 2R + 2 √ 4t. Dividing by √ t and taking t → ∞ followed by ↓ 0, we get the desired result.
It remains to prove Lemma 8.2. Here we will use:

Lemma 8.3
For P-a.e. realization of the environment, under P 0 we have where is the constant-speed continuous-time simple symmetric random walk which obeys the Functional CLT with unit limit variance. It thus suffices to show that the clock process converges to a deterministic linear function. This follows from A(t) t P 0 −→ t→∞ σ 2 , P-a.s. (8.19) which is itself proved by the Birkhoff Pointwise Ergodic Theorem applied under the stationary and ergodic law Q and the fact that Q is equivalent to P.
Proof of Lemma 8.2. We will again focus only on (8.12) as (8.13) is obtained analogously. Let σ be the quantity in (8.18) and denote W := N (0, σ 2 ). Given > 0, the quenched CLT for Y in Lemma 8.3 ensures there is a P-a.s. finite random variable T 0 on the space of environments such that Denote T x := T 0 • τ 0,x and observe that P x (Y t < 0) = P 0 (Y t < −x) • τ 0,x . (8.21) Decomposing the sum in (8.12) according to whether {T x ≤ t} occurs or not, we get Dividing both sides by √ t, the Pointwise Ergodic Theorem along with the Monotone Convergence Theorem show that the right-hand side tends to zero as t → ∞ followed by ↓ 0. In light of the fact that χ(0, 1) = ϕ − 1, Lemma 7.1 gives In combination with (8.22), this now proves (8.12).

Remark 8.4
Although have not quite managed to prove this, we believe that E b 0 (0)ϕ 2 = E b 0 (0)ϕ . (8.25) This is because stationarity of P under spatial shifts combined with some elementary calculus allow us to derive 1 t E χ(t, 0) 2 −→ t→∞ 2E b 0 (0)(ϕ − 1)ϕ (8.26) and because we expect the convergence in Lemma 8.2 to hold in L 2 (P)-sense as well.
(Alternatively, we expect E(χ 2 ) to vanish in the limit as ↓ 0.) If (8.25) indeed holds, then the limiting variance of the Brownian motion arising from the walk X is the same as the limit variance of the Brownian motion arising from Y, a fact for which we have no intuitive explanation.

NECESSITY OF THE MOMENT CONDITIONS
In this final section, we will address the situations when one of the moment condition fails. We start by the lower moment condition in Theorem 2.1. Fix β > 0 and consider the following quantity R β (t) := 1 du e −βu P x a (X tu = y) . (9.1) The absence of the lower moment condition manifests itself as follows: The inner product on the right-hand side is monotone decreasing with respect to the standard partial order on individual conductances and so R β (t) is decreasing in P. Next observe that, whenever P is such that the moment conditions in (1.4) hold, and X thus obeys an annealed invariance principle, we have where σ 2 is the variance of the limiting Brownian motion. In this case σ 2 can in fact be explicitly computed to be σ 2 = 2 E(a(0, 1) −1 ) −1 (9.5) thanks to the explicit representation of the corrector (see, e.g., Biskup and Prescott [8]).
We will now use these facts to derive the claim. Consider P as in the statement of Theorem 2.1 and let R β (t) be related to P as in (9.2). Given > 0, consider the conductance model with conductances a ( ) (x, y) := a(x, y) ∨ and let R ( ) β (t) be the corresponding quantity in (9.2). The monotonicity in the conductance law gives > 0. (9.6) Moreover, (9.4-9.5) apply to R ( ) β (t). It follows that, for any > 0, the limes inferior of R β (t) is bounded from below by the right hand side of (9.4) with σ 2 replaced by σ 2 := [E(a ( ) (0, 1) −1 )] −1 . (9.7) Replacing t by 4t/δ 2 now shows, via (9.11) and a routine change of variables, that the integral on the right-hand side tends to zero as t → ∞. The claim follows.
Concerning the failure of the upper moment condition, we give: Proof of Theorem 2.2. Consider the spatially-homogeneous (dynamical) conductances derived from process η t as in (2.3). Since the environment is homogeneous in space, the random walk X has the law of a time change of the simple symmetric random walk. Explicitly, X t law = Z N(Ã(t)) , t ≥ 0, (9.13) where N is an independent rate-1 Poisson process, Z is the discrete-time simple symmetric random walk on Z andÃ (t) := 2 t 0 ds η s . (9.14) The claim follows from the Central Limit Theorem for the random walk t → Z N(t) and the fact that, under the assumption of ergodicity of t → η t and diverging expectation of η 0 , we haveÃ(t)/t → ∞ as t → ∞ P-a.s.

ACKNOWLEDGMENTS
This project has been supported in part by the NSF award DMS-1712632 and GAČR project P201/16-15238S. I am grateful to Pierre-François Rodriguez for valuable contributions in earlier attempts to solve this problem and, later, for keen observations on the strategy that ultimately succeeded. This paper is dedicated to Jean-Dominique Deuschel on the occasion of his 60th birthday. I wish to thank Jean-Dominique for his friendship and also for challenging my mathematical ability every time we meet. This note is a response to one of these challenges.