Localization for controlled random walks and martingales

We consider controlled random walks that are martingales with uniformly bounded increments and nontrivial jump probabilities and show that such walks can be constructed so that P(S_n^u=0) decays at polynomial rate n^{-\alpha} where \alpha>0 can be arbitrarily small. We also show, by means of a general delocalization lemma for martingales, which is of independent interest, that slower than polynomial decay is not possible.


Introduction and statement of results
Consider a discrete time martingale {M i } i≥0 adapted to a filtration F i whose increments are uniformly bounded by 1, i.e. |M i+1 − M i | ≤ 1, and such that P (|M i+1 − M i | = 1 | F i ) > c > 0. It is folklore that in many respects, such a martingale should be well approximated by Brownian motion. In particular, one would expect that P (|M n | ≤ 1) should be of order n −1/2 . Our goal in this paper is to point out that this naive expectation is completely wrong. We will frame this in the language of controlled processes below, but a corollary of our main result, Theorem 2 below, is the following.
If {M i } i≥0 is a discrete time martingale (with respect to a filtration F i ) satisfying E((M i+1 − M i ) 2 |F i ) ∈ [δ, 1] and |M i+1 − M i | ≤ n β a.s., then The heart of the proof of Theorem 1 uses a sequence of entrance times to a space-time region, which may be of independent interest (see Figure 2.1 for a graphical depiction). Our interest in these questions arose while one of us was working on [7]. Charlie Smart then kindly pointed out [9] that the continuous time results in [2] and [3] concerning the viscosity solution of certain optimal control problems could be adapted to the discrete time setting (using [4]) in order to show an integrated version of Corollary 1, namely that for any fixed β, γ > 0 a martingale {M i } as in the lemma could be constructed so that for all δ small, (Note that γ can be taken arbitrarily large, for β fixed. The estimate (2) is in contrast with the expected linear-in-δ behavior one might naively expect from diffusive scaling.) This then raises the question of whether a local version of this result could be obtained, and our goal in this short note is to answer that in the affirmative. We phrase some of our results in the language of controlled random walks. Fix a parameter q ∈ [0, 1). Consider a controlled simple random walk {S u i } i≥0 , defined as follows. Let S 0 = 0 and let F i = σ(S 0 , S 1 , . . . , S i ) denote the sigmafield generated by the process up to time i. A q-admissible control is a collection of random variables {u i } i≥0 satisfying the following conditions: Let U q denote the set of all q-admissible controls. For u ∈ U q , the controlled simple random walk {S u i } i≥0 is determined by the equation Of course, {S u i } i≥0 is an F i -martingale. For q = 0, we recover the standard simple random walk. We prove the following.
Theorem 2 For any q ∈ (0, 1), there exists σ + (q), σ − (q) ∈ (0, 1/2) and c, C ∈ (0, ∞) such that for any n and Work related to ours (in the context of the control of diffusion processes) has appeared in [8]; more recently, the results in [1] are related to the lower bound in Theorem 2.

Proofs
Theorem 1 (which immediately implies the upper bound in Theorem 2) is obtained by observing that any martingale has to overcome a (logarithmic number of) barriers in order to reach the target region, and each such barrier can be overcome only with (conditional on the history) probability bounded away from 1. The lower bound in Theorem 2, on the other hand, will be obtained by exhibiting an explicit control.

Upper bound -Proof of Theorem 1
Throughout this subsection, δ ∈ (0, 1] is a fixed constant, and {M i } i≥0 denotes a martingale adapted to a filtration F i , satisfying the condition We begin with an elementary lemma. Lemma 1 Assume that M 0 = 0, that (6) holds, and that for some h ≥ 1, Let τ = min{i : |M i | ≥ h}. Then, Proof of Lemma 1. By (6), the process where the bound on the increments of {M i } was used in the last inequality. It follows that E(τ ∧ ℓ) ≤ 4h 2 /δ, and therefore, where (7) was used in the second inequality. On the other hand, using again that increments of {M i } are bounded by h, which implies that P (M τ ≤ −h) ≤ 2/3. Combining this with (9) yields the lemma.
We have the following corollary.
Lemma 2 Let H, L > 0 and let K be a positive integer so that H 2 ≤ δKL/24. Assume (6), M 0 = 0, and that Let τ H = min{i : M i ≥ H}. Then, Proof of Lemma 2. Set ℓ = L/K, h = H/K, and iterate Lemma 1 K times. Combining Lemma 1 and Lemma 2, one obtains the following.
Lemma 3 Let L > 0 be a positive integer. Set H = 3 √ L and let K be a positive integer so that Kδ ≥ 216. Assume (6), M 0 = z, (10) and Then, Proof of Lemma 3. It is enough to consider z ≥ 0. Letτ H = min{i : M i ≥ z + H}. Note that the condition on K ensured that H 2 ≤ δKL/24. By Lemma 2, On the other hand, by Doob's inequality and (12), on the eventτ H ≤ L, Combining the last two displays completes the proof.
We can now begin to construct the barriers alluded to above. Fix n > 0 and set V m,n = [m, n] ∩ Z, R n = [−n, n]. Write V n = V 0,n and B j,n = V (1−2 −j )n,n . Define the following nested subsets of R × V n : Let τ 0 = 0 and for i ≥ 1 set τ i = min{t > τ i−1 : (M t , t) ∈ D i }. A direct consequence of Lemma 3 is the following. Adjusting C if necessary, we may and will assume that N 0 > 1. Note that {|M n | ≤ n β } ⊂ {τ N0 ≤ n} and therefore, by Lemma 4, This yields the theorem.

Proof of Theorem 2
The upper bound in (4) is a consequence of Theorem 1. We thus need only to consider the lower bound in (4), and the claim (5). First note that the simple control u i = q already achieves the lower bound with exponent σ − (q) = 1/2. Thus, what we need to show is that for any q > 0 there is a (polynomially) better control and that as q → 1 we can achieve an exponent close to 0. Toward this end, we use two very simple controls, that are not approximation of the optimal control. See Section 3 for further comments on this point.
We begin with the following a-priori estimate.
Proof of Lemma 5: The control we take is slow inside [−βK, βK] and fast outside, i.e. we take u i = q for |S u i | ≤ βK and u i = 0 for |S u i | > βK. We claim that given any q > 0, using this control with α > 0 and β > 0 small enough and K > K 0 with K 0 large enough will satisfy the conclusion of the lemma with some ε > 0.
Using reversibility we get Now, the probability that a simple random walk will get to a distance of more than K/2 in αK 2 steps tends to 0 as α tends to 0, uniformly in K. Obviously, this applies also for our controlled random walk (which is sometimes lazy), hence by choosing small enough α we can guarantee that for any K > 0 and any y ∈ [−K/2, K/2] we have P y (S u αK 2 ∈ [−K, K]) > 1 − q. Having fixed α, we now claim that Indeed, by [10,Corollary 14.5], there exists a constant C(q) so that for any two states x and y. (The bound in [10] is valid for any random walk on an infinite graph with bounded degree and bounded above and below conductances, see [10,Pg. 40]; Note that while the bound is stated for discrete time chains, it can also be transferred without much effort to the continuous time setting. See e.g. [6, Theorem 2.14 and Proposition 3.13].) Plugging t = αK 2 , we get which tends to 0 when β → 0 and K → ∞ in the order prescribed in (14). Thus, by choosing small enough β and large enough K 0 we can have uniformly for all K > K 0 and we are done.
In preparation for the proof of (5), we provide an auxilliary estimate.
Lemma 6 For any ε > 0 there exist A and q < 1 such that for any K there is a q-admissible control with the property that for any x ∈ [−2K, 2K] we have Proof of Lemma 6: Let A be so that for a simple random walk on Z we have for any K, where τ 2K is the first hitting time of 2K.
Having chosen A, let q < 1 be so big such that for a q-lazy random walk (that is, a random walk with control u i = q) we have for any K, where τ {−K,K} is the first time of hitting either K or −K.
We now define the control to be fast until the walk hits 0 and slow afterwards, i.e. we take u i = 0 for i < τ 0 := min{n : S u n = 0} and u i = q for i ≥ τ 0 . If the starting location S u 0 is in [−2K, 2K], then by (15) with probability at least 1−ε/2 we hit 0 before time AK 2 . If that happens, then by (16) with probability at least 1 − ε/2, the walk stays inside [−K, K] until time AK 2 .
We can now complete the proof of Theorem 2. Proof of (5): Fix ε > 0 and choose q and A according to Lemma 6. Let L = ⌊log 4 (n/A)⌋ and let T ℓ = T − A ℓ i=0 4 k , for ℓ = 0, . . . , L. For time 0 to T L , we have P 0 (S u TL ∈ [−2 L , 2 L ]) > c for some fixed c > 0, regardless of the control.
For any ℓ = L, . . . , 1, from time T ℓ to T ℓ−1 we use the strategy provided by Lemma 6 for K = 2 ℓ . Then with probability at least c(1 − ε) L ≈ n log 4 (1−ε) we have S T = 0. This yields the required lower bound.

Concluding remark
Motivated by the structure of the optimal control in the continuous time-andspace analogue of our control problem, see [2], one could attempt to improve on the lower bound in (4) by using a bang-bang control of the type u i = q if (S u i , i) ∈ D ⊂ Z × [0, n] and u i = 0 otherwise, where D is a domain whose boundary is determined by an appropriate (roughly parabolic) curve. The analysis of that control is somewhat tedious, and proceeding in that direction we have only been able to show the lower bound in (4) with σ − (q) < 1/2 when q is sufficiently large. It would be interesting to check whether an analysis of the dynamic programming equation associated with the control problem, in line with its continuous time analogue in [2,3], could yield that estimate, and more ambitiously, show the equality of σ − (q) and σ + (q) in (4).
One could also consider the dual problem of minimizing the probability of hitting 0 at time n, that is, in the setup of Theorem 2, of evaluating inf u∈Uq P (S u n = 0) .
One can adapt the proof of the lower bound in Theorem 2 (replacing in the sub-optimal control "fast" by "slow") to obtain a polynomial upper bound in (17) that has exponent larger than 1/2. Similarly (using the invariance principle for martingales), one shows that there is α = α(q) such tha the controlled walk with |S u 0 | < 2K satisfies |S αK 2 | ≤ K with positive (depending only on q and independent of K) probability, and from this a polynomal lower bound in (17) follows. We omit further details.