Stability of overshoots of zero mean random walks

We prove that for a random walk on the real line whose increments have zero mean and are either integer-valued or spread out (i.e. the distributions of the steps of the walk are eventually non-singular), the Markov chain of overshoots above a fixed level converges in total variation to its stationary distribution. We find the explicit form of this distribution heuristically and then prove its invariance using a time-reversal argument. If, in addition, the increments of the walk are in the domain of attraction of a non-one-sided $\alpha$-stable law with index $\alpha \in (1,2)$ (resp. have finite variance), we establish geometric (resp. uniform) ergodicity for the Markov chain of overshoots.


Introduction
Let S = (S n ) n≥0 with S n = S 0 + X 1 + . . . + X n be a one-dimensional random walk with independent identically distributed (i.i.d.) increments X 1 , X 2 , . . . and the starting point S 0 that is a random variable independent with the increments. Assume that and implying lim sup n→∞ S n = − lim inf n→∞ S n = ∞ a.s. Define the up-crossings times of zero T 0 := 0, T n := inf{k > T n−1 : S k−1 < 0, S k ≥ 0}, n ∈ N, and let O n := S Tn , U n := S Tn−1 , n ∈ N; (3) be the corresponding overshoots and undershoots; put O 0 = U 0 := S 0 . The choice of zero is arbitrary and can be replaced by any fixed level. The sequence of overshoots O = (O n ) n≥0 is a Markov chain. The sequence of undershoots U = (U n ) n≥0 also forms a Markov chain. Both statements can be checked easily, although the latter one is less intuitive. We are mostly interested in the chain of overshoots, but our techniques also yield results for the chain of undershoots.
Under assumption (1), consider the law where Z is the state space of the walk S, defined as the minimal closed (in the topological sense) subgroup of (R, +) containing the topological support of the distribution of X 1 , and λ is the Haar measure on (Z, +) normalized such that λ([0, x) ∩ Z) = x for positive x ∈ Z.
We will prove that the distribution π + is invariant for the Markov chain of overshoots O (Theorem 1). Our proof is based on a time reversal of the path of S between the up-crossings of the level zero. Since this proof gives no insight into the form of π + , in Section 2.2 we 1 present a heuristic argument which we used to find this invariant distribution. The invariance of π + is also established in our companion paper [13,Theorem 4] in a more general setting using entirely different methods based on infinite ergodic theory; the proof presented here precedes the one in [13]. By [13,Theorem 5], the assumption in (1) implies that the law π + is a unique (up to multiplicative constant) locally finite Borel invariant measure of the chain of overshoots O on Z. Moreover, we will see in Section 2.1 that assumption (1) is the weakest possible ensuring that O has an invariant distribution (i.e. probability measure).
The main goal of this paper is to study convergence of the Markov chain of overshoots O to its unique invariant law π + . Our aim is to identify the conditions on the law of the increments of S under which the total variation distance between the law of O n and π + converges to zero as n → ∞ (Theorem 2) and study its rate of decay (Theorem 3). Since the chain O is in general neither weak Feller ([13, Remark 8]) nor ψ-irreducible (see Section 5), the total variation convergence requires additional smoothness assumptions on the distribution of increments of S. In particular, Theorem 2 holds if the distribution of X 1 is either arithmetic or spread out, which means respectively that either X 1 is supported on dZ for some d > 0 or the distribution of S k is non-singular for some k ≥ 1. The geometric rate of convergence in Theorem 3 is established under a further assumption that the law of X 1 is in the domain of attraction of a non-one-sided α-stable law with index α ∈ (1, 2). For increments with finite variance we get a stronger version with the geometric rate of convergence uniformly in the starting point of O. Section 5 concludes the paper by offering a conjecture about the weak convergence of the Markov chain O to π + without additional assumptions on the law of X 1 other than (1).
Our interest in the Markov chains of overshoots of random walks stems from their close connection to the local time of the random walk at level zero (see Perkins [15]) and the fact that they appear in the study of the asymptotics of the probability that the integrated random walk (S 1 + . . . + S k ) 1≤k≤n stays positive (see Vysotsky [23,24]). A detailed discussion with applications and further connections to a special class of Markov chains, called oscillating random walks, is available in [13, Section 1.2].
2. Stationary distribution of overshoots 2.1. Setting. Consider the random walk S = (S n ) n≥0 from Section 1, and define its version S ′ = (S ′ n ) n≥0 with S ′ n := S n − S 0 , which always starts at zero. We assume that S, as well as all the other random elements considered in this paper, are defined on a generic measurable space equipped with a variety of measures: a probability measure P; the family of probability measures {P x } x∈R given by P x (S ∈ ·) = P(x + S ′ ∈ ·) (satisfying P x (S 0 = x) = 1); and the measures of the form P µ (·) := R P x (·)µ(dx), where µ is a Borel measure µ on R. We do not necessarily assume that µ is a probability but we prefer to (ab)use the probabilistic notation P µ and the terms "law", "expectation", "random variable", etc., by which we actually mean the corresponding notions of general measure theory. Under the measure P µ , the starting point S 0 of the random walk S follows the "law" µ. Denote by E and E x the respective expectations under P and P x . All the measures on topological spaces considered in the paper are Borel, that is defined on the corresponding Borel σ-algebras.
Recall that the state space Z of the random walk S was defined as the minimal closed subgroup of (R, +) containing the support of the distribution of X 1 . Let us give a different representation for Z assuming throughout that X 1 is not degenerate. For any h ∈ [0, ∞), let Z h be the real line R if h = 0 and the integer lattice Z multiplied by h if h > 0: We equip Z h with the discrete (resp. Euclidean) topology if h > 0 (resp. h = 0). Note that any closed (in the topological sense) subgroups of (R, +) is of the form (Z h , +) for some . Define the span of the distribution of increments of S by and note that d ∈ [0, ∞) and Z = Z d . We always assume that the random walk starts in Z d , hence P(S 0 , S 1 , . . . ∈ Z d ) = 1. The distribution of increments of S is called arithmetic (with span d) if d > 0 and is called non-arithmetic if d = 0. We shall often use d > 0 and d = 0 as synonyms for arithmetic and non-arithmetic, respectively. Define the measure λ d on Z d as follows: for any B ∈ B(Z d ), put where λ 0 denotes the Lebesgue measure on R and # denotes the number of elements in a set. Then λ d is the normalized Haar measure on the additive group Z d = Z, as defined in the Introduction. Define the measures λ + where c 1 := 1 if E|X 1 | = ∞ and c 1 : This extends the definition of π + given in the Introduction under assumption 1.
The classic trichotomy states that the (non-degenerate) random walk S either drifts to ∞, drifts to −∞, or oscillates. By definition, the latter possibility means that lim sup n→∞ S n = ∞ a.s. and lim inf n→∞ S n = −∞ a.s.; see Feller [6, Section XII.2]. By the above trichotomy, oscillation is the weakest possible assumption to consider the Markov chains of overshoots and undershoots of a level. It is known that oscillation of S is equivalent to divergence of both series ∞ n=1 n −1 P 0 (S n > 0) and ∞ n=1 n −1 P 0 (S n < 0); cf. Asmussen [1, Theorems VIII.2.3 and 2.4]. Clearly, oscillation is necessary and sufficient for S to cross the zero level infinitely often a.s. In particular, oscillation holds when the random walk S is topologically recurrent on Z d , which means that P 0 (S n ∈ G i.o.) = 1 for every open neighbourhood G ⊂ Z d of 0. This is because such random walks satisfy P 0 (S n ∈ G i.o.) = 1 for every non-empty open set G ⊂ Z d ; see Guivarc'h et al. [7,Theorem 24]. The converse is not true since there are symmetric non-degenerate random walks (which always oscillate) that are not recurrent. This is readily seen from the following criterion by Spitzer [20,Section 8] and Ornstein [14]: topological recurrence of S is equivalent to divergence of the integral a −a Re (1 − Ee itX 1 ) −1 dt for all a > 0. In particular, this integral diverges under condition (1), so every random walk satisfying (1) is recurrent and hence oscillates. Note in passing that the distribution of increments of a topologically recurrent random walk may have arbitrarily heavy tails; see Shepp [19].
Similarly to (2) and (3), define the down-crossings times of the level zero as The corresponding overshoots and undershoots at the down-crossings are and The random sequences in (3) and (6) are defined on the event that all crossing times T n are finite. Since S oscillates, this event occurs almost surely under P and under P µ with arbitrary measure µ on Z d .
The Markov chains of overshoots at up-crossings O = (O n ) n≥0 and at down-crossings Note that there is asymmetry at zero. Namely, since −Z + d = Z − d , the downcrossing times T ↓ n (resp. positions O ↓ n and U ↓ n ) need not be equal to the up-crossing times T n (resp. positions −O n and −U n ) for the dual random walk (−S n ) n≥0 . Our consideration mostly concerns O, which for brevity will be called the chain of overshoots if there is no risk of confusion with O ↓ . Theorem 1. Let S be any random walk that oscillates. Then the measure π + is invariant for the Markov chains O and (−U n − d) n≥0 of overshoots and shifted sign-changed undershoots at up-crossings of the zero level, i.e. P π + (O n ∈ ·) = π + and P π + (−U n − d ∈ ·) = π + for all n ∈ N. Similarly, π − is an invariant measure for the chains O ↓ and (−U ↓ n − d) n≥0 . Remark 1. We will show in Section 2.3.1 below that the laws of overshoots and undershoots of the zero level at consecutive down-and up-crossings are related as follows: We will prove these results using an argument based on a time reversal of the path of S between the up-crossings of the level zero. Since this proof gives no insight about the form of π + , we will also present a heuristic argument which we used to find this invariant distribution. After these results were obtained, we found an entirely different proof of Theorem 1, which is based on the methods of infinite ergodic theory and applies in a more general setting; see our companion paper [13,Theorem 4].
The assumption that the random walk S oscillates is the weakest possible to consider the Markov chains of overshoots and undershoots. By [13,Theorem 5], the chains of overshoots and undershoots of such random walks possess no other locally finite invariant Borel measures (up to a multiplicative constant), including the ones singular with respect to π + and π − . Therefore, the probabilistic question of convergence of these chains to stationarity can be posed only if the measures π + and π − in Theorem 1 have total mass one. This need not be the case in general since every non-degenerate symmetric random walk oscillates. However, by (5), both measures π + and π − have finite mass if and only if E|X 1 | ∈ (0, ∞), in which case the oscillation assumption forces EX 1 = 0 and the equalities π + (Z d ) = π − (Z d ) = 1 follow.
Thus, condition (1) is the weakest assumption under which convergence to stationarity of the chains of overshoots and undershoots can be stated.
Probability measures of the same form as π + and π − appear as limit distributions for the following stochastic processes closely related to random walks. Assume that (1) holds. First, π + is the unique stationary distribution of the reflected random walk driven by an i.i.d. sequence with the common non-arithmetic distribution P(X 1 ∈ ·|X 1 > 0); see Feller [6,Section VI.11] and Knight [11]. Second, 1 2 π + + 1 2 π − is an invariant distribution of a Markov chain, which belongs to a special type of the so-called oscillating random walks, whose increments are distributed as P(X 1 ∈ ·|X 1 < 0) for all starting points in Z + d and as P(X 1 ∈ ·|X 1 > 0) for all starting points in Z − d ; see Borovkov [4] and cf. Vysotsky [25]. Third, π + is known as the limit distribution, as well as the stationary distribution, for the non-negative residual lifetime in a renewal process with inter-arrival times distributed according to P(X 1 ∈ ·|X 1 > 0); see Asmussen [1,Section V.3.3] The r.h.s.'s of (7) is referred to as the distribution of the overshoot of the walk S under an "infinitely remote" level at −∞. This distribution equals π − defined for H − 1 instead of X 1 . Similarly, π + corresponds to the non-strict overshoot of S above an "infinitely remote" level at ∞, which is distributed as the strict overshoot at this level decreased by d.
Notice that the invariant distributions π + and π − in Theorem 1 are defined only in terms of the tails of the distribution of increments of the random walk S. On the other hand, it is natural to expect that π + and π − are closely related to the distributions of the overshoots above infinitely remote levels, and in fact, we clarify this in [13, Proposition 1] by giving different representations of π + and π − . Finally, we note that since the limit distributions of overshoots above infinite levels exist only for zero-mean random walks with finite variance, prior to finding π + and π − and proving Theorem 1 is not clear why the chain of overshoots should have a stationary distribution if the walk has infinite variance.

2.2.
Derivation of π + . Let us present a simple probabilistic argument that we used to guess the shape of π + . Assume that EX 1 = 0, the variance of increments σ 2 = EX 2 1 is finite and positive, and the random walk S is integer-valued and aperiodic, i.e. the distribution of X 1 −a is arithmetic with span 1 for every a ∈ Z. In this case Z + d = N 0 , where N 0 := N 0 ∪{0}. Consider the number of up-crossings of the zero level by time n: Assume that the chain O has an ergodic stationary distribution µ. Then by the ergodic theorem, for any x, y ∈ {z ∈ N 0 : P(X 1 > z) > 0}, On the other hand, By the local central limit theorem, there exists a constant c > 0 such that for every integer Hence from (8) and the dominated convergence theorem, we obtain Thus, µ = π + in the special case considered above. Therefore it is feasible that the distribution π + is stationary for the chain of overshoots O for general random walks but of course we need to prove this directly.

2.3.
Proof of Theorem 1. The main result of the section, Proposition 2 below, reveals a distributional symmetry hidden in the trajectory of an arbitrary oscillating random walk, which is key for the proof of Theorem 1.
Define new Markov transition kernels P and Q on Z d as follows: with the convention that Q(x, dy) := δ 0 (dy) in the case when P(X 1 − d ≥ x) = 0; the choice of the delta measure is arbitrary and will not be relevant for what follows. The kernel P is defined in terms of the sign-changed first undershoot U 1 , given in (3) above, which is shifted by d to ensure that −U 1 − d may take value zero in the arithmetic case. The kernel Q corresponds to up-crossings of the zero level by the walk S. Clearly, for every x ∈ Z d , the transition probabilities P (x, dy) and Q(x, dy) are supported on Z + d . The transition kernels of the Markov chains of overshoots (O n ) n≥0 and shifted signchanged undershoots (−U n − d) n≥0 equal P Q and QP , respectively. More precisely, for any probability measure µ on Z d and any n ∈ N, Here for any transition kernel T on Z d , by µT we denoted the measure on Z d given by µT (dy) := Z d T (z, dy)µ(dz), and put T 0 (x, dy) = δ x (dy).
In the arithmetic case, we clearly have the equality λ d (dx)P( we will also prove this identity for d = 0. Combined with the equality of measures P(X 1 − d ≥ z)λ + d (dz) = π + (dz) on Z d , this implies that the transition kernel Q is reversible with respect to π + . Put differently, the detailed balance condition . Surprisingly, the kernel P shares the same property. Put together, we have the following statement, which we will prove in full below in Section 2.3.1.

Proposition 1.
For any random walk S that oscillates, the kernels P and Q are reversible with respect to π + .
A direct corollary of this proposition is the invariance of the measure π + for the Markov chains (O n ) n≥0 and (−U n − d) n≥0 asserted by Theorem 1. A similar argument yields the invariance of π − for the chains (13) from Section 2.3.1 below and a kernel decomposition for these chains analogous to (10)). Thus Theorem 1 follows from Proposition 1, which in turn is a direct corollary of Proposition 2 (see Section 2.3.1).
2.3.1. The time reversal argument. We now present a result concerning the entire trajectory of the random walk between up-crossings of the level zero. Our proof is based on a generalisation of the argument from Vysotsky [24, Lemma 1]. It may be regarded as an illustration of the conclusion of Remark 5 in [13, Section 5.2] on general state-space Markov chains.
Proposition 2. For any random walk S that oscillates, for any m ∈ N we have The choice of the value 0 in the random sequences in (11) and (12) is arbitrary and could be substituted by any constant. However, we stress that the equalities in (11) and (12) cease to hold if this constant value is substituted by the remaining part of the path of S. Note that (11) can be stated more elegantly as Remark 2. Similarly, we have and (14) We first prove two simple corollaries of Proposition 2.
Proof of Propositions 1. Reversibility of the P -kernel follows immediately by (11) with m = 1 since U 1 = S T 1 −1 .
As explained above, reversibility of the Q-kernel follows from the equalities of measures The latter equality is trivial. The former one is equivalent to as follows from substituting y by y − d using the invariance of λ d under tshifts in Z d and substituting x by −x using the central symmetry of λ d ; cf. (23) below for the meaning of (15). It suffices to check the equality of measures (15) only for rectangular sets with Borel sides A, B ⊂ Z d . By Fubini's theorem and the mentioned shift invariance of λ d , where the last equality follows from the first two. This is exactly (15).
Recall that Remark 1 asserts that Proof of Remark 1. Fix m = 1. By (11), the random variables We now prove the main statement of the section.
Proof of Proposition 2. Consider equality (11) in the case m = 1. Pick an arbitrary k ∈ N and define the time-reversal mapping R k : Introduce the random vector K := (S 0 , . . . , S k ) and note that (11) follows if we establish the equality of measures on (Z d ) k+1 : PutZ the set of sequences of length k + 1 that start fromZ + d , down-cross the level zero exactly once, and in the non-arithmetic case have no zeroes.
Note that R k is an invertible mapping on R k+1 , and it is an involution. Further, The second equality is trivial in the arithmetic case. In the non-arithmetic case, it is due to the fact that K has density with respect to the Lebesgue measure on R k+1 , which in turn holds true since in this case the measure π + has density with respect to the Lebesgue measure on R.
By (17), if suffices to check equality (16) on rectangles of the form B 0 × B × B k with Borel sides B 0 ⊂Z + d , B k ⊂ Z − d and B ⊂ C k . Using the definition of π + and the fact that X k+1 is independent with K under P x 0 for every x 0 ∈ Z d , we obtain where Then we use equality (18) where in the last equality we used the change of variables formula, the fact that R k is an involution, and the equality ( Let us simplify the integrand under the last integral in (19). We have The well-known duality principle for random walks states that the random vectors (S 1 , . . . , S k ) and (S k − S k−1 , . . . , S k − S 1 , S k ) have the same law under P 0 . By a conditional version of this distributional identity, for every x 0 ∈ Z d and P x 0 (S k ∈ ·)-a.e. x k ∈ Z d , By the definition of f B , this gives Thus, using in the non-arithmetic case the fact that a distribution function can have at most countably many jumps, we get Hence by (18), (19), and (21) combined with (17), equality (16) will follow once we show the following equality of measures on Z + d × Z − d : By translation invariance of λ d under shifts in Z d , and thus the claim (22) reduces to which means that the random walk −S is dual to S with respect to λ d . To prove this property, note that by the shift invariance of λ d under shifts in Z d , the equality (23) of measures on . This is exactly (15) with X 1 replaced by S ′ k .
Thus, (11) is proved for m = 1. The general case m ∈ N follows analogously, with the only difference that the set C ′ k shall account for 2m − 1 crossings of the level zero. Consider now (12). We need to prove that the law of (S 0 , S 1 , . . . , S T ↓ m −1 ) under P π + equals the law of (− Similarly to the proof of (11), by the duality principle for random walks this reduces to the equality Use the definitions of π + , π − , and R 1 to write this as This equality holds by (22) and the fact that P(

Convergence to the stationary distribution
For the rest of the paper we assume (1) and investigate convergence in total variation of the law of O n to the probability distribution π + as n → ∞.
In the non-arithmetic case convergence in the total variation norm requires additional assumptions on the law of the increments of S. We say that the distribution of the increment X 1 is spread out if P 0 (S k ∈ ·) is non-singular with respect to the Lebesgue measure for some k ≥ 1. It is clear that this assumption is necessary for the total variation convergence to π + of the law of the chain O n starting from a point. In fact, if this assumption is violated in the non-arithmetic case, then P x (O n ∈ ·) − π + (·) TV = 1 for every x ∈ R and n ≥ 1 since π + has density. In this sections we will show that that the spread out assumption is actually sufficient for the total variation convergence. Let us mention that spread out distributions arise often in the context of renewal theory, see Asmussen [1, Section VII].
Theorem 2. Assume (1) and that the distribution of X 1 is either arithmetic or spread out. Then lim n→∞ P x (O n ∈ ·) − π + (·) TV = 0 for all x ∈ Z d .
A standard application of the dominated convergence theorem yields another proof of the fact (given in full generality by [13,Theorem 5]) that, under the assumptions of Theorem 2, π + is the unique stationary distribution of the chain (O n ) n≥0 in the class of all probability laws on Z d , including the ones singular with respect to π + .
The convergence in Theorem 2 may fail for every starting point x ∈ Z 0 in the case of general non-arithmetic distributions of increments, e.g. for discrete non-arithmetic distributions, but π + remains the unique stationary distribution of O by [13,Theorems 4 and 5]. Therefore one may argue that the total variation metric is too fine for the study of convergence of the chain of overshoots for general zero mean random walks. It is feasible that the convergence holds in other metrics under less restrictive assumptions than those in Theorem 2 but we did not succeed in proving results of such type; see the discussion in Section 5 below.
It is well known that under the spread out assumption on the increments of a random walk, a successful coupling of the walks started at arbitrary distinct points x, y ∈ Z 0 can be defined, implying in particular lim n→∞ P x (S n ∈ ·)−P y (S n ∈ ·) TV = 0, see e.g. Theorem 6.1 of Chapter 3 in Thorisson [22]. However, this coupling yields only a shift-coupling [22, Section 3.1] of the chains of overshoots started at x and y. Thus only the Cesaro total variation convergence [22, Section 3.2] of O can be deduced from these results, which is weaker than the convergence stated in Theorem 2. Our proof of Theorem 2 rests on the crucial property of the Markov chain (O n ) n≥0 stated below in Proposition 3, implying that a successful coupling of the chains of overshoots started at any distinct levels can be constructed for any span d ∈ [0, ∞). We do not exhibit the coupling construction in this paper but instead apply Theorem 4 in Roberts and Rosenthal [17], which is established using this coupling.
For any measure µ on Z d , denote respectively by µ a and µ s its absolutely continuous and singular components with respect to λ d . We will slightly abuse this notation for distributions of random variables and write, say, P a x (O 1 ∈ ·) instead of (P x (O 1 ∈ ·)) a . We reserve the term "density" to mean the density with respect to the Lebesgue measure λ 0 without referring to the measure. The set X + := [0, M + ) ∩ Z d , where M + := sup(supp(X 1 )), is the actual state space of the Markov chain of overshoots: for any x ∈ Z d and n ∈ N we have P x (O n ∈ X + ) = 1. Moreover, the equality π + (X + ) = 1 holds true. Proposition 3. Assume (1) and that the distribution of X 1 is either arithmetic or spread out. Then the measures P a x (O 1 ∈ ·) and π + (·) are equivalent for any x ∈ Z d . Put differently, for any x ∈ Z d there exists a version of the density d dλ d P a x (O 1 ∈ dy) that is strictly positive for all y ∈ X + .
Proof of Theorem 2. Proposition 3 implies that with positive probability, the chain of overshoots visits in a single step any Borel set A ⊆ Z d satisfying π + (A) > 0. This means that the Markov chain (O n ) n≥0 is π + -irreducible and aperiodic in the sense of Meyn and Tweedie [12, Sections 4.2 and 5.4]. By Theorem 1 above, (O n ) n≥0 has a stationary distribution π + . Then Theorem 4 in Roberts and Rosenthal [17], which applies to ψ-irreducible aperiodic Markov chains with a stationary distribution on a general state space with a countably generated σ-algebra, implies the total variation convergence in Theorem 2 for π + -a.e.
x ∈ Z d .
Since P x (O 1 ∈ X + ) = 1 for every x ∈ Z d , we will conclude the proof of Theorem 2 if we show that the non-convergence set N := {x ∈ X + : lim sup n→∞ P x (O n ∈ ·) − π + (·) TV > 0} is empty. In the arithmetic case (d > 0) this is clear by the fact that every point of X + has positive π + -measure and π + (N) = 0. In the non-arithmetic case (d = 0) first note that since the Borel σ-algebra on X + is countably generated, the function x → P x (O n ∈ ·) − π + (·) TV is measurable for every n ∈ N by Roberts and Rosenthal [16,Appendix], making the set N measurable. Thus the claim will follow by a standard application of the strong Markov property and the dominated convergence theorem if we show that the chain (O n ) n≥0 hits the convergence set X + \ N with probability one when started in N. Put differently, we need to prove that P x (O n ∈ N, ∀n ∈ N) = 0 for every x ∈ N.
Since π + (N) = λ 0 (N) = 0 we have P x (S m ∈ N) = P s x (S m ∈ N) for all m ∈ N. Hence, where in the second inequality we used the identity O n = S Tn and the fact that T n ≥ 2n for x ≥ 0, cf. (2) and (3). By the definition of spread out distributions, we have P s x (S k ∈ R) = P s (S ′ k ∈ R) < 1 for some k ≥ 1. Then, using that the convolution of an absolutely continuous measure with any other measure is absolutely continuous, we get for any integer m ≥ 1, where ⌊c⌋ denotes the largest non-negative integer smaller or equal to a c ≥ 0. Hence the sequence P s x (S m ∈ R), which equals P s (S ′ m ∈ R), decays exponentially fast to zero as m → ∞, and it follows that the last bound in (24) is zero.
Proof of Proposition 3. Pick any x ∈ Z d and and denote by y an arbitrary element in X + . Consider two cases.
Arithmetic distributions. We need to prove that P x (O 1 = y) > 0. Since y < M + , there exists a z ∈ Z + d such that z > y and P(X 1 = z) > 0. Further, according to the definition of Z d , there exists an integer k ≥ 1 such that P x (S k = y − z) > 0; see, e.g., Spitzer [20, Propositions 2.1 and 2.5]. Then and it remains to show that the first factor in the r.h.s. is positive. Denote by Sym(k) the symmetric group on the set {1, . . . , k}. For any permutation σ ∈ Sym(k), define a new random walk S(σ) = (S n (σ)) n≥0 by S n (σ) := S 0 +X σ(1) +. . .+X σ(n) for 1 ≤ n ≤ k and S n (σ) := S n for n ≥ k. Denote by T 1 (σ) the first up-crossing time of the level zero by S(σ) (cf. (2)), and let ξ be the number of negative terms among X 1 , . . . , X k .
Spread out distributions. We say that measures µ and ν on Z 0 = R satisfy for any Borel set B ⊂ I. Note that µ(du) ≥ cλ 0 (du) on I implies µ a (du) ≥ cλ 0 (du) on I. In this case there exists a version of the density of µ a which is bounded from below on I by the positive constant c.

Rate of convergence to the stationary distribution
In this section we present results on the rate of convergence in Theorem 2. We will use the following norm: for any function f : In particular, for f ≡ 1 the following relationship with the total variation norm holds: µ f = 2 µ TV . Clearly, convergence in any f -norm is stronger than the total variation convergence. We will only need the V γ -norms, where V γ (x) := 1 + x γ with γ ≥ 0.
Further, define the set of bivariate parameters For a random variable X, we write X ∈ D(α, β) for a pair (α, β) ∈ I if the distribution of X belongs to the domain of attraction of a strictly stable law with the characteristic function Theorem 3. Assume (1) and that the distribution of X 1 is either arithmetic or spread out. In addition, assume either EX 2 1 < ∞ with γ ∈ {0, 1} or X 1 ∈ D with γ ∈ (0, min{αp, α(1 − p)}). Then there exist constants r ∈ (0, 1) and c 1 > 0 such that Our proof of Theorem 3 rests on two statements. The first can be viewed as a uniform version of Proposition 3 stated in a slightly different form to avoid measurability issues.

Proposition 4.
Under the assumptions of Theorem 3, for any K > 0 in the case X 1 ∈ D and for K = ∞ in the case EX 2 1 < ∞, there exists a measurable function g K : X + → (0, ∞) such that for all x ∈ Z + d ∩ [0, K) and Borel sets B ⊆ X + . (34) Remark 3. Note that (34) implies P x (O 1 ∈ B) ≥ B g K (y)λ d (dy) > 0 for any Borel set B with λ d (B) > 0. In particular, every compact set C ⊂ Z + d with non-empty interior in Z + d (or the whole set Z + d in the finite variance case) is small with respect to the measure g diam(C) (y)½ X + (y)λ d (dy); see Meyn and Tweedie [12,Section 5.2] for the definition of small sets. The proposition also yields that the Markov chain (O n ) n≥0 is strongly aperiodic and satisfies the minorization condition, cf. Sections 5.4 and 5.1 in [12], respectively.
Remark 4. Our proof, based on Stone's local limit theorem, actually implies that the inequality in (34) with finite K is also valid for asymptotically stable distributions of increments with 1 < α < 2, |β| = 1 and with α = 2. Moreover, it is plausible that (34) holds under assumptions of Proposition 3, i.e. without any assumptions on the tail behaviour of X 1 beyond (1).
Second, we need the following geometric drift condition. We will prove it using results of renewal theory.
Proof of Proposition 4. The case X 1 ∈ D(α, β). In the arithmetic case the set Z + d ∩[0, K) has a finite number of elements and the claim follows from Proposition 3. For spread out distributions we have d = 0, implying Z + 0 ∩ [0, K) = [0, K), and it is clearly sufficient to prove that there exist a measurable function g K : X + → (0, ∞) and for every x, a version p(x, y) of the density of P a x (O 1 ∈ dy) such that inf p(x, y) ≥ g K (y) > 0 for all y ∈ X + .
We will do this by refining the argument in the proof of Proposition 3. Pick y ∈ X + and consider the estimate in (32). Note that ε 1 does not depend on x and y while ε 3 depends only on y through the choice of z > y. By decomposing X + into a pair-wise disjoint collection of countably many bounded half-open intervals and choosing the same z for all y in each of the intervals makes y → ε 3 (y) = P(X 1 ∈ [z, z + h/2]) a measurable function of y. Therefore it suffices to check that ε ′ 2 can be bounded away from zero and k ′ can be bounded from above, both uniformly in x ∈ [0, K) and y in each of the intervals in the partition of X + . These claims will follow once we establish a refined version of (29): for any compact interval I in R and h > 0, there exists an integer m ≥ 1 such that Possibly the easiest way to prove (37) is to apply Stone's local limit theorem which holds for non-lattice asymptotically stable distributions [21,Corollary 1]: if the sequence (b n ) n≥1 tending to infinity is such that S n /b n converges weakly to a strictly stable law with the characteristic function χ α,β given above, then P x (S n ∈ [u, u + h)) = P 0 (S n ∈ [u − x, u − x + h)) = (hp α,β (0) + o(1))b −1 n uniformly in x ∈ [0, K) and u ∈ I as n → ∞, where p α,β , the density of the stable law defined by χ α,β , is strictly positive and continuous at 0 for (α, β) ∈ I. Hence the inequality in (37) holds for all n sufficiently large.
The case EX 2 1 < ∞. Note that in this case the above proof implies (34) for any finite K > 0. In order to construct g ∞ : (0, ∞) → X + , let T (−∞,L) := min{n ≥ 0 : S n < L} be the moment of the first entrance of the walk (S n ) n≥0 to the half-line (−∞, L), where L := d + 1 > 0. For any Borel set B in X + we have where g L is the lower bound in (36) that corresponds to the interval (0, L) and the equality P x (S T (−∞,L) ∈ (0, L)) = P x−L (O ↓ 1 ∈ (−L, 0)) holds by the definition of O ↓ 1 in (6). By (7), under the assumption EX 2 1 < ∞, P x (O ↓ 1 ∈ ·) converges weakly as x → ∞ to a distribution which assigns positive mass to (−L, 0). Hence there exist constants c 0 , K 0 > 0 such that In particular, this implies (7) under the assumption EX 2 1 < ∞. Clearly, there is a similar representation for the overshoot O 1 at the first up-crossing: x} is the non-negative residual lifetime at time x > 0 for the ascending ladder height process (H + k ) k≥1 of the random walk S ′ . The increments of this process are i.i.d. and have the same common distribution as H + 1 , the first strictly positive value of S ′ .
Thus the inequality in (35) holds for any ρ ∈ (ρ 0 , 1) since E x O γ 1 is locally bounded by (42), (43), and the fact that for all R sufficiently large and any K > 0, where we used (41) for the first inequality and (39) for the second one as we did in (43). The case EX 2 1 < ∞. The case γ = 0 is trivial so take γ = 1. It is well known that the ladder heights of random walks with finite variance of increments are integrable; see Feller [6,Sections XVIII.4 and 5]. Moreover, we have the following versions of (40) and (41): see Gut [8,Theorem 3.10.2]. The rest of the proof is exactly as in the first case: by (42), the value of the l.h.s. of (44) is now zero and E x O 1 is locally bounded.

Conclusion
By [13,Theorem 5], the probability law π + is the unique stationary distribution for the chain of overshoots O of any random walk satisfying (1). By Theorem 2, the laws of O n converge to π + in the total variation distance for random walks with either arithmetic or spread out distributions of increments. Our intuition coming from renewal theory suggests that the following hypothesis is plausible.
Below we discuss the difficulties of proving convergence of O n in other metrics on probability distributions under the minimal assumptions in (1). Let us start with two observations. First, the total variation norm is clearly inappropriate since it requires the spread out assumption, as explained in the beginning of Section 3. Moreover, in the non-spread out non-arithmetic case the chain of overshoot is not ψ-irreducible and thus not Harris recurrent, placing it outside of the scope of the well-established classical convergence theory (see Meyn and Tweedie [12]). In fact, the spread out assumption on the distribution of X 1 is equivalent to ψ-irreducibility of O (and S, of course). To see this, recall that any ψ-irreducible Markov chain on R has a finite period p by Theorems 5.2.2 and 5.4.4 in Meyn and Tweedie [12]. Then, by Theorem 4 in Roberts and Rosenthal [17], which we used in the proof of Theorem 2, the ψ-irreducibility of O implies that the aperiodic chain (O pn ) n≥0 converges to π + in the total variation distance. But this can only be true when the distribution of X 1 is spread out.
Second, recall from Section 2.3 that stationarity of π + for the chain O can be established by factorizing the transition kernel of O into the Markov kernels P and Q, defined in (9), both having π + as their stationary distribution (see (10)). Unfortunately, this representation appears to be of a very limited use for studying the questions of convergence. In fact, the following example shows that the chain generated by Q may have an invariant distribution other that π + , hence it may fail to converge to π + starting from an arbitrary point. Example 1. Let X 1 satisfy P(X 1 = a|X 1 > 0) = 1 for some a > d. Then for any x ∈ (0, a) we have Q(x, dy) = δ a−x (dy) and hence 1 2 δ x + 1 2 δ a−x is a stationary distribution of Q. An analogous phenomenon occurs for any non-arithmetic distribution of X 1 whose restriction to Z + 0 is atomic with finitely many atoms. The next candidate is convergence in L 2 (π + ). First of all, here we can work only with initial distributions (of O 0 = S 0 ) that are absolutely continuous with respect to π + . Given that the transition operator of the chain of overshoots O is the product of two reversible transition operators (see Section 2.3 above), it is tempting to apply the methods of the theory of self-adjoint operators. We would need to show that either P or Q has a spectral gap. A plausible way to prove this is to check that the operator is compact, with 1 being an eigenvalue of multiplicity one, and that −1 is not an eigenvalue.
The operator Q appears to be more amenable for the analysis, but it seems that Q may be non-compact for a general distribution of increments. In addition, Example 1 above shows that 1 can be a multiple eigenvalue of Q, since the Q-chain can in general have more than one stationary distribution on Z + d . We are not aware of any works that establish compactness of Markov transition operators on an appropriate functional space without the assumption of absolute continuity of the transition probabilities (which in this paper corresponds to the spread out case).
Regarding the weak convergence of Markov chains, the only technique we are aware of is based on the so-called ε-coupling for continuous-time Markov chains; see Thorisson [22,Section 5.6]. This does not seem to be applicable in the non-arithmetic case: even though, for any distinct real values x 1 and x 2 , the walks x 1 + S ′ and x 2 + S ′ enjoy a version of ε-coupling (see Thorisson [22,Theorem 2.7.1]), the level zero will be crossed at different times by the two walks making it hardly possible to deduce that the corresponding chains of overshoots are eventually only a small distance away from each other.
Our last candidate are Wasserstein-type metrics with a carefully chosen distance on Z + 0 = [0, ∞). Here there is a promising approach, introduced by Hairer and Mattingly [9,10], which works under a significantly relaxed version of the restrictive ψ-irreducibility assumption and allows one to prove convergence of Markov chains whose transition probabilities can even be mutually singular. Our problem with non-arithmetic distributions that are not spread out appears to be in this category, but we were unable to apply these ideas in our context.