Moments of recurrence times for Markov chains

We consider moments of the return times (or first hitting times) in a discrete time discrete space Markov chain. It is classical that the finiteness of the first moment of a return time of one state implies the finiteness of the first moment of the first return time of any other state. We extend this statement to moments with respect to a function $f$, where $f$ satisfies a certain, best possible condition. This generalizes results of K. L. Chung (1954) who considered the functions $f(n)=n^p$ and wondered"[...] what property of the power $n^p$ lies behind this theorem [...]"(see Chung (1967), p. 70). We exhibit that exactly the functions that do not increase exponentially -- neither globally nor locally -- fulfill the above statement.


Introduction
A classical result, see e.g. [Kol36], states that for any recurrent, irreducible Markov chain on a countable state space the following holds: if for any state i the first moment of the recurrence time is finite then this also applies to any other state. A first generalization of this result appeared in [HR53]. If we denote by T i j the first time that the Markov chain visits state j if it is started in i, then the result can be stated as follows: E f (T ii ) < ∞ for some state i implies that E f (T j j ) < ∞ for any other state j, where f (x) = x n for some integer n. The authors state the result as a lemma and refer to the proof to an (unpublished) note by K. L. Chung and R. N. Snow. A more general result can be found in [Chu54, Theorem 1], where f is allowed to be of the form x p , for any real p > 0. After stating the theorem, the author also comments that the concept of generalized moments defined in terms of a general function f was suggested to him by J. L. Doob, but Chung only mentions that his results can also be shown for functions f satisfying f (x + y) ≤ f (x)+Af ( y), for some constant A. Further related research considers recursive formulas for second moments in terms of first moments [Chu54, Sect. 2], factorial moments [Lam60], and also, more recently, explicit formulas for higher polynomial moments [Sze08]. These considerations naturally lead to the question, for which functions it is true that a finite generalized moment of the return time for one state i implies that the moment is also finite for the return time for any other state of the Markov chain. In this note, we characterize this class of functions. In the following, we only consider irreducible, recurrent discrete time Markov chains with a countable state space. To formulate our results, we introduce the following notation. The candidate functions f are taken from the set Then our objective is to classify the collection of all f ∈ such that for each irreducible recurrent discrete time Markov chain with a finite or countably infinite state space E, the following holds Following Chung [Chu54] we additionally introduce the class , by stating that f ∈ , if for any Markov chain the following holds: if there exist two states i and j such that As the states i and j do not have to be distinct, it follows that is contained in . The classical result due to Kolmogorov implies that the identity function belongs to and [Chu54] shows that any f (x) = x p , for p > 0, belongs to . Our main result states that the two classes and are in fact the same; and we also give a characterization for a function to be in this class. (b) f ∈ ; (c) the following two conditions are satisfied: Condition (c) has an easy interpretation: It ensures that the function f does not grow exponentially fast -neither globally nor locally. In fact, one can construct functions outside that globally increase as slowly as one wishes, but locally have parts of exponential increase (cf. Example 3.3). This note is structured as follows. In Section 2, we prove the implication (c) ⇒ (a) of our main theorem. In Section 3, we prove the implication (b) ⇒ (c) by showing that any function that violates either condition (i) or (ii) in (c) is not in the class . Together with the earlier observation that ⊆ , so that (a) ⇒ (b), this shows the equivalence of the three statements.

Proof of (c) ⇒ (a)
In this section, we prove that (c) ⇒ (a) in our main theorem. We start with the following lemma, which collects some preliminary facts. For this purpose, it is convenient to introduce the following notation: for states i and j of a Markov chain, we denote by U i j the return time from i to i conditioned on not crossing j (if there is such a path with positive probability). Further, V i j denotes the first hitting time of state j when started from i conditioned on not returning to i before hitting j.
Proof. To see the first part of (i) note that if the probability p of going from from i to i without crossing j is positive, we have This The remaining parts are shown similarly.
We stress that, in the second part of Lemma 2.1, one cannot prove in the same way that E f (T i j ) < ∞, since a typical path from i to i does not necessarily contain a path from i to j. However, as the next lemma shows this can be shown if we assume that condition (c) holds. This lemma is also the main part of the argument for the proof of (c) ⇒ (a) in our main theorem.
Lemma 2.2. Let f ∈ and assume that (c) holds.

(i) If for two states i and j of a Markov chain we have
Proof. First we show (i). Clearly T ii is stochastically dominated by T i j + T ji , where T i j and T ji are independent. Using (c) and the monotonicity of f , we get Now we turn our attention to (ii). For the purpose of this proof, define f (0) : The crucial observation (also cf. (3.3) below) is that where the random variables U (r) are i.i.d. copies of the random variable U i j as defined before Lemma 2.1 and V is a copy of the random variable V i j as defined before Lemma 2.1, and all variables are independent. Further, M (independent of the U's and V ) is a geometric random variable with mean 1/π − 1, where π > 0 is the probability of first hitting j before i when started from i. It may be that π = 1 -which is the case if and only if there is no path from i to i without crossing j -in which case T i j = V and we are already done. Excluding this case, we derive from (c) and (2.1) that where we know that E f (U (1) ) < ∞ from Lemma 2.1(i).
To show (2.2) fix a large constant A > 0 and estimate using (c) as follows: Hence, 1 Letting m → ∞ and using part (ii) of (c) we get lim sup Letting now A → ∞ and using dominated convergence since E f (U (1) ) < ∞, we have Therefore, using once again part (i) of (c), we have

Proof of (b) ⇒ (c)
In this section, we will show the implication (b) ⇒ (c) in Theorem 1.1 by showing that the conditions of subexponential growth rate and submultiplicativity are in fact necessary. In Lemma 3.1, we give abstract conditions on f , which imply that f / ∈ , which we exploit in Lemma 3.2 to show that if f violates the submultiplicativity condition (i) of (c) in the main theorem, we have that f / ∈ . Finally, in Lemma 3.4, we show that any function growing exponentially fast, i.e. does not satisfy condition (ii), does not belong to .
Lemma 3.1. If for a given f ∈ one can construct two independent random variables U 1 and U 2 taking values in N with infinite support such that E f (U i ) < ∞ for i = 1, 2, but E f (U 1 + U 2 ) = ∞, then f / ∈ .
Proof. Given the two random variables, we will construct a Markov chain with two special states 0 and 1 with the property that E f (T 11 ) < ∞, whereas E f (T 00 ) is infinite, which shows that f / ∈ . The construction of the Markov chain is in some regards similar to [YK39], where the authors construct a Markov chain with T 00 having any particular distribution. Denote by {x 1 , x 2 , . . . }, { y 1 , y 2 , . . . } the support of U 1 and U 2 , respectively. Formally, we can write the state space E of our Markov chain as The state 0 is connected only to 1, and if in state 0, the chain always moves to 1 next, i.e. p 01 = 1. If in state 1, the chain has three possibilities. The first one is that the chain moves to 0 with probability p, for some parameter p ∈ (0, 1). Then, conditionally on not going to 0, with equal probability it either moves "left" (i.e. to a state (L, n, 1)) or "right" (i.e. to a state (R, n, 1)). Conditionally on the next move going to the "left", we want T 11 to have distribution U 1 , therefore we set where we identify (L, n, x n ) with 1. Similarly, conditionally on the next move going "right", we would like T 11 to have the distribution U 2 , and set p 1,(R,n,1) = 1 − p 2 P(U 2 = y n ) , p (R,n,m−1),(R,n,m) = 1, n ∈ N, 1 < m ≤ y n , where we again identify (R, n, y n ) with 1. Now, we can calculate the generalized moments of T 00 and T 11 . Firstly, we find for T 11 by conditioning on the three different possibilities which is finite by our assumptions on U 1 and U 2 . However, for T 00 , we obtain where M is a geometric random variable with parameter p and U (i) 10 are independent random variables that have the same distribution as T 11 conditioned on not going to 0 in the first step. (Recall (2.1).) In particular we obtain a lower bound by considering the following strategy: first the Markov chain jumps from 0 to 1, then it takes a tour to the "left" and after that it takes a tour of "right", before it finally returns to 0. Thus, we obtain using that f is non-decreasing where the latter is infinite by our assumptions on U 1 and U 2 ; and thus as claimed E f (T 00 ) is infinite.
The next lemma uses the construction in Lemma 3.1 to show that any function not satisfying the submultiplicativity condition (i) in (c) is not in .
Lemma 3.2. Suppose that f ∈ is such that for any C > 0, there exist x C and y C such that Proof. By Lemma 3.1 it suffices to construct two random variables U 1 and U 2 such that E f (U i ) < ∞ for i = 1, 2 and E f (U 1 + U 2 ) = ∞.
By our assumption on f , we can find increasing sequences (x k ) k≥1 and ( y k ) k≥1 such that Indeed, to see the existence of such sequences assume that for all x ≥ x k + 1 and y ≥ y k + 1 the following inequality holds A short calculation implies that in contradiction to the assumption of the lemma. Then, for definiteness, let U 1 be a random variable taking value x k with probability p k := c 1 f (x k ) −1 k −2 (with a suitable normalizing constant c 1 ) and similarly, U 2 takes value y k with probability q k := c 2 f ( y k ) −1 k −2 (with a suitable normalizing constant c 2 ). In particular, we find that E f (U i ) is finite for i = 1, 2. However, which is infinite.
The following example exhibits a "typical" function that can be chosen to satisfy (ii) of (c), while not obeying (i) of (c).
Example 3.3. We now construct a function f where we can choose the parameters in such a way that the condition in Lemma 3.2 is satisfied and we have that lim sup 1 n log f (n) = 0. We first describe the function g = log f . Take two sequences (s i ) i≥1 and (u i ) i≥1 such that u i ≤ s i+1 − s i and u n → ∞ as n → ∞. Then, by setting g(0) = 0, s 0 = u 0 = 0 and for i ≥ 0, define g to be constant on the interval (s i + u i , s i+1 ) and for i ≥ 1, assume that on the interval (s i , s i + u i ) the function g grows linearly with slope 1. Then, by adjusting the parameters u i , s i (in such a way that lim n→∞ g(n)/n = 0) one can make sure that the condition in Lemma 3.2 is fulfilled for f = e g , while at the same time by making the differences s i − s i−1 large enough, one can let f grow as slowly as desired.
The following lemma shows that the condition on the subexponential growth rate of f -(ii) of (c) -is really necessary. Proof. By our assumption on f we can find an increasing sequence x i and δ > 0, such that f (x i ) ≥ e δx i . Then, consider the Markov chain with two states 0 and 1 and transition probabilities p 01 = p and p 10 = 1 for some parameter p ∈ (0, 1). Now, E f (T 00 ) is finite, while for any i ≥ 1, which tends to infinity as i → ∞ provided that p is sufficiently small.