Exponential return times in a zero-entropy process

We construct a zero-entropy weakly mixing finite-valued process with the exponential limit law for return resp. hitting times. This limit law is obtained in almost every point, taking the limit along the full sequence of cylinders around the point. Till now, the exponential limit law for return resp. hitting times, taking the limit along the full sequence of cylinders, have been obtained only in positive-entropy processes satisfying some strong mixing conditions of Rossenblatt type.


Introduction
In the last two decades, asymptotic laws for the return and hitting time statistics in stationary processes were intensively studied. They have been investigated mainly in the context of strong mixing properties of the process and the results are of two kinds. First under some strong mixing conditions, the limit distribution of return (resp. hitting) times to shrinking cylinders is exponential. See for instance [Aba01,AV08,Pit91,GS97,GK09].
In these cases, the authors are taking the limit in almost every point, along the full sequence of cylinders around the point. The strong mixing conditions for processes imply positive entropy.
On the other hand, there are several classes of zero-entropy processes, which do not satisfy these strong mixing conditions and possess another limit distribution for return (resp. hitting) times. These concern some low-complexity shifts as Sturmian shifts, linearly recurrent shifts and substitutive shifts, where the limit distributions for hitting times were proved to be piecewise linear. Chaumoitre and Kupsa [CK06] proved that in the class of processes derived from rank-one systems, one can actually obtain any possible limit law for return and hitting times satisfying some very weak and natural condition described in [Lac02] and [KL05]. In particular, the exponential law can be obtained as the limit law for return and hitting times in a rank-one process. Let us recall, that rank-one processes have entropy equal to zero.
However, most of the results from the previous paragraph are much weaker than results about the exponential limit law, since the limit laws are attained taking the limit along particular subsequences of cylinders. Although the limit law is again attained in almost every point (i.e. the limit law is still "global"), we pick a particular sequence of cylinders around the point. This allows for coexistence of different limit laws in one process. Indeed, an example of a rank-one system, where all possible laws are realised as the limit laws for hitting times along suitably chosen subsequences of cylinders, is provided in [CK06]. There are only a couple of examples of zero-entropy processes, where the limit law is attained along the full sequence of cylinders. These are the process derived from the Fibonacci shift and processes derived from two-column rank-one constructions ( [CK05]). In these cases the limit law for the hitting time is piecewise linear. Taking the limit along suitably chosen subsequences of cylinders also allows one to obtain a non-exponential limit distribution for positive-entropy processes, see [DL].
The question which we answer in this paper is whether there exists a zeroentropy process where the exponential law is the limit law for return and hitting times attained taking the limit along the full sequences of cylinders. Our answer is positive and we therefore demonstrate that this behavior of return and hitting times implies neither strong mixing conditions, nor positive entropy.
Our paper is structured as follows. Section 2 contains our main theorem and definitions and notations needed to state it. In Section 3, we define the zero-entropy stationary process. Sections 4, 5 and 6 are devoted to the careful analysis of the process and its non-stationary measure, which is used in the construction of the process. These sections provide all technical steps needed to prove the that the process possesses the exponential limit distribution, see Corollary 7. Theorem 1 comes immediately from this corollary. The appendix contains some basic analysis of the non-stationary measure, concerning the notion of the dependency structure.

Acknowledgments
We would like to thank Tomasz Downarowicz. It was his idea to construct this kind of processes to obtain exponential behavior in the class of zero-entropy processes. He encouraged us and discussions with him helped us to overcome many technicalities.
We are also grateful to El Houcein El Abdalaoui for sharing his knowledge on rank-one systems.
The first author's research was supported by MNiSW Grant N N201 394537. The second author's research was supported by GA ASCR under the grant KJB100750901.

Preliminaries and the main theorem
Let A be a finite set, called an alphabet. Let A N be the space of all sequences x = x 0 x 1 . . ., x i ∈ A, equipped with the σ-field B generated by the following sets On this measurable space we consider the classical shift mapping T : A quadruple (A N , B, T, µ), where µ is a T -invariant probability measure, is called a finite-valued stationary process. This approach to define a process is characteristic for ergodic theory. The standard notion of the process, being a sequence of random variables, can be obtained if one considers the sequence of projections from For a measurable set B ∈ B of positive measure and a point x ∈ X, we define the hitting time of x in B as follows, where T k is the k-th iteration of the transformation T . The function τ B can be considered as a random variable on the probability space (X, B, µ), or as a random variable on the conditional space (B, B|B, µ B ), where B|B is the restriction of the σ-field B to the set B and µ B is the measure defined on B|B by the formula µ B = µ/µ(B). The former variable is called the hitting time to B, whereas the latter one is called the return time to B. Poincaré Theorem states the return time to B is almost surely finite. Since the process is assumed to be ergodic, Kac Lemma ensures the expectation of the return time to B is equal to the reciprocal of the measure µ(B).
For any x ∈ A N , one can consider the sequence of suitably rescaled return or hitting times µ([x n ])τ [x n ] , n ∈ N. The question is, whether these sequences converge in distribution. It can be rephrased in the term of weak convergence of the corresponding distribution functions. We denote the following functions from We say, the distribution function F x,n (resp.F x,n ), n ∈ N, weakly converges to a distribution function F if F x,n (t) (resp.F x,n (t)), n ∈ N, converges to F (t) for every point t of continuity of F . Our main result is the following: Theorem 1. There exists a finite-valued weakly mixing zero-entropy process (A N , B, T, µ) such that for almost every point x ∈ A N , the sequence of rescaled hitting times µ([x n ])τ [x n ] , n ∈ N, converges in distribution to the exponential law with parameter 1, i.e.
An immediate consequence of the integral equation introduced in [HLV05] is that the convergence in the theorem also holds for the distribution function of return timesF x,n .
In the rest of the section we introduce some necessary notations. For n ∈ N, a word (or block) of length n over the alphabet A is any finite sequence u = u 0 . . . u n−1 of elements from A. The set of all words of length n is denoted by A n , the length of u is denoted by |u|. The length of an infinite sequence v ∈ A N is formally defined to be +∞. The set of all words of all lengths is denoted by A * , i.e. A * = n∈N A n . The concatenation of two words u, v ∈ A * , denoted simply by uv, is a word from The language of u, denoted by L(u), is a subset of A * consisting of words u m, n , where m ≤ n ≤ |u|. These words are called subwords of u. The subword u 0, n will be denoted in the shorter way u n . For a set S ⊆ A * ∪ A N , the language of S is defined as the union of the sets L(u), u ∈ S.
This definition is an analog to the definition of the cylinder [x n ] in the previous section. We will also deal with some measurable partitions of A N . For I ⊆ N, we define the partition P(I) as follows: Two elements x, y ∈ A N belong to the same set from P(I) if and only if x i = y i , for every i ∈ I. For m, n ∈ N, we denote P(m, n) = P([m, n)), P(n) = P([0, n)), where [m, n) and [0, n) are left-closed right-open intervals of integers. For every n ∈ N, the partition P(n) consists of the cylinders [u], u ∈ A n . From now on, we reserve the symbol A for three-element alphabet {0, 1, 2}.

Definition of the process
In this section we will define a stationary process by averaging a non-stationary one. Both processes will have the alphabet A = {0, 1, 2}.

This sequence looks as follows
Proof of the following simple fact stating some properties of the sequence ω is left to the reader.
The nonoverlapping property will ensure that return times of cylinders in a process introduced below are not too small. Let (b n ) ∞ n=1 be a sequence of powers of 2 † , increasing fast enough to satisfy the following condition This condition is not needed until Section 6, where one can find the last steps of our main result's proof, see Lemma 11. We denote by ′ the permutation on A = {0, 1, 2}, which permutes 1 and 2. This permutation is called the negation. We extend the mapping to words over A and to sequences from A N . This extension is defined letter by letter, i.e. it commutes with the concatenation and T . We remark, that the symbol 0 has a special role in this mapping, it is a fixpoint.
In addition, we inductively define a sequence of numbers (a k ) ∞ k=1 and a sequence of families of words A k ⊆ A a k , k ≥ 0. Put a 0 = 1 and A 0 = {1, 2}. For k ≥ 0, a k+1 = 2b k+1 a k + 2b k+1 − 1 and A k+1 consists of all words of the form We define a measure ν on B by defining its values on the set of generators of B. Let k ∈ N and u ∈ A a k , then Since every word from A k has the same number of prolongations in A k+1 , the function ν is additive on cylinders and the definition is correct. The support of the measure (we consider the standard topology on A N generated by all cylinders [u], u ∈ A * ) equals The language of the support is denoted by L, i.e. L = L(supp ν) = L k∈N A k .
Lemma 1. For k ∈ N, the negation is a permutation on A k . For l ≥ 1, every word u ∈ A k+l can be written as follows In particular, the support of ν is invariant under the negation and for every x ∈ supp ν, k ∈ N, Proof. The negation is injective. By the inductive construction of families A k , k ∈ N, they are invariant under the negation. Hence, the negation is a permutation on A k . It implies that the second part of the lemma holds for l = 1, k ∈ N. Now, we assume the second part of the lemma holds for some l ≥ 1 and every k ∈ N. Take u ∈ A k+l+1 . By the inductive assumption, Using the inductive assumption again, there exists sequence of words u(j) ∈ A k such that for every i ≤ p(l, l + 1) where for every 1 ≤ i ≤ p(l, l + 1), 1 ≤ j ≤ p(k, l), the following equality holds Since p(k, l) is a power of 2, (3.1), (3.2) and the above yields s (i−1)p(k,l)+j = ω (i−1)p(k,l)+j . The equality p(k, l + 1) = p(k, l)p(l, l + 1) concludes the proof.
The lemma tells us at which position we can expect the occurrence of words from A k . This could be reformulated in the notion of the coefficients By the definition of a k and by equations (3.1)-(3.3), we get the following arithmetic properties of the coefficients These properties will be often used throughout the text.
The following fact can be easily proved by looking at the construction of the process.

3.2.
Extension of a rank-one system. To understand better the measure defined in the preceding section, we consider the continuous projection π : A N → {0, 1} N that works letter by letter and is defined on symbols as follows: π(0) = 0 and π(1) = π(2) = 1. By Lemma 1, given k ≥ 1, for every word u ∈ A k , Thus, all words from A k have the same π-image w(k) = 10 ω1 10 ω2 . . . 10 ω p(0,k) . Denote w(0) = 1. By the inductive definition of A k , k ∈ N, we get Since w(k + 1) is a prolongation of w(k), the limit w = lim k→∞ w(k) exists. The point w is generic for a rank-one (non-atomic) measure on {0, 1} N (see the symbolic definition of a rank-one system in [Fer97]). This measure is positive on every cylinder [u], u ∈ L(w). By standard Chacon arguments, see [Cha69], one can show that the rank-one system is weakly mixing and 1/2-rigid. Since ν(π −1 {w}) = 1, we can reformulate the facts about the rank-one measure in the following way.
Fact 3. Let B ∈ B be π-measurable. Then for every n ∈ N, T n ν(B) ∈ {0, 1}. Moreover, Cesàro averages 1/n n−1 i=0 T i ν weakly converge to a probability measure on the σ-field of all π-measurable sets. The limit measure is invariant and weakly mixing with respect to T .
For every u ∈ L, the limit measure of the set π −1 π[v] is positive.
We will show that the Cesàro averages weakly converge on the whole σ-field B. The following lemma plays an important role. Its proof is introduced in the technical Section 4.
Proof. Let u ∈ L, then for every n ∈ N, where θ(u) is the number from Lemma 2. Since π −1 (π [u]) is an open-closed πmeasurable set, the Cesàro averages on the right hand side converge. Thus, the left hand side also converges. If u ∈ L, then T i ν([u]) = 0 for every i ∈ N. Thus, the Cesàro averages converge.
The limit measure from Lemma 3 will be denoted by µ. Fact 3 and Lemma 2 imply that supp µ = supp ν, Proposition 2. The system (A N , B, µ, T ) is weakly mixing.
Proof of this Proposition is presented at the end of Appendix.
Lemma 4. The entropy of the system (A N , B, T, µ) equals zero.
Proof. It suffices to estimate from above the number of words from the language L of length n, for an infinite sequence of natural numbers.
the word v has to be a subword of u0 iũ for some u,ũ ∈ A k and i ≤ a k . Hence, the number of words from L of length a k is bounded by 2a 2 k · (# A k ) 2 . The entropy of the system

Language analysis
For this section, let u be a word from L, such that u is not a block of zeros. There exist natural numbers q, m 0 , m 1 , . . . , m q and symbols The numbers and the symbols are unique and q ≥ 1. For the rest of the section, we define ω 0 to be 0. Denote Since π(u) = 0 m0 10 m1 10 m2 . . . 10 mq and w = 0 ω0 10 ω1 10 ω2 . . ., we get the following lemma.
Lemma 5. The image π(u) appears in the sequence w only at positions from the set Ξ(u).
Lemma 6. If m 0 = 0, then there exist g 0 , g ∈ N, such that g is a power of 2, g > g 0 and Proof. By the definition of the sequence ω, for every j = 1, 2, . . . , q − 1, In addition, Since m 0 = 0, the condition ω i ≥ m 0 holds for every i ∈ N and Ω(u) is equal to the intersection of the sets Ω j , 1 ≤ j ≤ q. Each of these sets has constant gaps between consecutive elements, which are bigger than the value of the smallest element. Their intersection is either empty (which does not happen for u ∈ L) or has the same property. Gaps in the intersection are also constant and are equal to the least common multiple of the gaps in the sets themselves. Since gaps in every set Ω j , 1 ≤ j ≤ q, are powers of 2, the gap g in the intersection is equal to the maximum of the gaps 2 mq and 2 mj +1 , j = 1, . . . , q − 1. Hence, there exists g 0 < g, such that Ω(u) = {g 0 + ng, n ∈ N}.
The number g from the previous lemma will be called the order of u.
There are three cases: u is a block of zeros, u begins with a non-zero letter and u contains a non-zero letter, but not at the very beginning.
Suppose that u is a block of zeros, than [u] is π-measurable and the lemma holds with θ(u) = 1.
Let u begin with 0, but it is not a block of zeros. Then u = 0 m0 u(1)0 m1 . . . u(q)0 mq , for some natural numbers q, m 0 , m 1 , . . . m q , where q ≥ 1 and u(i) ∈ {1, 2}, for i ≤ q. Denote v = u(1)0 m1 . . . u(q)0 mq . The set [0 m0 ] is π-measurable and has T n ν-measure 0 or 1 for every n ∈ N, thus We have proved in the paragraph above that T n ν( Finally, we prove that the constant θ(u) is positive. Since u ∈ L, there exist n ∈ N and x ∈ supp ν, such that x n, n + |u| = u. It implies that T n ν([u]) > 0, adding the fact that θ(u) equals T n ν([u]) finishes the proof.

Closeness of return and hitting times
Fix u ∈ L, which does not begin with zero. Let g be the order of u. For t ∈ N we define The aim of this section is to prove that the measures µ [u] (V (1, t)) and µ (V (1, t)) are close.
The following lemma is a corollary of the fact, that for every measurable set A ⊆ X, the total variation distance between µ and µ X\A is equal to µ(A).
Let I = n∈N I n . We get Fix n ∈ N. By Corollary 3, ξ(n) belongs to I n . Suppose j ∈ I n . Symmetric difference of the intervals [j + 1, j + t] and [ξ(n) + 1, ξ(n) + t] consists of two intervals J 1 and J 2 (possibly empty), which lengths satisfy Since |I n | < z 0 (g) and the gaps in Ξ(u) are at least z 0 (g) (Corollary 3), we get that each set, Ξ(u) ∩ J 1 and Ξ(u) ∩ J 2 , consists of at most one element. Hence We have 1, t)).
The set J ∩ K 1 \K ′ 2 can be written as follows, , the third condition holds. It remains to prove the forth condition. Let i ∈ N be such that ig ′ = b k+1 . If i ≥ l, then J ∩ K 1 ∩ K ′ 2 and the forth condition holds. Assume i ≤ l − 1. Then . Therefore, the first part of the forth condition is true. Now, assume that the second part does not hold, i.e.

Exponential limit distributions for return and hitting times
Denote The words x n , n ≥ 1, belong to L and begin with a non-zero letter. We denote the order of the word x n by g(x, n) (for the definition, see Lemma 6). The maximal integer k such that p(0, k) ≤ g(x, n) is denoted by k(x, n). Since |x n | ≤ z 0 (g(x, n)), we get that g(x, n) and k(x, n) tend to infinity with n increasing to infinity.
Proof. Let k ∈ N. By the construction of the sequence w, w 0, a k+m = w(k)0 ω1 w(k)0 ω2 . . . w(k)0 ω p(k,k+m) , for every m ∈ N. The number of occurrences of w(k) in the word above is p(k, k+m). Thus, by ergodicity Let x ∈ X ′′ , n ∈ N. We denote k = k(x, n), u = x n and ξ(i) relates to the block u. By Corollary 3 and equations (3.6) ξ(0) + |u| ≤ z 0 (g(x, n)) ≤ z 0 (p(0, k + 1)) = a k+1 . Thus, By equation (3.4), the right hand side tends to infinity, when n goes to infinity. So does the left hand side.
Lemma 12. Let x ∈ X ′ . For every k ∈ N, there exists u ∈ A k , which is a subword of x.
Lemma 13. For all n ∈ N, for each u ∈ A n , µ π −1 π([u]) ([u]) = 1 # An . For all blocks u, v ∈ L(supp µ) such that v is a subword of u holds Proof. Fix n ∈ N and u ∈ A n . By Lemma 2 and by the fact u ∈ A n , we get θ(u) = ν([u]) = 1 # An . Now, let u be an arbitrary block from the language L(supp µ) and let v be its non-empty subword. Take the cylinders defined by the words u and v.
Lemma 15. Let x ∈ X ′ . There exist N, n 0 ∈ N such that x N = 0 and for every n ≥ n 0 , the sets [x n + N ] and [x N, n + N ] are equal modulo µ.
The word v appears in w exactly at the positions from the set Ξ(v) = {z 0 (gn), n ∈ N}. We get that j + N = ng, for some n ≥ 1. But ω ng ≥ n and The inequality 6.1 is proved. Now, let n ≥ n 0 . Similarly, one can prove that the following equalities hold modulo T j ν, for j ∈ N Corollary 7. For x ∈ X ′ , t > 0,F [x n ] (t) and F [x n ] (t) tend to 1 − e −t , whenever n tends to infinity.
Proof. Let x ∈ X ′ and N, n 0 ∈ N be numbers, for which the previous lemma is satisfied. Denote y = T N x. Then y ∈ X ′′ and for every n ≥ n 0 , [x N + n ] equals [y n ] modulo µ. Thus, for every n ≥ n 0 , the functions F x,N +n and F y,n are the same. By Corollary 6, for every t > 0, F y,n (t) tends to 1 − e −t , and so does the sequence F x,n (t).

Appendix. Dependency structure on coordinates
The mutual dependences among partitions T −j P, j ∈ N, with respect to the measure ν, can be described through a non-reflexive, symmetric and transitive relation on the set of coordinates N, which is an equivalence relation on Q = {j ∈ N : w j = 1} = {z 0 (i) : i ∈ N}.
First, we define the direct dependency symmetric relation D. We say, that coordinates i, j ∈ N are directly dependent ((i, j) ∈ D), if i, j ∈ Q and there exist k, m ∈ N such that i, j ∈ [z k (m), z k (m) + a k ) and |i − j| = 1 2 (a k − 1). The dependency relation D on N is the transitive envelope of D. We remark that this relation is a subset of Q 2 and it is an equivalence on Q. For I ⊆ N, we denote Since Proof. The first part of the lemma follows from definitions. Since the measure ν is invariant under the negation, ν(T −i [1]) = ν(T −i [2]) = 1/2 for every i ∈ Q. Let (i, j) ∈ D. In order to prove, that ν( ), a = 1, 2, we prove that for every x ∈ supp ν, x ′ i = x j . Let k, m ∈ N be such that i, j ∈ [z k (m), z k (m)+ a k ) and |i − j| = 1 2 (a k − 1). By Fact 2, It also implies that T −i P equals T −j P modulo ν. We consider the relation on N defined as follows: (i, j) ∈ N 2 is in the relation if T −i P equals T −j P modulo ν. Since this relation is transitive and contains D, it contains D too. Thus, for every (i, j) ∈ D, T −i P equals T −j P modulo ν. It implies, that P(D(i)) equals T −i P (mod ν).
We denote the following sets, k ∈ N, We define the mapping φ : Γ −→ Q by The mapping is well defined what follows by two facts. The first is z i (0) = 0, for every i ∈ N and the second is the equality which can be deduced from equations (3.5) and (3.6) By the same arguments, the mapping φ is a bijection. Moreover, for every k ∈ N, φ maps bijectively Γ(k) onto Q ∩ [0, a k ). The following lemma is an easy observation.
This lemma has the following corollary.
Combining this corollary with Lemma 17 gives the following technical lemmas.
Lemma 22. Partitions P(E), where E runs over all equivalence classes of D, are mutually ν-independent.
Proof. Let k ∈ N. Elements of the partition P(a k ) consists of the cylinders given by the words from A k . By the definition of ν all elements of partition P(a k ) have the same measure. Thus, we can easily calculate the entropy of the partition On the other hand, the sets D(j), j ∈ φ(Γ ′ (k)), cover Q ∩ [0, a k ). Hence, the joining of the partitions P(D(j)), j ∈ φ(Γ ′ (k)), is finer than P(a k ). In addition, by Lemma 16, H( P(D(j)), ν) = 1, for every j ∈ φ(Γ ′ (k)). Hence, Since the first term of the inequality above equals the last one, all the terms above are equal. In particular, the partitions P(D(j)), j ∈ φ(Γ ′ (k)), are mutually νindependent and this is true for every k ∈ N. Since D(j), j ∈ k∈N Γ ′ (k), are all classes of the equivalence D, the lemma holds. Lemma 23. For every n, m ∈ N, the measure T z0(m2 n ) ν is equal to ν on P(z 0 (2 n )).
A pair (i, j) belongs to E if and only if P[i, i + |u|) and P[i j , i + j + |v|) are νdependent. In particular, for (i, j) ∈ E holds In addition, we denote E i,n = j ≤ n : (i, j) ∈ E}, E ′ j,m = {i ≤ m : (i, j) ∈ E .
By Lemma 21, the upper Banach density of D(i) is bounded by 2 k /a k , for every k ∈ N. Hence D(i) has Banach density equal to zero. Hence, the sequence #E i,n /n converges to 0 uniformly in i, when n goes to infinity. Therefore, there is n 1 ≥ n 0 such that for every n ≥ n 1 and every i ∈ N, densities #E i,n /n are less than ε 2 .
Let n ≥ n 1 . For every m ∈ N From now on, we will use the notation a = b + o(e) instead of |a − b| ≤ ε. Take m ∈ N such that for every j ≤ n holds If j ∈ [1, n]\M m , i ∈ [1, m]\E ′ j,m , then Hence, for j ∈ [1, n]\M m , ≤4ε + #M m /n < 5ε.