Young Towers for Product Systems

We show that the direct product of maps with Young towers admits a Young tower whose return times decay at a rate which is bounded above by the slowest of the rates of decay of the return times of the component maps. An application of this result, together with other results in the literature, yields various statistical properties for the direct product of various classes of systems, including Lorenz-like maps, multimodal maps, piecewise $ C^2 $ interval maps with critical points and singularities, H\'enon maps and partially hyperbolic systems.


INTRODUCTION AND STATEMENT OF RESULTS
Let M = M 1 × ... × M p denote the product of a finite number of Riemannian manifolds and let f = f 1 × ... × f p : M → M be a product map on M , where each f i : M i → M i i = 1, ..., p, is a map whose properties will be specified in the various cases below. In all cases, each f i will admit, either by assumption or by construction, an ergodic invariant probability measure µ i which is absolutely continuous with respect to the Riemannian volume on M i or is an SRB measure, i.e. has absolutely continuous conditional measures on unstable manifolds [48]. We then let µ = µ 1 × · · · × µ p denote the product measure which is also an ergodic invariant probability measure for f and either is absolutely continuous with respect to the Riemannian volume on M or satisfies the SRB property.
Our purpose is to study the statistical properties of the system (f, µ) . As we will explain in detail below, this problem is generally quite difficult as the geometric and dynamical properties of (f, µ) can be extremely complicated and even knowledge about statistical properties for the individual components (f i , µ i ) does not always lead to information about the statistical properties for (f, µ) , at least not by any elementary argument. We will show that the problem can be effectively solved if the individual components admit some geometrical structures. In the next two subsections we give some general standard definitions about product systems and about statistical properties. We then give the precise statement of our results.
1.1. Product systems. We start with some relatively elementary observations about product systems, which show that the dynamics of a product does not reduce in a trivial way to the "product of the dynamics". We discuss here the case of the product of two maps, the case of an arbitrary finite number of maps generalizes in an obvious way. Given two sets X, Y and two maps f : X → X and g : Y → Y we define the product X × Y = {(x, y) : x ∈ X, y ∈ Y } and the corresponding product map f × g : X × Y → X × Y by (f × g)(x, y) = (f (x), g(y)). It is tempting to think that the dynamical behaviour of a product map f × g can be trivially described in terms of the dynamics of the two original maps f, g but this is in fact not at all the case, not even in the simplest setting in which f = g . For example, while it is clear that the product (p, q) of periodic points p and q is itself a periodic point, the set-theoretic product of the periodic orbits O + (p) = {p 1 , ..., p m } and O + (q) = {q 1 , ..., q n } may consist of several periodic orbits 1 , which are in some sense new periodic orbits which do not exist in either of the original systems.
If X, Y are metric spaces and f : X → X and g : Y → Y are Borel measurable maps, we let M f , M g denote the spaces of all invariant Borel probability measures of X and Y for f and g respectively, and similarly denote M f ×g the space of all invariant Borel probability measures for f × g . Then it is well known and easy to see that M f ×g contains all product measures µ × ν where µ ∈ M f , ν ∈ M g and µ × ν is the unique Borel probability measure on X × Y such that (µ × ν)(A × B) = µ(A)ν(B) for every A ∈ A, B ∈ B , see e.g. [11]. However M f ×g generally contains many measures which are not product measures. Simple examples are measures supported on the "new" periodic orbits mentioned 1 In fact, if gcd {m, n} = 1 , then (p, q) is periodic point of period mn. On the other hand, if gcd {m, n} = k > 1, then (p, q) is periodic point with period mn/k, but O + (p) × O + (q) is a union of k periodic orbits. As a simple example, consider the case m = 2 and n = 4. In this case, the product of the orbits splits into two orbits for the product map: O + (p) × O + (q) = O + (p 1 × q 1 ) ∪ O + (p 2 × q 1 ). above 2 or, in the simple setting where we consider the product f × f of a map with itself, it is easy to see that the "diagonal" {(x, x) : x ∈ X} ⊂ X × X is invariant and therefore supports lots of invariant measures which are not product measures 3 Finally, we remark that if X, Y are Riemannian manifolds then the dynamical properties of maps f : X → X, g : Y → Y are often deduced from certain geometrical structures such as the existence of particular types of foliations or, if the maps have discontinuities, from geometrical properties of the discontinuity sets. Then the product X × Y can also be endowed with a natural Riemannian product structure but in general the geometrical structures associated to the product dynamics f ×g : X ×Y → X ×Y may be quite different and may not allow the required analyses to be carried out making it difficult to establish properties of the product analogous to those which may be known for the single maps.
1.2. Decay of correlations for product measures. The purpose of this paper is to investigate the statistical properties of product systems. It turns out that for the decay of correlations there is a relatively elementary calculations which expresses the correlation function of the product in terms of the correlation functions of the components. We recall the precise definitions. Definition 1.1. Let B 1 , B 2 be Banach spaces of measurable observables defined on M . We denote the correlation of non-zero observables ϕ ∈ B 1 and ψ ∈ B 2 with respect to µ by We say that (f, µ) has decay of correlations at rate {γ n } with respect to µ for observables in B 1 against observables in B 2 if there exists constant C > 0 such that for any ϕ ∈ B 1 , ψ ∈ B 2 the inequality Rates of decay of correlations have been extensively studied and are well known for many classes of systems and various families of observables, indeed the literature is much too vast to give complete citations. We just mention that the first results on rates of decay of correlations go back at least to [12,26,42,43] in the 70's and since then results have been obtained in [4,8,12,13,14,17,21,22,25,24,28,29,33,38,40,46,47,49,50] amongst others. 2 As above let p and q be periodic points of period m and n respectively for f and g. Then the Dirac measures δ O + (p) = 1 m δ pi and δ O + (q) = 1 n δ qj preserved by f and g respectively. In the case gcd {m, n} = k > 1, the Dirac measure defined as is preserved by f × g, but it is not a product of measures δ O + (p) and δ O + (q) . 3 Thanks to M. Blank for this interesting observation.
The following result states that the rates of decay of correlations for a product system is bounded simply by the slowest of the rates for the individual components. We are grateful to Carlangelo Liverani for showing us a simple proof of the following statement. Let f i : (M i , µ i ) , i = 1, ..., p, be a family of maps defined on compact metric spaces M i , preserving Borel probability measures µ i and let f := f 1 × · · · × f p : M → M be the direct product on M := M 1 × · · · × M p , and µ := µ 1 × · · · × µ p be the product measure. We let A i , B i denote Banach spaces of functions on M i and A , B denote Banach spaces of functions on M . We assume these families satisfy the following properties: There exists C > 0 such that for any ϕ ∈ A, ψ ∈ B and any Our main application will be in the case where A i and A are either all spaces of Hölder continuous functions or all spaces of essentially bounded functions, and the same for B i and B . It is easy to verify that in these cases conditions (1)-(3) hold.
Cγ n for all non-zero ϕ i ∈ A i and ψ i ∈ B i for all i = 1, ..., p. Then there exists a constant C > 0 such that for all non-zero ϕ ∈ A , ψ ∈ B , and for all n ≥ 0 we have Cor µ (ϕ, ψ • f n ) ≤Cγ n .
1.3. Young Towers. Unfortunately it does not seem possible to find such completely general relationships between the product system and its components in the cases of other kinds of statistical properties such as large deviations, almost sure invariance principles, local limit theorems. Moreover these kinds of statistical properties are generally deduced from differentiable and geometrical properties of a dynamical system which are not necessarily, or at least not trivially, preserved for the product system. In particular, several recent and deep results show that a lot of information can be obtained from the existence of a non-trivial geometric structure which we call a Young Tower (we will give the precise definitions below), see for example the pioneering paper [49] and subsequent papers [5,16,21,34,35,36,50] where it is shown that many statistical properties can be obtained simply from information on the return times of Young towers. It is possible to define the direct product of Young Towers but this direct product is not a Young Tower and does not provide any information which can be used to deduce dynamical and statistical information. The main purpose of this paper is to construct a Young Tower for the product system f : M → M under the assumption that each individual component admits such a tower. Moreover we obtain some estimates for the decay of the return times of the tower for the product in terms of the rates of decay of the return times for the individual towers.
As an immediate application of our result we deduce below non-trivial statistical properties for many systems which have a direct product structure and for which no direct strategy exists to obtain the same conclusions. We mention that the existence of a Young Tower is itself a non-trivial geometrical structure. Nevertheless its existence has been shown in a large number of classes of systems, see eg. [3,5,10,13,16,18,19,21,23,27,34,39,49,50], for most of which information about the decay of return times also exists. Thus our results can be applied to deduce the existence of a tower and information about the decay of return times for direct products of any finite number of such systems.
We mention also that it was recently proved in [1] that Young Towers essentially always exist if a map has an absolutely continuous invariant probability measure with positive Lyapunov exponents. Thus our construction of Young Towers for direct products is in principle applicable in full generality. It is also shown in [2] that in some cases the rate of decay of the return time follows from the statistical properties of the system, thus showing a fairly intrinsic two-way connection between statistical properties and the geometric structure of Young Towers.
We now give the precise formal definition of a Young Tower following [5] which generalizes the definition of [49] (see Remark 2.5). We give two separate definitions for diffeomorphisms and endomorphisms. Let f : M → M be a diffeomorphism of Riemannian manifold M. If γ ⊂ M is a submanifold, then m γ denotes the restriction of the Riemannian volume to γ. Assume that f satisfies the following conditions.
(A1) There exists Λ ⊂ M with hyperbolic product structure, i.e. there are families of stable and unstable manifolds Γ s = {γ s } and Γ u = {γ u } such that Λ = (∪γ s )∩(∪γ u ); dim γ s +dimγ u = dimM ; each γ s meets each γ u at a unique point; stable and unstable manifolds are transversal with angles bounded away from 0; m γ u (γ u ∩Λ) > 0 for any γ u . Let Γ s and Γ u be the defining families of Λ. A subset Λ 0 ⊂ Λ is called s -subset if Λ 0 also has a hyperbolic structure and its defining families can be chosen as Γ u and Γ s 0 ⊂ Γ s . Similarly, we define u -subsets. For x ∈ Λ let γ σ (x) denote the element of Γ σ containing x, where σ = u, s. (A3) There exist constants C ≥ 1 and β ∈ (0, 1) such that (1) dist(f n (x), f n (y)) ≤ Cβ n , for all y ∈ γ s (x) and n ≥ 0; (2) dist(f n (x), f n (y)) ≤ Cβ s(x,y) for all x, y ∈ γ u and for any 0 ≤ n < R(x). (A4) Regularity of the stable foliation: given γ, γ ∈ Γ u define Θ : Then (a) Θ is absolutely continuous and (b) Let u(x) denote the density in item (a). We assume there exists C > 0 and β < 1 such that The geometric structure described in (A1) and (A2) allows us to define the corresponding Young Tower. More precisely, we let Lebesgue measure m on T is defined as follows. Let A be the Borel σalgebra on Λ ⊆ M, and let m Λ denote the restriction of Lebesgue measure to Λ. For any ≥ 0 and Notice that it is not strictly necessary for f to be a diffeomorphism. Such a structure can be defined for example also in the presence of some discontinuities as long as the stable and unstable manifolds exist and satisfy the required properties, this has been done for example for some classes of Billiards. Now we are ready to state the main result of the present paper. For simplicity we state it for the direct product of two maps, the general case follows. Given maps f i : M i , i = 1, 2 with the towers F i : T i , reference measures m i on the bases Λ i and return time functions R i respectively.
i , m u denote the conditional measure on the unstable leaves.
admits a towerF :T with the base Λ 1 ×Λ 2 . Moreover the return time function T : In Section 6 we will give the proof of Theorem 1.3, taking advantage of Theorem 1.4 below.

Gibbs-Markov-Young
Towers. Now we give the formal definition of Young Tower for non-invertible maps. To distinguish this case from the previous one, this structure sometimes is referred to as a Gibbs-Markov-Young (GMY) structure or GMY-tower. Let f : M → M be a C 1+ local diffeomorphism (outside some critical/singular set) of a Riemannian manifold M. We say f admits GMY-structure if there exists a ball ∆ 0 , its mod 0 partition P = {∆ 0,i } and a return time function R : ∆ 0 → N that is constant on the partition elements, i.e. R|∆ 0,i = R i such that In particular the separation time s(x, y) is almost everywhere finite. (G3) Bounded distortion: there exists D > 0 such that for any points Our main theorem in this setting is the following. Let F : (∆, m) , F : (∆ , m ) be two expanding towers given by Gibbs-Markov induced maps. Consider the direct product F ×F : (∆×∆ ,m) of the towers F and F , wherem = m × m . We denote bym 0 the restriction ofm onto ∆ 0 . We will show F × F admits a towerF with the base∆ 0 = ∆ 0 × ∆ 0 . Letting M n = max{m{R > n}, m {R > n}} we have the following with the base∆ 0 and return time function Remark 1.5. We remark here that Holland [23] also proves a general result on maps with very slow decay of correlations if the tail of the return time decays very slowly, e.g. at speeds (log •... • log n) −1 as n → ∞ . Our construction and estimates do not allow us to obtain results for the decay of the tail of the product in these cases. It is not clear to us if this is just a technical issue in the proof or if the result might not generalize to these cases. Another remark is that if we consider finitely many towers with tails decaying at polynomial rates, then each time we apply theorem 1.4 we lose one exponent. This fact seems because of the technique that we use.
1.5. Applications. As mentioned above, there are many classes of explicit systems for which Young Towers have been shown to exist with estimates on the decay of the return times. Our main Theorem can be applied to direct products of such systems to obtain a Young Tower and estimates for the decay of the return times, and thus to deduce several non-trivial statistical properties. For completeness we mention here some of these dynamical systems, referring to the original papers for the precise definitions.
(1) Uniformly Hyperbolic maps. The statistical properties of uniformly expanding and uniformly hyperbolic systems are well known and classical results. [6,12], It was shown in [50] that there exists a Young Tower with exponential tail 4 .
compact interval and f i is uniformly expanding and has a single singularity with dense preimages satisfying |f (x)| ≈ |x| −β for some β ∈ (1/2, 1) . It was shown in [18] 4 We remark that the direct product of uniformly hyperbolic systems are still uniformly hyperbolic and therefore can be studied directly. This is however essentially the only case in which this can be done. Even the direct product of a uniformly hyperbolic system with one of the other systems described below no longer satisfies the conditions which would allow it to be studied without using the additional information that it admits a Young Tower that there exists a Young Tower with exponential tail, see [18] for the precise technical conditions. (3) Multimodal maps. Young Towers for a large class of interval maps were constructed in [13,41]. There, the decay rate of the return times was obtained in terms of the growth rate of the derivative along the critical orbits. (4) Maps with critical points and singularities. A more abstract and more general class of maps are a partial generalization of both Lorenzlike and multimodal maps and includes maps which contain both critical points and singularities and also applies to maps in higher dimensions, see [3,19]. (5) Planar periodic Lorentz gas. These are the class of examples introduced by Sinai [44]. The Lorentz gas is a billiard flow which admit Young Towers with exponential tails [50], [16]. (6) Hénon maps are the maps H a,b : . For certain choices of parameters (a, b) they admit a Young Tower with exponential tail [9, 10].

DECAY OF CORRELATIONS FOR PRODUCT MEASURES
Here we give the proof of theorem 1.2. Notice that, it is sufficient to prove the theorem for p = 2. The general case can be obtained by successive applications of the argument. Let ϕ ∈ A be such that ϕdµ = 0 and ψ ∈ B. Moreover, letφ(x 1 ) = ϕ(x 1 , y)dµ 2 (y). If we fix the first coordinate, by Fubini's theorem we have Since µ 2 is f 2 invariant probability measure, we can write the first term of right hand sight as The latter inequality follows since, left hand side gives the correlations of the second component with respect to µ 2 . From the third assumption we obtain I ≤ C ϕ A ψ B . Again by the invariance of µ 2 we can write the second summand of the equation 2.1 as Note that φ(x 1 )dµ 1 = 0 by the choice of ϕ. Then the above expression can be written as The expression under the integral with respect to µ 2 is exactly the correlation with respect to µ 1 . Again using the third property of the Banach spaces A and B we have This completes the proof.

A TOWER FOR THE PRODUCT
3.1. Basic ideas and notation. Let F : (∆, m) and F : (∆ , m ) be two towers with bases ∆ 0 and ∆ 0 .
The first return timeR (x ) for x ∈ ∆ 0 is defined analogously.
In this section we will show that the direct productF of the towers F and F admits a towerF with the base∆ 0 , i.e. we will define a return time map T and a partitionP of∆ 0 compatible with T in the sense that the quadruple (F ,P, T,ŝ) satisfy all the conditions of definition 1.4.
Let us begin with the following analysis. Fix n, and focus on the elements of the partition For x ∈ ∆ let x j = F j (x) and consider partition the element From the definition of tower for x / ∈ ∆ 0 we have F −1 (η(x)) = η(F −1 (x)), which shows the element η(x) gets refined only when F −j (x) ∈ ∆ 0 for some j, j = 0, ..., n − 1. Let us see the mechanism of this refinement in examples.
For the case n = 2 partition (3.1) has form F −1 η ∨ η. In this case only the elements on the top levels get refined in such a way that the new elements are mapped by F bijectively into ∆ 0,i ⊂ ∆ 0 , for some i, all the other elements remain the same (see pic.1). In particular, for x ∈ ∆ with F 2 (x) ∈ ∆ 0 we have F 2 (η 2 (x)) = ∆ 0 . This is because η 2 (x) = η(x) and F (η(x)) is top level element of η. (For simplicity, the pictures are drawn when the partition of base contains only two elements and has return times 3 and 5.)

FIGURE 1. η and η 2
For n = 3 all the elements on the top levels and one below levels get refined as follows: the top level elements of the new partition are mapped onto some ∆ 0,i by F ; the elements belonging to one level below than the top level are mapped onto some ∆ 0,i by F 2 and other elements remain the same (see pic. 2).
When n = min i {R i }, the refinement reaches to ∆ 0 . After this moment, the second refinements on the top levels will start. In picture 2 bold elements of η 5 are the elements of η which refined twice. In general, for each n the refinement pulled back from the top levels onto to the lower levels exactly for n − 1 levels. If n is sufficiently large, some of the partition elements might get refined several times. The next lemma shows the relations between return times and mixing property of the system. Proof. The fact gcd{R i } = 1 implies µ mixing was proved in [50] (Theorem 1).
Here we prove the fact if µ is mixing then gcd{R i } = 1. Assume by contradiction µ is mixing and gcd{R i } = k > 1. By the remark 3.1 for A ⊂ ∆ 0,i the relation F −n (A) ∩ ∆ 0 = ∅ holds only when n is divisible by k, which contradicts the mixing property of µ.
The lemma which we have just proved shows that the condition gcd {R i } = 1 is equivalent definition of aperiodicity for towers (see [7]).
The following almost obvious lemma will be useful in the construction below.
Proof. The proof is based on the induction on n. For n = 1 we have no refinement, η 1 = η. Therefore, the conclusion follows from the definition of tower.
Since the first return time is constant on the elements of η, the second assertion follows.

3.2.
Partition of∆ 0 . We start by defining the return time to the base. Theorem 1 in [50] asserts that GMY-Tower admits an invariant probability measure with positive density w.r.t. reference measure. Therefore there exists n 0 > 0 such that for any n ≥ n 0 the inequalities m(F −n (∆ 0 ) ∩ ∆ 0 ) > c > 0 , m (F −n (∆ 0 ) ∩ ∆ 0 ) > c > 0 hold. We choose such n 0 and introduce a sequence {τ i } of positive integers: forx = (x, x ) ∈∆ let τ 0 = 0 and Notice that the functions τ i are piecewise constant on∆ 0 .
Proof. Recall that for the towers F and F there are mixing measures µ and µ which are equivalent to reference the measures m 0 and m 0 respectively. Since µ × µ is mixing, hence ergodic and by Birkhoff's ergodic theorem µ × µ -almost everywhere Taking the subsequence {τ i (x, x )} i≥2 finishes the proof. Now, we define a partitionP of∆ 0 . This construction uses the sequence τ i .

Auxiliary partitions.
Denote by π, π the projections of∆ onto the coordinates and by η, η the partitions of ∆, ∆ into ∆ ,j , ∆ ,j 's respectively. Define an increasing sequence of partitions of∆ 0 defined as follows. For eachx = (x, x ) ∈∆ 0 , let The projection to the first coordinate πξ 1 is a partition of ∆ 0 of form 3.1. From the definition of τ 1 , for any x ∈ ∆ 0 we have F τ 1 (x) (x) ∈ ∆ 0 . The second assertion of lemma 3.3 implies that the function τ 1 is constant on every element Ω(x) ∈ πξ 1 i.e. τ 1 (x) = τ 1 |Ω(x) = const. Moreover from the first assertion of lemma 3.3 for each element of πξ 1 we have F τ 1 |Ω (Ω) = ∆ 0 bijectively. Define ξ 2 by partitioning each element of ξ 1 in the second coordinate: By construction, τ 2 is constant on each element of ξ 2 and F τ 2 maps the second coordinate of each element bijectively into ∆ 0 . (As above this is easily seen by applying the lemma 3.3 in the second coordinate.) In addition, since ξ 2 is a refinement of ξ 1 , τ 1 is also constant on each element of ξ 2 . Now we define the partitionξ 2 by refining ξ 2 in the first coordinate as follows:ξ Since τ 2 (x) ≥ τ 1 (x) + n 0 the partition πξ 2 is finer than the partition πξ 1 . Hence, the functions τ 1 and τ 2 are constant on the elements ofξ 2 .
The picture 3 gives an illustrative examples of elements of partitions ξ 1 , ξ 2 andξ 2 , where the elements of ξ 1 are vertical strips. The elements of ξ 2 are rectangles each of them formed by subdividing the elements of ξ 1 in the second coordinate. By refining the elements of ξ 2 in the first coordinate, we define the elements ofξ 2 (for example the rectangle with bold boundaries).
The proof easily follows from the lemma 3.3 by applying to each coordinate with n = τ 2 .
The subset of the partitionP constructed in the first step iŝ By definition for any Γ ∈P 2 we have T |Γ = τ 2 |Γ. If we denote by P 2 the set of points where T = τ 2 , then P 2 is a partition of this set. Notice that even on P 2 the return time T is not necessarily bounded.
Defineξ 2i+1 to be the refinement of ξ 2i+1 in the second coordinatê As a simple application of the lemma 3.3 in the coordinates we get the following Therefore, subset of the partitionP constructed in the k − 1 th step iŝ When k is even, we define the partition ξ k+1 by partitioning∆ 0 in the second coordinate as ξ 2 . This finishes the general step of the construction.
Partition of∆ 0 . DefineP Since T almost everywhere finite, η and η generate trivial partitions, and τ i > n 0 i we end up with mod 0 partition of∆ 0 .
So, we have now constructed a partitionP of∆ 0 with the properties (1) For any Γ ∈P the return time T is constant on it and T |Γ = τ i |Γ for some i. (2)F T (Γ) =∆ 0 , for any Γ ∈P. It remains to show aperiodicity and bounded distortion. Next two subsections will be devoted to show these properties.
Let D F , β F , D F , β F be the distortion constants as in (1.3) for F and F respectively. Lemma 3.7. There existsD > 0 andβ ∈ (0, 1) such that for any Γ ∈P andx,ȳ ∈ Γ the following inequality holds Proof. By the property of Jacobian and absolute value we have Start by estimating the first summand. Notice that the simultaneous return time T can be written as a sum of return times to ∆ 0 . Without loss of generality, assume T = R 1 + ... + R k , for some k. For any x ∈ ∆ 0 let . Hence, for the points x, y ∈ πΓ the sequences x j and y j belong to the same element of η for all j = 0, ..., k − 1, which implies The second summand is estimated similarly and we get This finishes the construction of tower over∆ 0 .

COMBINATORIAL ESTIMATES FOR THE PRODUCT TOWER
In the previous section we showed the mapF admits a tower. In order to obtain statistical properties, we need to study the asymptote of the return time. From now on we use the following notation: The main result of this section is the following Proposition 4.1. There exist constants ε 0 , K 0 > 0 such that for any i ≥ 2 The proof of the proposition will be given below in lemmas 4.6 and 4.7. Now, we prove some technical lemmas. Let For any A ∈ η 0 n and x, y ∈ A the following inequality holds where D as in (1.3).
Proof. The collection η 0 n is a partition of F −n ∆ 0 . For any x ∈ ∆ 0 each A ∈ η 0 n contains single element of {F −n x}. For a point x ∈ A let j(x) be the number of visits of its orbit to ∆ 0 . Since the images of A before time n will remain in an element of η all the points in A have the same combinatorics up to time n. This implies j(x) is constant on A. Therefore JF n (x) = (JF R ) j (x), for the projectionx of x into ∆ 0 (i.e. if x = (z, ) thenx = (z, 0) ). Thus for any x, y ∈ ∆ 0 from (1.3) we obtain Hence, for any y ∈ A we have  Proof. Let ν n = F n * m. From corollary 4.3 for any x ∈ ∆ 0 This proves the case x ∈ ∆ 0 . For x ∈ ∆ with ≥ n we have F −n (x) = y ∈ ∆ −n . Since JF ≡ 1 for any y ∈ ∆ \ ∆ 0 we have dν n dm (x) = 1 JF n (y) = 1.
Let x ∈ ∆ , < n. Then for any y ∈ F −n x the equality F n− y = F − x ∈ ∆ 0 holds. Hence, JF (F j y) = 1 for all j = n − ..., n − 1. Therefore by the chain rule we obtain JF n (y) = JF n− (y). Now, we can reduce the problem to the first case by writing This finishes the proof.
Proof. By the assumption, F n : A → ∆ 0 is invertible. So for any x ∈ ∆ 0 there is unique x 0 ∈ A such that F n (x 0 ) = x and dν dm (x) = 1 JF n x 0 . Let ϕ = dν dm then for x, y ∈ ∆ 0 using the lemma 4.2 we obtain ϕ(x) ϕ(y) Next lemma is devoted to the first item of the proposition 4.1.
Proof. By the construction the set {T > τ i−1 } is a union of elements of partition ξ i . Thus From the above equality we see that it is sufficient to prove for any Γ ∈ ξ i the inequalitym Assume for a moment i is even and let Ω = π(Γ), Ω = π (Γ). Then From the equality F τ i−1 (Ω) = ∆ 0 we have For odd i 's we can just change F to F and do all the calculations, that gives the estimatem .
Taking ε 0 = c (1+D) max{ 1 m 0 (∆ 0 ) , 1 m 0 (∆ 0 ) } we get the assertion. The second assertion of the proposition 4.1 is proved in the following Lemma 4.7. There exists K 0 > 0 such that for any i ≥ 0, Γ ∈ ξ i and n ≥ 0.m Proof. Assume for a moment i is even, let as above Ω = π(Γ), Ω = π (Γ). Moreover let µ be the mixing acip for the tower F : (∆, m) . Since, τ i is constant on the elements of ξ i we have the following equality From the definition of τ i we havē Let k = τ i − τ i−1 + n 0 . By the properties of the derivative The second factor is bounded by M 0 from the lemma 4.4. Let us estimate the first one. Note that, dµ . From the lemma 4.5 for any x, y ∈ ∆ 0 we have where D F as in (1.3). Substituting this to the expression For i odd the argument will be the same and we obtain the followinḡ Letting and using the relation m 0 {R > n} = i≥n m 0 {R > i} we get the assertion.

PROOF OF THEOREM 1.4.
Now we are ready to prove the main technical theorem. We start by giving lower bound. By definition T > max{R, R } and Now our aim is to estimate the measurem 0 {T > n} from above. Letting τ 0 = 0 first we split the latter measure as follows For technical reasons as we see below, we need to split the sum into two summands with respect to i depending on which rate we want to consider.
Estimates of each case are different of each other. From now on, we let C denote constant that only depends on F and the reference measure m and First we prove the following The proof of the proposition 5.1 will be broken into several lemmas. Noting the fact Proof. The case i = 1 : The case i = 2 :m Using the equality m{R > n} = j≥n m 0 {R > j} we the estimate of lemma.
Below we give similar estimates for the case i ≥ 3 and we will see that we can do better estimates in this case.
Proof. Since T > n ≥ τ i−1 we havē Moreover writing and using the fact τ i > n we see there is at least one j ∈ [1, i] such that τ j − τ j−1 > n i , which gives the lemma. Now we give estimates for each summand of the lemma 5.4. Lemma 5.5. For i ≥ 3, and j ≥ 1 We separate the expression in lemma into factors as follows We now estimate each of these terms separately. Start with the second term. By the second item of the proposition 4.1 we have For the first term, note that By the construction of the partition ξ j−1 and the return time T the set {T > τ j−2 } can be written as a union of elements of ξ j−1 .
Hence by the second item of the proposition 4.1 . It remains to give an upper bound for the term Y 3 . Start with the following Sublemma 5.6. For any k ≥ j Proof. Since τ j and τ j−1 are constant on the elements of ξ j , if τ j (x) − τ j−1 (x) > n i for some pointx, then it is true on the entire element ξ j (x). By construction, for k ≥ j the partition ξ k is finer than ξ j and the set {T > τ k−1 } can be written as a union of elements of ξ k . Hence the set {T > τ k−1 ; τ j − τ j−1 > n i } can be written as a union of elements of ξ k . Using the first item of the proposition 4.1 in each partition element gives From the sublemma we get directly Substituting the estimates (5.4), (5.5), (5.6) into the equation (5.3) we get the conclusion.
For the case i ≥ 3 and j < 3 proof will be the same but only without the term Y 2 .
Since the series ∞ i=1 (1 − ε 0 ) i−2 i α is convergent and the fact second term in the above sum is exponential in n we get desired estimate.

Super polynomial cases.
In this subsection the proof of the items b) and c) will be given. Let Remark 5.7. It is known fact (see for example [45]) that the cardinality cardA(i) of the set A(i) is bounded above by n+i−n 0 Let δ be a sufficiently small number, which will be specified later. We now prove the following proposition and the proof of the theorem will directly follow from the proposition.
The proof of the proposition will be given in several lemmas below. We splitm 0 {T > n} as follows: As in polynomial case, we start estimating with the second summand of the equation (5.7). Lemma 5.9. For any n Proof. The proof of this lemma is analogous to the proof of lemma 5.2.
To estimate the second summand of (5.7), first we fix i and prove the following Indeed, from the definition of τ j and the equality we obtain that a vector (k 1 , ..., Using the above observation and notations we can write (5.10) m{T > n; τ i−1 ≤ n < τ i } ≤m{τ i−1 ≤ n < τ i } ≤ Sublemma 5.11. For any k = (k 1 , ..., k i−1 ) ∈ A(i) the following holds where K 0 as in proposition 4.1.
Proof. Start with the following equalitȳ Since, for any j the set {P (k, j)} is a union of elements from ξ j . Therefore, we can apply the second item of proposition 4.1 to each element of the right hand side and obtain for each By the same reason Substituting these inequalities into equation (5.11) we get the assertion of the sublemma.
Applying sublemma 5.11to each summand of 5.10 we obtain the lemma.
Proof of Proposition 5.8. By the definition ofM(i, n) and remark 5.7 from lemma 5.10 we obtain Taking sum over i finishes the proof.
Proof of the theorem 1.4 for the exponential case. For the exponential case we choose θ = 1 in the proposition 5.8 and we have the following inequality.
6. PROOF OF THEOREM 1.3 Let F i : T i be the two towers correpsonding to maps f i : M i , i = 1, 2 as in the statement of the Theorem. Then, from conditions (A1)-(A7) we can obtain GMY Towers by considering the system obtained by the equivalence relation on Λ i , i = 1, 2 defined as x ∼ y if and only if y ∈ γ s (x). Then on ∆ i 0 = Λ i / ∼ we have the partition P i = {∆ 0,j } := {Λ 0,j / ∼} and the return time function R i : ∆ 0,j → Z + , and the quadruples (F i , R i , P i , s i ), i = 1, 2 which satisfy conditions (G1)-(G5). Moreover there is natural projectionπ i : T → ∆ that sends each stable manifold to a point. We can then define the direct product of these two "quotient" GMY Towers and, applying Theorem 1.4, construct a new GMY Tower for this product. Thus, on ∆ 1 0 × ∆ 2 0 we have a partitionP , and return time T : ∆ 0 → N such that for any A ∈P we have (F 1 × F 2 ) T (A) = ∆ 1 0 × ∆ 2 0 . On the other hand we know that each A ∈P is of form ) is a u -subset of Λ 1 × Λ 2 because at return times we have f T i = F T i , i = 1, 2. All that is left is to check the properties (A1)-(A7). Those that refer to the combinatorial structure follow immediately from the discussion above, others follow immediately from the corresponding properties of the quotient (proved in Theorem 1.4). The only new property to check here is the second item in (A3). This follows easily by noticing that from the definition of T we have s T (x, y) ≤ min{s R 1 (x 1 , y 1 ), s R 2 (x 2 , y 2 )}, where s T , s R 1 , s R 2 denote separation times with respect to return times T, R 1 , R 2 and x = (x 1 , x 2 ), y = (y 1 , y 2 ) and therefore, by the definition of product metric, we have dist M ((f T ) n (x), (f T ) n (y)) = max i=1,2 {dist i ((f i ) T (x i ), (f i ) T (y i ))} ≤ Cβ min{s R 1 (x 1 ,y 1 ),s R 2 (x 2 ,y 2 )} ≤ Cβ s T (x,y) .
This completes the proof.