Fast Arnold Diffusion in systems with three time scales

We consider the problem of Arnold Diffusion for nearly integrable 
partially isochronous Hamiltonian systems with three time scales. 
By means of a careful shadowing analysis, based on a variational technique, 
we prove that, along special directions, Arnold diffusion takes 
place with fast (polynomial) speed, even though the "splitting determinant" 
is exponentially small.

1. Introduction.In a previous paper [6] (see also [7]) we introduced, in the context of nearly integrable Hamiltonian systems, a functional analysis approach to the "splitting of separatrices" and to the "shadowing problem".We applied our method to the problem of Arnold Diffusion, i.e. topological instability of action variables, for nearly integrable partially isochronous systems.The aim of this paper is to improve the shadowing theorem of [6] and to apply this new theorem to the system with three time scales (1.1) below , in order to prove that along special directions Arnold diffusion takes place with "very fast speed", namely a speed polynomial in ε.To that effect, we use the results on the splitting provided in [6].
Hamiltonian systems with three time scales have been introduced in [11] as a description of the D'Alembert problem in Celestial Mechanics.Later on systems with three time scales have been reconsidered for example in [16], [17], [22], [10], [6], [19].
When µ = 0 the energy ω ε,i I i of each oscillator is a constant of the motion.The problem of Arnold diffusion in this context is whether, for µ = 0, there exist motions whose net effect is to transfer O(1)-energy from one oscillator to others in a certain time T d called the diffusion time.It will be required that ω ε satisfies some diophantine condition.
The existence of Arnold diffusion is usually proved following the mechanism proposed in [3].For µ = 0 Hamiltonian H µ admits a continuous family of n-dimensional partially hyperbolic invariant tori T I0 = {(ϕ, I, q, p) ∈ T n × R n × T 1 × R 1 | I = I 0 , q = 0, p = 0} possessing stable and unstable manifolds W s (T I0 ) = W u (T I0 ) = {(ϕ, I, q, p) ∈ T n ×R n ×T 1 ×R 1 | I = I 0 , p 2 /2+(cos q−1) = 0} called "whiskers" by Arnold.For µ small enough the perturbed stable and unstable manifolds W s µ (T µ I0 ) and W u µ (T µ I0 ) may split and intersect transversally, giving rise to a chain of tori connected by heteroclinic orbits.By a shadowing type argument one can then prove the existence of an orbit such that the action variables I undergo a O(1)-variation in a certain time T d called the diffusion time.In order to prove the existence of diffusion orbits following the previous mechanism one encounters two different problems: 1) Splitting of the whiskers; 2) Shadowing problem.
In [6], the splitting of the stable and unstable manifolds is related to the variations of the "homoclinic function" G µ : T n → R (defined in (2.5)), which is the difference between the generating functions of the stable and unstable manifolds at section {q = π}.∇G µ (A) provides a measure of the distance between the stable and unstable manifolds, so that a critical point A of G µ gives rise to a homoclinic intersection.Usually det D 2 G µ (A) is called the "splitting determinant".The use of the "homoclinic function" G µ for measuring the splitting has two advantages.Firstly, it is very well suited to deal with the shadowing problem by means of variational techniques because G µ is nothing but the difference of the values of the Lagrangian action functional associated to the quasi-periodically forced pendulum (2.2) at two solutions, lying respectively on the stable and unstable manifolds W s,u µ (T I0 ), see (2.4).Secondly it may shed light on a "non uniform" splitting which would not be given by the splitting determinant, when the variations of G µ in different directions are of different orders.With this regard we quote paper [20] where, for a general nearly integrable Hamiltonian system, detailed estimates for the "eigenvalues and the eigenspaces of the splitting matrix", rather than for the determinant, are given.
For the system with three time scales associated to Hamiltonian H µ , "non uniform" splitting is suggested by the behaviour of the first order expansion of G µ in µ, called the Poincaré-Melnikov approximation.In fact the first order term, which is given by the Poincaré-Melnikov primitive defined in (2.8), has exponentially small oscillations in the fast angle A 1 , and polynomially small ones in the slow angles A 2 .Naively this gives the hint that the splitting might be exponentially small in the direction I 1 and just polynomially small in the directions I 2 .
However, in general, for µ = O(ε p ) and ε → 0 the homoclinic function G µ is not well approximated by the Poincaré-Melnikov primitive.In [16]- [19] the asymptotic validity of Melnikov's integrals for computing the exponentially small "splitting determinant" is proved to hold thanks to cancellations techniques.
In [6] the naive Poincaré-Melnikov approximation for Hamiltonian H µ has been rigourously justified for µε −3/2 sufficiently small, in a different way.We define another "splitting function" G µ , see (2.7), whose critical points as well give rise to homoclinic intersections.G µ is well approximated, for µ = O(ε p ) and ε → 0, by the Poincaré-Melnikov primitive and has exponentially small oscillations in A 1 , see theorem 2.2.The crucial observation is that G µ and G µ are the same function up to a diffeomorphism ψ µ of the torus close to identity, namely After the works [8], [9], [21], [12], [10], [6], [13], [14] and references therein, it is a well established fact that the diffusion time is estimated by a polynomial inverse power of the splitting.For instance, using the estimate on the size of the splitting of [16] and [19] an exponentially long diffusion time has been obtained in [10], namely for some b > 0 (see also theorem 5.2 of [6]).
However the properties of G µ (oscillations of different amplitude orders according to the direction) suggest that Arnold diffusion can take place with different speed along different directions; since, for larger splitting one would expect a faster speed of diffusion, one could guess the existence of diffusion orbits that drift along the "fast" directions I 2 ∈ R n−1 , where the splitting is just polynomially small w.r.t.1/ε, in a polynomially long diffusion time T d = O(1/ε q ); see also the discussion in chapter 2 of [20].The aim of this paper is to prove that this is indeed the case and to provide explicit and careful estimates on the diffusion time.In order to prove this phenomenon (see theorem 4.1 for the general case and theorem 4.2 for an application) we refine the shadowing theorem 2.3 of [6] for dealing with the present "non-uniform" splitting.Note that, because of the preservation of the energy along the orbits, Arnold diffusion can take place in the direction I 2 for n ≥ 3 only.
In order to justify heuristically our result we recall how the diffusion time T d is estimated in [6], once it is verified that stable and unstable manifolds split.T d is, roughly, estimated by the product of the number of heteroclinic transitions k (= number of tori forming the transition chain = heteroclinic jump/splitting) and of the time T s required for a single transition, namely T d = kT s .The time for a single transition T s is bounded by the maximum time between the "ergodization time" T e of the torus T n run by the linear flow ω ε t, and the time needed to "shadow" homoclinic orbits for the corresponding quasi-periodically forced pendulum (equation 2.2).
The reasons for which we are able to move in polynomial time w.r.t 1/ε along the fast I 2 directions are the following three ones.(i) As in [6], since the homoclinic orbit decays exponentially fast to 0, the time needed to "shadow" homoclinic orbits for the quasi-periodically forced pendulum (2.2) is only polynomial.(ii) Since the splitting is polynomially small in the directions I 2 , we can choose just a polynomially large number of tori forming the transition chain k = O(1/ε p ) to get a O(1)-drift of I 2 .(iii) Finally, the most difficult task is getting a polynomial estimate for the "ergodization time" T e -defined as the time needed for the flow {ω ε t} to make an α-net of the torus-with α appropriately small.By a result of [4] this time satisfies T e = O(1/α τ ).Let us explain how this estimate enters into play.In order to apply our "gluing" variational technique, the projection of our shadowing orbit on the torus T n , namely {ω ε t + A 0 }, must approach, at each transition, sufficiently close to the homoclinic critical point A of G µ .The crucial improvement of the shadowing theorem 4.1 allows the shadowing orbit to approach A only up to a polynomially small distance α = O(ε p ), p > 0, (and not exponentially small as it would be required when applying the shadowing theorem of [6]).By the forementioned estimate on the ergodization time T e = O(1/α τ ) it results that the minimum time after which the homoclinic trajectory can "jump" to another torus is only polynomially long w.r.t 1/ε.Actually this allows to improve as well the exponential estimate on the diffusion time required to move also in the I 1 direction, see remark 4.3.Theorems 4.1 and 4.2 are the first steps to prove the existence of this phenomenon also for more general systems (with non isochronous terms and more general perturbations).We quote [20], where the splitting problem is studied in a quite general framework.
The paper is organized as follows: in section 2 we recall some preliminary results taken from [6].In section 3 we introduce the general "splitting condition" which will be used in section 4 to prove the shadowing theorems.

Preliminaries.
In this section we recall the results of [6] that will be used in the sequel.We refer to [6] for complete details and for the description of the general functional analysis approach based on a Lyapunov-Schmidt reduction.With respect to the notations of [6] we remark that we have changed the sign before the perturbation f in Hamiltonian H µ .
The equations of motion corresponding to Hamiltonian The angles ϕ evolve as ϕ(t) = ω ε t + A; therefore equations (2.1) can be reduced to the quasi-periodically forced pendulum equation corresponding to the Lagrangian For each solution q(t) of (2.2) one recovers the dynamics of the actions I(t) by quadratures in (2.1).
For µ = 0 equation (2.2) possesses the one parameter family of homoclinic solutions to 0, mod 2π, q θ (t) = 4 arctan(exp (t − θ)), θ ∈ R. Using the Implicit Function Theorem one can prove (lemma 2.1 of [6]) that there exist, near the unperturbed homoclinic solutions q θ (t), for 0 < µ < µ 0 small enough independently of ω ε , "pseudo-homoclinic solutions" q µ A,θ (t) of equation (2.2).These are true solutions of (2.2) in each interval (−∞, θ) and (θ, +∞); at time t = θ such pseudo-solutions are glued with continuity at value q µ A,θ (θ) = π and for t → ±∞ are asymptotic to the equilibrium 0 mod 2π.We can then define the function F µ : T n ×R → R as the action functional of Lagrangian (2.3) evaluated on the "1-bump pseudo-homoclinic solutions" q µ A,θ (t), namely and the "homoclinic function" G µ : There holds Remark 2.1.The homoclinic function G µ is the difference between the generating functions S ± µ,I0 (A, q) of the stable and the unstable manifolds W s,u µ (T I0 ) (which in this case are exact Lagrangian manifolds) at the fixed section {q = π}, namely A critical point of G µ gives rise to a homoclinic orbit to torus T I0 , see lemma 2.3 of [6].
In order to justify the dominance of the Poincaré-Melnikov function when µ = O(ε p ) one would need to extend analytically the function F µ (A, θ) for complex values of the variables.Since the condition q µ A,θ (Re θ) = π, appearing naturally when trying to extend the definition of q µ A,θ to θ ∈ C, breaks analyticity, the function F µ (A, θ) can not be easily analytically extended in a sufficiently wide complex strip.To overcome this problem, in [6] the Lagrangian action functional is evaluated on different "1-bump pseudo-homoclinic solutions" Q µ A,θ .Define ψ 0 : R → R by ψ 0 (t) = cosh 2 (t)/(1 + cosh t) 3 and set ψ θ (t) = ψ 0 (t − θ).Two important properties of the function ψ 0 (t) are that R ψ 0 (t) q0 (t) dt = 0 and that it can be extended to a holomorphic function on R + i(−π, π) (while the homoclinic solution q 0 (t) can be extended to a holomorphic function only up to R + i(−π/2, π/2)).By the Contraction Mapping Theorem there exist (lemma 4.1 of [6]) near q θ , for µ small enough, pseudo-homoclinic solutions Q µ A,θ (t) and a constant α µ A,θ defined by We define the function F µ : T n × R → R as the action functional of Lagrangian (2.3) evaluated on the "1-bump pseudo-homoclinic solutions" Q µ A,θ (t), namely and G µ : Remark 2.2.Also critical points of G µ give rise to homoclinic solutions to torus T I0 , see lemma 4.2 of [6].By theorem 2.1 below, from a geometrical point of view the introduction of the "homoclinic function" G µ may be interpreted simply as measuring the splitting with a non constant Poincaré section, see the introduction of [6].
The crucial point is now to observe that the homoclinic functions G µ and G µ are the same up to a change of variables close to the identity, as stated by the following theorem (see theorem 4.1 of [6]) Theorem 2.1.For µ small enough (independently of ω ε ) there exists a Lipschitz homeomorphism (a real analytic diffeomorphism if f is analytic) (2.8) Develop in Fourier series w.r.t. the first variable the homoclinic function and the Poincaré-Melnikov primitive Assume that the perturbation f is analytic w.r.t (ϕ 2 , . . ., ϕ n ).More precisely assume that there exist The following theorem about the splitting of stable and unstable manifolds in systems with three time scales, holds (see theorem 5.1 of [6]) Theorem 2.2.For µ||f ||ε −3/2 small there holds In order to prove our shadowing theorem we need also to recall the definition of the k-bump pseudo-homoclinic solutions q L A,θ (t) for the quasi-periodically forced pendulum (2.2).Such pseudo solutions turn k times along the separatrices and are asymptotic to the equilibrium 0, mod 2π, for t → ±∞.More precisely in lemma 2.4 of [6] it is proved that for all k ∈ N, for all θ 1 < . . .< θ k with min i (θ i+1 − θ i ) > L, with L sufficiently large, independent of ω ε and µ, there exists a unique pseudo-homoclinic solution q L A,θ (t) : R → R which is a true solution of . Such pseudo-homoclinic orbits are found via the Contraction Mapping Theorem, as small perturbations of a chain of "1-bump homoclinic solutions" q µ A,θi .
Then we consider the Lagrangian action functional evaluated on these pseudohomoclinic orbits q L A,θ depending on n + k variables Setting e k = (1, . . ., 1) ∈ R k , the following invariance property, inherited from the autonomy of H µ , holds By lemma 2.1, in order to get heteroclinic solutions connecting T I0 to T I 0 , we need to find critical points of F k µ (A, θ).When min i (θ i+1 − θ i ) → +∞ the "k-bump homoclinic function" F k µ (A, θ) turns out to be well approximated simply by the sum of the functions F µ (A, θ i ) according to the following lemma.We set θ 0 = −∞ and θ k+1 = +∞.Note that the estimate given in what follows is independent of ε.

Lemma 2.2. There exist positive constants
with (2.12) 3. The splitting condition.We now give a general "splitting condition" on the homoclinic function G µ well suited to describe the non-uniform splitting of stable and unstable manifolds which takes place in systems with three time scales.Roughly, the "splitting condition" 3.1 below states that G µ possesses a maximum and provides explicit estimates of the non-uniform splitting.It will be used, in the next section, to prove the shadowing theorem 4.1.As a paradigmatic example, we will verify, in lemma 3.2, that, when the perturbation f (ϕ) = n j=1 cos ϕ j , the "splitting condition" is satisfied, see also remark 3.1.
We have where k µ (a 1 , . . ., a n ) : Assume that G µ satisfies condition 3.1 with maps l 1 , l 2 .For all x = (a 2 , . . ., a is a homeomorphism from the interval ( l 1 (x), l 2 (x)) to the interval (l 1 (x), l 2 (x)), where There results that, for all x = (a 2 , . . ., a Therefore G µ satisfies the splitting condition 3.1, with maps l j replaced by l j , and the same positive parameters.Since We now give a paradigmatic example where the former "splitting condition" is satisfied.Assume that the perturbation f is given by f (ϕ 1 , . . ., ϕ n ) = n j=1 cos ϕ j .
In the next lemma we show that the corresponding homoclinic function G µ satisfies the "splitting condition" 3.1 and hence, by lemma 3.1, G µ as well satisfies the "splitting condition" 3.1.
From now on, notation K i will be used for positive universal constants, whereas notation c i (δ) will be used for positive constants depending only on δ.Notation u = O(v) will mean that there exists a universal constant K such that |u| ≤ K|v|.
Our first aim is to prove expression (3.14) below.It easily results that, if By thereom 2.2 we have and, by (3.2), up to a constant which we shall omit, ) In this proof we shall use the abbreviations .
4. The shadowing theorem.In this section we shall prove, under the "splitting condition" 3.1, our general shadowing theorem.
Then, for all I 0 , I 0 ∈ R n such that (I 0 − I 0 ) ∈ Span{Ω 3 , . . ., Ω n }, there exists a heteroclinic trajectory from T I0 to T I 0 which connects a η-neighbourhood of torus T I0 to a η-neighbourhood of torus T I 0 in the "diffusion time" where Remark 4.1.The diophantine condition on the frequency vector ω ε restricts the values of ε and β that we consider.In any case, if for instance β is (γ,n − 2)diophantine then for τ ≥ n − 1 there exist c 0 > 0 and a sequence ε j → 0 such that ω ε is (γ ε , τ)-diophantine with γ ε = c 0 ε a , see for example [16].needed to "shadow" homoclinic orbits for the forced pendulum equation.We use here that these homoclinic orbits are exponentially asymptotic to the equilibrium.We could prove also the existence of connecting orbits for all I 0 − I 0 ∈ Span {Ω 2 , . . ., Ω n }.In this case the number k of heteroclinic transitions would depend also on δ 2 , see remark 4. By lemma 2.1 and the exponential decay of the solutions of (2.2) asymptotic to the equilibrium , in order to prove the theorem, it is sufficient to find a critical point of the k-bump heteroclinic function We introduce suitable coordinates (a 1 , a 2 , a 3 , s 1 , . . ., s k ) ∈ R 3 × (min l 1 , max l 2 ) k defined by where η i are constants to be chosen later.Let H k µ (a, s) = F k µ (A, θ) be the "kbump homoclinic function" and H k µ (a, s) = F k µ (A, θ) be the "k-bump heteroclinic function" expressed in the new variables (a, s).
The function H k µ does not depend on a 1 , since, by the invariance property (2.9) (we recall that Ω up to an additive constant.In the sequel of the proof we shall use the abbreviation H k µ = H k µ (0, a 2 , a 3 , s).We now choose the constants (η 1 , . . ., η k ) ∈ R k .Note that, since ω ε is (γ ε , τ)diophantine, Ω 1 satisfies the diophantine condition Hence, by the results of [4], there exists C > 0 such that the "ergodization time" T e of the torus T 3 run by the linear flow Ω In particular there exists a constant C 2 and there exist η i such that ) In order to prove the theorem we just need to prove the existence of a critical point of H k µ in R 2 × (min l 1 , max l 2 ) k .The upperbound of the diffusion time given in (4.1) will then be a consequence of (4.7) and (4.2).Indeed, by (4.4) and (4.7) we get that By (4.9) there exists C > 0 such that the time θ i+1 −θ i "spent for a single transition" is bounded by where ) after the change of variables (4.4).We recall that H µ is defined in condition 3.1 The left hand side inequality in (4.7) implies that hence, by (2.12), .12) We will maximize attains its maximum over U at some point (a, s) = (a 2 , a 3 , s).It is enough to prove that (a, s) ∈ U .
• We first prove that for all i, s i ∈ (l 1 (a 2 + y i , a 3 + z i ), l 2 (a 2 + y i , a 3 + z i )).Since (a, s) is a maximum point of H k µ in U , for any t ∈ [l 1 (a 2 + y i , a 3 + z i ), l 2 (a 2 + y i , a 3 + z i )], replacing s i with t does not increase H k µ .Since such a substitution alters at most three terms among S 1 , . . ., S k in (4.11), we obtain, using (4.12), that for any i, for any and, by condition 3.1-(i), this implies that s i ∈ (l 1 (a + χ i ), l 2 (a + χ i )), where we have set χ i = (y i , z i ).

Remark 4 . 2 .
The meaning of(4.1)  is the following: the diffusion time T d is estimated by the product of the number of heteroclinic transitions k = ( heteroclinic jump / splitting ) = O(ρ|I 0 − I 0 |/δ 3 ), and of the time T s required for a single transition, that is T d ≈ k • T s .The time for a single transition T s is bounded by the maximum time between the "ergodization time" (1/γ ε σ τ ), i.e. the time needed for the flow ωt to make an σ-net of the torus, and the time max{| ln δ 1 |, | ln δ 2 |, ∆/|ω ε |}