Exponential stability of fast driven systems, with an application to celestial mechanics

We construct a normal form suited to {\it fast driven systems}. We call so systems including actions ${\rm I}$, angles {$\psi$}, and one fast coordinate $y$, moving under the action of a vector--field $N$ depending only on ${\rm I}$ and $y$ and with vanishing ${\rm I}$--components. {In absence of the coordinate $y$, such systems have been extensively investigated and it is known that, after a small perturbing term is switched on, the normalised actions ${\rm I}$ turn to have exponentially small variations compared to the size of the perturbation. We obtain the same result of the classical situation, with the additional benefit that } no trapping argument is needed, as no small denominator arises. {We use the result to prove that, in the three--body problem, the level sets of a certain function called {\it Euler integral} have exponentially small variations in a short time, closely to collisions.}


Description of the results
We consider a (n + 1 + m)-dimensional vector-field N which, expressed in local coordinates (I, y, ψ) ∈ P = I × Y × T m (where I ⊂ R n , Y ⊂ R are open and connected; T = R/(2πZ) is the standard torus), has the form N (I, y) = v(I, y)∂ y + ω(I, y)∂ ψ . (1) The motion equations of N can be integrated in cascade: ω(I 0 , η(I 0 , t ))dt (2) with η(I 0 , ·) being the general solution of the one-dimensional equationẏ(t) = v(I 0 , y). This formula shows that along the solutions of N the coordinates I ("actions") remain constant, while the motion of the coordinates ψ ("angles") is coupled with the motion of the "driving" coordinate y. We assume that v is suitably far from vanishing (for the problem considered in the paper |v| has a positive lower bound). It is to be noted that, without further assumptions on the function v (like, for example, of being "small", or having a stationary point) nothing prevents to the y coordinate to move fast. For this reason -with slight abuse due to the fact that fastness may nowise occur -we refer to the solutions in (2) as fast driven system. The main risk of such kind of system is that the solution q(t) = (I(t), y(t), ψ(t)) of N in (2) leaves the domain P at a finite time. It is then convenient to define the exit time from P under N , or, more in general, the exit time from a given W ⊆ P under N , and denote it as t N,W ex , the (possibly infinite) first time that q(t) leaves W . Let us now replace the vector-field N (I, y) with a new vector-field of the form X(I, y, ψ) = N (I, y) + P (I, y, ψ) where the "perturbation" P = P 1 (I, y, ψ)dI + P 2 (I, y, ψ)dy + P 3 (I, y, ψ)dψ is, in some sense, "small" (see the next section for precise statements). Let t X,W ex be the exit time from W under X, and let be a uniform upper bound for the absolute value of P 1 on W . Then, one has a linear-in-time a-priori bound for the variations of I, as follows We are interested in improving the bound (4). To the readers who are familiar with Kolmogorov-Arnold-Moser (kam) or Nekhorossev theories, this kind of problems is well known: see [3,38,44,22], or [9,20,29,25,47] for applications to realistic models. Those are theories originally formulated for Hamiltonian vector-fields (next extended to more general ODEs), hence, in particular, with n = m and the coordinate y absent. In those cases the unperturbed motions of the coordinates (I, ψ) are I(t) = I 0 , ψ(t) = ψ 0 + ω(I 0 )t (5) and the properties of the motions after the perturbing term is switched on depend on the arithmetic properties of the frequency vector ω(I 0 ). Under suitable non-commensurability assumptions of ω(I 0 ) (referred to as "Diophantine conditions"), kam theory ensures the possibility of continuing the unperturbed motions (5) for all times. Conversely, if ω(I) satisfies, on an open set, an analytic property known as "steepness" (which is satisfied, e.g., if ω does not vanish and moreover if it is the gradient of a convex function), Nekhorossev theory allows to infer -for all orbits -a bound as in (4), with e −C/ a replacing and t X,W ex = e C/ b , with suitable a, b, C > 0. It is to be remarked that in the Nekhorossev regime the exponential scale of t X,W ex is an intrinsic consequence of steepness, responsible of a process known as "capture in resonance". In the case considered in the paper such phenomenon does not seem to exist and hence the exit time t X,W ex has no reason to be long. Nevertheless, motivated by an application to celestial mechanics described below, we are interested with replacing in (4) with a smaller number. We shall prove the following result (note that steepness conditions are not needed here).
Theorem A Let X = N + P be real-analytic, where N is as in (1), with v ≡ 0. Under suitable "smallness" assumptions involving ω, ∂ω, ∂v and P , the bound in (4) holds with e −C/ a replacing , with a suitable a, C > 0.
A quantitative statement of Theorem A is given in Theorem 2.1 below. In addition, in view of our application, we also discuss a version to the case when analyticity in ψ fails; this is Theorem 2.2.
In order to simplify the analysis a little bit, we introduce a main assumption. The Hamiltonian H 3b in (6) includes the Keplerian term We assume that this term is "leading" in the Hamiltonian. By averaging theory, this assumption allows us to replace (at the cost of a small error) H 3b by its -average where is the mean anomaly associated to (7), and 1 with U := 1 2π 2π 0 d x − x( ) being the "simply 2 averaged Newtonian potential". We recall that the mean anomaly is defined as the area spanned by x on the Keplerian ellipse generated by (7) relatively to the perihelion P of the ellipse, in 2π units. From now on we focus on the motions of the averaged Hamiltonian (9), bypassing any quantitative statement concerning the averaging procedure, as this would lead much beyond the purposes of the paper 3 . Neglecting the first term in (8), which is an inessential additive constant for H and reabsorbing the constant δ with a time change, we are led to look at the Hamiltonian H in (9). We denote as E the Keplerian ellipse generated by Hamiltonian (7), for negative values of the energy. Without loss of generality, assume E is not a circle and 4 Λ = 1. Remark that, as the mean anomaly is averaged out, we loose any information concerning the position of x on E, so we shall only need two couples of coordinates for determining the shape of E and the vectors y , x . These are: • the "Delaunay couple" (G, g), where G is the Euclidean length of x × y and g detects the perihelion. We remark that g is measured with respect to x (instead of with respect to a fixed direction), as the SO(2) reduction we use a rotating frame which moves with x (compare the formulae in (66) below); • the "radial-polar couple"(R, r), where r := x and R := y ·x x .
Using the coordinates above, the Hamiltonian in (9) becomes where C = x × y + x × y is the total angular momentum of the system, and we have assumed The Hamiltonian (10) is now wearing 2 degrees-of-freedom. As the energy is conserved, its motions evolve on the 3-dimensional manifolds M c = {H = c}. On each of such manifolds the evolution is associated to a 3-dimensional vector-field X c , given by the velocity field of some triple of coordinates on M c . As an example, one can take the triple (r, G, g), even though a more 1 Remark that y( ) has vanishing -average so that the last term in (6) does not survive. 2 Here, "simply" is used as opposed to the more familiar "doubly" averaged Newtonian potential, most often encountered in the literature; e.g. [27,16,39,13,12]. 3 As we consider a region in phase space close where x is very close to the instantaneous Keplerian orbit of x, quantifying the values of the mass parameters and the distance which allow for the averaging procedure is a delicate (even though crucial) question, which, by its nature, demands careful use of regularisations. Due to the non-trivial underlying analysis, we choose to limit ourselves to point out that the renormalizable integrability of the Newtonian potential has a nontrivial dynamical impact on the simply averaged three-body problem, which explain the existence of the motions herewith discussed, which would not be justified otherwise. 4 We can do this as the Hamiltonian H 3b rescale by a factor β −2 as (y , y) → β −1 (y , y) and (x , x) → β 2 (x , x). convenient choice will be done below. To describe the motions we are looking for, we need to recall a remarkable property of the function U, pointed out in [40]. First of all, one has to note that U is integrable, as it is a function of (r, G, g) only. But the main point is that there exists a function F of two arguments such that U(r, G, g) = F(E(r, G, g), r) where E(r, G, g) = G 2 + r 1 − G 2 cos g .
The function E is referred to as Euler integral, and we express (11) by saying that U is renormalizable integrability via the Euler integral. Such cirumstance implies that the level sets of E, namely the curves are also level sets of U. On the other hand, the phase portrait of (13) keeping r fixed is completely explicit and has been studied in [41]. We recall it now. Let us fix (by periodicity of g) the strip [−π, π] × [−1, 1]. For 0 < r < 1 or 1 < r < 2 it includes two minima (±π, 0) on the g-axis; two symmetric maxima on the G-axis and one saddle point at (0, 0). When r > 2 the saddle point disappears and (0, 0) turns to be a maximum. The phase portrait includes two separatrices when 0 < r < 1 or 1 < r < 2; one separatrix if r > 2. These are the level sets with S 0 (r) being the separatrix through the saddle; S 1 (r) the level set through circular orbits. Rotational motions in between S 0 (r) and S 1 (r), do exist only for 0 < r < 1. The minima and the maxima are surrounded by librational motions and different motions (librations about different equilibria or rotations) are separated by S 0 (r) and S 1 (r). All of this is represented in Figure 1. In Figure 2 the same level sets are drawn in the 3-dimensional space (r, G, g). The spatial visualisation turns out to be useful for the purposes of the paper, as the coordinate r, which stays fixed under E, is instead moving under H, due to its dependence on R; see (10). We denote as S 0 the union of all the S 0 (r) with 0 ≤ r ≤ 2. It is to be noted that, while E is perfectly defined along S 0 , U is not so. Indeed, as S 0 (r) = (G, g) : we have 5 U(r, G, g) = ∞ for (G, g) ∈ S 0 (r), for all 0 ≤ r ≤ 2. The natural question now raises whether any of the E-levels in Figure 2 is an "approximate" invariant manifold for the Hamiltonian H in (10). In [42] and [14] a positive answer has been given for case r > 2, corresponding to panels (c). In this paper, we want to focus on motions close to S 0 with r in a left neighbourhood of 2 (panels (b)). Such portion of phase space is denoted as C. By the discussion above, motions in C are to be understood as "quasi-collisional". To state our result, we denote as r s (A) the value of r such that the area encircled by S 0 (r s (A)) is A. Then the set {∃ A : r = r s (A)} corresponds to S 0 . We prove: 5 Rewriting (14) as tells us that (G, g) ∈ S 0 (r) if and only if x occupies in the ellipse E the position with true anomaly ν = π − g.  Theorem B Inside the region C there exists an open set W such that along any motion with initial datum in W , for all t with |t| ≤ t X,W ex , the ratio between the absolute variations of the Euler integral E from time 0 to time t, for all |t| ≤ t X,W ex , and the a-priori bound t (where := |P 1 | ∞ , with P 1 being the action component of the vector-field) does not exceed Ce −L 3 /C , provided that the initial value of r is e −L away from r s (A), with L > 0 sufficiently large.
The proof of Theorem B, fully given in the next section, relies on a careful choice of coordinates (A, y, ψ) on M c , where y is diffeomorphic to r, while (A, ψ) are the action-angle coordinates of E(r, ·, ·), such that the associated vector-field has the form in (3) with n = m = 1. The diffeomorphism r → y allows X c to keep its regularity upon S 0 .
Before switching to proofs, we recall how the theme of collisions in N -body problems (with N ≥ 3) has been treated so far. As the literature in the field in countless, by no means we claim completeness. In the late 1890s H. Poincaré [43] conjectured the existence of special solutions in a model of the three-body problem usually referred to as planar, circular, restricted three-body problem (pcrtbp). According to Poincaré's conjecture, when one of the primaries has a small mass µ, the orbit of an infinitesimal body approaching a close encounter with the small primary consists of two Keplerian arcs glueing so as to form a cusp. These solutions were named by him second species solutions, and their existence has been next proved in [4,5,6,7,8,30,26]. In the early 1900s, J. Chazy classified all the possible final motions of the three-body problem, including the possibility of collisions [10]. The study was reconsidered in [1,2]. After the advent of kam theory, the existence of almost-collisional quasi-periodic orbits was proven [11,15,48]. The papers [45,46,17,18,31,32,33,34] deal with rare occurrence of collisions or the existence of chaos in the proximity of collisions. In [21] it is proved that for pcrtbp there exists an open set in phase space of fixed measure, where the set of initial points which lead to collision is O(µ α ) dense with some 0 < α < 1. In [28] it is proved that, after collision regularisation, pcrtbp is integrable in a neighbourhood of collisions. In [23,24] the result has been recently extended to the spatial version, often denoted scrtbp.

A Normal Form Theorem for fast driven systems
In the next Sections 2.1-2.4 we state and prove a Normal Form Theorem (nft) for real-analytic systems. For the purpose of the paper, in Section 2.5 we generalise the result, allowing the dependence on the angular coordinate ψ to be just C * ( * ∈ N), rather than holomorphic. In all cases, we limit to the case n = m = 1. Generalisations to n, m ≥ 1 are straightforward.

Weighted norms
Let us consider a 3-dimensional vector-field (I, y, ψ) ∈ P r,σ,s : where I ⊂ R, Y ⊂ R are open and connected; T = R/(2πZ), which has the form (3). As usual, if A ⊂ R and r,s > 0, the symbols A r , T s denote the complex r, s-neighbourhoods of A,T: with B r (x) being the complex ball centred at x with radius r. We assume each X i to be holomorphic in P r,σ,s , meaning the it has a finite weighted norm defined below. If this holds, we simply write X ∈ O 3 r,σ,s .
For functions f : (I, y, ψ) ∈ I r × Y σ × T s → C, we write f ∈ O r,σ,s if f is holomorphic in P r,σ,s . We let where f = k∈Z f k (I, y)e ikψ is the Fourier series associated to f relatively to the ψ-coordinate. For ψ-independent functions or vector-fields we simply write · r,σ . For vector-fields X : (I, y, ψ) σ,s for i = 1, 2, 3. We define the weighted norms The wighted norm affords the following properties.

The Normal Form Theorem
We now state the main result of this section. Observe that the nature of the system does not give rise to any non-resonance condition or ultraviolet cut-off. We name Normal Form Theorem the following Theorem 2.1 (nft) Let u = (r, σ, s); X = N + P ∈ O 3 u and let w = (ρ, τ , t) ∈ R 3 + . Put and 6 assume that for some p ∈ N, s 2 ∈ R + , the following inequalities are satisfied: and Then, with there exists a real-analytic change of coordinates Φ such that X := Φ X ∈ O 3 u and X = N + P , with Remark 2.1 (Proof of Theorem A) Theorem 2.1 immediately implies Theorem A, with C = min{2 −7 Q −2 e −2s2 2 log 2 , t/diamY σ }, a = 2, provided that := 2 ( P w u ) 2 is of "order one" with respect to . The mentioned "smallness assumptions" correspond to conditions (18)- (21) and

The Step Lemma
We denote as the formal Lie series associated to Y , where denotes Lie brackets of two vector-fields, with being the Lie operator.
and that P is so small that Let ρ * , τ * , t * be defined via and assume Then there exists Y ∈ O 3 u * +w * such that X + := e L Y X ∈ O 3 u * and X + = N + P + , with In the next section, we shall use Lemma 2.1 in the following "simplified" form. (25) and (27) are replaced with with u + := (r − 4ρ, σ − 4τ e s2 , s − 5t) .
To prove Lemma 2.1, we look for a change of coordinates which conjugates the vector-field X = N +P to a new vector-field X + = N + +P + , where P + depends in the coordinates I at higher orders. The procedure we follow is reminiscent of classical techniques of normal form theory, where one chooses the transformation so that X + = e L Y X, with the operator e L Y being defined as in (23). As in the classical case, Y will be chosen as the solution of a certain "homological equation" which allows to eliminate the first order terms depending on ψ of P . However, as stated in Lemma 2.1, differently from the classical situation, one can take N = N + , which is another way of saying that it is possible to choose Y such in a way to solve regardless P has vanishing average or not -or, in other words, that also the resonant terms of the perturbing term will be killed. Note also that no "ultraviolet cut-off" is used. Equation (34) is precisely what is discussed in Lemma 2.3 and Proposition 2.1 below. Fix y 0 ∈ Y; v, ω : I × Y → R, with v ≡ 0. We define, formally, the operators F v,ω and G v,ω as acting on functions g : Observe that, when existing, F v,ω , G v,ω send zero-average functions to zero-average functions. The existence F v,ω , G v,ω is established by the following The proof of Lemma 2.3 is obvious from the definitions (35).
Proof We expand Y j and Z j along the Fourier basis where (J Z ) ij = ∂ j Z i are the Jacobian matrices, we rewrite (37) as Regarding (39) as equations for Y j,k , we find the solutions multiplying by e ikψ and summing over k ∈ Z we find Then, by Lemma 2.3, Multiplying the inequalities above by ρ −1 * , τ −1 * , t −1 * respectively and taking the sum, we find (38), with We recognise that, under conditions (27), ρ * , τ * , t * in (26) solve the equations above.
Taking the u 0 − u + w-weighted norms, the thesis follows.
Hence, de-homogenizating, Eliminating the common factor k k+1 and iterating k times from i = k, by Stirling, we get Then the Lie series e L Y defines an operator Proof of Lemma 2. 1 We look for Y such that X + := e L Y X has the desired properties.
By Proposition 2.2, the Lie series e L Y defines an operator The bounds on P + are obtained as follows. Using the homological equation, one finds The bound is even more straightforward.

Proof of the Normal Form Theorem
The proof of nft is obtained -following [44] -via iterate applications of the Step Lemma. At the base step, we let 7 X = X 0 := N + P 0 , w = w 0 := (ρ, τ, t) , u = u 0 := (r, σ, s) Conditions (29)-(32) are implied by the assumptions (20)- (22). We then conjugate X 0 to Then we have We assume, inductively, that, for some 1 ≤ j ≤ p, we have where with The case j = 1 trivially reduces to the identity P 1 w0 u1 = P 1 w0 u1 . We aim to apply Lemma 2.2 with u = u j as in (45) and Conditions (29), (30) and (31) are easily seen to be implied by (20), (19), (18) and the first condition in (22) combined with the inequality pη 2 < 1, implied by the choice of p. We check condition (32). By homogeneity, we see that condition (32) is met: Then the Iterative Lemma can be applied and we get Using homogeneity again to the extreme sides of this inequality and combining it with (44), (43) and (22), we get so we can take X = X p+1 , P = P p+1 , u = u p+1 .

A generalisation when the dependence on ψ is smooth
u, * , with u = (r, σ), the class of vector-fields (I, y, ψ) : In this section we generalise Theorem 2.1 to the case that X ∈ C 3 u, * . We use techniques going back to J. Nash and J. Moser [37,35,36]. First of all, we need a different definition of norms 8 and, especially, smoothing operators.

Generalised weighted norms We let
Clearly, the class O 3 r,σ,s defined in Section 2.1 is a proper subset of C 3 u, * Observe that the norms (46) still verify monotonicity and homogeneity in (16) and (17).

Smoothing
We call smoothing a family of operators As an example, as suggested in [3], one can take which, with the definitions (46)-(47), verifies the inequalities above with δ = 2. We name Generalised Normal Form Theorem (gnft) the following + and assume that for some s 1 , s 2 ∈ R + , the following inequalities are satisfied. Put 8 The series in (15) is in general diverging when f ∈ C u, * .
then assume: and Then, with there exists a real-analytic change of coordinates Φ such that X := Φ X ∈ C 3 u , * and X = N + P , with The result generalising Lemma 2.1 is Lemma 2.6 Let X = N + P ∈ C 3 u, * , with u = (r, σ), N as in (36), , K ∈ N. Assume (24) and that P is so small that Let ρ * , τ * be defined via Then there exists Y ∈ T K C 3 u * +ŵ * , * such that X + := e L Y X ∈ C 3 u * , * and X + = N + P + , with The simplified form of Lemma 2.6, corresponding to Lemma 2.2, is Lemma 2.7 (Generalised Step Lemma) Assume (24) and replace (53) and (54) with with Proof The inequalities in (56) guarantee Then (59) is implied by (55), monotonicity and homogeneity and the inequality in (58).
Let now F v,ω and G v,ω be as in (35). First of all, observe that F v,ω , G v,ω take T K C u, * to itself. Moreover, generalising Lemma 2.3,
Proof The proof copies the one of Lemma 2.5, up to invoke Lemma 2.9 at the place of Lemma 2.4 and hence replace the w's "up" with w K .
Then the Lie series e L Y defines an operator Proof of Lemma 2.6 All the remarks before Lemma 2.3 continue holding also in this case, except for the fact that, differently from Lemma 2.1 here we need a "ultraviolet cut-off" of the perturbing term. Namely, we split We choose Y so that the homological equation is satisfied. By Proposition 2.3, this equation has a solution Y ∈ T K C 3 u, * verifying with w * = (ρ * , τ * , t * ) as in (61). As t * = t = 1 c0 K 1+δ , We let w * ,K := w * ,ŵ * := (ρ * , τ * ) with (ρ * , τ * ) as in (54). By Proposition 2.4, the Lie series e L Y defines an operator u * , * . The bounds on P + are obtained as follows. The terms e L Y 2 N w * ,K u * and e L Y 1 P w * ,K u * are treated quite similarly as (41) and (42): The moreover, here we have the term R K P , which is obviously bounded as We are finally ready for the Proof of Theorem 2.2 Analogously as in the proof of nft, we proceed by iterate applications of the Generalised Step Lemma. At the base step, we let X = X 0 := N + P 0 , w 0 := w 0,K := ρ, τ, 1 c 0 K 1+δ , u 0 := (r, σ) with X 0 = N + P 0 ∈ C 3 u0, * . We let Conditions (56)-(58) are implied by the assumptions (48)-(52). We then conjugate X 0 to X 1 = N + P 1 ∈ C 3 u1, * , where Then we have u0, , the proof finishes here. So, we assume the opposite inequality, which gives We assume, inductively, that, for some 1 ≤ j ≤ p, we have where with The case j = 1 is trivially true because it is the identity P 1 w0 u1 = P 1 w0 u1 . We aim to apply Lemma 2.7 with u = u j as in (65) and Conditions (56) and (57) correspond to (50)-(51), while (58) is implied by (52). We check condition (58). By homogeneity, we see that condition (32) is met: Then the Iterative Lemma can be applied and we get X j+1 = N + P j+1 ∈ C 3 uj+1, * , with Using homogeneity again to the extreme sides of this inequality and combining it with (64), (63) and (52), we get After p iterations, so we can take X = X p+1 , P = P p+1 , u = u p+1 .

Symplectic tools
In this section we describe various sets of canonical coordinates that are needed to our application. We remark that during the proof of Theorem B, we shall not use any of such sets completely, but rather a "mix" of action-angle and regularising coordinates, described below.

Starting coordinates
We begin with the coordinates where: as usual, the "skew-product"); • after fixing a set of values of (y, x) where the Kepler Hamiltonian (7) takes negative values, E denotes the elliptic orbit with initial values (y 0 , x 0 ) in such set; • a is the semi-major axis of E; • P, with P = 1, the direction of the perihelion of E, assuming E is not a circle; • is the mean anomaly of x on E, defined, mod 2π, as the area of the elliptic sector spanned from P to x, normalized to 2π; • α w (u, v) is the oriented angle from u to v relatively to the positive orientation established by w, if u, v and w ∈ R 3 \ {0}, with u, v ⊥ w.
The canonical 9 character of the coordinates (66) has been discussed, in a more general setting, in [40]. The shifts π 2 and π in (66) serve only to be consistent with the spatial coordinates of [40].

Energy-time coordinates
We now describe the "energy-time" change of coordinates which integrates the function E(r, G, g) in (12), where E ("energy") denotes the generic level-set of E, while τ is its conjugated ("time") coordinate. The domain of the coordinates (67) is The extremal values of E are taken to be the minimum and the maximum of the function E for 0 ≤ r < 2. The values r and 1 have been excluded because they correspond, in the (g, G)-plane, to the curves S 0 (r) and S 1 (r) in Figure 1, where periodic motions do not exist.
The functions G(E, r, ·), g(E, r, ·) and ρ(E, r, ·) appearing in (67) are, respectively, 2τ p periodic, 2τ p periodic, 2τ p quasi-periodic, meaning that they satisfy P er : with τ p = τ p (E, r) the period, defined below. Note that one can find a unique splitting such that ρ(E, r, ·) is 2τ p -periodic. It is obtained taking The transformation (67) turns to satisfy also the following "half-parity" symmetry: In addition, when −r < E < r, one has the following "quarter-parity" The change (67) will be constructed using, as generating function, a solution of the Hamilton-Jacobi equation We choose the solution where we denote as the real roots of Note that the equation in (76) has always a positive real root all r, E as in (68), so α + (E, r) is positive. S + et generates the following equations The equations for g and r are immediate. We check the equation for τ . Letting, for short, σ(E, r) := α + (E, r), we have having let g + (E, r) := cos −1 E−σ(E,r) 2 r √ 1−σ(E,r) 2 and used, by (75), Observe that (g + , σ) are the coordinates of the point where E reaches its maximum on each level set ( Figure 1). The equation for R is analogous. Equations (77) define the segment of the transformation (67) with 0 ≤ τ ≤ τ p , where is the half-period, with The transformation is prolonged to −τ p < τ < 0 choosing the solution of (74). It can be checked that this choice provides the symmetry relation described in (72). Considering next the functions S ± k = S ± et + 2kΣ(E, r), where Σ solves 10 one obtains the extension of the transformation to τ ∈ R verifying (69).
Observe that quarter period symmetry (67), holding in the case −r < E < r, is an immediate consequence of the definitions (77). The coordinates (R, E, r, τ ) are referred to as energy-time coordinates.
The regularity of the functions G(E, r, τ ), ρ(E, r, τ ), B(E, r) and τ p (E, r), which are relevant for the paper, are studied in detail in Section 4. Their holomorphy is not discussed.

Action-angle coordinates
We look at the transformation with B(E, r) as in (71), τ p (E, r) as in (79) and A(E, r) the "action function", defined as with α + (E, r) and β(E, r) being defined in (75), (80). Geometrically, A(E, r) represents the area of the region encircled by the level curves of E in Figure 1 in the former case, the area of its complement in the second case, divided by 2π. The canonical character of the transformation (81) is recognised looking at the generating function S aa (R, E, r * , ϕ * ) = ϕ * A(E, r * ) + Rr * (82) and using the following relations (compare the formulae in (77) and (79)) which allow us to rewrite (81) as the transformation generated by (82): The coordinates (R * , A * , r * , ϕ * ) are referred to as action-angle coordinates.
Remark 3.1 We conclude this section observing a non-negligible advantage while using actionangle coordinates compared to energy-time -besides the obvious one of dealing with a constant period. It is the law that relates R to R * , which is (see (67), (70) and (81)) where ρ is as in (70). Here ρ * (A * , r * , ϕ * ) is a periodic function because so is the function ρ. This benefit is evident comparing with the corresponding formula with energy-time coordinates: which would include the uncomfortable linear term B(E, r)τ . Incidentally, such term would unnecessarily complicate the computations we are going to present in the next Section 6.

Regularising coordinates
In this section we define the the regularising coordinates. First of all we rewrite S 0 (r) in (14) in terms of (A * , ϕ * ): with A s (r * ) being the limiting value of A(E, r * ) when E = r * : We observe that the function A s (r * ) is continuous in [0, 2] (in particular, A s (1 − ) = A s (1 + )), with A s (0) = 0 , A s (2) = 1 and increases smoothly between those two values, as it results from the analysis of its derivative. Indeed, letting, for short, σ 0 (r * ) := r * (2 − r * ) and proceeding analogously as (78), we get We denote as A * → r s (A * ) the inverse function and we define two different changes of coordinates The transformations (88) are canonical, being generated by The coordinates (Y k , A k , y k , ϕ k ) with k = ±1 are called regularising coordinates.
Proof of Proposition 4.2 The function T 0 (κ) in (94) is studied in detail in Appendix A. Combining Lemma A.1 and Proposition A.1 and taking the κ-primitive of such relations, one obtains Proposition 4.2.

The function F(E, r)
In this section we study the function F(E, r) in (11). Specifically, we aim to prove the following Proposition 5.1 F(E, r) is well defined and smooth for all (E, r) with 0 ≤ r < 2 and −r ≤ E < 1 + r 2 4 , E = r. Moreover, there exists a number C > 0 and a neighbourhood O of 0 ∈ R such that, for all 0 ≤ r < 2 and all −r ≤ E < 1 + r 2 4 such that E − r ∈ O, To prove Proposition 5.1 we need an analytic representation of the function F, which we proceed to provide. In terms of the coordinates (66), the function U in (10) is given by (recall we have fixed Λ = 1) where ξ is the eccentric anomaly. By [40], U remains constant along the level curves, at r fixed, of the function E(r, ·, ·) in (12). Therefore, the function F(E, r) which realises (11) is nothing else than the value that U(r, ·, ·) takes at a chosen fixed point(G 0 (E, r), g 0 (E, r)) of the level set E in Figure 1. For the purposes 11 of the paper, we choose such point to be the point where the E-level curve attains its maximum. It follows from the discussion in Section 3.2 that the coordinates of such point are where α + (E, r) is as in (75). Replacing (108) into (107), we obtain F(E, r) = 1 2π To study the regularity of F, it turns to be useful to rewrite the integral (109) as twice the integral on the half period [0, π] and next to make two subsequent changes of variable. The first time, with z = s(E, r) cos x. It gives the following formula, which will be used below.

Proof of Theorem B
In this section we state and prove a more precise statement of Theorem B, which is Theorem 6.1 below.
The framework is as follows: • fix a energy level c; where t is the new time and t the old one. The new time t is soon renamed t; • look at the ODE for the triple q k = (A k , y k , ψ) where A k , y k are as in (88), while ψ = ϕ * , with ϕ * as in (81) in P k , where • the projection of P + in the plane (g, G) in Figure 1 is an inner region of S 0 (r) and r varies in a ε-left neighburhood of 2; • the projection of P − in the plane (g, G) in Figure 1 is an outer region of S 0 (r) and r varies in a ε-left neighburhood of 2; • the boundary of P κ includes S 0 if L + = ∞; it has a positive distance from it if L + < +∞.
We shall prove Theorem 6.1 There exist a graph G k ⊂ P k (ε − , ε + , L − , L + , ξ) and a number L > 1 such that for any L − > L there exist ε − , ε + , L + , ξ, an open neighbourhood W k ⊃ G k such that along any orbit q k (t) such that q k (0) ∈ W k , where t ex is the first t such that q(t) / ∈ W k and is an upper bound for P 1 W k (with P 1 being the first component of P ).
Proof For definiteness, from now on we discuss the case k = +1 (outer orbits). The case k = −1 (inner orbits) is pretty similar. We neglect to write the sub-fix "+1" everywhere. As the proof is long and technical, we divide it in paragraphs. We shall take Step 1. The vector-field X As ψ is one of the action-angle coordinates, while A, y are two among the regularising coordinates, we need the expressions of the Hamiltonian (10) written in terms of those two sets. The Hamiltonian (10) written in action-angle coordinates is with φ aa as in (81), while G(E, r, τ ), F(E, r) as in (67), (11), respectively, ρ * is as in (85). The Hamiltonian (10) written in regularising coordinates is we find that the evolution for the triple q = (A, y, ψ) during the time t is governed by the vector-field hence, The application of nft relies on the smallness of the perturbing term P . In the case in point, the "greatest" term of P is the component P 2 , and precisely ρ * ,3 . This function is not uniformly small. For this reason, we need to look at its zeroes and localise around them. The localisation (described in detail below) carries the holomorphic perturbation P to a perturbation P , which is smaller, but no longer holomorphic. We shall apply gnft to the new vector-field X = N + P .
Step 3. Localisation about non-trivial zeroes of ρ * , 3 The following lemma gives an insight on the term ρ * ,3 , appearing in (119). It will be proved in Appendix B.
Step 4. Bounds The following uniform bounds follow rather directly from the definitions. Their proof is deferred to Appendix B, in order not to interrupt the flow.
Here C is a number not depending on L − , L + , ξ, ε − , ε + , c, |C|, β, α and the norms are meant as in Section 2.5, in the domain (124). Remark that the validity of (126) is subject to condition which will be verified below.
Step 5. Application of gnft and conclusion Fix s 1 , s 2 > 0. Define so that (49) are satisfied. With these choices, as a consequence of the bounds in (125)-(126), one has We now discuss inequalities (49)-(52) and (127). We choose s i , L ± , ε ± and K to be the following functions of L and ξ, with 0 < ξ < 1 < L: with 0 < c 1 < 1 < C 1 and 0 < c − < c + < 1 suitably fixed, so as to have K > 0. A more stringent relation between ξ and L will be specified below. We take In view of (128), it is immediate to check that there exist suitable numbers 0 < c 1 < 1 < C 1 depending only on c, c + , c − and α such that inequalities (49)-(51) and (127) are satisfied and An application of gnft conjugates X = N + P to a new vector-field X = N + P , with the first component of the vector P * being bounded as Using (122), (123), that P vanishes outside V • , the chain rule and the holomorphy of P (A, y, ·), where P (V•)s (A, y, ψ) denotes the restriction of P (A, y, ·) on (V • ) s , while s is the analyticity radius of P (A, y, ·). We take s so small that Then we have where we have used the inequality which will be discussed below. On the other hand, analogous techniques as the ones used to obtain (126) provide with := CL 3 e −4L and 0 < c < 1. So, which is what we wanted to prove. It remains to discuss (129). By Stirling and provided that > 2δ, (129) is implied by These inequalities are satisfied by choosing , * and ξ to be related to L such in a way that = max [c 2 L 3 ] + 1 , [2δ] + 1 , 1 2π A The elliptic integrals T 0 (κ) and j β (κ) The functions T 0 (κ) in (94) and j β (κ) in (113) are complete elliptic integrals. We use this appendix to store some useful material concerning such functions. First of all, in the definition of T 0 (κ), we change the integration variable, letting ξ → 1 ξ , so as to rewrite with G 0 (κ) as in (94). Next, we look at the complex-valued function which is easily related to T 0 (κ) and j 0 (κ): Proof We have only to prove that T 0 (κ) = j 0 (κ) when 0 < κ < 1, as the other relations are immediate, from (130) and (131). We write We deform the integration path of the first integral at right hand side stretching the real path ξ ∈ [0, +∞) to the purely imaginary line z = iy, with y ∈ [0, +∞), so that Combining this with the observation that, for 0 < κ < 1, T 0 (κ) and j 0 (κ) are real while the two latter integrals in (133) are purely imaginary, we have T 0 (κ) = j 0 (κ), as claimed.
Remark A.1 It follows from the proof of Lemma A.1 (compare (133)-(134)) that, in the sense of complex integrals, This identity can be also directly checked, using proper changes of coordinate combined with cuts of the complex plane, in order to make the square roots single-valued in a neighbourhood of the real axis.
The advantage of looking at g(κ) instead of T 0 (κ) is that the integration path in (131) is κindependent, and this turns to be useful when taking κ-derivatives. The main result at this respect in this section is the following Proposition A.1 • Let κ ∈ R \ {0, 1} and let g(κ) be as in (131). There exists two positive real numbers R * , S * and two complex numbers • Let β ≥ 0; 0 < κ < 1, j β (κ) as in (113). There exist two positive numbers R * β and S * β ∈ R and two real functions R β (κ), S β (κ) satisfying such that Proof We prove the first statement. We distinguish two cases. Case 1: κ < 0 or κ > 1. The integral takes real values when κ < 0; purely imaginary ones when κ > 1: The function under the integral is bounded above by 1 min{1, √ |κ|} √ ξ 4 −1 when κ < 0; by 1 ξ 2 −1 when κ > 1. Both such bounds are integrable. Then it is possible to derive under the integral, and we obtain We change variable 1 − κξ 2 = η when κ < 0, κξ 2 − 1 = η when κ > 1 and rewrite Case 2: 0 < κ < 1. We split g(κ) into its real and imaginary part. Using (132) and (135), we obtain Notice that also in this case, the functions under the integrals may be bounded by integrable functions: 1 √ κ(y 2 +1) for the former; Then, letting 1 + κy 2 = η in the first respective integrals, and 1 − κξ 2 = η in the second ones, for all 0 < κ < 1. The proof for j β (κ) is completely analogous to the case 2 above (with the difference that we do not have the imaginary part in that case). One finds R β (κ) = 1 2 which verify (136).

B Technicalities
In this section of the appendix we prove the bounds in (125), (126) and Lemma 6.1.