On the mass-critical generalized KdV equation

We consider the mass-critical generalized Korteweg--de Vries equation $$(\partial_t + \partial_{xxx})u=\pm \partial_x(u^5)$$ for real-valued functions $u(t,x)$. We prove that if the global well-posedness and scattering conjecture for this equation failed, then, conditional on a positive answer to the global well-posedness and scattering conjecture for the mass-critical nonlinear Schr\"odinger equation $(-i\partial_t + \partial_{xx})u=\pm (|u|^4u)$, there exists a minimal-mass blowup solution to the mass-critical generalized KdV equation which is almost periodic modulo the symmetries of the equation. Moreover, we can guarantee that this minimal-mass blowup solution is either a self-similar solution, a soliton-like solution, or a double high-to-low frequency cascade solution.


Introduction
We consider the initial-value problem for the mass-critical generalized Korteweg-de Vries (gKdV) equation x (R), where µ = ±1 and the solution u is a real-valued function of spacetime (t, x) ∈ R× R. When µ = 1 the equation is called defocusing, while the case µ = −1 is known as focusing.
The name mass-critical refers to the fact that the scaling symmetry u(t, x) → u λ (t, x) := λ for all t ∈ I. We refer to the interval I as the lifespan of u. We say that u is a maximallifespan solution if the solution cannot be extended to any strictly larger interval. We say that u is a global solution if I = R.
Throughout this paper we will use the following notation: 2000 Mathematics Subject Classification. 35Q55. 1 In view of Theorem 1.3 below, we will also refer to S I (u) as the scattering size of u on the interval I. Associated to the notion of solution is a corresponding notion of blowup, which we now define. As we will see in Theorem 1.3, this precisely corresponds to the impossibility of continuing the solution (in the case of blowup in finite time) or failure to scatter (in the case of blowup in infinite time). The local well-posedness theory for (1.1) with finite-mass initial data was developed by Kenig, Ponce, and Vega, [16]. They constructed local-in-time solutions for arbitrary initial data in L 2 x ; however, as is the case with critical equations, the interval of time for which existence was proved depends on the profile of the initial data rather than on its norm. Moreover, they constructed global-in-time solutions for small initial data in L 2 x and showed that these solutions scatter, that is, they are well approximated by solutions to the free Airy equation (∂ t + ∂ xxx )u = 0 asymptotically in the future and in the past. We summarize these results in the following theorem.
Theorem 1.3 (Local well-posedness, [16]). Given u 0 ∈ L 2 x (R) and t 0 ∈ R, there exists a unique maximal-lifespan solution u to (1.1) with u(t 0 ) = u 0 . We will write I for the maximal lifespan. This solution also has the following properties: Conversely, given u + ∈ L 2 x (R) there is a unique solution to (1.1) in a neighbourhood of infinity so that (1.5) holds.
• (Small data global existence) If M (u 0 ) is sufficiently small, then u is a global solution which does not blow up either forward or backward in time. Indeed, in this case Global well-posedness for large finite-mass initial data is an open question. In the case of more regular initial data, for example, u 0 ∈ H s x (R) with s ≥ 1, one may access higher regularity conservation laws to answer the global well-posedness question. One such conserved quantity is the energy, E(u(t)) := R 1 2 |∂ x u(t, x)| 2 + µ 6 |u(t, x)| 6 dx. (1.6) Invoking the conservation of energy, in the defocusing case one may iterate the local wellposedness theory to obtain a global solution for initial data u 0 ∈ H s x (R) with s ≥ 1, [16]. In the focusing case, the same argument combined with the sharp Gagliardo-Nirenberg inequality, [37], yields global well-posedness for finite-energy initial data with mass less than that of the ground state soliton, which we will discuss in a moment. In neither case does the argument yield information about the long-time behaviour of the solution.
The ground state is the unique positive radial solution to the elliptic equation and is given by the explicit formula Q(x) = 3 1/4 cosh 1/2 (2x) . (1.7) The ground state plays an important role in the study of the focusing case (µ = −1) of (1.1), as it gives rise to soliton solutions. More precisely, is a solution to (1.1). Furthermore, it is known that when M (u 0 ) > M (Q), solutions can blow up in finite time, [23], even for H 1 x initial data. There has been some work dedicated to lowering the regularity of the initial data for which one has global well-posedness. In [11], Fonseca, Linares, and Ponce established global well-posedness for solutions of the focusing mass-critical gKdV for initial data in H s (R) with s > 3/4 and mass less than that of the ground state solution. Recently, Farah, [10], used the I-method of Colliander, Keel, Staffilani, Takaoka, and Tao, [7], to further lower the regularity of the initial data to s > 3/5. In view of the fact that it is both scaling-critical and conserved by the flow, it is natural to endeavour to prove well-posedness for initial data in L 2 x , that is, when s = 0. Another interesting open question is related to the asymptotic behavior of global solutions to (1.1). Intuitively, if we knew that u(t) decayed to zero (in some sense) as t → ±∞, then the nonlinearity ∂ x (u 5 (t)) should decay even faster and so the nonlinear effects should become negligible for large times. As a result, it is widely expected that the nonlinear solution scatters to a linear solution, at least in the defocusing case; in the focusing case, the same behavior is expected for initial data with mass less than that of the ground state. More precisely, it is expected that there exist u ± ∈ L 2 x (R) such that For critical problems it is natural to encapsulate both the well-posedness and scattering questions in the form of global spacetime bounds; the precise formulation is Conjecture 1.4 below. Indeed, the existence of a scaling symmetry implies that there is no reference scale for time or space and hence, one should regard 'good' (i.e., profile-independent) local wellposedness and scattering as two facets of the same question. In addition to addressing global well-posedness and scattering, spacetime bounds imply a strong form of stability for the equation; see Theorem 3.1.

Conjecture 1.4 (Spacetime bounds for the mass-critical gKdV).
The defocusing masscritical gKdV is globally well-posed for arbitrary initial data u 0 ∈ L 2 x (R). In the focusing case, the same conclusion holds for initial data u 0 ∈ L 2 x (R) with M (u 0 ) < M (Q). Furthermore, in both cases, the global solution satisfies the following spacetime bounds: (1.9) Conjecture 1.4 has been compared in the literature to the analogous conjecture for the mass-critical nonlinear Schrödinger equation in one space dimension. This is where µ = ±1 and the solution v is a complex-valued function of spacetime R × R. Just as for the mass-critical gKdV, the case µ = 1 is called defocusing, while the case µ = −1 is known as focusing. The numerical constant 5 24 can be changed to any other positive value by rescaling v. However, as will be discussed below, this specific value is convenient for exhibiting a close connection between (1.1) and (1.10). More precisely, it was observed in [6,31] that for highly oscillatory initial data, solutions of gKdV mimic those of NLS.
Note also that (1.10) is time-reversed relative to most work on this equation; positive frequencies move to the left.
Mass and energy as defined by (1.3) and (1.6) are also conserved quantities for (1.10). Moreover, (1.10) enjoys a scaling symmetry is a solution to (1.10), where Q is as defined in (1.7). Note that M (v) = 2 6 5 M (Q). The local theory for (1.10) was developed by Cazenave and Weissler, [4,5], who constructed local-in-time solutions for arbitrary initial data in L 2 x (with the time of existence depending on the profile of the initial data) and global-in-time solutions for small initial data in L 2 x . For finite-energy initial data, the usual iterative argument yields global existence in the defocusing case. In the focusing case, global existence also follows from the same argument combined with the sharp Gagliardo-Nirenberg inequality for finite-energy initial data with M (v 0 ) < 2 6 5 M (Q); see [37]. For global existence results for less regular data, but still above the critical regularity, see [9,35].
The natural global well-posedness and scattering conjecture for (1.10) is the following; it is still open. Conjecture 1.5 (Spacetime bounds for the mass-critical NLS). The defocusing masscritical NLS is globally well-posed for arbitrary initial data v 0 ∈ L 2 x (R). In the focusing case, the same conclusion holds for initial data v 0 ∈ L 2 x (R) with M (v 0 ) < 2 6 5 M (Q). Furthermore, in both cases, the global solution satisfies the following spacetime bounds: Recently, Tao [31] used the fact that solutions to (1.10) can be used to build solutions to (1.1) in order to show that Conjecture 1.4 implies Conjecture 1.5. More precisely, he showed Theorem 1.6 (Conjecture 1.4 almost implies Conjecture 1.5, [31]). Fix µ = ±1 and assume that Conjecture 1.4 holds for initial data u 0 ∈ L 2 Note that in the defocusing case, Theorem 1.6 shows that the full version of Conjecture 1.4 implies the full version of Conjecture 1.5. In the focusing case, the result is somewhat inefficient as it only proves that the full version of Conjecture 1.4 implies Conjecture 1.5 in the small mass case M (v 0 ) < 2M (Q), missing the desired hypothesis by a factor of 6/5.
As this theorem shows, any attack on Conjecture 1.4 must also address Conjecture 1.5, at least in some way. The approach we adopt here is to prove a form of converse to Theorem 1.6, namely Theorem 4.1, which we contend (cf. Remark 1.14) precisely isolates the role of NLS as an obstruction to proving Conjecture 1.4.
The principal thrust of this paper however, is to provide what we believe to be an important first step to verifying Conjecture 1.4. Our reason for such optimism stems from recent progress on other dispersive equations at critical regularity: NLW, wave maps, and more specifically, NLS.
The recent progress on the mass-and energy-critical NLS can be found in [2,8,12,14,19,20,22,25,29,33,34,36]. Here we apply the techniques developed to attack these problems to Conjecture 1.4. More precisely, using concentration-compactness techniques, we will show that if Conjecture 1.4 were to fail (but Conjecture 1.5 holds true), then there exists a minimal-mass blowup solution to (1.1). Moreover, this minimal-mass counterexample to Conjecture 1.4 has good compactness properties.
To state our results, we need the following definition.
Definition 1.8 (Almost periodicity modulo symmetries). Fix µ = ±1. A solution u to (1.1) with lifespan I is said to be almost periodic modulo symmetries if there exist functions N : I → R + , x : I → R, and C : R + → R + such that for all t ∈ I and η > 0, We refer to the function N as the frequency scale function for the solution u, x the spatial center function, and to C as the compactness modulus function.
Remark 1.9. The parameter N (t) measures the frequency scale of the solution at time t, while 1/N (t) measures the spatial scale. It is possible to multiply N (t) by any function of t that is bounded both above and below, provided that we also modify the compactness modulus function C accordingly.
Remark 1.10. It follows from the Ascoli-Arzela Theorem that a family of functions is precompact in L 2 x (R) if and only if it is norm-bounded and there exists a compactness modulus function C so that   Thus, in order to prove Conjecture 1.4 (assuming that Conjecture 1.5 holds) it suffices to preclude the existence of minimal-mass blowup solutions. Employing a combinatorial argument in the spirit of [19,Theorem 1.16], one can prove that no matter how small the class of minimal-mass blowup solutions to (1.1) is, one can always find at least one of three specific enemies to Conjecture 1.4. More precisely, in Section 6, we adapt the argument given in [19] to obtain Remark 1.14. In none of the three scenarios just described is there any known connection to solutions of NLS nor any other simpler equation. This is our justification for the claim made earlier that Theorem 4.1 precisely isolates the role of NLS in Conjecture 1.4.
Using the analogue of Theorem 1.13 developed in the context of the mass-critical NLS (see [19]), it is possible to recast the role of Conjecture 1.5 in Theorems 1.11 and 1.13 as follows: Suppose Conjecture 1.4 fails; then there exists a minimal-mass blowup solution to either (1.1) or (1.10). Moreover, in the former case this solution can be taken to have the structure of one of the three scenarios listed above. In the latter case the three scenarios are very similar (cf. [19]); there is an additional Galilei symmetry and the self-similar solution has N (t) = t −1/2 .
Let us now outline the content of the remainder of the paper with a few remarks on what novelties appear in the analysis.
In Section 2 we recall the linear estimates that are essential for our arguments. In particular, we recall the linear profile decomposition for the Airy equation developed in [27]. Note that the defect of compactness arising from highly oscillatory data (cf. the parameters ξ n in Lemma 2.4) is not associated with a symmetry of our equation; by comparison, in the mass-critical NLS context, it is associated to the Galilei boost symmetry. This is the primary source of difficulty/novelty in our analysis; it is also the regime in which the connection to Conjecture 1.5 arises. An early manifestation of this nuisance appears when proving decoupling of the nonlinear profiles; see Lemma 2.6.
In Section 3 we develop a stability theory for gKdV, which controls the effect of both small perturbations to the initial data and the addition of weak forcing terms.
In Section 4 we discuss the behaviour of highly oscillatory solutions to gKdV. More precisely, we show how Conjecture 1.5 implies the existence of spacetime bounds for such solutions to gKdV. This is Theorem 4.1 and is our converse to Theorem 1.6. The relation between the proofs of these theorems will be elaborated upon there.
Section 5 is devoted to the proof of Theorem 1.11. Again, the principal differences when compared to NLS arise in the case of highly oscillatory profiles. In particular, we rely upon Lemma 2.6 and Theorem 4.1.
The proof of Theorem 1.13 appears in Section 6 and is closely modeled on the analogous reduction for NLS proved in [19].
Acknowledgements. We would like to thank Terry Tao for useful comments. The first author was supported by NSF grant DMS-0701085. The last author was supported by NSF grant DMS-0901166.

Notation and useful lemmas
2.1. Some notation. We write X Y or Y X to indicate that X ≤ CY for some constant C > 0. We use O(Y ) to denote any quantity X such that |X| Y . We use the notation X ∼ Y to mean X Y X. If C depends upon some parameters, we will indicate this with subscripts; for example, X u Y denotes the assertion that For any spacetime slab I × R, we use L q t L r x (I × R) and L r x L q t (I × R) respectively, to denote the Banach spaces of functions u : I × R → C whose norms are with the usual modifications when q or r is equal to infinity. When q = r we abbreviate L q t L q x and L q x L q t as L q t,x . We define the Fourier transform on R bŷ For s ∈ R, we define the fractional differentiation/integral operator which in turn defines the homogeneous Sobolev norms Linear estimates. We start by recalling the usual Kato smoothing, Strichartz, and maximal function estimates associated to the Airy propagator. [13,15,16]). Let I be a compact time interval and let u : I × R → R be a solution to the forced Airy equation Then we have the Kato smoothing, maximal function, and Strichartz estimates x (I×R) for any t 0 ∈ I.

A linear profile decomposition.
In this subsection we record the linear profile decomposition statement from [27], which will lead to the reduction in Theorem 1.11. For a linear profile decomposition for the Schrödinger propagator, see [1,3,17,18,21,24,26]. We first recall the (non-compact) symmetries of the equation (1.1) which fix the initial surface t = 0. Definition 2.2 (Symmetry group). For any position x 0 ∈ R and scaling parameter λ > 0, we define a unitary transformation g x0,λ : Note that if u is a solution to (1.1), then T g u is a solution to (1.1) with initial data gu 0 .

Remark 2.3.
It is easy to verify that G is a group and that the map g → T g is a homomorphism. Moreover, u → T g u maps solutions to (1.1) to solutions with the same Strichartz size as u, that is, and We are now ready to record the linear profile decomposition for the Airy propagator.
Lemma 2.4 (Airy linear profile decomposition, [27]). Let {u n } n≥1 be a sequence of realvalued functions bounded in L 2 x (R). Then, after passing to a subsequence if necessary, there exist (possibly complex) functions {φ j } j≥1 ⊂ L 2 x (R), group elements g j n ∈ G, frequency parameters ξ j n ∈ [0, ∞), and times t j n ∈ R such that for all J ≥ 1 we have the decomposition where the parameters ξ j n satisfy the following property: for any 1 ≤ j ≤ J either ξ j n = 0 for all n ≥ 1, or ξ j n λ j n → ∞ as n → ∞. Here, w J n ∈ L 2 x (R) is real-valued and its linear evolution has asymptotically vanishing symmetric Strichartz norm, that is, Moreover, the following orthogonality conditions are satisfied: Remark 2.5. By analytic interpolation together with Lemma 2.1 and (2.1), we obtain that the linear evolution of the error term w J n also vanishes asymptotically in the Strichartz space t,x (R×R) = 0. Our next lemma shows that divergence of parameters in the sense of (2.2) gives decoupling of nonlinear profiles. Note that when ξ n λ n → ∞, the structure of the nonlinear profile is dictated by Theorem 4.1.
Lemma 2.6 (Decoupling for the nonlinear profiles). Let ψ j and ψ k be functions in C ∞ c (R× R). Given sequences of parameters that diverge in the sense of (2.2), we have when ξ j n λ j n → ∞ and ξ k n ≡ 0. Lastly, when ξ j n λ j n → ∞ and ξ k n λ k n → ∞. Proof. By moving the scaling symmetry onto one of the profiles (i.e., by changing variables in each of the space and time integrals) one can quickly obtain convergence to zero unless λ j n ∼ λ k n . In the case of a rapidly moving profile, one should note that for any ϑ ∈ C ∞ c (R×R), ϑ(3a n t, x + 3a 2 n t) L 5/2 x L 5 t 1 independent of the growth of a n . This follows from the fact that With λ j n ∼ λ k n , equation (2.5) follows very quickly; one merely writes down the rather lengthy formula and utilizes the fact that ξ j n λ j n → ∞. In the case of (2.4), one then sees that divergence of the spatial or temporal center parameters, in the sense of (2.2), eventually separates the supports of the two profiles. Further details can be found in a number of prior publications, including [18,21].
We now turn our attention to (2.6). The general scheme mimics that for (2.4); however, everything becomes extremely messy without one small trick. For this reason, we work through a few of the details.
Bounding ψ j and ψ k by (multiples of) the characteristic function of a suitably large square, we see that and similarly for R k n . The next step is to apply the Cauchy-Schwarz inequality to the spatial integral in (2.8). Before doing so, let us gather some information that will allow us to bound what results. First, by changing variables according to where the denominator originates from the Jacobian factor. On the other hand, just as in (2.7). Thus, combining (2.8), (2.9), and (2.10) with the Cauchy-Schwarz inequality and the fact that we may assume λ j n ∼ λ k n yields This shows convergence to zero unless √ λ j n λ k n ξ j n − ξ k n 1 (2.12) and is the origin of the second term in (2.2).
It is now not difficult to deal with the remaining two terms in (2.2); however, it is useful to observe that (2.12) and λ j n ξ j n → ∞ imply ξ j n ∼ ξ k n . Indeed, the ratio converges to one.
Proof. When G = 0, this can be proved by a simple T T * argument; however, the result seems to appear for the first time in [15], which considers a much more general setup. We note that G can be inserted a posteriori by a simple application of Minkowski's inequality.

Stability theory
An important part of the local well-posedness theory is the stability theory. By stability, we mean the following property: Given an approximate solutionũ to (1.1) in the sense that with e small in a suitable sense and the initial dataũ 0 close to u 0 , then there exists a genuine solution u to (1.1) which stays very close toũ in critical spacetime norms. The question of continuous dependence of the solution upon the initial data corresponds to the case e = 0. Although stability is a local question, it has played an important role in all existing treatments of the global well-posedness problem for the nonlinear Schrödinger equation at critical regularity. It has also proved useful in the treatment of local and global questions for more exotic nonlinearities [32,38]. As in previous work, the stability result is an essential tool for extracting a minimal-mass blowup solution.
Theorem 3.1 (Long-time stability for the mass-critical gKdV). Let I be a time interval containing zero and letũ be a solution to (3.1) on I × R for some function e. Assume that x L 10 t (I×R) ≤ L for some positive constants M and L. Let u 0 be such that . Remark 3.2. Theorem 3.1 implies the existence and uniqueness of maximal-lifespan solutions to (1.1). It also proves that the solutions depend uniformly continuously on the initial data (on bounded sets) in spacetime norms which are critical with respect to scaling.
The proof of a stability result is by now standard; we follow the exposition in [33]. One first obtains a short-time stability result which can be iterated to obtain a long-time stability result, as long as the number of iterations depends only on the mass and the Strichartz norm.
for some small 0 < ε < ε 0 = ε 0 (M, M ′ ). Then there exists a solution u to (1.1) on I × R with initial data u 0 at time t = 0 satisfying

8)
Proof. By the local well-posedness theory, it suffices to prove (3.5) through (3.8) as a priori estimates, that is, we may assume that the solution u already exists. Also, we may assume, without loss of generality, that 0 = inf I. Let w := u −ũ. Then w satisfies the following initial-value problem For t ∈ I, we write On the other hand, Hölder's inequality yields where all spacetime norms are on [0, t] × R. Thus, we obtain , from which a continuity argument yields A(t) ε for all t ∈ I, provided ε 0 is chosen sufficiently small. This proves (3.5). Conclusion (3.6) follows from (3.5) and (3.9).
This completes the proof of the lemma.
We are now ready to complete the proof of Theorem 3.1.
Proof of Theorem 3.1. We will derive Theorem 3.1 from Lemma 3.3 by an iterative procedure. First, we assume, without loss of generality, that 0 = inf I. Now let ε 0 = ε 0 (M, 2M ′ ) be as in Lemma 3.3. Note that we have to replace M ′ by the slightly larger 2M ′ as the difference u(t) −ũ(t) in L 2 x may possibly grow in time. Divide I into N many intervals I j = [t j , t j+1 ] such that on each time interval I j we have (3.10) We will first show that N depends only on ε 0 and L, and hence only on M, M ′ , L. Indeed, for 0 ≤ j < N − 1, let f j (x) := ũ(x) L 10 t (Ij ) .
Summing the bounds in (3.11) over all subintervals I j completes the proof of the theorem.

Embedding NLS inside gKdV
The purpose of this section is to prove the following Theorem 4.1 (Oscillatory profiles). Assume that Conjecture 1.5 holds. Let φ ∈ L 2 x be a complex-valued function; in the focusing case, assume also that M (φ) < 2 6 5 M (Q). Let {ξ n } n≥1 ⊂ (0, ∞) with ξ n → ∞ and let {t n } n≥1 ⊂ R such that 3ξ n t n converges to some T 0 ∈ [−∞, ∞]. Then for n sufficiently large there exists a global solution u n to (1.1) with initial data at time t = t n given by u n (t n , x) = e −tn∂ 3 x Re(e ixξn φ(x)). Moreover, the solution obeys the global spacetime bounds and for every ε > 0 there exist n ε ∈ N and ψ ε ∈ C ∞ c (R × R) so that u n (t, x) − Re e ixξn+itξ 3 n ψ ε 3ξ n t, x + 3ξ 2 n t L 5 x L 10 for all n ≥ n ε .
As noted in the introduction, this is a form of converse to Theorem 1.6. Let us briefly sketch the argument behind Theorem 1.6 as given in [31]: To prove Conjecture 1.5, one merely needs to prove a priori spacetime bounds for Schwartz solutions to NLS on a compact time interval. As in [6], Tao exploits the fact that such solutions can be used to build approximate solutions to gKdV of comparable size. Conjecture 1.4 controls the size of all solutions to gKdV and so also of these particular solutions. Thus Conjecture 1.5 follows.
We have glossed over two subtleties in the argument. First, the difference in scaling between NLS and gKdV means that they share no common critical spacetime norm. For this reason, the Schwartz nature of the solution and the compactness of the time interval play essential roles in Tao's argument. To prove Theorems 1.11 and 1.13, we must contend with non-Schwartz solutions and work globally in time -as extremal objects, minimalmass blowup solutions are not susceptible to a priori analysis. Overcoming these difficulties represents the principal novelty of this section.
The second subtlety stems from the necessity to use X s,b -type estimates to control the discrepancy between the NLS and gKdV evolutions. In this aspect, we borrow directly from [31]; see Lemma 4.4 below.
The remainder of this section is devoted to the Combining this with standard persistence of regularity arguments (cf. Lemma 3.10 in [32]) and the frequency localization of the initial data, we deduce that for any s ≥ 0. By the perturbation theory for the mass-critical NLS, as worked out in [32], we also have where v is the solution to (1. 10) with This solution also exists, is global, and scatters by Conjecture 1.5. In particular, there exist (Note that if T 0 = ±∞, then we can identify one scattering state, namely, v ± = φ.) Using this and (4.6), we deduce that Next we use v n to build an approximate solution to gKdV, namely, Here T is large an n-independent parameter that will be chosen in due course. Our first task is to show that this is indeed almost a solution to gKdV. We begin with the simpler large-time regime. While the cubic dispersion relation of Airy can be well approximated by a suitable quadratic polynomial (and hence Schrödinger) in a bounded frequency regime (note the frequency localization and shift in (4.8)), the minute differences are magnified over long time scales. Thus, one cannot maintain the approximation by NLS over large time intervals. The key observation to deal with this is that a positivefrequency solution which is well-dispersed (i.e., resembles a scattered wave) for NLS is also well-dispersed for gKdV. This is captured by the following lemma. x [e ixξn e −iT ∂ 2 x φ] L 6 t,x ([0,∞)×R) = 0. Proof. By the Strichartz inequality, it suffices to prove the claim when φ is a Schwartz function with compact Fourier support.
A computation reveals that and χ suppφ denotes the characteristic function of the Fourier support of φ.
Invoking the Van der Corput estimate [28, Corollary, p.334] and taking n sufficiently large, we obtain g t L ∞ x φ ξ 1/6 n (T + ξ n t) 1/2 . Thus, for n large, On the other hand, a direct computation shows x for n sufficiently large.
Interpolating between the two bounds, we get Finally, integrating with respect to time and letting T → ∞ we derive the claim.
The smallness of the linear evolution provided by Lemma 4.2 carries over easily to the nonlinear evolution: Lemma 4.3 (Good approximation to gKdV -large times). Forũ n as defined above, we have . We will only consider the contribution from t > T 3ξn to the right-hand side; negative values of t can be handled identically. By analytic interpolation together with Lemma 2.1, t,x (t>0) . Invoking (4.7) and Lemma 4.2, we derive the claim.
We now turn to showing thatũ n is a good approximate solution in the middle interval |t| ≤ T 3ξn . Here we have (∂ t + ∂ xxx )ũ n = µ∂ x (ũ 5 n ) + E n , (4.9) where E n := E 1 n + E 2 n + E 3 n and the errors E j n for 1 ≤ j ≤ 3 are given by E 1 n := ξ n k=3,5 C 1,k Re e ikξnx+ikξ 3 n t |v n | 4 v n 3ξ n t, x + 3ξ 2 n t E 2 n := k=1,3,5 C 2,k Re e ikξnx+ikξ 3 n t |v n | 4 v n x 3ξ n t, x + 3ξ 2 n t E 3 n := C 3 Re e iξnx+iξ 3 n t (v n ) xxx 3ξ n t, x + 3ξ 2 n t , with absolute constants C 1,3 , C 1,5 , C 2,1 , C 2,3 , C 2,5 , C 3 of inconsequential value. Note that the constant 5/24 in front of the nonlinearity in equation (1.10) was chosen so as to cancel the 'resonant' term k = 1 in E 1 n . Using (4.5) and making the necessary change of variables shows (4.10) Unlike these two terms, E 1 n does not converge to zero in this norm. Indeed, the simple arguments above show merely E 1 n L 1 t L 2 x 1. Following [31], the expedient way to deal with this error term is to alter our approximate solutionũ n on this middle interval; ultimately we will see that the modification is negligible in all the important norms.
Lemma 4.4 (Error-Control, [31]). Let E n be as defined above and let e n be the solution to the forced Airy equation t,x (|t|≤ T 3ξn ) + e n L 5 x L 10 t (|t|≤ T 3ξn ) = 0. The proof of Lemma 4.4 uses the compactness of the time interval in an essential way. Indeed, we already see the importance of this in (4.10). As noted earlier, it is unavoidable since the norms in which v n must be estimated are not scale-invariant.
By the Strichartz inequality (Lemma 2.1), the bounds given in (4.10) suffice to control the contributions from E 2 n and E 3 n . Using linearity, one may therefore focus one's attention on E 1 n . To handle this term one uses instead the oscillatory behaviour of the terms e i3ξnx+i3ξ 3 n t and e i5ξnx+i5ξ 3 n t . Indeed, the frequencies (ω, ξ) = (3ξ 3 n , 3ξ n ) and (ω, ξ) = (5ξ 3 n , 5ξ n ) are far from the cubic ω = ξ 3 ; this fact together with X s,b -type arguments are used to yield the claim in this case. For details, see [6, Lemma 6.1] or [31, Lemma 3.1].

the Strichartz inequality, Lemma 4.4, and (4.5) yield
t,x (|t|≤ T 3ξn ) + e n L 5 Note that changing variables in the time integral is responsible for the appearance of ξ n in the denominator on the penultimate line.
This allows us to prove thatũ n − e n is an approximate solution to gKdV on the middle time interval. Lemma 4.6 (Good approximation to gKdV -the middle interval). Letũ n and e n be as defined above. Thenũ n − e n approximately solves the gKdV equation (1.1) in the sense that  Recall that u n (t n ) is defined in (4.1).
Proof. We break the proof in two cases depending on whether or not T 0 is finite. Consider first the case |T 0 | < ∞. Note that in this case we must necessarily have t n → 0 as n → ∞. Requiring T > |T 0 | and n sufficiently large, and using the definition ofũ n and Plancherel, we estimate x (e ixξn φ(x)) − e ixξn+itnξ 3 n v n (3ξ n t n , x + 3ξ 2 n t n ) L 2 Now recall that 3ξ n t n → T 0 and, by construction, v(T 0 ) = e −iT0∂ 2 x φ; these combined with (4.6) yield the claim the |T 0 | < ∞ case.
Next, we consider the case T 0 = ∞; the case T 0 = −∞ can be handled identically. Using the unitarity of e −t∂ 3 x and the calculation above, we obtain Recalling the construction of v in this case together with the fact that ξ n → ∞ by hypothesis, the dominated convergence theorem combined with (4.6) yield (4.11).
We are now in a position to apply the stability result Theorem 3.1. We begin with the case |T 0 | < ∞, which implies that t n lies in the interval |t| ≤ T 3ξn for T and n large enough. In this case, we useũ n − e n as our approximate solution on the time interval |t| ≤ T 3ξn . By Lemma 4.6, for n sufficiently large this is an approximate solution to gKdV, while by Lemmas 4.4 and 4.7, we have asymptotic (in n) agreement of the initial data. Thus we obtain a solution u n to gKdV on the interval |t| ≤ T 3ξn which matches the initial data stated in the theorem and obeys lim n→∞ u n −ũ n L ∞ t L 2 x (|t|≤ T 3ξn ) + u n −ũ n L 5 x L 10 t (|t|≤ T 3ξn ) + |∂ x | 1/6 (u n −ũ n ) L 6 t,x (|t|≤ T 3ξn ) = 0. Note that we used Lemma 4.4 to remove e n from the formula above. To extend the solution u n to the whole real line, we use the formula above together with Lemma 4.3 and Theorem 3.1; moreover, lim n→∞ u n −ũ n L ∞ t L 2 x (R×R) + u n −ũ n L 5 x L 10 t (R×R) + |∂ x | 1/6 (u n −ũ n ) L 6 t,x (R×R) = 0. (4.12) The argument in the case T 0 = ±∞ is very similar. One simply treats the three time intervals in a different order. We still obtain a global solution u n to gKdV with satisfies (4.12).
We are left with the task of constructing the compactly supported approximation to our solution. The asymmetry in the space/time exponents in L 5 x L 10 t combined with the boost in (4.8) prevent us from using a simple density argument.
Given ε > 0, let T > 0 and n be sufficiently large so that u n L 5 x L 10 t (|t|> T 3ξn ) ≤ ε. This is possible by virtue of (4.12) and the proof of Lemma 4.3. This allows us to discount the region |t| > T 3ξn from further consideration. In light of the L 6 t,x bounds on v, we may In particular, by (4.6), for n sufficiently large depending on ε, v n − ψ ε L 6 t,x ([−T,T ]×R) ≤ 2ε. (4.13) By the triangle inequality, + v n (3ξ n t, x + 3ξ 2 n t) − ψ ε (3ξ n t, x + 3ξ 2 n t) L 5 x L 10 t (|t|≤ T 3ξn ) . The former difference converges to zero by (4.12); the latter we estimate using (4.13) as follows: v n (3ξ n t, x + 3ξ 2 n t) − ψ ε (3ξ n t, x + 3ξ 2 n t) L 5 to obtain the last inequality, we used the fact that ψ ε (t, x+ξ n t) L ∞ t has support of diameter O(ξ n ) and Lemma 2.7. When using Lemma 2.7, the boost is accounted for by using the Galilei symmetry of the Schrödinger equation and (4.5).
This completes the proof of Theorem 4.1 and with it, the section.

Reduction to almost periodic solutions
In this section we prove Theorem 1.11, which we will derive as a consequence to the following key proposition, asserting a certain compactness (modulo symmetries) in sequences of almost blowup solutions with mass converging to the critical mass from below. Then the sequence u n (t n ) has a subsequence which converges in L 2 x modulo the symmetries described in Definition 2.2.
Proof. Using the time-translation symmetry of (1.1), we may set t n = 0 for all n ≥ 1. Thus, Applying Lemma 2.4 to the sequence u n (0) (which is bounded in L 2 x (R)) and passing to a subsequence if necessary, we obtain the decomposition Comparing this with (5.3), we see that we must have φ j = 0 for j ≥ 2, that is, there is only one linear profile and it carries all the mass. Consequently, the linear profile decomposition simplifies to By applying the symmetry operation T g −1 n to u n , which does not affect the hypotheses of Proposition 5.1, we may take all g n to be the identity. Thus, (5.4) reduces to u n (0) = e −tn∂ 3 x [Re(e ixξn φ)] + w n , (5.5) for some sequence {t n } n≥1 ⊂ R, some {ξ n } n≥1 ⊂ [0, ∞) such that either ξ n ≡ 0 or ξ n → ∞, and some φ, w n ∈ L 2 x with M (w n ) → 0 (and hence S R (e −t∂ 3 x w n ) → 0) as n → ∞. Case I a). We first consider the case when ξ n → ∞ as n → ∞. By passing to a subsequence if necessary, we may assume that the sequence {3t n ξ n } n≥1 converges to some A computation reveals that Thus, invoking the Riemann-Lebesgue lemma together with the fact that, by assumption, for some sequence {t n } n≥1 ⊂ R and some φ, w n ∈ L 2 x with M (w n ) → 0. By passing to a subsequence if necessary, we may assume that the sequence {t n } n≥1 converges to some T 0 ∈ [−∞, +∞]. If T 0 ∈ (−∞, ∞), then by replacing φ by e −T0∂ 3 x φ, we may assume that T 0 = 0; moreover, absorbing the error e −tn∂ 3 x [Re(φ)] − Re(φ) into the error term w n , we may reduce to t n ≡ 0. To review, we may assume that either t n ≡ 0 or t n → ±∞. We treat these two scenarios separately.
Case I b1). Assume t n ≡ 0. Then (5.6) becomes u n (0) = Re(φ) + w n with M [Re(φ)] = M c and M (w n ) → 0 as n → ∞. This immediately implies that u n (0) converges to Re(φ) in L 2 x , thus yielding the claim of Proposition 5.1. Case I b2). Assume t n → ±∞ as n → ∞. We only present the argument for t n → ∞; the case t n → −∞ can be treated symmetrically.
Applying the stability result Theorem 3.1 (using 0 as the approximate solution and u n (0) as the initial data), we conclude lim n→∞ S ≥0 (u n ) = 0, which contradicts (5.1).
Case II. We now turn to the case where u n contains multiple profiles, namely, when We will eventually show that this leads to a contradiction. Reordering the indices in the decomposition (5.2) if necessary, we may assume that there exists 1 ≤ J 0 ≤ J such that for each 1 ≤ j ≤ J 0 we have ξ j n ≡ 0, while for J 0 < j ≤ J we have ξ j n λ j n → ∞ as n → ∞. Note that both the reordering and J 0 depend upon J. For 1 ≤ j ≤ J 0 we make the following reductions: First, refining the subsequence once for each j and using a diagonal argument, we may assume that for each j, the sequence {t j n } n≥1 converges to some T j ∈ [−∞, ∞]. If T j ∈ (−∞, ∞), then by replacing φ j by e −Tj∂ 3 x φ j , we may assume that T j = 0; moreover, absorbing the difference e −t j n ∂ 3 x [Re(φ j )] − Re(φ j ) into the error term w J n , we may assume that t j n ≡ 0. Thus, either t j n ≡ 0 or t j n → ±∞. Continuing with the case 1 ≤ j ≤ J 0 , we define the nonlinear profiles v j as follows: then v j is the maximal-lifespan solution to (1.1) that scatters forward in time to e −t∂ 3 x Re(φ j ). • If t j n → −∞, then v j is the maximal-lifespan solution to (1.1) that scatters backward in time to e −t∂ 3 x Re(φ j ). By (5.7), each v j has mass less than M c ; as a consequence it is global and S R (v j ) < ∞. Moreover, combining this with the small data theory (see Theorem 1.3) gives Mc M [Re(φ j )]. Next, for each 1 ≤ j ≤ J 0 and n ≥ 1, we introduce v j n : R × R → R defined by v j n (t) := T g j n v j (· + t j n ) (t). Each v j n is a global solution to (1.1) with initial data v j n (0) = g j n v j (t j n ). Furthermore, Now consider J 0 < j ≤ J. In this case we make the following reduction: refining the subsequence once for every j and using a diagonal argument, we may assume that for each j, the sequence {3t j n ξ j n λ j n } n≥1 converges to some Thus, by (5.7) and the Riemann-Lebesgue lemma, M (φ j ) < 2M c . This places us in the setting of Theorem 4.1. Hence, for n sufficiently large there exists a global solutionṽ j n to gKdV with initial dataṽ Next, we define the nonlinear profiles v j n : R × R → R by v j n (t) := T g j n ṽ j n (· + t j n ) (t). Invariance of the scattering norm under symmetries shows that (5.9) can be recast as for n sufficiently large. By Lemma 2.6, we have decoupling of the nonlinear profiles defined above. More precisely, due to the orthogonality conditions in Lemma 2.4, Note that the three cases discussed in Lemma 2.6 cover the possible relations between j, k, and J 0 . This decoupling property will allow us to show that u n may be well approximated by a sum of the v j n . To this end, we define an approximate solution Next we will show that u J n is indeed a good approximation to u n for n, J sufficiently large. Lemma 5.2 (Asymptotic agreement with initial data). For any J ≥ 1 we have Proof. This follows directly from which is a consequence of the way v j n were constructed.
Next we show that u J n has finite scattering size for n, J sufficiently large. Indeed, by The last step before invoking the stability result Theorem 3.1 is to check that u J n almost solves the equation.
Thus it suffices to show that which, by the triangle inequality, reduces to proving We first consider (5.15). This follows from the pointwise inequality together with Hölder's inequality, Remark 2.5, and (5.14).
We now turn to (5.16). We observe the following pointwise inequality: By Hölder's inequality combined with (5.8), (5.10), and (5.11), we see that this vanishes asymptotically as n → ∞ in L 1 x L 2 t . This proves (5.16) and completes the proof of the lemma.
We are now in a position to apply the stability result Theorem 3.1. Indeed, using (5.14) together with Lemmas 5.2 and 5.3, we deduce that for J and n sufficiently large, u J n is an approximate solution to (1.1) satisfying the hypotheses of Theorem 3.1. Thus, for n sufficiently large, we obtain S R (u n ) Mc 1, which contradicts (5.1). Thus, Case II cannot occur and we have finished the proof of Proposition 5.1.
With the Palais-Smale condition modulo symmetries in place, we are now ready to prove Theorem 1.11.
Proof of Theorem 1.11. As discussed in the introduction, failure of Conjecture 1.4 implies the existence of a critical mass M c and a sequence u n : R × R → R of solutions with M (u n ) ր M c and lim n→∞ S R (u n ) = +∞. Choose t n ∈ R so that S ≥tn (u n ) = S ≤tn (u n ). Then, lim n→∞ S ≥tn (u n ) = lim n→∞ S ≤tn (u n ) = ∞. (5.17) Using the time-translation symmetry of (1.1), we may take all t n = 0. Applying Proposition 5.1, and passing to a subsequence if necessary, we can locate u 0 ∈ L 2 x (R) such that u n (0) converge in L 2 x (R) modulo symmetries to u 0 ; thus, there exist group elements g n ∈ G such that g n u n (0) converge strongly in L 2 x (R) to u 0 . Applying the symmetry operation T g −1 n to the solution u n we may take all g n to be the identity, and thus u n (0) converge strongly in L 2 x (R) to u 0 . In particular this implies M (u 0 ) ≤ M c . Let u : I × R → R be the maximal-lifespan solution with initial data u(0) = u 0 as given by Theorem 1.3. We claim that u blows up both forward and backward in time. It remains to show that our solution u is almost periodic modulo symmetries. Consider an arbitrary sequence of times t ′ n ∈ I. Now, since u blows up both forward and backward in time, we have S ≥t ′ n (u) = S ≤t ′ n (u) = ∞. Applying Proposition 5.1 once again we see that u(t ′ n ) has a subsequence which converges modulo symmetries. Thus, the orbit {u(t) : t ∈ I} is precompact in L 2 x (R) modulo symmetries.

Three enemies
In this section we outline the proof of Theorem 1.13. The argument closely follows [19, §4], which may be consulted for further details.
Let v : J × R → R denote a minimal-mass blowup solution whose existence (under the hypotheses of Theorem 1.13) is guaranteed by Theorem 1.11. We denote the symmetry parameters of v by N v (t) and x v (t). We will construct our solution u by taking a subsequential limit of various normalizations of v: This solution is almost periodic modulo symmetries and has symmetry parameters Note that by the definition of almost periodicity, any sequence of t n ∈ J admits a subsequence so that v [tn] (0) converges in L 2 x . Furthermore, if u 0 denotes this limit and u : I × R → R denotes the maximal-lifespan solution with u(0) = u 0 , then u is almost periodic modulo symmetries with the same compactness modulus function as v. Lastly, Theorem 3.1 shows that v [tn] → u in critical spacetime norms (along the subsequence) uniformly on any compact subset of I.
Our first goal is to find a soliton-like solution from among the normalizations of v if this is at all possible. To this end, for any T ≥ 0, we define the quantity which measures the least possible oscillation that one can find in N v (t) on time intervals of normalized duration T .
Case 1: lim T →∞ osc(T ) < ∞. Under this hypothesis, we will be able to extract a soliton-like solution.
Then a few computations reveal that any subsequential limit u of v [tn] fulfils the requirements to be classed as a soliton-like solution in the sense of Theorem 1.13. In particular, u is global because an almost periodic (modulo symmetries) solution cannot blow up in finite time without its frequency scale function converging to infinity.
When osc(T ) is unbounded, we must seek a solution belonging to one of the remaining two scenarios. To aid in distinguishing between them, we consider the quantity a(t 0 ) := inf t∈J:t≤t0 N v (t) + inf t∈J:t≥t0 N v (t) N v (t 0 ) associated to each t 0 ∈ J. This measures the extent to which N v (t) decays to zero on both sides of t 0 . Clearly, this quantity takes values in the interval [0, 2]. First we treat the case where a(t 0 ) can be arbitrarily small. As we will see, this will lead to a double cascade.
Case 2: lim T →∞ osc(T ) = ∞ and inf t0∈J a(t 0 ) = 0. From the behavior of a(t 0 ) we may choose sequences t − n < t n < t + n from J so that a(t n ) → 0, N v (t − n )/N v (t n ) → 0, and N v (t + n )/N v (t n ) → 0. Next we choose times t ′ n ∈ (t − n , t + n ) so that In particular, N v (t ′ n ) ≥ 1 2 N v (t n ), which allows us to deduce that Now consider the normalizations v [t ′ n ] and let s ± n := (t ± n − t ′ n )N v (t ′ n ) 3 . From (6.3) and (6.4) we see that and N v [t ′ n ] (s ± n ) → 0 as n → ∞. Passing to a subsequence if necessary, we obtain that v [t ′ n ] converge locally uniformly to a maximal-lifespan solution u of mass M (v) defined on an open interval I containing 0, which is almost periodic modulo symmetries. Now s ± n must converge to the endpoints of the interval I, which implies that N u (t) is bounded above on I and thus, u is global. Rescaling u slightly, we may ensure that N u (t) ≤ 1 for all t ∈ R.
From the fact that osc(T ) → ∞, we see that N v (t) must show significant oscillation in neighborhoods of t ′ n . Transferring this information to u and using the upper bound on N u (t), we may conclude that lim inf t→−∞ N u (t) = lim inf t→∞ N u (t) = 0. Thus we obtain a double high-to-low frequency cascade in the sense of Theorem 1.13.
Finally, we treat the case when a(t) is strictly positive; we will construct a self-similar solution.
Case 3: lim T →∞ osc(T ) = ∞ and inf t0∈J a(t 0 ) = 2ε > 0. Let us call a t 0 ∈ J futurefocusing if N v (t) ≥ εN v (t 0 ) for all t ≥ t 0 ; we call t 0 past-focusing if N v (t) ≥ εN v (t 0 ) for all t ≤ t 0 . Note that by hypothesis, every t 0 ∈ J is future-focusing, past-focusing, or possibly both.
Next we argue that either all sufficiently late times are future-focusing or all sufficiently early times are past-focusing. If this were not the case, one would be able to find arbitrarily long time intervals beginning with a future-focusing time and ending with a past-focusing time. The existence of such intervals would contradict the divergence of osc(T ). We restrict our attention to the case where all t ≥ t 0 are future-focusing; the case when all sufficiently early times are past-focusing can be treated symmetrically.
Choose T so that osc(T ) > 2ε −1 . We will now recursively construct an increasing sequence of times {t n } ∞ n=0 so that 0 < t n+1 − t n ≤ 2ε −3 T N v (t n ) −3 and N v (t n+1 ) ≥ 2N v (t n ). (6.5) Given t n , set t ′ n := t n + ε −3 T N v (t n ) −3 . Then J n := [t ′ n − T N v (t ′ n ) −3 , t ′ n + T N v (t ′ n ) −3 ] ⊆ [t n , t n + 2ε −3 T N v (t n ) −3 ]. As t n is future-focusing, this allows us to conclude that N v (t) ≥ εN v (t n ) on J n , but then by the way T is chosen, we may find t n+1 ∈ J n so that N v (t n+1 ) ≥ 2N v (t n ).
Having obtained a sequence of times obeying (6.5), we may conclude that t n converge to a limit and N v (t n ) to infinity. Hence sup J is finite and lim n→∞ t n = sup J. Moreover, elementary manipulations using (6.5) and the local constancy property also yield sup J − t ∼ v N v (t) −3 for all t 0 ≤ t < sup J.
Enlarging the compactness modulus function by a bounded amount, we may redefine N v (t) = (sup J − t) −1/3 for all t 0 ≤ t < sup J.
Now consider the normalizations v [tn] . After passing to a subsequence if necessary, v [tn] converge locally uniformly to a maximal-lifespan solution u of mass M (v) defined on an open interval I containing (−∞, 1), which is almost periodic modulo symmetries. Moreover, the frequency scale function of u obeys N u (s) ∼ v (1 − s) −1/3 for all s ∈ (−∞, 1).
Rescaling u and applying a time translation (by −1) followed by a space/time reversal, we obtain our sought-after self-similar solution.
This completes the proof of Theorem 1.13.