1 Introduction

In this paper, we consider the scattering problem for the mass-critical nonlinear Klein–Gordon equations (NLKG) on \(\mathbb {R}^d\):

$$\begin{aligned} \left\{ \begin{array}{rl} - \partial _t^2 u + \Delta u - u &{}= \mu |u|^\frac{4}{d} u ,\\ u(0,x) &{}=u_0(x),\\ \partial _t u(0,x) &{}=u_1(x), \end{array}\right. \end{aligned}$$
(1.1)

where \(u: \mathbb {R}\times \mathbb {R}^d \rightarrow \mathbb {R}\), \(d\ge 3\), in both the defocusing case (\(\mu =1\)) and focusing case (\(\mu =-1\)). The NLKG equation is a fundamental model in mathematical physics and has been extensively studied in a large amount of literatures, for example, see [33, 36, 38] and references therein. A major effort was recently devoted to the scattering problem.

An important class of nonlinearity is the power type nonlinearity \(\mu |u|^{p-1} u\), where \(p >~1\). Although the NLKG equation with the power type nonlinearity do not have a scaling structure, we can find that in the massless case, that is for the corresponding wave equation, it has the scaling structure \(u(t,x) \mapsto \lambda ^{ \frac{2}{p-1}}u (\lambda t, \lambda x)\). The scaling leaves the \(\dot{H}^{s_c}_x\)-norm invariant, where \(s_c = \frac{d}{2}-\frac{2}{p-1}\). As blow-up is associated with the small spatial scale that is when \(\lambda \rightarrow \infty \) and the mass term shrinks to 0 under this scaling. Therefore, it is natural to view \(s_c\) as the critical regularity of the NLKG. In general, there are two critical indices for p: mass-critical index \(p = 1 + \frac{4}{d}\) and energy-critical index \(p = 1 + \frac{4}{d-2}\) when \(d \ge 3\). These two indices correspond to \(s_c=0\) and \(s_c=1\), respectively. On the global dynamics there are many studies: for defocusing inter-critical cases \(1+\frac{4}{d}<p<1+\frac{4}{d-2}\) [7, 8, 29], defocusing energy-critical cases [28] and focusing inter-critical and energy-critical cases [9, 10, 12, 20, 32,33,34]. For mass critical cases, energy scattering was studied by R. Killip, B. Stovall, and M. Visan [16] for the two dimensional case and recently by M. Ikeda, T. Inui, and M. Okamoto [11] for the one dimensional case. The two works used the concentration-compactness/rigidity method developed by Kenig–Merle [14, 15].

The purpose of this paper is to study the mass-critical NLKG equations and prove energy scattering in three and higher dimensions. The mass-critical NLKG equation (1.1) has a conservation of energy

$$\begin{aligned} E\left( u,\partial _t u\right) := \int _{\mathbb {R}^d} \frac{1}{2} |\partial _t u(t,x)|^2 + \frac{1}{2} |\nabla u(t,x)|^2 + \frac{1}{2} |u(t,x)|^2 + \frac{\mu d}{2(d+2)} |u(t,x)|^\frac{2(d+2)}{d}\, \textrm{d}x, \end{aligned}$$

and also a conservation of momentum

$$\begin{aligned} P\left( u, \partial _t u \right) := \int _{\mathbb {R}^d} \partial _t u \cdot \nabla u \,\textrm{d}x. \end{aligned}$$

Thus a natural phase space for NLKG is the energy space \(H^1\times L^2\).

In the defocusing case, the conserved energy immediately gives us the global existence of solutions. On the other hand, in the focusing case, there is a global-existence/blowup dichotomy. The ground state solution, a static solution \(u(t,x)=Q(x)\) to the NLKG equation plays a crucial role in the dichotomy. Here, \(Q(x)\in H^1\) is a positive radial solution to the nonlinear elliptic equation

$$\begin{aligned} \Delta Q - Q = - Q^{1 + \frac{4}{d}}. \end{aligned}$$
(1.2)

Global well-posedness vs. blow-up for the solutions under \(E(u,u_t)< E(Q, 0)\) was given essentially in [35], where the threshold of \(\Vert u_0\Vert _{L^2}\) is used to discriminate the solutions. More precisely, one has global well-posedness when \(\Vert u_0\Vert _{L^2} < \Vert Q\Vert _{L^2}\) and blowup when \(\Vert u_0\Vert _{L^2}>\Vert Q\Vert _{L^2}\). However, in both defocusing and focusing cases, scattering needs more effort due to the mass-criticality. The main result of this paper is to establish the scattering for the global solutions.

Theorem 1.1

Assume \((u_0,u_1) \in H_x^1(\mathbb {R}^d) \times L_x^2(\mathbb {R}^d)\), \(d\ge 3\). We have

  1. (i)

    if \(\mu =1\), then the global solution to (1.1) scatters in energy space in both time directions, that is, there exist \(u_\pm \in C_t^0 H_x^1 \cap C^1_t L_x^2 \) which are the solutions of the linear Klein–Gordon equation such that

    $$\begin{aligned} \left\| u(t) - u_\pm (t) \right\| _{H_x^1} + \left\| \partial _t u(t) - \partial _t u_\pm (t) \right\| _{L_x^2} \rightarrow 0,\quad \text { as } t \rightarrow \pm \infty . \end{aligned}$$
  2. (ii)

    if \(\mu =-1\), we assume further \(E(u_0,u_1) < E(Q, 0)\), then the solution u to (1.1) exists globally and scatters in the energy space when \(\Vert u_0\Vert _{L^2} < \Vert Q\Vert _{L^2}\); and it blows up in finite time when \(\Vert u_0\Vert _{L^2}>\Vert Q\Vert _{L^2}\). Also, the case when \(\Vert u_0 \Vert _{L^2 } = \Vert Q\Vert _{L^2}\) is impossible.

To prove the scattering part of the above theorem, we use the “Kenig–Merle roadmap” as in [14,15,16]. Our main technical development lies in the linear and nonlinear profile decomposition in higher dimensions. This is a key tool to prove the existence of a non-scattering solution with the minimal energy, which is so-called a minimal energy blow-up solution.

First, due to the mass-criticality, we need to establish the linear profile decomposition associated to the linear Klein–Gordon equation in higher dimensions at the \(L^2\)-critical level. More precisely, we need to characterise the defect of the compactness of the Strichartz estimate

$$\begin{aligned} \left\| e^{-it\langle \nabla \rangle }f\right\| _{L_{t,x}^{2+\frac{4}{d}}(\mathbb {R}\times \mathbb {R}^d)} \lesssim \Vert f\Vert _{H^1}. \end{aligned}$$

Since the right-hand side can be replaced by the \(H^{1/2}\)-norm (see Remark 2.4, below), assuming bounded data in \(H^1\), we can handle the high frequency easily due to the room of regularity. For the low frequency, it is much more complicated. This can be seen by the fact that the low frequency limit of Klein–Gordon equation is indeed the Schrödinger equation. In fact, for any \(\varphi \in H^1\), we have

$$\begin{aligned} e^{ it\lambda ^2}e^{- it\lambda ^2\langle \lambda ^{-1}\nabla \rangle } \varphi \rightarrow e^{i\frac{t}{2}\Delta } \varphi ~\text { in } H^1,\quad \text {as}~ \lambda \rightarrow \infty , \end{aligned}$$

by the asymptotic expansion

$$\begin{aligned} \lambda ^2 \left( \langle \lambda ^{-1} \xi \rangle -1 \right) = \tfrac{1}{2} |\xi |^2 + O\left( \lambda ^{-2} |\xi |^4 \right) ,\quad \text {as}~ \lambda \rightarrow \infty . \end{aligned}$$
(1.3)

Thus we have to take into account more symmetries for low frequency. In this example, the Fourier transform concentrates at the origin. The Lorentz boost is also involved when the Fourier transform concentrates to another point. We will rely on some refined Strichartz estimates which are derived by the bilinear Strichartz estimates. We slightly simplify the argument in [16] (see Remark 2.7).

Second, we use the nonlinear profile decomposition to construct the minimal mass non-scattering solutions. For this step, the mass-critical NLS

$$\begin{aligned} i\partial _t w+ \frac{1}{2} \Delta w = \mu C_d |w|^{\frac{4}{d}}w \end{aligned}$$
(1.4)

serves as approximate equation to that determines the behaviour to the large scale nonlinear profile of the NLKG equation, where

$$\begin{aligned} C_d := \tfrac{\Gamma \left( \frac{2}{d} + \frac{3}{2} \right) }{\sqrt{\pi } \Gamma \left( \frac{2}{d} + 2\right) }. \end{aligned}$$

The connection between (1.1) and (1.4) in the scattering problem is previously studied. K. Nakanishi [31] proved that the scattering of the NLKG implies the scattering of the corresponding NLS equation. Conversely, R. Killip, B. Stovall, and M. Visan [16] used the scattering results of the mass-critical NLS equation to show the scattering of NLKG in two dimension. This was extended to one dimensional case by M. Ikeda, T. Inui, and M. Okamoto in [11]. Unlike the one- and two-dimensional cases, some new difficulty is caused by the fact that the power of the nonlinear term is of fractional order in the higher dimensions. The limit NLS equation is not as obvious as in one and two dimensional case. The difficulty can be summarized as specifying the constant \(C_d\). By the technique developed by the third author and his collaborators [23,24,25,26], we can overcome this difficulty. In these works, they introduce an expansion of homogeneous nonlinearity to pick up the resonant term from the non-algebraic nonlinear term. By these ideas we derive the limit NLS equation and then use it to construct the minimal energy blow-up solutions. Note that a similar technique was developed in [21, 27, 30, 31].

Finally, in the rigidity part, we exclude the existence of the critical element by a virial type monotonicity argument. This part is done by an argument in [9, 16]. We give a proof of this part for self-containedness.

2 Preliminary

2.1 Definition and Notations

We use C to denote some universal constant which may change from line to line. For \(X, Y \in \mathbb {R}\), \(X\lesssim Y\) means that there exists a constant C such that \(X\le CY\), similarly for \(X\gtrsim Y\). We use \(\lesssim _{A,\epsilon }\) and \(\gtrsim _{A,\epsilon }\) to indicate that the implicit constant depends on \(A,\epsilon \). \(X\sim Y\) means \(X\lesssim Y\) and \(X\gtrsim Y\). For \(a\in \mathbb {R}\), \(a+\) (resp. \(a-\)) denotes \(a+\varepsilon \) (resp. \(a-\varepsilon \)) for any sufficiently small \(\varepsilon >0\), and \(\langle a\rangle =(1+|a|^2)^{\frac{1}{2}}\).

For a function \(f\in L_{loc}^1(\mathbb {R}^d)\), we use \(\widehat{f}\) or \(\mathcal {F}(f)\) to denote the spatial Fourier transform of f: \(\mathcal {F}(f) (\xi ) = \hat{f}(\xi ) =(2 \pi )^{-\frac{d}{2} } \int _{\mathbb {R}^d}e^{- i x \xi } f(x)\,\textrm{d}x\). Let \(\varphi \in C_0^\infty (\mathbb {R})\) be a real-valued, non-negative, even, and radially-decreasing function such that

$$\begin{aligned} \varphi (\xi ) = \left\{ \begin{array}{ll} 1,&{}\quad |\xi | \le 1, \\ 0,&{}\quad |\xi | \ge \frac{5}{4}, \end{array}\right. \end{aligned}$$

and define \(\chi (\xi )=\varphi (\xi )-\varphi (2\xi )\). For a dyadic number \(N\in 2^{\mathbb {Z}_+}\), we define the Littlewood–Paley projectors: \(\widehat{P_1f}(\xi ):=\varphi (\xi )\widehat{f}(\xi )\) and for \(N\ge 2\),

$$\begin{aligned} \widehat{P_Nf}(\xi ):=\chi \left( \frac{\xi }{N} \right) \widehat{f}(\xi ), \quad \widehat{P^{\pm }_Nf}(\xi ):=\chi \left( \frac{\xi }{N} \right) 1_{\pm \xi \ge 0}\cdot \widehat{f}(\xi ). \end{aligned}$$

For \(\varOmega \subseteq \mathbb {R}^d\), we also define the Littlewood–Paley projector \(P_\varOmega =\mathcal {F}^{-1}1_\varOmega (\xi )\mathcal {F}\). We define the Fourier multiplier \(m(\nabla )=\mathcal {F}^{-1}m(\xi )\mathcal {F}\). In particular, \(\langle \nabla \rangle \) (resp. \(D^s\)) is the Fourier multiplier with symbol \(\langle \xi \rangle =(1+|\xi |^2)^{\frac{1}{2}}\) (resp. \(|\xi |^s\)).

We use \(L^p\) to denote the Lebesgue space with a norm \(\Vert \cdot \Vert _{p}:=\Vert \cdot \Vert _{L^p}\) and \(L^p_tL_x^q\) to denote the mixed norm Lebesgue space with \(\Vert f\Vert _{L^p_tL_x^q}=\big \Vert \Vert f(t,\cdot )\Vert _{L_x^q}\big \Vert _{L_t^p}\). \(H^s\) (and \(\dot{H}^s\)) denotes the standard (homogeneous) Sobolev space.

It is convenient for us to rewrite (1.1) into the first order. Let \(v = u + i \langle \nabla \rangle ^{-1} \partial _t u\), then the equation for v is

$$\begin{aligned} \left\{ \begin{array}{l} i \partial _t v - \langle \nabla \rangle v = \mu \langle \nabla \rangle ^{-1}\left( |\Re v|^{\frac{4}{d}} \Re v \right) ,\\ v(0,x) = v_0(x) \in H^1(\mathbb {R}^d). \end{array}\right. \end{aligned}$$
(2.1)

We will use these two equivalent forms interchangeably. We use u to denote solution of (1.1) and v the corresponding solution of (2.1), the scattering norms and energies are defined to be

$$\begin{aligned} S_I(u)= & {} S_I(v) = \int _{I}\int _{\mathbb {R}^d} |\Re v(t,x)|^{\frac{2(d+2)}{d}}\,\textrm{d}x \textrm{d}t,\\ E(u(t))= & {} E(v(t)) = \int _{\mathbb {R}^d} \frac{1}{2} |\langle \nabla _x \rangle v(t,x)|^2 + \mu \frac{d }{2(d+2)} |\Re v(t,x)|^\frac{2(d+2)}{d} \,\textrm{d}x. \end{aligned}$$

2.2 Well-Posedness Theory

In this subsection, we collect some Strichartz estimates and well-posedness results that will be used in this paper. First, we recall the dispersive estimate for the Klein–Gordon propagator (see e.g. [1, 7, 8, 22]).

Lemma 2.1

For any dyadic number \(N\ge 1\), we have

$$\begin{aligned} \left\| {e^{-it\langle {\nabla }\rangle }P_N f}\right\| _{L_x^\infty (\mathbb {R}^d)}\lesssim |t|^{-\frac{d}{2}} N^{\frac{d+2}{2}}\Vert f\Vert _{L_x^1}, \end{aligned}$$

and

$$\begin{aligned} \left\| e^{-it\langle \nabla \rangle }P_N f\right\| _{L_x^\infty (\mathbb {R}^d)}\lesssim |t|^{-\frac{d-1}{2}} N^{\frac{d+1}{2}}\Vert f\Vert _{L_x^1}. \end{aligned}$$

By the above dispersive estimate, we can get the Strichartz estimate. The Strichartz estimate of the Klein–Gordon equation has been studied in many literatures, see [1, 7,8,9, 22] and the references therein. In general, the Klein–Gordon propagator behaves like wave for high frequency and Schödinger for low frequency.

Definition 2.2

We say that a pair (qr) is wave-admissible if

$$\begin{aligned} 2 \le q, r \le \infty , \quad \frac{1}{q} \le \frac{d-1}{2} \left( \frac{1}{2} - \frac{1}{r} \right) , \quad (q,r,d)\ne (2,\infty , 3); \end{aligned}$$

and Schrödinger-admissible if

$$\begin{aligned} 2 \le q, r \le \infty , \quad \frac{1}{q} \le \frac{d}{2} \left( \frac{1}{2} - \frac{1}{r} \right) , \quad (q,r,d)\ne (2,\infty , 2). \end{aligned}$$

If the equality holds, then we say (qr) is sharp wave (or Schrödinger)-admissible.

Lemma 2.3

(Homogeneous Strichartz estimate) Assume (qr) is Schrödinger-admissible. We have

$$\begin{aligned} \left\| e^{-it\langle \nabla \rangle }P_N f\right\| _{L_t^qL_x^r(\mathbb {R}\times \mathbb {R}^d)}\lesssim N^{\beta (q,r)}\Vert f\Vert _{L_x^2}, \end{aligned}$$

where

$$\begin{aligned} \beta (q,r)= \left\{ \begin{array}{ll} \frac{d+2}{2} \left( \frac{1}{2}-\frac{1}{r} \right) &{} \quad \text {if } (q,r) \text { is sharp Schr}\ddot{o}\text {dinger-admissible}, \\ \frac{d}{2}-\frac{d}{r}-\frac{1}{q}&{} \quad \text {if } (q,r) \text { is wave-admissible}. \end{array}\right. \end{aligned}$$

By interpolation, we can obtain the Strichartz estimate for (qr) between wave-admissible and sharp Schrödinger-admissible.

Remark 2.4

In particular, \(\left( 2+\frac{4}{d-1}, 2+\frac{4}{d-1}\right) \) is sharp wave-admissible and \(\left( 2\!+\!\frac{4}{d}, 2\!+\!\frac{4}{d} \right) \) is sharp Schrödinger-admissible. We have for \(d\ge 2\),

$$\begin{aligned} \left\| e^{-it\langle \nabla \rangle }f\right\| _{L_{t,x}^{2+\frac{4}{d-1}}(\mathbb {R}\times \mathbb {R}^d)} + \left\| e^{-it\langle \nabla \rangle }f\right\| _{L_{t,x}^{2+\frac{4}{d}}(\mathbb {R}\times \mathbb {R}^d)} \lesssim \Vert f\Vert _{H^{ \frac{1}{2}}(\mathbb {R}^d)}. \end{aligned}$$
(2.2)

By the duality, we have

Lemma 2.5

(Inhomogeneous Strichartz estimate) Assume v and G satisfy the following equations on the time interval \(I\subseteq \mathbb {R}\),

$$\begin{aligned} i \partial _t v - \langle \nabla \rangle v = \langle \nabla \rangle ^{-1} G. \end{aligned}$$

Then

$$\begin{aligned} \left\| \langle \nabla \rangle ^{ 1 + \frac{d+2}{2} \left( \frac{1}{r} - \frac{1}{2} \right) } v \right\| _{L_t^q L_x^r(I\times \mathbb {R}^d)} \lesssim \left\| \langle \nabla \rangle v(t_0) \right\| _{L^2(\mathbb {R}^d)} + \left\| \langle \nabla \rangle ^{\frac{d+2}{2} \left( \frac{1}{2} - \frac{1}{\tilde{r}} \right) } G \right\| _{L_t^{\tilde{q}' } L_x^{\tilde{r}'}(I \times \mathbb {R}^d)} \end{aligned}$$

for each \(t_0 \in I\) and any sharp Schrödinger-admissible pairs (qr) and \((\tilde{q}, \tilde{r})\).

For the low frequency component, the Klein–Gordon propagator behaves like Schrödinger equation. By applying the bilinear restriction estimate of [37] as in [16] (see [18]), we obtain the following refined Strichartz estimate which is the same as the Schrödinger equation.

Lemma 2.6

(Refined Strichartz) \(\forall \, f \in L_x^2(\mathbb {R}^d)\) and \(\textrm{supp}\,\hat{f} \subseteq \{|\xi | \le 2^d\}\), we have

$$\begin{aligned} & {} \left\| e^{-it \langle \nabla \rangle } f \right\| _{L_{t,x}^\frac{2(d+2)}{d}(\mathbb {R}\times \mathbb {R}^d)} \nonumber \\ {} & {} \quad \lesssim \Vert f\Vert _{L_x^2}^\frac{d+1}{d+2} \left( \underset{\mathcal {C}}{\sup } | \mathcal {C} |^{-\frac{d+1}{2(d^2 + 3d + 1)}} \left\| e^{-it\langle \nabla \rangle } P_{\mathcal {C}} f \right\| _{L_{t,x}^\frac{2 \left( d^2 + 3d + 1 \right) }{d^2} (\mathbb {R}\times \mathbb {R}^d)} \right) ^\frac{1}{d+2}, \end{aligned}$$
(2.3)

where the supremum is taken over all dyadic cubes \(\mathcal {C}\) with side length no more than \(2^{d+1}\), and \(P_{\mathcal {C}} f\) is the Fourier restriction of f to \( \mathcal {C}\). As a consequence, by interpolation, we obtain

$$\begin{aligned} \left\| e^{-it \langle \nabla \rangle } f \right\| ^{\frac{d^2+2d+1}{d^2+3d+1}}_{L_{t,x}^\frac{2(d+2)}{d}(\mathbb {R}\times \mathbb {R}^d)} \lesssim \Vert f\Vert _{L_x^2}^\frac{d+1}{d+2} \left( \underset{\mathcal {C}}{\sup } | \mathcal {C} |^{-\frac{d+1}{2(d^2 + 3d + 1)}} \left\| e^{-it\langle \nabla \rangle } P_{\mathcal {C}} f \right\| _{L_{t,x}^\infty }^{\frac{d+1}{d^2+3d+1}} \right) ^\frac{1}{d+2}. \end{aligned}$$
(2.4)

Remark 2.7

In the two dimensional case [16], the combination of the cube decomposition (2.3) and a tube-type decomposition is used to obtain the inverse Strichartz estimate. It will turn out that the decomposition (2.4) is sufficient for this purpose.

By the Strichartz estimate and Picard’s iteration, we can establish the well-posedness theory for (2.1).

Proposition 2.8

(Local well-posedness in \(H^1\)) For any \(v_0 \in H_x^1(\mathbb {R}^d)\), there exists a unique maximal-lifespan solution \(v: I \times \mathbb {R}^d \rightarrow \mathbb {C}\) to (2.1) with \(v(0) = v_0\) satisfying \(S_J(v)<\infty \) for any \(J\Subset I\). Moreover, we have

  1. (1)

    \(|I|\ge C(\Vert v_0\Vert _{H^1})\). If \(I=\mathbb {R}\) and \(S_{\mathbb {R} }(v) < \infty \), then v scatters in \(H^1\).

  2. (2)

    If \(\Vert v_0\Vert _{H_x^1}\) is small enough, then \(I=\mathbb {R}\) and \(S_{\mathbb {R}}(v) \lesssim 1\).

  3. (3)

    If \(J\subseteq I\) and \(S_J(v) < L\), then for any \(0 \le s <1 + \frac{4}{d}\), we have

    $$\begin{aligned} \left\| \langle \nabla \rangle ^{s + \frac{d+2}{2} \left( \frac{1}{r} - \frac{1}{2} \right) } v\right\| _{L_t^q L_x^r(J\times \mathbb {R}^d)} \lesssim \left\| \langle \nabla \rangle ^{s } v_0 \right\| _{L_x^2}, \end{aligned}$$
    (2.5)

    where (qr) is sharp Schrödinger-admissible.

In the defocusing case, we can extend the local well-posedness to global well-posedness by the energy conservation. For the focusing case, we have global well-posedness under the restriction \(E(u_0,u_1)< E(Q,0)\) and \(\Vert u_0\Vert _2 < \Vert Q\Vert _2\) (see the next subsection). To prove the scattering, we need the following stability theorem which can be proved by the Strichartz estimates. This theorem is used in the proof of Theorem 3.5 (the approximation of the large scale profile) and Theorem 3.1 (the existence of the critical element).

Proposition 2.9

(Stability theorem) Assume \(\tilde{v}\) solves

$$\begin{aligned} i\tilde{v}_t - \langle \nabla \rangle \tilde{v} = \mu \langle \nabla \rangle ^{-1} \left( |\Re \tilde{v}|^\frac{4}{d} \Re \tilde{v}\right) + e_1 + e_2 +e_3 \end{aligned}$$

on the time interval \(I\subseteq \mathbb {R}\) with error terms \(e_1\), \(e_2\) and \(e_3\), and satisfies \(\Vert \langle \nabla \rangle ^\frac{1}{2} \tilde{v}\Vert _{L_t^\infty L_x^2} \le M\) and \(\Vert \Re \tilde{v}\Vert _{L_{t,x}^\frac{2(d+2)}{d}(I \times \mathbb {R}^d)} \le L\) for some constants \(M,L > 0\). Assume further for some \(t_0 \in I\) and \(\Vert \langle \nabla \rangle ^\frac{1}{2} (v_0 - \tilde{v}(t_0))\Vert _{L^2} \le M'\) for some constant \(M'> 0\). Then there exists \(\varepsilon =\epsilon (M,M',L)>0\) with the following properties: if

$$\begin{aligned} & {} \left\| e^{-i(t-t_0) \langle \nabla \rangle } (v_0 - \tilde{v}(t_0)) \right\| _{L_{t,x}^\frac{2(d+2)}{d}(I \times \mathbb {R}^d)} + \Vert e_1\Vert _{L_t^1 H_x^\frac{1}{2}}\\ {} & {} + \Vert \langle \nabla \rangle e_2\Vert _{L_{t,x}^\frac{2(d+2)}{d+4}(I\times \mathbb {R}^d)} + \left\| \int _{t_0}^t e^{-i(t-s) \langle \nabla \rangle } e_3(s) \,\textrm{d}s \right\| _{L_{t,x}^\frac{2(d+2)}{d} \cap L_t^\infty H_x^\frac{1}{2}(I \times \mathbb {R}^d) } \le \epsilon , \end{aligned}$$

then there exists a solution v to (2.1) on the time interval I with \(v(t_0) = v_0\), and v satisfies

$$\begin{aligned} \Vert v-\tilde{v}\Vert _{L_{t,x}^\frac{2(d+2)}{d}(I\times \mathbb {R}^d)} \le \epsilon C(M,M',L)~\text { and }~ \Vert v- \tilde{v}\Vert _{L_t^\infty H_x^\frac{1}{2}(I\times \mathbb {R}^d)} \le M' C(M,M',L). \end{aligned}$$

2.3 Variational Estimate

In this subsection, we collect some variational estimates which are needed when studying the focusing NLKG. The variational estimates are known in [9] or can be proved using similar arguments (see also [13, 41]). For \((\alpha , \beta ) \in \mathbb {R}^2\), let

$$\begin{aligned} m_{\alpha ,\beta }:= \inf \left\{ E(\varphi , 0): \varphi \in H^1(\mathbb {R}^d)\setminus \{0\}, \mathcal {K}_{\alpha ,\beta } (\varphi ) = 0\right\} , \end{aligned}$$

where

$$\begin{aligned} \mathcal {K}_{\alpha , \beta } (\varphi )= & {} \frac{\partial }{\partial \lambda }\Big |_{\lambda =0} E\left( e^{\alpha \lambda }\varphi \left( e^{\beta \lambda }x \right) ,0 \right) \\ = & {} \int _{\mathbb {R}^d} \frac{2 \alpha -(d-2) \beta }{2} |\nabla \varphi |^2 + \frac{2 \alpha - d \beta }{2} |\varphi |^2 - \left( \alpha -\frac{d^2 \beta }{ 2(d+2)} \right) |\varphi |^{\frac{2(d+2)}{d} } \,\textrm{d}x. \end{aligned}$$

Let

$$\begin{aligned} \mathcal {K}_{\alpha , \beta }^+= & {} \left\{ (u_0,u_1) \in H^1\times L^2: E(u_0,u_1)< m_{\alpha , \beta }, \mathcal {K}_{\alpha , \beta }(u_0) \ge 0\right\} ,\\ \mathcal {K}_{\alpha , \beta }^-= & {} \left\{ (u_0,u_1) \in H^1\times L^2: E(u_0,u_1)< m_{\alpha , \beta }, \mathcal {K}_{\alpha , \beta }(u_0) < 0\right\} . \end{aligned}$$

In particular, we will use

$$\begin{aligned} \mathcal {K}_{0}(\varphi ) : = \mathcal {K}_{1,0}(\varphi )\quad \text { and }\quad \mathcal {K}_1(\varphi ) : = \mathcal {K}_{d, 2}(\varphi ). \end{aligned}$$

\(\mathcal {K}_{0}(\varphi )\) is convenient for the blow-up while \(\mathcal {K}_1(\varphi )\) is convenient for the scattering. As a sign-functional, they play the same roles.

Lemma 2.10

We have \(m_{1,0} =m_{d,2}= E(Q,0)> 0\), where \(Q \in H^1\) is the ground state of (1.2). Moreover, \(\mathcal {K}_{1, 0}^\pm =\mathcal {K}_{d, 2}^\pm \).

Remark 2.11

Recall that Q is the unique (up to symmetry) maximizer to the following sharp Gagliardo–Nirenberg inequality:

$$\begin{aligned} \Vert f\Vert _{L_x^\frac{2(d+2)}{d}}^\frac{2(d+2)}{d} \le \frac{d+2}{d} \left( \frac{\Vert f\Vert _{L_x^2}}{\Vert Q\Vert _{L_x^2}}\right) ^\frac{4}{d} \Vert \nabla f\Vert _{L_x^2}^2. \end{aligned}$$
(2.6)

As a result, we have \(\mathcal {K}_{1, 0}^- = \mathcal {K}_{d, 2}^- = \{E(u_0,u_1) < E(Q,0): \Vert u_0\Vert _{L^2}>\Vert Q\Vert _{L^2}\}\).

Proposition 2.12

Let \(u: I \times \mathbb {R}^d \rightarrow \mathbb {R}\) be a solution with maximal lifespan \(I= (-T_*, T^*)\) to (1.1) with \((u(0), u_t(0)) \in H_x^1 \times L_x^2\), \(\mu =-1\), and \(E(u(0), u_t(0)) < E(Q, 0)\).

  • If \(\Vert u_0\Vert _2 < \Vert Q\Vert _2\), then \(I=\mathbb {R}\) and for any \(t \in \mathbb {R}\),

    $$\begin{aligned} E(u(t), \partial _t u(t))\le & {} \frac{1}{2} \int _{\mathbb {R}^d} |\nabla u|^2 + |u|^2 + |\partial _t u |^2\,\textrm{d}x \le \left( 1 + \frac{d}{2}\right) E(u, \partial _t u),\\ \mathcal {K}_0(u(t) )\ge & {} c \min \left( E(Q,0) - E(u(0),u_t(0)), \Vert u(t) \Vert _{H^1_x}^2\right) , \nonumber \\ \mathcal {K}_1(u(t))\ge & {} c \min \left( E(Q,0) - E(u(0), u_t(0)), \Vert \nabla u(t)\Vert _{L_x^2}^2 \right) . \end{aligned}$$
    (2.7)
  • If \(\Vert u_0\Vert _2> \Vert Q\Vert _2\), then \(\max (T_*,T^*)<\infty \). Moreover, we have for any \(t\in I\),

    $$\begin{aligned} \mathcal {K}_0(u(t)) \le - 2 \left( E(Q,0) - E(u(0), u_t(0)) \right) < 0 \end{aligned}$$

    and

    $$\begin{aligned} \mathcal {K}_1(u(t)) \le - 2 \left( E(Q,0) - E(u(0), u_t(0) ) \right) < 0. \end{aligned}$$

Remark 2.13

The blow-up part in the above proposition can be proven by showing the strict concavity of \(\Vert u(t)\Vert _{L_x^2}^{-\frac{2}{d}}\). We will omit the details of the proof but refer to [9, 16, 33, 35] for similar argument.

3 Proof of the Main Theorem

In this section, we prove the main theorem, assuming two crucial ingredients which will be proved in the remaining sections. The proof follows closely Kenig–Merle’s road map and the ideas in [16]. Let

$$\begin{aligned} \Lambda (E) = \sup \Vert u\Vert _{L_{t,x}^{ \frac{2(d+2)}{d}}(\mathbb {R} \times \mathbb {R}^d )}, \end{aligned}$$

where the supremum is taken over all solutions \(u \in C_t^0 H_{x }^{ 1}\) of (1.1) obeying \( {E}(u,\partial _t u) \le E\) (and an extra assumption \(\Vert u_0\Vert _{L^2}<\Vert Q\Vert _{L^2}\) when \(\mu = -1\)).

Let \(E_{c} = \sup \{ E: \Lambda (E) < \infty \}\). We have \(E_c > 0\) by the small data scattering results in Proposition 2.8. To prove Theorem 1.1, we only need to show \(E_{c} = \infty \) (when \(\mu = 1\)) and \(E_c = E(Q,0) \) (when \(\mu = - 1\)). We will prove it by contradiction argument.

3.1 Existence of Critical Element

The main result of this subsection is

Theorem 3.1

(Existence of an critical element) Assume \(E_{c} < \infty \) (when \(\mu = 1\)) and \(E_c < E(Q, 0)\) (when \(\mu = - 1\)). There exists a global solution \(u_c\) to (1.1) with \(E(u_c, \partial _t u_c) = E_c\) (and also \(\Vert u_c(0)\Vert _{L^2}<\Vert Q\Vert _{L^2}\) when \(\mu =-1\)) and \(\Vert u_c\Vert _{L_{t,x}^\frac{2(d+2)}{d} (\mathbb {R} \times \mathbb {R}^d )} = \infty \). Furthermore, there exists \(x: \mathbb {R} \rightarrow \mathbb {R}^d\) such that

$$\begin{aligned} \left\{ (u_c, \partial _t u_c)(t, x+ x(t)): t\in \mathbb {R} \right\} \end{aligned}$$
(3.1)

is pre-compact in \(H^1 \times L^2\).

As a direct consequence of the pre-compactness of the critical element, we have

Corollary 3.2

For any \(\eta >0\), there is \(C: \mathbb {R}^+ \rightarrow \mathbb {R}^+\) such that

$$\begin{aligned} & {} \int _{|x- x(t)| \ge C(\eta )} |\nabla u_c(t,x)|^2 + |u_c(t,x)|^2+ |\partial _t u_c(t,x)|^2 + |u_c(t,x)|^\frac{2(d+2)}{d}\, \textrm{d}x\nonumber \\ {} & {} \quad + \int _{| \xi | \le \frac{1}{ C(\eta )}} \left| \widehat{u_c}(t, \xi )\right| ^2 \,\textrm{d} \xi < \eta , \quad \forall \, t \in \mathbb {R}. \end{aligned}$$
(3.2)

The proof of the above theorem relies on two ingredients. The first one is the profile decomposition associated to the linear Klein–Gordon equation with data in \(H^1\) at the \(L^2\)-critical level. More precisely, due to the mass-criticality, we need to understand the defect of compactness of the following Strichartz estimate

$$\begin{aligned} \left\| e^{-it\langle \nabla \rangle }f\right\| _{L_{t,x}^{2+\frac{4}{d}}(\mathbb {R}\times \mathbb {R}^d)} \lesssim \Vert f\Vert _{H^1}. \end{aligned}$$

There are several non-compact groups of symmetry in the above inequality. The first one is the spatial translation:

$$\begin{aligned} f\rightarrow \left( T_y f \right) (\cdot ) : = f(\cdot -y), \quad y\in \mathbb {R}^d. \end{aligned}$$

The second one is phase modulation:

$$\begin{aligned} f\rightarrow e^{i\theta \langle {\nabla }\rangle }f, \quad \theta \in \mathbb {R}. \end{aligned}$$

The third one is dilation in one direction (not a group):

$$\begin{aligned} f\rightarrow D_\lambda f:=\lambda ^{-\frac{d}{2}}f\left( \frac{\cdot }{\lambda }\right) , \quad \lambda \in [1,\infty ). \end{aligned}$$

For \(f\in H^1\), we see \(\Vert D_\lambda f\Vert _{L^2}=\Vert f\Vert _{L^2}\) and \(\Vert D_\lambda f\Vert _{\dot{H}^1}\rightarrow 0\) as \(\lambda \rightarrow \infty \). We introduce a Schrödinger dilation

$$\begin{aligned} u(t,x)\rightarrow \widetilde{D}_\lambda u:=\lambda ^{- \frac{d}{2}}u \left( \frac{t}{\lambda ^2}, \frac{x}{\lambda }\right) . \end{aligned}$$

We have \(\Vert \widetilde{D}_\lambda u\Vert _{L_t^qL_x^r}=\Vert u\Vert _{L_t^q L_x^r}\) when (qr) is sharp Schödinger-admissible. Moreover,

$$\begin{aligned} \widetilde{D}_{\lambda ^{-1}} \left[ e^{-it \langle {\nabla }\rangle } D_\lambda f \right] = e^{-i\lambda ^2 t \langle {\lambda ^{-1}\nabla }\rangle } f. \end{aligned}$$

The last one is the Lorentz transformation. We use the version as in [16]. For any \(\nu \in \mathbb {R}^d\), we define the Lorentz boost of the space-time:

$$\begin{aligned} \left( \tilde{t}, \tilde{x}\right) = L_\nu (t,x) : = \left( \langle \nu \rangle t - \nu \cdot x, x^\perp + \langle \nu \rangle x^\parallel - \nu t\right) , \end{aligned}$$

where \(x^\perp = x- \frac{( x \cdot \nu ) \nu }{|\nu |^2}\) and \(x^\parallel = \frac{ (x\cdot \nu ) \nu }{ |\nu |^2}\). An easy computation yields

$$\begin{aligned} (t,x)=L_\nu ^{-1}\left( \tilde{t}, \tilde{x} \right) = L_{-\nu } \left( \tilde{t}, \tilde{x} \right) = \left( \langle \nu \rangle \tilde{t} + \nu \cdot \tilde{x}, {\tilde{x}}^\perp + \langle \nu \rangle {\tilde{x}}^\parallel + \nu \tilde{t}\right) \end{aligned}$$

and that if \(u(t,x) = e^{-i t \langle \xi \rangle + i x \cdot \xi }\), then

$$\begin{aligned} u \circ L_\nu ^{-1}\left( \tilde{t}, \tilde{x}\right) = e^{-i \tilde{t}\left\langle \tilde{\xi } \right\rangle + i \tilde{x} \cdot \tilde{\xi } }, \end{aligned}$$

where

$$\begin{aligned} \tilde{\xi } = { {l}}_\nu (\xi ) : = \xi ^\perp + \langle \nu \rangle \xi ^\parallel - \nu \langle \xi \rangle . \end{aligned}$$

The action of the Lorentz boosts on the function f is defined to be

$$\begin{aligned} {\L } _\nu f(x) := \left( e^{-i \cdot \langle \nabla \rangle } f \right) \circ L_\nu (0,x). \end{aligned}$$

Then

$$\begin{aligned} \left( e^{-it\langle \nabla \rangle } {\L } _\nu ^{-1} f\right) (x) : = \left( e^{-i \cdot \langle \nabla \rangle } f\right) \circ L_\nu ^{-1}(t,x), \end{aligned}$$
(3.3)

namely u is a solution of the linear Klein–Gordon equation if and only if \(u\circ L_\nu ^{-1}\) is a solution of the linear Klein–Gordon equation with the following transformation on the initial data

$$\begin{aligned} f\mapsto \left( {\L } _\nu f\right) (x) : = \left( e^{-i \cdot \langle \nabla \rangle } f\right) \circ L_\nu \left( 0,x\right) . \end{aligned}$$

Moreover, direct computation yields (see [16])

$$\begin{aligned} \widehat{ {\L } _\nu ^{-1} f} \left( \tilde{\xi }\right) = \langle \xi \rangle \langle \tilde{\xi }\rangle ^{-1} \hat{f}(\xi ),\quad d\xi =\langle \xi \rangle \langle \tilde{\xi }\rangle ^{-1} d\tilde{\xi } \end{aligned}$$

and

$$\begin{aligned} {\L } _\nu ^{-1} T_y e^{-i\tau \langle \nabla \rangle } = T_{\tilde{y}} e^{-i\tilde{\tau } \langle \nabla \rangle } {\L } _\nu ^{-1},\quad \text { where } \left( \tilde{\tau }, \tilde{y}\right) = L_\nu (\tau , y). \end{aligned}$$
(3.4)

Furthermore, we have for any \(s\in \mathbb {R}\),

$$\begin{aligned} \left\langle {\L } _\nu ^{-1} f , g \right\rangle _{H^s} = \left\langle f, m_s^\nu (\nabla ) {\L } _\nu g\right\rangle _{H^s},\quad \text { with } m_s^\nu (\xi ) := \left( \frac{\langle {\xi }\rangle }{\langle \tilde{\xi }\rangle } \right) ^{1- 2s } \end{aligned}$$
(3.5)

and \(\Vert m_s^\nu \Vert _{L_\xi ^\infty } + \Vert (m_s^\nu )^{-1}\Vert _{L_\xi ^\infty } \lesssim \langle \nu \rangle ^{|2s - 1|}\).

We can see the Lorentz transformation is unitary in \(H^{\frac{1}{2}}\), so assuming uniform boundedness in \(H^1\) will require boundedness of \(|\nu |\). However, due to the mass-criticality, there is a \(L^2\)-dilation. Thus the Lorentz transformation should also be taken into count for the defect of compactness. More precisely, for \(\nu _n\rightarrow \nu \) and \(\lambda _n\rightarrow \infty \), as \(n \rightarrow \infty \), we have for \(f\in H^1\),

$$\begin{aligned} D_{\lambda _n} {\L } _{\nu _n} f- {\L } _{\nu _n } D_{\lambda _n} f \end{aligned}$$

is not small in \(L^2\) in general. By a direct calculation, we have

Lemma 3.3

For any \(f \in L^2\setminus \{0\}\) and \(\Lambda > 0\),

$$\begin{aligned} \mathcal {K} : = \left\{ D_\lambda ^{-1} {\L } _\nu ^{-1} m_0^\nu (\nabla )^{-1} e^{i\nu x} D_\lambda f: |\nu |\le \Lambda ~\text {and}~ \Lambda ^{-1} \le \lambda < \infty \right\} \end{aligned}$$

is a precompact subset of \(L^2\), and \(0 \notin \overline{\mathcal {K}}\). Furthermore, if \(\hat{f}= \chi _{[-1,1]^d}\), we see for any \(R > 0\),

$$\begin{aligned} \textrm{supp}\,\hat{g} \subseteq \left\{ |\xi |\lesssim \langle \Lambda \rangle \right\} ,~~\Vert g\Vert _{L^2_x} \gtrsim \langle \Lambda \rangle ^{-1},~\text { and } \int _{|x| \sim R} |g(x)|^2 \,\textrm{d}x \lesssim \frac{\langle \Lambda \rangle }{\langle R\rangle }, \end{aligned}$$

uniformly for any \(g \in \mathcal {K}\).

This result is the higher dimension extension of Lemma 2.8 in [16], since the proof follows the similar argument, we omit the proof here. We remark that \(\mathcal {K}\) is not compact, and the elements in \(\bar{\mathcal {K}} \setminus \mathcal {K}\) are characterized as follows: \(h \in \bar{\mathcal {K}} \setminus \mathcal {K} \) if and only if \(\hat{h}(\tilde{\xi }) = \hat{f}(\tilde{\xi }^\perp + \langle \nu \rangle \tilde{\xi }^\parallel )\).

With all the symmetries above, we have

Theorem 3.4

(Profile decomposition) Assume \(\{v_n\}_{n\ge 1} \) is a bounded sequence in \(H_x^1(\mathbb {R}^d)\). Then, up to a subsequence, there exists \(J_0 \in [1, \infty ]\) and for each integer \(1\le j<J_0\), there also exist a non-zero function \(\phi ^j \in L_x^2(\mathbb {R}^d)\), a sequence \(\{(\lambda _n^j,t_n^j, x_n^j, \nu _n^j)\} \subseteq [ 1, \infty )\times \mathbb {R} \times \mathbb {R}^d \times \mathbb {R}^d\) with the following properties:

  • either \(\lambda _n^j \rightarrow \infty \) as \(n\rightarrow \infty \) or \(\lambda _n^j \equiv 1\);

  • either \(\frac{t_n^j}{(\lambda _n^j)^2} \rightarrow \pm \infty \) as \(n\rightarrow \infty \) or \(t_n^j \equiv 0\);

  • \(\phi ^j \in H^1\) if \(\lambda _n^j \equiv 1\);

  • \(\nu _n^j \rightarrow \exists \nu ^j\in \mathbb {R}^d\) as \(n\rightarrow \infty \), and \(\nu _n^j \equiv 0\) if \(\lambda _n^j \equiv 1\).

Let \(P_n^j\) be the projector defined by

$$\begin{aligned} P_n^j \phi ^j : = \left\{ \begin{array}{ll} \phi ^j &{}\quad \text { if } \lambda _n^j\equiv 1,\\ P_{\le (\lambda _n^j)^\theta }\phi ^j &{}\quad \text { if } \lambda _n^j \rightarrow \infty , \end{array}\right. \end{aligned}$$
(3.6)

where \(0 < \theta \ll 1\). For any \( 1\le J <J_0 \), we have the decomposition

$$\begin{aligned} v_n = \sum _{j=1}^J T_{x_n^j} e^{it_n^j \langle \nabla \rangle } {\L } _{\nu _n^j} D_{\lambda _n^j} P_n^j \phi ^j + w_n^J, \end{aligned}$$
(3.7)

with the decoupling properties

$$\begin{aligned} \Vert v_n\Vert _{L_x^2}^2 - \sum _{j=1}^J \left\| T_{x_n^j} e^{it_n^j\langle \nabla \rangle } {\L } _{\nu _n^j} D_{\lambda _n^j} P_n^j \phi ^j \right\| _{L_x^2}^2 - \Vert w_n^J\Vert _{L_x^2}^2\rightarrow & {} 0,\end{aligned}$$
(3.8)
$$\begin{aligned} \Vert v_n\Vert _{H_x^1}^2 - \sum _{j=1}^J \left\| T_{x_n^j} e^{it_n^j\langle \nabla \rangle } {\L } _{\nu _n^j} D_{\lambda _n^j} P_n^j \phi ^j \right\| _{H_x^1}^2 - \left\| w_n^J \right\| _{H_x^1}^2\rightarrow & {} 0,\end{aligned}$$
(3.9)
$$\begin{aligned} E(v_n) - \sum _{j=1}^J E\left( T_{x_n^j} e^{it_n^j \langle \nabla \rangle } {\L } _{\nu _n^j} D_{\lambda _n^j} P_n^j \phi ^j\right) - E(w_n^J)\rightarrow & {} 0\quad \text { as } n\rightarrow \infty , \end{aligned}$$
(3.10)

and

$$\begin{aligned} \limsup _{n\rightarrow \infty } \left\| e^{- it\langle \nabla \rangle } w_n^J \right\| _{L_{t,x}^\frac{2(d+2)}{d}(\mathbb {R}\times \mathbb {R}^d)} \rightarrow 0, \text { as } J \rightarrow J_0. \end{aligned}$$
(3.11)

Moreover, for any \(j\ne j'\), the orthogonality relation

$$\begin{aligned} \frac{\lambda _n^j}{\lambda _n^{j'}} + \frac{\lambda _n^{j'}}{\lambda _n^j} + \lambda _n^j \big |\nu _n^j - \nu _n^{j'} \big | + \frac{ \big |s_n^{jj'} \big |}{ \big (\lambda _n^{j'} \big )^2} + \frac{ \big |y_n^{jj'} \big |}{\lambda _n^{j'}} \rightarrow \infty \quad \text { as } n \rightarrow \infty , \end{aligned}$$
(3.12)

holds, where \(\left( - s_n^{jj'}, y_n^{jj'}\right) : = L_{\nu _n^{j'}}\left( t_n^{j'} - t_n^j, x_n^{j'} - x_n^j\right) \).

The second ingredient in the proof of Theorem 3.1 is the following NLS approximation. We will use it to construct the nonlinear profile.

Theorem 3.5

(Approximation of the low frequency profile) Assume \(\nu _n \rightarrow \nu \in \mathbb {R}^d\), \(\lambda _n \rightarrow \infty \), and either \(t_n \equiv 0\) or \(\frac{t_n}{\lambda _n^2} \rightarrow \pm \infty \), as \(n\rightarrow \infty \). Let \(\phi \in L_x^2(\mathbb {R}^d)\), and also assume

$$\begin{aligned} \Vert \phi \Vert _{L^2} < (2C_d)^{-\frac{d}{4}} \Vert Q\Vert _{L^2}\quad \text { if }~\mu = - 1, \end{aligned}$$

where \(C_d\) is a constant which is given by (5.2) below. Let

$$\begin{aligned} \phi _n : = T_{x_n} e^{it_n \langle \nabla \rangle } {\L } _{\nu _n} D_{\lambda _n} P_{\le \lambda _n^\theta } \phi , \end{aligned}$$

where \(\theta \) is some sufficiently small positive number. There exists a global solution \(v_n\) of (2.1) with \(v_n(0) = \phi _n\) for n large enough satisfying

$$\begin{aligned} S_{\mathbb {R}}(v_n) \lesssim _{\Vert \phi \Vert _{L^2} } 1. \end{aligned}$$

Moreover, \(\forall \, \epsilon > 0\), there exist \(N_\epsilon > 0\) and \(\psi _\epsilon \in C_c^\infty (\mathbb {R} \times \mathbb {R}^d)\) so that for each \(n > N_\epsilon \), we have

$$\begin{aligned} \left\| \Re \left( v_n \circ L_{\nu _n}^{-1}(t + \tilde{t}_n, x + \tilde{x}_n) - {\lambda _n^{- \frac{d}{2} } }{e^{-it}} \psi _\epsilon \left( \frac{t}{\lambda _n^2}, \frac{x}{\lambda _n}\right) \right) \right\| _{L_{t,x}^\frac{2(d+2)}{d}(\mathbb {R} \times \mathbb {R}^d)} < \epsilon , \end{aligned}$$
(3.13)

where \((\tilde{t}_n, \tilde{x}_n ) : = L_{\nu _n}(t_n, x_n)\).

If we assume the above two theorems hold, then by Proposition 2.12, Theorem 3.4, Theorem 3.5, and Proposition 2.9, with similar argument as in [9, 16], we can give the following result.

Proposition 3.6

(P.S. condition modulo translations) Let \(u_n\) be a sequence of global solutions to (1.1), which satisfy

$$\begin{aligned} & {} \lim _{n\rightarrow \infty } S_{(- \infty , 0]} (u_n) = \lim _{n\rightarrow \infty } S_{[ 0, \infty )} (u_n) = \infty ,\nonumber \\ {} & {} \Vert u_n(0)\Vert _{L^2} < \Vert Q\Vert _{L^2},\quad \text {when}~ \mu = -1, \end{aligned}$$
(3.14)

and also

$$\begin{aligned} E(u_n) \nearrow E_c \quad ~\text {as}~ n\rightarrow \infty . \end{aligned}$$

Then \((u_n(0), \partial _t u_n(0))\) converges in \(H^1\times L^2\) modulo translations up to a subsequence.

The proposition can be shown in the same spirit as in [16]. Let us give a brief outline of the proof to see how the tools we have developed by now are used.

Proof (Outline of the proof)

Let

$$\begin{aligned} v_n : = u_n + i \langle \nabla \rangle ^{-1} \partial _t u_n, \end{aligned}$$

and we will show \(v_n(0)\) converges in \(H^1\) modulo translations after passing to a subsequence. When \(\mu = -1\), by Proposition 2.12 and (3.14), we have \(v_n\) satisfies

$$\begin{aligned} \Vert v_n(0)\Vert _{L^2}^2 \le 2 E_c < \Vert Q\Vert _{L^2}^2. \end{aligned}$$

Thus, for both defocusing and focusing cases, we get

$$\begin{aligned} \Vert v_n(0)\Vert _{H^1}^2 \lesssim E(v_n) \le E_c. \end{aligned}$$

We can then apply Theorem 3.4 to the sequence \(v_n(0)\), and have for any \(J \in [1, J_0) \cap \mathbb {N}\),

$$\begin{aligned} v_n(0) = \sum _{j =1}^J \phi _n^j + w_n^J, \end{aligned}$$

with

$$\begin{aligned} \phi _n^j = T_{x_n^j} e^{it_n^j \langle \nabla \rangle } {\L } _{\nu _n^j} D_{\lambda _n^j} P_n^j \phi ^j. \end{aligned}$$

For any \( 1 \le j \le J_0\), we can make sure that \(\Vert \phi _n^j\Vert _{L^2}\) and \(E(\phi _n^j)\) converge after passing to a subsequence. By (3.10), we also have

$$\begin{aligned} E_c = \lim _{n\rightarrow \infty } E(v_n) = \lim _{n\rightarrow \infty } \left( \sum _{j = 1}^J E(\phi _n^j) + E(w_n^J) \right) . \end{aligned}$$
(3.15)

In the sequel, let us restrict ourselves to the case \(J_0=1\). The preclusion of the case \(J_0\ge 2\) is standard, see for instance [2, 16]. In this case, the identity

$$\begin{aligned} \lim _{n \rightarrow \infty } E(\phi _n^1) = E_c \end{aligned}$$
(3.16)

follows also by a standard argument. By (3.15) and (2.7), we have

$$\begin{aligned} v_n - \phi _n^1 = w_n^1 \rightarrow 0~\text { in }~H_x^1, \quad \text { as } n \rightarrow \infty . \end{aligned}$$
(3.17)

We now divide the analysis into the following three cases.

  1. Case 1.

    \(\lambda _n^1 = 1\) and \(t_n^1 = 0\);

  2. Case 2.

    \(\lambda _n^1 = 1\) and \(t_n^1 \rightarrow \pm \infty \);

  3. Case 3.

    \(\lambda _n^1 \rightarrow \infty \) as \(n\rightarrow \infty \).

In the first case, we have the desired conclusion. The second case is precluded by a standard argument, see for example [2, 16]. We omit the details.

Let us show that the third case can also be precluded. We will apply Theorem 3.5, but when \(\mu = -1\), we need to verify the following result first: \(\square \)

Lemma 3.7

When \(\mu = - 1\), if \(\lim _{n\rightarrow \infty } \lambda _n^1 = \infty \), we have \(\Vert \phi ^1\Vert _{L^2}< \Vert Q\Vert _{L^2}\).

Proof

Using (3.16) and (3.4), we obtain

$$\begin{aligned} \langle \nu _\infty ^1 \rangle \Vert \phi ^1\Vert _{L^2}^2= & {} \lim _{n\rightarrow \infty } \int _{\mathbb {R}^d} \left\langle {(\lambda _n^1)^{-1} \xi } \right\rangle \left\langle l_{- \nu _n^1} \left( {\left( \lambda _n^1 \right) ^{-1} \xi } \right) \right\rangle \left| \left( \mathcal {F}\left( P_{\le \left( \lambda _n^1 \right) ^\theta } \phi ^1 \right) \right) (\xi ) \right| ^2 \,\textrm{d}\xi \\ = & {} \lim _{n\rightarrow \infty } 2 E (\phi _n^1) \le 2 E_c. \end{aligned}$$

This together with \(2 E_c < 2 E(Q) = \Vert Q\Vert _{L^2}^2\) implies the result.\(\square \)

By Theorem 3.5, \(v_n^1\) with \(v_n^1(0) = \phi _n^1\) is a global solution to (2.1) and \(S_{\mathbb {R}} \left( v_n^1 \right) \lesssim _{E_c} 1\) for n large enough. Let us recall that the mass assumption of Theorem 3.5 in the focusing case is

$$\begin{aligned} \Vert \phi ^1\Vert _{L^2}^2 < (2 C_d)^{-\frac{d}{2}} \Vert Q\Vert _{L^2}^2, \end{aligned}$$

which is fulfilled because \(C_d < \frac{1}{2}\). Using (3.17) and Proposition 2.9, we can conclude \(S_{\mathbb {R}}(v_n) < \infty \), this is a contradiction and therefore completes the proof of Proposition 3.6.

Once we get the (P.S.) condition modulo translation in Proposition 3.6, we can extract a special solution of NLKG, which is the critical element in Theorem 3.1.

3.2 Rigidity

By a virial type argument, we can exclude the critical element, thus concluding the proof of Theorem 1.1 in the following theorem. We refer to [9, 16] for a proof.

Theorem 3.8

(Nonexistence of the critical element) The critical element \(u_c\) in Theorem 3.1 does not exist.

Before giving the proof of this theorem, we first collect some properties of the critical element. In the defocusing case and also invoking (2.7) in the focusing case, we have

$$\begin{aligned} \Vert u_c\Vert _{L_t^\infty H_x^1}^2 + \Vert \partial _t u_c \Vert _{L_t^\infty L_x^2}^2 \le 4 E(u_c ). \end{aligned}$$
(3.18)

By the Lorentz invariance of the NLKG and the minimality of \(u_c\) as a blow-up solution, we have

$$\begin{aligned} P(u_c , \partial _t u_c ) = 0. \end{aligned}$$
(3.19)

This leads to the control of x(t) in (3.1).

Lemma 3.9

(Controlling x(t)) The spatial center function x(t) of \(u_c\) satisfies

$$\begin{aligned} \left| \frac{x(t)}{t} \right| \rightarrow 0\quad \text {as } |t| \rightarrow \infty . \end{aligned}$$
(3.20)

Proof

By the spatial translation invariance, we may assume that \(x(0) = 0\). Suppose (3.20) is not true, there would exist \(\delta > 0\) and a sequence \(t_n \rightarrow \pm \infty \) such that

$$\begin{aligned} |x(t_n) | > \delta |t_n|. \end{aligned}$$

We may assume that \(t_n \rightarrow \infty \). Let \(\eta \ll E(u_c)\) be a sufficiently small positive constant and define

$$\begin{aligned} R_n : = C(\eta ) + |x(t_n)|, \end{aligned}$$

where \(C(\eta )\) is given by Corollary 3.2. Let \(\psi :\mathbb {R}_+ \rightarrow [0, 1]\) be a smooth cut-off function with

$$\begin{aligned} \psi (r) = \left\{ \begin{array}{ll} 1,&{}\quad 0 \le r < 1, \\ 0,&{}\quad r > 2 \end{array}\right. \end{aligned}$$
(3.21)

and define

$$\begin{aligned} X_{R_n}(t) : = \int _{\mathbb {R}^d} x \psi \left( \frac{|x|}{R_n} \right) e_{u_c} (t,x) \,\textrm{d}x, \end{aligned}$$

where

$$\begin{aligned} e_{u_c} : = \frac{1}{2} |u_c |^2 + \frac{1}{2} |\nabla u_c |^2 + \frac{1}{2} |\partial _t u_c |^2 + \mu \frac{d}{2(d+2)} |u_c |^\frac{2(d+2)}{d}. \end{aligned}$$

By the triangle inequality, (3.18) and (3.2), we have

$$\begin{aligned} |X_{R_n}(0)| \le \int _{|x| \le C(\eta )} |x| |e_{u_c} (0,x)| \,\textrm{d}x + \int _{C(\eta ) \le |x| \le 2 R_n} |x| |e_{u_c} (0,x)| \,\textrm{d}x \lesssim C(\eta ) E(u_c ) + \eta R_n. \end{aligned}$$

By the triangle inequality and (3.2), we also have

$$\begin{aligned} |X_{R_n}(t_n)|\ge & {} |x(t_n)| E(u_c ) - \int _{|x- x(t_n)| \le C(\eta )} | x - x(t_n)| \psi \left( \frac{|x|}{R_n} \right) |e_{u_c} (t_n)| \,\textrm{d}x \\ {} & {} - \int _{|x- x(t_n)| \ge C(\eta )} | x- x(t_n)| \psi \left( \frac{ |x|}{R_n} \right) |e_{u_c} (t_n)| \,\textrm{d}x\\ {} & {} - |x(t_n)| \int _{\mathbb {R}^d} \left( 1- \psi \left( \frac{ |x|}{R_n} \right) \right) |e_{u_c} (t_n)| \,\textrm{d}x \\ \ge & {} |x(t_n)| \left( E(u_c) - 4 \eta \right) - C(\eta ) \left( 2 E(u_c) + 2 \eta \right) . \end{aligned}$$

Thus, we get

$$\begin{aligned} |X_{R_n}(t_n) - X_{R_n}(0)| \gtrsim _{E(u_c )} |x(t_n)| - C(\eta ). \end{aligned}$$
(3.22)

By a direct calculation relying on (3.19), we have

$$\begin{aligned} X_{R_n}'(t) = \int _{\mathbb {R}^d } \left( 1 - \psi \left( \frac{|x|}{R_n} \right) \right) \partial _t u_c \nabla u_c \,\textrm{d}x - \int _{\mathbb {R}^d} \frac{x}{|x|R_n} \psi '\left( \frac{|x|}{R_n} \right) \partial _t u_c \, x \cdot \nabla u_c \,\textrm{d}x. \end{aligned}$$

This together with (3.2) yields

$$\begin{aligned} |X_{R_n}'(t)| \lesssim \eta . \end{aligned}$$
(3.23)

We can now derive an estimate by (3.22) and (3.23), that is

$$\begin{aligned} \eta t_n \gtrsim |X_{R_n}(t_n) - X_{R_n}(0) | \gtrsim _{E(u_c )} |x(t_n)| - C(\eta ) \gtrsim _{E(u_c )} \delta t_n - C(\eta ), \end{aligned}$$

once taking \(\eta \) small enough depending on \(\delta \) and \(E(u_c )\), and then taking n sufficiently large, we arrive a contradiction. Therefore, we get (3.20).\(\square \)

We now turn to the proof of Theorem 3.8. Let \(\eta _1 \) and \(\eta _2 \) be small positive constants to be determined later. By Lemma 3.9, there exists \(T_0 = T_0(\eta _1)>0\) such that

$$\begin{aligned} |x(t)| \le \eta _1 t \quad \text { for any } t \ge T_0. \end{aligned}$$
(3.24)

By Plancherel’s identity and (3.2), we have

$$\begin{aligned} \int _{\mathbb {R}^d } |u_c (t,x)|^2 \,\textrm{d}x\le \eta _2 + C(\eta _2)^2 \int _{\mathbb {R}^d} |\nabla u_c (t, x)|^2 \,\textrm{d}x. \end{aligned}$$
(3.25)

With \(\psi \) defined as in (3.21) and \(0< \epsilon< 1 < R\) to be specified later, we define

$$\begin{aligned} Z_R(t) = - \int _{\mathbb {R}^d} \psi \left( \frac{|x|}{R} \right) \partial _t u_c (t,x) x \cdot \nabla u_c (t,x) \,\textrm{d}x - (1 - \epsilon ) \int _{\mathbb {R}^d} \partial _t u_c (t,x) u_c (t,x) \,\textrm{d}x. \end{aligned}$$

By the Cauchy–Schwarz inequality and (3.18), we have

$$\begin{aligned} |Z_R(t)| \lesssim R E(u_c ) \lesssim _{u_c} R. \end{aligned}$$
(3.26)

On the other hand, by direct calculation, we have

$$\begin{aligned} Z_R'(t)= & {} \epsilon \left( \Vert u_c (t) \Vert _{H^1}^2 + \Vert \partial _t u_c \Vert _{L^2}^2 \right) + (1 - 2\epsilon ) \int _{\mathbb {R}^d} |\nabla u_c (t)|^2 + \mu \frac{d}{ d+2} |u_c (t)|^\frac{2(d+2)}{d} \,\textrm{d}x\\ {} & {} - 2\epsilon \int _{\mathbb {R}^d} |u_c (t)|^2 \,\textrm{d}x\\ {} & {} - \int _{\mathbb {R}^d} \left( 1 - \psi \left( \frac{|x|}{R}\right) \right) \left( |\partial _t u_c (t)|^2 - |u_c (t)|^2 - \mu \frac{d}{d+2} |u_c (t)|^\frac{2(d+2)}{d} \right) \,\textrm{d}x\\ {} & {} + \int _{\mathbb {R}^d} \frac{|x|}{2R} \psi '\left( \frac{|x|}{R}\right) \left( |\partial _t u_c (t)|^2 \!-\! |\nabla u_c (t)|^2 \!-\! |u_c (t)|^2 \!-\! \mu \frac{d}{d+2} |u_c (t)|^\frac{2(d+2)}{d} \right) \,\textrm{d}x \\ {} & {} + \int _{\mathbb {R}^d} \frac{1}{ |x| R} \psi '\left( \frac{|x|}{R} \right) (x \cdot \nabla u_c (t))^2 \, \textrm{d}x. \end{aligned}$$

Then, by (2.6), (3.2), and (3.25), we have for any \(T_0 \le t \le T_1\),

$$\begin{aligned} Z_R'(t)\ge & {} \epsilon \left( \Vert u_c (t)\Vert _{H^1}^2 + \Vert \partial _t u_c (t)\Vert _{L^2}^2 \right) - 2 \epsilon \eta _2 - 10 \eta _1 \\ {} & {} + \left( (1 - 2 \epsilon ) \left( 1 + \mu \frac{M(u_c (t))}{M(Q)} \right) - 2 \epsilon C(\eta _2)^2 \right) \int _{\mathbb {R}^d} |\nabla u_c (t)|^2 \,\textrm{d}x, \end{aligned}$$

where

$$\begin{aligned} R = C(\eta _1) + \sup _{t \in [T_0, T_1]} |x(t)|. \end{aligned}$$
(3.27)

Taking \(\eta _2\) sufficiently small depending on \(u_c\), and \(\epsilon \) small enough depending on \(C(\eta _2)\), and finally \(\eta _1\) sufficiently small depending on \(\epsilon \) and \(u_c \), we get

$$\begin{aligned} Z_R'(t) \gtrsim _{u_c} 1 \quad \forall \,T_0\le t \le T_1. \end{aligned}$$
(3.28)

By (3.26), (3.28), (3.27), and (3.24), we obtain

$$\begin{aligned} T_1 - T_0 \lesssim _{u_c} C(\eta _1) + \eta _1 T_1\quad \forall \,T_1 > T_0. \end{aligned}$$

Taking \(\eta _1\) sufficiently small depending on \(u_c\) and \(T_1\) large enough, we get a contradiction. Thus, we have proven Theorem 3.8.

4 Profile Decomposition: Proof of Theorem 3.4

In this section, we prove Theorem 3.4. The main tools are the refined linear and bilinear Strichartz estimates. Let us introduce a set of weak limit modulo symmetry.

Definition 4.1

For a bounded sequence \(\textbf{v}=\{v_n\}_n \subseteq H^1\), we let \(\mathcal {V}(\textbf{v})\) be the set of all functions \(\phi \in L^2\) such that there exist a number \(\Lambda >0\) and sequences \(\{\lambda _n\}_n \subseteq [\Lambda ^{-1},\infty )\), \(\{\xi _n\}_n \subseteq \Lambda [-1,1)^d\), \(\{t_n\}_n \subseteq \mathbb {R}\), and \(\{x_n\}_n \subseteq \mathbb {R}^d\) such that

$$\begin{aligned} D_{\lambda _n}^{-1} {\L } _{\xi _n}^{-1} e^{- i t_n \langle \nabla \rangle } T_{x_n}^{-1} v_n \rightharpoonup \phi \quad \text {weakly in } L^2 \end{aligned}$$

along a subsequence. Furthermore, we let \(\eta (\textbf{v}):= \sup _{\phi \in \mathcal {V}(\textbf{v})} \Vert \phi \Vert _{L^2}\).

For a sequence \(\textbf{v}\) bounded in \(H^1\), the case \(\eta ({\textbf {v}})=0\) corresponds to the vanishing scenario. We give a control of \(\eta ({\textbf {v}})\), which is called the inverse Strichartz estimate. As mentioned in Remark 2.7, the cube decomposition (2.4) is sufficient.

Lemma 4.2

(Inverse Strichartz estimate) Let \(\textbf{w}=\{w_n\}_n\) be a bounded sequence in \(H^1\). For any \(M>0\) and \(\varepsilon >0\), there exists \(\alpha =\alpha (M,\varepsilon )>0\) such that if

$$\begin{aligned} \Vert w_n\Vert _{H^1} \le M \end{aligned}$$

and

$$\begin{aligned} \limsup _{n\rightarrow \infty } \left\| e^{-it \langle \nabla \rangle } w_n \right\| _{L_{t,x}^\frac{2(d+2)}{d}} \ge \varepsilon , \end{aligned}$$

then

$$\begin{aligned} \eta (\textbf{w}) \ge \alpha . \end{aligned}$$

Proof

Since \(\Vert P_{>N} w_n\Vert _{H^{ \frac{1}{2}}} \le MN^{- \frac{1}{2}}\), we see from the Strichartz estimate (2.2) that there exists \(N_0=N_0(M,\varepsilon ) \in 2^\mathbb {Z}\) such that

$$\begin{aligned} \limsup _{n\rightarrow \infty } \left\| e^{-it \langle \nabla \rangle } P_{\le N_0} w_n \right\| _{L_{t,x}^\frac{2(d+2)}{d}} \ge \frac{\varepsilon }{2}. \end{aligned}$$

By (2.4), there exists a dyadic cube \(\mathcal {C}_n= \xi _n + \lambda _n^{-1} [-1,1)^d\) with \(|\xi _n| \lesssim _{N_0} 1\) and \(\lambda _n \gtrsim _{N_0} 1\) such that

$$\begin{aligned} \lambda _n^{ \frac{d}{2} } \left\| P_{\mathcal {C}_n} e^{-i t \langle \nabla \rangle } w_n \right\| _{L^\infty _{t,x}} \gtrsim _{M,\varepsilon } 1. \end{aligned}$$

Thus, one can choose \((t_n, x_n) \in \mathbb {R}\times \mathbb {R}^d\) so that

$$\begin{aligned} \left| \left( P_{\mathcal {C}_n} e^{i t_n \langle \nabla \rangle } w_n \right) (-x_n) \right| \gtrsim _{M,\varepsilon } \lambda _n^{-\frac{d}{2}}. \end{aligned}$$
(4.1)

With the parameters \(\xi _n\), \(\lambda _n\), \(x_n\), \(t_n\) given above, we define

$$\begin{aligned} W_n := D_{\lambda _n}^{-1} {\L } _{\xi _n}^{-1} e^{i t_n \langle \nabla \rangle } T_{x_n}^{-1} w_n. \end{aligned}$$

Since \(\xi _n\) is uniformly bounded, \(W_n\) is a bounded sequence in \(L_x^2\). Hence, after passing to a subsequence, there is a weak limit \(\phi \in \mathcal {V}(\textbf{w})\). Note that \(\eta (\textbf{w}) \ge \Vert \phi \Vert _{L^2}\) by definition of \(\eta \).

It suffices to show that there exists \(\beta =\beta (M,\varepsilon )>0\) such that \(\Vert \phi \Vert _{L^2}\ge \beta \). To this end, we introduce

$$\begin{aligned} h_n := D_{\lambda _n}^{-1} {\L } _{\xi _n}^{-1} m_0^{\xi _n}(\nabla )^{-1} \hat{T}_{\xi _n} D_{\lambda _n} \mathcal {F}^{-1} \textbf{1}_{[-1,1)^d}, \end{aligned}$$

where \(\hat{T}_\xi = \mathcal {F} ^{-1} T_\xi \mathcal {F} = e^{i x\cdot \xi }\) is a multiplication operator. In light of Lemma 3.3, we see that \(h_n\) converges to a function \(h \in L^2\) strongly in \(L^2\) along a subsequence. Furthermore, one has \(\Vert h\Vert _{L^2} \lesssim _{N_0(M,\varepsilon )} 1\). We remark that

$$\begin{aligned} \lambda _n^{\frac{d}{2}}\left( P_{Q_n} e^{-i t_n \langle \nabla \rangle } w_n \right) (-x_n)= & {} (2\pi )^{-\frac{d}{2}} \lambda _n^{\frac{d}{2}} \int _{\mathbb {R}^d}{} \textbf{1}_{\xi _n + \lambda _n^{-1}[-1,1)^d}(\xi ) e^{- ix_n \cdot \xi }\mathcal {F} \!\left( e^{i t_n \langle \nabla \rangle }w_n \right) \!(\xi ) d \xi \\ = & {} (2\pi )^{-\frac{d}{2}} \int _{\mathbb {R}^d} \textbf{1}_{[-1,1)^d}(z) \left( D_{\lambda _n} T_{\xi _n}^{-1} \mathcal {F} \left( T_{x_n}^{-1} e^{i t_n \langle \nabla \rangle } w_n \right) \right) (z) d z \\ = & {} (2\pi )^{-\frac{d}{2}} \left\langle D_{\lambda _n} T_{\xi _n}^{-1} \mathcal {F} \left( T_{x_n}^{-1}e^{i t_n \langle \nabla \rangle } w_n \right) , \textbf{1}_{[-1,1)^d} \right\rangle _{L^2}, \end{aligned}$$

where we have applied the change of variable \(z=\lambda _n(\xi - \xi _n)\) to obtain the second line. Plugging the identity \(D_\lambda ^{-1} = \mathcal {F}^{-1} D_\lambda \mathcal {F}\) and using the unitarity of \(\mathcal {F}^{-1}\), \(D_\lambda \), and \(\hat{T}_\xi \) in \(L^2\) and (3.5), one sees that the right-hand side equals to

$$\begin{aligned} & {} (2\pi )^{-\frac{d}{2}} \left\langle T_{x_n}^{-1}e^{i t_n \langle \nabla \rangle }w_n, \hat{T}_{\xi _n} D_{\lambda _n} \mathcal {F}^{-1} \textbf{1}_{[-1,1)^d} \right\rangle _{L^2}\\ {} & {} = (2\pi )^{-\frac{d}{2}} \left\langle {\L } _{\xi _n} D_{\lambda _n} W_n , \hat{T}_{\xi _n} D_{\lambda _n} \mathcal {F}^{-1} \textbf{1}_{[-1,1)^d} \right\rangle _{L^2} = (2\pi )^{-\frac{d}{2}} \left\langle W_n , h_n \right\rangle _{L^2}. \end{aligned}$$

Thus, by means of (4.1), one has

$$\begin{aligned} \Vert \phi \Vert _{L^2} \gtrsim _{M,\varepsilon } \left| \langle \phi , h \rangle _{L^2} \right|= & {} \lim _{n\rightarrow \infty } \left| \langle W_n, h_n\rangle _{L^2} \right| \\ \ge & {} (2\pi )^{\frac{d}{2}}\liminf _{n\rightarrow \infty } \lambda _n^{\frac{d}{2}} \left| \left( P_{Q_n} e^{-i t_n \langle \nabla \rangle } w_n \right) (-x_n)\right| \gtrsim _{M,\varepsilon } 1. \end{aligned}$$

Hence, the claim is proven.\(\square \)

We next give two more characterization of the orthogonality of the parameters.

Lemma 4.3

Let \(\{(\lambda _n,t_n,x_n, \nu _n)\}_{n\ge 1}\) and \(\{(\tilde{\lambda }_n,\tilde{t}_n,\tilde{x}_n, \tilde{\nu }_n)\}_{n\ge 1}\) be two sequences of \(\mathbb {R}_+ \times \mathbb {R} \times \mathbb {R}^d \times \mathbb {R}^d\) satisfying the normalization rule in Theorem 3.4. Then, the following three are equivalent:

  1. (1)

    (3.12) holds;

  2. (2)

    For any \(\phi \in L^2\),

    $$\begin{aligned} D_{{\tilde{\lambda }}_n}^{-1} {\L } _{{\tilde{\nu }}_n}^{-1} e^{-i \tilde{t}_n \langle \nabla \rangle } T_{\tilde{x}_n}^{-1} \left( T_{x_n} e^{it_n \langle \nabla \rangle } {\L } _{\nu _n} D_{\lambda _n} P_n \phi \right) \rightharpoonup 0 \quad \text { in }L^2,\quad \text { as }n\rightarrow \infty , \end{aligned}$$

    where \(P_n\) is the projector defined as in (3.6) with \(\lambda _n\);

  3. (3)

    For any subsequence \(\{n_k\}_k\) of \(\{n\}_n\), there exists a sequence of functions \(\{u_k\}_k\) bounded in \(H^1\) such that, along a subsequence \(\{k_\ell \}_\ell \),

    $$\begin{aligned} D_{{\tilde{\lambda }}_{n_{k_\ell }}}^{-1} {\L } _{{\tilde{\nu }}_{n_{k_\ell }}}^{-1} e^{-i \tilde{t}_{n_{k_\ell }} \langle \nabla \rangle } T_{\tilde{x}_{n_{k_\ell }}}^{-1} u_{k_\ell } \rightharpoonup 0, \quad D_{{\lambda }_{n_{k_\ell }}}^{-1} {\L } _{{\nu }_{n_{k_\ell }}}^{-1} e^{-i {t}_{n_{k_\ell }} \langle \nabla \rangle } T_{{x}_{n_{k_\ell }}}^{-1} u_{k_\ell } \rightharpoonup u_\infty \ne 0, \end{aligned}$$

    weakly in \(L^2\) as \(\ell \rightarrow \infty \).

Proof

(1) \(\Rightarrow \) (2). We omit the subscript n in this part since the role is less important. Pick two functions \(\phi ,\psi \in L^2\). The goal is to show that for any \(\varepsilon >0\), there exists \(K = K(\varepsilon ) \ge 1\) such that

$$\begin{aligned} I := \left\langle D_{{\tilde{\lambda }}}^{-1} {\L } _{{\tilde{\nu }}}^{-1} e^{-i \tilde{t} \langle \nabla \rangle } T_{\tilde{x}}^{-1} \left( T_{x} e^{it \langle \nabla \rangle } {\L } _{\nu } D_{\lambda } P_{\lambda } \phi \right) , \psi \right\rangle _{L^2} \end{aligned}$$

obeys the bound \(|I| \le \varepsilon \) as long as

$$\begin{aligned} \frac{\lambda }{\tilde{\lambda }} + \frac{\tilde{\lambda }}{\lambda } + \lambda |\nu - \tilde{\nu } | + \frac{|s_\Delta |}{\lambda ^2} + \frac{|y_\Delta |}{\lambda } \ge K, \end{aligned}$$

where \(P_\lambda \) is a suitable frequency cutoff, \((-s_\Delta ,y_\Delta )=L_{\tilde{\nu }} (\tilde{t}-t,\tilde{x}-x)\), and \(\nu , \tilde{\nu } \in B(0,\Lambda )\) for some \(\Lambda \). By the density argument, we may suppose without loss of generality that \(\phi , \psi \in \mathcal {S}\) and \(\textrm{supp}\,\mathcal {F} \phi , \textrm{supp}\,\mathcal {F} \psi \subseteq \overline{ B(0,R_0)}\) for some \(R_0 \gg 1\). Furthermore, by this modification, we may replace \(P_\lambda \) by the identity.

Let us begin with the proof when \(\frac{\lambda }{\tilde{\lambda }} + \frac{\tilde{\lambda }}{\lambda }\) is large. Note that if the Fourier support of a function f is a subset of B(cR), then those of \(D_\lambda f\) and \({\L } _\nu f\) are included in \(B \left( \frac{c}{\lambda }, \frac{R}{\lambda }\right) \) and \(B(l_{\nu }(c), 2\langle \nu \rangle R)\), respectively. Hence

$$\begin{aligned} \textrm{supp}\,\mathcal {F}\left( D_{{\tilde{\lambda }}}^{-1} {\L } _{{\tilde{\nu }}}^{-1} e^{-i \tilde{t} \langle \nabla \rangle }T_{\tilde{x}}^{-1} \left( T_{x} e^{it \langle \nabla \rangle } {\L } _{\nu } D_{\lambda } \phi \right) \right) \subseteq B \left( \tilde{\lambda } l_{-\tilde{\nu }} \left( l_{\nu } (0) \right) , 4 \langle \tilde{\nu } \rangle \langle \nu \rangle R_0 \tfrac{\tilde{\lambda }}{\lambda } \right) . \end{aligned}$$

By Bernstein’s inequality and boundedness of \(\nu _n\) and \(\tilde{\nu }_n\),

$$\begin{aligned} |I|\le & {} \left\| D_{{\tilde{\lambda }}}^{-1} {\L } _{{\tilde{\nu }}}^{-1} e^{-i \tilde{t} \langle \nabla \rangle } T_{\tilde{x}}^{-1} \left( T_{x} e^{it \langle \nabla \rangle } {\L } _{\nu } D_{\lambda } \phi \right) \right\| _{L_x^{\frac{2(d+2)}{d}} } \Vert \psi \Vert _{L_x^{\frac{2(d+2)}{d+4}}} \\ &\lesssim _\psi&\left( 4 \langle \tilde{\nu } \rangle \langle \nu \rangle R_0 \tfrac{\tilde{\lambda }}{\lambda } \right) ^{\frac{d}{d+2}} \left\| D_{{\tilde{\lambda }}}^{-1} {\L } _{{\tilde{\nu }}}^{-1} e^{-i \tilde{t} \langle \nabla \rangle } T_{\tilde{x}}^{-1} \left( T_{x} e^{it \langle \nabla \rangle } {\L } _{\nu } D_{\lambda } \phi \right) \right\| _{L_x^2}\\ &\lesssim _{\Lambda ,R_0}&\left( \tfrac{\tilde{\lambda }}{\lambda } \right) ^{\frac{d}{d+2}} \Vert \phi \Vert _{L^2} \rightarrow 0 \end{aligned}$$

as \(\frac{\lambda }{\tilde{\lambda }}\rightarrow \infty \).

Hence, there exists \(K_1 = K_1(\varepsilon )\) such that we obtain the desired smallness if \(\frac{\lambda }{\tilde{\lambda }} \ge K_1\). On the other hand, in the case when \(\frac{\tilde{\lambda }}{\lambda }\) is large, we use the identity

$$\begin{aligned} I= \left\langle \phi , D_{{\lambda }}^{-1} {\L } _{{\nu }}^{-1} \left( m_0^{\nu }(\nabla )^{-1} m_0^{\tilde{\nu }}(\nabla ) \right) e^{-i {t} \langle \nabla \rangle } T_{{x}}^{-1} \left( T_{\tilde{x}} e^{i\tilde{t} \langle \nabla \rangle } {\L } _{\tilde{\nu }} D_{\tilde{\lambda }} \psi \right) \right\rangle _{L^2}. \end{aligned}$$

Since \(m_0^{\nu }(\nabla )^{-1} m_0^{\tilde{\nu }}(\nabla )\) is a multiplication by a bounded factor in the Fourier side, we obtain

$$\begin{aligned} & {} \left\| D_{{\lambda }}^{-1} {\L } _{{\nu }}^{-1} \left( m_0^{\nu }(\nabla )^{-1} m_0^{\tilde{\nu }}(\nabla ) \right) e^{-i {t} \langle \nabla \rangle } T_{{x}}^{-1} \left( T_{\tilde{x}} e^{i\tilde{t} \langle \nabla \rangle } {\L } _{\tilde{\nu }} D_{\tilde{\lambda }} \psi \right) \right\| _{L_x^{\frac{2(d+2)}{d}}}\\ {} & {} \lesssim \left( \tfrac{\lambda }{\tilde{\lambda }} \right) ^{\frac{d}{d+2}} \Vert \psi \Vert _{L_x^2} \rightarrow 0 \end{aligned}$$

as \(\frac{\tilde{\lambda }}{\lambda }\rightarrow \infty \), just as in the previous case. Thus, replacing \(K_1\) with a larger one if necessary, we obtain the desired smallness if \(\frac{\tilde{\lambda }}{\lambda }\ge K_1\).

We suppose \(\frac{\lambda }{\tilde{\lambda }} + \frac{\tilde{\lambda }}{\lambda }\le 2K_1\) in what follows. Let us next consider the case \(\lambda = \tilde{\lambda }= 1\). In this case, we have \(\nu = \tilde{\nu } = 0\) by the normalization condition. Furthermore, \((-s_\Delta ,y_\Delta )= (\tilde{t}-t,\tilde{x}-x)\). One has

$$\begin{aligned} I= \left\langle T_{x-\tilde{x}} e^{i \left( t- \tilde{t} \right) \langle \nabla \rangle } \phi , \psi \right\rangle _{L^2}. \end{aligned}$$

By a standard argument, one sees that there exists \(K_2 = K_2(\varepsilon )\) such that the modulus of the right-hand side is smaller than \(\varepsilon \) if \(|t-\tilde{t}| + |x-\tilde{x}|\ge K_2\).

Let us move to the case \(\lambda \rightarrow \infty \). Note that \(\tilde{\lambda }\ge \frac{\lambda }{2K_1} \rightarrow \infty \) by our assumption. We next consider the case when \(\lambda |\nu -\tilde{\nu }|\) is sufficiently large. A computation shows

$$\begin{aligned} \tilde{\lambda } | l_{-\tilde{\nu }} (l_{\nu } (0))|= & {} \tilde{\lambda } \sqrt{(\langle \nu \rangle \langle \tilde{\nu }\rangle - \nu \cdot \tilde{\nu } + 1) (\langle \nu \rangle \langle \tilde{\nu } \rangle - \nu \cdot \tilde{\nu } - 1)}\\ \ge & {} \sqrt{\frac{\min (\langle \nu \rangle , \langle \tilde{\nu } \rangle )}{\max (\langle \nu \rangle , \langle \tilde{\nu } \rangle )}} \frac{\lambda |\nu -\tilde{\nu }|}{2K_1} . \end{aligned}$$

Hence, there exists \(K_3\) such that if \(\lambda |\nu -\tilde{\nu }| \ge K_3\) then \(I=0\) is deduced from the disagreement of the Fourier support. We therefore suppose that \(\lambda |\nu -\tilde{\nu }| \le K_3\). One sees from (3.4) that

$$\begin{aligned} {\L } _{{\tilde{\nu }}}^{-1} e^{-i \tilde{t} \langle \nabla \rangle } T_{\tilde{x}}^{-1} T_{x} e^{it \langle \nabla \rangle } = {\L } _{{\tilde{\nu }}}^{-1} T_{x-\tilde{x}} e^{i(t-\tilde{t}) \langle \nabla \rangle } = T_{-y_\Delta } e^{i s_\Delta \langle \nabla \rangle } {\L } _{{\tilde{\nu }}}^{-1}. \end{aligned}$$

Hence,

$$\begin{aligned} I = \left\langle T_{-{y_\Delta }/{\tilde{\lambda }}} e^{i \left( s_\Delta /\tilde{\lambda }^2 \right) \langle \nabla \rangle } D_{\tilde{\lambda }}^{-1} {\L } _{{\tilde{\nu }}}^{-1} {\L } _\nu D_\lambda \phi , \psi \right\rangle _{L^2}. \end{aligned}$$

One verifies that \(D_{\tilde{\lambda }}^{-1}{\L } _{{\tilde{\nu }}}^{-1} {\L } _\nu D_\lambda \phi \) takes value in a precompact set in \(L^2\). Hence, there exists \(K_4\) such that if \(|s_\Delta |/\tilde{\lambda }^2 + |y_\Delta |/\tilde{\lambda }\ge K_4\), then \(|I| \le \varepsilon \).

Combining the above together, we prove the existence of the desired K.

(2) \(\Rightarrow \) (3). Pick \(u_\infty \in H^1\), \(u_\infty \ne 0\), and set \(u_k := T_{x_{n_k}} e^{it_{n_k} \langle \nabla \rangle } {\L } _{\nu _{n_k}} D_{\lambda _{n_k}} P_{n_k} u_\infty \).

(3) \(\Rightarrow \) (1). We prove by contradiction. Suppose that (1) fails. Then, there exists a subsequence \(\{n_k\}_{k\ge 1}\) such that

$$\begin{aligned} \nu _{n_k} \rightarrow \nu _\infty \in \mathbb {R}^d\quad \text { as } \tilde{\nu }_{n_k} \rightarrow \tilde{\nu }_\infty \end{aligned}$$

and

$$\begin{aligned} & {} \frac{\lambda _{n_k}}{\tilde{\lambda }_{n_k}} \rightarrow \lambda _*\in (0,\infty ),\quad \tilde{\lambda }_{n_k} |\nu _{n_k} - \tilde{\nu }_{n_k}| \rightarrow \nu _*\in [0,\infty ), \\ {} & {} \frac{s_{\Delta ,n_k}}{\lambda _{n_k}^2} \rightarrow s_*\in \mathbb {R} , \quad \frac{y_{\Delta ,n_k}}{\lambda _{n_k} } \rightarrow y_{*}\in \mathbb {R}^d,\quad \text { as } k\rightarrow \infty , \end{aligned}$$

where \((-s_{\Delta ,n},y_{\Delta ,n})=L_{\tilde{\nu }_n} (\tilde{t}_n-t_n,\tilde{x}_n-x_n )\). Along this sequence, the operator

$$\begin{aligned} S_n:= D_{{\tilde{\lambda }}_n}^{-1} {\L } _{{\tilde{\nu }}_n}^{-1} e^{-i \tilde{t}_n \langle \nabla \rangle } T_{\tilde{x}_n}^{-1} T_{x_n} e^{it_n \langle \nabla \rangle } {\L } _{\nu _n} D_{\lambda _n} = T_{-y_{\Delta ,n}/{\tilde{\lambda }}_n} e^{i \left( s_{\Delta ,n}/{\tilde{\lambda }}_n^2 \right) \langle \nabla \rangle } {\L } _{{\tilde{\nu }_n}}^{-1} \end{aligned}$$

converges to a bounded operator, say \(S_\infty \), in the strong operator sense.

Now, we suppose that a bounded sequence \(\{u_k\}_{k\ge 1} \subseteq H^1\) satisfies

$$\begin{aligned} D_{{\tilde{\lambda }}_{n_{k_\ell }}}^{-1} {\L } _{{\tilde{\nu }}_{n_{k_\ell }}}^{-1} e^{-i \tilde{t}_{n_{k_\ell }} \langle \nabla \rangle } T_{\tilde{x}_{n_{k_\ell }}}^{-1} u_{k_\ell } \rightharpoonup 0, \quad D_{{\lambda }_{n_{k_\ell }}}^{-1} {\L } _{{\nu }_{n_{k_\ell }}}^{-1} e^{-i {t}_{n_{k_\ell }} \langle \nabla \rangle } T_{{x}_{n_{k_\ell }}}^{-1} u_{k_\ell } \rightharpoonup u_\infty ~\text { in } L^2~\text { as } \ell \rightarrow \infty . \end{aligned}$$

Since

$$\begin{aligned} D_{{\tilde{\lambda }}_{n_{k_\ell }}}^{-1} {\L } _{{\tilde{\nu }}_{n_{k_\ell }}}^{-1} e^{-i \tilde{t}_{n_{k_\ell }} \langle \nabla \rangle } T_{\tilde{x}_{n_{k_\ell }}}^{-1} u_{k_\ell } = S_{n_{k_l}} D_{{\lambda }_{n_{k_\ell }}}^{-1} {\L } _{{\nu }_{n_{k_\ell }}}^{-1} e^{-i {t}_{n_{k_\ell }} \langle \nabla \rangle } T_{{x}_{n_{k_\ell }}}^{-1}u_{k_\ell }, \end{aligned}$$

we see from the uniqueness of the weak limit that \(S_\infty ^*u_\infty =0\), where \(S_\infty ^*\) is an adjoint operator of \(S_\infty \). This implies \(u_\infty \) must be zero. Hence (3) fails. We complete the proof.\(\square \)

Now, we are ready to prove Theorem 3.4.

Proof of Theorem 3.4

Assume \( {\textbf {w}}^0 = \{v_n\}_{n\ge 1}\) is a bounded sequence in \(H_x^1(\mathbb {R}^d)\) satisfying \(\Vert v_n\Vert _{H^1}\le A\). We divide the proof into five steps. In the first three steps, we construct profiles, parameters, and remainders by induction. The fourth step is devoted to the mutual orthogonality of the parameters. In the last step we establish the smallness of the remainders. We remark that we freely extract a subsequence of n, denoted again by n.

Step 1. Construction of the first profile and the first remainder If \(\eta ( {\textbf {w}}^0)=0\), then one has the desired property with the choice \(J_0=1\). Hence, we suppose \(\eta ({\textbf {w}}^0)>0\) in the sequel.

By the definition of \(\eta \), there exists \(\tilde{\phi }^1 \in \mathcal {V} ({\textbf {w}}^0)\) such that \(\Vert \tilde{\phi }^1\Vert _{L^2} \ge \frac{1}{2} \eta ({\textbf {w}}^0)>0\). By the definition of \(\mathcal {V} ({\textbf {w}}^0)\), there exists \( (\tilde{\lambda }_n^1,\tilde{\xi }_n^1,\tilde{t}_n^1, \tilde{x}_n^1)\in \mathbb {R}_+ \times \mathbb {R}^d \times \mathbb {R} \times \mathbb {R}^d\) such that

$$\begin{aligned} D_{\tilde{\lambda }_n^1}^{-1} {\L } _{\tilde{\xi }_n^1}^{-1} e^{-i \tilde{t}_n^1 \langle \nabla \rangle } T_{\tilde{x}_n^1}^{-1} v_n \rightharpoonup \tilde{\phi }^1 \quad \text {weakly in } L^2 \end{aligned}$$
(4.2)

along a subsequence. Furthermore, \(\tilde{\lambda }_n^1\) and \( |\tilde{\xi }_n^1|\) are bounded by a positive constant from below and above, respectively. Further extracting a subsequence if necessary, we have

$$\begin{aligned} \tilde{\lambda }_n^1 \!\rightarrow \! \tilde{\lambda }_\infty ^1 \!\in \! (0,\infty ],~ \tilde{\xi }_n^1 \!\rightarrow \! \tilde{\xi }_\infty ^1 \!\in \! \mathbb {R}^d,~ \left( \tilde{\lambda }_n^1 \right) ^{-2}\tilde{t}_n^1\! \rightarrow \! \tilde{\tau }_\infty ^1 \in [-\infty ,\infty ],~ e^{-i \tilde{t}_n^1/\langle \xi _n^1\rangle }\! \rightarrow \! e^{i\theta ^1} \in \mathbb {C}. \end{aligned}$$

Now we modify the profile and parameters so that the parameter satisfies the desired property, i.e. either \(\lim _{n\rightarrow \infty }\lambda _n^1= \infty \) or \(\lambda _n^1\equiv 1\), etc.

  • If \(\tilde{\lambda }^1_\infty < \infty \) and \(\tilde{\tau }_\infty ^1\in \mathbb {R}\), then we take

    $$\begin{aligned} \phi ^1:=e^{i \tilde{\tau }_\infty ^1(\lambda _\infty ^1)^2 \langle \nabla \rangle } {\L } _{\xi _\infty ^1} D_{\lambda _\infty ^1} \tilde{\phi }^1, \quad \lambda _n^1:=1, \quad \nu _n^1:=0, \quad t_n^1:=0, \quad x_n^1:= \tilde{x}_n^1. \end{aligned}$$

    Note that \(\tilde{t}_n^1 \rightarrow \tilde{\tau }_\infty ^1 (\lambda _\infty ^1)^2 \in \mathbb {R}\) as \(n\rightarrow \infty \).

  • If \(\tilde{\lambda }^1_\infty < \infty \) and \(\tilde{\tau }_\infty ^1 = \pm \infty \), then we take

    $$\begin{aligned} \phi ^1:= {\L } _{\xi _\infty ^1} D_{\lambda _\infty ^1} \tilde{\phi }^1, \quad \lambda _n^1:=1, \quad \nu _n^1:=0, \quad t_n^1:=\tilde{t}_n^1, \quad x_n^1:= \tilde{x}_n^1. \end{aligned}$$
  • If \(\tilde{\lambda }^1_\infty =\infty \) and \(\tilde{\tau }_\infty ^1\in \mathbb {R}\), then we take

    $$\begin{aligned} \phi ^1:=e^{-i\theta ^1} e^{-i \frac{\tilde{\tau }_\infty ^1 \Delta }{2 \left\langle \tilde{\xi }_\infty ^1\right\rangle }} \tilde{\phi }^1, \quad \lambda _n^1:=\tilde{\lambda }_n^1, \quad \nu _n^1:= \tilde{\xi }_n^1, \quad t_n^1:=0, \quad x_n^1:= \tilde{x}_n^1+ \frac{\tilde{\xi }_n^1}{ \left\langle \tilde{\xi }_n^1 \right\rangle } \tilde{t}_n^1. \end{aligned}$$
  • If \(\tilde{\lambda }^1_\infty =\infty \) and \(\tilde{\tau }_\infty ^1= \pm \infty \), then we simply take

    $$\begin{aligned} \phi ^1:=\tilde{\phi }^1, \quad \lambda _n^1:=\tilde{\lambda }_n^1, \quad \nu _n^1:=\tilde{\xi }_n^1, \quad t_n^1:=\tilde{t}_n^1, \quad x_n^1:= \tilde{x}_n^1. \end{aligned}$$

Note that one has \(\lim _{n\rightarrow \infty } t_n^1= \tilde{\tau }_\infty ^1 \in \{\pm \infty \}\) in the second and the fourth cases. Set \(\lambda _\infty ^1:= \lim _{n\rightarrow \infty } \lambda _n^1 \in \{1,\infty \}\). It follows from (3.5) that \(\Vert \phi ^1\Vert _{L^2} \gtrsim \langle \xi _\infty ^1\rangle ^{-1} \Vert \tilde{\phi }^1\Vert _{L^2}\). In particular, \(\phi ^1 \ne 0\). Let us now prove that

$$\begin{aligned} W_{n}^{0,1}:= D_{\lambda _n^1}^{-1} {\L } _{\nu _n^1}^{-1} e^{-i {t}_n^1 \langle \nabla \rangle } T_{x_n^1}^{-1} v_n \rightharpoonup {\phi }^1 \quad \text {weakly in} ~L^2 \end{aligned}$$
(4.3)

along the same subsequence. In the last case, this is nothing but (4.2). Furthermore, one easily verifies it is also in the first two cases by the convergence of the parameters. Let us consider the third case. By (3.4), we have

$$\begin{aligned} {\L } _{\tilde{\xi }_n^1}^{-1} e^{-i \tilde{t}_n^1 \langle \nabla \rangle } T_{\frac{\tilde{\xi }_n^1}{ \left\langle \tilde{\xi }_n^1 \right\rangle } \tilde{t}_n^1} = e^{-i \frac{\tilde{t}_n^1}{ \left\langle \tilde{\xi }_n^1 \right\rangle } \langle \nabla \rangle } {\L } _{\tilde{\xi }_n^1}^{-1}, \end{aligned}$$

from which we obtain

$$\begin{aligned} D_{\tilde{\lambda }_n^1}^{-1} {\L } _{\tilde{\xi }_n^1}^{-1} e^{-i \tilde{t}_n^1 \langle \nabla \rangle } T_{\tilde{x}_n^1}^{-1} v_n = e^{-i \frac{\tilde{t}_n^1}{\langle \xi _n \rangle } \left\langle \left( \lambda _n^1 \right) ^{-1} \nabla \right\rangle } \left( D_{{\lambda }_n^1}^{-1} {\L } _{\nu _n^1}^{-1} e^{-i {t}_n^1 \langle \nabla \rangle } T_{x_n^1}^{-1} v_n \right) . \end{aligned}$$

Then, one can extract a subsequence so that the strong operator convergence

$$\begin{aligned} e^{-i \frac{\tilde{t}_n^1}{ \left\langle \tilde{\xi }_n^1 \right\rangle } \left\langle \left( \lambda _n^1 \right) ^{-1} \nabla \right\rangle } = e^{-i \frac{\tilde{t}_n^1}{ \left\langle \tilde{\xi }_n^1 \right\rangle } \left( \left\langle \left( \lambda _n^1 \right) ^{-1} \nabla \right\rangle -1 \right) } e^{-i \frac{\tilde{t}_n^1}{ \left\langle \tilde{\xi }_n^1 \right\rangle }} \rightarrow e^{-i \frac{\tilde{\tau }_\infty ^1 }{2 \left\langle \tilde{\xi }_\infty ^1 \right\rangle }\Delta } e^{i \theta ^1} \end{aligned}$$

holds as \(n\rightarrow \infty \) as an operator from \(L^2\) into itself. Since the limit operator is unitary in \(L^2\), we obtain

$$\begin{aligned} D_{{\lambda }_n^1}^{-1} {\L } _{\nu _n^1}^{-1} e^{-i {t}_n^1 \langle \nabla \rangle } T_{x_n^1}^{-1} v_n \rightharpoonup \left( e^{-i \frac{\tilde{\tau }_\infty ^1 }{2 \left\langle \tilde{\xi }_\infty ^1 \right\rangle } \Delta } e^{i \theta ^1}\right) ^{-1} \tilde{\phi }^1 = \phi ^1 \end{aligned}$$

as desired. We obtain (4.3).

Furthermore, if \(\lambda _n^1\equiv 0\), that is \(\tilde{\lambda }_\infty ^1<\infty \), one sees from the boundedness of \(\{\nu _n^1\}_n\) that the sequence \(\{D_{\lambda _n^1}^{-1} {\L } _{\nu _n^1}^{-1} e^{-i {t}_n^1 \langle \nabla \rangle }T_{x_n^1}^{-1} v_n\}_{n\ge 1}\) is bounded in \(H^1\). Hence, one has \(\phi ^1 \in H^1\) from (4.3). Furthermore, the same weak convergence as in (4.3) holds in the weak \(H^1\) sense.

Now, we define the remainder term \(\textbf{w}^1 = \{w_n^1\}_{n\ge 1}\) by

$$\begin{aligned} w_n^1 := w_n^0 - T_{x_n^1} e^{i {t}_n^1 \langle \nabla \rangle } {\L } _{{\nu }_n^1} D_{{\lambda }_n^1}P_n^1 \phi ^1. \end{aligned}$$

Then, the decomposition for \(J=1\) immediately follows. Moreover, by the virtue of the presence of \(P_n^1\), \(\{w_n^1\}\) is bounded in \(H^1\). Furthermore, as \(P_n^1\) converges to the identity in the strong operator sense, we have

$$\begin{aligned} W_{n}^{1,1} := D_{{\lambda }_n^1}^{-1} {\L } _{{\nu }_n^1}^{-1} e^{-i {t}_n^1 \langle \nabla \rangle } T_{x_n^1}^{-1} w_n^1 \rightharpoonup 0\quad \text { in } L^2\quad \text { as } n \rightarrow \infty . \end{aligned}$$
(4.4)

Note that if \(\lambda _n^1\equiv 0\) then the convergence holds weakly in \(H^1\) as in (4.3).

Step 2. Proof of the decoupling identities Let us claim

$$\begin{aligned} \left\| w_n^0\right\| _{L^2}^2= & {} \left\| w_n^{1}\right\| _{L^2}^2+\left\| T_{x^1_n} e^{i t^1_n \langle \nabla \rangle } {\L } _{\xi ^1_n} D_{\lambda ^1_n} P_{n}^1\phi ^1\right\| _{L^2}^2+o(1),\end{aligned}$$
(4.5a)
$$\begin{aligned} \left\| w_n^0\right\| _{\dot{H}^1}^2= & {} \left\| w_n^{1}\right\| _{\dot{H}^1}^2+\left\| T_{x^1_n} e^{i t^1_n \langle \nabla \rangle } {\L } _{\xi ^1_n} D_{\lambda ^1_n} P_{n}^1 \phi ^1\right\| _{\dot{H}^1}^2+o(1)\quad \text { as } n \rightarrow \infty . \end{aligned}$$
(4.5b)

Note that \(w_n^0=v_n\). For \(k=0,1\), we have

$$\begin{aligned} \left\langle \nabla ^k w_n^0, \nabla ^k w_n^0 \right\rangle _{L^2}= & {} \left\| w_n^1\right\| _{\dot{H}^k}^2 + \left\| T_{x_n^1} e^{i {t}_n^1 \langle \nabla \rangle } {\L } _{{\nu }_n^1} D_{\lambda _n^1}P_n^1 \phi ^1 \right\| _{\dot{H}^k}^2 \\ {} & {} + 2 \Re \left\langle \nabla ^k w_n^1, \nabla ^k T_{x_n^1} e^{i {t}_n^1 \langle \nabla \rangle } {\L } _{{\nu }_n^1} D_{{\lambda }_n^1}P_n^1 \phi ^1 \right\rangle _{L^2}. \end{aligned}$$

Hence, it suffices to show that the last term of the right-hand side tends to zero as \(n\rightarrow \infty \). If \(\lambda _n^1\equiv 0\), then it is a direct consequence of \(\nu _n^1 \equiv 0\) and the fact that the weak convergence (4.3) holds weakly in \(H^1\). We consider the case \(\lambda _n^1\rightarrow \infty \) as \(n\rightarrow \infty \), one has

$$\begin{aligned} \left\langle w_n^1, T_{x_n^1} e^{i {t}_n^1 \langle \nabla \rangle } {\L } _{\nu _n^1} D_{\lambda _n^1}P_n^1 \phi ^1 \right\rangle _{{H}_x^k} = \left\langle W_n^{1,1} , D_{\lambda _n}^{-1} \langle \nabla \rangle ^{2k} m_k^{\nu _n^1}(\nabla )^{-1} D_{\lambda _n^1} P_{\le (\lambda _n^1)^\theta } \phi \right\rangle _{L^2}. \end{aligned}$$

Since

$$\begin{aligned} D_{\lambda _n}^{-1} \langle \nabla \rangle ^{2k} m_k^{\nu _n^1}(\nabla )^{-1} D_{\lambda _n^1} P_{\le \left( \lambda _n^1 \right) ^\theta } \rightarrow \langle \nu _\infty ^1 \rangle ^{1-2k} \end{aligned}$$

in the strong operator sense in \(\mathcal {L}(L^2)\), we see from (4.4) that

$$\begin{aligned} \Re \left\langle \nabla ^k w_n^1, \nabla ^k T_{x_n^1} e^{i {t}_n^1 \langle \nabla \rangle } {\L } _{\nu _n^1} D_{\lambda _n^1}P_n^1 \phi ^1 \right\rangle _{L^2} \rightarrow 0\quad \text { as } n\rightarrow \infty . \end{aligned}$$

Therefore, we obtain (4.5a, b). It also yields

$$\begin{aligned} \limsup _{n\rightarrow \infty } \Vert w_n^1\Vert _{H^1} \le \limsup _{n\rightarrow \infty } \Vert w_n^0\Vert _{H^1} = A. \end{aligned}$$

Furthermore, mimicking the proof of the claim one also has

$$\begin{aligned} \Vert w_n^0\Vert _{H^{ \frac{1}{2}}}^2 = \Vert w_n^{1}\Vert _{H^{\frac{1}{2}}}^2+\left\| T_{x^1_n} e^{i t^1_n \langle \nabla \rangle } {\L } _{\xi ^1_n} D_{\lambda ^1_n} P_{n}^1 \phi ^1\right\| _{H^{ \frac{1}{2}}}^2 + o(1)\quad \text { as } n\rightarrow \infty . \end{aligned}$$
(4.6)

We now turn to the energy decoupling for \(J=1\). By  (4.5a, b), it is enough to prove

$$\begin{aligned} \Vert \Re w_n^0\Vert _{L_x^\frac{2(d+2)}{d}}^\frac{2(d+2)}{d} - \Vert \Re w_n^1\Vert _{L_x^\frac{2(d+2)}{d}}^\frac{2(d+2)}{d} - \left\| \Re \left( T_{x^1_n} e^{i t^1_n \langle \nabla \rangle } {\L } _{\xi ^1_n} D_{\lambda ^1_n} P_{n}^1 \phi ^1 \right) \right\| _{L_x^\frac{2(d+2)}{d}}^\frac{2(d+2)}{d} \rightarrow 0~ \text { as } n\rightarrow \infty . \end{aligned}$$
(4.7)

When \(\lambda _n^1 \equiv 1\), \(\nu _n^1 \equiv 0\), \(t_n^1 \equiv 0\), we see from (4.3) and (4.4) that

$$\begin{aligned} \Vert \Re w_n^0\Vert _{L_x^\frac{2(d+2)}{d}} = \left\| \Re T_{x_n^1}^{-1}w_n^0 \right\| _{L_x^\frac{2(d+2)}{d}} \rightarrow \Vert \Re \phi ^1\Vert _{L_x^\frac{2(d+2)}{d}} \end{aligned}$$

and

$$\begin{aligned} \Vert \Re w_n^1\Vert _{L_x^\frac{2(d+2)}{d}} = \left\| \Re T_{x_n^1}^{-1}w_n^1 \right\| _{L_x^\frac{2(d+2)}{d}} \rightarrow 0\quad \text { as } n\rightarrow \infty . \end{aligned}$$

Together with

$$\begin{aligned} \left\| \Re \left( T_{x^1_n} \phi ^1 \right) \right\| _{L_x^\frac{2(d+2)}{d}} = \Vert \Re \phi ^1\Vert _{L_x^\frac{2(d+2)}{d}}, \end{aligned}$$

we have (4.7).

When \(\lambda _n^1 \equiv 1\), \(\nu _n^1 \equiv 0\), \(t_n^1 \rightarrow t_\infty ^1 \in \{\pm \infty \}\), we see from the dispersive estimate that

$$\begin{aligned} \left\| \Re \left( T_{x^1_n} e^{i t^1_n \langle \nabla \rangle } \phi ^1 \right) \right\| _{L_x^\frac{2(d+2)}{d}} \le \left\| e^{i t^1_n \langle \nabla \rangle } \phi ^1 \right\| _{L_x^\frac{2(d+2)}{d}} \rightarrow 0\quad \text { as } n\rightarrow \infty . \end{aligned}$$

Hence, together with the embedding \(H^1 \hookrightarrow L^{\frac{2(d+2)}{d}}\) and the uniform boundedness of \(\{w_n^0\}_{n\ge 1}\) and \(\{w_n^1\}_{n \ge 1}\), one has

$$\begin{aligned} & {} \left| \Vert \Re w_n^0\Vert _{L_x^\frac{2(d+2)}{d}}^\frac{2(d+2)}{d} - \Vert \Re w_n^1\Vert _{L_x^\frac{2(d+2)}{d}}^\frac{2(d+2)}{d} - \left\| \Re \left( T_{x^1_n} e^{i t^1_n \langle \nabla \rangle } \phi ^1 \right) \right\| _{L_x^\frac{2(d+2)}{d}}^\frac{2(d+2)}{d}\right| \\ {} & {} \lesssim _d \left( \Vert w_n^0\Vert _{L_x^\frac{2(d+2)}{d}} + \Vert w_n^1\Vert _{L_x^\frac{2(d+2)}{d}} \right) ^{\frac{d+4}{d}} \left\| \Re \left( T_{x^1_n} e^{i t^1_n \langle \nabla \rangle } \phi ^1 \right) \right\| _{L_x^\frac{2(d+2)}{d}} \rightarrow 0\quad \text { as } n\rightarrow \infty . \end{aligned}$$

When \(\lambda _\infty ^1 = \infty \), by Bernstein’s inequality, (3.5), \(\sup _\xi |\partial _{\xi _j} l_{\nu _n^1}(\xi )| \lesssim \langle \nu _n^1\rangle \), and the boundedness of \(\nu _n^1\), we have

$$\begin{aligned} & {} \left\| e^{i t^1_n \langle \nabla \rangle } {\L } _{\xi ^1_n} D_{\lambda ^1_n} P_{n}^1 \phi ^1 \right\| _{L_x^\frac{2(d+2)}{d}}\\ {} & {} \lesssim \left( \textrm{diam}\left( \textrm{supp}\,\mathcal {F}\left( e^{i t^1_n \langle \nabla \rangle } {\L } _{\xi ^1_n} D_{\lambda ^1_n} P_{n}^1 \phi ^1 \right) \right) \right) ^{\frac{d}{d+2}} \left\| e^{i t^1_n \langle \nabla \rangle } {\L } _{\xi ^1_n} D_{\lambda ^1_n} P_{n}^1 \phi ^1 \right\| _{L_x^2}\\ {} & {} = \left( \textrm{diam}\left( \textrm{supp}\,\mathcal {F}\left( {\L } _{\xi ^1_n} D_{\lambda ^1_n} P_{n}^1 \phi ^1 \right) \right) \right) ^\frac{d}{d+2} \left\| {\L } _{\nu _n^1} D_{\lambda _n^1} P_n^1\phi ^1 \right\| _{L_x^2} \\ {} & {} \lesssim \left( \left\langle \nu _n^1 \right\rangle \textrm{diam}\left( \textrm{supp}\left( \mathcal {F} \left( D_{\lambda _n^1} P_n^1 \phi ^1\right) \right) \right) \right) ^\frac{d}{d+2} \langle \nu _n^1 \rangle \left\| \phi ^1 \right\| _{L_x^2}\\ {} & {} \lesssim _{\sup |\nu _n^1|} \left( \lambda _n^1 \right) ^{\frac{d(\theta - 1)}{d+2}} \left\| \phi ^1 \right\| _{L_x^2} \rightarrow 0 \end{aligned}$$

as \(n\rightarrow \infty \). Then, we obtain (4.7) as in the previous case.

Step 3. Construction of profiles and remainders by induction Let us construct the other profiles and the remainders by induction.

Suppose that \(\eta (\textbf{w}^k)>0\) for some \(k\ge 1\). By the definition of \(\eta \), there exists \(\tilde{\phi }^{k+1} \in \mathcal {V} (\textbf{w}^k)\) such that \(\Vert \tilde{\phi }^{k+1}\Vert _{L^2} \ge \frac{1}{2} \eta (\textbf{w}^k)>0\). By definition of \(\mathcal {V} (\textbf{w}^k)\), there exists \((\tilde{\lambda }_n^{k+1},\tilde{\xi }_n^{k+1},\tilde{t}_n^{k+1}, \tilde{x}_n^{k+1})\in \mathbb {R}_+ \times \mathbb {R}^d \times \mathbb {R} \times \mathbb {R}^d\) such that

$$\begin{aligned} D_{\tilde{\lambda }_n^{k+1}}^{-1} {\L } _{\tilde{\xi }_n^{k+1}}^{-1} e^{-i \tilde{t}_n^{k+1} \langle \nabla \rangle } T_{\tilde{x}_n^{k+1}}^{-1} w_n^k \rightharpoonup \tilde{\phi }^{k+1}\quad \text { in } L^2 \end{aligned}$$

along a subsequence. Furthermore, \(\tilde{\lambda }_n^{k+1}\) and \(|\tilde{\xi }_n^{k+1}|\) are bounded by a positive constant from below and above, respectively. Mimicking the argument in Step 1, one obtains \(\phi ^{k+1} \in L^2\), \(\phi ^{k+1}\ne 0\), and the parameter \((\lambda _n^{k+1},\nu _n^{k+1},t_n^{k+1},x_n^{k+1})\) satisfying the property of the theorem such that

$$\begin{aligned} W_{n}^{k,{k+1}}:= D_{\lambda _n^{k+1}}^{-1} {\L } _{\nu _n^{k+1}}^{-1} e^{-i {t}_n^{k+1} \langle \nabla \rangle } T_{x_n^{k+1}}^{-1} w_n^k \rightharpoonup {\phi }^{k+1}\quad \text { in } L^2 \end{aligned}$$
(4.8)

along a subsequence. If \(\lambda _n^{k+1} \equiv 1\), then \(\phi ^{k+1} \in H^1\) and the weak convergence (4.8) holds weakly in \(H^1\). By using the parameter, we define the remainder term \(\textbf{w}^{k+1} = \{w_n^{k+1}\}_{n\ge 1}\) by

$$\begin{aligned} w_n^{k+1} := w_n^k - T_{x_n^{k+1}} e^{i {t}_n^{k+1} \langle \nabla \rangle } {\L } _{{\nu }_n^{k+1}} D_{{\lambda }_n^{k+1}}P_n^{k+1} \phi ^{k+1}. \end{aligned}$$
(4.9)

This is a bounded sequence in \(H^1\):

$$\begin{aligned} \limsup _{n\rightarrow \infty } \Vert w_n^{k+1}\Vert _{H^1} \le \limsup _{n\rightarrow \infty } \Vert w_n^{k}\Vert _{H^1}. \end{aligned}$$

Furthermore,

$$\begin{aligned} W_{n}^{{k+1},{k+1}}:= D_{\lambda _n^{k+1}}^{-1} {\L } _{\nu _n^{k+1}}^{-1} e^{-i {t}_n^{k+1} \langle \nabla \rangle } T_{x_n^{k+1}}^{-1} w_n^{k+1} \rightharpoonup 0\quad \text { in } L^2. \end{aligned}$$

It holds weakly in \(H^1\) if \(\lambda _n^{k+1}\equiv 1\). Arguing as in Step 2, we have

$$\begin{aligned} \Vert w_n^k\Vert _{L^2}^2= & {} \Vert w_n^{k+1}\Vert _{L^2}^2+\left\| T_{x^{k+1}_n} e^{i t^{k+1}_n \langle \nabla \rangle } {\L } _{\nu ^{k+1}_n} D_{\lambda ^{k+1}_n} P_{n}^{k+1}\phi ^{k+1} \right\| _{L^2}^2+o(1),\end{aligned}$$
(4.10a)
$$\begin{aligned} \Vert w_n^k\Vert _{\dot{H}^1}^2= & {} \Vert w_n^{k+1}\Vert _{\dot{H}^1}^2+\left\| T_{x^{k+1}_n} e^{i t^{k+1}_n \langle \nabla \rangle } {\L } _{\nu ^{k+1}_n} D_{\lambda ^{k+1}_n} P_{n}^{k+1} \phi ^{k+1}\right\| _{\dot{H}^1}^2+o(1), \end{aligned}$$
(4.10b)
$$\begin{aligned} & {} \left\| \Re w_n^k\right\| _{L_x^\frac{2(d+2)}{d}}^\frac{2(d+2)}{d} - \left\| \Re w_n^{k+1} \right\| _{L_x^\frac{2(d+2)}{d}}^\frac{2(d+2)}{d}\nonumber \\ {} & {} - \left\| \Re \left( T_{x^{k+1}_n} e^{i t^{k+1}_n \langle \nabla \rangle } {\L } _{\nu ^{k+1}_n} D_{\lambda ^{k+1}_n} P_{n}^{k+1} \phi ^{k+1} \right) \right\| _{L_x^\frac{2(d+2)}{d}}^\frac{2(d+2)}{d} = o(1), \end{aligned}$$
(4.11)

and

$$\begin{aligned} \left\| w_n^k\right\| _{H^{ \frac{1}{2}}}^2=\left\| w_n^{k+1}\right\| _{H^{\frac{1}{2}}}^2+\left\| T_{x^{k+1}_n} e^{i t^{k+1}_n \langle \nabla \rangle } {\L } _{\nu ^{k+1}_n} D_{\lambda ^{k+1}_n} P_{n}^{k+1} \phi ^{k+1}\right\| _{H^{ \frac{1}{2}}}^2 + o(1) \end{aligned}$$
(4.12)

as \(n\rightarrow \infty \).

We repeat the above procedure so long as \(\eta (\textbf{w}^{k})>0\). If \(\eta (\textbf{w}^{k_0})=0\) holds for some \(k_0\ge 1\), we define \(J_0=k_0+1\ge 2\). Otherwise, let \(J_0=\infty \). Combining (4.9) and recalling \(w_n^0=v_n\), we obtain the desired decomposition (3.7) for all \(J \in [1,J_0-1]\). Similarly, by (4.5a, b), (4.10a, b), (4.7), and (4.11), we have (3.8), (3.9) and (3.10) for all \(J \in [1,J_0-1]\).

Step 4. Orthogonality of the parameters Let us establish the mutual orthogonality of the parameters. We see from (4.3) and (4.8) that \(W_n^{k,k}\rightharpoonup 0\) in \(L^2\) for \(1\le k \le J_0-1\). Now we show by induction on \(a \in \mathbb {Z}_{>0}\) such that

$$\begin{aligned} W_{k,k+a}:= D_{\lambda _n^{k+a}}^{-1} {\L } _{{\nu }_n^{k+a}}^{-1} e^{-i {t}_n^{k+a} \langle \nabla \rangle } T_{x_n^{k+a}}^{-1} w_n^k \rightharpoonup \phi ^{k+a} \ne 0 \end{aligned}$$
(4.13)

holds weakly in \(L^2\) for \(a\ge 1\) and \(0\le k \le J_0-1-a\). If we obtain (4.13), then by means of “\((3)\Rightarrow (1)\)” of Lemma 4.3, one obtains the desired orthogonality of the parameters.

Let us prove (4.13). For simplicity, we consider the case \(J_0=\infty \). The base case \(a=1\) follows from (4.3) and (4.8). Pick \(a_0 \ge 1\) and suppose that (4.13) holds as long as \(1 \le a \le a_0\). Then, by (4.9), we have

$$\begin{aligned} W_{k,k+a_0+1}= & {} W_{k+1,k+a_0+1} \\ {} & {} + D_{{\lambda }_n^{k+a_0+1}}^{-1} {\L } _{{\nu }_n^{k+a_0+1}}^{-1} e^{i {t}_n^{k+a_0+1} \langle \nabla \rangle } T_{x_n^{k+a_0+1}}^{-1} \left( T_{x_n^{k+1}} e^{ i {t}_n^{k+1} \langle \nabla \rangle } {\L } _{{\nu }_n^{k+1}} D_{{\lambda }_n^{k+1}}P_n^{k+1} \phi ^{k+1} \right) . \end{aligned}$$

By assumption of the induction together with (2) of Lemma 4.3, one sees that

$$\begin{aligned} W_{k+1,k+a_0+1} \rightharpoonup \phi ^{k+a_0+1} \end{aligned}$$

and

$$\begin{aligned} D_{\lambda _n^{k+a_0+1}}^{-1} {\L } _{\nu _n^{k+a_0+1}}^{-1} e^{i {t}_n^{k+a_0+1} \langle \nabla \rangle } T_{x_n^{k+a_0+1}}^{-1} \left( T_{x_n^{k+1}} e^{i {t}_n^{k+1} \langle \nabla \rangle } {\L } _{\nu _n^{k+1}} D_{\lambda _n^{k+1}}P_n^{k+1} \phi ^{k+1} \right) \rightharpoonup 0 \end{aligned}$$

weakly in \(L^2\) as \(n\rightarrow \infty \). Since k is arbitrary, we have (4.13) for \(a=a_0+1\). Thus, by induction we have (4.13) for all \(a\ge 1\).

Step 5. Smallness of the remainder term To complete the proof, we show (3.11). For this purpose, we first prove

$$\begin{aligned} \lim _{J\rightarrow J_0-1} \eta \left( \textbf{w}^J \right) =0. \end{aligned}$$
(4.14)

If \(J_0\) is finite, then this is true by the definition of \(J_0\). Suppose \(J_0=\infty \). Combining (4.6) and (4.12), we have for \(J\ge 1\),

$$\begin{aligned} \Vert v_n\Vert _{H^{ \frac{1}{2}}}^2 = \sum _{j=1}^J \left\| T_{x^j_n} e^{i t^j_n \langle \nabla \rangle } {\L } _{\nu ^j_n} D_{\lambda ^j_n} P_{n}^j \phi ^j\right\| _{H^{ \frac{1}{2}}}^2 + o(1)\quad \text { as } n\rightarrow \infty . \end{aligned}$$
(4.15)

Let us claim

$$\begin{aligned} \left\| T_{x^j_n} e^{i t^j_n \langle \nabla \rangle } {\L } _{\nu ^j_n} D_{\lambda ^j_n} P_{n}^j \phi ^j\right\| _{H^{ \frac{1}{2}}} \ge \Vert \tilde{\phi }^j\Vert _{L^2}. \end{aligned}$$

We prove it for \(j=1\). Recalling the definition of the parameters and using the fact that \({\L } _{\nu }\) is unitary in \(H^{\frac{1}{2}}\), we see that

$$\begin{aligned} \left\| T_{x^1_n} e^{i t^1_n \langle \nabla \rangle } {\L } _{\nu ^1_n} D_{\lambda ^1_n} P_{n}^1 \phi ^1\right\| _{H^{ \frac{1}{2}}} = \left\| D_{\lambda ^1_n} P_{n}^1 \phi ^1\right\| _{H^{ \frac{1}{2}}}. \end{aligned}$$

If \(\lambda _n^j \equiv 1\), then

$$\begin{aligned} \left\| D_{\lambda ^1_n} P_{n}^1 \phi ^1\right\| _{H^{ \frac{1}{2}}} = \left\| \phi ^1\right\| _{H^{ \frac{1}{2}}} = \left\| D_{\lambda _\infty ^1} \tilde{\phi }^1\right\| _{H^{ \frac{1}{2}}} \ge \left\| D_{\lambda _\infty ^1} \tilde{\phi }^1\right\| _{L^2} = \left\| \tilde{\phi }^1\right\| _{L^2}. \end{aligned}$$

If \(\lambda _n^j \rightarrow \infty \) as \(n\rightarrow \infty \), then

$$\begin{aligned} \left\| D_{\lambda ^j_n} P_{n}^j \phi ^j\right\| _{H^{ \frac{1}{2}}} \ge \left\| D_{\lambda ^j_n} P_{n}^j \phi ^j\right\| _{L^2} = \left\| \tilde{\phi }^j\right\| _{L^2}. \end{aligned}$$

Hence, the claim follows. Thus, plugging the identity of the claim to (4.15), taking supremum in n, and letting \(J\rightarrow \infty \), one obtains

$$\begin{aligned} \sum _{j=1}^\infty \left\| \tilde{\phi }^j \right\| _{L^2}^2 \le \sup _{n}\Vert v_n\Vert _{H^{ \frac{1}{2}}} \le A <\infty . \end{aligned}$$

This shows \(\Vert \tilde{\phi }^j\Vert _{L^2}\rightarrow 0\) as \(j\rightarrow \infty \). Hence,

$$\begin{aligned} \eta ({\textbf {w}}^J) \le 2 \Vert \phi ^{J+1}\Vert _{L^2} \rightarrow 0\quad \text { as } J \rightarrow \infty . \end{aligned}$$
(4.16)

This is (4.14).

If \(J_0\) is finite, then (3.11) follows from (4.14), thanks to Lemma 4.2. Let us consider the case \(J_0=\infty \). Suppose that (3.11) fails. Then, there exist \(\varepsilon _0>0\) and a sequence \(\{J_k\}_{k\ge 1}\) with \(\lim _{k\rightarrow \infty }J_k = \infty \) such that

$$\begin{aligned} \limsup _{n\rightarrow \infty } \left\| e^{-it \langle \nabla \rangle } w_n^{J_k} \right\| _{L_{t,x}^\frac{2(d+2)}{d}} \ge \varepsilon _0 \end{aligned}$$

for all \(k\ge 1\). Together with the bound

$$\begin{aligned} \limsup _{n\rightarrow \infty } \Vert w_n^{J_k}\Vert _{H^1} \le \limsup _{n\rightarrow \infty } \Vert v_n\Vert _{H^1} \le A, \end{aligned}$$

we see from Lemma 4.2 that there exits \(\alpha =\alpha (M,\varepsilon _0)>0\) such that \(\inf _{k} \eta (\textbf{w}^{J_k}) \ge \alpha \). However, this contradicts with (4.16). Thus, we obtain (3.11).\(\square \)

5 Low-Frequency Nonlinear Profile: Proof of Theorem 3.5

In this section, we will prove Theorem 3.5. We study the large scale profile, and using the solution of the mass-critical nonlinear Schrödinger equation to approximate the large scale profile. Throughout this section, we write \(f(z) = |z|^\frac{4}{d} z\). Before presenting the main result in this section, we first review the global well-posedness and scattering result of the mass-critical nonlinear Schrödinger equation

$$\begin{aligned} i\partial _t w+ \frac{1}{2} \Delta w = \mu C_d f(w), \end{aligned}$$
(5.1)

where \(\mu = \pm 1\) and the constant \(C_d\) is the well-known Wallis integral

$$\begin{aligned} C_d : = \frac{1}{2^{2+ \frac{4}{d}}\pi } \int _0^{2\pi } f(1 + e^{i\theta }) \,\textrm{d}\theta = \frac{\Gamma \left( \frac{2}{d} + \frac{3}{2} \right) }{\sqrt{\pi } \Gamma \left( \frac{2}{d} + 2\right) } < \frac{1}{2}. \end{aligned}$$
(5.2)

In particular, we see \(C_1 = \frac{5}{16}\), and \(C_2 = \frac{3}{8}\). For reader’s convenience, we give the computation of (5.2) in Appendix A.1. When \(\mu = -1\), the ground state solution associated to (5.1) is

$$\begin{aligned} w_Q(t,x) : = e^{it } \left( \frac{1}{C_d}\right) ^\frac{d}{4} Q\left( \sqrt{2}x\right) , \end{aligned}$$

with

$$\begin{aligned} \Vert w_Q\Vert _{L_x^2} = (2C_d)^{-\frac{d}{4}} \Vert Q\Vert _{L_x^2}, \end{aligned}$$

where Q is the ground state of (1.2). For the mass-critical nonlinear Schrödinger equation, we have the following result:

Theorem 5.1

(Global well-posedness and scattering of the mass-critical NLS, [3,4,5,6, 17, 19, 39, 40]) For any \(w_0 \in L_x^2(\mathbb {R}^d)\) and when \(\mu = -1\), we also assume \(\Vert w_0\Vert _{L_x^2} < (2C_d)^{-\frac{d}{4}} \Vert Q\Vert _{L_x^2}\), there exists a unique global solution w to (5.1) with \(w(0) = w_0\), and

$$\begin{aligned} \Vert w\Vert _{L_{t,x}^\frac{2(d+2)}{d}(\mathbb {R} \times \mathbb {R}^d)} \le C\left( \Vert w_0\Vert _{L_x^2}\right) , \end{aligned}$$

for some continuous function C. Moreover, w scatters in \(L^2\),

We now turn to the proof of Theorem 3.5.

Proof.  By (3.4), we have

$$\begin{aligned} \phi _n = {\L } _{\nu _n} T_{\tilde{x}_n} e^{i\tilde{t}_n \langle \nabla \rangle } D_{\lambda _n} P_{\le \lambda _n^\theta } \phi . \end{aligned}$$

We will take \(x_n = \frac{ \nu _n t_n}{ \langle \nu _n \rangle }\) by the spatial translation invariance, and this leads to \(\tilde{x}_n = 0\) and \(\tilde{t}_n = \frac{t_n}{ \langle \nu _n \rangle }\).

Case I. \(\nu _n = 0\). To show (3.13), we only need to show

$$\begin{aligned} \left\| v_n(t + t_n, x) - {e^{-it}}{\lambda _n^{- \frac{d}{2}}} \psi _\epsilon \left( {\lambda _n^{-2} }{t}, {\lambda _n^{-1} }{x} \right) \right\| _{L_{t,x}^\frac{2(d+2)}{d}(\mathbb {R}\times \mathbb {R}^d)} < \epsilon . \end{aligned}$$
(5.3)

Before giving the approximate solutions to (2.1), we first define the solutions to (5.1), which will be the building block.

When \(t_n = 0\), let \(w_n\) be the solution to (5.1) with \(w_n(0) = P_{\le \lambda _n^\theta } \phi \), and correspondingly, we let \(w_\infty \) be the solution to (5.1) with \(w_\infty (0) = \phi \).

In the case when \(\frac{t_n}{\lambda _n^2} \rightarrow \infty \) (respectively \(\frac{t_n}{\lambda _n^2} \rightarrow -\infty \)), we denote by \(w_n\) the solutions to (5.1), that scatter backward (respectively forward) in time to \(e^{it \frac{\Delta }{2}} P_{\le \lambda _n^\theta } \phi \). In the same time, we define \(w_\infty \) to be the solution to (5.1) that scatters backward (respectively forward) in time to \(e^{it \frac{\Delta }{2}} \phi \). By Theorem 5.1, we have

$$\begin{aligned} S_{\mathbb {R}}(w_n) + S_{\mathbb {R}}(w_\infty ) \lesssim _{\Vert \phi \Vert _{L^2}} 1. \end{aligned}$$

We also have the following space-time boundedness of the sequence \(w_n\) by direct computation, which will be useful later in this section.

Lemma 5.2

(Boundedness of the Strichartz norms) The solutions \(w_n\) satisfy

$$\begin{aligned} \left\| |\nabla |^s w_n \right\| _{L_t^\infty L_x^2 \cap L_{t,x}^\frac{2(d+2)}{d}} \lesssim _{\Vert \phi \Vert _{L^2}} \lambda _n^{s \theta }, \end{aligned}$$
(5.4)

for any \( 0 \le s < 1+ \frac{4}{d}\) and

$$\begin{aligned} \left\| \langle \nabla \rangle ^s \partial _t w_n \right\| _{L_{t,x}^\frac{2(d+2)}{d}} \lesssim _{\Vert \phi \Vert _{L^2}} \lambda _n^{(2+s) \theta } \end{aligned}$$
(5.5)

for any \(0 \le s < \frac{4}{d}\). Moreover, we also have the approximation

$$\begin{aligned} \Vert w_n - w_\infty \Vert _{L_t^\infty L_x^2 \cap L_{t,x}^\frac{2(d+2)}{d}} + \left\| D_{\lambda _n} (w_n - P_{\le \lambda _n^\theta } w_\infty ) \right\| _{L_t^\infty H_x^\frac{1}{2}} \rightarrow 0\quad \text {as } n\rightarrow \infty . \end{aligned}$$
(5.6)

We can now construct the following approximate solutions to (2.1):

$$\begin{aligned} \tilde{v}_n(t) : = \left\{ \begin{array}{ll} e^{-it } D_{\lambda _n} \left( P_{\le \lambda _n^{2\theta } }w_n \right) \left( \frac{t}{ \lambda _n^2} \right) &{}\quad \text { if } |t| \le \lambda _n^2 T ,\\ e^{-i \left( t- \lambda _n^2 T \right) \langle \nabla \rangle } \tilde{v}_n \left( \lambda _n^2 T \right) &{}\quad \text { if } t > \lambda _n^2 T ,\\ e^{-i \left( t + \lambda _n^2 T \right) \langle \nabla \rangle } \tilde{v}_n \left( - \lambda _n^2 T\right) &{}\quad \text { if } t < - \lambda _n^2 T, \end{array}\right. \end{aligned}$$

where T is a sufficiently large positive number to be specified later. We will show this sequence approximately solves (2.1), and by invoking Proposition 2.9 to deduce that the resulting solutions \(v_n\) obey (3.13). By the Strichartz estimate and Lemma 5.2, we have

$$\begin{aligned} \Vert \tilde{v}_n \Vert _{L_t^\infty H_x^\frac{1}{2} \cap L_{t,x}^\frac{2(d+2)}{d}}{} & {} \lesssim \Vert D_{\lambda _n} w_n \Vert _{L_t^\infty H_x^\frac{1}{2}} + \left\| D_{\lambda _n} w_n\left( \frac{t}{ \lambda _n^2} \right) \right\| _{L_{t,x}^\frac{2(d+2)}{d}}\\ {} & {} \lesssim _{\Vert \phi \Vert _{L_x^2}} 1 + \lambda _n^{- \frac{1}{2} } \left\| |\nabla |^\frac{1}{2} w_n \right\| _{L_t^\infty L_x^2} \lesssim _{ \Vert \phi \Vert _{L_x^2} } 1 + \lambda _n^{- \frac{1 - \theta }{2} } \lesssim _{ \Vert \phi \Vert _{L_x^2} } 1. \end{aligned}$$

By the definition of \(\phi _n\) and also (5.6), we can get

Lemma 5.3

(Approximation of the initial data)

$$\begin{aligned} \limsup _{n \rightarrow \infty } \Vert \tilde{v}_n(- t_n) - \phi _n\Vert _{H_x^\frac{1}{2}} \rightarrow 0\quad \text { as } {T \rightarrow \infty }. \end{aligned}$$

Arguing as in [16], we have \(\tilde{v}_n\) are approximate solutions to (2.1) on the large time intervals, by using the solution of the free Schrödinger equation to approximate the nonlinear solutions \(w_n\) and also the free first order Klein–Gordon propagator is asymptotic small in the Strichartz space. We refer to [16] for similar argument.

Proposition 5.4

(Asymptotic small on the large time intervals)

$$\begin{aligned} & {} \limsup _{n \rightarrow \infty } \left( \left\| e^{-i \left( t- \lambda _n^2 T \right) \langle \nabla \rangle } \tilde{v}_n \left( \lambda _n^2 T \right) \right\| _{L_{t,x}^\frac{2(d+2)}{d} \left( \left( \lambda _n^2 T, \infty \right) \times \mathbb {R}^d \right) }\right. \\ {} & {} \qquad \qquad \left. + \left\| e^{-i \left( t + \lambda _n^2 T \right) \langle \nabla \rangle } \tilde{v}_n \left( - \lambda _n^2 T \right) \right\| _{L_{t,x}^\frac{2(d+2)}{d} \left( \left( - \infty , - \lambda _n^2 T \right) \times \mathbb {R}^d \right) } \right) \rightarrow 0\quad \text { as } T\rightarrow \infty . \end{aligned}$$

We now turn to the middle time interval. On the middle time interval, we see \(\tilde{v}_n\) satisfies

$$\begin{aligned} \left( - i \partial _t + \langle \nabla \rangle \right) \tilde{v}_n + \mu \langle \nabla \rangle ^{-1} f\left( \Re \tilde{v}_n \right) = e_{1,n} + (e_{2,1,n} +e_{2,2,n}+e_{2,3,n})+e_{3,n}, \end{aligned}$$

where

$$\begin{aligned} e_{1,n}&: =&e^{-it} \lambda _n^{- \frac{d}{2}}\left( P_{\le \lambda _n^{2\theta } } \left( \langle \lambda _n^{-1} \nabla \rangle - 1 + \frac{1}{2 \lambda _n^2} \Delta \right) w_n\right) \left( \frac{t}{ \lambda _n^2}, \frac{x}{ \lambda _n}\right) ,\\ e_{2,1,n}&: =&\mu \left( \langle \nabla \rangle ^{-1} - 1\right) \left( e^{-it} {C_d} P_{\le \lambda _n^{2\theta -1}} \left( f\left( w_n \left( \frac{t}{\lambda _n^2}, \frac{x}{\lambda _n} \right) \right) \right) \lambda _n^{ -\frac{d}{2} - 2} \right) ,\\ e_{2,2,n}&: =&- \mu {C_d} \lambda _n^{ -\frac{d}{2} - 2} e^{-it} \langle \nabla \rangle ^{-1} (P_{\le \lambda _n^{2\theta -1} }-1) \left( f\left( w_n \left( \frac{t}{\lambda _n^2}, \frac{x}{\lambda _n} \right) \right) \right) , \\ e_{2,3,n}&: =&- \mu {C_d } \lambda _n^{ -\frac{d}{2} - 2} e^{-it} \langle \nabla \rangle ^{-1} \left( f \left( w_n \left( \frac{t}{\lambda _n^2}, \frac{x}{\lambda _n} \right) \right) - f\left( \left( P_{\le \lambda _n^{2\theta }} w_n \right) \left( \frac{t}{\lambda _n^2}, \frac{x}{\lambda _n} \right) \right) \right) , \\ e_{3,n}&: =&\mu \lambda _n^{ -\frac{d}{2} - 2} \langle \nabla \rangle ^{-1} \left( f\left( \Re \left( e^{-it} \left( P_{\le \lambda _n^{2\theta }} w_n \right) \left( \frac{t}{\lambda _n^2} , \frac{x}{\lambda _n} \right) \right) \right) \right. \\ {} & {} \left. - e^{-it} {C_d }f\left( \left( P_{\le \lambda _n^{2\theta }} w_n \right) \left( \frac{t}{\lambda _n^2}, \frac{x}{\lambda _n} \right) \right) \right) . \end{aligned}$$

Remark 5.5 The above decomposition is slightly different from that is used in the previous result [16]. The point is that we have the factor \(\langle \nabla \rangle ^{-1}\) in the term \(e_{3,n}\), which is crucial when we consider high dimensions.

By Plancherel’s identity, (1.3), Hölder’s inequality and (5.4), we have

$$\begin{aligned} & {} \Vert e_{1,n}\Vert _{L_t^1 H_x^\frac{1}{2} \left( \left[ - \lambda _n^2 T, \lambda _n^2 T \right] \times \mathbb {R}^d \right) }\nonumber \\ {} & {} = \lambda _n^2 \left\| \langle \lambda _n^{-1} \xi \rangle ^\frac{1}{2} \left( \langle \lambda _n^{-1} \xi \rangle - 1 - \frac{ |\xi |^2}{2 \lambda _n^2} \right) \widehat{P_{\le \lambda _n^{2\theta }}w_n}(t,\xi ) \right\| _{L_t^1 L_\xi ^2( [-T,T]\times \mathbb {R}^d)}\\ {} & {} \lesssim \lambda _n^2 \left\| \langle \lambda _n^{-1} \xi \rangle ^\frac{1}{2} \frac{ |\xi |^4}{ \lambda _n^4} \widehat{P_{\le \lambda _n^{2\theta }}w_n}(t,\xi ) \right\| _{L_t^1 L_\xi ^2([-T,T] \times \mathbb {R}^d)} \nonumber \\ {} & {} \lesssim T \lambda _n^{-2+8\theta } \left\| w_n \right\| _{L_t^\infty L_x^2} \rightarrow 0, \text { as } n \rightarrow \infty . \end{aligned}$$
(5.7)

By the Mikhlin multiplier theorem, we obtain

$$\begin{aligned} & {} \Vert \langle \nabla \rangle e_{2,1,n}\Vert _{L_{t,x}^\frac{2(d+2)}{d+4} \left( \left[ - \lambda _n^2 T, \lambda _n^2 T \right] \times \mathbb {R}^d \right) }\nonumber \\ {} & {} \lesssim \lambda _n^{ -\frac{d}{2} - 2} \left\| \nabla \left( f\left( w_n \left( \frac{t}{\lambda _n^2}, \frac{x}{\lambda _n} \right) \right) \right) \right\| _{L_{t,x}^\frac{2(d+2)}{d+4} \left( \left[ - \lambda _n^2 T, \lambda _n^2 T \right] \times \mathbb {R}^d \right) }\\ {} & {} \lesssim \lambda _n^{-1} \Vert w_n\Vert _{L_{t,x}^\frac{2(d+2)}{d} ([-T, T] \times \mathbb {R}^d )}^\frac{4}{d} \Vert \nabla w_n\Vert _{L_{t,x}^\frac{2(d+2)}{d} ([-T, T] \times \mathbb {R}^d )}\nonumber \\ {} & {} \lesssim _{\Vert \phi \Vert _{L_x^2} } \lambda _n^{-1 + \theta } \rightarrow 0\quad \text { as } n\rightarrow \infty . \end{aligned}$$
(5.8)

Similarly, by the Bernstein inequality, one has

$$\begin{aligned} & {} \Vert \langle \nabla \rangle e_{2,2,n}\Vert _{L_{t,x}^\frac{2(d+2)}{d+4} \left( \left[ - \lambda _n^2 T, \lambda _n^2 T \right] \times \mathbb {R}^d \right) }\nonumber \\ {} & {} \lesssim \lambda _n^{ -\frac{d}{2} - 1-2\theta } \left\| \nabla \left( f\left( w_n \left( \frac{t}{\lambda _n^2}, \frac{x}{\lambda _n} \right) \right) \right) \right\| _{L_{t,x}^\frac{2(d+2)}{d+4} \left( \left[ - \lambda _n^2 T, \lambda _n^2 T \right] \times \mathbb {R}^d \right) }\\ {} & {} \lesssim _{\Vert \phi \Vert _{L_x^2} } \lambda _n^{-\theta } \rightarrow 0\quad \text { as } n\rightarrow \infty \end{aligned}$$
(5.9)

and

$$\begin{aligned} & {} \Vert \langle \nabla \rangle e_{2,3,n}\Vert _{L_{t,x}^\frac{2(d+2)}{d+4} \left( \left[ - \lambda _n^2 T, \lambda _n^2 T \right] \times \mathbb {R}^d \right) }\nonumber \\ {} & {} \lesssim \Vert w_n\Vert _{L_{t,x}^\frac{2(d+2)}{d}([-T, T]\times \mathbb {R}^d )}^\frac{4}{d} \Vert P_{> \lambda _n^{2\theta }} w_n\Vert _{L_{t,x}^\frac{2(d+2)}{d}([-T, T]\times \mathbb {R}^d )}\\ {} & {} \lesssim \lambda _n^{-2\theta } \Vert w_n\Vert _{L_{t,x}^\frac{2(d+2)}{d}([-T, T] \times \mathbb {R}^d )}^\frac{4}{d} \Vert \nabla w_n\Vert _{L_{t,x}^\frac{2(d+2)}{d}([-T, T] \times \mathbb {R}^d )} \nonumber \\ {} & {} \lesssim _{ \Vert \phi \Vert _{L_x^2} } \lambda _n^{-\theta } \rightarrow 0\quad \text { as } n\rightarrow \infty . \end{aligned}$$
(5.10)

We now turn to \(e_{3,n}\), and show

$$\begin{aligned} \left\| \int _0^t e^{-i(t-s) \langle \nabla \rangle } e_{3,n}(s) \,\textrm{d}s \right\| _{L_t^\infty H_x^\frac{1}{2} \cap L_{t,x}^\frac{2(d+2)}{d} \left( \left[ -\lambda _n^2 T, \lambda _n^2 T \right] \times \mathbb {R}^d \right) } \lesssim _T \lambda _n^{-1+8\theta } \rightarrow 0 \text { as } n \rightarrow \infty . \end{aligned}$$
(5.11)

For simplicity, we denote \(P_{\le \lambda _n^{2\theta }} w_n\) by \(w_n\) in what follows. This will not cause any difference because we do not use the equation for \(w_n\) to show (5.11). We would point out that we do not need the upper bounds on the regularity parameter s in the bounds (5.4) and (5.5) any more as long as \(\theta \) is replaced by \(2\theta \). We have the Fourier series expansion

$$\begin{aligned} |\Re u|^\frac{4}{d} \Re u = \sum _{k \in \mathbb {Z}} g_{2k-1} |u|^{\frac{4}{d} +2 - 2k} u^{2k-1}, \end{aligned}$$
(5.12)

where \(g_1 = C_d\) and

$$\begin{aligned} g_{2k-1} := \frac{1}{2\pi } \int _{-\pi }^\pi |\cos \theta |^{\frac{4}{d}} \cos \theta \cos ((2k-1) \theta ) d\theta . \end{aligned}$$

By [23, Proposition A.1], we have

$$\begin{aligned} g_{2k-1} = \frac{ (-1)^{k-1} \Gamma \left( \frac{3}{2} + \frac{2}{d} \right) \Gamma \left( k-1-\frac{2}{d} \right) }{ \sqrt{\pi } \Gamma \left( -\frac{2}{d} \right) \Gamma \left( k + 1+ \frac{2}{d} \right) } = O \left( |k|^{-\frac{4}{d} - 2} \right) \quad \text { as } |k|\rightarrow \infty . \end{aligned}$$

The expansion (5.12) yields another formula for the error term

$$\begin{aligned} e_{3,n} = \sum _{k\in \mathbb {Z},\, k\ne 1} e_{3,k,n}, \end{aligned}$$

where

$$\begin{aligned} e_{3,k,n} \!=\! \mu g_{2k-1} \lambda _n^{-\frac{d}{2}-2} e^{-i (2k-1)t} \langle \nabla \rangle ^{-1} \left( \left| w_n\left( {\lambda _n^{-2}}{t}, {\lambda _n^{-1}} {x} \right) \right| ^{\frac{4}{d} + 2-2k} w_n\left( {\lambda _n^{-2}}{t},{\lambda _n^{-1}} {x} \right) ^{2k-1} \right) . \end{aligned}$$

Let us introduce \(f_{k,n}\) defined by

$$\begin{aligned} f_{k,n}(t) = -i \int _0^t e^{-i(t-s) \langle \nabla \rangle } e_{3,k,n}(s) \,\textrm{d}s. \end{aligned}$$

Remark that what we want to estimate is the \({L_t^\infty H_x^\frac{1}{2} \cap L_{t,x}^\frac{2(d+2)}{d} ([-\lambda _n^2 T, \lambda _n^2 T] \times \mathbb {R}^d)}\) norm of \(f_{n} := \sum _{k\ne 1} f_{k,n}\). A computation shows that

$$\begin{aligned} \left( -i \partial _t + \langle \nabla \rangle \right) f_{k,n} = - e_{3,k,n} \end{aligned}$$

and

$$\begin{aligned} {} & {} (-i \partial _t + \langle \nabla \rangle ) e_{3,k,n} \\ {} & {} = -2(k-1)e_{3,k,n} -i \mu g_{2k-1} \lambda _n^{-\frac{d}{2} - 4} e^{-i(2k-1)t}\\ {} & {} \quad \times \left( \langle \lambda _n^{-1} \nabla \rangle ^{-1} \partial _t \left( |w_n|^{\frac{4}{d} + 2-2k} w_n^{2k-1} \right) \right) \left( \frac{t}{\lambda _n^2}, \frac{x}{\lambda _n}\right) \\ {} & {} \quad + \mu g_{2k-1} \lambda _n^{- \frac{d}{2} - 2} e^{-i(2k-1)t} \left( \!\!\langle \lambda _n^{-1} \nabla \rangle ^{-1}\!\!\left( \!\!\langle \lambda _n^{-1} \nabla \rangle - 1\right) \!\!\left( |w_n|^{\frac{4}{d} + 2 - 2k } w_n^{2k-1} \right) \!\!\right) \!\!\left( \frac{t}{\lambda _n^2}, \frac{x}{\lambda _n}\right) . \end{aligned}$$

Combining these two identities, one obtains

$$\begin{aligned} {} & {} (-i \partial _t + \langle \nabla \rangle ) \left( f_{k,n}- \frac{1}{2(k-1)} e_{3,k,n}\right) \\ {} & {} = \frac{i \mu g_{2k-1}}{2(k-1)} \lambda _n^{-\frac{d}{2} - 4} e^{-i(2k-1)t} \left( \langle \lambda _n^{-1} \nabla \rangle ^{-1} \partial _t \left( |w_n|^{\frac{4}{d} + 2-2k} w_n^{2k-1} \right) \right) \left( \frac{t}{\lambda _n^2}, \frac{x}{\lambda _n}\right) \\ {} & {} \quad - \frac{\mu g_{2k-1}}{2(k-1)} \lambda _n^{- \frac{d}{2} - 2} e^{-i(2k-1)t} \!\left( \langle \lambda _n^{-1} \nabla \rangle ^{-1} \!\left( \langle \lambda _n^{-1} \nabla \rangle \!-\! 1\right) \left( |w_n|^{\frac{4}{d} + 2 - 2k } w_n^{2k-1} \right) \right) \!\left( \frac{t}{\lambda _n^2}, \frac{x}{\lambda _n}\right) . \end{aligned}$$

By the Strichartz estimate, one has the desired estimate

$$\begin{aligned} \Vert f_{n} \Vert _{L_t^\infty H^\frac{1}{2}_x \cap L_{t,x}^\frac{2(d+2)}{d} \left( \left[ - \lambda _n^2 T, \lambda _n^2 T \right] \times \mathbb {R}^d \right) } \lesssim _T \lambda _n^{-1+6\theta }, \end{aligned}$$

which is exactly (5.11), from the following four estimates:

$$\begin{aligned} \Vert e_{3,k,n} \Vert _{L^\infty _t H^\frac{1}{2}_x \left( \left[ -\lambda _n^2 T, \lambda _n^2 T \right] \times \mathbb {R}^d \right) }\lesssim & {} |g_{2k-1}| \lambda _n^{-2} \left\| w_n \right\| _{L_t^\infty L_x^{2 \left( 1+\frac{4}{d} \right) }([-T,T]\times \mathbb {R}^d)}^{1+\frac{4}{d}} \\ \lesssim & {} \langle k \rangle ^{- \frac{4}{d} - 2} \lambda _n^{-2} \Vert w_n\Vert _{L_t^\infty H_x^{\frac{2d}{d+4}}([-T,T]\times \mathbb {R}^d)}^{1 + \frac{4}{d}} \\ \lesssim & {} \langle k \rangle ^{- \frac{4}{d} - 2} \lambda _n^{-2+4\theta }, \end{aligned}$$
$$\begin{aligned} \Vert e_{3,k,n} \Vert _{L_{t,x}^\frac{2(d+2)}{d} ( [-\lambda _n^2 T, \lambda _n^2 T ] \times \mathbb {R}^d )}\lesssim & {} |g_{2k-1} | \lambda _n^{-2} \left\| w_n\right\| _{L_{t,x}^{\frac{2(d+2)(d+4)}{d^2}} ([-T,T]\times \mathbb {R}^d)}^{1+\frac{4}{d}} \\ \lesssim & {} \langle k \rangle ^{-\frac{4}{d} -2 } \lambda _n^{-2} T^{\frac{d}{2(d+2)}} \Vert w_n\Vert _{L^\infty _t H_x^{\frac{d(3d+4)}{(d+2)(d+4)}}([-T,T]\times \mathbb {R}^d)}^{1+\frac{4}{d} }\\ &\lesssim _T&\langle k \rangle ^{- \frac{4}{d} - 2} \lambda _n^{-2+6\theta }, \end{aligned}$$
$$\begin{aligned} {} & {} \left\| \frac{i \mu g_{2k-1} }{2(k-1)} \lambda _n^{-\frac{d}{2} - 4} e^{-i (2k-1)t} \langle \nabla \rangle ^{-1}\!\!\left( \partial _t \left( |w_n|^{\frac{4}{d} + 2 - 2k} w_n^{2k-1} \right) \!\right) \!\!\left( \frac{\cdot }{\lambda _n^2}, \frac{\cdot }{\lambda _n} \right) \right\| _{L^1_t L^2_x( [-\lambda _n^2 T, \lambda _n^2 T ] \times \mathbb {R}^d )}\\ {} & {} \quad \lesssim |g_{2k-1} | \lambda _n^{-2} \left\| |w_n|^\frac{4}{d} |\partial _t w_n| \right\| _{L^1_t L^2_x([-T,T]\times \mathbb {R}^d)} \\ {} & {} \quad \lesssim \langle k \rangle ^{-\frac{4}{d} - 2} \lambda _n^{-2} \Vert w_n\Vert _{L^\infty _t H^{\frac{2d}{d+4}}_x([-T,T]\times \mathbb {R}^d)}^{ \frac{4}{d} } \Vert \partial _t w_n\Vert _{L^\infty _t H^{\frac{2d}{d+4}}_x([-T,T]\times \mathbb {R}^d)}\\ {} & {} \quad \lesssim \langle k \rangle ^{-\frac{4}{d} - 2} \lambda _n^{-2+6\theta }, \end{aligned}$$

and

$$\begin{aligned} {} & {} \left\| \frac{\mu g_{2k-1}}{2(k-1)} \lambda _n^{-\frac{d}{2} - 2} \!e^{-i (2k-1)t}\! \!\left( \!\!\left( \!\langle \lambda _n^{-1} \nabla \rangle \!-\! 1\right) \!\!\left( \!|w_n|^{\frac{4}{d} + 2-2k} \!w_n^{2k-1}\right) \!\!\right) \!\! \left( \frac{\cdot }{\lambda _n^2}, \frac{\cdot }{\lambda _n}\right) \!\right\| _{L^1_t L^2_x( [-\lambda _n^2 T, \lambda _n^2 T ] \times \mathbb {R}^d )} \\ {} & {} \quad \lesssim \frac{ |g_{2k-1} |}{ |k-1|} \left\| \lambda _n^{-1} \nabla \left( |w_n|^{\frac{4}{d} + 2-2k} w_n^{2k - 1}\right) \right\| _{L^1_t L^2_x([-T,T]\times \mathbb {R}^d)}\\ {} & {} \quad \lesssim \langle k \rangle ^{-\frac{4}{d} - 2} \lambda _n^{-1} \Vert w_n\Vert _{L^\infty _t H^{\frac{2d}{d+4}}_x([-T,T]\times \mathbb {R}^d)}^{ \frac{4}{d}} \Vert \nabla w_n\Vert _{L^\infty _t H^{\frac{2d}{d+4}}_x([-T,T]\times \mathbb {R}^d)} \lesssim \langle k \rangle ^{-\frac{4}{d} - 2} \lambda _n^{-1+6\theta }. \end{aligned}$$

In the above, we have used the elementary estimate \( \left| \frac{d}{dz} \left( |z|^{\frac{4}{d} + 2 - 2k} z^{2k-1} \right) \right| \lesssim \langle k\rangle |z|^{\frac{4}{d}}\) to obtain the third and fourth estimates. Notice that the decay in k is enough to sum up. Therefore, (5.11) follows. After the above computation, we have

Proposition 5.6

For any \(\epsilon > 0\), there exist sufficiently large positive constants T and N, such that for any \(n \ge N\), \(\tilde{v}_n\) satisfy

$$\begin{aligned} \left( -i \partial _t + \langle \nabla \rangle \right) {\tilde{v}}_n = - \mu \langle \nabla \rangle ^{-1} f\left( \Re {\tilde{v}}_n \right) + \tilde{e}_{1,n} + \tilde{e}_{2,n} + \tilde{e}_{3,n}, \end{aligned}$$

with the error terms \(\tilde{e}_{1,n}\), \( \tilde{e}_{2,n}\), \( \tilde{e}_{3,n}\) small in the sense that

$$\begin{aligned} {} & {} \left\| \tilde{e}_{1,n} \right\| _{L_t^1 H_x^\frac{1}{2}(\mathbb {R} \times \mathbb {R}^d) } + \left\| \langle \nabla \rangle \tilde{e}_{2,n} \right\| _{L_{t,x}^\frac{2(d+2)}{d+4}(\mathbb {R} \times \mathbb {R}^d)}\\ {} & {} + \left\| \int _0^t e^{-i(t-s) \langle \nabla \rangle } \tilde{e}_{3,n}(s) \,\textrm{d}s \right\| _{L_t^\infty H_x^\frac{1}{2} \cap L_{t,x}^\frac{2(d+2)}{d}(\mathbb {R} \times \mathbb {R}^d)} \le \epsilon . \end{aligned}$$

Proof

On the interval \([- \lambda _n^2 T, \lambda _n^2 T]\), we can take

$$\begin{aligned} \tilde{e}_{1,n} = e_{1,n}, \quad \tilde{e}_{2,n} = e_{2,1,n}+ e_{2,2,n} + e_{2,3,n}, \quad \tilde{e}_{3,n} = e_{3,n}. \end{aligned}$$

By (5.7), (5.8), (5.9) and (5.10), we have

$$\begin{aligned} \Vert \tilde{e}_{1,n}\Vert _{L_t^1 H_x^\frac{1}{2}([- \lambda _n^2 T, \lambda _n^2 T] \times \mathbb {R}^d)} + \Vert \langle \nabla \rangle \tilde{e}_{2,n}\Vert _{L_{t,x}^\frac{2(d+2)}{d+4} ([- \lambda _n^2 T, \lambda _n^2 T] \times \mathbb {R}^d)} \lesssim _T \lambda _n^{- 2 + 8 \theta } + \lambda _n^{-1+ \theta }+ \lambda _n^{- \theta }. \end{aligned}$$

Together with (5.11), \(\forall \, T > 0\), we can take N large enough, such that for each \(n \ge N\),

$$\begin{aligned} {} & {} \Vert \tilde{e}_{1,n}\Vert _{L_t^1 H_x^\frac{1}{2}([- \lambda _n^2 T, \lambda _n^2 T] \times \mathbb {R}^d)} + \Vert \nabla \tilde{e}_{2,n} \Vert _{L_{t,x}^\frac{2(d+2)}{d+4} ([ - \lambda _n^2 T, \lambda _n^2 T ]\times \mathbb {R}^d)} \\ {} & {} \quad + \left\| \int _0^t e^{-i(t-s) \langle \nabla \rangle } \tilde{e}_{3,n}(s) \,\textrm{d}s \right\| _{L_t^\infty H_x^\frac{1}{2} \cap L_{t,x}^\frac{2(d+2)}{d} ([- \lambda _n^2 T, \lambda _n^2 T] \times \mathbb {R}^d)} \le \frac{\epsilon }{2}. \end{aligned}$$

We now turn to the time intervals \((-\infty , - \lambda _n^2 T) \cup (\lambda _n^2 T, \infty )\). In this case, we choose \(\tilde{e}_{1,n} = \tilde{e}_{2,n} = 0\) and \(\tilde{e}_{3,n} = \mu \langle \nabla \rangle ^{-1} f(\Re {\tilde{v}}_n)\). By Proposition 5.4, (5.11) and the Strichartz estimate, for T and n sufficiently large, one has

$$\begin{aligned} \left\| \int _0^t e^{-i(t-s) \langle \nabla \rangle } \tilde{e}_{3,n}(s) \,\textrm{d}s \right\| _{L_t^\infty H_x^\frac{1}{2} \cap L_{t,x}^\frac{2(d+2)}{d} (|t| \ge \lambda _n^2 T)} \lesssim \Vert \tilde{v}_n\Vert _{L_{t,x}^\frac{2(d+2)}{d} (|t| \ge T\lambda _n^2)}^{\frac{4}{d} + 1} \le \frac{\epsilon }{2}. \end{aligned}$$

This completes the proof of the proposition.\(\square \)

By Lemma 5.3, Propositions 5.6, and 2.9, we can obtain a solution \(v_n\) to (2.1) with \(v_n(0) = \phi _n\), for n large enough. Moreover,

$$\begin{aligned} \Vert v_n(t) - \tilde{v}_n(t-t_n)\Vert _{L_t^\infty H_x^\frac{1}{2} \cap L_{t,x}^\frac{2(d+2)}{d}} \rightarrow 0\quad \text { as } n\rightarrow \infty . \end{aligned}$$
(5.13)

We now turn to the proof of (5.3). By density, we can take \(\psi _\epsilon \in C_c^\infty (\mathbb {R} \times \mathbb {R}^d)\) such that

$$\begin{aligned} \left\| e^{-it} D_{\lambda _n} \left( \psi _\epsilon \left( \lambda _n^{-2} t \right) - w_\infty (\lambda _n^{-2} t) \right) \right\| _{L_{t,x}^\frac{2(d+2)}{d}} = \Vert \psi _\epsilon - w_\infty \Vert _{L_{t,x}^\frac{2(d+2)}{d}} < \frac{\epsilon }{2}. \end{aligned}$$
(5.14)

By the definition of \(\tilde{v}_n\), the triangle inequality, Proposition 5.4, (5.6), the dominated convergence theorem, we have by taking T sufficiently large and n large enough,

$$\begin{aligned} {} & {} \left\| \tilde{v}_n(t) - e^{-it} D_{\lambda _n} w_\infty \left( \lambda _n^{-2} t \right) \right\| _{L_{t,x}^\frac{2(d+2)}{d}}\\ {} & {} \lesssim \left\| \tilde{v}_n \right\| _{L_{t,x}^\frac{2(d+2)}{d} \left( \left\{ |t|> T \lambda _n^2 \right\} \times \mathbb {R}^d \right) } + \Vert w_n - w_\infty \Vert _{L_{t,x}^\frac{2(d+2)}{d}} + \Vert w_\infty \Vert _{L_{t,x}^\frac{2(d+2)}{d}(\{|t|>T\} \times \mathbb {R}^d)} < \frac{\epsilon }{2}. \end{aligned}$$

Combining this with (5.13) and (5.14), we get (3.13) when \(\nu _n = 0\).

Case II. \(\nu _n \rightarrow \nu \in \mathbb {R}^d\) as \(n\rightarrow \infty \).

By the proof in Case I, there is a global solution \(v_n^0\) to (2.1) with

$$\begin{aligned} v_n^0(0) = T_{\tilde{x}_n} e^{i\tilde{t}_n \langle \nabla \rangle } D_{\lambda _n} P_{\le \lambda _n^\theta } \phi , \end{aligned}$$

for n large enough. Moreover, \(S_{\mathbb {R}}(v_n^0) \lesssim _{ \Vert \phi \Vert _{L_x^2} } 1\) and for any \(\epsilon > 0\), there exists \(\psi _\epsilon ^0 \in C_c^\infty (\mathbb {R} \times \mathbb {R}^d)\) and \(N_\epsilon ^0\) so that

$$\begin{aligned} \left\| \Re \left( v_n^0\left( t + \tilde{t}_n, x + \tilde{x}_n\right) - {\lambda _n^{- \frac{d}{2}} }{e^{-it}} \psi _\epsilon ^0\left( \lambda _n^{-2}{t}, \lambda _n^{-1}{x} \right) \right) \right\| _{L_{t,x}^\frac{2(d+2)}{d}} < \epsilon , \end{aligned}$$
(5.15)

when \(n \ge N_\epsilon ^0\). Before continuing, we recall the following results, which are extension of the results in [16] in higher dimensions. Arguing as in [16], by the finite speed of propagation, we have

Lemma 5.7

For any \((u_0,u_1) \in H_x^1 \times L_x^2\), there exist sufficiently small constant \(\epsilon > 0\) and a local solution u defined in \(\varOmega = \{(t,x)\in \mathbb {R} \times \mathbb {R}^d: |t| - \epsilon |x| < \epsilon \}\) to (1.1) with \((u(0), \partial _t u(0)) = (u_0,u_1)\). In addition, the solution u satisfies

$$\begin{aligned} \sup _{|t| < \epsilon R} \int _{|x| > R} \left( |\partial _t u(t,x)|^2 + |\nabla u(t,x)|^2 + |u(t,x)|^2\right) \,\textrm{d}x \rightarrow 0\quad \text { as } {R\rightarrow \infty }. \end{aligned}$$
(5.16)

Lemma 5.8

Given \((u(0), \partial _t u(0)) \in H^1\times L^2\) and \(\frac{|\nu |}{\langle \nu \rangle } < \epsilon \) for some \(\epsilon > 0\), we have \(u \circ L_\nu \) is a solution to (1.1) on \((- \epsilon , \epsilon ) \times \mathbb {R}^d\) and \((u \circ L_\nu (0,x), (u \circ L_\nu )_t(0,x))\in H^1 \times L^2\) is continuous with respect to \(\nu \).

We can now return to the proof when \(\nu _n \rightarrow \nu \in \mathbb {R}^d\), as \(n\rightarrow \infty \), we have the following extension of Proposition 6.11 in [16] in higher dimensional case. Although the proof is a slight modification of Proposition 6.11 of [16], we present the proof for self-contained.

Proposition 5.9

(Matching initial data) For n large enough, the global solution

$$\begin{aligned} v_n^1:= \left( 1 + i \langle \nabla \rangle ^{-1} \partial _t \right) \Re \left( v_n^0 \circ L_{\nu _n}\right) \end{aligned}$$

of (2.1) satisfies \(\sup _n S_{\mathbb {R}}(v_n^1) \lesssim _{ \Vert \phi \Vert _{L_x^2}} 1\) and

$$\begin{aligned} \left\| v_n^1(0) - \phi _n \right\| _{H_x^1} \rightarrow 0\quad \text { as } n\rightarrow \infty . \end{aligned}$$
(5.17)

Proof

We have the decomposition

$$\begin{aligned} \Re v_n^0= u_n^{0, l } + \tilde{u}_n^0, \end{aligned}$$

where \(u_n^{0, l }\) is the solution of the free Klein–Gordon equation with

$$\begin{aligned} \left( \left( 1 +i \langle \nabla \rangle ^{-1} \partial _t \right) u_n^{0,l }\right) (0) = v_n(0) = L_{\nu _n}^{-1} \phi _n. \end{aligned}$$

By (3.3), we have

$$\begin{aligned} \left( \left( 1 + i \langle \nabla \rangle ^{-1} \partial _t \right) \left( u_n^{0, l} \circ L_{\nu _n} \right) \right) (0) = \phi _n, \end{aligned}$$

we can then obtain \(\Vert v_n^1(0) - \phi _n\Vert _{H_x^1} = \Vert \tilde{u}_n^0 \circ L_{\nu _n}(0, \cdot )\Vert _{H_x^1}\).

By direct calculation, we see \(\tilde{u}_n^0\) obeys

$$\begin{aligned} \left\{ \begin{array}{l} \partial _t^2 \tilde{u}_n^0 - \Delta \tilde{u}_n^0 + \tilde{u}_n^0 = - \mu |\Re v_n^0|^\frac{4}{d} \Re v_n^0,\\ \tilde{u}_n^0(0,x) = \partial _t \tilde{u}_n^0(0,x) = 0. \end{array}\right. \end{aligned}$$

On the space-time domain \(\varOmega = \{(t,x)\in \mathbb {R} \times \mathbb {R}^d: |t| - \epsilon |x| < \epsilon \}\), by Lemma 5.7 and the Strichartz estimate, we have

$$\begin{aligned} \left\| \tilde{u}_n^0 \right\| _{L_t^q L_x^r(\varOmega )} + \left\| \nabla _{t,x} \tilde{u}_n^0 \right\| _{L_t^\infty L_x^2(\varOmega )} < \infty \quad \text {for any sharp Schr}\ddot{\text {o}}\text {dinger admissible pair} (q,r). \end{aligned}$$

Since \(\Re v_n^0 \) satisfies (5.16), and the analogous estimate for \(u_n^{0,l }\) follows from finite speed of propagation and energy conservation, this yields

$$\begin{aligned} \sup _{|t| \le \epsilon R} \int _{|x| > R}\left| \partial _t \tilde{u}_n^0(t,x) \right| ^2 + \left| \nabla \tilde{u}_n^0(t,x) \right| ^2 + \left| \tilde{u}_n^0(t,x) \right| ^2\,\textrm{d}x\rightarrow 0\quad \text { as } R\rightarrow \infty . \end{aligned}$$
(5.18)

Let \(\mathcal {T}\) be the stress energy tensor of \(\tilde{u}_n^0\), its components are

$$\begin{aligned} \mathcal {T}^{00}= & {} \frac{1}{2} \left| \partial _t \tilde{u}_n^0 \right| ^2 + \frac{1}{2} \left| \nabla \tilde{u}_n^0 \right| ^2 + \frac{1}{2} \left| \tilde{u}_n^0 \right| ^2, \quad \mathcal {T}^{0j} = \mathcal {T}^{j0} = - \partial _t \tilde{u}_n^0 \partial _j \tilde{u}_n^0,\\ \text { and }\quad \mathcal {T}^{jk}= & {} \partial _j \tilde{u}_n^0 \partial _k \tilde{u}_n^0 - \delta _{jk} \left( \mathcal {T}^{00} - \left| \partial _t \tilde{u}_n^0 \right| ^2 \right) , \end{aligned}$$

where \(j, k \in \{1, \cdots , d\}\). Let the vector \(\textbf{p}_n\) with components defined by

$$\begin{aligned} {p}_n^\alpha = \langle \nu _n \rangle \mathcal {T}^{0\alpha } + \nu _{n,1} \mathcal {T}^{1 \alpha } + \nu _{n,2} \mathcal {T}^{2 \alpha } + \cdots + \nu _{n,d} \mathcal {T}^{d \alpha } , \quad \alpha \in \{0,1,2, \dots , d\}. \end{aligned}$$

By direct computation, we have

$$\begin{aligned} \nabla _{t,x} \cdot \textbf{p}_n = - \mu \left| \Re v_n^0 \right| ^\frac{4}{d} \Re v_n^0 \left( \langle \nu _n \rangle \partial _t \tilde{u}_n^0 - \nu _n \cdot \nabla _x \tilde{u}_n^0 \right) , \end{aligned}$$
(5.19)

and by Gauss’ formula,

$$\begin{aligned} \int _{L_{\nu _n} (t, \mathbb {R}^d)} \textbf{p}_n \cdot \textrm{d}\textbf{S}= & {} \int _{\mathbb {R}^d} \left( \langle \nu _n \rangle p_n^0 + \sum _{j = 1}^d \nu _{n,j } p_n^j\right) \circ L_{\nu _n} (t,x) \,\textrm{d}x\\ = & {} \frac{1}{2} \int _{\mathbb {R}^d} \left| \partial _t \left( \tilde{u}_n^0 \circ L_{\nu _n} \right) \right| ^2 + \left| \nabla \left( \tilde{u}_n^0 \circ L_{\nu _n} \right) \right| ^2 + \left| \tilde{u}_n^0 \circ L_{\nu _n} \right| ^2 \,\textrm{d}x, \end{aligned}$$

where \(\textrm{d} \textbf{S}\) is the surface measure times the unit normal vector.

We now consider the estimate of the nonlinearity in

$$\begin{aligned} \varOmega _n = \left\{ (t,x): \left( \langle \nu _n \rangle t + \nu _n \cdot x \right) t < 0\right\} . \end{aligned}$$

For any \((t,x) \in \mathbb {R} \times \mathbb {R}^d\), denote

$$\begin{aligned} \psi _R(t,x) = \phi \left( \frac{|t|+ |x|}{R}\right) , \end{aligned}$$

where \(\psi \) is the cut-off function defined in (3.21), by applying the divergence theorem together with (5.18), (5.19), and Lemma 5.7, we have

$$\begin{aligned} \frac{1}{2} \left\| \tilde{u}_n^0\! \circ \! L_{\nu _n}(0, \!\cdot ) \right\| _{H_x^1}^2\le & {} \lim _{R\rightarrow \infty } \frac{1}{2} \int _{\mathbb {R}^d}\! \left( \left| \partial _t (\tilde{u}_n^0 \!\circ \! L_{\nu _n} )\right| ^2 \!+\! |\nabla \left( \tilde{u}_n^0 \!\circ \! L_{\nu _n} \right) |^2 \!+\! \left| \tilde{u}_n^0 \!\circ \! L_{\nu _n} \right| ^2 \right) \psi _R \,\textrm{d}x \nonumber \\ \le & {} \limsup _{R\rightarrow \infty } \int \int _{\varOmega _{t,\nu _n }} \left| \psi _R \nabla _{t,x} \cdot \textbf{p}_n \right| + \left| \textbf{p}_n \cdot \nabla _{s,y} \psi _R\right| \,\textrm{d}y \textrm{d}s\nonumber \\ \le & {} \int \int _{\varOmega _{t,\nu _n }} \left| \nabla _{t,x} \cdot \textbf{p}_n \right| + \limsup _{R\rightarrow \infty } \frac{1}{R} \int _{-\epsilon R}^{\epsilon R} \int _{|x|\sim R} \left| \langle \nabla _{t,x} \rangle \tilde{u}_n^0 \right| ^2 \,\textrm{d}x \textrm{d}t \nonumber \\ = & {} \int \int _{\varOmega _n} \left| \nabla _{t,x} \cdot \textbf{p}_n \right| \,\textrm{d}x \textrm{d}t\nonumber \\ \lesssim & {} \left\| \Re v_n^0 \right\| _{L_{t,x}^\frac{2(d+2)}{d}(\varOmega _n)}^\frac{d+4}{d} \left\| \nabla _{t,x} \tilde{u}_n^0 \right\| _{L_{t,x}^\frac{2(d+2)}{d}(\mathbb {R}\times \mathbb {R}^d)}, \end{aligned}$$
(5.20)

where \(\varOmega _{t, \nu _n} := \{(s,y): (\langle \nu _n \rangle ^{-1} (t- \nu _n \cdot y) - s) s > 0\}\).

We now estimate the right-hand side of (5.20). We can see \(\forall \, \psi \in C_c^\infty \),

$$\begin{aligned} \int _{\varOmega _n} \left| \lambda _n^{- \frac{d}{2}} \psi \left( \frac{t- \tilde{t}_n}{ \lambda _n^2}, \frac{x - \tilde{x}_n}{ \lambda _n} \right) \right| ^\frac{2(d+2)}{d} \,\textrm{d}x \textrm{d}t \lesssim \lambda _n^{-1}\Vert \psi \Vert _{L_{t,x}^\infty } \rightarrow 0\quad \text { as } n\rightarrow \infty . \end{aligned}$$

This together with (5.15) and the triangle inequality yields for n sufficiently large,

$$\begin{aligned} \Vert \Re v_n^0\Vert _{L_{t,x}^\frac{2(d+2)}{d} (\varOmega _n)} \rightarrow 0\quad \text { as } n\rightarrow \infty . \end{aligned}$$
(5.21)

By the triangle inequality, (2.5), \(S_{\mathbb {R}}(v_n^0) \lesssim _{ \Vert \phi \Vert _{L_x^2} } 1\), and Strichartz, we get

$$\begin{aligned} \left\| \nabla _{t,x} \tilde{u}_n^0 \right\| _{L_{t,x}^\frac{2(d+2)}{d}}\le & {} \left\| \nabla _{t,x} \Re v_n^0 \right\| _{L_{t,x}^\frac{2(d+2)}{d}} + \left\| \nabla _{t,x} u_n^{0, l } \right\| _{L_{t,x}^\frac{2(d+2)}{d}}\\ &\lesssim _{\Vert \phi \Vert _{L_x^2}}&\left\| \langle \nabla \rangle ^\frac{3}{2} D_{\lambda _n} P_{\le \lambda _n^\theta } \phi \right\| _{L_x^2} + \left\| v_n^0(0) \right\| _{H_x^\frac{3}{2}} \lesssim _{ \Vert \phi \Vert _{L_x^2} } 1. \end{aligned}$$
(5.22)

By (5.20), (5.22), and (5.21), we can finish the proof of (5.17).\(\square \)

Since \(v_n^0\) is a solution of (2.1), \(\Re (v_n^0\circ L_{\nu _n})\) solves (1.1) by Lemma 5.8. In general, \(v_n^0 \circ L_{\nu _n}\) is not a solution of (2.1), and also

$$\begin{aligned} v_n^1:= \left( 1 + i \langle \nabla \rangle ^{-1} \partial _t \right) \Re \left( v_n^0 \circ L_{\nu _n}\right) \end{aligned}$$

solves (2.1) with \(S_{\mathbb {R}}(v_n^1) = S_{\mathbb {R}}(v_n^0)\), which equals to \(v_n^0 \circ L_{\nu _n}\) only when \(\nu _n = 0\). Thus it is necessary to pass through real solutions here. By Proposition 5.9, the difference between \(v_n^1(0)\) and \(v_n(0)\) is small. By Proposition 5.9 and Proposition 2.9, there exists a global solution \(v_n\) to (2.1) with \(v_n(0) = \phi _n\) and \(S_{\mathbb {R}}(v_n) \lesssim _{\Vert \phi \Vert _{L_x^2} } 1\) for n large enough. Moreover,

$$\begin{aligned} \left\| \Re \left( v_n - v_n^1 \right) \right\| _{L_{t,x}^\frac{2(d+2)}{d}} \rightarrow 0\quad \text { as } n\rightarrow \infty . \end{aligned}$$

This together with \(\Re v_n^0 = \Re (v_n^1 \circ L_{\nu _n}^{-1})\) and (5.15) shows (3.13).