Optimal time decay of the non cut-off Boltzmann equation in the whole space

In this paper we study the large-time behavior of perturbative classical solutions to the hard and soft potential Boltzmann equation without the angular cut-off assumption in the whole space $\threed_x$ with $\DgE$. We use the existence theory of global in time nearby Maxwellian solutions from \cite{gsNonCutA,gsNonCut0}. It has been a longstanding open problem to determine the large time decay rates for the soft potential Boltzmann equation in the whole space, with or without the angular cut-off assumption \cite{MR677262,MR2847536}. For perturbative initial data, we prove that solutions converge to the global Maxwellian with the optimal large-time decay rate of $O(t^{-\frac{\Ndim}{2}+\frac{\Ndim}{2r}})$ in the $L^2_\vel(L^r_x)$-norm for any $2\leq r\leq \infty$.


Introduction and main results
In recent work the Boltzmann equation has been shown for the first time to have global in time classical perturbative solutions for physically realistic collision kernels in the case of the torus (T n x ) [11,12] with n ≥ 2 and in the whole space (R 3 x ) case [1]. These results were able to successfully remove the widespread (non-physical) "Grad angular cut-off" assumption in the context of perturbations. The solutions on the torus exhibit exponential time decay O(e −λt ) to equilibrium for the hard potentials and rapid polynomial decay O(t −k ) for any k > 0 in the case of the soft potentials. This large time behavior is predicted by the celebrated Boltzmann H-theorem. In the whole space, the presence of dispersion shackles the H-theorem to low order polynomial rates. Even with angular cut-off [26], it has remained a longstanding open problem to determine the time decay rates for the soft potential kernels in R n x . In this work, we prove optimal large-time decay rates for the full range of hard and soft potential collision kernels without angular cut-off.
We will study solutions to the Boltzmann equation, which is given by Here the unknown is F = F (t, x, v) ≥ 0. For each time t ≥ 0, F (t, ·, ·) represents the density of particles in phase space. The spatial coordinates we consider are x ∈ R n x , and the velocities are v ∈ R n v with n ≥ 2. The Boltzmann collision operator, Q, is a bilinear operator which acts only on the velocity variables, v, as Here we are using the standard shorthand F = F (v), . In this expression, v, v * and v ′ , v ′ * are the velocities of a pair of particles before and after collision. They are connected through the formulas The Boltzmann collision kernel, B(v − v * , σ), depends upon the relative velocity |v − v * | and on the deviation angle θ through cos θ = (v − v * ) · σ/|v − v * |. Without restriction we suppose that B(v − v * , σ) is supported on cos θ ≥ 0, as in [10,11].
These will be called "soft potentials" throughout this paper. These collision kernels are physically motivated since they can be derived from a spherical intermolecular repulsive potential such as φ(r) = r −(p−1) with p ∈ (2, +∞) as was shown by Maxwell in 1866. In the physical dimension (n = 3), B satisfies the conditions above with γ = (p − 5)/(p − 1) and s = 1/(p − 1); see [27]. A large amount of previous work requires the Grad angular cut-off assumption which usually means either b(cos θ) ∈ L ∞ (S n−1 ) or b(cos θ) ∈ L 1 (S n−1 ). However neither of these assumptions are satisfied for angular factors such as (1.2).
We will study the linearization of (1.1) around the Maxwellian equilibrium states where without loss of generality the Maxwellian is given by µ(v) def = (2π) −n/2 e −|v| 2 /2 .
We linearize the Boltzmann equation (1.1) around (1.5). This grants an equation for the perturbation, f (t, x, v), that is given by (1.6) where the linearized Boltzmann operator, L, is defined as and the bilinear operator, Γ, is then The (n + 2)-dimensional null space of L is well known [10]: Now, for fixed (t, x), we define the orthogonal projection from L 2 v to N (L) as where the functions a f , b f def = [b 1 , · · · , b n ] and c f will depend on f (t, x, v). Our main interest is in the large-time behavior of global classical solutions of the Cauchy problem for the Boltzmann equation (1.1) which are perturbations of the Maxwellian equilibrium states (1.5) for the long-range collision kernels (1.2), (1.3) and (1.4). This behavior is controlled by the celebrated Boltzmann H-theorem.
We define the Boltzmann H-functional by Then the Boltzmann H-theorem predicts that, for solutions of the Boltzmann equation, the entropy is increasing over time; this corresponds to the formal statement This is a part of the second law of thermodynamics.
Here D(f, f ) is the well-known "entropy production functional" which does not operate on (t, x). This shows that the H-theorem is degenerate in (t, x) because characterizations of D(f, f ) will be local in those variables, and so the transport terms in (1.1) become important. Of course this is an extremely formal calculation, even more so in the whole space where the H-functional is infinity at any non-zero global Maxwellian. In recent work [11,12], Gressman and the author have introduced into the Boltzmann theory the following sharp weighted geometric fractional Sobolev norm: Generally, 1 A is the standard indicator function of the set A. Now this space includes the weighted L 2 ℓ space, for ℓ ∈ R, with norm given by The weight is v def = 1 + |v| 2 . The fractional differentiation effects are measured using the anisotropic metric d(v, v ′ ) on the "lifted" paraboloid (in R n+1 ) as This metric encodes the nonlocal anisotropic changes in the power of the weight. The linearized collision operator L is non-negative and it is furthermore locally coercive in the sense that there is a constant λ > 0 such that [11,Theorem 8.1]: This may be interpreted loosely as a linearized statement of the H-theorem. Note that the norm N s,γ provides a sharp characterization of the linearized collision operator [11, (2.13)]; in earlier work [18] the sharp gain of velocity weight in L 2 was established for the non-derivative part of (1.10). N s,γ also controls sharply the nonlinear collision operator, and its entropy production estimates [13] for D(f, f ). However all of these coercive estimates are degenerate in the (t, x) variables.
In a bounded domain such as the torus (T n x ), the H-theorem is prominent and then rapid convergence can be established (as in [11,12]). However in the whole space (R n x ) the dispersive effects dominate the H-theorem, and so the transport terms in (1.6) restrict the convergence to low order polynomial rates. This point of view illustrates the additional difficulty involved in proving decay rates in the whole space. We state these time decay rates in our main theorems of Section 1.2. Prior to that we introduce the notation.
Then we denote H m 0 = H m . For a Banach space X, we let · X denote the corresponding norm over X(R n x × R n v ), and | · | X analogously denotes the norm only over X(R n v ). For example we use the notation . Furthermore X(B) denotes the Banach space X over the domain B ⊂ R n . In particular, we will use the notation B R to denote the n dimensional ball of radius R > 0 centered at the origin. Sometimes further L 2 x and L 2 v are used to denote L 2 (R n x ) and L 2 (R n v ) respectively. There should be no confusion between L 2 x , L 2 v and L 2 ℓ , etc, since x and v are never used to denote a weight. For an integrable function g : R n → R, its Fourier transform is defined by For two complex vectors a, b ∈ C n , (a | b) = a · b denotes the dot product over the complex field, where b is the ordinary complex conjugate of b.
We use ·, · to denote the inner product over the Hilbert space L 2 v , i.e.
Analogously (·, ·) denotes the inner product over L 2 (R n x × R n v ). For r ≥ 1, we define the mixed Lebesgue space We introduce the norms · Ḣm and · H m with m ≥ 0 given by HereḢ m x =Ḣ m (R n x ) is the standard homogeneous L 2 x based Sobolev space: We also define the unified weight function as follows We then consider the weighted anisotropic derivative space as in (1.10): The length of α is |α| = α 1 + · · · + α n and the length of β is |β| = β 1 + · · · + β n . Fix ℓ ≥ 0. Given a solution, f (t, x, v), to the Boltzmann equation (1.6), we define an instant energy functional to be a continuous function, E K,ℓ (t), which satisfies (1.14) Also define the high-order instant energy functional E h K,ℓ (t) as We furthermore define the dissipation rate D K,ℓ (t) as For brevity, when ℓ = 0 we write E K (t) = E K,0 (t), E h K (t) = E h K,0 (t) and D K (t) = D K,0 (t). We suppose once and for all that K is an integer satisfying K ≥ 2K * n , where K * n def = ⌊ n 2 + 1⌋ is the smallest integer which is strictly greater than n 2 . Throughout this paper we let C denote some positive (generally large) inessential constant and λ denotes some positive (generally small) inessential constant, where both C and λ may change values from line to line. Furthermore A B means A ≤ CB, and A B means B A. In addition, A ≈ B means A B and B A.

1.2.
Main results. In this subsection we state our main optimal time decay results for the Boltzmann equation (1.6). We begin with the following existence result, and Lyapunov inequalities, based on the theory from [11].
is sufficiently small, then the Cauchy problem to the Boltzmann equation (1.6) admits a unique global solution f (t, x, v) satisfying the Lyapunov inequality d dt Here λ > 0 may depend on ℓ. In addition there is E h K,ℓ (t) such that The nonlinear energy estimates in Theorem 1.1 together with time-decay estimates on the linearized system in Theorem 2.1 lead us to the time-decay rates of the instant energy functionals E K,ℓ (t), E h K (t) and the Z r norms if we make additional integrability assumptions on the initial data. Precisely, for ℓ ≥ 0, set ǫ K,ℓ to be . Our main optimal time decay result is as follows: E K,ℓ (t) ǫ K,ℓ+p(n) (1 + t) − n 2 . Suppose further that ǫ K,ℓ(n)+p(n) is sufficiently small, where the weight factor ℓ(n) is defined, for any small ε > 0, as follows (1.21) ℓ(n) def = 2(γ + 2s), for the hard potentials: (1.3), n 2 + K * n + 1 + ε, for the soft potentials: (1.4). Then for any 2 ≤ r ≤ ∞, we have the following estimate: 2r , which holds uniformly over t ≥ 0. Furthermore, we have faster time decay rates for higher derivatives and special components of the solution as follows.
If both ǫ K,q(n) and ǫ K,ℓ(n)+p(n) are sufficiently small, where ℓ(n) is defined in (1.21), then for any 2 ≤ r ≤ ∞, we have the following time decay estimate: This will hold uniformly for any t ≥ 0. Again for the soft potentials, (1.4), we use any small ε > 0. But the hard potentials, (1.3), we can choose ε = 0.
These time decay rates for the L 2 -norms in (1.20) and (1.22) are optimal in the sense that they are the same as those for the linearized system, as in Theorem 2.1, which is studied using Fourier analysis. These rates also coincide with those in the case of the Boltzmann equation [25] for hard-sphere particles, and further they are the same in L r x as those for solutions to the Heat equation. Corollary 1.3 shows that higher derivatives and the microscopic part of the solution decays faster.
1.3. Historical discussion, new methods, and future directions. Now our main theorems and their corollary show that the Cauchy problem for the non cut-off Boltzmann equation (1.6) is Hypocoercive for perturbations, this holds in the sense of the description given by Villani [28].
We would like to point out that there have been extensive investigations on the rate of convergence to Maxwellian equilibrium for the nonlinear Boltzmann equation or related kinetic equations. We only have the space to mention a brief few. In the context of perturbations, it was Ukai [25] who in 1974 proved the first decay result. Here the spectral analysis was used to obtain the exponential rates for the Boltzmann equation with hard potentials on torus. Further time decay results on the torus were obtained in [2,4,7,[20][21][22] and the references therein.
In particular we have studies of the decay rates for the Vlasov-Poisson-Boltzmann [7] and Vlasov-Maxwell-Boltzmann [21] system using the existence theory from [15]. We also mention the interaction functional approach from Duan [6] which removes the time derivatives. We further have decay results with an angular cut-off for the moderate soft potentials in [2] and for the full range of soft potentials in [21,22]. Now there are also rapid decay results for the relativistic Boltzmann equation [20]. Of course these methods apply rigorously in the perturbative regime.
With the entropy production method Desvillettes-Villani [4] obtained the first almost exponential rate of convergence for solutions to the Boltzmann equation on the torus with cut-off soft potentials for initial data without a size restriction assuming additional global in time uniform high regularity and moment bounds and good lower bounds for the density at high velocities. Then Strain-Guo [22] provided a very simple proof of the main decay results in [4] for the unconditional perturbative regime, assuming only the high moment bounds without extra regularity.
To study the optimal convergence rates in the whole space has proven to be harder than the case of the torus because of the additional dispersive effects of the transport term in (1.6). The early results are well documented in Glassey [10].
In particular we mention the Kawashima [17] method of thirteen moments, also the optimal linear decay analysis of Duan, Ukai, Yang, Zhao [9] for the hard sphere case. Also [5] studies in particular the Boltzmann equation with confining forces. Further recently we have seen proofs of the optimal time decay for the one-species Vlasov-Poisson-Boltzmann system [7] in R 3 x , and the two-species Vlasov-Maxwell-Boltzmann system [8] using the existence theory from [19]. In the spirit of the Kawashima's work [17], for the linearized time-decay analysis, instead of using the compensation function as in [17], a key idea in [5,7,8] is to design several interactive functionals in order to exploit the dissipation which is present in the degenerate parts of the solution. For further references and discussions of these and other related results we refer to the commentary in [7,8].
We point out that the methods above do not apply to the Boltzmann equation, with or without angular cut-off, in the whole space for a soft potential. In that context we only have the result of Ukai-Asano [26] from 1982. For the cut-off moderately soft potentials, so that roughly b(cos θ) ∈ L ∞ (S n−1 ) instead of (1.2) and −1 < γ ≤ 0 instead of (1.3) and (1.4), they obtain in the whole space Above · denotes v in the norm as usual. Here a def = min n 2 1 p − 1 2 , 1 for p ∈ [1, 2). Their initial data uses that the following quantity is sufficiently small This convergence rate of a = n 4 when p = 1 is optimal, in comparison to ours, in dimensions n = 2, 3, 4 but the rate of a = 1 is not optimal for n ≥ 5. Their methods involve the spectral analysis of the semi-group, which as far as we are aware has not been extended lower than γ > −1.
At the same time, we would like to mention that after the main results in this article were complete, a related paper by Alexandre, Morimoto, Ukai, Xu, and Yang appeared in [1]. For the non cut-off soft potentials γ + 2s < 0 in the range max{−3, −2s − 3/2} < γ ≤ −2s, they give the following convergence rate in [1]: This non-optimal rate is proven on the basis of the pure energy method and a time differential inequality. Note that in the existence theorem they use K ≥ 6 derivatives and ℓ ≥ K + 1 weights in n = 3 dimensions. They also prove some optimal convergence rates for the hard potential case when γ + 2s > 0. We point out that our optimal decay results will apply to their solutions [1]. As far as we know, other than ours, these are the only two results obtaining time decay rates in the whole space for the soft potential Boltzmann equation with or without the angular cut-off assumption.
To obtain the results from this paper in Theorem 1.1, Theorem 1.2, and Corollary 1.3, we build upon previous work developed in collaborations of the author with Renjun Duan [7,8], Phillip T. Gressman [11][12][13], and Yan Guo [21,22]. The ideas in those works were to estimate the dispersion in the whole space using the interactive functionals and the Fourier analysis [7,8], to prove sharp estimates for the Boltzmann collision operator without angular cut-off [11][12][13], and also to prove rapid decay on the torus for the full range of cut-off soft potentials using interpolation [21] and splitting methods [22].
The present work incorporates a fusion of each of these different ideas, but it also requires several new methodologies. In particular previously the Fourier analysis techniques for studying the dispersion were designed around the hard-potential cases where the dissipation is as strong as the instant energy. For the soft potentials this is just simply false.
To fix this difficulty, we need to prove new weighted instant time-frequency Lyapunov inequalities in (t, k) on the Fourier transform side after integrating in v. We then must show that a functional satisfying (2.17), i.e. E ℓ (t, k) ≈ |w ℓf (t, k)| 2 L 2 , is preserved by the linear flow for any ℓ ∈ R. Once we have this preservation, we can use interpolation with a family of higher weight functions to obtain the linear decay. Yet to prove the preservation is not so obvious, in fact we find an appropriate functional E ℓ (t, k) which must be defined in two different ways for |k| ≤ 1 and alternatively for |k| > 1. Fortunately different inequalities are available in each case that allow us to exploit the changing behavior of the solution in these two separate regimes.
However unfortunately this interpolation method fails by necessity at the nonlinear level in the whole space because the dissipation (1.16) does not (and can not) contain the macroscopic components (1.9). To prove the nonlinear decay we use the time-velocity splitting. Note that this splitting was designed to deduce exponential decay O(e −λt p ) for p ∈ (0, 1) in the case of a cut-off soft potential [22] with an exponential velocity weight on the initial data. This is a completely different purpose from the one for which we use the splitting herein. For this paper, the novelty is in that the error terms can be controlled by extra polynomial velocity weights for the soft potentials (1.4), and the linear decay allows us to deduce the convergence rates in Theorem 1.2 and Corollary 1.3 using the Lyapunov inequalities from Theorem 1.1.
We also gain back a large "weight loss" nonlinearly since the microscopic part Pf decays exponentially in the velocity variables and the linear decay works for any ℓ ∈ R. This gain is further aided by the fact that we prove new estimates for the L 2 v norm of Γ(f, f ) from (1.7) which allow negative weights on the initial data. To obtain the optimal time decay rates in L r x with 2 ≤ r ≤ ∞ as in (1.22) we use an optimized Sobolev inequality (3.43). Here we recall [8].
Furthermore, we expect that the methods developed in this paper can be useful in several other physical contexts. We believe that our new approaches developed in this paper are generally applicable for proving time decay rates for soft-potential kinetic equations with perturbative initial data. In particular, including numerous additional efforts, with Zhu we can prove the optimal time decay for the relativistic Boltzmann equation [23] in the whole space. Moreover, in particular, it may be interesting to check if these optimal decay results could also be carried out for the solutions to the Landau equation obtained by Guo [14] in the whole space. We additionally believe that these methods can apply to several other various kinetic equations where the soft potentials are present.
Lastly we point out that these results are constructive in the sense that it is possible to track all of the constants, although we make no effort to do so. Remark 1.1. Note that for simplicity we have used K ≥ 2K * n derivatives in the above existence results. However for the hard potentials (1.3), or more generally under (1.4) combined with γ + 2s > − n 2 , these decay results will apply under less stringent regularity assumptions, see [11].
It is worth mentioning that all of our main results above hold in n ≥ 3 dimensions. Furthermore, Theorem 1.1 holds under n ≥ 2. When n = 2, it can be seen from the proofs in Section 3 that logarithms in the time decay rate show up as in the estimate [20,Proposition 4.5]. Thus easy modifications of our proofs would allow the same decay results as above when n = 2, however in each case above we would lose a small epsilon in the decay rate as a result of the temporal log. This could be upgraded to optimal decay by proving that for a given data f 0 which satisfies Pf 0 = 0, then the linear decay in Theorem 2.1 is actually faster by 1 2 . Such a property is well known for the Boltzmann equation [25,26] with an angular cut-off and hard or moderately soft-potentials. We are willing to conjecture that this property holds true even without angular cut-off, but proofs of such properties usually use the spectral analysis of the linearized collision operator. We avoid a complicated spectral study by using the Fourier transform, and with such methods we are unaware of any proof of this property. Note further that all of the the linear decay estimates in Section 2 hold true for any n ≥ 2.
1.4. Organization of the paper. The rest of this paper is organized as follows. In Section 2, we will study the time decay of solutions to the linear Boltzmann equation (2.1). This is decomposed into several steps which are outlined at the beginning of Section 2 below. Then in Section 3 we start out by proving the energy Lyapunov inequalities from Theorem 1.1, and we finish by proving the nonlinear time decay rates from Theorem 1.2 and Corollary 1.3.

Linear time decay
In this section we study the time-decay properties of solutions to the Cauchy problem for the linearized non cut-off Boltzmann equation (1.6). We state our main results in the first subsection. Then in Section 2.2 we derive several velocity weighted pointwise time-frequency Lyapunov inequalities. A key point here is that we can include weights and prove pointwise instantaneous bounds for solutions to the linearized equation. Finally in Section 2.3 the temporal decay rates of the solution and its derivatives in L 2 x are proven as in Theorem 2.1 and Corollary 2.2. 2.1. Time decay properties of solutions to the linearized equation. We consider the linearized Boltzmann equation with a microscopic source g = g(t, x, v): For the nonlinear system (1.6), the non-homogeneous source term is given by In this case g = {I − P}g. Solutions of (2.1) formally take the following form Here A(t) is the linear solution operator for the Cauchy problem corresponding to (2.1) with g = 0. The main result of this section is stated as follows.
The solution of the linearized homogeneous system satisfies for the hard potentials (1.3) and any t ≥ 0. Here σ r,m is given by For the soft potentials (1.4) with j > 2σ r,m we further obtain for any t ≥ 0 that We remark that theḢ m norm above is a convenient tool, which could be replaced by any ∂ α x with |α| = m. On the basis of the previous theorem, we have the following corollary which allows faster linear time decay away from the null space (1.9).
The rest of this section is devoted to the proof of Theorem 2.1 and Corollary 2.2. First, in the next Section 2.2 we prove several velocity weighted Lyapunov inequalities for the linear evolution (2.1) which are pointwise in (t, k). Then, in Section 2.3, we will use these inequalities to prove the time decay.

2.2.
Weighted time-frequency Lyapunov inequalities. In this subsection, we shall construct the desired weighted time-frequency Lyapunov functional as in Theorem 2.3. In the proof we have to take great care to estimate the microscopic and macroscopic parts for |k| ≤ 1 and |k| > 1 respectively each in different ways in order to capture the delicate individual behavior of each separate piece.

2.2.1.
Estimate on the microscopic dissipation. The first step in our construction of the time-frequency Lyapunov functional is to estimate the microscopic dissipation on the basis of the coercivity property (1.11) of L.
Consider (2.1), taking the Fourier transform in x grants us Then we multiply equation (2.7) withf (t, k, v) and integrate over v to achieve From the coercivity estimate (1.11) and (1.9) one has that Re ĝ,f . This is the first main estimate which we will use in the following. Notice that in (2.8) we use the inclusion L 2 γ+2s ⊃ N s,γ from (1.10). This lower bound will also be implicitly used several times below since the L 2 γ+2s (R n v ) norm already captures the control that we will need in order to prove the linear decay.

2.2.2.
Microscopic weighted time-frequency inequality. In this section we prove the following instantaneous Lyapunov inequality with a velocity weight ℓ ∈ R: We split the solution f to equation (2.1) into f = Pf + {I − P}f , take the Fourier transform as in (2.7), and then apply {I − P} to the resulting equation: We will estimate each of the three terms in (2.10). As a result of the rapid decay in the coefficients of (1.9) we obtain which holds for any small η > 0 and any large j > 0. For the second estimate, we invoke [11, Lemma 2.6] to achieve the following coercive bound Note that following the procedure as above when ℓ = 0, using (1.11), yields This holds when g = 0, it will be useful in the proof of Corollary 2.2. We furthermore remark, following the same procedure as above, that we get In other words, if we multiply (2.7) by w 2ℓf (t, k), integrate in v and use the same estimates as in the last case it follows that we obtain (2.12).

2.2.3.
Estimate on the macroscopic dissipation. In this section, we recall some arguments from [5] which are used to estimate the macroscopic dissipation, in the spirit of [17]. Now the form (1.9) of the orthogonal projection P implies the identities We will also use the following high-order moment functions Θ(f ) = (Θ ij (f )) n×n and Λ(f ) = (Λ 1 (f ), · · · , Λ n (f )) which are given by (2.14) Θ Now as in [5] these high-order moment functions satisfy some mixed hyperbolicparabolic equations. In the case when Pg = 0, the equations for these moment functions can be used to prove the following lemma from [5, Lemma 4.1]: with two properly chosen constants 0 < κ 2 ≪ κ 1 ≪ 1 such that holds for any t ≥ 0, and k ∈ R n . Now the {e m } are the smooth exponentially decaying velocity basis vectors which are contained in (1.9) and (2.14), and e is a linear combination of the {e m } whose precise form is unimportant herein.

Proof of time decay of linear solutions.
In this subsection we prove Theorem 2.1 based on Theorem 2.3. For the soft-potentials (1.4), the time-frequency dissipation is weaker than E ℓ (t, k) so we close the estimates using interpolation.
In (A.1) from Appendix A, using the Hölder and Hausdorff-Young inequalities, we showed that the integration over |k| ≤ 1 is bounded for 1 ≤ r ≤ 2 as follows Here we recall (2.5). For the integration over |k| ≥ 1: Collecting the above estimates as well as (2.22) gives (2.4). It remains to prove (2.6) for the soft potentials (1.4). By (2.18) we have that E ℓ (t, k) ≤ E ℓ (0, k) for any ℓ ∈ R. On the other hand notice that (2.18) is insufficient because γ + 2s < 0, and in this case E ℓ (t, k) is not controlled by |w ℓf (t, k)| 2 To resolve this difficulty we interpolate with a family of norms.
In particular, for j > 0, using (2.17) and (1.13) we have We therefore conclude that . Now we can rewrite (2.18), for any k ∈ R n , as To prove (2.6), one can bound E ℓ (t, k) as follows Integrating this over time, we obtain For any ℓ ∈ R and j > 0, uniformly in k ∈ R n , we have shown that We also just used the estimate E ℓ (0, k) E ℓ+j (0, k).
As before, we integrate over k and split into |k| ≤ 1 and |k| > 1 to achieve Alternatively, when |k| ≤ 1 we choose j to satisfy j > 2σ r,m and obtain Zr . For 1 ≤ r ≤ 2, this last inequality again uses (A.1) in Appendix A.
Next we give our proof of Corollary 2.2, using different methods.
Proof of Corollary 2.2. To prove time decay, now and in subsequent sections without loss of generality we can suppose that t ≥ 1. We will use a time-velocity splitting on the energy inequality from (2.11). Recalling (1.13), we define the sets Here p ′ ≥ 0 will be chosen later. We initially restrict to the case of the soft potentials (1.4). For the dissipation term in the energy inequality (2.11) we have We plug (2.24) into (2.11) to achieve The integrated form of this inequality is Here A ≥ 0, and we take p > 0, or 0 < p ′ < 1, so that the integral is finite.
where we have used 1 w This completes our estimates for the time decay of the Cauchy problem for the linearized non cut-off Boltzmann equation (2.1). In the next section we explain how to prove these decay rates in the nonlinear regime, for (1.6).

Nonlinear time decay
This section is devoted to the proof of Theorem 1.2 and Corollary 1.3. But first we prove the energy estimates (1.17) and (1.18) from Theorem 1.1. Note that once these energy estimates are established, the rest of the statements in Theorem 1.1 can be shown by using the methods elaborated in [11].
3.1. Velocity-weighted energy estimates. In this subsection, we will prove the velocity-weighted energy estimates in (1.17) for solutions to (1.6). The first step is to prove the following unweighted estimate for the norms in (1.14) and (1.16): L 2 , and D 0 K (t) contains no velocity derivatives: Following the proof of [11,Theorem 8.4], we find for suitable solutions to (1.6) that there are constants δ > 0 and C 2 > 0 such that Note that in the whole space we can not use the Poincaré inequality to obtain the term Pf in the lower bound. Otherwise the rest of the arguments in the proof of [11,Theorem 8.4] generalize from T n x to R n x . Furthermore I(t) is a suitable functional defined precisely in [11, (8.25)]. The key property of I(t) is that it can be absorbed into the energy, for a small κ > 0, as follows Then with (3.3) and (3.4) we obtain (3.1) with the first upper bound. The proof of this inequality follows exactly the proof in [11, (8.26)] when ℓ = |β| = 0.
To obtain the second upper bound in (3.1), we need to work a bit harder because D K,ℓ (t) in (1.16) does not contain Pf . To overcome this, we claim that where ℓ ≥ 0 and |α| + |β| ≤ K with K ≥ 2K * n . This clearly implies the second upper bound inequality in (3.1) The first step in our proof of (3.5) is to use the well-known expansion where the ψ i (t, x) are the elements from (2.13) and the φ i (v) are the smooth rapidly decaying velocity basis vectors in (1.8). Thus from [11, Proposition 6.1]: Here |[a, b, c]| is just the Euclidean square norm of the coefficients from (2.13). Now take the supremum of either |∂ α1 [a, b, c]| or |w ℓ−|β| ∂ α−α1 β1 {I − P}f | L 2 γ+2s , whichever contains the largest total number of derivatives, and use the embedding H for this term and Cauchy-Schwartz for the others to obtain (3.5).
The last case to consider is Γ(Pf, Pf ). From [11, Proposition 6.1] again This follows from the rapid decay of the basis vectors in (1.8). Now if |α| > 0 then we again use H K * n x ⊂ L ∞ x and Cauchy-Schwartz to get (3.5).
However, when α = 0, we combine the L 2 * (R n x ) gradient Sobolev inequality, for n ≥ 3 and 2 * = 2n . We can fortunately choose m + 1 = K * n . Now take the L ∞ x norm of |[a, b, c]|, use (3.7) and Cauchy-Schwartz to get (3.5) when α = 0 and n ≥ 3. Alternatively, when α = 0 and n = 2 we use Cauchy-Schwartz combined with the following inequality . We have then shown (3.5) in all the different cases.
In the rest of this subsection we will prove weighted energy estimates with spatial derivatives for the macroscopic part, {I − P}f in Step 1. Then in Step 2 we will prove velocity weighted estimates with both space and velocity derivatives also for {I − P}f . A suitable linear combination of these and (3.1) will establish (1.17).
Step 1. We split the solution f to equation (1.6) into f = Pf + {I − P}f and take {I − P} of the resulting equation to obtain For |α| ≤ K − 1 we take ∂ α of (3.8), multiply the result by w 2ℓ ∂ α {I − P}f for ℓ ≥ 0, and then integrate in x, v to achieve: We will estimate each of the three terms in (3.9). Now from (3.5) we have |Γ 1 | E K,ℓ (t)D K,ℓ (t). Then with Cauchy-Schwartz, we have |Γ 2 | D 0 K (t) from (3.2). Furthermore, from [11, Lemma 2.6], we see that We collect these estimates into (3.9) to achieve the final estimate of 1 2 Similarly, for |α| = K we take ∂ α of (1.6), multiply the result by w 2ℓ ∂ α f for ℓ ≥ 0, and then integrate in x, v and sum over |α| = K to achieve: These are the main energy inequalities in the first step in our proof of (1.17).
Step 2. Fix |α| + |β| ≤ K with |β| ≥ 1. We apply ∂ α β to (3.8) to obtain where the functionals I 1 and I 2 are denoted by , and for C β β1 ≥ 0 we have We multiply (3.12) by w 2ℓ−2|β| ∂ α β {I − P}f and integrate over x, v to get Analogous to the estimates in Step 1, 2). Furthermore from [11, Lemma 2.6] for a small η > 0 Lastly consideringĨ 2 as in [11, (8.29)] for any small η ′ > 0 we see that We add together each of these estimates, use a simple induction, and sum to obtain . This is the third and final estimate which we need to prove (1.17).

3.2.
High-order energy estimates. In this subsection, we will prove the second Lyapunov inequality in Theorem 1.1. Our goal is to construct a high-order instant energy functional E h K,ℓ (t) satisfying (1.18) if E K,ℓ (0) is sufficiently small (which we assume throughout this subsection). Due to (1.17), E K,ℓ (t) is also sufficiently small uniformly in time. Recall also the definition (1.16) of D K,ℓ (t).
Step 1. From the system (1.6) with (3.5) we obtain (3.14) In particular we differentiate (1.6) with ∂ α , then multiply the result by ∂ α f , integrate over x, v and sum over 1 ≤ |α| ≤ K to obtain (3.14) using also (1.11) and (3.5) with ℓ = 0. Multiply equation (3.8) by {I − P}f to additionally get and we will furthermore use (3.13) with ℓ = 0. Recall also (3.3). Following the proof of [11,Theorem 8.4] we can definẽ The terms I α a (t), I α b (t), and I α c (t) are defined in the proof of [11,Theorem 8.4]. Note that I(t) from (3.3), as defined in [11, (8.25)], is as above except that the sum is instead over |α| ≤ K − 1. It is established in the proof of [11,Theorem 8.4], similar to just below [11, (8.25)], that Strictly speaking, to prove (3.16) one has to replace the use of [11,Lemma 8.7] in the proof of [11,Theorem 8.4] with the following estimate for K ≥ 2K * n : Here e k are the exponentially decaying velocity basis vectors defined in [11, (8.10)]. This is not a problem. We use (6.12) from [11, Proposition 6.1] to see that which holds for any m ≥ 0. Then we apply (3.7) to the term in the upper bound above with fewer derivatives to obtain (3.17). It is furthermore a key point that for any functional E h K (t) satisfying (1.15), from the proof of [11,Theorem 8.4] it can be seen directly that we have the upper bound forĨ(t) of We will collect each of these last few estimates to prove (1.18). Now, we are ready to construct E h K,ℓ (t). In fact, let us define for suitable constants 0 < κ 5 ≪ κ 4 ≪ κ 3 ≪ κ 2 ≪ κ 1 ≪ 1 to be chosen sufficiently small, whereĨ(t) satisfies (3.16) and (3.18). Due to (3.18), notice that (1.15) holds true and so E h K,ℓ (t) is a well-defined high-order instant energy functional. By choosing 0 < κ 5 ≪ κ 4 ≪ κ 3 ≪ κ 2 ≪ κ 1 ≪ 1 further small enough, the sum of (3.14), (3.15)×κ 1 , (3.16)×κ 3 , (3.13)×κ 2 , (3.10)×κ 4 , and (3.11)×κ 5 yields Here we have additionally added ∇ x Pf 2 to both sides of the inequality; while also recalling that w ℓ ∇ x Pf 2 ∇ x Pf 2 . We thus establish the desired estimate (1.18) since E K,ℓ (t) is small enough.
3.3. The L 2 (R n v ) nonlinear estimate. In this subsection, we will prove the following velocity weighted L 2 (R n v ) based norm estimates for (1.7).
In the proposition above, we note that (γ+2s) + = max{γ+2s, 0}. Now, previous estimates of this type can be found in, for example, [1,3] and the references therein. We can not use these because they do not hold in particular under (1.4), and they do not allow negative decaying velocity weights at infinity. Our estimate above is not-optimal in terms of the order of differentiation, i instead of 2s from (1.2). It is also not optimal in terms of the order of the velocity weights, in particular because of the term (γ + 2s) + (γ + 2s) + above. However our ability to include negative decaying velocity weights allows us to obtain better dependence on the initial data in Theorem 1.2 than would otherwise be possible (without the negative weights above, when b + = 0 and b − > 0, we would have to "pay more" with additional weights for the decay in Theorem 1.2). These estimates also have the advantage that they can be proven quickly using the machinery from [11].
We will use Proposition 3.1 in two distinct cases. Firstly, for (1.14), we have (3.22) w This holds when b + = 0, b − = ℓ and b ′ = ℓ/2 where ℓ > 0 is sufficiently large. The estimate (3.22) then follows from the embedding H K * n x ⊂ L ∞ x and K ≥ 2K * n . This works for either the hard or the soft potential estimates in Proposition 3.1.
The second case is when b = b + ≥ 0 so that b − = b ′ = 0 and we have . This now uses the L p −L q Sobolev embeddings as in [11,Remark of (6.9)]. In (3.23) for the hard potentials (1.3) we need ℓ ′ = b + 2(γ + 2s) and for the soft potentials (1.4) we use ℓ ′ = b + 1 as a result of the scaling in the weight (1.13).
We denote ψ j def = F −1 ϕ j and define the notation "g j " as follows We thus have the standard expansion g = ∞ j=0 g j . Furthermore for any ℓ ∈ R and say m ∈ (0, 2) we have the estimate with ∇ the Euclidean gradient in R n . Above and below we will be using the notation This will be our main tool to control the Littlewood-Paley sums below.
To proceed with our estimates, we expand the trilinear form as As usual, all the sums can be rearranged because of the rapid convergence.
We first give the proof of (3.20) using several estimates from [11]. From [11,Proposition 3.4], using Γ * (g, h, f ) = Γ * (g, w q h, w −q f ), with q ∈ R it follows that Here δ > 0 is a small number, and m ≥ 0 can be large. (Note that in [11,Proposition 3.4], using [11,Proposition 3.5], it is legitimate to choose ǫ = s ∈ (0, 1).) To estimate the second term in (3.28) we use T k * (g, h, f ) = T k * (g, w q h, w −q f ) again. Then from [11,Proposition 3.2] we have the estimate . Unfortunately the estimate for T k + can not use the same trick as easily. However following the proof of [11,Proposition 3.3] we see that for any q + , q − , q ′ ≥ 0 with q = q + − q − and q − ≥ q ′ we have that . This can be derived quickly from the proof of [11,Proposition 3.3] as follows. When γ + 2s < 0 as in (1.3) in [11, (3.14)] we replace w 2ℓ (v ′ ) with w q (v ′ )w −q (v ′ ). Then in the top factor of [11, (3.15)] w 2ℓ (v ′ ) is replaced by w 2q (v ′ ) and in the bottom factor of [11, (3.15)] w 2ℓ (v ′ ) is replaced by w −2q (v ′ ) and the rest of the proof is exactly the same under (1.3). When alternatively γ + 2s ≥ 0, in [11, (3.16)] we can replace w 4ℓ (v ′ ) with w 2q (v ′ )w −2q (v ′ ) and we replace w 2(ℓ+ℓ ′ ) (v ′ ) in both places with w 2(q+q ′ ) (v ′ ). We put w −2q (v ′ ) with |f ′ | 2 in [11, (3.16)] and the rest of the proof does not change. This T k + estimate is the largest of the two above.
Then for the second term in (3.28) we obtain the upper bound of The last inequality used (3.27) as well as 2sj = ij + (i − 2s)j with (1.2) and (i − 2s) < 0.
To estimate (3.29) we have to exploit the cancellations. Following the proof of [11,Proposition 3.7] for q ∈ R we obtain the estimate (for m ≥ 0 large) To prove this estimate with the isotropic Euclidean derivative ∇ as above follows directly the proof of [11,Proposition 3.7] except that we use symmetry, a longer explanation is given in [11, (3.41)]. We further have Therefore in the proof of [11,Proposition 3.7] we replace the use of [11, (3.25)] by instead using (3.30). The only other difference is that we replace the weight w 2ℓ (v ′ ) with w q (v ′ )w −q (v ′ ). The factor w q (v ′ ) follows the function h j and the factor w −q (v ′ ) goes with the function f . Otherwise the proof is exactly the same as [11,Proposition 3.7]. Now we estimate (3.29) with the upper bound of . Of course, these inequalities used (3.27).
To prove (3.20) it remains to choose the weights. First suppose that (1.3) holds so that γ + 2s ≥ 0. In this case from (1.13) we have that w(v) = v . We choose Alternatively, when γ + 2s < 0 but γ + 2s > − n 2 with (1.4), since in this case under (1.13) we have w = v −γ−2s , we can choose Then, in either case, collecting these upper bounds we have shown that The estimate in (3.20) then follows by duality.
where we defined λ = λ ′ p with p = −p ′ + 1 > 0. Use the factor e −λt p to obtain We suppose A ≥ 0 and p > 0, equivalently 0 < p ′ < 1, so that the integral is finite. In this case, since E high K,ℓ (s) is restricted to E c (s) we have that  .3) where g is given by (2.2). We conclude that We recall the norms from (1.12). We now apply (2.4) with m = 0, r = 1 and ℓ = −b ≤ 0 to be determined to I 0 (t) and I 1 (t), respectively, to obtain Under (1.3) we can take j = 0 and for (1.4) we take any j > n 2 . Define (3.36) For I 1 (t), from (3.22) and the definition (3.36) of E ∞ K,ℓ (t), it holds that Here we have chosen b > 0 sufficiently large and used K ≥ 2K * n in (3.22). We have also used the decay estimates for the time integrals as in [20,Proposition 4.5].
Collecting the estimates on I 1 (t) and I 2 (t) above, with (1.9), implies . Then we obtain (3.33) with p = 1, except that in the upper bound A = 0. As in the previous case, we plug in (3.37) and use a bootstrap argument with ǫ K,ℓ sufficiently small to obtain (1.20) Now we plug this into (1.18) to conclude that Following the exact procedure used to obtain (3.33) we achieve Here j is defined as in Theorem 2.1. We then use (3.22) to see that for b > 0 sufficiently large one has [E K (s)] 2 . Now using the notation in Theorem 1.2, we choose p(n) ≤ q(n) so that we have ǫ K,p(n) ≤ ǫ K,q(n) is sufficiently small as in (1.19 Then we can obtain (3.39) with p = 1 and A = 0. As in the previous analysis, we plug in (3.41) with ǫ K,0 sufficiently small and q(n) = 0 to obtain (1.23).
3.6. Time decay rates of solutions in L r x . In this final subsection, we will prove both (1.22) from Theorem 1.2 and (1.24) from Corollary 1.3. Notice already that the time decay rates stated in (1.22) and (1.24) are both true when r = 2 due to the assumptions of Corollary 1.3, Theorem 1.2 and the definitions (1.14) and (1.15) of E K,ℓ (t) and E h K,ℓ (t) respectively. It then remains to verify the following claim: Once this is done (1.22) and (1.24) follow from interpolation. We begin with Lemma 3.1. Using (2.3), we have the following estimates uniformly in t ≥ 0: A(t)f 0 Z∞ (1 + t) − n 2 w j(n) f 0 H K * n ∩Z1 . Above the weight power is given by any j(n) > n 2 + K * n for the soft potentials (1.4) and j(n) = 0 for the hard potentials (1.3). Furthermore, it holds that {I − P}A(t)f 0 Z∞ (1 + t) − n+1 2 +ǫ w j(n) ′ f 0 H K * n +1 ∩Z1 . In this case the weight power is given by j(n) ′ = 0 (and ǫ = 0) for the hard potentials (1.3). And for the soft potentials (1.4) with any small ǫ > 0 we have j(n) ′ = j(n) ′ (ǫ) > 0 is sufficiently large. We therefore observe that the microscopic part can have faster linear L ∞ x decay. Now we see that the nonlinear decay rate in (1.24) for {I − P}f (t) is not optimal in Z ∞ , at least in comparison to the linear decay in Z ∞ from Lemma 3.1 above.
Collecting the estimates in this paragraph grants (3.42). Q.E.D.
This establishes (A.1) for r = 1. Now to prove (A.1) when 1 < r < 2 we use the following two inequalities. The Hölder inequality is of course with 1 p + 1 q = 1 that The Hausdorff-Young inequality can be expressed as R n |ĝ(k)| q dk 1/q ≤ C(p) R n |g(x)| p dx 1/p , which will hold for any 1 ≤ p ≤ 2 and 1 p + 1 q = 1 or q = p/(p − 1). Now to get back to (A.1), when 1 < r < 2, we apply the Hölder inequality as This completes the proof of (A.1).