Convergence rate for the method of moments with linear closure relations

We study linear closure relations for the moments' method applied to simple kinetic equations. The equations are linear collisional models (velocity jump processes) which are well suited to this type of approximation. In this simplified, 1 dimensional setting, we are able to prove stability estimates for the method (with a kinetic interpretation by a BGK model). Moreover we are also able to obtain convergence rates which automatically increase with the smoothness of the initial data.


Quick presentation of the moments' methods for kinetic equations
We study here an unusual but very simple choice of closure for the moments' method where we can completely characterize the stability and convergence rates of the approximation. As far as we know this very simple situation was never considered before. Before presenting this choice though, let us very briefly give the main ideas behind the method of moments. Moments' methods have been introduced in [10] in the context of the Boltzmann equation. This well known equation is posed on the density f (t, x, v) of particles in the phase space and reads where Q is a non linear operator (in the velocity variable v) which expresses how the velocity of a particle may change when it has a random collision. Solving numerically an equation like (1.1) is in general very costly. The structure of the righthand side Q is non local in v and moreover the equation is posed in phase space which means that one has to work in dimension 2d+1 if x, v ∈ R d (7 for instance if x ∈ R 3 ).
The moments' method is one possible answer to this problem and it consists in solving equations on the polynomial moments of f . Let m(v) be polynomial in v then Now instead of solving one equation in dimension 2d + 1, one has to solve several equations but in dimension d + 1. Moreover in general one can expect that < m, Q(f ) > is not too complicated to compute. However the system given by (1.2) is not closed as vm is always one degree higher than m. Therefore no matter how many moments m i one chooses, it is never possible to express all the v m j in terms of the m i . This is the closure problem and it means that (1.1) is never equivalent to (1.2) for any finite number of moments.
Instead one typically chooses a closure equation, i.e. a relation between < vm i , f > and the < m j , f > for those i where vm i cannot be expressed in terms of the m j .
The first big difficulty for this type of method is how to choose the closure in order to ensure that the corresponding moments' system has good properties and gives a good approximation of Eq. (1.1). This problem was of course recognized early on, see for instance [3], as well as the role of entropy, see [15] among many others.
One of the first systematic ways of finding a closure was introduced in [12] and [13]. It is still not easy to actually compute the relation which means that it is often computed numerically instead (see [18], [6], [7]). Different closures can of course be used (see [21] for example).
Theoretically even checking that the corresponding method leads to a hyperbolic system is not easy (we refer for instance to [4], [5]). Proving convergence rates seems to be out of reach for the time being although in practice it seems to be a good approximation (see [11] for a numerical study).
Let us also mention that the methods of moments has also been used for theoretical purposes (as in [9]) and not only numerical computations.
We conclude this very brief overview by refering to [2], [17] or [19] for more on numerical simulations for kinetic equations in this context.

Linear closure relations
The guiding question in this article is whether it can make sense to consider a linear closure relation. This is certainly delicate in the nonlinear case of Boltzmann eq. (1.1). Instead we choose a simplified 1d setting where it is possible to fully analyze the method.
Instead of (1.1), we consider the linear model with λ > 0 and where the operator Q corresponds to a velocity jump process. While much simplified with respect to (1.1), it is not uninteresting in itself, with applications to physics (see [8], [23]) or biology (see [14] for example). The equation had to be supplemented with some initial data, which for simplicity we assume to be compactly supported in velocity In general Q could even be assumed to depend on t and x. Here we make the additional approximation with q smooth and compactly supported in some interval I, d ∈ N * and (α j ) 0≤j≤d ∈ R d+1 . With this special form, one of course expects to be in a very favorable situation for the method of moments. Hence this should be seen as a simple toy model where the method can easily be tested. Denoting the moments of the solution f by Eq. (1.4) simply becomes As we work in dimension 1, the structure of the hierarchy of equations on the moments is also very simple where we truncate at order N and we define the moments of q (1.10) In order to close the system, it would be necessary to be able to express µ f N +1 in terms of the µ f i for i ≤ N. The linear closure relation that we study here consists in assuming that µ f N +1 is a linear combination of the lower moments. That means that instead of (1.8) or (1.9), we solve with (a 0 , . . . , a N ) given real coefficients that one should choose in the "best" possible way. One can rewrite (1.11) in matrix form as 12) where (1.13)

Basic properties of the method and the main result
Given the simple form of (1.11) or (1.12), some properties are obvious. Given its linear nature, the system is hyperbolic if the characteristic polynomial has N + 1 real roots, denoted by This is enough to guarantee the well posedness of the numerical system (1.12) but not necessarily good stability properties. Norms of the numerical appoximation could for instance grows fast as N increases. If the initial data is compactly supported in x then the solution is as well and the support propagates with speed max k |λ k |.
On the other hand, the major inconvenients of such a method are also pretty clear. For instance positivity of the even moments will likely not be preserved. Even the positivity of the macroscopic density µ f 0 has no reason to be propagated.
However a careful analysis can in fact reveal that it is possible to choose appropriately the coefficients a i in order to have not only stability but also very fast convergence.
For simplicity, assume that I = [−1, 1] (just rescale and translate otherwise) and choose the Tchebychev polynomial for χ A or (1.17) Then it is possible to show: In addition if f 0 ∈ H k (R 2 ), d = 0, denoting by f the corresponding solution to (1.4) and defining its moments by (1.7), one has where C ≥ 0 depends on T, λ, q and k.

Remark 1.2
The convergence result is given for the first moment and d = 0.
Higher moments would lose small powers of N so the same proof would give We do not know whether those results are optimal or not and in particular whether the numerical value 3/4 in (1.19) is optimal (though actual numerical simulations do suggest it is).
Hence in this setting, the conclusion of this analysis is that the linear method of moments should be seen in the same light as spectral methods (see [16] for instance). It is stable, converges automatically at order k − 3/4 if the initial data is in H k but does not propagate any additional property (positivity being probably the most important). The next section is a more detailed presentation of the stability and convergence analysis in the general case (without necessarily choosing the Tchebychev points). The corresponding technical justifications and calculations are presented in two separate sections. Theorem 1.1 is proved in section 5. The last section is an appendix and recalls some well known results.

Stability and convergence results
We present here in more details the kind of stability and convergence results that can be proved for the method (1.11) without assuming any particular choice of the eigenvalues (like (1.16)). We give here the main ideas in the approach and leave the technical proofs to further sections. (1.12) As they enter in some of the estimates, we start by a short parenthesis about the eigenvectors for the matrix A. Define P ∈ M N +1 (R) the matrix of the eigenvectors; then

Eigenvectors for System
Its inverse P −1 can be computed as easily and with the convention thatp i,N = 1.
Moreover, an easy computation shows that: We can notice thatp i,j is an homogeneous polynomial of degree N − j in the eigenvalues (λ 0 , . . . , λ N ).

Stability estimate and kinetic interpretation of the method
Let us first state our main stability estimate.
Theorem 2.1 Assume (1.5) and that {λ 0 , . . . , λ N } ⊂ I. Moreover assume that there exists a function ρ N (v), positive on I, such that Then, the hyperbolic system (1.12) is stable and This result does not assume any particular distribution on the eigenvalues but of course it is by no means guaranteed in general that one could find ρ N satisfying (2.5). Notice that the corresponding relation is really a quadrature formula for computing integrals on I which we ask to be exact for polynomials of degree 2N + 1. We do not prove this result directly on the system (1.12). Instead we show a corresponding result on a linear BGK problem (see [1] for the simplification of collision kernels into BGK models).
For any ε > 0 and N ≥ d, consider the equation where the linear operator L is defined by (1.8) and the Maxwellian M (N ) : This problem is a kinetic approximation of the macroscopic problem (1.11) as formally when ε −→ 0 then one recovers (1.11) from (2.10). Indeed, multiplying (2.10) by v i and integrating over v ∈ I, we obtain which is our closure relation. The interest of (2.10) is to make some computations more transparent and easier to follow and in addition at the kinetic level which is more natural given the original equation. If we can prove stability estimates for (2.10) that are uniform in ε then simply by passing to the limit, we will obtain estimates for (1.12)-(1.11).
The most obvious choice for the maxwellian is simply The conditions (2.13) ensure that (2.11) holds.
There are obviously many ways to choose the m i s.t. (2.13) is satisfied. What we are looking for is a choice compatible with an inner product Φ N such that the application M (N ) is an orthogonal projection for Φ N . Formally this implies that for any f In addition this inner product should have good symmetry properties s.t. for The simplest way to ensure this is to look for a weight ρ N s.t.
In that case formally .
This is the strategy that we implement. Find appropriate conditions on ρ N and the m i to obtain the correct structure and then simply bound For the first part of this strategy we actually prove (2.14) We set Then, the map φ N defined by is an inner product on E N , and the Maxwellian M (N ) : is an orthogonal projection and satifies , any solution f ε N of the problem (2.10) formally satisfies:

Error estimate
We now turn to the convergence of the method of moments to the solution. For simplicity we state the results here for the case d = 0 in (1.6), namely We also assume that I ⊂ [−1, 1], still for simplicity. The convergence results of course require a stability estimate. However in themselves, they do not use the specific form of the closure. Therefore here we do not assume any specific closure relation. Instead we assume that we have a well defined methods of moments, i.e., some way of computing µ i which satisfy where Moreover we assume that the corresponding method has good stability estimates in the sense that where C, γ may depend on T , q, λ but not on N or f 0 . Note that Th. 2.1 can indeed be expected to imply such a result. The exponent γ will depend on the choice of the coefficients in our method. Of course Th. 2.1 controls only the L 2 norm of the µ i . However as the method (1.11) is linear, the ∂ k x µ i are also a solution to the same system and an estimate like (2.18) can be derived. For a more detailed analysis of how to obtain (2.18) from Th. 2.1, we refer to Section 5 where it is performed when the λ k are the Tchebychev points.
For any method that satisfies (2.18), then one has the following convergence result Then, for all T ≥ 0 and for all N ≥ 1, we have the estimate where C ≥ 0 depends on T, λ, q and k but not on N.
This error estimate is a sort of interpolation between the stability bounds and the following result for C ∞ solution to (2.17) Then, for all T ≥ 0 and for all N ≥ 1, we have the estimate where C ≥ 0 depends on T, λ, q.

Proof of Theorems 2.1 and 2.2
To simplify the presentation, we omit here the subscript N in E N , φ N , ρ N , and the superscript N in M (N ) .

Elementary space decomposition
The difficulty is to combine the fact that M has to be an orthogonal projection for Φ with the symmetry property on Φ. We take here a slighty more general approach by not assuming directly that Φ satisfies (2.15). Consider a Maxwellian which has the form (2.12) and satisfies (2.13).
Moreover, one has that We start by a more detailed explanation of the sufficient conditions to obtain Th. 2.2 .
Then, for any solution f ε N = f ε N (t, x, v) of the problem (2.10), inequality (2.16) formally holds: Thus, having R φ(f, v∂ x f )dx = 0 is sufficient to obtain (2.16). Since Therefore the symmetry condition (3.1) on φ is enough to conclude. If f is not smooth enough to follow the previous steps, one simply regularizes it by convolution in x. As the equation is linear and x is only a parameter in M and L then the regularized function solves the same equation. Therefore it satisfies (2.16) and letting the regularizing parameter vanish, one recovers the same inequality for f . Now, take a inner product φ on E such that K = V ⊥,φ , i.e. K and V are orthogonal for this inner product. The symmetry condition (3.1) is obviously equivalent to 3) We study each of those conditions in the following subsections.

Study of the condition (3.2)
Proposition 3.2 The condition (3.2) is equivalent to where A is the matrix defined by (1.13) and Q is the symmetric definite positive matrix defined by Proof: The condition (3.2) is equivalent to Moreover, (2.13) implies with the convention m −1 = 0. Thus with the convention Q −1,j = 0. Therefore, we have Thus, (3.2) amounts to the matrix A T Q be (real and) symmetric, i.e. A T Q = QA, since Q is symmetric. This result suggests a particular way of defining the inner product on V : where P is defined by (2.1). This choice implies Proof. We have, denoting by D = diag(λ 0 , . . . , λ N ): thus Q satisfies A T Q = QA. Moreover, it is easy to check that Q is symmetric, definite, and positive.
In the rest of the proof, we always choose Φ according to (3.6).

Study of the condition (3.3)
Now, we assume (3.6) to be satisfied and analyze (3.3).
Proposition 3.4 Assume (3.6), and consider the following polynomials
Then, the condition (3.3) is equivalent to The proof of this proposition is split in several lemmas. First we find two equivalent conditions to (3.3).

Lemma 3.5 The condition (3.3) is equivalent to
We then have to study conditions (3.10) and (3.11). Proof of Lemma 3.5: The condition (3.3) is obviously equivalent to where the last equality is deduced from recalling that w n f dv = 0, ∀n ≤ N. Thus, the condition (3.3) amounts to for any 0 ≤ i ≤ N and f ∈ K from which we deduce (3.11), and Therefore, for all i ∈ {1, . . . , N}, we have which shows the relation (3.10). Conversely, the conditions (3.10)+(3.11) imply (3.3) just by following the previous steps in reverse order. We start with condition (3.10) Lemma 3.6 Consider the polynomial The condition (3.10) is equivalent to where (T i ) 0≤i≤N are the following polynomials: Thus, it comes from the recursive formula on the m i that which is exactly with T N (X) = D(X), Conversely, if the functions (m i ) 0≤i≤N satisfy these relations, then (3.10) is obvious, almost everywhere v ∈ I (except at the roots of the polynomial D).
We can now prove Lemma 3. 8 We have the explicit formula (3.7) for the (T i ) 0≤i≤N which is recalled here Proof: First, according to the definition of D in Lemma 3.6, we have, for all k ∈ {0, . . . , N}, Then, the recursive formula (3.12) easily implies: Since each polynomial T i have a degree equal to N, it is enough to check equality (3.7) on the set {λ 0 , . . . , λ N }, which allows to conclude.

Remark 3.9
The formula (3.7) shows that the polynomials (T i ) 0≤i≤N are a basis of R N [X] since the matrix Q is invertible.
We now have all what is needed to prove Prop. 3.4 Proof of Prop. 3.4: By Lemma 3.5, condition (3.3) is equivalent to (3.10)-(3.11). By Lemma 3.6, condition (3.10) is equivalent to (3.8) provided that the T i are defined by (3.12). Lemma 3.8 shows that the recursive formula (3.12) actually gives the explicit formula (3.7). Finally by Lemma 3.10, we know that condition (3.11) is equivalent to (3.9) thus concluding the proof.
Proof: This choice is compatible with (3.11), since (3.11) can also be written as Moreover, (3.4) is obviously satisfied with this choice.

Choice of φ on the subspace V
For the moment Φ is defined as a weighted L 2 type inner product on K by Prop. 3.11 and on V by Corollary 3.3.
We wish to define Φ as a weighted inner product on the whole K ⊕ V . The following lemma shows that provided ρ satisfies the right relations then Φ as defined by Prop. 3.11 and Corollary 3.3 is automatically of the right form.

Proof of the theorem (2.2): Synthesis
We summarize here all the definitions and check rigorously that they are compatible.
So, assume there exists a function ρ = ρ(v), positive on I and satisfying the moment conditions: . (3.18) Note that this in particular implies that 1/ρ is integrable on I.
The space E = L 2 (I, ρ(v)dv) is a Hilbert space, for the inner product We define the map M : E → E by and Q = (P T ) −1 P −1 (the matrix P is defined by (2.1)).
• First, the map M is well defined as ∀f ∈ E, ∀i ∈ {0, . . . , N} , and thus Mf makes sense. Moreover, for f ∈ E, we have • It is obvious that M is linear. We check that M is a projection: let f ∈ E, we have, for 0 ≤ j ≤ N, Moreover, we have We deduce • M is an orthogonal projection (for the inner product φ), because it is a self-adjoint projector: as the matrix Q is symmetric.
• The Maxwellian M satisfies moment conditions: In fact, the first conditions have been already established, and the second ones result from the following formula, using (3.18) and since  19) with We can control the moments of f ε N in the following way: and the assumption (2.14) implies We deduce that Moreover, we have Thus, combining the last two inequalities and using Cauchy-Schwarz inequality, we obtain which is the desired estimate.
Combining Th. 2.2 and Lemma 3.13, we can obtain stability estimates for (2.10) uniform in ε: Proposition 3.14 Let f ε N a solution of (2.10) with (1.5). Then, the following estimate holds for any t > 0 (3.20) where and we recall Proof: It is straightforward using Gronwall lemma, as according to (2.16) and (3.19), we have We are now ready to conclude the proof of Th. 2.1 Now, we show that we can pass to the limit ε → 0 in the BGK model (2.10) to obtain a stability estimate on the hyperbolic system (1.12).
We fix N ≥ d. Using (3.20), we can see that the family (f ε N ) ε>0 is bounded in the space Therefore and thus Hence, passing to the limit in (2.11), we show that the function f N is a kinetic interpretation of the system (1.12), in the sense that µ i = I v i f N dv 0≤i≤N satisfies (1.12). As this system is linear, hyperbolic, it has a unique solution for a given initial data. That means that any solution of (1.12) can consequently be obtained as the moments of a limit f N of f ε N . Moreover, f N also satisfies the bound (3.20) : Thus we obtain the estimate (2.6), since  Proof : If f is a solution of (1.4) with d = 0, then, using the Stokes formula, we have, formally Therefore, integrating in time, we obtain where C = q L 2 (I) − λ.
Proof: First note that for any k, ∂ k x f is also a solution to (1.4) with ∂ k x f 0 as initial data. Then Using Prop. (4.1) and Hölder inequality, we easily obtain and a simple Gronwall lemma gives the result. We may finally conclude from Props. 4.1-4.2 that where C = q L 2 (I) − λ.
Proof. Notice that and one concludes by using Prop. 4.2.

Proof of Prop. 2.4: Error estimate in the smooth case
Since the functions (µ i ) are smooth in the space variable, we have by (2.17), Of course the same computation can be performed on the ∂ k x µ i , obtaining ∀k ∈ N, ∀i ∈ {0, . . . , N},

(4.2)
This sequence of inequalities lets us control µ 0 (t) L 2 (R) in term of the "last derivatives", namely ∂ N x µ i L 2 (R) 0≤i≤N . To do that, we set The coefficients γ i are easily bounded by q L 2 . Hence the inequalities (4.2) imply  Let (H k (t)) k≥1 be a sequence of nonnegative C 1 functions such that: where C > 0 is a numerical constant independant of k. Then, we have: Proof: First, the assumption may be rewritten as thus, integrating in t, we obtain: A simple recursion allows to conclude.
End of the proof of prop (2.4): Applying the previous lemma, we obtain and thus, for all t ∈ [0, T ], where the constant C depends on T , λ and q. This concludes the proof of Prop. 2.4.

Proof of Th. 2.3: Error estimate in the general case
The general idea for the proof is to regularize the initial data. Then we have to bound 3 terms. First the error between the exact solution and the truncated hierarchy for this regularized initial data. This term is controlled by Prop. 2.4. The next term is the difference between the solution for the non regularized initial data and the solution for the regularized one, both solutions to the exact equation (1.4). This is bounded using the estimates for (1.4). The final term is the difference between the solution for the non regularized initial data and the solution for the regularized one, but both solutions to the truncated hierarchy (2.17). We bound this term thanks to assumption (2.18). • Step 1: Regularization of the initial data We fix ε > 0 and we choose f 0 ε ∈ H k (R 2 ), with supp v f 0 ε ⊂ I, and such that (see the appendix for more details) (4.5) Let f ε = f ε (t, x, v) be the unique solution to Eq. (1.4) with f ε (t = 0) = f 0 ε . We define the moments as usual Of course, those moments still satisfy the hierarchy We denote by µ ε i the solution to the truncated hierarchy (2.17) for the initial data (4.6) • Step 2: Error estimates in term of ε. First note that µ ε i − µ fε i satisfies the assumptions of Prop. 2.4. On the one hand by Corollary 4.3. On the other hand by applying (2.18) to µ ε i and 0, one gets So applying Prop. 2.4, we have for γ ′ = max(γ, 1/2) by (4.5).
We can use the stability estimate (2.18) to control v , (4.8) again by (4.5).
We of course choose the "best" value of ε, which minimizes the error. When N > k, we get for γ ≥ 1/2 If γ < 1/2 then one takes instead with the same asymptotic behaviour.
In both cases In fact, the function 1 ρ N is a normalization of the Tchebychev weight. As the λ k gives the usual method of integration we have Proposition 5. 1 We have, for all N ≥ 1 and for all R ∈ R 2N +1 [X] : We may therefore apply Th. 2.1 to this choice of λ k and for this choice of ρ N . Compute Similarly

Appendix
The natural way to regularize is by convolution. However to obtain high order approximation, it is necessary to choose correctly the mollifier. In the L 2 framework though, things are quite simple by truncating in Fourier. Proposition 6.1 Let k a positive integer, f ∈ H k (R d ) and ε > 0. It exists First, we have which proves (6.1). On the other hand, for all m ∈ N, thus, f ε ∈ H m (R d ) and the estimate (6.2) holds.