A moment approach for entropy solutions of parameter-dependent hyperbolic conservation laws

We propose a numerical method to solve parameter-dependent hyperbolic partial differential equations (PDEs) with a moment approach, based on a previous work from Marx et al. (2020). This approach relies on a very weak notion of solution of nonlinear equations, namely parametric entropy measure-valued (MV) solutions, satisfying linear equations in the space of Borel measures. The infinite-dimensional linear problem is approximated by a hierarchy of convex, finite-dimensional, semidefinite programming problems, called Lasserre's hierarchy. This gives us a sequence of approximations of the moments of the occupation measure associated with the parametric entropy MV solution, which is proved to converge. In the end, several post-treatments can be performed from this approximate moments sequence. In particular, the graph of the solution can be reconstructed from an optimization of the Christoffel-Darboux kernel associated with the approximate measure, that is a powerful approximation tool able to capture a large class of irregular functions. Also, for uncertainty quantification problems, several quantities of interest can be estimated, sometimes directly such as the expectation of smooth functionals of the solutions. The performance of our approach is evaluated through numerical experiments on the inviscid Burgers equation with parametrised initial conditions or parametrised flux function.


Introduction
Non-linear hyperbolic conservation laws model numerous physical phenomena in fluid mechanics, traffic flow or non-linear acoustics [1,2].The numerical computation of such equations is often a challenge since their solutions may present discontinuities, even if the initial data are smooth.Numerous numerical methods exist to approximate them, amongst which we may cite finite volume or finite difference schemes [3] or the front-tracking method [4].We are interested in this paper in solving parameterdependent hyperbolic conservation laws, which are considered for various tasks in data assimilation [5], uncertainty quantification [6][7][8][9][10], sensitivity analysis [11], or error analysis [12].The parameters in our context appear in the initial data and in the flux function and are associated with a probability measure.The computation of approximate solutions for many instances of the parameters is usually prohibitive and require reduced order models.
Model order reduction methods aim at providing an approximation of the solution u(z, ξ), depending on physical variables z and parameters ξ, that can be efficiently evaluated.They either rely on an explicit approximation of the solution map ξ → u(•, ξ) or an approximation of the solution manifold {u(•, ξ) : ξ ∈ Ξ} by some dimension reduction method.The main challenge for models driven by conservation laws is that the solution maps and solution manifolds are highly nonlinear (in particular due to the presence of discontinuities), that require the introduction of nonlinear approximation or dimension reduction methods.Several model reduction methods based on compositions have been proposed, that include methods based on parameter-dependent changes of variables [13,14] or deep learning methods using neural networks [15].These methods usually require high computational resources and huge training data for the approximation of highly nonlinear solution maps.
Here, we follow a different approach and propose a new surrogate modelling method.It is an extension of [16] to parameter-dependent or random conservation laws.Whereas it is classical to seek entropy weak solutions to hyperbolic conservation laws [1,17], we are rather interested in so-called entropy measure-valued (MV) solutions, an even weaker notion of solution, introduced by DiPerna in [18,19].To a MV solution corresponds an occupation measure, whose marginal is the MV solution.Even if this notion of solution is very weak, there is a correspondence with entropy weak solution.The measure concentrated on the graph of the entropy weak solution is a MV solution.It is worth noting that the formulation in the setting of MV solutions leads to a linear problem, so that some efficient tools from convex analysis can be applied.
We start with a theoretical framework for parameter-dependent conservation laws similar to the one of [20,21].However, in our case, we introduce a weak-parametric formulation of the problem, where the classical entropy weak formulation is also integrated with respect to the parameter.The purpose of this formulation is to obtain an equivalent definition of parameter-dependent entropy MV solutions using the moments of the associated occupation measure with respect to all the variables, including the parameters.Under the assumption that flux function is polynomial and that the initial data can be described by semi-algebraic functions, the entropy formulation becomes a set of linear constraints on the moments of the occupation measure and we can follow the procedure initiated in [16].Indeed, this allows us to consider the problem as a generalized moment problem (GMP), an infinite-dimensional optimization problem over sequences of moments of measures, where both the cost and the constraints are linear with respect to the moments of the measures.Powerful results from real algebraic geometry allow to reformulate the constraint that a sequence is a moment sequence into tractable semi-definite constraints.This problem is then solved using Lasserre's (moment sum-of-squares) hierarchy [22], which consists in solving a sequence of convex semi-definite programs of increasing size to approximate the moments of the occupation measure.Note that the use of Lasserre's hierarchy for solving PDEs has been also recently considered in [23], although with a different approach where the considered measure is defined on an infinite-dimensional function space, and assumed to be concentrated on the solution of the PDE.
Obtaining an approximation of the moments can be costly, but once this offline computation is performed, efficient online post-treatments are possible.First, we can naturally obtain expectations of variables of interest that are functions of the moments of the solution.Also, the graph of the entropy weak solution (for any parameter value) can be recovered using a localization property of the Christoffel-Darboux kernel of the approximate occupation measure, following the methodology proposed in [24].This powerful approximation method allows to capture efficiently discontinuities in the solutions.Using the moment completion technique from [25], one can also have access to other quantities of interest, such as statistical moments of point-wise evaluations of the solution.

Outline
This paper is organized as follows.We first introduce some notations and the problem considered.Section 2 introduces different notions of solutions for parametrised scalar conservation laws and examines some links between these notions.Section 3 introduces the moment-SOS hierarchy and indicates how to perform several post-treatments such as retrieving the graph of the solution or estimating statistical moments of the solution.Finally, Section 4 presents some numerical experiments.

Notations
For X ⊂ R n , with n ∈ N, let C(X ), C 0 (X ) and C 1 c (X ) denote the space of functions on X that are continuous, continuous and vanishing at infinity and continuously differentiable with compact support, respectively.The sets of signed Borel measures and positive Borel measures are denoted M(X ) and M(X ) + , respectively.The set of probability measures on X is denoted by P(X ).The measure λ X ∈ M(X ) + denotes the Lebesgue measure on X , and for B ⊂ X a Borel set, |B| denotes its Lebesgue measure.Given a vector w = (w 1 , . . ., w n ), we denote by R[w] the ring of real multivariate polynomials in the variable w 1 , . . ., w n , and for a multi-index α = (α 1 , . . ., α n ), w α := w α1 1 . . .w αn n .Given a positive Borel measure µ, we denote by supp(µ) its support, defined as the smallest closed set whose complement has measure zero.

Definition of the problem
We consider parameter-dependent scalar hyperbolic conservation laws that are formulated as a Cauchy problem where t ∈ R + is the time variable, x ∈ R n is the space variable, and where ξ is a parameter in a parameter set Ξ ⊂ R p , p ∈ N. Then data are the flux f : R × Ξ → R n and the initial condition u 0 : R n × Ξ → R.
The parameter set Ξ is equipped with a probability measure ρ.We assume that Ξ is compact and the initial condition u 0 ∈ L ∞ (Ξ; L ∞ (R n )).Moreover, we shall make the following assumptions on f .Assumption 1.1.For all K ⊂ R compact, there exists C K ∈ R such that for all u ∈ K and ρ − a.e., ∥f This assumption is satisfied for a polynomial function f , that will be assumed in the next section.Then for the sake of simplicity, we restrict our analysis to this setting.A more general setting can be found in [21].
Note that the initial data u 0 and the flux f may depend on distinct parameters but for the sake of clarity, and without loss of generality, we indicate a dependence on the same set of parameters ξ.

Notions of parameter-dependent solution 2.1 Parametric entropy solution
We start by introducing the notion of parametric entropy weak solution, that is defined point-wise in the parameter domain.This may be considered as a strong-parametric solution, that is a straightforward notion of solution when a parameter is considered, see, e.g., [20] or [21].Definition 2.1 (Entropy pairs).Let η be a locally Lipschitz and convex function from R to R. Let q : R × Ξ → R n such that ∂ u q(u, ξ) = η ′ (u)∂ u f (u, ξ) for ρ-almost ξ and almost all u ∈ R. Then (η, q) is called an entropy pair associated with conservation law (1a).
We may notice that for an entropy pair (η, q), for ρ-almost all ξ, q(•, ξ) is a locally Lipschitz function.
We now introduce three specific families of entropy pairs, each of them having a particular theoretical or numerical objective.Definition 2.2 (C 1 family of entropy pairs).The C 1 family of entropy pairs, denoted E C , is defined as the set of entropy pairs (η, q) such that η ∈ C 1 (R) and for ρ-almost all ξ, q(•, ξ) ∈ C 1 (R).
Note under assumption 1.1, if (η, q) is an entropy pair with η ∈ C 1 , then (η, q) ∈ E C .The C 1 family of entropy pairs is related to the (opposite of the) thermodynamic entropy and to the second law of thermodynamics, for fluid dynamics models.The conservation law (1a), for a fixed ξ, can be seen as a simplification of such models.Definition 2.3 (Kruzhkov family of entropy pairs).The Kruzhkov family of entropy pairs is defined by where for all v ∈ R, for all u ∈ R and for ρ-almost all ξ, η v (u) := |u − v| and Compared to E C , the family E K has the advantage of being explicitly described and of carrying strong results coming from Kruzhkov's fundamental paper [17], allowing to obtain some theoretical results, such as uniqueness and stability.Definition 2.4 (Polynomial family of entropy pairs).The polynomial family of entropy pairs E P is defined as the set of entropy pairs (η, q) such that η is a polynomial function.If f is a polynomial function, then, for ρ-almost all ξ, q(•, ξ) is also a polynomial function.
In the case of a uniformly convex flux function, a single polynomial entropy can be sufficient to select the relevant solution (see e.g.[26,27]).Actually, our motivation is different.In numerical experiments, we shall use subsets of the polynomial family of entropy pairs.The SOS-moment (Lasserre's) hierarchy, later exposed in this paper, relies indeed on a polynomial setting.Although it is possible to implement our numerical method with E K as in [16], it is easier to do so with subsets of E P when possible.Definition 2.5 (Parametric entropy solution).Consider a family of entropy pairs E.
) is a parametric entropy solution for E if, for all (η, q) ∈ E, for all non-negative test functions ψ ∈ C 1 c (R + × R n ) and ρ-almost all ξ, it satisfies Proof.From [28, Theorem 5.2], we have that, for ρ-almost all ξ, there exists a unique solution u(•, We refer to [20,Theorem 3.3] that provides measurability properties of u under additional assumptions on u 0 .Remark 2.9.We might hope that imposing that u 0 ∈ L ∞ (Ξ; L ∞ (R n )) may be sufficient to have that ξ → u(•, •, ξ) is Bochner measurable, but this has not been proved yet.

Weak-parametric entropy solutions
The next notion of solution is weaker.While the parametric entropy solution adopts a pointwise point of view in the parameter domain, the following notion of solution is deduced by integration over the parameter domain.Definition 2.10 (Weak-parametric entropy solution).Consider a family of entropy pairs ) is called a weak-parametric entropy solution for E if, for all (η, q) ∈ E and all non-negative test functions ϕ ∈ C(Ξ; It is at first glance a weaker notion of solution, but we shall see that under certain assumptions, both notions of parametric entropy solution and weak-parametric entropy solution coincide.Theorem 2.11.Assume that u 0 ∈ L ∞ (Ξ, L ∞ (R n )) and f satisfies Assumption 1.1.A function u, such that ξ → u(•, •, ξ) is Bochner measurable, is a parametric entropy solution for E K if and only if it is a weak-parametric entropy solution for E K .
Proof.Let u be a parametric entropy solution for E K , and let v ∈ R and ϕ ∈ C(Ξ; C 1 c (R + × R n )).From Remark 2.8, and since Let us integrate equation (2) on Ξ.First, let us consider the terms where η v appears.From Remark 2.8, u is essentially bounded.Since η v is continuous, η v • u is also essentially bounded.Since ϕ and its derivative are continuous in ξ and Ξ is a compact set, the terms where η v appears are integrable in ξ.Recalling the definition of q v in Definition 2.3, we have that, for all y ∈ R, ρ-almost everywhere, ∥q v (y, ξ)∥ ≤ ∥f (y, ξ)∥ + ∥f (v, ξ)∥.From Assumption 1.1, and since from the same argument as for the terms where η v appears, u is essentially bounded, there exists C, C v ∈ R such that ∥q v (u, ξ)∥ ≤ C + C v , ρalmost everywhere.Thus, ξ → R+ R n ∇ x ϕ • q(u, ξ)dxdt is integrable and integrating on Ξ yields equation (4).

Measure-valued solutions
Following DiPerna [18], previous notions of solutions are extended to the weaker case of measure-valued solutions thanks to the notion of Young measure.Definition 2.12 (Young measure).A Young measure on a Euclidean space X is a map µ : From this, we can seek an even weaker notion of solution that is a Young measure µ (t,x,ξ) which satisfies the following Cauchy problem: where ⟨•, •⟩ denotes the integration of a (vector-valued) function g ∈ C(R; R k ) against a measure µ ∈ M(R), defined by while σ 0 = δ u0 .Equation (5a) has to be understood in a weak entropy sense, as explained in the following.Definition 2.13 (Parametric entropy measure-valued (MV) solution).Consider a family of entropy pairs E. Let σ 0 be Young measure on R n × Ξ, and let f satisfying Assumption 1.1.A Young measure µ is a parametric entropy MV solution to ( 5) for E, if, for a family of entropy pairs E, for all (η, q) ∈ E and all non-negative test functions With the injection we notice that, under the condition that σ 0 = δ u0 , weak-parametric entropy solutions are parametric entropy MV solutions, but without further assumptions, parametric entropy MV solutions are not necessarily weak-parametric entropy solutions.However, the following result shows that the parametric entropy MV solution can be concentrated on the graph of the weak-parametric entropy solution.
Theorem 2.14.Let u 0 ∈ L ∞ (R n × Ξ) and f satisfy Assumption 1.1.Let u be the unique weak parametric entropy solution for E K and µ be a parametric entropy MV solution for Proof.First, we may note that from the same arguments that those presented in the second part of the proof of Theorem 2.11, a parametric entropy MV solution µ for E K verifies the following inequality, for all v ∈ R, all non-negative test functions Then, for ρ-almost all ξ, µ(•, •, ξ) is an entropy MV solution of the initial problem with a fixed parameter ξ, that is a parameter-independent problem studied in [16].Then, from [16, Theorem 1] and [18], we have that ρ-almost everywhere, if , with u(•, •, ξ) the weak parametric entropy solution.

Restrictions to compact hypercubes
In order to extend the strategy developed in [16], it is mandatory to work on compact sets.Whereas introducing compact domains in time and in the parameter set is trivial, the restriction to bounded space domains has to be carefully done.To simplify the setting and to avoid the problem of introducing boundary conditions to conservation laws, see for instance [19,30,31], we assume that the solution has no interaction with its boundary, i.e. the solution is known on the boundary of the spatial domain at any time.Let be the respective domains of time t, space variable x and parameter ξ for fixed (but arbitrary) constants T , (L i ) n i=1 and (R i ) n i=1 .The absence of interaction with the boundary is translated as follows: initial data u 0 that we consider are the restrictions to X×Ξ of initial data defined on R n ×Ξ such that, considering the associated weak parametric entropy solution u, there exists ϵ > 0 such that in ∂X ϵ := (∂X + B(0, ϵ)) ∩ X, and for all t ∈ T and ρ-almost all ξ, u(t, •, ξ) = u 0 (•, ξ), i.e. the weak parametric entropy solution is stationary on ∂X ϵ .This framework is the one we shall use in the following.
From (3), we can consider that u takes values in the following compact set where the bounds are u := ess inf X,Ξ u 0 and u := ess sup X,Ξ u 0 .This leads us to reformulate the problem on the restricted domain.Proposition 2.15 (Parametric entropy measure-valued solution on compact hypercubes).Consider a family of entropy pair E. Let µ : (t, x, ξ) ∈ R + ×R n ×Ξ → µ (t,x,ξ) ∈ M + (U) be a parametric entropy measure-valued solution for E. Then it satisfies for all (η, q) ∈ E and for all non-negative test functions where σ 0 and σ T are Young measures supported on X × Ξ, and where γ is such that where for each 1 ≤ i ≤ n, γ L,i and γ R,i are boundary measures supported on T×Γ L,i × Ξ and T×Γ R,i ×Ξ respectively, with Γ L,i = {x ∈ ∂X : 16.Consider a family of entropy pairs E such that either (id, f ) ∈ E or E = E K .Let µ be a parametric entropy MV solution for E. Then for all test functions Proof.The proof of this lemma is discussed in [32], and the case of the Kruzhkov's entropies is retrieved thanks to the boundedness of U. Remark 2.17.Our numerical method shall not use all inequalities (10).Therefore, imposing (11) as an additional constraint may be beneficial in practice, see further discussion in Section 3. Remark 2.18 (Imposing constraints on the boundary).To ensure concentration of µ (t,x,ξ) , in addition to the condition σ 0 = δ u0(•) , one may impose conditions on the boundary measures (γ L,i ) n i=1 and (γ R,i ) n i=1 .The choice of boundary condition allows to ensure the absence of interaction with the boundary.We shall make the assumption that the trace of u 0 on Γ B,i , noted γ B,i (u 0 ) exists for all B ∈ {L, R} and all 1 ≤ i ≤ n, and we at the same time notice that this trace does not depend on ξ ∈ Ξ.We then want to impose that γ B,i (x) = δ γ B,i (u0)(x) for almost all x ∈ X, for all 1 ≤ i ≤ n and for B ∈ {L, R}.Remark 2.19 (General Dirichlet boundary conditions).The first entropy weak formulation with Dirichlet boundary conditions has been proposed by [33], but it requires the existence of strong traces, so that it cannot be generalized to MV solutions.Later, Otto introduced boundary entropy-flux pairs in his PhD thesis.A short presentation is given in [34] while an extended analysis is available in [19].This notion allowed Vovelle in [35] to show that the Otto's framework can be reformulated via Kruzhkov semientropies and to extend the wellposedness theory of conservation laws on bounded domains to entropy process solutions, which is a similar concept to MV solutions (see also [36] for more results).Lastly, in order to express such a formulation as moment contraints, it suffices to follow the approach of doubling the number of measures in [16], recalled in Appendix B for the case of Kruzhkov entropies.
The introduction of the measure ν allows us to rewrite the constraints ( 11) and (10).The equation ( 11) can be put in the following form where ϕ ∈ C(Ξ; C 1 (T × X)), and the equation ( 10) can be written for entropy pairs (η, q) in a family E and ϕ ∈ C(Ξ; C 1 (T × X)) non-negative test functions.Remark 2.20 (On systems of conservation laws).When studying systems of conservation laws, the main difficulty is the lack of uniqueness theory.In particular, even if entropy inequalities are available, there is no hope to identify entropy measure-valued solutions with entropy weak solutions.Nevertheless, our method could be applied as such to systems of conservation laws.
3 Moment-SOS method for measure-valued solutions on compact sets In the previous section, we introduced parametric measure-valued (MV) solutions for scalar hyperbolic equations, that are defined by equations ( 16)- (17).The aim of this section is to express these equations as constraints on the moments of the occupation measure and to explain how to approximate these moments based on the moment-SOS (Lasserre's) hierarchy [22].For that, we require the assumption that f : R × Ξ is a polynomial function.We will see in the next section how to extract from these moments some information on the solution u of the initial problem.

From weak formulations to moment constraints
The following lemma, derived from [16, Lemma 1], relies on density arguments, together with the fact that we are working with compact sets.Lemma 3.1.Let {ϕ α } α∈N n+p+1 be a polynomial basis on T × X × Ξ.Then equation ( 16) is equivalent to for all α ∈ N n+p+1 , where F is defined in (16).
For f a polynomial function, (18) provides constraints on linear combinations of moments of measures ν.In the case of a family of polynomial entropy pairs E ⊆ E P , we can also express (17) as constraints on the moments of measures ν.Lemma 3.2.Assume {ϕ α } α∈F for F ⊂ N n+p+1 is a countable family of polynomials on T×X×Ξ such that any non-negative polynomial can be decomposed on this family, with positive coefficients.Then, equation ( 17) is equivalent to for all α ∈ F and all (η, q) ∈ E. Remark 3.3.Since, as stated in Lemma 2.16, equation ( 17) implies equation ( 16), then equation (19) implies equation ( 18) with an appropriate family of entropy pairs.It thus may seem redundant to enforce both, but, in the approximation method, the family of polynomials in Lemma 3.2 will be reduced, so that this implication is no more guaranteed and imposing (18) as additional constraints may be beneficial.An illustration is provided in Appendix C for the numerical example studied in Section 4.1.
The case E = E K ensures concentration of the measure, as seen in Theorem 2.14, but we are faced with two issues: first, taking into account an uncountable family of functions parametrised by v ∈ U and, second, the absolute value function v → |v| is not a polynomial.To deal with the uncountable family of functions, we introduce v as a new variable.To treat the absolute value, we double the number of measures.
More precisely, we introduce as new unknowns Borel measures ϑ + and ϑ − , whose supports are respectively defined by supp(ϑ and impose the condition that ν ⊗ λ U = ϑ + + ϑ − , which can be expressed as constraints between moments of ν, ϑ + and ϑ − .Similarly, we introduce time boundary measures ϑ + 0 , ϑ − 0 , ϑ + T and ϑ − T , space boundary measures (ϑ and (ϑ − R,i ) n i=1 , and the corresponding constraints with measures ν.All those definitions are plainly written in Appendix B. We shall once again introduce a collection of measures From [16,Lemma 2], equation ( 17) is equivalent to for all non-negative test functions ϕ ∈ C 1 (T × X) ⊗ C(Ξ) and all non-negative test functions θ ∈ C(U).Lemma 3.4.Assume {ϕ α } α∈F is a countable family of polynomials on T×X×Ξ×U such that any non-negative polynomial can be decomposed on this family with positive coefficients.Then (20) is equivalent to for all α ∈ F.
Proof.The proof relies on density arguments.
A particular family {ϕ α } α∈N 2(n+p+2) satisfying the assumption that any nonnegative polynomial can be decomposed on this family with positive coefficients is given by for α ∈ N2(n+p+2) .The proof follows the one of the Lemma 3 in [16], and uses Handelman's Positivstellensatz [16].

Generalized Moment Problem
Roughly speaking, the Generalized Moment Problem (GMP) is an infinite-dimensional linear optimization problem on finitely many Borel measures ν i ∈ M(K i ) + , with K i ⊆ R ni , with i = 1, ..., N and n i ∈ N.That is, one is interested in finding measures whose moments satisfy (possibly countably many) linear constraints and which minimize some criterion.In full generality, the GMP is intractable, but if all K i are basic semi-algebraic sets1 and the integrands are polynomials, then one may provide an efficient numerical scheme to approximate as closely as desired any finite number of moments of optimal solutions of the GMP.It consists of solving a hierarchy of semi-definite programs 2 of increasing size.Convergence of this numerical scheme is guaranteed by invoking powerful results from real algebraic geometry, essentially positivity certificates, and further developed for many classical cases in [38,39].
Let h i ∈ R[w i ] and h i,k ∈ R[w i ] be polynomials in the vector of indeterminates w i ∈ R ni and let b k be real numbers, for finitely many i = 1, . . ., N and countably many k = 1, 2, . ... The GMP is the following optimization problem over measures:

From measures to moments and their approximation
Instead of optimizing over the measures in problem (22), we optimize over their moments.For simplicity and clarity of exposition, we describe the approach in the case of a single unknown measure ν, but it easily extends to the case of several measures.Let us consider the simplified GMP inf where Similarly, given a sequence z = (z α ) α∈N n , if (24) holds for some ν ∈ M(K) + we say that the sequence has the representing measure ν.Recall that measures on compact sets are uniquely characterized by their moments (see [22, p. 52]).Remark 3.5.The use of canonical moments, i.e. associated with monomials, may be critical from a numerical point of view.Other polynomial bases with more favorable numerical properties could be considered.However, in the numerical experiments, we restrain ourselves to domains included in the unit hypercube, which moderates numerical instabilities.
Next, we define a pseudo-integration with respect to an arbitrary sequence z ∈ R N n by and ℓ z is called the Riesz functional.Theorem 3.6 (Riesz-Haviland [22, Theorem 3.1]).Let K ⊆ R n be closed.A real sequence z ∈ R N n is the moment sequence of some measure ν ∈ M(K) + , i.e. z satisfies (24), if and only if ℓ z (p) ≥ 0 for all p ∈ R[w] non-negative on K.
Assuming that K is closed, we can reformulate thanks to this result the GMP (23) as a linear problem on moment sequences, namely Theorem 3.6 guarantees the equivalence between formulations (26) and (23).However, the latter reformulation is still numerically intractable.
From non-negative polynomials to sums of squares Characterizing non-negativity of polynomials is an important issue in real algebraic geometry.Let K be a basic semi-algebraic set, i.e.K = {w ∈ R n : g 1 (w) ≥ 0, . . ., g m (w) ≥ 0} (27) for some polynomials g 1 , . . ., g m ∈ R[w], and assume that K is compact.In addition assume that one of the polynomials, say the first one, is g 1 (w) := N − n i=1 w 2 i for some N sufficiently large 3 .For notational convenience we let g 0 (w) := 1.
We say that a polynomial s ∈ R[w] is a sum of squares (SOS) if there are finitely many polynomials q 1 , . . ., q r such that s(w) = r j=1 q j (w) 2 for all w.Theorem 3.7 (Putinar's Positivstellensatz).If p > 0 on the basic semi-algebraic compact set K defined by (27) with g 1 (w) := N − n i=1 w 2 i , then p = m j=0 s j g j for some SOS polynomials s j ∈ R[w], j = 0, 1, . . ., m.
By a density argument, checking non-negativity of ℓ z on polynomials non-negative on K can be replaced by checking non-negativity only on polynomials that are strictly positive on K and hence on those that have a SOS representation as in Theorem 3.7.
For a given integer d, denote by Σ[w] d ⊂ R[w] the set of SOS polynomials of degree at most 2d, and define the cone Q d (g) ⊂ R[w] for g = (g 0 , . . ., g m ) by and observe that Q d (g) consists of polynomials which are non-negative on K. Let b d (w) := (w α ) |α|≤d ∈ R[w] n d be the vector of monomials of degree at most d.We recall that n d denotes the binomial number n+d n .For j = 0, ..., m, let d j = ⌈deg(g j )/2⌉, let M d−dj (g j z) denote the real symmetric matrix linear in z corresponding to the entrywise application of ℓ z to the matrix with polynomial entries g j b d−dj (w)b T d−dj (w).For j = 0 and g 0 = 1, the matrix M d (z) = ℓ z (b d b T d ) (where ℓ z is applied entrywise) is called the moment matrix.For any other value of j, it is called a localizing matrix.It turns out that, for all j = 0, 1, . . ., m, ℓ z (g j q 2 ) ≥ 0 for all q ∈ R[w] d if and only if M d−dj (g j z) ⪰ 0, which are convex linear matrix inequalities in z and where ⪰ denotes the positive semi-definite (or Loewner) order.

Moment-SOS hierarchy
The following finite-dimensional semi-definite programming (SDP) problems are relaxations of the moment problem (26): and they are parametrized by the relaxation degree d ≥ max j=0,...,m d j .Theorem 3.8 (Convergence of the moment-SOS hierarchy, [38,Theorem 7]).Suppose that K is a basic semi-algebraic compact set.Further assume that there exists C > 0 such that for any d ∈ N, if z d ∈ R n 2d is solution of (29), then z d 0 ≤ C, with C independent of d.Finally assume that there exists a unique solution ν * ∈ M(K) + to problem (23).Then there exists a sequence In particular, one has ρ * d → ρ * as d → ∞.

Application to our problem
Entropy MV solution as a GMP In the scalar hyperbolic case, the measures ν i under consideration are from the collection ν, or ν and ϑ when considering Kruzhkov's entropies.The sets K i all correspond to K = T × X × Ξ × U.The polynomials h i,j are given in ( 18) (conservation law), (19) when considering polynomial entropy pairs or (21) (and compatibility conditions between ν and ϑ (47) and similar equations) when considering Kruzhkov entropy pairs (entropy inequalities), and ( 38)-( 41) (marginal constraints).For the sake of readibility, we shall only consider the case of polynomial entropies and a formulation only on measures ν.
We may also define an objective functional with h, h 0 , h T , (h If the initial measure is concentrated on the graph of the initial condition and if, in addition, one imposes suitable boundary measures as exposed in Remark 2.18, then the choice of the objective functional is not crucial to recover the entropy MV solution of scalar hyperbolic PDE.Indeed, as a consequence of Theorem 2.14, the corresponding Young measure is concentrated: there is nothing to be optimized.However, our aim is to approximate the GMP by a finite dimensional optimization problem in order to solve it numerically and, then, the choice of the objective functional will impact the convergence of the corresponding relaxations.From experimental observations, two objective functionals seem to produce interesting results: the maximum of the opposite of the entropy constraints and the minimum of the trace of moment matrix.Choosing the latter seems to be a good heuristic: minimizing the nuclear norm of a matrix leads to reducing its rank (see [40]), which tends to favorise measures with localized support.However, there is up to date still no proof of a general effective functional.
Finally, one is able to define a GMP: where the infimum is taken over measures ν ∈ M(K) + , ν T ∈ M(K T ) + .Theorem 3.8 extends to the case of multiple measures, as discussed in [22] and shown in [38].Note that the compact sets T, X, Ξ and U as defined before can be expressed as basic semi-algebraic compact sets, so that K is also a basic semi-algebraic compact set: Moreover, the constraint for α = 0 (see equation (38) in Appendix A) yields the relaxed linear constraint z 0 = ℓ z (1) = T×X×Ξ dtdxdρ(ξ) ≤ |T||X|.Finally, supposing that σ 0 is concentrated on the graph of the weak-parametric entropy solution, (32) admits a unique solution.Hence the hypotheses of Theorem 3.8 are satisfied.
Then, optimal solutions of the moment-SOS hierarchy (29) (adapted to the present context) converge to optimal solutions of (32) as d goes to infinity.In particular, one may extract the MV solution of (11), provided that σ 0 , γ L,i and γ R,j are concentrated for 1 ≤ i ≤ n, as already discussed in Remark 2.18.

Post-processing quantities of interest
We have seen in the previous section how to obtain approximate sequences z d of moments of the measure ν on K, such that dν(t, x, ξ, y) = dµ t,x,ξ (y)dtdxdρ(ξ) where µ is the measure-valued solution supported on the graph of the solution.In this section, we present how to construct an approximation of the function u thanks to the Christoffel-Darboux function and its ability to estimate the support of a measure (see [41] for further details).Also, we show how to obtain approximations of statistical moments of variables of interest that are functions of the solution, possibly using a moment completion technique and the Christoffel-Darboux function.

Approximation of the graph of the solution
We consider that we have obtained an approximation z d of the moments of order 2d of the measure ν, which is a measure supported on the graph of the function u(t, x, ξ).In order to approximate the function from the moments, we rely on an approximate Christoffel-Darboux function associated with the measure (that has to be carefully defined), which tends to take high values on the support of the measure.Thus, finding the minimizers of the approximate inverse Christoffel-Darboux function for given (t, x, ξ) ∈ T×X×Ξ gives an approximation of u(t, x, ξ).For w = (t, x, ξ, y) ∈ K, we let b d (w) be a basis of monomials of order up to d and be the corresponding moment matrix, that is the Gram matrix of the basis b d (w) for the measure ν d corresponding to z d .When M d (z d ) is invertible, the inverse Christoffel-Darboux function is defined by where the (λ i , v i ) are eigenpairs of M d (z d ), and the polynomials p i (w) = λ −1/2 i b d (w) T v i form an orthonormal basis of the space of polynomials of order d in L 2 ν d (K).In the case where M d (z d ) is singular, a regularization is introduced by considering the function which turns out to be the inverse Christoffel-Darboux function of a measure ν d + βν 0 , where ν 0 is the measure on K for which the monomials form an orthonormal family (see [24]).Exploiting the fact that q ν d +βν0 tends to take low values on the graph of u, an approximation of u is defined by Further information can be found in [24].Remark 3.9.In this paper, we consider polynomial moments of the occupation measure.This allows us to exploit the localization property of the associated Christoffel functions and to estimate the graph of the solution a posteriori.The use of moments associated with other functions, such as piecewise polynomials, could probably be considered as well (see [42] for the use of piecewise polynomial moments and a short review on the use of other types of moments).

Statistical moments of variables of interest
Considering ξ as a random parameter, one may be interested in computing the expectation of some variable of interest Q(ξ) = F (u(•, •, ξ); ξ), where F (•, ξ) is a real-valued function taking as input time-space functions.In some particular situations, it is possible to directly obtain an estimation of this quantity from the moments z d .In particular, this is the case when We may also be interested in obtaining statistical moments of the solution u(t, x, ξ) at different points (t, x), which is not a variable of interest in the above format.Of course, these quantities can be estimated from point-wise evaluations of u based on the technique presented in the previous section.However, an alternative approach is possible to estimate the statistical moments for all (t, x) ∈ T × X, from the the approximate moments z d of the measure ν.We know that the measure ν can be disintegrated into its marginal λ T ⊗ λ X and its conditional measure dν(ξ, y|t, x), such that dν(t, x, ξ, y) = dν(ξ, y|t, x)dtdx.
We assume that f k (t, x) takes values in a compact set F := [F , F ] which can be easily obtained in terms of U and k.We then let { g j } m j=1 , m ∈ N, be polynomials that describe the semi-algebraic compact set T × X × F. Letting z be the sequence of moments of ν, we may notice that for all α = (α Our goal is then here to approximate the support of the measure δ f k (t,x) (dy)dxdt from its moments ω in order to recover the graph of f k (t, x).We are faced with the issue that the information we have on the moments is incomplete, namely, we only have the moments ω α,0 for α ∈ N n+1 2d and ω α,1 for α ∈ N n+1 2d−k .Following [25], we introduce the following finite-dimensional semi-definite programming (SDP) problems to recover the graph of f k (t, x) from incomplete moment information: where Tr(M ) denotes the trace of a matrix M .We recall that M d (ω) denotes the moment matrix of ω.From this, we can compute the corresponding Christoffel-Darboux approximation of f k , following the approach of the previous section, see [24,25].

Numerical examples
For numerical illustration, we consider Burgers-type equations with parametrised initial condition or parametrised flux.The choice of entropy pairs is important to ensure uniqueness of the solution.Implementing Kruzkhov's entropy pairs is possible (as seen in Section 3.1), but computationally heavy since it requires a reformulation with measures in higher dimension.It is known that the entropy η(y) = y 2 provides sufficient constraints to ensure uniqueness of the entropy solution for Burgers equation [26].Then, instead of using Kruzkhov's pairs, we here rely on the following family of polynomial entropies: η l (y) = y 2l , ∀l ∈ N and the corresponding polynomial functions q l .As an objective function, we choose the trace of the moment matrix (see discussion in section 3.4).
Numerical experiments are performed with the Matlab interface Gloptipoly3 [43].
In order to approximate the graph of solutions u, we use the method described in Section 3.5.1.Numerically, the optimization of the Christoffel function is achieved through a discretization of T, X, Ξ and U and the computation of the Christoffel function at each point of the grid.We set the value of the regularization parameter β to 10 −7 .Remark 4.1 (On the invertibility of the moment matrix).Imposing that the moment matrix is positive semi-definite does not guarantee its invertibility, hence the regularization.However, the regularized matrix may be poorly conditioned for high values of relaxation order d, but we do not directly manipulate the ill-conditioned matrix.In order to obtain the Christoffel-Darboux kernel, we compute a spectral decomposition of the non-regularized matrix and use the inverse of shifted eigenvalues.This seems to mitigate numerical instabilities.However, a further stability analysis should be required.
We shall in the following denote by u d the Christoffel-Darboux approximation of the solution using approximate moments from a degree d of the hierarchy, and by u the exact solution of our Riemann problem.Remark 4.2 (On computational complexity).In our numerical experiments, we used an interior-point algorithm to solve SDP problems, whose complexity is analyzed in [44].Let q d be the number of unknown moments of a measure.Our SDP problem has semi-positive inequalities on matrices M d−dj ∈ R ri×ri , 0 ≤ j ≤ m, with r j ≤ q d /2.Letting r = m j=0 r j , the complexity of the interior-point algorithm is O( √ r(q 2 d m j=0 r j + q d m j=0 r 3 j )).Thus, the complexity is O(q 9/2 d ).Remark 4.3 (Curse of dimension).For a relaxation degree d, the number of unknown moments of a measure is we deduce that the complexity of the interior-point algorithm is O(d 9 2 (n+p+2) ).To circumvent the curse of dimension and address problems with high dimension n+p+2, we should exploit low-dimensional structures in the set of moments, such as sparsity [45] or low-rankness.This will be addressed in future works.Remark 4.4.Mesh-based methods can be used to approximate solutions in terms of the physical variables and the parameters, also facing the curse of dimension.Due to the presence of discontinuities in terms of the physical variables and the parameters (not necessarily aligned with the coordinates), adaptive mesh refinement methods are required, leading to prohibitive computational cost in high dimension.Our method is mesh-free and seems to capture well discontinuities.Further comparisons need to be performed in future works.Remark 4.5 (Obtaining the moment constraints).We here consider problems with polynomial or piecewise polynomial data, that allowed us to compute exactly the moments of the given measures σ 0 , γ L,i and γ R,i , 1 ≤ i ≤ n.For more complicated data and high dimensional problems, numerical integration methods could be required, such as Monte Carlo methods.

Riemann problem for the Burgers equation with parametrised initial condition
As a first example, we consider the classical one-dimensional Riemann problem (see e.g., [46]) for a Burgers equation, with a parameter-independent flux and where we parametrise the initial position of the shock, taking with a parameter ξ taking values in Ξ = [0, 1].We know that the solution takes values in U = [0, 1].The time-space window on which we consider the solution is T = [0, 1  2 ] and X = [− 1 2 , 1  2 ].The unique solution is Equipping Ξ with the Lebesgue measure on [0, 1], it yields the following statistical moments for all k ∈ N, for all (t, x) ∈ T × X.We may notice that in this simple case, f k is independent on k for k ≥ 1.
We recall that at relaxation degree d, the number of unknown moments 2q d ≤ 2 d+n+p+2 d .Indeed, each measure is expected to yield q d moments, and we optimize over the two measures ν and ν T .However, ν T has less unknown moments, since it is supported on K T and some of its moments are dependant.
Retrieving the graph of the solution Figure 1 shows the graphs of the approximate solution u d (t, x, 0) for (t, x) ∈ T × X (so that the shock is initially located at x = − 1 4 ), with hierarchy's degree d = 2, 5, 8.We see that our approximation is almost indistinguishable from the exact solution for d = 5, and indistinguishable from the exact solution d = 8.We observe the same results as in [16], where discontinuities are very well resolved as early as d = 5.

Error estimation
We choose to compute two different errors of our approximate solution.
l 1 (T × X × Ξ) error.We randomly pick 25 values in Ξ, and consider 25 equidistant values in T and X.We denote the test sets Ξ e , T e and X e respectively.We study the evolution of the relative l 1 error with respect to the degree d of the hierarchy.Namely, we are interested in e g (d) := ∥u − u d ∥ l 1 (Te×Xe×Ξe) ∥u∥ l 1 (Te×Xe×Ξe) .
The results are presented in Table 1 and on Figure 3.We observe a fast convergence of the error for small values of d.The convergence is not monotone and rather slow for high values of d.It may thus seem interesting to investigate where the errors are concentrated in our approximation.For illustrative purpose, we plot the distributed error ε(t, x) = | u 5 (t, x, 0.2) − u(t, x, 0.2)| for (t, x) ∈ T e × X e on Figure 4. We observe that the errors are mostly concentrated around the shock, but it is noticeable that the closer to the time boundaries, the worse they are.l 1 (T × X) error for different parameter values.We consider four different values of the parameter ξ ∈ Ξ e := (0, 0.2, 0.6, 1) (which correspond to a shock initially located at x = −0.25,x = −0.2,x = −0.1 and x = 0), and 100 equidistant points in T and X, denoting the test sets T e and X e respectively.We then choose to study, for each ξ e ∈ Ξ e , the evolution of the relative l 1 (T e × X e ) error with respect to the degree d   We observe the same behaviour of the errors as in the previous paragraph.

Conservation condition
We here check the conservation condition, i.e. how far c d (t, ξ) := X ( u d (t, x, ξ) − u(t, x, ξ))dx is from 0. We plot on Figure 5 the function t → c d (t, 0.2) for 11 equidistant points in T and for d = 2, 5, 8.In order to approximate the integral, we compute the pointwise error for 1001 equidistant points in X and divide the sum by the number of points.The conservation condition is rather well satisfied and it tends to improve as d rises.
Retrieving statistical moments of the solution Denote T e 100 equidistant points in T and X e 100 equidistant points in X.We want to approximate the expectation f 1 (x, t) of the solution for all (t, x) ∈ T e ×X e following the method described in Section 3.5.2.Denoting f 1d the approximated expected value of the solution for degree of relaxation d of the hierarchy, we want to compute the relative l 1 (T e × X e ) error of our approximation for d = 2, . . ., 8, namely, we are interested in for all d = 2, . . ., 8. The results are presented in Table 4.
We note here the same phenomenon as for the errors presented above occurring, where the approximation rapidly improves as d rises until d = 5.The convergence is then rather slow and not monotone.
The solution is known to take values in U = [0, 1].The time-space window on which we consider the solution is T = [0, 1  2 ] and X = [− 1 2 , 1  2 ]. Figure 7 shows the graphs of the approximate solution u d ( 1 4 , x, ξ) for x ∈ X and ξ = 0, 1 (so that the speed of the shock is 1  4 and 1 2 respectively), with hierarchy's degree d = 2, 5, 8, superposed with the exact solution.Once more, we see that our approximation is very close to the exact solution for d = 5, and indistinguishable from the exact solution for d = 8.We note here the same phenomenon as for the errors presented above occurring, where the approximation improves as d rises until d = 5.Then the convergence is not monotone and rather slow.

Conservation condition
We once more study the conservation condition by plotting the function t → c d (t, ξ) = X ( u d (t, x, ξ) − u(t, x, ξ))dx for ξ = 0.2 for 11 equidistant points in T and d = 2, 5, 8 on Figure 9.The same remark as for the previous experiment holds: the conservation condition is rather well satisfied and it tends to improve as d increases.First, to ensure that the marginal of ν with respect to t, x and ξ is the tensor product of the Lebesgue measure on T × X and ρ, it suffices to impose that K t α1 x α2 ξ α3 dν(t, x, ξ, y) = T×X×Ξ t α1 x α2 ξ α3 dtdxdρ(ξ), for all α ∈ N n+p+1 .In a similar manner, we impose the marginals of the time boundary measures to be products of a Dirac measure, a Lebesgue measure and ρ as follows: for all α ∈ N n+p+1 , K t α1 x α2 ξ α3 dν 0 (t, x, ξ, y) = 0 α1 X×Ξ x α2 ξ α3 dxdρ(ξ), K t α1 x α2 ξ α3 dν T (t, x, ξ, y) = T α1

C Not imposing (18) for the Burgers equation
For illustrative purposes, we solved the GMP and reconstructed the approximate solution without imposing (18) for the Riemann problem where the initial position of the shock is parametrised.At relaxation degree d = 5 and for ξ = 0.6, which is equivalent to a shock initially located at x = − 1 10 , it yields Figure 10.We observe here that removing (18) from the constraints highly degrades the approximation.Fig. 10: Graph of the approximate solution u d (t, x, 0.6) for d = 5 without imposing (18) Let N n d := {α ∈ N n : |α| ≤ d}, where |α| := n i=1 α i , and n d := n+d d .A vector p := (p α ) α∈N n d ∈ R n d is the coefficient vector (in the monomial basis) of a polynomial p ∈ R[w] with degree d = deg(p) expressed as p = α∈N n d p α w α .Integrating p with respect to a measure ν involves only finitely many moments:

Fig. 3 :
Fig. 3: Evolution of the error e g with relaxation degree d

Fig. 8 :
Fig. 8: Evolution of the error e g with relaxation degree d

Table 1 :
Number of unknowns q d and error e g (d) for d = 2, . . ., 8