Semidefinite Approximations of Invariant Measures for Polynomial Systems

We consider the problem of approximating numerically the moments and the supports of measures which are invariant with respect to the dynamics of continuous- and discrete-time polynomial systems, under semialgebraic set constraints. First, we address the problem of approximating the density and hence the support of an invariant measure which is absolutely continuous with respect to the Lebesgue measure. Then, we focus on the approximation of the support of an invariant measure which is singular with respect to the Lebesgue measure. Each problem is handled through an appropriate reformulation into a linear optimization problem over measures, solved in practice with two hierarchies of finite-dimensional semidefinite moment-sum-of-square relaxations, also called Lasserre hierarchies. Under specific assumptions, the first Lasserre hierarchy allows to approximate the moments of an absolutely continuous invariant measure as close as desired and to extract a sequence of polynomials converging weakly to the density of this measure. The second Lasserre hierarchy allows to approximate as close as desired in the Hausdorff metric the support of a singular invariant measure with the level sets of the Christoffel polynomials associated to the moment matrices of this measure. We also present some application examples together with numerical results for several dynamical systems admitting either absolutely continuous or singular invariant measures.


Introduction
Given a polynomial system described by a discrete-time (difference) or continuous-time (differential) equation under general semialgebraic constraints, we propose numerical methods to approximate the moments and the supports of the measures which are invariant under the sytem dynamics. The characterization of invariant measures allows to determine important features of long term dynamical behaviors [21].
One classical way to approximate such features is to perform numerical integration of the equation satisfied by the system state after choosing some initial conditions. However, the resulting trajectory could exhibit some chaotic behaviors or great sensitivity with respect to the initial conditions. Numerical computation of invariant sets and measures of dynamical systems have previously been studied using domain subdivision techniques, where the density of an invariant measure is recovered as the solution of fixed point equations of the discretized Perron-Frobenius operator [8,5]. The underlying method, integrated in the software package GAIO [7], consists of covering the invariant set by boxes and then approximating the dynamical behaviour by a Markov chain based on transition probabilities between elements of this covering. More recently, in [9] the authors have developed a multilevel subdivision scheme that can handle uncertain ordinary differential equations as well.
By contrast with most of the existing work in the literature, our method does not rely neither on time nor on space discretization. In our approach, the invariant measures are modeled with finitely many moments, leading to approximate recovery of densities in the absolutely continuous case or supports in the singular case. Our contribution is in the research trend aiming at characterizing the behavior of dynamical nonlinear systems through linear programs (LP), whose unkown are measures supported on the system constraints. This methodology was introduced in the static case by Lasserre [22] and consists of reformulating a polynomial optimization problem as an infinite-dimensional LP over probability measures. To handle practically such an LP, one can then rely on a hierarchy of semidefinite programming (SDP) problems, called moment-sum-of-square or Lasserre hierarchy, see [23] for a global view.
In the context of polynomial dynamical systems, extensions to this hierarchy have been studied, for example, to obtain converging approximations for regions of attraction [12], maximum controlled invariants [18] and reachable sets [31].
In our case, we first focus on the characterization of densities of absolutely continuous invariant measures with respect to some reference measure (for instance the Lebesgue measure). For this first problem, our method is inspired by previous contributions looking for moment conditions ensuring that the underlying unknown measure is absolutely continuous [24] with a bounded density in a Lebesgue space, as a follow-up of the volume approxilation results of [13]. When the density function is assumed to be square-integrable, one can rely on [16] to build a sequence of polynomial approximations converging to this density in the L 2 -norm.
We focus later on the characterization of supports of singular invariant measures. For this second problem, we rely on previous works [13,25,27] aiming at extracting as much information as possible on the support of a measure from the knowledge of its moments. The numerical scheme proposed in [13] allows to approximate as close as desired the moments of a measure uniformly supported on a given semialgebraic set. The framework from [25] uses similar techniques to compute effectively the Lebesgue decomposition of a given measure, while [27] relies on the Christoffel function associated to the moment matrix of this measure. When the measure is uniform or when the support of the measure satisfies certain conditions, the sequence of level sets of the Christoffel function converges to the measure support with respect to the Hausdorff distance.
Previous work by the third author [11] shows how to use the Lasserre hierarchy to characterize invariant measures for one-dimensional discrete polynomial dynamical systems. We extend significantly this work in the sense that we now characterize invariant measures on more general multidimensional semialgebraic sets, in both discrete and continuous settings, and we establish convergence guarantees under certain assumptions. In the concurrent work [19], the authors are also using the Lasserre hiearchy for approximately computing extremal measures, i.e. invariant measures optimal w.r.t. a convex criterion. They have weaker convergence guarantees than ours, but the problem is formulated in a more general set-up including physical measures, ergodic measures or atomic measures.
Our contribution is twofold: • A first Lasserre hierarchy allowing to approximate the moments of an invariant measure which is absolutely continuous with respect to the Lebesgue measure. For an invariant measure with either square integrable or essentially bounded density, one has convergence guarantees of the hierarchy and one can compute asymptotically the exact moment sequence of this measure. At each step of the hierarchy, one can recover an approximate polynomial density from the solution of the SDP problem. The resulting sequence of polynomial approximations converges weakly to the exact density when the degree of the polynomial goes to infinity.
• A second Lasserre hierarchy allowing to approximate as close as desired the support of a singular invariant measure. At each step of this hierarchy, one can recover an approximation of the support with the superlevel set of the Christoffel polynomial associated to the moment matrix of the singular measure. Under certain assumptions, the maximal distance between the exact support and the points among this superlevel set converges to zero when the size of the moment matrix goes to infinity, i.e. we can ensure convergence in the Hausdorff metric.
In both cases, our results apply for both discrete-time and continuous-time polynomial systems. Each problem is handled through an adequate reformulation into an LP problem over probability measures. We show how to solve in practice this infinite-dimensional problem with moment relaxations. In both cases, this boils down to solving a Lasserre hierarchy of finite-dimensional SDP problems of increasing size.
In Section 2, we describe preliminary materials about discrete/continuous-time polynomial systems, invariant measures as well as polynomial sums of squares and moment matrices. The first problem of density approximation for absolutely continuous invariant measures is handled in Section 3 with our first hierarchy. The second problem of support approximation for singular invariant measures is investigated in Section 4 with our second hierarchy. Finally, we illustrate the method with several numerical experiments in Section 5.

Discrete-Time and Continuous-Time Polynomial Systems
Given r, n ∈ N, let R[x] (resp. R 2r [x]) stand for the vector space of real-valued n-variate polynomials (resp. of degree at most 2r) in the variable be the vector space of complex-valued n-variate polynomials. We are interested in the polynomial system defined by • a polynomial transition map The degree of f is given by d f := max{deg f 1 , . . . , deg f n }; • a set of constraints assumed to be compact basic semialgebraic: defined by given polynomials g 1 , . . . , g m ∈ R[x].
We develop our approach in parallel for discrete-time and continuous-time systems. With f being a polynomial transition as in (1) and X being a set of semialgebraic state constraints as in (2), we consider either the discrete-time system: or the continuous-time system:

Invariant Measures
Given a compact set A ⊂ R n , we denote by M(A) the vector space of finite signed Borel measures supported on A, namely real-valued functions on the Borel sigma algebra B(A). The support of a measure µ ∈ M(A) is defined as the closure of the set of all points x such that µ(B) = 0 for an open neighborhood B of x. We note C(A) (resp. C 1 (A)) the Banach space of continuous (resp. continuously differentiable) functions on A equipped with the sup-norm. Let C(A) be the topological dual of C(A) (equipped with the supnorm), i.e. the set of continuous linear functionals of C(A). By a Riesz representation theorem, C(A) is isomorphically identified with M(A) equipped with the total variation norm denoted by · TV . Let C + (A) (resp. M + (A)) be the cone of non-negative elements of C(A) (resp. M(A)). A probability measure on A is an element µ ∈ M + (A) such that µ(A) = 1. The topology in C + (A) is the strong topology of uniform convergence in contrast with the weak-star topology that can be defined in M + (A).

The restriction of the Lebesgue measure on a subset
The moments of the Lebesgue measure on A are denoted by where we use the multinomial notation x β := x β 1 1 x β 2 2 . . . x βn n . The sequence of Lebesgue moments on A is denoted by . When A = X, we define z := z X by omitting the superscript notation. A sequence y := (y β ) β∈N n ∈ R N n is said to have a representing measure on X if there exists µ ∈ M(X) such that y β = x β µ(dx) for all β ∈ N n .
The so-called pushforward measure or image measure of a given µ ∈ M + (X) under f is defined as follows: for every set A ∈ B(X). Let us define the linear operator L disc f : C(X) → C(X) by: where the derivatives of measures are understood in the sense of distributions, that is, through their action on test functions of C 1 (X). In the sequel, we use the more concise notation L f to refer to L disc f (resp. L cont f ) in the context of discrete-time (resp. continuoustime) systems. Definition 2.1. (Invariant measure) We say that a measure µ is invariant w.r.t. f when L f (µ) = 0 and refer to such a measure as an invariant measure. We also omit the reference to the map f when it is obvious from the context and write L(µ) = 0.
When considering discrete-time systems as in (3), a measure µ is called invariant w.r.t. f when it satisfies L disc f (µ) = 0. When considering continuous-time systems as in (4), a measure is called invariant w.r.t. f when it satisfies L cont f (µ) = 0.
It was proved in [20] that a continuous map of a compact metric space into itself has at least one invariant probability measure. A probability measure µ is ergodic w.r.t. f if for all A ∈ B(X) with f −1 (A) = A, one has either µ(A) = 0 or µ(A) = 1. The set of invariant probability measures is a convex set and the extreme points of this set consist of the so-called ergodic measures. For more material on dynamical systems and invariant measures, we refer the interested reader to [21].

Sums of Squares and Moment Matrices
With X a basic compact semialgebraic set as in (2), we set r j := (deg g j )/2 , j = 1, . . . , m. For the ease of further notation, we set g 0 (x) := 1. Let Σ[x] stand for the cone of polynomials that can be expressed as sums of squares (SOS) of some polynomials, and let us note Σ r [x] the cone of SOS polynomials of degree at most 2r, namely For each r ≥ r min := max{1, r 1 , . . . , r m }, let Q r (X) be the r-truncated quadratic module generated by g 0 , . . . , g m : . Let P(X) := R[x] ∩ C + (X) denote the cone of polynomials that are nonnegative on X. To guarantee the convergence behavior of the relaxations presented in the sequel, we need to ensure that polynomials which are in the interior of P(X) lie in Q r (X) for some r ∈ N. The existence of such SOS-based representations is guaranteed by Putinar's Positivstellensaz (see e.g. [13, Theorem 2.2]), when the following algebraic compactness condition holds: There exists a large enough integer N such that one of the polynomials describing the set X is equal to N − x 2 2 .
In addition, the semialgebraic set X should fulfill the following condition: This is the case e.g. if X is a ball. From now on we assume that X is a compact basic semialgebraic set as in (2) and that it satisfies both Assumptions 2.2 and 2.3.
For all r ∈ N, we set N n r := {β ∈ N n : n j=1 β j ≤ r}, whose cardinality is n+r r . Then a polynomial p ∈ R r [x] is written as follows: and p is identified with its vector of coefficients p = (p β ) β∈N n r in the canonical basis (x β ) β∈N n r . Given a real sequence y = (y β ) β∈N n ∈ R N n , let us define the linear functional y : R[x] → R by y (p) := β p β y β , for every polynomial p. For each j = 0, 1, . . . , m, we associate to y a localizing matrix, that is a real symmetric matrix M r (g j y) with rows and columns indexed by N n r−r j and the following entrywise definition: When j = 0 the localizing matrix is called the moment matrix M r (y) := M r−r 0 (g 0 y).
For a given invariant measure µ ∈ M(X), one has L(µ) = 0. It follows from the Stone-Weierstrass Theorem that monomials are dense in continuous functions on the compact set X. The equation L(µ) = 0 is then equivalent to in the context of a discrete-time system (3) and in the context of a continuous-time system (4).
Hence, we introduce the linear functionals I disc for every polynomial p and where grad p := ( ∂p ∂x i ) i=1,...,n . In the sequel, we use the more concise notation I y to refer to I disc y (resp. I cont y ) in the context of discrete-time (resp. continuous-time) systems.

Absolutely Continuous Invariant Measures
For p = 1, 2, . . ., let L p (X) (resp. L p + (X)) be the space of (resp. nonnegative) Lebesgue integrable functions f on X, i..e. such that f p : ) be the space of (resp. nonnegative) Lebesgue integrable functions f on X which are essentially bounded on X, i.e. such that f ∞ := ess sup x∈X |f (x)| < ∞. Two integers p and q are said to be conjugate if 1/p + 1/q = 1, and by Riesz's representation theorem (see e.g. [28, Theorem 2.14]), the dual space of L q (X) for 1 ≤ q < ∞ (i.e. the set of continuous linear functionals on L q (X)) is isometrically isomorphic to L p (X).
For µ ∈ M(X), if µ λ then there exists a measurable function h on X such that dµ = h dλ and the function h is called the density of µ. If h ∈ L p (X), by a slight abuse of notation, we write µ ∈ L p (X) and µ p := h p . If in addition µ is invariant w.r.t. f then we say that f has an invariant density in L p (X).
In Section 3.1, we state some conditions fulfilled by the moments of an absolutely continuous measure with a density in L p (X). In the case of invariant measures, we rely on these conditions to provide an infinite-dimensional LP characterization in Section 3.2. In Section 3.3, we show how to approximate the solution of this LP by using a hierarchy of finite-dimensional SDP relaxations. Section 3.4 is dedicated to the approximation of the invariant density. In Section 3.5, we extend the approach to piecewise-polynomial systems.

Invariant Densities in Lebesgue Spaces
Theorem 3.1. Let p and q be conjugate with 1 ≤ q < ∞. Consider a sequence y ⊂ R. The following statements are equivalent: and for all g ∈ P(X), it holds y (g) ≥ 0.
(i) ⇐= (ii). Since y (g) ≥ 0 for all g ∈ P(X), we rely again on the Riesz-Haviland Theorem to show that y has a representing measure µ ∈ M + (X). It remains to prove that µ ∈ L p + (X there exists a linear functional˜ y in L q (X) such that y (x α ) = y α for all α ∈ N n , its operator norm · being bounded by a nonnegative real constant γ. Moreover, the restriction of˜ y to R[x] is y hence y ≤ ˜ y ≤ γ. Since the dual space of L q + (X) is isometrically isomorphic to L p + (X), there exists µ ∈ L p + (X) such that ˜ y = µ p ≤ γ and satisfying˜ y (g) = y (g) = X gdµ, for all g ∈ R[x]. Theorem 3.1 provides necessary and sufficient conditions satisfied by the moments of an absolutely continuous Borel measure with a density in L p + (X). We now state further characterizations when p = q = 2 in Theorem 3.2 and when p = ∞ and q = 1 in Theorem 3.3. For a given sequence y = (y α ) α and r ∈ N, the notation y r stands for the truncated sequence (y α ) |α|≤2r . The notation 0 means positive semidefinite.
Theorem 3.2. Consider a sequence y ⊂ R. The following statements are equivalent: Proof. (i) =⇒ (ii). By using the first necessary condition of Theorem 3.1, there exists Let r ∈ N and choose an arbitrary We obtain (8) by using a Schur complement. We prove that (9) holds in a similar way by using the second necessary condition of Theorem 3.1.
(i) ⇐= (ii). Since Assumption 2.2 holds, one can apply Putinar's Positivstellensatz [13, Theorem 2.2] to prove that (9) implies that y has a representing measure µ ∈ M + (X). As above, we show that (8) implies that | y (g)| ≤ γ z (g 2 ) 1/2 for all g ∈ R r [x]. We conclude the proof by using the sufficient condition of Theorem 3.1. (i) y has a representing measure µ ∈ L ∞ + (X) with µ ∞ ≤ γ for some γ ≥ 0; (ii) there exists γ ≥ 0 such that for all r ∈ N: Proof. (i) =⇒ (ii). By using the first necessary condition of Theorem 3.1, there exists The remaining inequalities are proved as in Theorem 3.2.
In Section 3.3, we will restrict to the case where p = 2 or p = ∞ while relying on the characterizations stated in the two previous theorems.

Infinite-dimensional Conic Formulation
Let us consider the following infinite-dimensional conic program: Theorem 3.4. Problem (12) admits an optimal solution. If the optimal value ρ ac is positive, then the optimal solution is a nonzero invariant measure.
Proof. First, let us prove that the feasible set of problem (12) is nonempty and weak-star compact. Nonemptiness follows from the fact the zero measure is admissible. Now, let us consider an admissible sequence (µ n ) n∈N . One has µ n TV = µ n p ≤ 1 < ∞, which shows that the feasible set of problem (12) is bounded for the weak-star topology. Now, let us assume that the sequence converges weakly-star to µ. For all A ∈ B(X), one has 0 = L(µ n )(A) → L(µ)(A) as n tends to infinity. In addition, one has µ n p ≤ 1 which yields µ p ≤ 1 as n tends to infinity. Thus µ is feasible for problem (12), which proves that the feasible set of problem (12) is closed in the metric induced by the weak-star topology. This proves that this feasible set is weak-star compact. Problem (12) has a linear cost function and a weak-star compact feasible set, which implies the existence of an optimal solution. The proof of the second statement is straighforward.
Assumption 3.5. There exists a unique invariant probability measure µ ac ∈ L p (X) for some p ≥ 1.
Note that Assumption 3.5 is equivalent to supposing that there exists a unique ergodic probability measure.
Proof. If Assumption 3.5 holds, then the nonzero invariant measure µ ac / µ ac p is feasible for problem (12), which proves that ρ ac ≥ X µ ac / µ ac p = 1/ µ ac p > 0. Finally, let µ ac be an optimal solution of problem (12), yielding the optimal value ρ ac = X µ ac . Then, the measure (ρ ac ) −1 µ ac is an invariant probability measure, ensured to be unique from Assumption 3.5, which concludes the proof.
The choice of maximizing the mass of the invariant measure in problem (12) is motivated by the following reasons: • If we consider to solve only the feasibility constraints associated to problem (12), one could end up with a solution being the zero measure, even under Assumption 3.5.
• Enforcing the feasibility constraints by adding the condition for µ to be a probability measure (i.e. X µ = µ 1 = 1) would not provide any guarantee to obtain a feasible solution as the inequality constraints µ p ≤ 1 may not be fulfilled since µ 1 ≤ vol X µ p ≤ µ p when µ ∈ L p (X) for some p ≥ 1.

A Hierarchy of SDP Relaxations
Let and from now on, let p = 2 or p = ∞ and r ∈ N be fixed, with r ≥ r min . We build the following hierarchy of finite-dimensional semidefinite programming (SDP) relaxations for problem (12): Lemma 3.7. Problem (13) has a compact feasible set and an optimal solution y r .
Proof. First, let us note that the zero sequence is feasible for SDP (13). Each diagonal element of M r (y) has to be nonnegative, thus y α ≥ 0 for α ∈ N n 2r . Let τ := max α∈N n 2r |z α |. For all α ∈ N n 2r , from the constraints C p r (y) 0, we consider the two following cases.
• For p = 2, one has g T yy T g ≤ g T M r (z)g, for each g ∈ R r [x] with vector of coefficients g. In particular for g(x) = x α , one obtains y 2 α ≤ z 2α ≤ τ . • For p = ∞, each diagonal element of M r (z)−M r (y) has to be nonnegative, yielding 0 ≤ y α ≤ z α ≤ τ .
Therefore, this implies that the feasible set of problem (13) is nonempty and compact, ensuring the existence of an optimal solution y r . Lemma 3.8. Let Assumption 3.5 hold and let µ ac be the unique optimal solution of problem (12). For every r ≥ r min , let y r be an arbitrary optimal solution of problem (13) and by completing with zeros, consider y r as an element of R[x] . Then the sequence (y r ) r≥r min ⊂ R[x] converges pointwise to y ∈ R[x] , that is, for any fixed α ∈ N n : lim r→+∞ y r α = y α .
Moreover, y has representing measure µ ac . In addition, one has: Proof. As in the proof of Theorem 3.4 in [25], we show that (y r ) r≥r min converges to (y ) since there exists a subsequence of integers (r k ) with r k ≥ r min such that: For an arbitrary integer r ≥ r min , it follows from (16) that 0 M r (y ). As a consequence of [23, Proposition 3.5], y has a representing measure µ ∈ M + (X). Using the fact that y r k is feasible for SDP (13) together with (16), one has I y (x α ) = 0, for all α ∈ N n 2r , C p r (y) 0 and M r−r j (g j y) 0, for all j = 0, 1, . . . , m. Thus this determinate representing measure satisfies L(µ) = 0 and has a density in L p (X) by using Theorem 3.2 for the case p = 2 and Theorem 3.3 for the case p = ∞. This proves that µ is feasible for problem (12) and ensures that ρ ac ≥ X µ.
Since problem (13) is a relaxation of problem (12), one has ρ ac ≤ ρ r k ac for all k ∈ N. Hence, one has ρ ac ≤ lim k→∞ ρ r k ac = lim k→∞ y r k 0 = X µ. This shows that µ is an optimal solution of problem (12), which is unique from Theorem 3.6. The accumulation point of y r is unique as it is the moment sequence of µ ac , yielding (14) and (15), the desired results.
Remark 1. Note that without the uniqueness hypothesis made in Assumption 3.5, we are not able to guarantee the pointwise convergence of the sequence of optimal solutions (y r ) r≥r min to y .
Remark 2. One could consider the dual of SDP (13), which is an optimization problem over polynomial sums of squares (SOS). One way to prove the non-existence of invariant densities in L p (X) for p ∈ {2, ∞} is to use the output of this dual program, yielding SOS certificates of infeasibility.

Approximations of Invariant Densities
Recall that p = 2 or ∞. Given a solution y r of the SDP (13) at finite order r ≥ r min , let h r ∈ R 2r [x] be the polynomial with vector of coefficients h r given by: where the moment matrix M r (z) is positive definite hence inversible for all r ∈ N. Note that the degree of the extracted invariant density depends on the SDP relaxation order r, and higher relaxation orders lead to higher degree approximations.
Lemma 3.9. Let Assumption 3.5 hold. For every r ≥ r min , let y r be an optimal solution of SDP (13), let h r be the corresponding polynomial obtained as in (17) and let µ ac be the unique optimal solution of problem (12) with density h ac . Then, the following convergence holds: lim Proof. By definition (17), one has: for all α ∈ N n 2r . As a consequence of (14) from Lemma 3.8, this yields: as r tends to infinity. Thus, for all g ∈ R[x], X g(x) h r (x)dλ → X g(x) h ac (x)dλ, yielding the desired result.

Extension to Piecewise Polynomial Systems
Now we explain how to extend the current methodology to piecewise polynomial systems. The idea, inspired from [1], consists in using the piecewise structure of the dynamics and the state-space partition to decompose the invariant measure into a sum of local invariant measures supported on each partition cell while being invariant w.r.t. the local dynamics.
Let us consider a set of cell indices I and a union of semialgebraic cells X = i∈I X i ⊂ R n partitioning the state-space. For each i ∈ I, the state-space cell X i is assumed to be a compact basic semialgebraic set: defined by given polynomials g i,1 , . . . , g i,m i ∈ R[x], m i ∈ N and fulfilling Assumption 2.2 as well as Assumption 2.3. We set r i,j := deg g i,j /2 , for all j = 0, 1, . . . , m i and i ∈ I. Then, one considers either the following discrete-time piecewise polynomial system: or the following continuous-time piecewise polynomial system: For p ≥ 1, this leads to the following infinite-dimensional conic program: Given µ i ∈ M + (X i ) let us define µ := i∈I µ i ∈ M + (X) as well as the linear mapping L : M + (X) → M + (X) by L(µ) := i∈I L f i (µ i ). Since µ p p = i∈I µ i p p , one can rewrite problem (21) as problem (12).
Next, we associate to problem (21) the hierarchy of SDP relaxations indexed by r ≥ max i∈I {max 0≤j≤m i {r i,j }}: As for Lemma 3.8 in Section 3.3, one proves that the sequence of optimal values of SDP (22) converges to the optimal value of problem (21). The extraction of approximate invariant densities can be performed in a way similar to the procedure described in Section 3.4.

Singular Invariant Measures
In the sequel, we focus on computing the support of singular measures for either discretetime or continuous-time polynomial systems. Our approach is inspired from the framework presented in [25], yielding a numerical scheme to obtain the Lebesgue decomposition of a measure µ w.r.t. λ, for instance when λ is the Lebesgue measure. By contrast with [25] where all moments of µ and λ are a priori known, we only know the moments of the Lebesgue measure λ in our case but we impose an additional constraint on µ to be an invariant probability measure.

Infinite-Dimensional LP Formulation
We start by considering the infinite-dimensional linear optimization problem: Assumption 4.1. There exists a unique invariant probability measure µ ∈ M + (X).
Proof. We first prove that the feasible set of LP (23) is nonempty and weak-star compact. Let us denote by µ the unique invariant probability measure for f . Then, nonemptiness follows from the fact that (µ , 0, λ X , µ ) is feasible for LP (23). Now, let us consider the sequences of measures ((µ n ) n , (ν n ) n , (ν n ) n , (ψ n ) n ) such that for all n ∈ N, (µ n , ν n ,ν n , ψ n ) is feasible for LP (23). One has µ n TV = 1 = ν n TV + ψ n TV < ∞ and (ν) n ≤ vol X < ∞ (as X is bounded). This shows that the feasible set of LP (23) is bounded for the weakstar topology. Now, let us assume that the sequences of measures respectively converge weakly-star to µ, ν,ν and ψ. For all A ∈ B(X), one has 0 = L(µ n )(A) → L(µ)(A) and we easily see that (µ, ν,ν, ψ) is feasible for LP (23), which proves that the feasible set is closed in the metric including the weak-star topology. LP (23) has a linear cost function and a weak-star compact feasible set, thus admits an optimal solution.
Now we explain the rationale behind LP (23). When there is no absolutely continuous invariant probability measure supported on X, then LP (23) has an optimal solution (µ , 0, λ X , µ ) with µ being the unique singular invariant probability measure. In this case, the value of LP (23) is ρ sing = 0. Note that in the general case where Assumption 4.1 does not hold, there may be several invariant probability measures. In this case, LP (23) still admits an optimal solution and the optimal value is the maximal mass of the νcomponent among all invariant probability measures.
By contrast with problem (12) from Section 3, we enforce the feasibility constraints by adding the condition for µ to be a probability measure. The reason is that if we remove this condition, the value ρ sing = 0 could still be obtained with another optimal solution (0, 0, λ X , 0), and we could not retrieve the unique invariant probability measure µ .

A Hierarchy of Semidefinite Programs
For every r ≥ r min , we consider the following optimization problem: Problem (24) is a finite-dimensional SDP relaxation of LP (23), implying that ρ r sing ≥ ρ sing for every r ≥ r min . Proof. First, let us denote by µ the invariant probability measure with associated moment sequence u. Since z is the moment sequence associated with λ X , it follows that SDP (24) has the trivial feasible solution (u, 0, z, u).
Then, Assumption 2.2 implies that the semidefinite constraint M r−1 (g X u) 0 holds. Thus, the first diagonal element of M r−1 (g X u) is nonnegative, and since u (1) = u 0 = 1, it follows that u (x 2r i ) ≤ N r , i = 1, . . . , n. We deduce from [26,Lemma 4.3,p. 111] that |u α | is bounded for all α ∈ N n 2r . In addition, from the moment constraints of SDP (24), one has for every i = 1, . . . , n: which similarly yields that |v α |, |v α | and |y α | are all bounded for all α ∈ N n 2r . Therefore, we conclude that the feasible set of SDP (24) is compact and that there exists an optimal solution (u , v ,v , y ).
Theorem 4.4. Let Assumption 4.1 hold. For every r ≥ r min , let (u r , v r ,v r , y r ) be an arbitrary optimal solution of SDP (24) and by completing with zeros, consider u r , v r ,v r , y r as elements of 4 , that is, for any fixed α ∈ N n : Moreover, with (µ , ν 1 , λ X − ν 1 , µ − ν 1 ) being the unique optimal solution of LP (23), u is the moment sequence of the unique invariant probability measure µ , v and y are the respective moment sequences of In addition, one has: lim r→∞ ρ r sing = ρ sing .
Proof. As in the proof of Theorem 3.4 in [25], we show that (u r , v r ,v r , y r ) r≥r min converges pointwise to (u , v ,v , y ) since there exists a subsequence of integers (r k ) with r k ≥ r min such that: By fixing an arbitrary integer r ≥ r min , it follows from (26) that 0 M r (u ), 0 M r (v ), 0 M r (v ) and 0 M r (y ). Therefore, by using [23, Proposition 3.5], u , v ,v and y are the respective moment sequences of determinate representing measures µ, ν,ν and ψ, supported on X. Using the fact that (u r k , v r k ,v r k , y r k ) is feasible for SDP (24) together with (26), these determinate representing measures must satisfy L(µ) = 0, X µ = 1, ν + ψ = µ and ν +ν = λ X . This proves that (µ, ν,ν, ψ) is feasible for LP (23), thus ρ sing ≥ X ν. In addition, one has ρ sing ≤ lim k→∞ ρ r k sing = lim k→∞ v r k 0 = X ν. This shows that (µ, ν,ν, ψ) is an optimal solution of LP (23), which is unique from Theorem 4.2. All accumulation points of (u r , v r ,v r , y r ) are unique as they are the respective moment sequences of µ , ν 1 ,ν = λ X − ν 1 , µ − ν 1 and yields (25), the desired result.
The meaning of Theorem 4.4 is similar to the one of Theorem 3.4 in [25]. By noting (ν , ψ ) the Lebesgue decomposition of the unique invariant probability measure µ , we have the two following cases: 1. If ν ∈ L ∞ + (X) with ν ∞ ≤ 1 then we can obtain all the moment sequences associated to ν and ψ by computing v r and y r through solving SDP (24) as r → ∞. In [25], the sup-norm must be less than an arbitrary fixed γ > 0 while in the present study we select γ = 1 as we consider an invariant probability measure µ . In particular, when there is no invariant measure which is absolutely continuous w.r.t. λ, one has ν = ν 1 = 0, ψ = µ and we obtain in the limit the moment sequence y of the singular measure µ .

Support Approximations for Singular Invariant Measures
Definition 4.5. (Christoffel polynomial) Let µ ∈ M + (X) be such that its moments are all finite and that for all r ∈ N, the moment matrix M r (u) is positive definite. With v r (x) denoting the vector of monomials of degree less or equal than r, sorted by graded lexicographic order, the Christoffel polynomial is the function p µ,r : X → R such that The following assumption is similar to [27, Assumption 3.6 (b)]. It provides the existence of a sequence of thresholds (α r ) r∈N for the Christoffel function associated to a given measure µ in order to approximate the support S of this measure. Here, we do not assume as in [27, Assumption 3.6 (a)] that the closure of the interior of S is equal to S. Assumption 4.6. Given a measure µ ∈ M + (X) with support S ⊆ X, S has nonempty interior and there exist three sequences (δ r ) r∈N , (α r ) r∈N , (d r ) r∈N such that: • (δ r ) r∈N is a decreasing sequence of positive numbers converging to 0.
• For every r ∈ N, d r is the smallest integer such that: • For every r ∈ N, α r is defined as follows: where diam S denotes the diameter of the set S and ω n := 2π is the surface of the n-dimensional unit sphere in R n+1 .
Remark 3. Regarding Assumption 4.6, as mentioned in [27,Remark 3.7], d r is well defined for all r ∈ N and the sequence (d r ) r∈N is nondecreasing. Since diam S ≤ diam X ≤ 1 and vol S ≤ vol X ≤ 1, replacing diam S and vol S by 1 in (27) yields a result similar to Theorem 4.7. For a given sequence (δ r ) r∈N , one can compute recursively d r as well as the threshold α r for the Christoffel polynomial. Theorem 4.7. Let Assumption 4.1 hold and let S ⊆ X be the support of the invariant probability measure µ . Suppose that there exist sequences (δ r ) r∈N , (α r ) r∈N and (d r ) r∈N such that µ , S, (δ r ) r∈N , (α r ) r∈N and (d r ) r∈N fulfill Assumption 4.6. For every r ∈ N, let: Then lim r→∞ sup x∈S r dist (x, S) = 0.
Proof. The proof is similar to the first part of the proof of Theorem 3.8 in [27], which relies on Assumption 4.6 and [27, Lemma 6.6]. We only need to show that the Christoffel polynomial associated to the invariant measure µ is affine invariant. Given an invertible affine mapping g : x → Ax + b, with A ∈ R n×n and b ∈ R n , we prove that the pushforward measureμ := g # µ has all its moments finite, that the associated moment matrix is positive definite and that p µ ,dr (x) = pμ ,dr (Ax + b) for all x ∈ X. Since X is compact,μ has all its moments finite. Since g is invertible, there exists an invertible matrix G such that v r (g(x)) = Gv r (x) for all x ∈ X. Denoting byũ the sequence of moments associated toμ , one has by definition of the pushforward measure: This proves that the moment matrix M r (ũ) is positive definite.
In addition, one has for all x ∈ X, pμ ,r (g(x)) :

Remark 4.
In the case when the invariant measure is discrete singular, its moment matrix may not be invertible and we can not approximate the support of this measure with the level sets of the Christoffel polynomial. In this case, one way to recover the support of the measure is to rely on the numerical linear algebra algorithm proposed in [14] for detecting global optimality and extracting solutions of moment problems. This algorithm is implemented in the GloptiPoly [15] software and has been already used in previous work [11] to recover finite cycles in the context of discrete-time systems.

Numerical Experiments
Here, we present experimental benchmarks that illustrate our method.
In Section 5.1, we compute the optimal solution y r of the primal SDP program (13) for a given positive integer r and p = 2 or ∞, as well as the approximate polynomial density h r p defined in (17). In Section 5.2, we compute the optimal solution u r of the primal SDP program (24) for a given positive integer r as well as S r , the sublevel set of the Christoffel polynomial defined in (28). In practice, the computation of α r in (28) relies on the following iterative procedure: we select d r = r, δ r = 1 and increment the value of δ r until the inequality (27) from Assumption 4.6 is satisfied.
SDPs (13) and (24) are both modeled through GloptiPoly [15] via Yalmip toolbox [30] available within Matlab and interfaced with the SDP solver Mosek [3]. Performance results are obtained with an Intel Core i7-5600U CPU (2.60 GHz) with 16Gb of RAM running on Debian 8. For each problem, we apply a preprocessing step which consists in scaling data (dynamics, general state constraints) so that the constraint sets become unit boxes. Note that our theoretical framework (including convergence of the SDP relaxations) only works after assuming that there exists a unique invariant probability measure (Assumption 3.5 and Assumption 4.1). Even though this may not hold for some of the considered systems, numerical experiments show that satisfying results can be obtained when approximating invariant densities and supports of singular measures.

Square Integrable Invariant density
First, let us consider the one-dimensional discrete-time polynomial system defined by Since F is invertible, then F −1 (t) = t 4/3 is distributed according to h and hence the following dynamical system with x being constrained in the interval X := [0, 1], has the invariant measure with density h ∈ L 2 (X).
Now, let us take w ∈ (0, 1). In this case one has In order to cast the dynamical system from (29) as a piecewise polynomial system, we introduce two additional (so-called lifting) variables y, z and consider the system defined as follows: where X 1 and X 2 are defined by Note that the collection {X 1 , X 2 } is a partition of

Piecewise Systems
Next, we consider three discrete-time piecewise systems coming respectively from Example 3, Example 4 and Example 5 in [17]. These three systems are known to have unique invariant densities [17, Section 4], thus Assumption 3.5 is fulfilled here.
As in Section 5.1.1, we introduce an additional lifting variable to handle either the division or the cube root operator. We compute the approximate density h r ∞ from (17) by solving SDP (22). For the system from (33), the second order SDP relaxation already provides a very accurate approximation of the exact density h (x) = 12(x − 1 2 ) 2 ), as shown in Figure 2(c).

Rotational Flow Map
We consider the rotational flow system, i.e. the two-dimensional continuous-time system defined byẋ with general state constraints within the unit disk X := {x ∈ R 2 : x 2 ≤ 1}. The restriction of the Lebesgue measure to X is invariant for the rotational flow map f (x) := (x 2 , −x 1 ). Indeed, denoting by n(x) the outer normal vector to the unit circle ∂X at x, the Green-Ostrogradski formula yields the following, for all v ∈ C 1 (X): which shows the invariance of λ X .

Hénon Map
The Hénon map is a famous example of two-dimensional discrete-time systems that exhibit a chaotic behavior. The system is defined as follows:

Van der Pol Oscillator
The Van der Pol oscillator is an example of an oscillating system with nonlinear damping [35]. The dynamics are given by the following second-order ordinary differential equation:ẍ By setting x 2 =ẋ 1 , one can reformulate this one-dimensional system into a twodimensional continuous-time system: When a > 0, there exists a limit cycle for the system. Here, we consider a = 0.  Figure 4 shows set approximations S r obtained from (28) of the support of the measure invariant w.r.t. the Van der Pol map. As for Example 5.2.1, we also represent the "true" limit cycle after performing a numerical integration of the Van der Pol system from t 0 = 0 to T = 20 with random sampled initial conditions within the disk of radius 0.1 and center [1, −1]. This numerical approximation is done with the ode45 procedure available inside Matlab. Once again, the plots exhibit a quite fast convergence behavior of the approximations S r of the invariant measure support to the limit cycle when r increases.

Arneodo-Coullet System
Finally, we investigate the Arneodo-Coullet system [4], representing the dynamics of a forced oscillator. This oscillator can be described by the following three-dimensional time-continuous system:ẋ conditions within the disk of radius 0.1 and center [1, 1, 1]. On Figure 5, we observe that the approximation S 4 obtained from (28) provides a reasonably good estimate of the chaotic attractor. The results obtained for higher relaxation orders were not satisfying because of the ill-conditioning of the moment matrix.

Conclusion and Perspectives
We can summarize the contributions of this article as follows: • We propose two methods to characterize invariant measures for discrete-time and continuous-time systems with polynomial dynamics and semialgebraic state constraints. Our approach can also be extended to piecewise-polynomial systems; • The first method allows to approximate as close as desired the moments of an absolutely continuous invariant measure, under the assumption that this density is square integrable. The second method allows to approximate as close as desired the support of a singular invariant measure by using sublevel sets of the Christoffel polynomial constructed from the moment matrix; • Each method relies on solving a hierarchy of finite-dimensional semidefinite programs. While the convergence of the hierarchy is guaranteed in theory, each program can be solved in practice, thanks to public-available solvers.
Our two methods for recovering invariant densities or singular invariant measure supports both rely on an extension of Lasserre's hierarchy of semidefinite relaxations, initially introduced in the context of polynomial optimization. Numerical experiments show that our two methods already yield fairly good approximations of the moments of invariant measures at modest relaxation orders. One further research direction would be to investigate scaling both methods to large-size systems, when either sparsity or symmetry occurs.
Regarding the approximation quality of the invariant densities, our first method could produce more satisfactory results by using Chebyshev polynomials or rational function bases as an alternative to the monomials base. Regarding the approximation quality of the supports of singular invariant measures, our second method would practically yield better estimates by handling the issue related to the ill-conditioning of the moment matrix.
Next, we shall devote research efforts to extend the uniform convergence properties of the Christoffel polynomial studied in [27] to certain classes of singular measures. Another track of investigation would be to develop a similar hierarchy of semidefinite programs to study the support of atomic discrete measures corresponding to finite cycles. A first attempt made in [11] consists in setting the objective function of these programs as particular linear moment combinations. However, the question raised about choosing the adequate objective function to recover a given finite cycle still remains open.