Empirical Measures and Vlasov Hierarchies

The present note reviews some aspects of the mean field limit for Vlasov type equations with Lipschitz continuous interaction kernel. We discuss in particular the connection between the approach involving the N-particle empirical measure and the formulation based on the BBGKY hierarchy. This leads to a more direct proof of the quantitative estimates on the propagation of chaos obtained on a more general class of interacting systems in [S.Mischler, C. Mouhot, B. Wennberg, arXiv:1101.4727]. Our main result is a stability estimate on the BBGKY hierarchy uniform in the number of particles, which implies a stability estimate in the sense of the Monge-Kantorovich distance with exponent 1 on the infinite mean field hierarchy. This last result amplifies Spohn's uniqueness theorem [H. Spohn, Math. Meth. Appl. Sci. 3 (1981), 445-455].


Introduction
Mean field evolution PDEs are an important class of models in non equibrium statistical mechanics. Perhaps the main example in this class of models is the Vlasov-Poisson system. It takes the form (1) ( where f ≡ f (t, x, ξ) ≥ 0 is the (unknown) distribution function of a particle system (in other words, the number density at time t of particles located at the position x with velocity ξ), while ∇V (x − y)f (t, x, ξ)dξdy .
The Vlasov-Poisson system is a fundamental model in plasma physics [11]; it is obtained by specializing (1) to the case where V is the Coulomb potential In that case, f is the distribution function of a system of identical charged point particles, and E[f ] is the self-consistent electrostatic field created by those particles. Another important example is the vorticity formulation of the Euler equation for incompressible fluids in space dimension d = 2, which takes the form (2) ∂ t ω(t, x) + div x (ωu)(t, x) = 0 , x ∈ R 2 .
Here ω ≡ ω(t, x) ∈ R is the unknown vorticity field, while is the velocity field. The integral kernel K is given by the formula While the vorticity field ω is in general of indefinite sign and cannot be viewed as a density of particles, applying the methods of statistical mechanics to a gas of vortices provides valuable information on the dynamics of incompressible, inviscid fluids in space dimension 2 [17]. A fundamental question in nonequilibrium statistical mechanics is to derive mean field PDEs such as (1) or (2) rigorously from the dynamics of finite particle systems in some appropriate limit. This remains an open problem at the time of this writing, at least for interactions such as the Coulomb potential V or the Biot-Savart kernel K, which are both singular at the origin. The case of interactions with a singularity at the origin weaker than either the Coulomb potential or the Biot-Savart kernel has been recently considered in [7,8]. In the case of regularized interactions, the corresponding limits have been established rigorously already some time ago [16,3,4].
There are two different ways of handling this problem. One can prove that the empirical measure of a system of N identical, interacting particles converges to the solution to the mean field PDE as N → ∞: this is the approach used in [16,3,4]. Alternately, one can try to use BBGKY hierarchies and establish the propagation of chaos in the limit N → ∞: see [18].
In this short note, we explain how both approaches are related in the case of smooth interaction kernels (see Theorem 3.1), and obtain a stability estimate on the BBGKY hierarchy that is uniform in the number of particles (see Theorem 4.1). As a consequence, we prove the continuous dependence on initial data of statistical solutions of the mean field PDE, with an estimate in some appropriate Monge-Kantorovich distance, thereby amplifying Spohn's uniqueness theorem for solutions of the infinite Vlasov hierarchy in [18]. (The notion of statistical solution of the mean field equation will be recalled at the end of section 5.) Seiji Ukai contributed several famous results in the theory of PDEs. For instance, his note [19] is the first global existence and uniqueness theorem on the Cauchy problem for the Boltzmann equation. In addition to his impressive work on the Boltzmann equation, Seiji Ukai is also at the origin of the regularity theory for the Vlasov-Poisson system [21]. More recently, Seiji Ukai also gave a striking interpretation of the derivation of the Boltzmann equation from the N -body problem in classical mechanics in terms of the Nirenberg-Ovsyannikov abstract variant of the Cauchy-Kovalevskaya theorem [20]. The present paper discusses different aspects of the analogous question for Vlasov type equations, and is dedicated to Seiji Ukai's memory, in recognition of his considerable influence on the field of kinetic models.

Vlasov Equations and Mean-Field Limit
Let K : R d × R d → R d be a continuous map satisfying (4) K(z, z ′ ) + K(z ′ , z) = 0 , z, z ′ ∈ R d , and the Lipschitz condition These two conditions imply in particular that Consider a system of N identical particles, the state of the kth particle at time t being defined by z k (t) ∈ R d . Assume that the evolution of this system of particles is governed by the N -body ODE system We have assumed a mean field scaling: the interaction between two particles in the system is of order 1/N , so that the collective action of the particle system on each particle is of order unity.
Together with the N -body ODE system (7), we consider the mean field Vlasov equation where f in ∈ L 1 (R d ; (1 + |z|)dz). In equation (8), we abuse the notation Kf (t, z) to designate (Kf (t, ·))(z), where The vorticity formulation of the incompressible Euler equation (2) in space dimension 2 is an obvious example of mean field equation (8), with integral kernel given by (3).
The Vlasov equation (1) can also be put in this form. In that case, d = 6 and the N -body ODE system (7) is the system of Hamilton's equationṡ where the Hamiltonian H N is defined on (R 6 ) N by the formula (10) H Denoting z k = (x k , ξ k ) ∈ R 3 × R 3 ≃ R 6 the k-th pair of conjugate variables, the interaction kernel K in this case is given by and satisfies the assumptions (4) and (5) if and only if (12) V ∈ C 1 (R 3 ) , ∇V ∈ Lip(R 3 ; R 3 ) and ∇V (z) + ∇V (−z) = 0 for all z ∈ R 3 .
Since the total mass and momentum are invariant under the dynamics defined by (1), the mean field PDE (8) coincides with (1) for distribution functions f such that Notice that neither the vorticity formulation of the incompressible Euler equation in space dimension 2 nor the Vlasov-Poisson system satisfy the Lipschitz condition (5) on their interaction kernels K, because of the singularity of the Biot-Savart kernel or of the Coulomb potential at the origin. However, both these equations do satisfy the antisymmetry condition (4) on their interaction kernels. Condition (6) implies that the differential system (7) has a global solution defined for all t ∈ R, denoted by and, for each t ∈ R, we define (14) belongs to C(R + , w − P 1 (R d )), where P r (R d ) designates, for each r > 0, the set of Borel probability measures µ on R d such that A remarkable property of the N -body ODE (7) is that the time-dependent empirical measure µ N is a weak solution to the Vlasov equation (8), where the operator K in (9) is extended to P 1 (R d ) by the formula We first recall the following important result (proved in the case (11)).
Proposition 1 (Dobrushin [4]). For all µ in ∈ P 1 (R d ), the Cauchy problem has a unique weak solution µ ∈ C(R; w − P 1 (R d )). Moreover (a) For all Z in N ∈ (R d ) N , the unique solution to (17) in C(R; w − P 1 (R d )) with initial data µ Z in N is the time dependent empirical measure µ N (t) defined by (14).
(b) Let µ and ν be the solutions of (17) in C(R; w − P 1 (R d )) with initial data respectively µ in and ν in . Then the Monge-Kantorovich distance dist MK,1 between µ(t) and ν(t) satisfies the inequality (c) If µ in is absolutely continuous with respect to the Lebesgue measure L d of R d , then the solution µ(t) of (17) is also absolutely continuous with respect to L d for all t ∈ R. Thus µ(t) is of the form µ(t) = f (t, ·)L d for all t ∈ R, where f is the solution to (8).
We recall that, for all r > 0, the Monge-Kantorovich distance dist MK,r is defined on P r (R d ) by the formula In the particular case r = 1, where Lip(φ) := sup (See Theorems 1.14 and 7.3 (i) in [22].) The interested reader is referred to the original article [4] for a proof of Proposition 1. We just sketch below the argument for obtaining the upper bound on moments of weak solutions of the Vlasov equation (17). For each r ≥ 1, one has (Here we have used the elementary inequality which is a consequence of the convexity of z → e z .) By Gronwall's lemma, A consequence of Dobrushin's result is that the Vlasov equation (8) governs the mean field limit of the N -particle ODE system (7), a result already established in [3] 1 -see also the earlier reference [16].
as N → ∞. Then the sequence of solutions T N t Z(N ) of the N -particle ODE system (7) satisfies Since the distance dist MK,1 metricizes the topology of weak convergence on P 1 (R d ) (see Theorem 7.12 in [22]), the two conditions on Z(N ) imply that By statement (b) in the Proposition above, where f is the solution to (8), which implies the conclusion.

Vlasov Hierarchy and the Mean-Field Limit
Another approach to the problem of deriving the Vlasov equation (8) as the mean field limit of the N -particle ODE system (7) involves the Vlasov hierarchy.
Let P in N ∈ P 1 ((R d ) N ) be a probability measure on the N -particle phase-space. All particles being identical, assume that P in N is symmetric, meaning that, for each permutation σ ∈ S N , one has 2 Consider the time dependent N -particle probability distribution P N (t) defined by the formula 1 The argument in [3] involves a distance very similar to dist M K,1 , defined by the Kantorovich-Rubinstein formula (19), where the supremum is taken on the set of all bounded Lipschitz continuous functions φ such that φ L ∞ + Lip(φ) ≤ 1: see formulas (2.8-9) in [3]. However the proof of the mean field limit (Theorem 3.1 in [3]) includes a weak convergence argument on the initial data that is not quantitative and differs from Dobrushin's. 2 If T : X → Y is a measurable map and m is a measure on X, T #m designates the pushforward of m under T , that is a measure on Y , defined as follows: for each positive measurable function f on Y , It is the unique solution in C(R; w − P 1 ((R d ) N )) of the N -body Liouville equation Observe that the first order differential operator on ( Therefore, the symmetry (22) is propagated by the N -body dynamics, so that for all σ ∈ S N and all t ∈ R .
For each m = 1, . . . , N − 1, define the m-body marginal of P N (t) as The sequence of m-body marginals of P N satisfies the BBGKY hierarchy of equations (29) Since the equation for m = N in this hierarchy coincides with the N -body Liouville equation (25), the BBGKY hierarchy (29) is exactly equivalent to (25).
for the product topology, each factor being endowed with the weak-* topology of the dual of L 1 (R; C b ((R d ) m )). Each limit point of that family as N → ∞ is a solution (P m ) m≥1 of the (infinite) Vlasov hierarchy 3 More precisely, this is the sequence indexed by N ≥ 1 of the elements (P N:m ) m≥1 of the product space X (as explained in formula (27), for m > N ≥ 1, one has P N:m = 0).
(See [2] for a proof of this result in the genuine Vlasov case (11).) We just sketch below the formal argument analogous to (21) implying tightness of the sequence (P N :m (t)) N ≥m for t and m fixed, as N → ∞. For each N, r ≥ 1, one has (The last inequality above uses again (20), while the last equality uses the symmetry property (26).) By Gronwall's inequality, for each t ∈ R and N ≥ 1, one has (31) In particular, for each t ∈ R, m ∈ N * and N ≥ m, one has If f is a solution to the Vlasov equation (8), the sequence ((f L d ) ⊗m ) m≥1 is a solution to the infinite Vlasov hierarchy (30). Thus, if one knows that each limit point (P m ) m≥1 of the family ((P N :m ) m≥1 ) N ≥1 belongs to a functional space where the infinite Vlasov hierarchy (30) has only one solution, one concludes that for each m ≥ 1, where f is the solution to the Vlasov equation (8).
Uniqueness of the solution to the infinite Vlasov hierarchy has been established in the genuine Vlasov case (11), first by Narnhofer-Sewell [15] in the case where the potential V is analytic with Fourier transformV ∈ C c (R d ), and later by Spohn [18] in the more general case whereV is a Radon measure on R d such that Notice however that Spohn's uniqueness theorem uses Dobrushin's Proposition 1 (or equivalently the mean field limit obtained in both [3,16]). Therefore the approach based on the BBGKY hierarchy is not really an alternative to the one based on the empirical measure.
These two approaches of the mean field limit involve objects of a different nature. Indeed, the time dependent empirical measures considered in the first approach are measures defined on the single-particle phase space, whereas the second approach based on the BBGKY hierarchy involves the sequence of m-particle phase spaces for all m ≥ 1.

Empirical Measures and Chaotic Sequences
The two approaches of the mean field limit sketched in sections 1 and 2 involve probability measures defined on very different phase spaces, and therefore may seem a priori unrelated. Indeed, in section 1, the Vlasov equation (8) is the equation governing the weak limit of the sequence of empirical measures µ T N t Z in N viewed as probability measures on the 1-particle phase space R d . In section 2, the object of interest is P N , which is a symmetric probability measures on the N -particle phase space (R d ) N , and the goal of the mean field limit is to describe the asymptotic behavior of P N as N → ∞ through the sequence of its marginals P N :m . The m-th marginal P N :m of P N is itself a symmetric probability measure on the m-particle Perhaps the key to understanding how these different approaches to the mean field limit are related is the following observation 4 : the empirical measure is a symmetric function of the N variables (z 1 , . . . , z N ) ∈ (R d ) N with values in the set of probability measures in the variable z ∈ R d . With this observation in mind, it becomes natural to consider expressions of the form where the measure-valued symmetric function µ ⊗m z1,...,zN of the N -tuple (z 1 , . . . , z N ) is averaged under the symmetric probability measure P N defined on the N -particle phase space (R d ) N . This expression defines a probability measure on the m-particle phase space (R d ) m , which is related to the m-th marginal of P N by a combinatorial argument that is the key to statement (a) in the theorem below.
More precisely, statement (a) uses this combinatorial argument together with the N -particle dynamics and relates the evolution of the m-particle marginal P N :m in the BBGKY hierarchy to tensor powers of the empirical measure, which is a measure valued solution of the mean field equation (8). In other words, statement (a) in the theorem below really bridges the two approaches to the mean field limit.
Combining statement (a) with Dobrushin's inequality and a quantitative variant of the law of large numbers for the initial 1-particle distribution f in L d , we arrive at a chaoticity estimate, measuring the distance from P N : satisfy the symmetry condition (22), and let t → P N (t) be defined by (24). Then (a) From tensorized empirical measures to marginals For all t ∈ R and all N ∈ N * (32) 2N .
(b) Dobrushin's estimate for marginals For all t ∈ R, all N ∈ N * and all m = 1, . . . , N , and for all bounded and Lipschitz continuous φ m defined on (R d ) m , one has (c) Chaoticity estimate Assume that P in N = (f in L d ) ⊗N and that for all t ∈ R and all N ≥ m ≥ 1. In particular, for m = 1, one has dist MK,1 (P N : where F (m, N ) is the set of maps from {1, . . . , m} to {1, . . . , N }. Thus Next split the summation defining Φ m as On the other hand, the formula defines a positive Radon measure satisfying 2N following from Theorem 58 in [6].
Obviously π λ ∈ Π(λ ⊗ µ, λ ⊗ ν) so that The first inequality in the lemma follows from minimizing the right hand side above for π running through Π(µ, ν). The second inequality in the lemma is established by a similar argument.
Corollary 2. For all µ, ν ∈ P 1 (R d ) and all m ≥ 1, one has Proof. By the triangle inequality The general term in the summation above satisfies where the first inequality follows from the first inequality in the lemma and the second from the second inequality in the lemma. The first and last terms on the right hand side of (37) are estimated by applying respectively the first and the second inequalities in the lemma.

Proof of statements (b) and (c). Let
The first inequality above follows from statement (a) in the Theorem and the second inequality from the estimate on R N,m (t). The third inequality above follows from (19) and the fourth from the corollary above, while the fifth follows from (18). Since P N :m (t) is a probability measure on (R d ) m , the bound (36) and the chain of inequalities above imply statement (b) in the Theorem. Now for statement (c). First where the first inequality results from the ordering of Monge-Kantorovich distances (see formula (7.3), section §7.1.2 in [22]), while the second follows from the Cauchy-Schwarz inequality. By Theorem 1.1 of [10], (34). Together with (b), this implies the first inequality in (c). The second estimate in (c) is a consequence of the first in the case m = 1 and of (19).
The convergence rate in Theorem 3.1 (c) obviously depends on the quantitative chaoticity estimate in [10]. More information on analogous chaoticity estimates can be found in Lemma 4.2 of [13].

Weak stability of the BBGKY hierarchy
In this section, we establish a stability property of the BBGKY hierarchy in the weak topology of probability measures. This stability property is uniform as the particle number tends to infinity.
Our stability estimate uses the following variant of Monge-Kantorovich distance. Let P ∈ P 1 ((R d ) M ) and Q ∈ P 1 ((R d ) N ) satisfy the symmetry condition (22).
is the quotient of (R d ) M under the action of S M (resp. of (R d ) N under the action of S N ) defined by S σ as in (23). Consider Proof. First we establish statement (a). Let X M ∈ (R d ) M and Y N ∈ (R d ) N ; then t → µ T M t XM and t → µ T N t YN are two weak solutions of the Vlasov equation (8). By Dobrushin's estimate ; averaging both sides of this inequality with respect to ρ in gives Since this is true for all ρ in ∈ Π(P in M , Q in N ), minimizing the right hand side of the inequality above in ρ in establishes the inequality in (a).
As for statement (b), pick φ m to be a bounded and Lipschitz continuous function on (R d ) m . By formula (32) where R M,m (t) is the Radon measure defined in (35) and the S N,m (t) the analogue of R N,m (t) with Q in N replacing P in M in formula (35). Thus, by (33) and (36), one has and, by the same token On the other hand, by Corollary 2 and Dobrushin's inequality (18), Minimizing the integral on the right hand side of this inequality as ρ in runs through Π(P in M , Q in N ) leads to the inequality stated in (b).

Continuous dependence on the initial data of statistical solutions of the Vlasov mean field PDE
In this section, we identify each element P N of P 1 ((R d ) N /S N ) with the element of P(P 1 (R d )) defined as the push-forward of P N under the map Since R d endowed with the Euclidean distance is a complete metric space, P 1 (R d ) endowed with the Monge-Kantorovich distance dist MK,1 is a Polish space (see Proposition 7.1.5 in [1]). Define Obviously, any element P N of P 1 ((R d ) N /S N ) is identified with an element of Then the distance Dist MK,1 introduced in the previous section is extended to P 1 (P 1 (R d )) as follows: Let E be the set of probability measures P in ∈ P 1 (P 1 (R d )) such that The next result is a consequence of Theorem 4.1.
Theorem 5.1. There exists a unique 1-parameter group T ∞ t defined on E and satisfying the following properties: (a) For each P in ∈ E and each sequence P in is continuous for the distance Dist MK,1 . (c) For each P in ∈ E, the family P m defined for all m ≥ 1 by the formula is a solution to the mean field, infinite hierarchy (30) in the sense of distributions.
(d) For each P in , Q in ∈ E, one has The generator of the 1-parameter group T ∞ t can be computed explicitly; see Lemma 2.11 of [13] for a detailed description of this computation and of the pertaining functional framework.
The definition of the 1-parameter group T ∞ t in statement (a) above requires constructing at least one sequence P in N converging to P in for the distance Dist MK,1 . This can be done as explained in the following lemma.
Lemma 5.2. For each Q ∈ E and each N ≥ 1, set Then the sequence Dist MK,1 (Q N , Q) converges to 0 as N → ∞.
For each f ∈ P d+5 (R d ), applying either Theorem 1.1 of [10] or the strong law of large numbers shows that the sequence f ⊗N (or equivalently its push-forward by the map (R d ) N ∋ Z N → µ ZN ∈ P 1 (R d )) converges to δ f for the distance Dist MK,1 as N → ∞. Since any probability measure Q ∈ E can be represented as a barycenter of δ f by the (tautological) formula it is natural to expect that Q is the limit for N → ∞ of the corresponding barycenters Q N of f ⊗N defined in Lemma 5.2. The missing details are explained in the proof below.
Proof. Observe that ρ(dZ N , df ) := f ⊗N (dZ N )Q(df ) is a coupling of Q N and Q. Indeed, for each φ ∈ C b ((R d ) N ) and each Φ continuous on P 1 (R d ) for the distance dist MK,1 , one has By Theorem 1.1 of [10], for all f ∈ P d+5 (R d ), i.e. Q-a.e. in P 1 (R d ) in view of (38). On the other hand (In the first inequality above, we have used the fact that for all λ, µ, ν ∈ P 1 (R d ), which is an obvious consequence of (19).) Since Q ∈ E, one has By dominated convergence as N → ∞, and this concludes the proof.
Proof of Theorem 5.1. Pick a sequence P in N of elements of P 1 ((R d ) N /S N ) for each N ≥ 1 such that Dist MK,1 (P in N , P in ) → 0 as N → ∞. (The construction in Lemma 5.2 provides one example of sequence P in N whenever P in ∈ E.) For each t ∈ R, we define P N (t) := T N t #P in N ; we recall that P N (t) ∈ P 1 ((R d ) N /S N ) by (26). Since P in N converges to P in for the distance Dist MK,1 , it is in particular a Cauchy sequence for that distance. Thus P N (t) is a Cauchy sequence for the distance Dist MK,1 for each t ∈ R, by Theorem 4.1 (a). Since P 1 (R d ) endowed with the distance dist MK,1 is a complete space, the set P 1 (P 1 (R d )) is a complete space for the distance Dist MK,1 (by Proposition 7.1.5 in [1]). Therefore, there exists a unique map R ∋ t → P (t) ∈ P 1 (P 1 (R d )) such that Dist MK,1 (P N (t), P (t)) → 0 as N → ∞ for each t ∈ R. Besides, the estimate in Theorem 4.1 (a) shows that this convergence is uniform in t ∈ [−T, T ] for each T > 0. Since P N is a continuous map on R with values in P 1 (P 1 (R d )) for the distance Dist MK,1 , we conclude that P is also continuous on R with values in P 1 (P 1 (R d )) for the distance Dist MK,1 . This proves (a) and (b).
Let P in , Q in ∈ P 1 (P 1 (R d )), and let P in N and Q in N be sequences of elements of P 1 ((R d ) N /S N ) converging respectively to P in and Q in in P 1 (P 1 (R d )) for the distance Dist MK,1 as N → ∞. As explained in the discussion above, the sequences P N (t) := T N t #P in N and Q N (t) := T N t #Q in N converge to P (t) := T ∞ t P in and Q(t) = T ∞ t Q in ∈ P 1 (P 1 (R d )) uniformly in t ∈ [−T, T ] for all T > 0 as N → ∞, and passing to the limit as N → ∞ in the inequality of Theorem 4.1 implies that This proves (d). In particular P (t) = Q(t) for all t ∈ R if P in = Q in . In other words, the function t → P (t) is uniquely determined by the initial condition P in . Since Dist MK,1 (P N (t), P (t)) → 0 as N → ∞, one has On the other hand for all N, R ≥ 1 and all t ∈ R by (31). Applying Fatou's lemma shows that P (t) ∈ E for all t ∈ R whenever P in ∈ E. Since for all t, s ∈ R and all N ≥ 1 , we conclude that T ∞ t defines a 1-parameter group on E.
To conclude this section, let us compare Theorem 5.1 and Spohn's theorem [18]. Theorem 5.1 implies that weak solutions of the infinite mean field hierarchy obtained as limits of the BBGKY hierarchy are uniquely determined by their initial condition.
Spohn's theorem [18] states that any weak solution to the infinite mean field hierarchy that is weakly differentiable in the time variable t is uniquely determined by its initial condition. To be more precise, a time-dependent sequence is differentiable on R for all m ≥ 1 and all φ m ∈ C 1 c ((R d ) m ). Such a sequence is said to be a weak solution to the infinite mean field hierarchy if for all t ∈ R, all m ≥ 1 and all φ m ∈ C 1 c ((R d ) m ). In other words, let (P m ) m≥1 be a weak solution to the infinite mean field hierarchy that is weakly differentiable in t and satisfies P m (0) ∈ P 1 ((R d ) m /S m ) for each m ≥ 1. By the Hewitt-Savage theorem [9], let P in be the unique element of P(P 1 (R d )) such that Finally, let V t be the 1-parameter group defined on P 1 (R d ) by the relation where µ is the solution to the Cauchy problem (17). Then one has for all t ∈ R and all m ≥ 1. (Indeed, the sequence ((V t f ) ⊗m ) m≥1 is a solution to the infinite mean field hierarchy for each f ∈ P 1 (R d ). Hence, by linearity, the sequence on the right hand side of the formula above defines a weak solution to the infinite mean field hierarchy that is weakly differentiable in time and coincides with (P m (t)) m≥1 for t = 0. By Spohn's uniqueness theorem, these sequences must coincide for all t ∈ R.) In the language of Theorem 5.1, Spohn's uniqueness theorem implies that for all t ∈ R and all P in ∈ E .
Notice that Spohn's uniqueness theorem is not a consequence of our proof of Theorem 5.1 (d). Since the stability inequalities in that theorem are obtained by passing to the large N limit in the corresponding inequalities for the BBGKY hierarchy in Theorem 4.1, they apply only to solutions of the infinite hierarchy that are obtained as limits for N → ∞ of solutions of the N -particle BBGKY hierarchy. Of course, with the additional information contained in Spohn's uniqueness theorem, the stability estimate in Theorem 5.1 (d) holds for all weak solutions of the infinite mean field hierarchy that are weakly differentiable in time, and whose initial condition is defined by (39) with P in ∈ E.
Let us explain how the discussion in this section is related to the notion of "statistical solution" of the mean field PDE (8). This notion is very clearly explained by Spohn (see [18] on p. 448, especially formulas (1.16)-(1.19)). We briefly recall Spohn's point of view for the reader's convenience. Let v ∈ C 1 (R n ; R n ) satisfy |v(x)| = O(|x|) as |x| → ∞; by the Cauchy-Lipschitz theorem, the vector field v generates a global flow X t on R N . In other words, for each x ∈ R N , the solution of the Cauchy problem is the trajectory t → X t (x) going through x at time t = 0. Consider instead of a single initial point x a cloud of initial data distributed under p in ∈ P(R n ). It is transported by the flow into a cloud of points which, at time t, are distributed under X t #p in . It is therefore natural to think of the map t → X t #p in as a "statistical solution" of the ODE (42). If one replaces R n with P 1 (R d ) and the ordinary differential equation (42) with the Vlasov equation (17), it is equally natural to think of the map t → V t #P in as the statistical solution of the Vlasov equation (17) starting from P in ∈ E at time t = 0. Then the equality (41) implies that our Theorem 5.1 establishes the Lipschitz continuous dependence on the initial data of statistical solutions of the Vlasov equation in the distance Dist MK,1 .

Appendix: Spohn's uniqueness theorem
For the sake of being complete, we recall Spohn's uniqueness theorem [18], and briefly sketch its proof. Theorem 6.1. Let R ∋ t → P (t) ∈ E be such that the sequence t → (P m ) m≥1 defined by is a weakly differentiable in t ∈ R weak solution to the infinite mean field hierarchy (30). Then P (t) = V t #P (0) for all t ∈ R where V t is the 1-parameter group such that t → V t µ in is the solution to the Cauchy problem (17).
Returning to the analogy between the ODE (42) and the mean field PDE (8) recalled at the end of section 5, weakly differentiable weak solutions of the infinite mean field hierarchy are analogous to weak solutions of the transport equation In this analogy, Spohn's uniqueness theorem is analogous to the method of characteristics, from which we conclude that the unique weak solution of (44) is given by the formula p(t) = X t #p in . Thus Spohn's theorem can be rephrased as follows: any weakly differentiable weak solution (P m ) m≥1 of the infinite mean field hierarchy is represented by the Hewitt-Savage theorem (i.e. formula (43)) in terms of a unique E-valued map t → P (t) that is is a statistical solution of the mean field PDE. The argument below is essentially similar to the original proof on pp. 449-453 in [18], although slightly simpler in places. For m = 0, we use the notation M 0 (f ) := 1 for all f ∈ P 1 (R d ).
We also need the notation Thus, the m-th equation in the infinite mean field hierarchy (30) takes the form where A * n [ζ] designates the (formal) adjoint of A n [ζ]. For each n ≥ m ≥ 1 and each φ m ∈ C 1 c ((R d ) m ), consider the function (In the expression φ m • T n s , the function φ m is viewed as function of n ≥ m variables.) Observe first that On the other hand, since P m is a weak solution to the m-th equation in the infinite mean field hierarchy (30), one has Splitting and observing that (48) for all s, t ∈ R as n → ∞. Besides Φ n L ∞ (R 2 ) ≤ φ m L ∞ ((R d ) m ) for all n ≥ 1 so that the convergence(50) holds in the sense of distributions on R 2 .
On the other hand Since P (t) ∈ E, by the same argument as in the proof of Lemma 5.2, one has By the Stone-Weierstrass theorem, the algebra R ⊕ span{M m [φ m ] s.t. m ≥ 1 and φ m ∈ C 1 c ((R d ) m )} is dense in C(P 1 (R d )), so that (52) implies the conclusion of Spohn's theorem.
It remains to establish the estimate (51) above. By the chain rule where a lk (s, Z n ) is the entry of the Jacobian matrix of T n s at Z n on the l-th row and the k-th column. Differentiating each side of the l-th equation in (7) with respect 5 By considering Φn instead of Φ, we have avoided computing explicitly ∂tΦ as in [18]. The identity (48) shows that the first term in the decomposition (47) exactly cancels with the expression under the bracket on the right hand side of (46). Thus we do not need to pass to the large n limit in these expression, so that the discussion in formulas (2.11-17) and (2.27) of [18] becomes useless. The analysis in formulas (2.17-24) in [18] is roughly equivalent to our proof that (∂t − ∂s)Φn → 0. Our proof of (51) is essentially equivalent to the analysis in formulas (2.18-22) of [18].