Existence of probability measure valued jump-diffusions in generalized Wasserstein spaces

We study existence of probability measure valued jump-diffusions described by martingale problems. We develop a simple device that allows us to embed Wasserstein spaces and other similar spaces of probability measures into locally compact spaces where classical existence theory for martingale problems can be applied. The method allows for general dynamics including drift, diffusion, and possibly infinite-activity jumps. We also develop tools for verifying the required conditions on the generator, including the positive maximum principle and certain continuity and growth conditions. To illustrate the abstract results, we consider large particle systems with mean-field interaction and common noise.


Introduction
In this paper we study existence of probability measure valued jump-diffusions, whose dynamics is specified by means of a martingale problem. Processes taking values in spaces of probability measures play an important role in a number of applied contexts. This includes population genetics (see Etheridge (2011) for an overview), stochastic partial differential equations (see e.g. Florchinger and Le Gland (1992) and Kurtz and Xiong (1999) among many others), statistical physics (see Huang (1987) for an overview), optimal transport (see Villani (2008) for an overview), and mathematical finance, in particular stochastic optimal control, McKean-Vlasov equations, and mean field games (see e.g. Carmona and Delarue (2017) and the references given there) and stochastic portfolio theory (see e.g. Fernholz (2002), Fernholz and Karatzas (2009), and Cuchiero (2019)).
The mathematical theory of probability measure valued processes has a long history going back to Watanabe (1968), Dawson (1977Dawson ( , 1978, and Fleming and Viot (1979). We refrain from a full literature review, but only mention the remarkable collection of St. Flour lecture notes of Sznitman (1991), Dawson (1993), and Perkins (2002), as well as the work of Ethier and Kurtz (1987, 1993, 2005. Much of the classical literature on measure valued processes works with the weak topology on the space M 1 (R d ) of all probability measures on R d (or some other relevant underlying spaces). There are however other interesting topologies that one can place on spaces of probability measures, that are more appropriate in certain situations. Prominent examples are topologies induced by Wasserstein metrics on the spaces P p (R d ) of probability measures with finite p-th moments. A basic reason for considering such stronger topologies is to ensure that the coefficients which specify the dynamics of the system are continuous functions of the current state.
The price to pay is that the classical existence theory for martingale problems becomes more difficult to apply. As a result, most proofs of existence of measure valued processes proceed instead via interacting particle systems and a passage to the large-population limit (see for instance the approach presented by Dawson and Vaillancourt (1995)). In this paper we prove existence for the limiting system directly, without passing through particle systems.
A key difficulty in using the martingale problem is related to the fact that (say) the Wasserstein space P p (R d ) is not straightforward to compactify. To illustrate this, consider first M 1 (R d ) with the topology of weak convergence. This space fails to be locally compact, and hence does not admit a standard one-point compactification. However, the space M 1 ((R d ) ∆ ) of probability measures on the one-point compactification of R d is compact, and thus fits naturally with classical machinery. This simple procedure does not work for P p (R d ).
In this paper we develop a simple device for embedding P p (R d ), and other similar spaces, into compact spaces where the classical existence theory of martingale problems can be applied. This allows us to establish existence of solutions for martingale problems in spaces of this kind. The operators for which the martingale problem is solved can be very general, including both drift, diffusion, and jumps which can be of infinite activity and even non-summable.
We start in Section 2 by reviewing some facts about martingale problems. The core of the paper is Section 3, where we state and prove our main abstract result, Theorem 3.4. There we consider a linear operator L on a carefully chosen domain of test functions. A key assumption on L is, as one would expect, that it satisfy the positive maximum principle. Since L acts on functions of probability measures, it may not be obvious how to verify the positive maximum principle in practice. To remedy this, we develop necessary conditions for optimality, see Theorem 5.1, that can be used to verify the positive maximum principle for operators of Lévy type, introduced in Section 4. This extends results in . Furthermore, in addition to the positive maximum principle, we impose certain continuity and growth conditions on L. In Section 6 we develop tools to aid the verification of these conditions. Finally, in Sections 7 and 8, we discuss some applications that illustrate the scope of the abstract theory. These applications are primarily related to large particle systems with mean-field interaction, where the particles are subject to common noise. In such systems, the limiting empirical distribution of the particles evolves as a probability measure valued stochastic process, whose dynamics can often be described in terms of a martingale problem of the type considered here.
The following notation is used throughout the paper. For a locally compact Polish space E, we let M + (E) denote the Polish space of positive measures on E, and M 1 (E) the subspace of probability measures. We also write M (E) = M + (E) − M + (E) for the space of signed measures on E of bounded variation. These spaces are sometimes considered with the topology of weak convergence (defined using bounded continuous functions and denoted µ n ⇒ µ) or vague convergence (defined using continuous functions vanishing at infinity). We remark that if E is compact, then M 1 (E) is compact and M + (E) is locally compact. However, if E is noncompact, M 1 (E) is not even locally compact. See for instance Remark 13.14(iii) and Corollary 13.30 in Klenke (2013) for more details.
2 Martingale problems and the positive maximum principle Let X be a Polish space, D ⊆ C(X ) a linear subspace, and consider a linear operator L : D → C(X ). (2.1) In this paper, X will be a subset of M (E) for some closed subset E ⊆ R d , or of M (E ∆ ) where E ∆ is the one-point compactification of E. The topology on X will however not always be the subspace topology (i.e. the topology of weak convergence). Moreover, the functions in D will usually be defined on a larger subset of M (E) than X , in which case the condition D ⊆ C(X ) just means that f | X ∈ C(X ) for every f ∈ D.
Definition 2.1. An X -valued RCLL process X, defined on some filtered probability space, is called a solution to the martingale problem for (L, D, X ) with initial condition µ ∈ X if X 0 = µ and is a local martingale for every f ∈ D.
It is convenient to allow solutions to the martingale problem to leave the state space. If X is locally compact, this is formalized via a one-point compactification of X . A similar procedure works more generally. Fix a cemetery state † / ∈ X . Define X † = X ∪ { †}, and let D † consist of all f : and L † f ( †) = 0. Assume that the given Polish topology on X can be extended to a Polish topology on X † in such a way that both D † and L † (D † ) are contained in C(X † ). For example, this is the case if X is locally compact, X † is the one-point compactification, and both D and L(D) are contained in C 0 (X ). If X is not locally compact, then the one-point compactification is not available, and other constructions must be used. This situation arises, for instance, when X is a Wasserstein space of probability measures.
Definition 2.2. A solution X to the martingale problem for (L † , D † , X † ) with initial condition µ ∈ X is called a possibly killed solution to the martingale problem for (L, D, X ) with initial condition µ. 1 For definiteness, we now suppose that X is a subset of M (E). We also suppose that for f ∈ D, both f and Lf are defined on all of M (E). The following classical definition is useful because it can very often be checked in practice.
If this holds for all µ ∈ X , then L is said to satisfy the positive maximum principle on X .
The positive maximum principle directly implies that Lf | X only depends on f | X and not on the values f takes outside X . Thus, if L satisfies the positive maximum principle on X , it can be regarded as an operator sending functions on X to functions on X , consistent with (2.1). The positive maximum principle is linked to existence of solutions to the martingale problem. The following classical result deals with the locally compact case. The nontrivial part is the forward implication, whose proof can be found, e.g., in (Ethier and Kurtz, 2005, Theorem 4.5.4).
Theorem 2.4. Assume X is locally compact, D ⊆ C 0 (X ) is dense, and L(D) ⊆ C 0 (X ). Then L satisfies the positive maximum principle on X if and only if there exists a possibly killed solution to the martingale problem for (L, D, X ) for every initial condition µ ∈ X .
One is often interested in solutions that are not killed. A general condition for this is that there exist functions f n ∈ D such that f n → 1 and (Lf n ) − → 0 in the bounded pointwise sense. This follows from a slight modification of (Ethier and Kurtz, 2005, Theorem 4.3.8 and Remark 4.5.5).
Since M 1 (E) is compact whenever E ⊂ R d is compact, we obtain the following result as a direct application of Theorem 2.4.
is a dense subset containing the constant function 1, and L(D) ⊆ C(X ). Then L satisfies the positive maximum principle on X if and only if there exists a possibly killed solution to the martingale problem for (L, D, X ) for every initial condition µ ∈ X . If additionally L1 = 0, then every such solution X satisfies X t ∈ M 1 (E) for all t ≥ 0, and is thus a solution to the martingale problem for (L, D, X ).

Main result
and fix a closed subset E ⊆ R d . Define the set of probability measures on E with finite wmoment, topologized by the following notion of convergence: µ n → µ if and only if µ n ⇒ µ and w, µ n → w, µ . This turns P w into a Polish space. A possible choice of metric is where d( · , · ) is the Prokhorov metric on M + (E), and the measures wµ i are given by (wµ i )(dx) = w(x)µ i (dx). The Prokhorov metric is discussed in detail in Section 3.1 of Ethier and Kurtz (2005). See also the discussion after Example A.42 in Föllmer and Schied (2004).
Example 3.1. If w(x) = |x| p outside some ball around the origin, then P w is the set of probability measures on E with finite p-th moments, and (3.2) generates the same topology as the Wasserstein p-distance W p .
We will use the following class of test functions: With the convention exp(− w, µ ) = 0 if w, |µ| = ∞, functions in D w can be evaluated at any µ ∈ M (E). We will obtain possibly killed solutions to the martingale problem for (L, D w , P w ), where L is an operator satisfying suitable assumptions. In order to do so, fix a cemetery state † and define P † w = P w ∪ { †}. 3 The topology is extended to P † w by declaring that a sequence of measures µ n ∈ P w converges to † if w, µ n → ∞. Thus lim µ→ † f (µ) = 0 for any f ∈ D w , so that D † w as defined in Section 2 is indeed contained in C(P † w ). If one assumes that lim µ→ † Lf (µ) = 0 for every f ∈ D w , which we shall, it follows that L † (D † ) ⊆ C(P † w ) as well, where L † is defined as in Section 2. This allows us to speak about possibly killed solutions to the martingale problem.
If E is compact, then P w = M 1 (E) is also compact, and Corollary 2.5 yields a satisfactory existence theory for the martingale problem. From now on we consider the opposite situation, and assume that E is not compact.
In this case P w is not even locally compact, and the classical results are not directly applicable. Instead, we will embed P w into a space that is locally compact, where Theorem 2.4 can be applied. To describe this embedding, let which is a topological embedding of P w into M + (E ∆ ). Observe that Let X denote the weak closure of T (P w ); this will serve as state space for an auxiliary martingale problem. Since X is a closed subset of the locally compact Polish space M + (E ∆ ), it is itself locally compact Polish. This places us in the framework of Theorem 2.4. Note that we have the explicit description using the convention w −1 (∆) = lim |x|→∞ w −1 (x). In particular, a measure ν ∈ X lies in T (P w ) if and only if it does not charge ∆. Using T , any martingale problem with state space P w and operator f → Lf can be regarded as a martingale problem with state space T (P w ) and operator f → L( f • T ) • T −1 . Our strategy is to extend this to a martingale problem with state space X and, then, show that the solution does not charge ∆ and thus actually lies in T (P w ). This gives a solution to the original martingale problem. These steps depend in a somewhat delicate way on the particular choice (3.3) of test functions. In particular, the function f • T −1 obtained by pushing forward a function f ∈ D w using T needs to be extendible to a function in C 0 (X ). This is captured by the following definition.
It is clear that sums and products of functions of C 0 type are again of C 0 type; these functions thus form an algebra. Since ν → ϕ, T −1 (ν) = ϕw −1 , ν is continuous on X for any ϕ ∈ C ∞ c (R d ), the product of ϕ, µ and a function of C 0 type is again of C 0 type. Also, µ → e − w,µ is certainly of C 0 type. We deduce in particular that every f ∈ D w is of C 0 type. Example 3.3. Suppose E = R and let f (µ) = ϕ, µ e − w,µ for some ϕ ∈ C(R). When is f of C 0 type? Set µ n = w(n) −1 δ n + (1 − w(n) −1 )δ 1 ∈ P w . Then µ n does not converge to any element of P w , but ν n = T (µ n ) = δ n + w(1)(1 − w(n) −1 )δ 1 converges to δ ∆ + w(1)δ 1 in M + (R ∆ ). On the other hand, lim n f • T −1 (ν n ) = (lim n ϕ(n)/w(n) + ϕ(1))e −1−w(1) only exists if ϕ(n)/w(n) has a finite limit. By considering similar sequences µ n , one sees that f is of C 0 type if and only if ϕ(x)/w(x) has a finite limit as x → ∆.
The following is the main result of this paper. To state it, we define the compact subset X c = {ν ∈ X : 1, ν ≤ c} for any constant c ≥ 1. The meaning of its conditions, and examples of how they can be verified, are discussed in later sections.
Theorem 3.4. Consider a linear operator L : D w → C(P w ), and assume the following conditions are satisfied: (i) L satisfies the positive maximum principle on P w , (ii) Lf is of C 0 type for every f ∈ D w , (iii) for every constant c ≥ 1, there exist a function f : X → R and pairs ( f m , g m ) in the bp-closure of the restricted graph such that ( f m , g + m ) are uniformly bounded in m, and (a) f m → f pointwise, f ≥ 0, and f (ν) = 0 for ν ∈ X c if and only if ν({∆}) = 0, (b) lim sup m→∞ g + m ≤ c ′ f | Xc pointwise for some constant c ′ . Then there exists a possibly killed solution to the martingale problem for (L, D w , P w ) for every initial condition µ ∈ P w . Furthermore, assume that → (1, 0) in the bounded pointwise sense. Then every possibly killed solution X to the martingale problem for (L, D w , P w ) satisfies X t ∈ P w for all t ≥ 0, and is thus a solution to the martingale problem for (L, D w , P w ).
Remark 3.5. While our focus is the case where (3.1) holds, one could also take w ≡ 1. In this case P w = M 1 (E) is the set of all probability measures on E with the topology of weak convergence. A slight modification of our main result holds also for this case. Specifically, letting D w denote the algebra generated by ϕ, µ with ϕ ∈ R + C ∞ c (R d ), Theorem 3.4 remains true as stated. Note that condition (iii) only needs to be verified for c = 1. The proof remains unchanged, apart from slightly different arguments in Lemma 3.6(i)-(ii) below.
The rest of this section is devoted to the proof of Theorem 3.4, so we now assume that its conditions are satisfied. As discussed above, the proof uses the embedding T in (3.4) to transform the original martingale problem into an auxiliary martingale problem on the state space X in (3.5). The domain of test functions for the auxiliary martingale problem is The elements of D can be evaluated at any ν ∈ M (E ∆ ), with the conventions ϕ(∆) = 0 and 1(∆) = 1. Note also that D ⊂ C 0 (X ), and that its elements f satisfy f • T ∈ D w . Due to Theorem 3.4(ii), we can then define a linear operator L : D → C 0 (X ) by the formula Lemma 3.6. We have the following properties.
(ii): If ν * itself lies in T (P w ), simply take ν n = ν * for all n. We thus assume that this is not the case, which means that ν Since w −1 , ν n = 1, these measures lie in T (P w ). Fix any f ∈ D and observe that we have for some polynomial p on R m and some ϕ 1 , . . . , ϕ m ∈ C ∞ c (R d ). For all sufficiently large n, x n lies outside the supports of all the ϕ i . For all such n, we have On the other hand, we have Therefore, if t n is chosen so that it follows that f (ν n ) = f (ν * ). To see that this is possible, let α(t) denote the right-hand side of (3.10), with t n replaced by t. Then t → α(t) is continuous and strictly increasing on (1, ∞) with lim t→∞ α(t) = ∞ and lim t→−∞ α(t) = −∞. Therefore α has a continuous inverse α −1 (s) which satisfies lim s→∞ α −1 (s) = ∞. We now define Since lim |x|→∞ w(x) = ∞, we have t n → ∞, and since (3.10) holds, we have t −1 n w(x n ) → λ. It is then clear from (3.9) that ν n ⇒ ν * . Therefore, after discarding the finitely many ν n for which x n lies in the support of some ϕ i , the measures ν n satisfy the desired properties.
Lemma 3.7. The operator L satisfies the positive maximum principle on X .
Proof. Let f ∈ D and ν * ∈ X be such that f (ν * ) = max X f ≥ 0. By Lemma 3.6(ii), there exist measures ν n ∈ T (P w ) with ν n ⇒ ν * and f (ν n ) = f (ν * ) for all n ∈ N. In particular, we have f (ν n ) = max T (Pw) f ≥ 0 for all n. Thus, the function f = f • T ∈ D w attains a nonnegative maximum over P w at the point µ n = T −1 (ν n ). Since L satisfies the positive maximum principle on P w , we get L f (ν n ) = Lf (µ n ) ≤ 0. Sending n to infinity and using that L f is continuous on X yields L f (ν * ) ≤ 0. This shows that L satisfies the positive maximum principle on X , as claimed.
Proof of Theorem 3.4. We have established that X is locally compact, that D ⊆ C 0 (X ) is dense, and that L(D) ⊆ C 0 (X ). Since L satisfies the positive maximum principle on X , Theorem 2.4 yields a possibly killed solution to the martingale problem for ( L, D, X ) for any initial condition ν ∈ X . The state space X † for the possibly killed solution is the one-point compactification of X , and L † and D † are as in Section 2.
Fix µ ∈ P w and let Y be a solution with initial condition ν 0 = T (µ). We may suppose that † is an absorbing state, that is, Since the right-hand side is a local martingale, it follows that X is a possibly killed solution to the martingale problem for (L, D w , P w ) with initial condition µ.
We must still argue that Y takes values in T (P w ) ∪ { †}. Fix any c ≥ max{1, 1, ν 0 }, and define the stopping time τ = inf{t ≥ 0 : 1, Y t > c}, with the convention 1, † = ∞. Since Y is a possibly killed solution to the martingale problem, an application of the optional stopping theorem yields for every t ≥ 0 and ( f , g) in the restricted graph (3.6). Since f ( †) = 0, the indicator on the left-hand side of (3.11) is redundant. By dominated convergence, (3.11) remains true for all ( f , g) in the bp-closure of the restricted graph (3.6). Now the indicator is needed, since these functions are not defined at †. Let now f and ( f m , g m ) be as given in Theorem 3.4(iii). By dominated convergence, (3.11), and the conditions in Theorem 3.4(iii), we obtain By the Gronwall inequality (see Theorem 5.1 in the Appendixes of Ethier and Kurtz (2005) Since the constant c was arbitrarily large, and since † is an absorbing state, it follows that Y takes values in T (P w ) ∪ { †}, as desired. Finally, suppose condition (iv) in Theorem 3.4 is in force, and consider the functions f n given there. Let X be a possibly killed solution to the martingale problem for (L, D w , P w ) with initial condition µ ∈ P w . We then get for every fixed t. As a result, X t ∈ P w for all t ≥ 0, as claimed.

Lévy type operators
Operators L : D w → C(P w ) that satisfy the positive maximum principle are integro-differential operators of Lévy type, which we now introduce. Such operators involve derivatives of functions f of measure arguments, and we define whenever they exist, and we write ∂ k f (µ) for the corresponding map. Define also the function space for any x ∈ R d and any µ ∈ M (R d ) such that ϕ and w are µ-integrable. One also has the product rule ∂(f g) = f ∂g + g∂f . In particular, every test function in f ∈ D w is infinitely many times differentiable and, for each k, the k-th derivative ∂ k x1,...,x k f (µ) is jointly continuous in (x 1 , . . . , x k , µ) ∈ R k × P w , and the map ∂ k f (µ) lies in (C ∞ w ) ⊗k for every fixed µ.
We say that L is of Lévy type if it acts on test functions f ∈ D w by where, for each µ ∈ P w , the following conditions are imposed to ensure that the right-hand side is well-defined: being a smooth function supported on [−2, 2] and equal to one on [−1, 1], acts as a truncation function for the large jumps.
If the martingale problem for (L, D w , P w ) has a possibly killed solution X for any initial conditions µ, then these objects govern the killing, drift, diffusion, and jump behavior of X, as one would expect.

Verifying the positive maximum principle
The positive maximum principle is convenient because it is often easy to verify in practice. The key tool for doing so when the operator is of Lévy type are the following optimality conditions for functions in D w .
Theorem 5.1. Fix f ∈ D w and µ ∈ P w such that f (µ) = max Pw f .
(ii) ∂ 2 f (µ), µ 2 ≤ 0 for all µ ∈ P w − P w such that 1, µ = 0 and supp(|µ|) ⊆ supp(µ). In particular, (iii) Let τ : R d → R d×d be C 1 , and suppose it can be chosen as diffusion matrix for an E-valued diffusion process, i.e. x If E = R d one has the following improvement of (iii): (iv) Let E = R d . Then (iii) remains true if τ is assumed to be continuous but not C 1 . Note that in this case (5.1) is vacuous and A τ (ϕ) = Tr(τ τ ⊤ ∇ 2 ϕ).
Proof. Before we begin, we need a technical result. Fix f ∈ D w and ϕ 1 , . . . , ϕ n ∈ C ∞ c (R d ) such that f (µ) = Φ( ϕ 1 , µ , . . . , ϕ n , µ , w, µ ) for some Φ ∈ C ∞ (R m+1 ). Applying the classical Taylor approximation theorem to Φ then yields (i) and (ii): The proof is a slight modification of the proof of Theorem 3.1 in . Pick any x ∈ supp(µ), y ∈ E, and let A n be the ball of radius 1/n centered at x, intersected with supp(µ). Note that ∂f (µ) ∈ C ∞ w and that setting µ n := µ( · ∩ A n )/µ(A n ) we get that µ + t(δ y − µ n ) ∈ P w for all t ∈ (0, µ(A n )) and that (µ n ) n∈N converges to δ x in P w . Following the proof of Theorem 3.1 in  yields the result.
(iii): Assume that τ is compactly supported. We follow the idea of the proof of Proposition 4.1 in Abi Jaber et al. (2019). Fix x ∈ R d , j ∈ {1, . . . , d}, and let z x : By Proposition 2.5 in Prato and Frankowska (2004) we know that z x (t) ∈ E for all x ∈ E and t ≥ 0. Observe that for all ϕ ∈ C ∞ w and with ψ = ϕ ⊗ ϕ we have that Define µ t := F t * µ, the pushforward of µ under F t , where F t (y) := z y (t) for y ∈ E. Clearly µ t ∈ M 1 (E), and since τ j is compactly supported, µ t ∈ P w . Moreover, the dominated convergence theorem yields lim t→0 1 t ϕ, µ t − µ = τ ⊤ j ∇ϕ, µ for all ϕ ∈ C ∞ w . We thus get that (5.3) holds true for µ := µ and ν t := µ t − µ. Since µ maximizes f over P w , we then obtain Recall that by Theorem 5.1(i) we know that ∂ y f (µ) = sup E ∂f (µ) for all y ∈ supp(µ) and hence τ (y) ⊤ ∇(∂p(µ))(y) = 0 by (5.1). As a result, dividing the above expression by t 2 , letting t go to 0, applying the dominated convergence theorem, and summing over 1 ≤ j ≤ d yields (5.2). A further application of dominated convergence allows to remove the assumption that τ is compactly supported.
(iv): The proof follows the proof of (iii) using F t (x) := x + tτ j (x) for all x ∈ R d .
Condition (iii) is in fact an application of a more general condition, which we report here for the case w ≡ 1. See Theorem 3.4 and Remark 5.7(i) in  for more details and a proof. An analogous result can be proved for w as in (3.1).
Lemma 5.2. Fix w ≡ 1, f ∈ D w , and µ ∈ P w such that f (µ) = max Pw f . Let A be the generator of a strongly continuous group of positive isometries of R + C 0 (E), and assume the domain of A and the domain of A 2 both contain C ∞ w . Then

Verifying the technical conditions
We now turn to the technical conditions (ii)-(iv) in Theorem 3.4. At the end of the section, we follow up on Remark 3.5 and consider the case w ≡ 1. Recall the embedding T : P w → M + (E ∆ ) defined in (3.4).
We now start to work towards concrete ways of checking these assumptions.
Lemma 6.1. Suppose L is of Lévy type (4.1). Assume that the functions are of C 0 type for every ϕ ∈ C ∞ w . Then so is Lf for every f ∈ D w , that is, condition (ii) in Theorem 3.4 is satisfied.
Proof. Note that one can check that each term of Lf in (4.1) is of C 0 type separately. Recall that µ → e − w,µ is of C 0 type, as are all the elements of D w .
Killing: The result for µ → κ µ f (µ) follows by noting that f (µ) = ϕ, µ e − w,µ g(µ) for some g constant or in D w , for some ϕ ∈ C ∞ c (R d ), and for all µ ∈ P w . Drift: Observe that for we have ∂f (µ) = (ϕ − ϕ, µ w)e − w,µ . One then sees that the given conditions ensure that µ → B µ (∂f (µ)), µ is of C 0 type. By the product rule and linearity in f , this result extends to each f ∈ D w .
Jumps: Finally, consider g(µ) := p(f (µ)) for some polynomial p : R → R and some f as in (6.1). Since p is a polynomial and ∂g(µ), χ(ν − µ) = p ′ (f (µ)) ∂f (µ), χ(ν − µ) , an application of the classical Taylor approximation theorem to p yields where k denotes the degree of p. Thus the given conditions ensure that the last term of Lg in (4.1) is of C 0 type. By polarization, this is also true for all f ∈ D w .
We focus now on condition (iii) in Theorem 3.4, assuming that (i)-(ii) are satisfied. This ensures that the operator L : D → C 0 (X ) in (3.7)-(3.8) is well-defined.
To verify these conditions it is useful to first extend L to a larger class of functions than D w , and then search for appropriate sequences {f m } m∈N in this larger class. The precise notion of extension is given in the following definition.
For any compact subset K ⊂ X we define the following restricted graph of L: Definition 6.2. We say that L can be extended to a function f : X → R if there is another function g : X → R such that ( f , g| K ) lies in the bp-closure of gph K ( L) for every compact subset K ⊂ X . We say that L can be extended to a function f : P w → R if L can be extended to a function f : X → R that satisfies f = f • T .
If f = f •T , and if ( f , g| K ) lies in the bp-closure of gph K ( L) for every compact subset K ⊂ X , we write Lf for g • T . Sometimes, it happens that the expression given in (4.1) is well defined for f and coincides with g • T . Those cases will be particularly important for our purposes. Lemma 6.3. Suppose L is of Lévy type (4.1) and satisfies conditions (i)-(ii) of Theorem 3.4. Assume L can be extended to all functions f in the algebra generated by D w and e − w,µ , and that Lf is given by (4.1). Assume also there exist [0, 1]-valued functions ψ m ∈ C ∞ c (R d ) with the following properties: • ψ m → 1 pointwise.
• The functions h m : P w → R given by h m (µ) = B µ ((1 − ψ m )w), µ + , which are of C 0 type, satisfy lim sup m→∞ h m • T −1 ≤ c ′ 1 {∆} , · in the bounded pointwise sense on every compact subset of X , for some constant c ′ .
Then condition (iii) in Theorem 3.4 is satisfied.
We now define f m = f m • T −1 , g m = Lf m • T −1 | Xc , and f (ν) = p c ( 1, ν )ν({∆}), and prove that these functions satisfy the properties in Theorem 3.4(iii). Since L can be extended to f m , the pair ( f m , g m ) lies in the bp-closure of gph Xc ( L). Moreover, the f m are uniformly bounded in Therefore g + m ≤ h m • T −1 | Xc , and our hypotheses imply that g + m is uniformly bounded in m, and that lim sup m→∞ g + m ≤ c ′ f | Xc pointwise. The dominated convergence theorem implies that f m → f pointwise. Finally, it follows by inspection that f ≥ 0, and that f (ν) = 0 for ν ∈ X c if and only if ν({∆}) = 0.
The last condition to analyze is condition (iv).
Lemma 6.4. Suppose L is of Lévy type (4.1) and satisfies the conditions (i)-(ii) of Theorem 3.4. Assume that for every function f in the algebra generated by D w and e − w,µ , (f, Lf ) with Lf is given by (4.1) lies in the bp-closure of the graph {(h, Lh) : h ∈ D w }. Assume also that κ µ = 0 for all µ ∈ P w , and that we have the linear growth condition for all µ ∈ P w and some constant c. Then condition (iv) in Theorem 3.4 is satisfied.
Proof. We showed in the proof of Lemma 6.3 that L can be extended to all maps f : P w → R of the form (6.2), and that Lf is given by (4.1). In fact, under our current assumptions, the argument shows that (f, Lf ) lies in the bp-closure (even the uniform closure) of the graph {(h, Lh) : h ∈ D w }. In particular, these facts apply to the maps f n (µ) = q( w, µ /n), where q ∈ C ∞ c (R + ) is nonincreasing and satisfies 1 [0,1] ≤ q ≤ 1 [0,2] . It is clear that f n → 1 in the bounded pointwise sense. We must argue that (Lf n ) − → 0 in the same sense. A direct computation gives Using the properties of q and, in the last step, the assumed linear growth condition, we get for some constant c ′ , We deduce that (Lf n ) − is uniformly bounded in n, and it is clear from the first line that (Lf n ) − → 0 pointwise. Remark 6.5. We now comment on the case w ≡ 1, and consider the setting of Remark 3.5. Lemma 6.1 would then be replaced by the requirement that Lf can be extended to a function in C(M 1 (E ∆ )) for all f ∈ D w . Concrete conditions when L is of Lévy type are that the maps µ → κ µ , µ → B µ (ϕ), and µ → Q µ (ϕ ⊗ ϕ) are continuous from P w to R, R + C 0 (R d ), and R + C 0 (R d ) ⊗ C 0 (R d ), respectively, and that µ → ϕ, ν − µ ℓ N (µ, dν) can be extended to a function in C(M 1 (E ∆ )) for every ℓ ≥ 2 and ϕ ∈ C ∞ w . Next, Lemma 6.3 holds without the assumption that L can be extended to all functions in the algebra generated by D w and e − w,µ . In the proof, one simply takes f m (µ) = (1 − ψ m ), µ .
In the case of condition (iv), the result for w ≡ 1 is quite different from Lemma 6.4. The reason is that, in contrast to the cases where condition (3.1) is in force, for w ≡ 1 the cemetery state † is an isolated point. Thus a solution to the martingale problem can reach † only by means of a jump. This has the consequence that (iv) can essentially be derived from (iii). We now report a precise formulation of this statement.
Proof. Choose ( f m , g m ) m∈N as in (iii) for c = 1. Define f m = 1 − f m • T and g m = − g m • T . Since κ µ = 0, (f m , g m ) lies in the bp-closure of the graph of L. Moreover, (f m , g − m ) → (1, 0) in the bounded pointwise sense.

Applications of the main result
Take E = R d and assume that w(x) = |x| p for |x| > 2, where p ∈ (0, ∞). Consider a linear operator L : D w → C(P w ) of Lévy type (4.1) with κ = 0, N = 0, and B and Q given by for some constants c, γ ≥ 0. Then conditions (i)-(iii) of Theorem 3.4 are satisfied, and thus there exists a possibly killed solution X to the martingale problem for (L, D w , P w ) for every initial condition µ ∈ P w . If one can take γ = 0 in (7.2), then condition (iv) of Theorem 3.4 holds, and thus X t ∈ P w for all t ≥ 0.
(ii): We verify the conditions of Lemma 6.1. That is, in the current notation, we must check that the functions ν → B ν (ϕ)w −1 , ν e − 1,ν and ν → Σ ν (ϕ)w −1 , ν 2 e − 1,ν (7.5) are C 0 functions on X for every ϕ ∈ C ∞ w . To see that the function involving B ν is continuous, we write If ν n ⇒ ν, then the first term tends to zero since B ν (ϕ)w −1 ∈ R + C 0 (R). The second term tends to zero due to (7.3). A similar calculation shows that the function in (7.5) involving Σ ν is continuous as well.
It remains to show that the functions in (7.5) vanish at infinity. To see this, note that for all ϕ ∈ C ∞ w . Since the suprema in (7.6) grow at most polynomially in 1, ν due to (7.4), the functions in (7.5) vanish at infinity.
(iii): We verify the conditions of Lemma 6.3. We have already shown that L satisfies conditions (i)-(ii) of Theorem 3.4. Let us show that L can be extended to all functions f in the algebra generated by D w and e − w,µ , and that Lf is given by (4.1).
Let now h m be as in Lemma 6.3 and note that The first inequality in (7.6), condition (7.4), and the reasoning in (7.7) imply that (h m | K ) m∈N is a bounded sequence in C(K) for every compact set K. Moreover, which is well defined by (7.3). Write the right-hand side as c ′ 1 {∆} for a constant c ′ . By the dominate convergence theorem we can conclude that h m • T −1 → c ′ 1 {∆} , · in the bounded pointwise sense on every compact subset of X . Thus the conditions of Lemma 6.3 hold, as required.
(iv): We verify the conditions of Lemma 6.4 under the additional assumption that one can take γ = 0 in (7.2). We already know that L satisfies the conditions (i)-(ii) of Theorem 3.4 and that κ µ = 0 for all µ ∈ P w . We show now that for every function f in the algebra generated by D w and e − w,µ , the pair (f, Lf ) with Lf given by (4.1) lies in the bp-closure of the graph {(h, Lh) : h ∈ D w }. To do this, we just need to follow the first part of the proof of (iii). Indeed, setting f n = f n • T we get that (f n , Lf n ) ∈ {(h, Lh) : h ∈ D w } converges bounded pointwise to (f, Lf ) for Lf as given in (4.1).
The last condition of Lemma 6.4 to be verified is the linear growth. By (7.4) with ϕ = w and γ = 0 we compute The claim follows.
Remark 7.2. As will be explored further in Section 8, the linear operator L introduced at the beginning of the section coincides with the generator of the conditional distribution X t = P(Z t ∈ · | F 0 t ) of a solution of a McKean-Vlasov equation with common noise, where F 0 t := σ(W 0 s , s ≤ t). The same result provided by Theorem 7.1 can be obtained when the common noise is replaced by a common jump mechanism. For example, consider a poisson random measure P 0 (dt, dy) with compensator F (dy)dt for some probability measure F supported on R, and let (X, Z) satisfy the McKean-Vlasov equation where F 0 t := σ(P 0 ([0, s], dy), s ≤ t). Here ℓ µ (x, y) describes the sizes of the common jumps, which we assume are confined to a cube [0, c] d for some c > 0. The generator of the probability measure valued process X is then the linear operator of Lévy type (4.1) given by γ(µ, y) := ( · + ℓ µ ( · , y)) * µ ∈ P w .
Note that since F is a probability measure, we are free to choose χ ≡ 0 as truncation function. Suppose now that b and σ satisfy the conditions of Theorem 7.1 with γ = 0. Assume also that ℓ µ (x, y) = ℓ wµ (x, y) for some continuous map (ν, Then there exists a solution X to the martingale problem for (L, D w , P w ) for every initial condition µ ∈ P w . This result can be proved following the proof of Theorem 7.1.
The next corollary follows directly from Theorem 7.1.
Corollary 7.3. Suppose that b, σ, and τ do not depend on µ and Then there exists a solution X to the martingale problem for (L, D w , P w ) for every initial condition µ ∈ P w .
We consider now a different linear operator L : D w → C(P w ) of Lévy type (4.1) with κ = 0, N = 0, and B and Q given by are continuous as functions from X to R + C 0 (R d ), and that α µ (x, y) is continuous as a function from X to R + C 0 (R 2d ). Assume also that for some constants c, γ ≥ 0. Then conditions (i)-(iii) of Theorem 3.4 are satisfied, and thus there exists a possibly killed solution X to the martingale problem for (L, D w , P w ) for every initial condition µ ∈ P w . If one can take γ = 0, then condition (iv) of Theorem 3.4 holds, and thus X t ∈ P w for all t ≥ 0.
Proof. The proof follows the proof of Theorem 7.1, applying Theorem 5.1(ii) to verify the positive maximum principle instead of Theorem 5.1(i) and (iv).
Similarly to standard SDEs with linearly growing coefficients, the linear growth properties (implicit in Lemma 6.4) imply that all moments of w, X t are finite.
Proposition 7.5. Let E, w and L satisfy the conditions of Theorem 7.1 for γ = 0 and let X be a solution to the martingale problem for (L, D w , P w ) for some initial condition µ ∈ P w . Then the following conditions hold.
(i) E[ w, X t k ] ≤ w, µ k e C k t for all k ∈ N, where C k := k(k + 1)C for some C > 0.
(ii) X solves the martingale problem for (L, E w , P w ) where E w denotes the algebra generated by {µ → ϕ, µ : ϕ ∈ C ∞ w } and Lf is given by Proof. To simplify the notation we just prove the case d = 1. Before to start observe that by the dominated convergence theorem the process f (X) − f (µ) − · 0 Lf (X s )ds denotes a bounded martingale for each map f such that the pair (f, Lf ) with Lf given by (4.1) lies in the bp-closure of the graph {(h, Lh) : h ∈ D w }. (7.8) We already shown in the proof of condition (iv) during the proof of Theorem 7.1 that (7.8) is satisfied by every function f in the algebra generated by D w and e − w,µ . Proceeding as in the proof of Lemma 6.3 we can extend this result to all maps f : P w → R of the form f (µ) := p(f 1 (µ), . . . , f n (µ)) for p ∈ C 2 (R n ) with p(0) = 0 and f 1 , . . . , f n being elements of that algebra.
(i): Fix now c > w, µ and set f c (µ) : . Note that f c satisfies (7.8). Setting T c := inf{t ≥ 0 : w, X t ≥ c} we have, due to (7.4) for γ = 0, The Gronwall inequality implies that E[f c (X t∧Tc )] ≤ w, µ e C k t and Fatou's lemma yields (ii): Fix now g ∈ E w and set g c (µ) := p c ( w, µ )g(µ). Observe that each g c satisfies condition (7.8), |g c (µ)| ≤ C w, µ k , and |Lg c (µ)| ≤ C w, µ k for some k big enough and some constant C not depending on c and µ. Since by (i) we get w, X s k ds] ≤ w, µ k (e C k t + 2 + C −1 k (e C k t − 1)) < ∞ and g c → g pointwise the claim follows by the dominated convergence theorem.
This completes the proof.

McKean-Vlasov equations with common noise
We continue to consider the setting of Section 7: E = R d , w(x) = |x| p for |x| > 2 and some p ∈ (0, ∞), L : D w → C(P w ) is a linear operator of Lévy type (4.1) with κ = 0, N = 0, and B and Q given by Definition 8.1. A weak solution of the McKean-Vlasov equation specified by (b, σ, τ ) is a tuple (X, Z, W, W 0 ), defined on some filtered probability space, where X and Z are adapted with values in P w and R d , W and W 0 are independent d-dimensional Brownian motions, and such that for some filtration G = (G t ) t≥0 to which W 0 is adapted, and of which W is independent.
Our aim in this section is to give an existence result for the McKean-Vlasov equation specified by (b, σ, τ ) by solving a martingale problem satisfied by the solution (X, Z, W, W 0 ). The state space for this martingale problem is the product space P w × R d × R d × R d . We will show below that the solution satisfies The corresponding generator H has domain D(H) consisting of all algebraic combinations of functions f (µ), ϕ(z), ψ(x), θ(x 0 ) with f ∈ D w and ϕ, ψ, θ ∈ C ∞ c (R d ). In view of (8.1) and (8.2), H acts on functions of the form h(µ, z, x, x 0 ) = f (µ)ϕ(z)ψ(x)θ(x 0 ) by the somewhat cumbersome expression Theorem 8.2. Fix z ∈ R d and assume b, σ, τ satisfy the conditions of Theorem 7.1 for γ = 0.
Then the martingale problem for (H, D(H), has a solution (X, Z, W, W 0 ), where W and W 0 are independent d-dimensional Brownian motions. Moreover, the linear equation is satisfied for Y = X. If one has the compatibility conditions that W is independent of the filtration G = (G t ) t≥0 generated by (X, W 0 ), and for all s ≤ t, F s and G t are conditionally independent given G s , then (8.3) is satisfied for Y t = P(Z t ∈ · | G t ) as well. In particular, if in addition uniqueness holds for (8.3), then (X, Z, W, W 0 ) is a weak solution of the McKean-Vlasov equation specified by (b, σ, τ ).
The compatibility conditions on the filtrations F and G are rather implicit. However, similar conditions are known to be required elsewhere in the literature; see for instance page 114 in Kurtz and Xiong (1999), and the conditions of Theorem 2 in Kailath et al. (1978). See also the remark at the beginning of page 142 in Kailath et al. (1978). Let us also mention (without proof) that whenever (X, W, W 0 ) solves the corresponding martingale problem, one can construct a process W such that (X, W , W 0 ) solves the same martingale problem and W is independent of the filtration G = (G t ) t≥0 generated by (X, W 0 ). Remark 8.3. Let f (ν) := p( ϕ 1 , ν , . . . , ϕ n , ν ) for some nonnegative map p : R n → R satisfying p(0) = 0, some ϕ 1 , . . . , ϕ n ∈ C ∞ c (R d ), and all ν ∈ P w . Note that setting ϕ i , ν −ν := ϕ i , ν − ϕ i ,ν we can naturally extend f to P w − P w . Consider now two solutions Y and Y of (8.3). An application of Itô's formula yields where L µ f (ν) = B µ (∂f (ν)), ν + 1 2 Q µ (∂ 2 f (ν)), ν 2 . If f additionally satisfies |L µ f (ν)| ≤ Cf (ν) for all ν ∈ P w − P w and µ ∈ P w , (8.4) an application of the Gronwall inequality yields E[f (Y t − Y t )] = 0, and thus that Y t − Y t ∈ {f = 0}. If this condition holds for sufficiently many f , we would be able to conclude that Y t = Y t almost surely and that uniqueness holds for (8.3). We illustrate a situation where this is the case in the following example. Let d = 1, z ∈ [0, 1], and assume that Y t ([0, 1]) = Y t ([0, 1]) = 1 for each t ≥ 0. Assume that the maps x → b µ (x) and x → τ µ (x) are polynomials of degree at most 1 and the map x → σ µ (x) 2 is a polynomial of degree at most 2. This in particular implies that B µ and Q µ are polynomial operators in the sense of Filipović and Larsson (2017), meaning that they map any polynomial to a polynomial of the same or lower degree. Fix then H 0 , . . . , H m ∈ C ∞ c (R) such that H i (x) = x i for each x ∈ [0, 1] and set p m (ν) = m i=0 H i , ν 2 . Note that for each ν ∈ P w − P w such that supp(ν) ⊆ [0, 1] we have for some α µ ij ∈ R. This implies that if sup µ∈Pw |α µ ij | < ∞, then condition (8.4) is satisfied, and H i , Y t = H i , Y t for each i ∈ {1, . . . , m}. Since m was arbitrary the same conclusion holds for each i ∈ N. Since two measures on [0, 1] have the same moments if and only if they are the same, it follows that Y t = Y t almost surely and uniqueness holds for (8.3).
The rest of the section is devoted to the proof of Theorem 8.2. We start with a corollary of Theorem 7.1.
Corollary 8.4. Assume that b, σ, and τ satisfy the conditions of Theorem 7.1 for γ = 0. Then there exists a solution (X, Z, W, W 0 ) to the martingale problem for (H, D(H), P w ×R d ×R d ×R d ) for every initial condition (µ, z, x, x 0 ) ∈ P w × R d × R d × R d .
Proof. We first observe that H satisfies the positive maximum principle on P w × R d × R d × R d . This can be proven by the classical optimality conditions on R d × R d × R d , Theorem 5.1(i), and a slightly modification of the argument in the proof of Theorem 5.1(iii).
Observe then that the concepts introduced in Section 3 can be generalized by setting x, x 0 ) := (T (µ), z, x, x 0 ) and calling f : Since L satisfies the conditions of Theorem 7.1 we know that Hh is of C 0 type for every h ∈ D(H). Following the proof of Theorem 3.4 we can conclude that there exists a possibly killed solution of the martingale problem for ( H, D(H) Using that by Theorem 7.1 the operator L satisfies conditions (iii)-(iv) of Theorem 3.4, we can conclude the proof by following the proof of Theorem 3.4.
The fact that a solution of the martingale problem is also a weak solution of the corresponding SDE is due, in the classical case, to Stroock and Varadhan (1972).
Lemma 8.5. Assume that the conditions of Corollary 8.4 are satisfied and let (X, Z, W, W 0 ) be a solution to the martingale problem for (H, D(H), P w × R d × R d × R d ) with initial condition (δ z , z, 0, 0). Then W, W 0 are independent Brownian motions, and (8.1) and (8.2) hold for each ϕ ∈ C ∞ c (R d ).
Proof. For h(µ, z, x, x 0 ) = ψ(x, x 0 ), we have that Hh(µ, z, x, x 0 ) = 1 2 ∆ψ(x, x 0 ) is the Laplacian. Thus W and W 0 are independent Brownian motions. To prove (8.2), we must show that the process ϕ, X t − t 0 B Xs ϕ, X s ds− t 0 τ Xs ∇ϕ(X s ), X s ⊤ dW 0 s , which is known to be a martingale due to Proposition 7.5, is constant. This is done by verifying that its quadratic variation is zero; we omit the details. The proof of (8.1) is similar.
Lemma 8.6. Assume that the conditions of Corollary 8.4 are satisfied and let (X, Z, W, W 0 ) be a solution to the martingale problem for (H, D(H), P w × R d × R d × R d ) with initial condition (δ z , z, 0, 0). If one has the compatibility conditions that W is independent of the filtration G = (G t ) t≥0 generated by (X, W 0 ), and for all s ≤ t, F s and G t are conditionally independent given G s , then the conditional law process Y t := P(Z t ∈ · | G t ) satisfies (8.3).

A A Fubini type result
The result presented in this section is based on Theorem 2 in Kailath et al. (1978) and its proof. Let (Ω, F , F = (F t ) t≥0 , P) be a filtered probability space endowed with two d-dimensional Brownian motions W and W 0 . Consider then a second filtration G = (G t ) t≥0 to which W 0 is adapted, and of which W is independent.