Probability measure-valued polynomial diffusions

We introduce a class of probability measure-valued diffusions, coined polynomial, of which the well-known Fleming--Viot process is a particular example. The defining property of finite dimensional polynomial processes considered by Cuchiero et al. (2012) and Filipovic and Larsson (2016) is transferred to this infinite dimensional setting. This leads to a representation of conditional marginal moments via a finite dimensional linear PDE, whose spatial dimension corresponds to the degree of the moment. As a result, the tractability of finite dimensional polynomial processes are preserved in this setting. We also obtain a representation of the corresponding extended generators, and prove well-posedness of the associated martingale problems. In particular, uniqueness is obtained from the duality relationship with the PDEs mentioned above.


Introduction
In this paper we develop probability measure-valued versions of a class of processes known as polynomial diffusions, which have -due to their inherent tractability -broad applications in population genetics, interacting particle systems, and finance; see e.g. [14,33,19]. The result is a class of stochastic processes that model randomly evolving probability measures, including examples such as Fleming-Viot processes [23,16], as well as conditional laws of jump-diffusions on (subsets of) R d .
Finite dimensional polynomial diffusions form a rich class that includes Kimura diffusions [25], Wishart correlation matrices [1], and affine processes [13], just to name a few subclasses. See e.g. [8,21,22,9] for further details and examples. This suggests transferring their defining property and tractability features to infinite dimensional processes. Such processes also appear as limits of empirically well-suited finite-dimensional polynomial models, whose limiting behavior is of key interest in population dynamics, but also in other areas, such as capital distribution curve modeling; see e.g. [31]. In a nutshell our contribution can be summarized as follows: • it is the first time that polynomial processes in infinite dimension are considered; • important examples of polynomial probability measure-valued diffusions, e.g. the Fleming-Viot and conditional law processes mentioned above, have not previously been considered from this angle. In fact, analogous definitions and methods apply to more general measure-valued diffusions such as super-Brownian motion and the Dawson-Watanabe superprocess; • a remarkably large class of processes is characterized via the polynomial property; in this paper, we find necessary and sufficient conditions for measure valued diffusions to be polynomial and to take values in the space of probability measures; Z be a process with values in the unit simplex ∆ d = {z ∈ [0, 1] d : z 1 + . . . + z d = 1}, representing the capitalization weights of d stocks. For tractability, it is natural select Z to be a polynomial diffusion on ∆ d as in [7]. To compute basic moment statistics of the capitalization weights, one uses a moment formula similar to (1.1). For homogeneous polynomials q(z 1 , . . . , z d ), it takes the form where Z 0 = z ∈ ∆ d , and the sum extends over all multi-indices α = (α 1 , . . . , α d ) with |α| = α 1 + · · · + α d = k := deg(q). There are N := k+d−1 k such multi-indices, and the R N -valued function u t = (u t (α) : |α| = k) solves the linear ODE ∂u ∂t = L k u, (1.3) whose initial condition is the coefficient vector of q, and where L k here is an N × N matrix derived from the generator of Z. For small or moderate dimensions d and degrees k, solving (1.3) is feasible. However, d is typically on the order of 10 3 , which renders (1.3) computationally taxing even for small k, since the ODE dimension is N ∼ d k . Now, consider instead a linear factor model Z = ( Z 1 , . . . , Z d ) for the capitalization weights. This means that Z i = E g i (x)X t (dx) for some nonnegative functions g 1 , . . . , g d that sum to one, and a probability measure-valued polynomial diffusion X with, say, E = [0, 1]. In this case, for some measure polynomial p(ν) of degree k = deg(q). This expectation can be computed using the moment formula (1.1), which amounts to solving the PDE (1.2) up to time t. Discretizing the space domain E k using n points in each dimension yields a complexity of order n k . This can be made orders of magnitude smaller than the complexity d k of solving (1.3). Importantly, n is a parameter that is chosen based on accuracy requirements, while d is an input to the problem. This illustrates how probability measure-valued polynomial diffusions can enhance tractability in high-dimensional models. On top of this, as projections of an infinite-dimensional process, these linear factor models constitute a much richer class than polynomial models on subsets of ∆ d .
The remainder of the paper is organized as follows. After reviewing some basic notation and definitions in the following subsection, we turn to polynomials of measure arguments in Section 2, and prove optimality conditions for such polynomials in Section 3. In Section 4 we define polynomial operators and study their form in the diffusion case. Section 5 contains the moment formula as well as our main results on well-posedness of the martingale problem. Applications and examples are treated in Section 6. Some proofs and supplementary material are gathered in appendices.

Notation and basic definitions
Throughout this paper, E is a locally compact Polish space endowed with its Borel σ-algebra. The following notation is used.
• C(E), C b (E), C 0 (E), C c (E) have the usual meaning of continuous (and bounded, and vanishing at infinity, and compactly supported) real functions on E. The topology on the latter three is that of uniform convergence, and · denotes the supremum norm.
• If E is noncompact, then E ∆ = E ∪ {∆} is the one-point compactification, itself a compact Polish space. If E is compact we write E ∆ = E, which mitigates the need to consider the compact and noncompact cases separately. We also define a closed subspace of C b (E k ). The spaces C ∆ (E) and C(E ∆ ) can be identified, and we occasionally regard elements of the former as elements of the latter, and vice versa. When E is compact, we have C(E) = C b (E) = C 0 (E) = C c (E) = C ∆ (E) and we then simply write C(E). Note that the constant function 1 lies in C ∆ (E), but of course not in C 0 (E). This is one reason the spaces C ∆ (E k ) are useful; other reasons are discussed in Remarks 2.6 and 4.6.
• C ∆ (E k ) is the closed subspace of C ∆ (E k ) consisting of symmetric functions f , i.e., f (x 1 , . . . , x k ) = f (x σ(1) , . . . , x σ(k) ) for all σ ∈ Σ k , the permutation group on k elements. C 0 (E k ) and C(E k ) are defined similarly. For any g ∈ C ∆ (E k ), h ∈ C ∆ (E ) we denote by g ⊗ h ∈ C ∆ (E k+ ) the symmetric tensor product, given by For a linear subspace D ⊆ C ∆ (E) we set D ⊗ D := span{g ⊗ g : g ∈ D}. We emphasize that only symmetric tensor products are used in this paper.
Two key notions are the positive maximum principle and conservativity for certain linear operators. In general, for a Polish space X and a subset S ⊆ X , these notions are defined as follows. An operator A : such that lim n→∞ f n = 1 on E and lim n→∞ (Af n ) − = 0 on E ∆ , both in the bounded pointwise sense; c.f. Chapter 4.2 in [17]. For us, S will be E, E ∆ , M 1 (E), or M 1 (E ∆ ).
It is well-known that the positive maximum principle, combined with conservativity, is essentially equivalent to the existence of S-valued solutions to the martingale problem for A; see for instance Theorem 4.5.4 of [17]. We use this extensively, and review the relevant results in Section D. Here an important issue is that while M 1 (E) is compact when E is compact, M 1 (E) is not even locally compact when E is noncompact.

Polynomials of measure arguments
In this section we develop basic properties of polynomials of measure arguments. The notation and results introduced here play a central role throughout this paper.
Throughout this section, E is a locally compact Polish space.

Monomials and polynomials
A monomial on M (E) is an expression of the form is referred to as the coefficient of the monomial; see e.g. [11,Chapter 2]. We identify C ∆ (E 0 ) with R, so that for k = 0 we have g, ν 0 = g ∈ R.
It is clear that the map ν → g, ν k is homogeneous of degree k, and that g → g, ν k is linear. Furthermore, one has the identity g, ν k h, ν = g ⊗ h, ν k+ , where the symmetric tensor product g ⊗ h is defined in (1.4).
A polynomial on M (E) is now defined as a (finite) linear combination of monomials, The degree of the polynomial p(ν), denoted by deg(p), is the largest k such that g k is not the zero function, and −∞ if p is the zero polynomial.
where δ i is the Dirac mass concentrated at {i}. Monomials take the form where the summation ranges over E k = {1, . . . , d} k . Therefore, as g(·) ranges over all symmetric functions on E k , we recover all homogeneous polynomials of total degree k in the d variables z 1 , . . . , z d . In particular, in view of Corollary 2.5 later, this relation provides a one to one correspondence between polynomials on the unit simplex ∆ d , and polynomials on M 1 (E).
The following function space will play an important role. with pointwise addition and multiplication.

Continuity and smoothness of polynomials
Just like ordinary polynomials, the elements of P are smooth. This is made precise in Lemma 2.3 below. In its statement, we use a directional derivative of functions on M (E) that is well-known since the work of [23]. A function f : exists. We write ∂p(µ) for the map x → ∂ x p(µ), and use the notation for iterated derivatives. We write ∂ k p(ν) for the corresponding map from E k to R. Observe that for p ∈ P of the form p(ν) = g, ν we get ∂ x p(ν) = lim ε→0 ( g(y)εδ x (dy))ε −1 = g(x) for each x ∈ E.
The following lemma asserts basic properties of polynomials, in particular that polynomials on M (E) can be uniquely extended to polynomials on M (E ∆ ), which will often be the object of interest for our purposes. (ii) Let p ∈ P be a monomial of the form p(ν) = g, ν k . Then, for every x ∈ E and ν ∈ M (E), . If k = 0, the right-hand side should be read as zero.
(iii) For each p ∈ P and x ∈ E the map ∂ x p : ν → ∂ x p(ν) lies in P .
(v) The identity holds for all p, q ∈ P , x ∈ E, ν ∈ M (E).
(vi) The Taylor representation holds for all p ∈ P and ν, µ ∈ M (E), where k denotes the degree of p.
is continuous as well. Note then that by linearity in (2.1) it is enough to prove the result for p(ν) = g, ν k and g ∈ C ∆ (E k ). Choose h ∈ C ∆ (E) ⊗k such that g − h ≤ ε and let 1 It can be shown that sequential continuity cannot be strengthened to continuity. ν n ∈ M (E) form a convergent sequence with limit ν ∈ M (E). Observe that, by the Banach-Steinhaus theorem, sup n |ν n |(E) < ∞. Then for some C ≥ 0. Since ε is arbitrary, this proves sequential continuity of p on M (E).
In particular we get continuity on M + (E) since this is a Polish space. The last part follows from the observation that every function in C ∆ (E) can be uniquely extended to a function in C(E ∆ ).
(ii): Using the symmetry of g, a direct calculation yields The expression for ∂ x p(ν) follows.
For the remaining part of the proof it suffices to consider monomials p(ν) = g, ν k for g ∈ C ∆ (E k ) due to the linearity in (2.1).
Continuity of x → ∂ x p(ν) follows from the dominated convergence theorem and the fact that E is Polish, and thus a sequential space.
The following property turns out to be particularly useful in the context of the moment formula. In the finite-dimensional setting, the result states that every polynomial on the unit simplex has a homogeneous representative. Corollary 2.5. Every polynomial on M (E) has a unique homogeneous representative on M 1 (E). That is, for every p ∈ P with deg(p) = m there is a unique g ∈ C ∆ (E m ) such that p(ν) = g, ν m for all ν ∈ M 1 (E).
Proof. Corollary 2.4 yields a unique set of coefficients g 0 , . . . , g m with g k ∈ C ∆ (E k ) and p(ν) = m k=0 g k , ν k . The result now follows by setting g := m k=0 g k ⊗ 1 ⊗(m−k) .

Remark 2.6.
If we choose to work with coefficients in C 0 (E k ) instead of C ∆ (E k ) we would obtain the same class of polynomials on M 1 (E). This is because every g ∈ C ∆ (E k ) equals k i=0 g i ⊗ 1 ⊗(k−i) for some g i ∈ C 0 (E i ), and therefore g, ν k = k i=0 g i , ν i for all ν ∈ M 1 (E). Indeed, the g i are given iteratively by g 0 := g(∆, . . . , ∆) and g i : However, not every such polynomial admits a homogenous representative on M 1 (E) in the sense of Corollary 2.5, unless E is compact. An example is 1 + g, ν with g ∈ C 0 (E) nonzero. The existence of homogeneous representatives leads to significant notational simplifications when E is not compact (see Remark 4.6 for more details). This is the main reason for working with the spaces C ∆ (E k ).

Polynomials with regular coefficients
For a polynomial p, the derivative map x → ∂ x p(ν) is only as regular as the coefficients of p. This leads us to consider subspaces of polynomials with more regular coefficients. Let D ⊆ C ∆ (E) be a dense linear subspace containing the constant function 1 and define P D := span 1, g, ν k : k ≥ 1, g ∈ D .
(2.2) Thus P D is the subalgebra of P consisting of all (finite) linear combinations of the constant polynomial and "rank-one" monomials g ⊗ · · · ⊗ g, ν k = g, ν k with g ∈ D.
Equivalently, P D consists of all polynomials p(ν) = φ( g 1 , ν , . . . , g k , ν ) with k ∈ N, g 1 , . . . , g k ∈ D, and φ a polynomial on R k . Lemma 2.7. For any p ∈ P D and ν ∈ M (E), we have ∂ k p(ν) ∈ D ⊗k . Moreover P D is dense in C(M 1 (E ∆ )). Here the elements of P D are viewed as functions on M 1 (E ∆ ) by first extending them to M (E ∆ ) using Lemma 2.3 (i) and then restricting them to M 1 (E ∆ ).
Proof. For p(ν) := φ( g, ν ) where φ is polynomial we have ∂ k p(ν) = φ (k) ( g, ν )g ⊗k ∈ D ⊗k . Thus the first part of the result holds for all such p, and by linearity for all p ∈ P D . For the second part, continuity of polynomials follows by Lemma 2.3(i). Stone-Weierstrass and the fact that D is densely contained in C(E ∆ ) yield the density.

Optimality conditions
We now develop optimality conditions for polynomials of measure arguments, which are instrumental when working with the positive maximum principle on M 1 (E ∆ ). Our first result, Theorem 3.1, extends the classical first and second order Karush-Kuhn-Tucker conditions for functions on the finite-dimensional simplex (see e.g. [4]). It is derived by perturbing an optimizer ν * ∈ M 1 (E ∆ ) by shifting small amounts of mass to arbitrary points in E ∆ . Our second result, Theorem 3.4, is obtained by deforming the optimizer ν * using a group of isometries of C ∆ (E). The resulting condition is genuinely infinite-dimensional; see Lemma 3.6. We will use the operator Ψ, which maps any function g : E × E → R k to the function Ψ(g) : E × E → R k given by Ψ(g)(x, y) = 1 2 (g(x, x) + g(y, y) − 2g(x, y)) . Note that we use Lemma 2.3(i) to extend polynomials from M 1 (E) to M 1 (E ∆ ).

Remark 3.2. Note the similarity between Theorem 3.1 and the classical Karush-Kuhn-
Then the first and second order (necessary) Karush-Kuhn-Tucker conditions on ∆ d hold:
where e i is the i-th canonical unit vector. Indeed, otherwise z ± ε(e i − e j ) would lie in ∆ d and give a higher function value for small ε > 0. More explicitly, we must have For the remainder of this section, D ⊆ C ∆ (E) is a linear subspace, and P D is defined by (2.2). Our next optimality condition is more subtle, in that it becomes trivial in the finitedimensional case; see Lemma 3.6. The basic observation is that a group of The value of a polynomial in its maximizer ν * cannot be less than its value in ν * − µ + µ t , for any t, and this leads to an optimality condition in terms of the group generator A. For example, if E = R, the generator could be Ag = τ g for some τ ∈ C 1 ∆ (R). The isometries would then be T t g : The corresponding flow of measures would consist of the pushforwards of µ with respect to φ(t, · ). For more details see Lemma 6.1. The for a given linear operator A : D → C ∆ (E).
Proof. Let {T t } t∈R be the group generated by A. For any µ ∈ M + (E ∆ ), the group induces a flow of measures µ t ∈ M (E ∆ ) via the formula g, µ t = T t g, µ for g ∈ C ∆ (E). The positivity and isometry property of T t implies that µ t is nonnegative and has constant total mass µ t (E ∆ ) = µ(E ∆ ). Therefore, assuming henceforth that µ ≤ ν * , it follows that We claim that both A and −A satisfy the positive maximum principle on E ∆ . Indeed, for f ∈ D and x ∈ E ∆ with f (x) = max E ∆ f ≥ 0, the positivity and isometry property give for all x ∈ supp(ν * ) due to Theorem 3.1, it follows that A(∂p(ν * ))(x) = 0 for all such x. As a result, using that supp(µ) ⊆ supp(ν * ) and that the domain of A contains A(D), we get Furthermore, using that for all g ∈ D, we deduce that for all g ∈ D ⊗ D. Inserting (3.6) and (3.7) into (3.4), dividing by t 2 , and sending t to This completes the proof.

Remark 3.5.
We claim that for A as in Theorem 3.4, the operator A 2 satisfies the positive Then, as in (3.5) and with the same notation, we The following lemma illustrates the pure infinite-dimensional nature of the condition provided in Theorem 3.4. Lemma 3.6. Let A be the generator of a strongly continuous group of positive isometries of C ∆ (E). If the domain of A is all of C ∆ (E), then A = 0. This is in particular the case if A is bounded or E consists of finitely many points.
Proof. Both A and −A satisfy the positive maximum principle on E, and A1 = 0. Therefore Lemma C.2 implies that A and −A are both of the form (C.1) with B = ±A. As a result, for all x ∈ E and g ∈ C(E ∆ ). This implies that 1 {x} c (ξ)ν A (x, dξ) and 1 {x} c (ξ)ν −A (x, dξ) are zero for all x ∈ E and hence that A = 0. Since each linear operator on a finitedimensional vector space is bounded, and the domain of a bounded operator on C ∆ (E) can be extended to all of C ∆ (E), the second part follows.

Polynomial operators
Let E be a locally compact Polish space. We now define polynomial operators, which constitute a class of possibly unbounded linear operators acting on polynomials. They are not defined on all of P in general, but only on the subspace P D for some dense subspace D ⊆ C ∆ (E); see (2.2). An analog of this notion has appeared previously in connection with finite-dimensional polynomial processes; see e.g. [8,21,9].
Given a linear operator L : P D → P , its associated carré-du-champ operator is the symmetric bilinear map Γ : The carré-du-champ operator gives information about the quadratic variation of the martingales appearing in the martingale problem for the operator L. It also gives information about path continuity of solutions to such martingale problems. We return to this issue in Lemma 5.2, which roughly speaking states that path continuity holds precisely when the carré-du-champ operator Γ is a derivation, which is defined as follows.
For a finite-dimensional diffusion it is known that its generator is polynomial if and only if the drift and diffusion coefficients are polynomial of first and second degree, respectively; see [8] and [21]. The following result is the generalization of this fact to the probability-valued setting. The proof is given in Section A.
In this case, B and Q are uniquely determined by L.
An analogue of Theorem 4.3 holds for L being S-polynomial, where S is an arbitrary subset of M (E); see Theorem A.1. This process takes values in M 1 (R), and its generator L acts on polynomials p ∈ P D by where Bg := 1 2 σ 2 g for some σ ∈ R. This is an M 1 (R)-polynomial operator of the form (4.2), where Q = Ψ as defined in (3.1). For more details, see Chapter 10.4 of [17]. Corollary 2.5 states that any polynomial on M 1 (E) has a unique homogeneous representative. Therefore, an operator L satisfying (4.2) actually maps any monomial g, ν k to a unique monomial h, ν k on M 1 (E). This induces an operator L k acting on the corresponding coefficients by L k g := h. The operators L 1 , L 2 , . . . are the key objects needed to compute conditional moments of polynomial diffusions corresponding to L. for every p(ν) = g, ν k with g ∈ D ⊗k .
Because of (4.2), the k-th dual operator L k can be written where the tensor notation B 1 ⊗ . . . ⊗ B N is used to denote the linear operator from D ⊗k and the k-th dual operator would thus consist in a (k + 1)-tuple of operators L k k , . . . , L 0 k . In the context of the moment formula, as stated in Theorem 5.3 below, the PIDE of (5.2) would then translate to a system of (k + 1) PIDEs. If one is interested in studying jump-diffusions taking value in other subspaces of M (E), as e.g. M + (E), a homogeneous representative can no longer be found and one has to deal with systems of PIDEs to compute the moments.

Existence and uniqueness of polynomial diffusions on M 1 (E)
Let E be a locally compact Polish space, D a dense linear subspace of C ∆ (E) containing the constant function 1, and L : P D → P a linear operator. In this section we study existence and uniqueness of M 1 (E)-valued polynomial diffusions, and derive the moment formula.
An M 1 (E)-valued process X with càdlàg paths defined on some filtered probability space (Ω, F, (F t ) t≥0 , P) is called a solution to the martingale problem for L with initial condition ν ∈ M 1 (E) if X 0 = ν P-a.s. and Lp(X s )ds The following lemma relates path continuity of solutions to the martingale problem to the carré-du-champ operator being a derivation. This explains why we consider derivations in Theorem 4.3.

Lemma 5.2.
If the carré-du-champ operator Γ associated to L is an M 1 (E)-derivation, then any solution to the martingale problem for L has continuous paths. Conversely, if for every initial condition ν ∈ M 1 (E) there is a solution to the martingale problem for L with continuous paths, then the carré-du-champ operator Γ associated to L is an M 1 (E)-derivation.

Probability measure-valued polynomial diffusions
Proof. Let X be a solution to the martingale problem for L. By Proposition 2 in [2], the real-valued process p(X) is continuous for every p ∈ P D , in particular for every linear monomial p(ν) = h, ν with h ∈ D. Since D is dense in C ∆ (E), we can conclude that X is continuous with respect to the topology of weak convergence on M 1 (E).
Conversely, if X is a solution to the martingale problem for L with continuous paths, then, by Lemma 2.3(i), the map t → p(X t ) is continuous for all p ∈ P D . The result now follows by Proposition 1 in [2].

Moment formula and uniqueness in law
Polynomial diffusions are of interest in applications because they generally satisfy a moment formula, which allows moments of the process to be computed tractably. If E is a finite set, the moment formula always holds, but technical conditions, in particular on the dual operators, are needed in the general case. For details regarding operators and semigroups, we refer e.g. to [17]. Theorem 5.3. Suppose L satisfies (4.2) and fix k ∈ N. Assume that the k-th dual operator L k is closable, and let g be in the domain of its closure L k . Suppose that there is a solution u :

2)
and suppose that sup t∈[0,T ] L k u(t, · ) < ∞ for all T ∈ R + . In particular, u(t, · ) is assumed to be in the domain of L k for all t ≥ 0. Then for any continuous solution X to the martingale problem for L, one has the moment formula Proof. We will follow the proof of Theorem 4.4.11 in [17] and extend it to obtain also the formula for the conditional moments. Fix T ∈ R + , t ∈ [0, T ], and A ∈ F t . Define for all (s 1 , Equation (5.2) and the fundamental theorem of calculus then yield Fix then s 1 ∈ [0, T − t]. Since u(t, · ) is in the domain of L k for all t ∈ R + , (5.1) yields In order to avoid confusion, for the rest of the section we denote by u g the solution of (5.2) with initial condition u g (0, · ) = g.
In most of the cases of interest (see Remark 5.7(iii) below) the operator L k satisfies the positive maximum principle on E k , for each k ∈ N. If this is the case, the existence of a solution u g of (5.2) satisfying the conditions of Theorem 5.3 for sufficiently many g, is essentially equivalent to the fact that L k generates a strongly continuous positve contraction semigroup on C ∆ (E k ) or in other words that it is the generator of a Feller process on E k . We state this precisely in the following remark.
Remark 5.4. Let L satisfy (4.2) and let X denote a solution to the corresponding martingale problem with initial condition X 0 = ν ∈ M 1 (E). Assume that the corresponding k-th dual operator L k satisfies the positive maximum principle on (E ∆ ) k (which in particular implies that L k is closable), for each k ∈ N.
Let D 0 be a dense subset of the domain of L k and suppose that the conditions of Theorem 5.3 hold true for all g ∈ D 0 . By Proposition 1.3.4 of [17], if we additionally have that t → L k u g (t, · ) is continuous, then L k is the generator of a strongly continuous contraction semigroup {Y k t } t≥0 on C ∆ (E k ) and Y k t g = u g (t, · ). In this case the moment formula reads as Conversely, if L k is the generator of a strongly continuous contraction semigroup {Y k t } t≥0 on C ∆ (E k ), then for all g in the domain of L k the map u g (t, x) := Y k t g(x) satisfies the conditions of Theorem 5.3. By the Hille-Yosida theorem, this is for instance the case if the range of λ − L k is dense in C ∆ (E k ) for some λ > 0. In this case, Corollary 4.2.8 in [17] yields a solution Z (k) (without loss of generality defined on the same probability space as X) to the martingale problem for L k with values in (E ∆ ) k and satisfying This gives an alternative interpretation to (5.3), namely that the PIDE in (5.2) is the Feynman-Kac PIDE associated to the k-dimensional process Markov process Z (k) . Note that in the case of a finite state space E, (5.2) reduces to an ODE and L k is automatically the generator of k-dimensional Markov chain.
As in the finite-dimensional case, the moment formula gives well-posedness of the martingale problem.
Corollary 5.5. Suppose L satisfies (4.2), and let X be a continuous solution to the martingale problem for L with initial condition ν ∈ M 1 (E). If the moment formula (5.3) holds for all g ∈ D ⊗k and k ∈ N, then the law of X is uniquely determined by L and ν.
Proof. By the moment formula (5.3) we have E[ g, X k T ] = u g (T, · ), ν k for all k ∈ N and g ∈ D ⊗k . Since g → u g is determined by L, Lemma 2.7 yields that the one-dimensional distributions of X are uniquely determined by L and ν. The conclusion follows by Theorem 4.4.2 in [17].

Existence and well-posedness
Our first main result of this section gives abstract sufficient conditions for existence of solutions to the martingale problem. Applications of this result are discussed in Section 6. Recall that E is throughout a locally compact Polish space.
where α : E 2 → R is a nonnegative symmetric function and, for i = 1, . . . , n, A i is the generator of a strongly continuous group of positive isometries of C ∆ (E), and the domain of A i contains both D and A i (D), Probability measure-valued polynomial diffusions i satisfies the positive maximum principle on E ∆ . Then L is M 1 (E)-polynomial and its martingale problem has a solution with continuous paths for every initial condition ν ∈ M 1 (E). If in addition the moment formula (5.3) holds for all g ∈ D ⊗k and k ∈ N, then the martingale problem for L is well-posed.
Note that (4.2) imposes the implicit condition on α that αΨ(g) must lie in C ∆ (E 2 ) for every g ∈ D ⊗ D. If D = C ∆ (E), then α is necessarily bounded, as is seen from Theorem 5.9 below. However, this does not hold for general D ⊆ C ∆ (E), as one can see by considering E = R, D ⊆ C 1 ∆ (R), and α(x, y) = |x − y| −1 1 {x =y} . Proof.
Therefore, since B − 1 2 n i=1 A 2 i satisfies the positive maximum principle and α is nonnegative, we get (ii) Let us also remark that the k-th dual operator G k of G(∂p(ν)), ν satisfies the positive maximum principle on (E ∆ ) k if it holds for G on E ∆ . Indeed, if x * ∈ (E ∆ ) k is a maximum of g, then x * i is a maximum of g(. . . , x * i−1 , · , x * i+1 , . . .). Hence G k given by where we use the same notation as in (4.5), clearly satisfies the positive maximum principle on (E ∆ ) k .
(iii) Consider the setting and the assumptions of Theorem 5.6 and define Note that by (4.4) we have L k = G k + C k + T k . We claim that G k , C k , T k , and hence L k , satisfy the positive maximum principle on (E ∆ ) k .
Probability measure-valued polynomial diffusions By item (iii) in Theorem 5.6, B − 1 2 n i=1 A 2 i satisfies the positive maximum principle on E ∆ , whence by (ii) it holds also for G k on (E ∆ ) k . The form of Ψ and the nonnegativity of α guarantee that this holds also for C k . Finally, since T k = n i=1 . , x j−1 , · , x j+1 , . . .)(x j ), Remark 3.5 yields the positive maximum principle on (E ∆ ) k also for T k and thus for L k .
The following result gives a useful condition for uniqueness when all the operators A i are zero. Due to Lemma 3.6 this happens, for instance, if D = C ∆ (E) and in particular if E consists of finitely many points. An example where uniqueness holds when those operators are not all zero is given in Example 6.7.
Lemma 5.8. Consider setting and assumptions of Theorem 5.6, and assume that A i = 0 for all i. Assume additionally that α is bounded and B is closable and its closure is the generator of a strongly continuous contraction semigroup on C ∆ (E). Then the moment formula (5.3) holds for all g ∈ C ∆ (E k ) and k ∈ N.
Since B satisfies the positive maximum principle on E ∆ due to Theorem 5.6(iii), the Hille-Yosida theorem guarantees that the conditions of the lemma are satisfied whenever λ − B has dense range in C ∆ (E) for some λ > 0.
Proof. Let {Y 1 t } t≥0 be the semigroup corresponding to B. Fix any k ∈ N and let B k and Q k be as in (4.5). It is straightforward to check that B k is the restriction to D ⊗k of the generator of the strongly continuous contraction Moreover, one has the estimate Bg = (g(ξ) − g( · )) ν B ( · , dξ) and Qg = αΨ(g), (5.5) where ν B is a nonnegative, finite kernel from E to E, and α : (E ∆ ) 2 → R is nonnegative, symmetric, bounded, and continuous on (E ∆ ) 2 \ {x = y}. In this case, for each k ∈ N the k-th dual operator L k satisfies the hypothesis of Theorem 5.3, and the moment formula (5.3) holds for all g ∈ C ∆ (E k ). Moreover, B and Q, and hence each L k , are bounded operators.
As in Theorem 5.6, condition (4.2) imposes implicit conditions on the different parameters. This is the case for the measure ν B , which in particular needs to satisfy (g(ξ) − g( · ))ν B ( · , dξ) ∈ C ∆ (E) for all g ∈ C ∆ (E). This condition is clearly satisfied if the map from E to M + (E) given by x → ν B (x, · ) is continuous. However the converse fails to be true as one can see by considering the following kernel for some continuous φ : E → E such that φ = id. Proof. Since by Theorem 5.9 each L k is bounded, the operator L can be uniquely extended to P C∆(E) . The result then follows by the same theorem.
The last main result of this section characterizes probability-valued polynomial martingales. An M 1 (E)-valued process X is called a martingale if g, X is a martingale for every g ∈ C ∆ (E). Note that, unlike Theorem 5.6, the conditions are both necessary and sufficient, regardless of the choice of domain D. Proof. To prove the forward implication, first note that Lemma 5.2 and Theorem 4.3 imply that L satisfies (4.2). To see that B = 0, pick any g ∈ D and x ∈ E, and let X be a solution to the martingale problem with initial condition δ x . Since g, X is a martingale, we have Bg, X = 0 and hence Bg(x) = Bg, X 0 = 0. The form of Q will follow from Lemma C.3. To verify its hypotheses, fix g ∈ D and ν ∈ M 1 (E), and define p ∈ P D by p(µ) := −( g, ν − g, µ ) 2 . Then ∂ 2 p(ν) = −2g ⊗ g, p ≤ 0, and p(ν) = 0, so the positive maximum principle yields − Q(g ⊗ g), ν 2 = Lp(ν) ≤ 0.
To prove the reverse implication, observe that existence of solutions to the martingale problem, along with path continuity, follows from Corollary 5.10, as does well-posedness if in addition α is bounded. Since B = 0, it is clear that g, X is a martingale for every g ∈ D and every solution X to the martingale problem. This implies that X is a martingale.

Finite underlying space
Let E = {1, . . . , d}. Then C ∆ (E) = C(E) is finite-dimensional, so any dense linear subspace must equal the whole space. We therefore take D = C(E). In this setting, any M 1 (E)-valued process X is of the form X t = d i=1 Z i t δ i for some ∆ d -valued process Z = (Z 1 , . . . , Z d ). When X is a polynomial diffusion, Theorem 5.9 describes its generator L in terms of a kernel ν B from E to E and a nonnegative symmetric function α : E 2 → R.
As we now show, the process Z then also solves a martingale problem whose generator can be written down explicitly.
In view of Example 2.1, any polynomial f on ∆ d can be represented as for some p ∈ P D . We may then define an operator A acting on such polynomials f by the formula Af (z) := Lp(z 1 δ 1 + · · · + z d δ d ).
Since f (Z) = p(X) and Af (Z) = Lp(X), it is clear that Z is a solution to the martingale problem for A with polynomials f as test functions. Conversely, if a solution Z to this martingale problem is given, a solution to the martingale problem for L is obtained by Next, a computation shows that A has the form and a kk (z) = − =k a k (z). Here well-posedness was obtained in [21], which we thus recover as a special case. In particular, Z is a polynomial diffusion on ∆ d in the sense of [21, Definition 2.1]. Furthermore, Theorem 5.9 yields the moment formula for X, which reduces to the corresponding formula for Z given in [21, Theorem 3.1].

Underlying space E ⊆ R d
Let E ⊆ R d be a closed subset and set Our goal is to analyze Theorem 5.6 in this setting. If E is not all of R d , the dynamics of the spatial motion is restricted. Intuitively, its diffusion component must be tangential to the boundary of E. This is encoded as follows.  Note that A i is well-defined by (6.3) in the sense that A i g only depends on g through its values on E. This is a direct consequence of the definition (6.2) of Σ d (E).
Proof. By Proposition 2.5 in [10], for each i = 1, . . . , d, there exists a map (t, x)), t ∈ R, defines a strongly continuous group of positive isometries of C ∆ (E) with generator A i . It is clear that the domain of A i contains D, and it also contains A i (D) since the components of τ i lie in C 1 ∆ (R d ). (ii) Q is given by where τ ∈ Σ d (E) and α : E 2 → R is a nonnegative symmetric function, Proof. This follows directly from Lemma 6.1, up to the fact that in (iii) we need to verify that the positive maximum principle holds on E ∆ , not just on E. Since D ⊆ C c (E), this follows from Remark 5.7(i).
The rest of the section is devoted to the case d = 1 and E = R. In view of Lemma C.1, the operator B should satisfy the positive maximum principle on E = R. It is well-known, see e.g. [6] or [24], that under this condition B is a Lévy type operator, i.e. Bg = bg + 1 2 ag + (g( · + ξ) − g − χ(ξ)g ) F ( · , dξ), g ∈ D, (6.4) for some continuous functions a, b with a ≥ 0, a truncation function χ, and a kernel F ( · , dξ) from R to R such that |ξ| 2 ∧ 1 F ( · , dξ) < ∞. Every operator of this form satisfies B1 = 0 and the positive maximum principle on R. The following result expresses Theorem 6.2 in this setting. Corollary 6.3. Let L : P D → P be a linear operator satisfying (4.2), where B is given by (6.4) with a := σ 2 + τ 2 for some continuous functions σ and τ , and Q is given by where α ∈ C ∆ (R 2 ) is nonnegative and τ ∈ C 1 ∆ (R). Assume also that B is R-conservative.
The coefficient α quantifies the diffusive exchange of mass between different points in the support of X t (dx). This is perhaps most clearly seen when E = {1, . . . , d}; see Section 6.1. The role of τ is different, as it governs random fluctuations of the support of X t (dx). The following example illustrates this point.

Example 6.4.
Consider an operator L of the form given in Corollary 6.3 with α = 0, Bg = 1 2 g , and τ = 1 (hence σ = 0). The resulting operator Q is given by Q(g ⊗g) = g ⊗g . A solution to the martingale problem for L is given by X = δ W , where W is a Brownian motion. Indeed, applying Itô's formula to g, X t k = g(W t ) k for any g ∈ D and k ∈ N 0 establishes that (5.1) is a martingale for any p ∈ P D .
In this example, as well as in Corollary 6.3, a nonzero τ in the specification of Q is coupled with a corresponding diffusive component in the specification (6.4) of B. The following result shows that this is a general phenomenon. Proposition 6.5. Let L : P D → P be a linear operator satisfying (4.2) with B given by (6.4). Suppose that L satisfies the positive maximum principle on R. If a = 0, then Q = αΨ for some nonnegative symmetric function α : R 2 → R.
The next lemma constitutes the main tool to prove Proposition 6.5. But it also has other consequences. In particular, it implies that Q(g ⊗ g)(x, y) depends on g just through g(x), g(y), g (x), and g (y), provided that L satisfies the positive maximum principle on M 1 (E). This illustrates that the form of Q as given in Theorem 6.2 is very general. Lemma 6.6. Let L : P D → P be a linear operator satisfying (4.2) with B given by (6.4). Suppose that L satisfies the positive maximum principle on R. Then, for all λ ∈ [0, 1], g ∈ D, and x, y ∈ R such that g(x) = g(y), we have that Proof. Fix g ∈ D such that g(x) = g(y). Since, by Lemma C.1, B1 = 0 and Q(g ⊗ 1) = 0 it is enough to consider the case g(x) = g(y) = 1. The result will follow from Lemma C.4. Indeed, if we let (p n ) n∈N and (f n ) n∈N be the sequences described there, by the positive maximum principle of L on R we get 0 ≥ Lp n (ν λ ) = Bf n , ν λ + 1 2 Q(g ⊗ g), ν 2 λ and letting n go to ∞ we can conclude the proof.
To verify the hypotheses of Lemma C.4, observe that Lemma C.1 yields Fix some g ∈ D and x, y ∈ R such that g(z) = g (z) = 0 for z ∈ {x, y}, and suppose that g = 1. Let F n : [0, 1] → R be the function defined in Lemma B.1. Consider then the sequence of polynomials given by where, for some compactly supported function ρ ∈ C ∞ ∆ (R) such that ρ = 1 on some neighborhood of x and y and ρ(R) ⊆ [0, 1], Observe that the conditions on g guarantee that for C big enough |g| ≤ H and thus | g, ν | ≤ H, ν for all ν ∈ M 1 (R). For supp(ρ) small enough we also have that H ≤ 1. Lemma B.1 then yields g, ν 2 F n H, ν ≤ 1 n H, ν for all ν ∈ M 1 (R), and therefore p n ≤ 0 on M 1 (R). This automatically implies that p n has a maximum at ν λ for all λ ∈ [0, 1]. Proceeding as in the proof of Theorem 5.9 we then obtain that Q(g ⊗ g), ν 2 λ = 0 for any g ∈ D such that g(x) = g(y) = 1 and g (x) = g (y) = 0. Choosing λ = 0, 1, 1/2 we get the result.
Probability measure-valued polynomial diffusions The following example gives a simple condition for well-posedness. We let G k , C k , and T k be as in Remark 5.7(iii). Example 6.7. Consider the setting of Corollary 6.3. Suppose that σ 2 is bounded away from zero, let the jump kernel F ( · , dξ) in (6.4) be zero, and assume that the parameters b and σ 2 are Lipschitz continuous and bounded. Then, by Theorem 8.1.6 of [17], B is R-conservative and the closure of G k + T k generates a strongly continuous semigroup on C ∆ (E k ) for each k ∈ N. Since C k is bounded, L k generates a strongly continuous contraction semigroup on C ∆ (E k ) as well (see e.g. Theorem 1.7.1 in [17] for more details). Since Remark 5.7(iii) shows that L k satisfies the positive maximum principle, Remark 5.4 and Theorem 5.3 yield the moment formula for all g ∈ D ⊗k . Well-posedness thus follows from Theorem 5.6.

Conditional laws of jump-diffusions are polynomial
In this section we deal with particle systems driven by some idiosyncratic noise (Brownian motion and jumps) and one common Brownian motion. We show that for essentially all such jump diffusions the conditional law with respect to the common Brownian motion is polynomial.
Moreover, let (Z i ) i∈N be a weak solution of the system where W 0 is a Brownian motion and (W 1 , p 1 ), (W 2 , p 2 ), . . . is a sequence of couples of Brownian motions and random measures with compensator F ( · , dξ). We assume that each couple is independent of the other couples and of W 0 . Note that the generator of each Z i is given by B as defined in (6.4).
Assume now that Z 1 , Z 2 , . . . are exchangeable and set By De Finetti's theorem (see e.g. Theorem 4.1 in [27] or, for a general overview, also Section 12.3 in [26]) we get that (Z i In the following proposition we now show that X is polynomial by proving that it solves the martingale problem for the polynomial operator L specified above. Proposition 6.8. Let X be given by (6.6). Then X solves the martingale problem for L with initial condition δ x .

Probability measure-valued polynomial diffusions
Proof. Let g ∈ D ⊗k and set Z := (Z 1 , . . . , Z k ). Then we get that -martingale and hence setting p(ν) := g, ν k we can compute using (6.7) proving that X is a solution to the martingale problem for L.

A Proof of Theorem 4.3 and a generalization
We first prove Theorem 4.3. Assume first L is of the stated form. Then for monomials p(ν) = g, ν k with g ∈ D, k ∈ N and ν ∈ M 1 (E) one has which is a polynomial in ν of degree at most k. Moreover, L1 = 0. By linearity, this shows that L is M 1 (E)-polynomial. Next, a direct calculation yields Γ(p, q)(ν) = Q ∂p(ν) ⊗ ∂q(ν) , ν 2 for all ν ∈ M 1 (E), which is easily seen to be an M 1 (E)-derivation due to the product rule give in Lem- Conversely, assume L is M 1 (E)-polynomial and Γ is an M 1 (E)-derivation. Consider arbitrary first degree monomials q(ν) = g, ν and r(ν) = h, ν , g, h ∈ D. The M 1 (E)polynomial property and Corollary 2.5 yield for some map B : D → C ∆ (E) that are easily seen to be linear due to the linearity of L. Furthermore, the M 1 (E)-polynomial property, definition (4.1) of Γ, and Corollary 2.5 imply that where Q inherits symmetry and linearity from Γ and take values in C ∆ (E 2 ). Thus, by taking linear combinations, we can and do extend them to operators on D ⊗ D.
We now make more substantial use of the fact that Γ is an M 1 (E)-derivation in order to extend (4.2) to higher degree monomials. We proceed by induction on k, and assume Lp is of the form (4.2) for all p = q l , l ≤ k. So far we have proved this for k = 2. The definition (4.1) of Γ and the fact that it is an M 1 (E)-derivation give the identity on M 1 (E) L(q k+1 ) = 2qL(q k ) − q 2 L(q k−1 ) + q k−1 Γ(q, q) for k ≥ 2. Due to the induction assumption, the right-hand side can be computed explicitly using (4.2). The result is which is equal to B(∂p(ν)), ν + 1 2 Q(∂ 2 p(ν)), ν 2 with p = q k+1 , for all ν ∈ M 1 (E). This concludes the induction step. It follows by induction that (4.2) holds for all monomials g, ν k , and by linearity for all p ∈ P D . Finally, the uniqueness assertion is immediate from the way B and Q were obtained above. This completes the proof of Theorem 4.3.
We now state a generalization of Theorem 4.3, where M 1 (E) is replaced by a general state space. We let E be a locally compact Polish space, D ⊆ C ∆ (E) be a dense linear subspace, and fix S ⊆ M (E).

B Proof of Theorem 5.9
Assume L satisfies (4.2) with B and Q as in (5.5), where ν B is a nonnegative, finite kernel from E to E, and α : (E ∆ ) 2 → R is nonnegative, symmetric, bounded, and continuous on (E ∆ ) 2 \{x = y}. Clearly Q is bounded with operator norm 2 α . Identifying C ∆ (E) and C(E ∆ ), we infer from Lemma C.2 that B is bounded, satisfies B1 = 0 as well as the positive maximum principle on E ∆ , and that {e tB } t≥0 is a strongly continuous contraction semigroup. By considering any sequence of functions g n ∈ C 0 (E) with 0 ≤ g n (x) ↑ 1 for all x ∈ E, and using that ν B (x, {∆}) = 0 for all x ∈ E, one sees that B is E-conservative. Theorem 5.6 then yields that L is M 1 (E)-polynomial and its martingale problem has an solution with continuous paths for every initial condition ν ∈ M 1 (E).
We now prove the opposite implication. Assume L is M 1 (E)-polynomial, its martingale problem is well-posed, and all solutions have continuous paths. Theorem 4.3 and Lemma 5.2 imply that L satisfies (4.2), and then also the positive maximum principle on M 1 (E) due to Lemma D.1.
By Lemma C.1 B satisfies the positive maximum principle on E and Lemma C.2 thus shows that B has the form in (5.5) for some nonnegative, finite kernel ν B from E ∆ to E ∆ . Additionally, B is bounded, satisfies the positive maximum principle on E ∆ , and is the generator of the strongly continuous contraction semigroup {e tB } t≥0 . We must prove that ν B (x, {∆}) = 0 for all x ∈ E; this will allow us to view ν B as a kernel from E to E. Assume by contradiction that there is some x ∈ E with ν B (x, {∆}) > 0. Let Z be the Markov process associated to the semigroup {e tB } t≥0 . Then, by approximating 1 {·∈∆} by a sequence of bounded continuous functions g n and applying relation (5.4), we find for all t ≥ 0. This contradicts the fact that X t is M 1 (E)-valued and proves that B is of the stated form.
The form of Q will follow from Lemma C.3. To verify its hypotheses, note that by Lemma C.1 Q(g ⊗ g), ν 2 ≥ 0. Next, fix some g ∈ D and ν ∈ M 1 (E) such that g = 0 on the support of ν, and suppose that g = 1. For each n ∈ N, define the polynomial where F n is as in Lemma B.1. Since D = C ∆ (E), we have p n ∈ P D . Moreover, since F n (z)zn ≤ 1 for all z ∈ [0, 1], we get and therefore p n ≤ 0 on M 1 (E). Since g = 0 on the support of ν, p n (ν) = 0. Applying the positive maximum principle and using the form (4.2) of L, as well as g, ν = |g|, ν = 0 and F n (0) = 1 we obtain 0 ≥ Lp n (ν) = − 1 n B(|g|), ν + Q(g ⊗ g), ν 2 for all n, whence Q(g ⊗ g), ν 2 ≤ 0. By scaling, this actually holds for any g ∈ D and ν ∈ M 1 (E) such that g = 0 on the support of ν. If g equals some other constant c ∈ R on the support of ν, we still get Q(g ⊗ g), ν 2 = Q((g − c) ⊗ (g − c)), ν 2 ≤ 0 using that Q(g ⊗ 1) = 0 by Lemma C.1. Thus Lemma C.3(ii) holds, and we conclude that Q = αΨ for some nonnegative symmetric function α : E 2 → R. It remains to use that αΨ(g) ∈ C ∆ (E 2 ) to show that this function can be extended to a bounded continuous function on (E ∆ ) 2 \ {x = y}. Continuity is clear. For proving boundedness, choose a sequence of pairs (x n , y n ) ∈ (E ∆ ) 2 \ {x = y} such that α(x n , y n ) n→∞ − −−− → ∞. Since we can assume without loss of generality that α(x i , y i ) > 0, x i = x j , x i = y j , and y i = y j for all i, j ∈ N, we can construct g ∈ C ∆ (E) such that (g(x n ) − g(y n )) 4 = α(x n , y n ) −1 .

C Auxiliary lemmas
Let E be a locally compact Polish space.
Lemma C.1. Let D ⊆ C ∆ (E) be a dense linear subspace containing the constant function 1, and let L : P D → P be a linear operator satisfying (4.2) and the positive maximum principle on M 1 (E). Then B satisfies the positive maximum principle on E, B1 = 0, Q(g ⊗ g), ν 2 ≥ 0, and Q(g ⊗ 1) = 0 for all g ∈ D and ν ∈ M 1 (E).
for all x ∈ E and g ∈ C(E ∆ ). In this case, B is bounded and satisfies the positive maximum principle on E ∆ , and {e tB } t≥0 is a strongly continuous contraction semigroup. Moreover, there is some nonnegative (finite) measure ν B (∆, · ) such that (C.1) holds also for x = ∆.
Proof. Assume there is a nonnegative, finite kernel ν B from E to E ∆ such that (C.1) holds for all x ∈ E and g ∈ C(E ∆ ). Then clearly B1 = 0. Suppose g ∈ C(E ∆ ), x ∈ E, and g(x) = max E g ≥ 0. Then g(x) = max E ∆ g, so that g(ξ) − g(x) ≤ 0 for all ξ ∈ E ∆ and hence Bg(x) ≤ 0. Thus B satisfies the positive maximum principle on E, which proves sufficiency.
To prove necessity, assume B1 = 0 and B satisfies the positive maximum principle on E. By Lemmas 4.2.1 and 1.2.11 in [17], the restriction B| C0(E) is dissipative, hence closable, and even closed since it is globally defined on C 0 (E). By the closed graph theorem B| C0(E) is bounded, and then so is B since B1 = 0. Pick any g ∈ C(E ∆ ) with g(∆) = max E ∆ g ≥ 0. Then g − g(∆) ≤ 0, so there exist functions h n ∈ C c (E) with h n ≤ 0 and h n → g − g(∆) uniformly. Then Bh n → B(g − g(∆)) = Bg uniformly as well. Taking x n such that h n (x n ) = 0 and x n → ∆, we obtain Bg(∆) = lim n→∞ Bh n (x n ) ≤ 0. We have thus proved that B is bounded and satisfies the positive maximum principle on E ∆ . As a result, Lemma 4.2.1 and Theorem 1.7.1 in [17] yield that {e tB } t≥0 is a strongly continuous contraction semigroup.
It remains to exhibit a kernel ν B from E ∆ to E ∆ such that (C.1) holds for all x ∈ E ∆ and g ∈ C(E ∆ ). To this end, fix x ∈ E ∆ and define h ∈ C(E ∆ ) by h(y) := d(x, y), where d( · , · ) is a compatible metric for the Polish space E ∆ . Since B satisfies the positive maximum principle on E ∆ , the map is a positive linear functional. By the Riesz-Markov representation theorem, there is a measure µ(x, · ) ∈ M + (E ∆ ) such that B(gh)(x) = E ∆ g(ξ)µ(x, dξ) for all g ∈ C(E ∆ ). We define ν B (x, dξ) := 1 E ∆ \{x} (ξ) 1 h(ξ) µ(x, dξ), which is permissible since h(y) > 0 for all y = x. For every g ∈ C c (E ∆ \ {x}) we have g/h ∈ C(E ∆ ), and therefore Since B is bounded, the identity Bg(x) = E ∆ g(ξ)ν B (x, dξ) extends by continuity to all g ∈ C(E ∆ ) with g(x) = 0. Thus, using also that B1 = 0, Repeating this for every x ∈ E ∆ yields that ν B satisfies (C.1) for all x ∈ E ∆ and g ∈ C(E ∆ ). To see that ν B (x, E ∆ ) < ∞, note that E ∆ g(ξ)ν B (x, dξ) ≤ B whenever g ∈ C(E ∆ ) satisfies 0 ≤ g ≤ 1 and g(x) = 0. Measurability of ν B ( · , A) for every Borel set A ⊆ E ∆ follows from a monotone class argument, so that ν B is indeed a kernel from E ∆ to E ∆ . (i) Q(g ⊗ g)(x, y) ≥ 0 for all g ∈ D, x, y ∈ E, with equality if g(x) = g(y).
(ii) Q(g ⊗ g), ν 2 ≥ 0 for all g ∈ D and ν ∈ M 1 (E), with equality if g is constant on the support of ν.
If either condition is satisfied, then Q is of the form Q = αΨ for some nonnegative symmetric function α : E 2 → R.
Proof. It is clear that (i) implies (ii). For the converse, first note that for any x ∈ E and g ∈ D, trivially g is constant on the support of δ x . Thus Q(g ⊗ g)(x, x) = Q(g ⊗ g), δ 2 x = 0. Taking ν = 1 2 (δ x + δ y ) for any x, y ∈ E then yields Q(g ⊗ g)(x, y) = Q(g ⊗ g), ν 2 ≥ 0, with equality if g(x) = g(y) since g is then constant on the support of ν. This proves that (ii) implies (i).
It remains to obtain the stated form of Q under the assumption that (i) holds. If E is a singleton then Q = 0, so we may assume that E contains at least two points. Fix x, y ∈ E with x = y. Due to (i), the map (g, h) → Q(g ⊗ h)(x, y) is bilinear and positive semidefinite, and therefore satisfies the Cauchy-Schwarz inequality |Q(g ⊗ h)(x, y)| ≤ Q(g ⊗ g)(x, y) Q(h ⊗ h)(x, y).
Along with (i) this implies that Q(g ⊗ h)(x, y) depends on g and h only through their values at x and y. Moreover, since D is dense in C ∆ (E), for every a ∈ R 2 there exists g ∈ D such that a = (g(x), g(y)). Thus there is a unique map T : R 2 × R 2 → R such that Q(g ⊗ h)(x, y) = T (a, b) where a = g(x) g(y) , b = h(x) h(y) .
Lemma D.1. If there exists a solution X to the martingale problem for L for each initial condition in M 1 (E), then L satisfies the positive maximum principle on M 1 (E).
The proof of Lemma D.1 is standard and we thus omit it. See for instance the proof of Lemma 2.3 in [21].
The next lemma is an adaptation of a classical result from [17]. For the application of this result it is crucial that L is an operator on the space of bounded continuous functions on a locally compact, separable, metrizable space. Since this is not the case for M 1 (E) if E is noncompact, we work on M 1 (E ∆ ), which is a compact Polish space with respect to the topology of weak convergence. The result of [17] can then be applied and we just have to check that if the initial condition of an M 1 (E ∆ ) solution X assigns mass 1 to E, then X t (E) = 1 almost surely for each t ≥ 0, so that the solution actually takes values in M 1 (E). For the second part, recall that by definition of E-conservativity there exist functions g n ∈ D ∩ C 0 (E) such that lim n→∞ g n = 1, and lim n→∞ (Bg n ) − = 0 bounded pointwise on E and E ∆ , respectively. By the dominated convergence theorem, (5.1), and Fatou's lemma we can compute E[X t (E)] = lim n→∞ E[ g n , X t ] = lim n→∞ g n , ν + E t 0 Bg n , X s ds ≥ ν(E) = 1.
Finally, note that a càdlàg process X on M 1 (E ∆ ) such that X t (E) = 1 almost sure is càdlàg also with respect to the topology of weak convergence on M 1 (E).