Martingale Inequalities and Deterministic Counterparts

We study martingale inequalities from an analytic point of view and show that a general martingale inequality can be reduced to a pair of deterministic inequalities in a small number of variables. More precisely, the optimal bound in the martingale inequality is determined by a fixed point of a simple nonlinear operator involving a concave envelope. Our results yield an explanation for certain inequalities that arise in mathematical finance in the context of robust hedging.


Introduction
Martingale inequalities are abundant in many areas of probability theory and analysis; see e.g. Burkholder's survey [11] for an extensive list of literature. We study general inequalities for discrete-time martingales from a bird's eye view and relate them to certain deterministic inequalities. Indeed, we shall see that every martingale inequality can be obtained as a consequence of two deterministic ones, and in fact that martingale inequalities are not very probabilistic in nature.
A simple example of a martingale inequality is Doob's maximal quadratic inequality, stating that the running maximum M * T := sup 0≤t≤T |M t | of any martingale M satisfies M * T 2 ≤ 2 M T 2 , where · 2 is the L 2 -norm. We may cast this in the form E[f (M T , M * T )] ≤ 0 for a suitable function f ; namely, f (x, y) = y 2 − 4|x| 2 . The general form a martingale inequality that we shall consider is where a is a constant and Z = (Z t ) t∈N is a suitable state process defined as a function of M ; in the preceding example, Z = (M, M * ). More precisely, let X be a vector space (in which our martingales are taking values) and let Z be a set, to be used as the state space. Then the Z-valued process Z is determined by a function φ : Z × X → Z via Z t+1 = φ(Z t , M t+1 − M t ) and some initial value z 0 . Again in the example, φ(x, y, d) = (x + d, y ∨ |x + d|) updates M by adding the next increment and increases the running maximum if necessary. Given f : Z → R and φ, we may ask if there exists a finite constant a such that (1.1) holds for all T ∈ N and all martingales with prescribed initial value, and what the optimal (minimal) value for a is. A possible answer runs as follows. Consider the operator A which acts on functions g : Z → R by pre-composing with φ and taking the concave envelope at the origin in the variable corresponding to the martingale increment: Ag(z) = g(φ(z, ·)) ♯ (0), z ∈ Z.
If u is a fixed point of A dominating f ; that is, Au = u and u ≥ f , then a = u(z 0 ) is an admissible constant in (1.1). Under the natural condition φ(z, 0) = z, a simple monotonicity argument shows that A has a minimal fixed point u dominating f . This fixed point can be obtained from f by iterating A and passing to the limit, u = A ∞ f := lim n→∞ A n f, and we shall see that a = u(z 0 ) is the optimal constant in (1.1). In this sense, we may say that a martingale inequality can be reduced to the two deterministic inequalities u ≥ Au and u ≥ f.
(Here u ≥ Au is actually equivalent to Au = u.) In fact, we may note that u defines a stronger martingale inequality altogether. Namely, as u ≥ f , is stronger than the original inequality E[f (Z T )] ≤ u(z 0 ) with optimal constant, and we remark that the inequality u ≥ f is strict in most cases of interest. Returning to our example, we can check that the minimal fixed point is given by u(x, y) = y 2 − 4|x| 2 if |x| < y/2, 2y 2 − 4|x|y if |x| ≥ y/2, and so the optimal constant corresponding to the initial value z 0 = (x 0 , |x 0 |) is a = −2|x 0 | 2 , while the knowledge of u actually yields a further strengthening of Doob's maximal inequality (Corollary 4.3). There are of course very relevant martingale inequalities which hold only for some specific class of martingales; for instance, nonnegative martingales, martingales with increments bounded by one, etc. Many such inequalities can be fitted within our framework by choosing Z appropriately and assigning the value −∞ to the function f on a suitable subset (see also Section 4.2).
All this has little to do with probability or measure theory; in fact, it seems that the latter is only needed to define the expectations. In order to clearly separate this aspect (and also to spare the reader some measurable selection arguments), we shall develop the theory for simple martingales (i.e., martingales taking finitely many values), so that all expectations are actually finite sums. In most cases of interest, f and φ (and then also the fixed point u) have some continuity properties and the passage to general martingales can be done a posteriori by approximation. However, we also provide an alternative argument which is more in the spirit of this paper and applies even to functions that are merely measurable, under the slight restriction that X be finite-dimensional. Namely, we devise a martingale version of Tchakaloff's theorem, stating that given a measurable (integrable) function g : (R n ) T → R and an n-dimensional martingale M , we can find a simple martingale N such that and moreover the (finite) support of the law of N lies in the support of the law of M . Note that we have here an actual equality; no approximation is necessary.
The theory outlined in this paper can be seen as a general formulation of a strategy of proof that was used in several works of D. L. Burkholder for martingale inequalities where Z consists of X, its running maximum and its square function. Namely, he used a class of functions u, corresponding roughly to what we call fixed points, to find admissible or sharp constants in various martingale inequalities, and in fact it seems that he was aware of some of the structure presented here; see in particular Theorem 2.1 in [13] but also [10,11,12], among others. His works contain a condition (called "the rather mysterious condition (2.2)" in [13]) related specifically to the running maximum and which seemingly prevents a general formulation of the ideas; this condition has no analogue in the present paper.
A different stream of literature about martingale inequalities has emerged in mathematical finance, starting with Hobson [17]. In this context, the process X takes values in R n and represents the discounted prices of n tradable securities, while f (Z T ) is seen as an option maturing at the fixed time horizon T . The problem is to find a minimal constant a and a predictable process H (i.e., H t is a function of X 0 , . . . , X t−1 ) such that where the inner product H t , X t+1 −X t is interpreted as the gain or loss that occurs as the price X t changes to X t+1 while H t units of the security are held. Thus, if a is charged as the price of the option, the trading strategy H allows to hedge the risk of f (Z T ) in a robust (model-free) way. We observe that by taking expectations on both sides, (1.2) implies the martingale inequality E[f (Z T )] ≤ a. Along these lines, "pathwise" proofs for several martingale inequalities have been obtained. For these and related results in robust finance, see [1,2,4,5,7,14,16,21] among others; more references can be found in the surveys by Hobson [18] and Obłój [20]. In particular, a result of Bouchard and Nutz [6] implies that any martingale inequality in finite discrete time can be related to an inequality of the type (1.2). However, the machinery used there (to deal with a more general case) only yields a non-constructive existence result for H and little insight into the nature of the inequality. We shall see that, in essence, H is determined quite explicitly as the derivative of u(φ(z, ·)) ♯ . The remainder of this article is organized as follows. In Section 2 we consider martingale inequalities with a fixed time horizon T and relate the optimal constant to certain concave envelopes by dynamic programming. Section 3 focuses on martingale inequalities that do not depend explicitly on the time horizon T ; this further condition of time-homogeneity leads to the fixed point considerations mentioned above. The connection to mathematical finance is also discussed here. In Section 4, we illustrate the theory by two simple examples, Doob's maximal L p -inequality and Burkholder's inequality for differentially subordinate martingales. Section 5 concludes with the martingale version of Tchakaloff's theorem.

Martingale Inequalities and Concave Envelopes
It will be convenient to work with functions taking values in the extended is used throughout; in particular, in the definitions of concavity and integrals. Let X be a real 1 vector space. Given a function g : X → R, we define its concave envelope g ♯ : X → R as the smallest concave function dominating g, or g ♯ (x) = inf{ψ(x)| ψ : X → R is concave and ψ ≥ g}, x ∈ X.
We shall need to take consecutive envelopes over several variables. Given an integer t ≥ 0 and g : X t+2 → (−∞, ∞], we first introduce the function in other words, we pass to the concave envelope in the ultimate variable and evaluate the resulting function at the penultimate variable. For an integer T ≥ 0, we can then define the composition which maps functions of T + 1 variables into functions of one variable. Our first aim is to identify, for a fixed time horizon T , the optimal constant for a martingale inequality defined by f : X T +1 → R in terms of the consecutive envelope f ♯(T ) . Given will be more convenient in the sequel. We emphasize that the integral under µ ∈ M T (x 0 ) is a finite sum and therefore does not require any measurability conditions, and moreover that, according to ( Or, to state the same in different words: proving that an inequality As a first step towards the proof, we consider the case T = 1. Noting that M(x) is simply the set of all probability measures on X having finite support and barycenter x ∈ X, the following identity is essentially classical (see Kemperman [19]); we state the details for the sake of completeness.
Proof. Let x ∈ X and µ ∈ M(x); then µ is a convex combination of Dirac To see the converse inequality, let x 1 , x 2 ∈ X and λ ∈ (0, 1).
Using the fact that λµ ε showing that The extension to the case of a general horizon T can be understood as a dynamic programming argument where the martingale laws play the role of the controls in a stochastic control problem. Given g : X t+2 → R, we therefore introduce the (value) function as well as the composition which maps functions of T + 1 variables into functions of t + 1 variables. Then Proof. We first suppose that f is bounded from above. To see the inequality We may see µ t as a stochastic kernel on X t+1 equipped with the discrete σfield. Recalling that we are only using measures with finite support, we may form the product measure µ ε := (µ 0 ⊗ · · · ⊗ µ T −1 )(x 0 ) which is an element of M T (x 0 ) by Fubini's theorem. We then have As ε > 0 was arbitrary, this yields the claimed inequality. To see the converse inequality "≥", fix x 0 ∈ X and note that any µ ∈ M T (x 0 ) can be decomposed into the product µ = µ 0 ⊗ µ 1 ⊗ · · · ⊗ µ T −1 of a measure µ 0 ∈ M(x 0 ) and kernels µ t on X t such that µ t (x 1 , . . . , x t ) ∈ M(x t ) for all x 1 , . . . , x t ∈ X. By the definition of the operators E t , we then have and the claim follows as µ ∈ M T (x 0 ) was arbitrary.
Finally, for the case of a general function f , we observe that both sides of (2.3) are continuous along increasing sequences (f n ) of R-valued functions having the property that {f n = −∞} = {f n+1 = −∞}, n ≥ 1. Thus, we may apply the above to f ∧ n and pass to the limit as n → ∞.
Proof of Proposition 2.1. Since Lemma 2.2 shows that ♯ t = E t , Proposition 2.1 is a direct consequence of Lemma 2.3.

Time-Homogeneous Martingale Inequalities
Let x 0 ∈ X, set X 0 = x 0 and let (X t ) t=1,2,... be the coordinate-mapping process on X × X × · · · . Moreover, let Z be a nonempty set and fix a function φ : We write R Z for the set of all functions Z → R, equipped with the pointwise partial order and convergence, and define the operator A : Moreover, we write A T for the T -fold composition A • · · · • A. Using this notation, Proposition 2.1 can be rephrased as follows. Then This lemma may look less general than Proposition 2.1, which allows for a general dependence on the path of X, but let us mention that with the choice Z = N × X N we can arrange things so that Z t = (t, X 0 , X 1 , . . . , X t , 0, 0, . . . ).
From now on, we focus on martingale inequalities which hold for any time horizon T . The structural condition seems to be natural in that setting and we make this a standing assumption.
The operator A then has the following monotonicity properties.
Proof. In view of (3.1), we have The second property follows from the monotonicity of ♯.
is the optimal horizon-independent constant for the martingale inequality determined by f , φ and z 0 . In fact, Lemma 3.1 naturally extends to if we denote by M ∞ (x 0 ) the set of all laws of X-valued simple martingales (M t ) t∈N satisfying M 0 = x 0 . Note that any such martingale is eventually constant, so that Z ∞ := lim n Z n is well-defined µ-a.s. for all µ ∈ M ∞ (x 0 ).
In particular, the limit A ∞ f (z) := lim n→∞ A n f (z) ∈ R exists for all z ∈ Z.
Next, let us observe that if (g n ) n≥1 ⊆ R Z is a nondecreasing sequence, then lim n g ♯ n = (lim n g n ) ♯ .
Indeed, both limits are increasing and thus well-defined, and the monotonicity of ♯ immediately implies that lim n g ♯ n ≤ (lim g n ) ♯ . Conversely, lim n g ♯ n is concave as the pointwise limit of concave functions and dominates lim g n , so that lim n g ♯ n ≥ (lim g n ) ♯ . Using this continuity property of ♯, we see that for all z ∈ Z; that is, A ∞ f is a fixed point. If g ∈ R Z is another fixed point of A such that g ≥ f , then the monotonicity of A from Lemma 3.2 yields that g = A n g ≥ A n f, n ≥ 0 and hence g ≥ A ∞ f by passing to the limit.
Remark 3.5. Let u : Z → R be any function such that f ≤ u and Au ≤ u (hence Au = u; cf. Lemma 3.2). Then that is, to prove that the martingale inequality holds with right-hand side a, it suffices to exhibit a fixed point u of A which dominates f and satisfies u(z 0 ) ≤ a. As mentioned in the Introduction, this corresponds to a general formulation of the strategy of proof that has been used by Burkholder for several specific martingale inequalities. For the above conclusion, it is not necessary to establish that u is the minimal fixed point; however, this property guarantees that u(z 0 ) is the optimal right-hand side.
To find an explicit formula for A ∞ f (or any other fixed point), it is often useful to study properties of f that are preserved by A. We give a simple example to illustrate this point (see also Section 4.1).
Remark 3.6. Suppose that Z is a cone and that φ is positively homogeneous of degree one. If f ∈ R Z is positively homogeneous of degree p > 0, then so are Af and A ∞ f . Indeed, let λ ≥ 0; then where the infima are taken over all concave functions ψ : X → R. The homogeneity of A ∞ f follows.
Next, we would like to explain a connection to certain inequalities which have arisen in mathematical finance-from our abstract point of view, we shall see that the latter are simply manifestations of the concavity that is imposed by A. For the purpose of the subsequent discussion, we assume that we are given a dual pair X, X ′ with a separating pairing ·, · . Given a concave function h : and h is called superdifferentiable at d 0 if this set is nonempty. (i) For all z ∈ Z there exists ξ(z) ∈ X ′ such that g(φ(z, d)) ≤ g(z) + ξ(z), d , d ∈ X. (3.2) (ii) Ag = g.
We mention that Lemma 3.7 can serve as a tool to verify that g is a fixed point: in examples, it is sometimes easier to verify a relation like (3.2) which does not involve the concave envelope (e.g. [5]).
Remark 3.8. In the context of mathematical finance, the R n -valued process X represents the discounted prices of n tradable securities, while f (Z T ) is seen as an option maturing at time T . Inequality (3.2) with g = u = A ∞ f expresses that the trading strategy H t := ξ(Z t−1 ) yields a superhedge for the seller of the option if u(z 0 ) is charged as its price: where the left-hand side is the balance obtained from the amount u(z 0 ) and the gains/losses from trading according to H. A similar observation applies if the time horizon T is seen as fixed (which is more natural in finance); namely, H t ∈ ∂[A T −t (φ(Z t−1 , ·)) ♯ ](0) yields a process such that In particular, this gives a simple and constructive proof for the result of [6] mentioned in the Introduction (note that an element of the supergradient can be chosen simply by taking a directional derivative). By its definition, A T (z 0 ) is the minimal constant allowing for an inequality of the form (3.4) to hold almost-surely under all martingale laws and hence in all viable models, so that A T (z 0 ) is called the robust (or modelindependent) superhedging price. To enlarge a bit further on the financial aspect, suppose that Z ⊆ X×Y for some set Y and that φ(x, y, d) = ϕ(x+d, y) for some function ϕ : X × Y → Z, where we now write (x, y) instead of z (see also Section 4.1 below). If u = A ∞ f , then u(·, y) is concave because u(x, y) is the concave envelope of u(ϕ(·, y)) evaluated at x, and moreover ∂ x u(x, y) = ∂u(φ(x, y, ·)) ♯ (0).
In other words, the hedging strategy is given by ξ(x, y) = ∂ x u(x, y), which corresponds to the option's Delta in the language of finance.
Certain classical martingale inequalities hold also for submartingales. This can be related to the above as follows (the submartingale property is understood componentwise in the multivariate case). Remark 3.9. Let X = R n and g : X → R; then by Lemma 2.2, we have sup µ∈M(x) µ[g] = g ♯ (x). Now let M * (x) be the set of all probability measures on X having finite support and barycenter x * ≥ x. If the function g is (componentwise) nonincreasing, we also have Indeed, for each µ * ∈ M * (x) there is µ ∈ M(x) such that µ * [g] ≤ µ[g]. As a consequence, the martingale inequality corresponding to f and φ extends to submartingales under the condition that Some martingale inequalities extend only to, e.g., nonnegative submartingales. Such a case can be covered by choosing a suitable state space Z, as in Section 4.2 below.
We conclude this section with a brief remark about measurability questions (which we have avoided wherever possible).
Remark 3.10. Suppose that X = R n and Z is, say, a Polish space, and that φ is Borel-measurable. If f is Borel-measurable, one can check that Af and A ∞ f are upper-semianalytic and in particular universally measurable; however, it can happen that Af is not Borel-measurable. As a consequence, the hedging strategy in Remark 3.8 can also be chosen to be universally measurable.

Doob's Maximal Inequality
The aim of this subsection is to illustrate the above abstract theory by a ramification of Doob's maximal L p -inequality; in this case, all quantities of interest can be computed explicitly. In what follows, X is a vector space with norm | · |.
Then the minimal fixed point of A dominating f is given by whereũ (x, y) := py p − p 2 p−1 |x|y p−1 , (x, y) ∈ Z.
Remark 4.2. The proof below also shows that the constant ( p p−1 ) p in the definition of f is optimal. Namely, if Setting |M | * T = max 0≤t≤T |M t | and applying the results of the previous subsection, we immediately deduce the following ramification of Doob's maximal L p -inequality.
and the right-hand side is optimal. In particular, for the case y = |x|, we have and thus |M | * T p ≤ p p−1 M T p . We mention that the functionũ also appears in a proof of (4.3) in [11]. The optimality of the constant was not studied there; incidentally, we see thatũ(x, y) actually yields the optimal constant for initial conditions with y = |x|. The functionũ can also be extracted (with some additional work) from Cox [15], who considers the finite-horizon version of Doob's inequality in the case X = R. namely, we have u(0, 0) = 0 and u(x, y) = y p ̺(|x|/y) for all (x, y) ∈ Z with y > 0. On the other hand, we know that u is a fixed point of A, so that x → u(x, y) is concave. In particular, using u ≥ f and the scaling property, we see that u(x, y) = ∞ at one point (x, y) if and only if u ≡ ∞ on Z. For the time being, let us suppose that we are in the case where u is finite. Under this condition, it follows from (4.5) and the scaling property, or also directly from (4.4), that x → u(x, y) is continuous. Thus, ̺ is a continuous concave function on [0, 1], and it follows from (4.5) that its (left) tangent t at the boundary point r = 1 satisfies t(r) ≥ r p ̺(1), r ∈ [1, ∞); (4.6) note that r p ̺(1) = u(x r , |x r |) if x r ∈ X is any point with |x r | = r (we may assume that X = {0}). For later use, we remark that the converse is also true: a continuous concave function̺ on [0, 1] satisfying the analogue of (4.6) determines a fixed pointū of A. Let us establish that Indeed, ̺(0) = u(0, 1) ≥ f (0, 1) = 1. Moreover, if ̺(1) were nonnegative, then p > 1 and (4.6) would imply that the tangent t has nonnegative slope, thus ̺(1) = t(1) ≥ t(0) ≥ ̺(0) ≥ 1. But then (4.6) states that the affine function t(r) dominates r p on [1, ∞), which is impossible. As a result, r → r p ̺(1) is concave and we see that the tangent condition (4.6) can be stated equivalently in differential terms. Namely, if ̺ ′ (1) denotes the slope of t, (4.6) is equivalent to In view of (4.7), the tangent t has a unique zero r 1 in [0, 1]. Using the Intercept Theorem, (4.8) implies that (1) and hence Next, we construct another fixed point of A for comparison. Lett be the (uniquely determined) affine function which is parallel to t and touches We denote by (r 2 , f (x r 2 , 1)) the coordinates of this touching point. Set By definition,̺ is a continuous concave function satisfying (4.8). As remarked above, this implies that̺ defines a fixed pointū of A viaū(0, 0) := 0 andū(x, y) := y p̺ (|x|/y) for (x, y) ∈ Z with y > 0.
The fact that f ≤ u and the construction ofū imply thatū ≤ u. On the other hand, we haveū ≥ f and u is the minimal fixed point of A above f , so u ≤ū. As a result,ū = u,̺ = ̺ andt = t. In particular, this establishes that ̺ is of the specific form (4.10); it remains to determine the tangent t explicitly.
Consider r 0 := (1/c) 1/p , the zero of r → 1− cr p = f (x r , 1). By concavity, we must have r 0 ≤ r 1 ; recall that r 1 is the zero of the tangent. In view of (4.9), we conclude that hence, our assumption that u is finite is contradicted whenever c < ( p p−1 ) p . Suppose that c = ( p p−1 ) p . Then r 0 = 1 − 1/p and so (4.11) implies that In view of (4.10), this corresponds to the claimed formula (4.1). Since we have seen that this form of u defines a fixed point dominating f , we are necessarily in the case where A ∞ f is finite; moreover, as f is decreasing with respect to c, A ∞ f is then also finite for all c ≥ ( p p−1 ) p .

Differentially Subordinate Martingales
The main purpose of this subsection is to illustrate how one can accommodate a martingale inequality which holds only for a specific class of martingales. To this end, we shall treat an inequality for differentially subordinate martingales, first derived by Burkholder for real-valued processes in [9] and extended to the Hilbert-valued case in [10]. A martingale N is differen- In other words, this says that the increments of the bivariate martingale (M, N ) take values in the cone {(d 1 , d 2 ) : |d 2 | ≤ |d 1 |}, and this is the condition defining the class of (bivariate) martingales for which the inequality will hold. Let H be a Hilbert space. In what follows, our basic vector space is X := H × H and our state space is Z = X ∪ {∆}; the additional point ∆ will be used as a cemetery state for paths that violate the subordination condition.
Then the minimal fixed point of A dominating f is given by and by the same identity withũ and f interchanged if 2 ≤ p < ∞.
Proof of Proposition 4.4. All relevant properties are contained in [10]; we merely translate them into our setup. Indeed, we have f (∆) = u(∆) by definition, and it is checked below Equation (1.10) in [10] that f (z) ≤ũ(z) for z ∈ X. Hence, f ≤ u. Moreover, according to Remark 1.2 in [10], u is the smallest function which dominates f on X and has the property that r → u(z + rd) is concave for all z ∈ X and all d = (d 1 , d 2 ) ∈ X such that |d 2 | ≤ |d 1 |. Using our notation and recalling that u(φ(z, ·)) = −∞ outside the set {|d 2 | ≤ |d 1 |}, it follows that u is the smallest function dominating f on Z such that u(φ(z, ·)) is concave on X. The latter property implies that Au(z) = u(φ(z, ·)) ♯ (0) = u(φ(z, 0)) = u(z), so u is a fixed point of A. Conversely, if g : Z → R is any fixed point of A, then g(φ(z, d)) = g(z + d + ·) ♯ (0) and hence g(φ(z, ·) is concave. As a result, u is the smallest fixed point of A dominating u.

Tchakaloff 's Theorem for Martingales
In the preceding sections, we have restricted our attention to simple martingales and we still have to argue that this entails no essential loss of generality. On the one hand, let us mention again that for nice functions f and φ, the extension from simple to general martingales can be done by direct approximation arguments; see, e.g., the proofs of Lemma 2.2 in [13] or Theorem 2.2 in [8]. On the other hand, we have developed the theory without regularity conditions and so we would like to see that the extension can be achieved under the natural requirement necessary to define the expectations; namely, the measurability alone. This will be achieved by a martingale version of Tchakaloff's theorem.
Following Bayer and Teichmann [3], a general version of Tchakaloff's classical theorem [22] about the existence of cubature formulas can be stated as follows: given an integrable function f on a probability space (Ω, F, µ), there exists a probability measure ν with finite support such that ν[f ] = µ[f ], and moreover that support can be chosen to lie in the support of µ. The function f may be multivariate, which allows one to incorporate a finite number of linear constraints on ν; for instance, that ν should have the same first moment as µ. Our aim is to provide a version of the theorem where µ and ν are martingale laws. This extension is not immediate because the martingale property corresponds to an infinite number of constraints 2 .
Theorem 5.1. Let k, n, T ∈ N and X = R n . Let x 0 ∈ X and let µ be the law of an X-valued martingale M 0 , . . . , M T with M 0 = x 0 , and let A ⊆ X T +1 be a (µ-measurable) set such that µ(A) = 1. Moreover, let f : X T +1 → R k be a µ-measurable function such that µ[|f |] < ∞. There exists a martingale law ν, still starting at x 0 , such that # supp ν ≤ (n + k + 1) T , supp ν ⊆ A and Proof. By changing f on a µ-nullset and replacing A with a smaller set of full µ-measure, we may assume that f and A are Borel. The case T = 1 is now a consequence of Tchakaloff's theorem in the form of [3,Corollary 2] applied to the function φ : X → R n+k+1 given by φ(x) = (f (x), x, 1). Hence, we assume that the theorem holds for some T ∈ N and show how to pass to T + 1. So let µ be a martingale law on X T +1 and let A ⊆ X T +1 satisfy µ(A) = 1. Let µ 0 be the marginal of µ on X T , given by µ 0 (B) := µ(B × X) for B ∈ B(X T ), and let µ 1 be a Borel-measurable stochastic kernel from X T to (X, B(X)) such that µ = µ 0 ⊗ µ 1 . (5.1) It is easy to see that µ 0 is a martingale law on X T and that µ 0 (A 0 ) = 1 if A 0 is the (universally measurable) canonical projection of A onto X T . On the other hand, it follows from (5.1) that there exists N ∈ B(X T ) with µ 0 (N ) = 0 such that for all x ∈ X T \ N , we have |f (x, x ′ )| µ 1 (x; dx ′ ) < ∞ and µ 1 (x) is a martingale law on X satisfying µ 1 (x; A x ) = 1, where A x ∈ B(X) is the section A x = {x ′ ∈ X : (x, x ′ ) ∈ A}. By the induction hypothesis, there exists a martingale law ν 0 on X T such that # supp ν 0 ≤ (n + k + 1) T , supp ν 0 ⊆ A 0 \ N (5.2) and Fix x ∈ X T \ N . By applying the case T = 1 to the function f (x, ·) and the measure µ 1 (x), we obtain a martingale law ν 1 (x) on X such that # supp ν ≤ n + k + 1, supp ν ⊆ A x and f (x, x ′ ) ν 1 (x; dx ′ ) = f (x, x ′ ) µ 1 (x; dx ′ ) ≡ g(x).
We may see x → ν 1 (x) as a kernel (measurable for the discrete σ-field) and define ν = ν 0 ⊗ ν 1 ; this product is well defined as a consequence of (5.2). By construction, we have # supp ν ≤ (n + k + 1) T +1 . Moreover, it follows from Fubini's theorem that and similarly that ν is a martingale law satisfying ν(A) = 1.
The preceding theorem entails that even for merely measurable functions f , simple martingales are sufficient to establish martingale inequalities.