Functional inequalities for marked point processes

In recent years, a number of functional inequalities have been derived for Poisson random measures, with a wide range of applications. In this paper, we prove that such inequalities can be extended to the setting of marked temporal point processes, under mild assumptions on their Papangelou conditional intensity. First, we derive a Poincar\'e inequality. Second, we prove two transportation cost inequalities. The first one refers to functionals of marked point processes with a Papangelou conditional intensity and is new even in the setting of Poisson random measures. The second one refers to the law of marked temporal point processes with a Papangelou conditional intensity, and extends a related inequality which is known to hold on a general Poisson space. Finally, we provide a variational representation of the Laplace transform of functionals of marked point processes with a Papangelou conditional intensity. The proofs make use of an extension of the Clark-Ocone formula to marked temporal point processes. Our results are shown to apply to classes of renewal, nonlinear Hawkes and Cox point processes.


Introduction
Point processes with a Papangelou conditional intensity ( [6], [9], [18], [20], [21]) constitute an important class of point process models, which generalizes the Poisson process. Roughly speaking, the intuitive meaning of this notion of conditional intensity, denoted by π x (ω), is that, for a suitable state space S and reference measure σ on S, π x (ω) σ(dx), x ∈ S, is the conditional probability of having a particle in the infinitesimal region dx when the configuration ω is given outside dx.
In this paper we provide several functional inequalities for marked temporal point processes having a Papangelou conditional intensity, with times in R + and marks in a complete separable metric space E.
Our main achievements are (i) a Poincaré inequality for square-integrable functionals of marked point processes with a Papangelou conditional intensity (Theorem 3.1); (ii) transportation cost inequalities for the law of functionals of marked point processes with a Papangelou conditional intensity (Theorem 3.7) and for the law of the marked point process itself (Theorem 3.13); (iii) a variational representation of the Laplace transform of functionals, bounded from above, of marked point processes with a Papangelou conditional intensity (Theorem 3.18).
The Poincaré inequality and variational representations of the Laplace transform for functionals on the Poisson space have attracted a lot of interest (see [16], [22], [27] for the Poincaré inequality and [4], [29] for variational representations of the Laplace transform). Our results in this direction show that such functional relations hold in a non-Poissonian setting. We emphasize that the Poincaré inequality proved in this paper concerns one-dimensional marked point processes and it holds under different conditions than the Poincaré inequality for Gibbs point processes provided in [13] (we refer the reader to Remark 3.5 for a more detailed discussion). On the one hand, to the best of our knowledge, the transportation cost inequality for functionals of marked point processes with a Papangelou conditional intensity is new even in the Poisson setting. On the other hand, the transportation cost inequality for the law of the marked point process itself generalizes a related relation proved in [19].
A key ingredient in the proofs is a new Clark-Ocone formula for square-integrable functionals of marked point processes with a Papangelou conditional intensity (Theorem 3. 19), which generalizes the corresponding formula in [8] in two directions. First, we allow for point processes with values on an unbounded time interval and marks in a complete separable metric space. Second, we prove the square integrability of the Clark-Ocone integrand, which is crucial in the proofs of the functional inequalities mentioned above since it enables the application of the isometry formula provided by Proposition 2.5.
Deviation bounds, which are a classical application of transportation cost inequalities, are presented in Remarks 3.7 and 3.8. The variational representation of the Laplace transform for functionals of marked point processes with a Papangelou conditional intensity can also be useful to derive large deviation principles for those functionals, along the lines of [4] and more generally relying on the theory developed in [7]. Further applications of our results to various classes of non-Poissonian point processes such as renewal, nonlinear Hawkes and Cox are presented in Corollaries 3.2, 3.4, 3.6, 3.10, 3.11, 3.12, 3.14, 3.15, 3.16 and in Remark 3.9.
The proof of the Poincaré inequality is based on the evaluation of the variance of the functional by a combination of the Clark-Ocone formula and the isometry formula for marked point processes with a stochastic intensity. The proof of the transportation cost inequality exploits its characterization via exponential moments proved in [10]. Such exponential moments are controlled by a stochastic convex inequality for functionals of marked point processes with a Papangelou conditional intensity (Proposition 4.2), which is based on the Clark-Ocone formula and generalizes the corresponding result in [12]. The proof of the variational representation of the Laplace transform uses a localization argument to deal with the unbounded case, which is out of the reach of the techniques in [29], and relies on the Clark-Ocone formula in order to take into account the non-Poissonian dynamics of the point process. The proof of the Clark-Ocone formula is based on the representation theorem for square-integrable martingales, on the use of an integration by parts formula for functionals of point processes with a Papangelou conditional intensity, and on an isometry formula for point processes with stochastic intensity, which allows us to identify the integrand appearing in the representation theorem.
We note once again that, in contrast to the corresponding result in [8], the Clark-Ocone formula of Theorem 3.19 guarantees the square integrability of the integrand (3.36). This integrability property is crucial to the proofs of our main results due to a pervasive use of the isometry formula provided by Proposition 2.5.
The paper is organized as follows. In Section 2 we give some preliminaries on point processes including the notions of Papangelou conditional intensity, classical stochastic intensity and an important relation between them. In Section 3 we describe the main results of the paper and give their proofs in Section 4. We also include an appendix, where we prove some technical lemmas and propositions.

Preliminaries on point processes
Let E be a complete separable metric space and E the corresponding Borel σ-field. Let Ω be the set of all integer-valued measures ω on (R + × E, B(R + ) ⊗ E), where B(R + ) is the Borel σ-field on R + , such that ω({0} × E) = 0, ω({t} × E) ≤ 1 for any t ∈ R + , and ω(K × E) < ∞ for any compact set K ⊂ R + (in Remark 2.1, we shall explain in what sense our results apply under a more general definition of Ω). We define N (ω) := ω, ω ∈ Ω and for a Borel set B ∈ B(R + ) we shall consider the σ-field where B(B) denotes the restriction of B(R + ) to B. For ease of notation we set F t := F [0,t] and F t − := F [0,t) ≡ 0≤s<t F s for t ∈ R + . We set F ∞ := t∈R + F t , let P be a probability measure on (Ω, F ∞ ) and consider the canonical probability space (Ω, F ∞ , P). The elements of (Ω, F ∞ , P) are known in the literature as simple and locally finite marked point processes on R + with marks in E. Throughout the paper we denote by E and Var the expectation and the variance operators with respect to P, respectively.
By analogy with the un-marked setting, a mapping X : R + × Ω × E → R which is measurable with respect to the σ-fields (B(R + ) ⊗ F ∞ ⊗ E, B(R)), where B(R) is the Borel σ-field on R, is called a stochastic process. Let G := {G t } t∈R + be a filtration such that F ∞ ⊃ G t ⊇ F t , t ∈ R + . The G-predictable σ-field on R + × Ω, denoted by P(G), is the σ-field generated by the sets (a, b] × A with A ∈ G a , a, b ∈ R + . Throughout the paper, for any ω ∈ Ω and (t, x) ∈ R + × E, we define ω − ε (t,x) := ω if (t, x) / ∈ Supp(ω), where ε (t,x) is the Dirac measure at (t, x) and Supp(ω) denotes the support of ω. A stochastic process X : R + × Ω × E → R is said to be G-predictable if it is measurable with respect to the σ-fields (P(G) ⊗ E, B(R)). We say that X is predictable if it is measurable with respect to the σ-fields (P(F) ⊗ E, B(R)), where F := {F t } t∈R + . For ease of notation, we set X (t,x) (ω) := X(t, ω, x) and for later purposes, we mention that if X is predictable, then for fixed t ∈ R + and x ∈ E the random variable X (t,x) is measurable with respect to F t − (and therefore with respect to F t ), and so X (t,x) (ω) = X (t,x) (ω − ε (t,x) ). This claim follows by an obvious modification of the proof of Lemma A3.3.I p. 425 in [5], see also Proposition 3.3 in [15].
We shall consider two different notions of conditional intensity for marked point processes: the Papangelou conditional intensity and the classical stochastic intensity.

Marked point processes with a Papangelou conditional intensity
Let ν denote a locally finite on (E, E). A non-negative stochastic process π : R + × Ω × E → R + is said to be a Papangelou conditional intensity of N with respect to dtν(dx) if, for any non-negative stochastic process X : Intuitively, π (t,x) (ω) dtν(dx) is the probability that the point process has a point in the infinitesimal region dtdx given that it agrees with the configuration ω outside of dtdx. Point processes with a Papangelou conditional intensity are fully characterized in [20] (see Section 3) and [21] (see Theorem 2').
Remark 2.1. Some texts (e.g. [5,Definitions 6.4.I.]) use a more general definition of a marked point process which allows for N to have atoms ε (t,x) , ε (t,y) for the same t ∈ R + and x = y. We considered a more restrictive set Ω in Section 2 since all our main results hold for marked point processes with a Papangelou conditional intensity. In such case, there exists a version of the more general marked point process which takes its values in the set Ω, see the following Lemma 2.1.
. Assume that the marked point process N is defined on Ω ′ instead of Ω and define the quantities from Section 2 in an analogous manner. In particular, if P is a probability measure on (Ω ′ , F ∞ ), where F ∞ is appropriately defined and N has a Papangelou conditional intensity, then N takes its values in Ω P-almost surely, i.e. P(Ω) = 1.
We postpone the proof of this lemma to Section 5.1 in the appendix in order to improve the flow of the article.
Throughout this paper we shall often consider locally stable point processes, i.e. point processes N with a Papangelou conditional intensity π such that for a function β : The local stability is known to be satisfied by a wide range of point processes (see e.g. [18]).
In [17], the author provides a condition guaranteeing the existence of the predictable projection of a bounded stochastic process. In the following proposition, we specialize this result to our setting while relaxing the boundedness assumption.
Proposition 2.2. Assume that N has a Papangelou conditional intensity π, i.e. (2.1) holds. Let X : R + × Ω × E → R be a stochastic process which is such that either (i) X ≥ 0 or (ii) for dtν(dx)almost all (t, x) ∈ R + × E, X (t,x) ∈ L 1 (Ω, F ∞ , P). Then there exists a predictable stochastic process p(X) : R + × Ω × E → R called predictable projection, which is such that for dtν(dx)-almost all (t, x) ∈ R + × E, we have The proof is rather technical and therefore postponed to Section 5.2 in the appendix. Throughout this paper we will use the discrete Malliavin derivative of F : Ω → R, defined as Under suitable integrability conditions on X and π, we shall consider the stochastic integral x) ), and so this integral can be rewritten as We conclude this paragraph with the following integration by parts formula, see Corollary 3.1 in [25].
Assume that N has a Papangelou conditional intensity π. Then, for any predictable stochastic process X : R + × Ω × E → R and any random variable G : Ω → R such that

Marked point processes with a classical stochastic intensity
A non-negative and G-predictable stochastic process λ : surely is said to be a G-stochastic intensity of N (see [1], [5] and [14]) if for any non-negative and G-predictable stochastic process X : Additionally, we call λ a (classical) stochastic intensity if G := F. Roughly speaking, the quantity λ (t,x) (ω) dtν(dx) is the probability that the point process has a point in the infinitesimal region dtdx given that it agrees with the configuration ω on [0, t) × E.
Hereafter, we assume that N has a G-stochastic intensity λ and, given two G-predictable stochastic processes X, Y : For p ∈ [1, +∞), we denote by P p (λ) the family of equivalence classes (with respect to the equivalence relation ∼) formed by G-predictable stochastic processes X such that and, for ease of notation, we set P 1,2 (λ) := P 1 (λ) ∩ P 2 (λ). Note that · Pp(λ) is a norm on P p (λ). For any X ∈ P 1 (λ) we define the stochastic integral which is well defined P-almost everywhere as the difference of two finite terms by (2.4). The next proposition provides a fundamental isometry formula for marked point processes with a G-stochastic intensity.
Proposition 2.5. Assume that N has a G-stochastic intensity λ and that E N ([0, t] × E) < ∞, for all t ∈ R + . The operator δ with domain P 1 (λ) can be uniquely extended to an operator with domain P 1 (λ) ∪ P 2 (λ), which we still denote by δ. Additionally, for any X, Y ∈ P 2 (λ) we have Although the proof of this proposition follows a standard Cauchy sequence argument along the lines of Section II.2 in [11], we have not found this precise result in the literature. For this reason, we provide the proof in Section 5.3 of the appendix.

A relation between the Papangelou conditional intensity and the classical stochastic intensity
In the next lemma we show that the Papangelou conditional intensity of a marked point process N determines its stochastic intensity.
Proof. Let X be a non-negative and predictable stochastic process. As recalled at the beginning of Section 2, we have X (t,x) (ω) = X (t,x) (ω − ε (t,x) ) and X (t,x) is F t − -measurable, for any t ∈ R + and x ∈ E. So by Fubini's theorem, standard properties of the conditional expectation and Proposition 2.2-(i), As already mentioned, the proofs of our main results are based on a Clark-Ocone formula for marked point processes, which, in turn, exploits the martingale representation theorem. For this reason, we shall consider the P-completed filtration F ≡ {F t } t∈R + , defined by F t : where N is the family of P-null events of F ∞ (see [1] p. 309).
The next lemma, which we prove for the sake of completeness, guarantees that the notion of stochastic intensity is equivalent to that of F-stochastic intensity.
Lemma 2.7. Let λ : R + ×Ω×E → R + be a predictable stochastic process. Then N has a stochastic intensity λ if and only if N has a F-stochastic intensity λ.
Proof. We start by noting that any predictable stochastic process is F-predictable, and in particular λ is both predictable and F-predictable. The necessity follows directly from this fact. For the sufficiency, taking in (2.4) the process defined by X (t,x) : We shall check later on that, for any non-negative or integrable random variable Y and t ∈ R + we have As a consequence of (2.6), for any a, b ∈ R + and L ∈ E, we have

Main results
In this section we state our main achievements, i.e. a Poincaré inequality for square-integrable functionals of marked point processes with a Papangelou conditional intensity, a transportation cost inequality for functionals of marked point processes with a Papangelou conditional intensity, a transportation cost inequality for the law of a marked point process with a Papangelou conditional intensity, and a variational representation formula for the Laplace transform of (bounded from above) functionals of marked point processes with a Papangelou conditional intensity. All these functional relations are obtained by applying a Clark-Ocone formula for square-integrable functionals of space-time point processes with a Papangelou conditional intensity, which generalizes in various directions the corresponding formula in [8].
Hereafter, we work under the convention 0/0 := 0. Moreover, in order to be more precise, from now on we shall write P G p (λ) in place of P p (λ), p ≥ 1, to stress that the stochastic processes therein are predictable with respect to some specific filtration G.

Poincaré inequality
The following Poincaré inequality holds for functionals of marked point processes with a Papangelou conditional intensity. 1) and that N has a Papangelou conditional intensity π such that
Remark 3.2. If N is a Poisson process with locally integrable intensity function σ(t, x) then π ≡ σ and all the assumptions of Theorem 3.1 hold. In particular, γ = 0 and we recover the well-known Poincaré inequality for Poisson functionals (see e.g. [15], [16] and [27]).
Remark 3.3. If N is locally stable in the sense of (2.2), then assumptions (3.1) and (3.2) hold. Indeed, it is clear that (2.2) implies (3.2). As far as the implication of (3.1), note that under assumption (2.2) we have where the first equality follows by the iterated Georgii-Nguyen-Zessin equation, see e.g. Proposition 15.5.II in [6]. We also remark that (3.2) is automatically verified if (3.1) holds and the marked point process is attractive, in the sense that for dsν(dx)-almost all (s, x) ∈ R + × E and all ω, η ∈ Ω such that Supp(ω) ⊂ Supp(η) (see e.g. equation (3.7) in [18] for more explanations on this notion). Indeed, for all t ∈ R + we have where we have applied (2.1) twice, and used the fact that by hypothesis E N ([0, t] × E) 2 < ∞.
Remark 3.4. Let N be a point process on R + ×E with a Papangelou conditional intensity π such that (2.2) holds with dominating function β, for some measurable non-negative function α : R + ×E → R, Consequently, all assumptions of Theorem 3.1 are satisfied. In particular, we note that which, combined with (3.6), guarantees (3.3).
Remark 3.5. As mentioned in the introduction, a Poincaré inequality for Gibbs point processes was proved in [13] (see Corollaries 5.1 and 5.2 therein). Basically, Corollary 5.1 in [13] (of which Corollary 5.2 is a small improvement) states and proves the following. If N is a grand canonical Gibbs point process on R d with activity parameter z > 0 and non-negative pair potential φ such that then, for any square-integrable functional F of the point process, we have where D x is the usual add one-cost operator, is the Papangelou conditional intensity of N and E denotes the relative energy. On the one hand, the Poincaré inequality provided by Theorem 3.1 can be applied e.g. to square-integrable functionals of renewal, non-linear Hawkes and Cox point processes, and these processes do not belong to the class of Gibbs point measures for which the inequality in [13] applies. On the other hand, we were not able to apply our Poincaré inequality to the Gibbs measures considered in [13].
The next three corollaries, whose proofs are given in Section 4, provide classes of non-Poissonian point processes which satisfy the Poincaré inequality (3.4).
and a spacing density f such that f is continuous on [0, +∞) and f > 0 on (a, C) for some a ∈ [0, T ] and C ∈ (T, +∞]. Assume further that for any z ∈ [0, T ], as well as where F is the tail function of f . Then, for any square integrable functional G of N , where the Papangelou conditional intensity π of N is given by Next, we give some illustrating examples of renewal point processes which satisfy the assumptions of Corollary 3.2. Second, consider the Weibull spacing density function The corresponding tail function is With the aim to check condition (3.8), we remark that again by Lipschitz continuity, we have Hence for fixed T > 0 and β 0 satisfying the inequality γ * (β 0 ) < 1 (in particular, note that it suffices to take β 0 close to one) the corresponding renewal point process satisfies the Poincaré inequality (3.9). Next, consider the generalized Pareto spacing density function Hence for fixed T, λ > 0 and ξ 0 satisfying the inequality γ * (ξ 0 ) < 1 (in particular, note that it suffices to take ξ 0 close to zero) the corresponding renewal point process satisfies the Poincaré inequality (3.9).

Corollary 3.4 (Nonlinear Hawkes processes).
Assume that N is a nonlinear Hawkes process on [0, T ], T < ∞, with parameters (h, φ) such that h is non-negative and integrable on [0, T ], and φ is Lipschitz continuous and non-increasing with φ(0) > 0. Additionally, assume where φ Lip denotes the Lipschitz constant of φ. Then, for any square-integrable functional G of N , where the Papangelou conditional intensity π of N is given by In particular, we have dtP(dω)-almost everywhere.
Next, we give an example of nonlinear Hawkes processes which satisfies the assumptions of Corollary 3.4. Example 3.5. Assume that N is a nonlinear Hawkes process with parameters φ(x) := α min(max(K− x, 0), 1) and h := 1 1 [0,z] for some α, K, z > 0, K integer. This is a notable example of nonlinear Hawkes process since N ([0, t]), t ∈ [0, T ], is the total number of customers who have entered, in the time interval [0, t], the Erlang loss system (or M/D/K/0 queue) with arrival rate α, deterministic service time z and number of servers equal to K, see [2] for details. An easy computation shows that γ defined in Corollary 3.4 is equal to Therefore, for fixed α, T > 0, and z 0 satisfying the inequality γ(z 0 ) < 1 (in particular, note that it suffices to take z 0 close to zero), the corresponding nonlinear Hawkes process satisfies the Poincaré inequality (3.10).
We conclude this subsection by providing a class of Cox processes which satisfy the Poincaré inequality.
Then, for any square-integrable functional G of N we have where the Papangelou conditional intensity π of N is given for fixed ω ∈ Ω by .
In particular, we have

Transportation cost inequalities
Let χ be a Polish space equipped with its Borel σ-field B(χ) and let d be a lower semi-continuous metric on χ (which does not necessarily generates the topology on χ). Letting σ 1 and σ 2 denote a couple of probability measures on (χ, B(χ)), we define the transportation cost where the infimum is taken over all the probability measures ρ on χ × χ with first marginal σ 1 and second marginal σ 2 . We denote by M 1 (χ, d) the set of all probability measures σ on (χ, B(χ)) such that d(x 0 , x) σ(dx) < ∞, for some x 0 ∈ χ and we remark that for σ 1 , σ 2 ∈ M 1 (χ, d), The relative entropy of σ 1 with respect to σ 2 is defined by if σ 1 is absolutely continuous with respect to σ 2 and H(σ 1 | σ 2 ) := +∞ otherwise (the reader is referred to [26] for more insight into the theory of optimal transportation).
In the following, we denote by the monotone conjugate of a measurable function g :

A transportation cost inequality for functionals of N
In this subsection we take χ := R, suppose that there exists a norm on χ, say · d , such that d(x, y) = x − y d , x, y ∈ R, and denote by L(X) the law of a real-valued random variable X. The following theorem holds.
Theorem 3.7. Suppose that N satisfies (2.2) for a measurable function β and let G be an integrable random variable such that for dtdPdν-almost all (t, ω, x) and some deterministic functions g 1 , g 2 such that where h := g 1 + g 2 . Then and If additionally we assume that there exists is always well-defined under the convention 0/0 := 0.
Remark 3.7 (Deviation inequality). From the point of view of the applications, it is important to remark that a transportation cost inequality is often equivalent to a deviation bound. More precisely, in the context of Theorem 3.7, one has that the transportation cost inequality (3.17) is equivalent to the deviation bound , for any n ≥ 1 and r ∈ R + (3.21) for any measurable function f : R → R which is Lipschitz continuous (with respect to d) with Lipschitz constant less than or equal to 1, i.e. such that where {G n } n≥1 is a sequence of independent random variables with the same law as G (see [10] and Theorem 1.1(c) in [19]).
To the best of our knowledge, the transportation cost inequality provided by Theorem 3.7 is new even in the Poisson case, which we state in a separate corollary.
Corollary 3.8 (Poisson processes). Suppose that N is a Poisson process on R + × E with mean measure β(t, x) dtν(dx) and let G be an integrable random variable such that for dtdPdν-almost all (t, ω, x) and some deterministic function h which satisfies (3.16). Then the transportation cost inequality (3.17) holds. If additionally we assume that there exists M > 0 such that h(t, x) ≤ M for dtdν-almost all (t, x), then (3.20) holds, and provides a more explicit bound on the corresponding deviation inequality (3.21).
In the following proposition, we specialize Theorem 3.7 to first order integrals. The subsequent corollaries concern applications to renewal, nonlinear Hawkes and Cox point processes.
for some measurable deterministic function g, and let g 1 , g 2 be deterministic functions such that |g(s, y)|β(s, y) dsν(dy) for (t, x) ∈ R + × E. Assuming that h := g 1 + g 2 satisfies the corresponding assumption (3.16), the transportation cost inequality (3.17) holds. If additionally we assume that there exists M > 0 such that h(t, x) ≤ M for dtdν-almost all (t, x), then (3.20) holds, and provides a more explicit bound on the corresponding deviation inequality (3.21).
In the next corollaries, we provide classes of point processes which satisfy the assumptions of Proposition 3.9. We point out that we do not aim to optimize the assumptions, favoring instead clarity and conciseness. The proofs are quite elementary, and provided in Section 4.4 for the readers' convenience.
) After straightforward adjustments due to the unmarked setting, the transportation cost inequality (3.17) holds with For any M > 0 such that g 1 + g 2 ∞ ≤ M , we have that (3.20) holds, which yields a more explicit bound on the corresponding deviation inequality (3.21).

Corollary 3.11 (Nonlinear Hawkes processes).
Assume that N is a nonlinear Hawkes process is Lipschitz continuous and non-increasing with φ(0) > 0, and let G be defined by (3.23). After straightforward adjustments due to the unmarked setting, the transportation cost inequality (3.17) For any M > 0 such that g 1 + g 2 ∞ ≤ M , we have that (3.20) holds, which yields a more explicit bound on the corresponding deviation inequality (3.21).

Corollary 3.12 (Cox processes). Assume ν(E) < ∞ and let N be a Cox process on
then the transportation cost inequality (3.17) holds with holds, which yields a more explicit bound on the corresponding deviation inequality (3.21).

A transportation cost inequality for the law of N
In this subsection we take χ := Ω and equip this set with the vague convergence topology, that is, the coarsest topology such that the map is a continuous function with compact support. It is well-known that this topology makes Ω a Polish space (see e.g. [5]). In this subsection, we let ϕ : R + × E → R + be a continuous function and define the following metric on Ω: where, for κ ∈ Ω, |κ| := κ + + κ − , and κ + and κ − denote respectively the positive and the negative parts of κ in the Hahn-Jordan decomposition. It is known that d ϕ is a lower semi-continuous metric on Ω (see Lemma 2.2 in [19]). The following theorem holds.
Theorem 3.13. Assume that N satisfies (2.2) for a measurable function β, that there exists a deterministic measurable function ψ which verifies (3.22) with ϕ in place of g and ψ in place of g 2 and that

28)
where h ϕ := ϕ + ψ. Then In particular, when N is a Poisson process with mean measure β(t, x) dtν(dx) we recover the sharp transportation cost inequality in Remark 2.7 of [19].
Remark 3.8 (Deviation inequality). Here again, it turns out (see [10] and Theorem 1.1(c) in [19]) that, in the context of Theorem 3.13, the transportation cost inequality (3.29) is equivalent to the deviation bound , for any n ≥ 1 and r ∈ R + (3.30) for any measurable function F : Ω → R which is Lipschitz continuous (with respect to d ϕ ) with Lipschitz constant less than or equal to 1, i.e. such that where β is defined by (3.26).

Variational representation of the Laplace transform
The following Theorem 3.18 generalizes to functionals of marked point processes with Papangelou conditional intensity the variational representation formula for the Laplace transform of Poisson functionals given by Theorem 4.4 of [29]. Note also that, in contrast to [29], here the marked point process is defined on the whole half-line.
Hereafter, we suppose that N satisfies (3.1) and (3.2), and denote by H the subset of P F 2 (p(π)) formed by the real-valued processes φ ∈ and for any T > 0, (t, The following lemma ensures that {E t (φ)} t∈R + is a square-integrable F-martingale. Its proof is postponed to Section 4 (see Subsection 4.6).
Lemma 3.17. Assume that N satisfies (3.1) and (3.2). Then, for any φ ∈ H, the stochastic Under the assumptions of Lemma 3.17, for φ ∈ H, we define a new probability measure P φ on The following variational representation of the Laplace transform holds.

35)
and let G be a random variable on (Ω, F ∞ , P) which is upper bounded. Then where E φ denotes the expectation under P φ and Additionally, if G ∈ L ∞ (Ω, F ∞ , P), then the infimum is uniquely attained at Here, we limit ourselves to note that ϕ (F ) is well-defined and belongs to P F 2 (p(π)), and refer the reader to Theorem 3.19 and Remark 3.11 for details.

Clark-Ocone formula
As already mentioned, the proofs of Theorems 3.1, 3.7, 3.13 and 3.18 are based on a Clark-Ocone formula for marked point processes with Papangelou conditional intensity which generalizes the one derived in [8] (see Remark 3.10).
Remark 3.10. The Clark-Ocone formula (3.37) generalizes the corresponding formula in [8] to point processes on R + with marks in E and guarantees that the integrand ϕ (G) is square integrable with respect to p(π) (t,x) (ω)dtP(dω)ν(dx). This integrability property is crucial in our proofs since it allows one to apply the isometry formula of Proposition 2.5. We also remark that the proof of formula (3.37) provided in this paper is shorter than the proof of the corresponding Clark-Ocone formula in [8].

Proof of Theorem 3.1
By Lemma 2.6, Theorem 3.19 and Proposition 2.5, we have Note that Thus, for any 0 < q < 1 − γ, by (4.1) and the convexity inequality we have By the Cauchy-Schwarz inequality By the Cauchy-Schwarz inequality So, by assumption (3.3) and the inequality which follows from Jensen's inequality, we have

Proofs of Corollaries 3.2, 3.4 and 3.6
Proof of Corollary 3.2. By Corollary 2.9 in [8] N has Papangelou conditional intensity {π t } t∈[0,T ] defined in the statement. By Proposition 2.10 in [8] N is locally stable, and so by Remark 3.3 the corresponding assumptions (3.1) and (3.2) are satisfied. In order to verify the corresponding assumption (3.3), we compute the quantity We recall that the points of N are denoted by T 0 := 0 < T 1 < T 2 < · · · < T N ([0,T ]) . For any t ∈ [0, T ], we have Hence by (4.6), Lemma 2.6 and Lemma 1 in [2], and so, by (3.8), the corresponding assumption (3.3) holds and the claim follows by Theorem 3.1.
Proof of Corollary 3.4. By Lemma 2.7 in [8], N has a Papangelou conditional intensity π t defined by (3.11). The assumptions on the parameters (h, φ) guarantee λ t ≤ φ(0) and Consequently, the inequalities in (3.12) hold and by Remark 3.3 the corresponding conditions (3.1) and (3.2) are satisfied. A straightforward computation gives and therefore the corresponding assumption (3.3) holds. The claim follows by Theorem 3.1.

Proof of Theorem 3.7
The proof of Theorem 3.7 is based on two preliminary propositions. The first one consists in a result from [10] (see Theorem 3 therein, as well as Theorem 1.1 in [19]). The second one, whose proof is given at the end of this subsection, provides a stochastic convex inequality for functionals of marked point processes and generalizes Theorem 4.1-(ii) in [12]. We recall that in general, χ denotes a Polish space equipped with its Borel σ-field B(χ) and d denotes a lower semi-continuous metric on χ which does not necessarily generate the topology on χ. Proposition 4.2. Let the notation of Theorem 3.19 prevail. Assume that N satisfies (2.2) with a dominating function β, G ∈ L 2 (Ω, F ∞ , P) and ϕ (G) (t,x) ≤ h(t, x), dtP(dω)ν(dx)-almost everywhere, for some deterministic function h such that Then, letting E ′ denote the expectation corresponding to a probability measure P ′ on (Ω, F ∞ ) under which N is a Poisson process on R + × E with intensity function β, for all twice continuously differentiable convex functions φ : R → R such that φ ′ is convex.
Proof of Theorem 3.7. We take χ = R and let f : R → R be bounded and Lipschitz continuous with Lipschitz constant less than or equal to 1 (i.e sup x =y |f (x) − f (y)|/d(x, y) ≤ 1). We note that for dtν(dx)P(dω)-almost every (t, x, ω), and so So, for any θ ≥ 0, by Proposition 4.2 with φ(x) := e θx and f (G) in place of G, we have where Λ is defined by (3.19). A straightforward computation shows that Λ is non-negative, nondecreasing, left-continuous and convex, with Λ(0) = 0. Thus, by Proposition 1 in [10] Λ ⊙⊙ = Λ. The transportation cost inequality (3.17) then follows by Proposition 4.1. Now, assuming that h is bounded by M > 0, we have and as in the proof of Theorem 2.6 in [19], we get the inequality (3.20).
Proof of Proposition 4.2. This proof is inspired by that of Lemma 3.2 of [28]. Throughout this proof all the random quantities are defined on the product probability space (Ω 2 , F ∞ ⊗ F ∞ , P ⊗ P ′ ), and we let N ′ (ω, ω ′ ) = ω ′ and N (ω, ω ′ ) = ω. With an abuse of notation, we set p(π) (t,x) (ω, ω ′ ) := x) (ω), and we denote by E the expectation with respect to P ⊗ P ′ . Let {M t } t∈R + and {M * t } t∈R + be the stochastic processes defined, respectively, by and Let {H t } t≥0 and {H * t } t≥0 be, respectively, the forward and backward filtrations defined by By Corollary C4 p. 235 in [1] and standard properties of the conditional expectation, we have that {M t } t≥0 is an H * -adapted H-martingale and {M * t } t≥0 is an H-adapted H * -backward martingale. Letting ε (t,u) and ε t denote, respectively, the Dirac measure at (t, u) ∈ R + × R and at t ∈ R + , we define the jump measures of {M t } t∈R + and {M * t } t∈R + respectively by where ∆M t := M t − M t − . For any fixed t ∈ R + , denote by ν t (dτ ) the (random) image measure on (R, B(R)) of p(π) (t,x) ν(dx) by the mapping E ∋ x → ϕ (G) (t,x) ∈ R, i.e. for any bounded and measurable f : and similarly let ν * t be the measure on (R, B(R)) defined by It turns out that ν t (dτ ) dt is the dual H-predictable projection of µ and ν * t (dτ ) dt is the dual H *predictable projection of µ * . Indeed focusing e.g. on µ, again by Corollary C4 p. 235 in [1] and standard properties of the conditional expectation, for any t, ∆t ≥ 0 and A ∈ B(R), Consequently, conditions (3.1), (3.2) and (3.3) of [12] are verified. We also note that condition (3.4) of [12] is trivially satisfied with H ≡ H * ≡ 0. For any t ∈ R + , we define the following (random) measures on (R, B(R)): ν t (dτ ) := |τ | 2 ν t (dτ ) and ν * t (dτ ) := |τ | 2 ν * t (dτ ).
For any u ∈ R, we have Furthermore, for any u ∈ R, Therefore, for any u ∈ R and any (ω, ω ′ ) ∈ Ω 2 , ν * t ([u, ∞)) < ∞ dt-almost everywhere, and so by Theorem 3.3 in [12] we have , for all t ∈ R + and any function φ as in the statement. By Theorem 3.19 and Proposition 2.4 we have Thus, there exists a sequence {t n } n≥1 such that M tn + M * tn → G − EG, P-almost surely, and therefore by Fatou's lemma which is exactly (4.9).

Proofs of Proposition 3.9 and Corollaries 3.10, 3.11, 3.12
Proof of Proposition 3.9. We shall apply Theorem 3.7. For any (t, x) ∈ R + × E, we have and by the Cauchy-Schwarz inequality .
Proof of Corollary 3.10. By the proof of Proposition 2.10 in [8] we have π t ≤ β, where β is given by (3.26). Additionally, by (3.24), (4.7) and the form of the stochastic intensity for renewal processes, The claim follows by Proposition 3.9.
Proof of Corollary 3.11. By the first inequality in (3.12) we can take as dominating function of π t Additionally, by the first inequality in (4.8), The claim follows by Proposition 3.9.
Proof of Corollary 3.12. We already noticed in the proof of Corollary 3.6 that β is a dominating function of the Papangelou conditional intensity. The claim easily follows by Proposition 3.9 noticing that

Proof of Theorem 3.13
Since the proof is conceptually similar to that of Theorem 3.7, we only emphasize the main differences. We take χ = Ω, d := d ϕ and let F : (Ω, d ϕ ) → R be bounded and Lipschitz continuous with Lipschitz constant less than or equal to one. We have for dtν(dx)P(dω)-almost every (t, x, ω), Proceeding similarly to the series of inequalities (4.10), we obtain ϕ(s, y)β(s, y) dsν(dy) .
The remainder of the proof is similar to that of the final part of Theorem 3.7.

Proof of Theorem 3.18
The proof of Theorem 3.18 is based on two preliminary propositions, which extend to our setting Propositions 4.2 and 4.3 in [29], respectively. Proposition 4.3. Let the assumptions and notation of Theorem 3.19 prevail and let G be a random variable on (Ω, F ∞ , P) which satisfies 0 < c 0 ≤ G ≤ c 1 almost surely, for some positive constants c 0 , c 1 > 0. Setting we have φ (G) ∈ H. Additionally, for any t ∈ R + we have Proposition 4.4. Let the assumptions and notation of Theorem 3.19 prevail and, for φ ∈ H, let P φ be defined by (3.34). For any predictable stochastic process ψ such that we have that, under P φ , is a square-integrable F-martingale with null mean.
These propositions are proved at the end of this subsection, and we start proving Lemma 3.17.
Proof of Theorem 3.18. We divide the proof in three steps. In the first step we identify dP φ /dP, in the second step we prove the claim when G ∈ L ∞ (Ω, F ∞ , P), in the third step we prove the variational representation in the more general case of functionals G which are bounded from above.
Letting M > 0 denote a constant such that |φ (s,x) | ≤ M dsν(dx)dP-almost everywhere, by (4.18) we have Let T > 0 be arbitrarily fixed. Again by (4.18) we have that s → E E s (φ) 2 is non-decreasing and continuous on [0, T ] and by (3.35) is integrable on [0, T ]. Therefore, by Grönwall's lemma ds ≤ e M 2 K , for any t ∈ R + and so sup t∈R + E E t (φ) 2 ≤ e M 2 K . By this latter relation, (4.14), Proposition 2.4-(ii) and (3.35), we have and thus E t (φ) converges in L 2 (Ω, F ∞ , P) to a random variable X. Letting E ∞ (φ) denote the P-almost sure limit of E t (φ) as t → ∞, we necessarily have X = E ∞ (φ) almost surely, and so By the martingale property it follows that E t (φ) = E E n (φ) F t for any t ∈ R + and any integer n > t. By the This convergence holds almost surely for a suitable subsequence {n ′ } and passing to the limit as n ′ → ∞ in the equality By this relation we finally deduce dP φ /dP = E ∞ (φ).
Using the elementary inequality 20) where the finiteness of the L 2 -norms follows by (3.32) and (4.19). Let φ ∈ H and T > 0 be arbitrarily fixed, and set ψ (s,x) : . Therefore, by (3.32) we have (4.13). Note that where we have used that the martingale provided by Proposition 4.4 has null mean. By Jensen's inequality and this relation and so − log E e −G ≤ inf φ∈H E φ G + L(φ) . Therefore, by Proposition 4.3 , ∀t ∈ R + and letting t go to infinity we deduce where the latter equality follows by (4.21). Combining (4.24) and (4.23) we deduce and the infimum is attained at φ (F ) . It remains to show that the infimum is uniquely attained at φ (F ) , i.e. if φ ∈ H is a stochastic process at which the infimum is attained then necessarily x, ω). So let φ ∈ H be a process at which the infimum is attained. Then Jensen's inequality (4.22) holds as an equality and therefore we have Therefore E ∞ (φ) = E ∞ (φ (F ) ) P φ -almost surely. Since the probability measures P φ and P are equivalent it follows that E ∞ (φ) = E ∞ (φ (F ) ) P-almost surely. Consequently, for any t ∈ R + we have E t (φ) = E t (φ (F ) ) P-almost everywhere, and so by (4.14) we find Taking the expectation of the square of this quantity, by Proposition 2.5 we have Since E s − (φ) > 0 dsdP-almost surely, we get φ = φ (F ) in P 2 (p(π)), and the proof is complete.
Step 3 . For any integer n ≥ 1, define G (n) := max{G, −n}. Since G is upper bounded we have G (n) ∈ L ∞ (Ω, F ∞ , P). Therefore, by Step 2 we have In addition, since the sequence {e −G (n) } n≥1 is non-decreasing we get lim n→∞ E e −G (n) = E e −G by the monotone convergence theorem, hence taking the limit as n → ∞ we obtain For the reversed inequality we note that, for any ψ ∈ H, where the latter equality follows by the monotone convergence theorem since the sequence {G (n) } n≥1 is non-increasing in n and each G (n) is (for n large enough) bounded above by a same constant. Taking the infimum on H in (4.25) yields the reversed inequality and the proof is complete.
Proof of Proposition 4.3. Since N has stochastic intensity p(π), the stochastic process is an F-martingale, and by Theorem 3. 19 we have Letting φ be defined by (4.11) and suppressing the dependence of φ on G for ease of notation we have, since 0 < c 0 ≤ G ≤ c 1 , and φ is predictable and bounded with sup (t,x)∈R + ×E |φ (t,x) | ≤ (c 1 /c 0 ) + 1, P-almost surely. Set 27) where the finiteness of the latter quantity follows noticing that |φ (t,x) | ≤ |ϕ (G) (t,x) |/c 0 , t ∈ R + , x ∈ E, and ϕ (G) ∈ P 2 (p(π)). So {X t } t∈R + is a square-integrable F-martingale. By the definition of φ, the relation p(G) t = E G F t − P-almost surely and (4.26), we have (4.28) Note that and so X is of finite variation (and càdlàg). Therefore, its quadratic variation process is given by (4.29) and thus the path-by-path continuous part of [X, X] is equal to zero. By (4.28) and Theorem 37 p. 84 in [23], we have We note that Substituting this expression into (4.30) we deduce In particular, for any T > 0, we have (t, Finally, we prove that (3.32) holds (so that φ ∈ H). First, set and note that (by the same computation as in (4.29)) its quadratic variation process is By Burkholder-Davis-Gundy's and Doob's inequalities, there exists a positive constant C > 0 such that for all t ≥ 0 we have By (4.26) the right-most term of this relation is equal to C(4/3) 4 E |E G F t − E[G]| 4 , which is in turn less than or equal to a positive constant, say C ′ > 0, which is independent of t. Hence for any t ∈ R + we have where C ′′ := (2C ′ )/c 4 0 > 0 and we have applied Proposition 2.4 and (4.27). Taking the limit as t goes to infinity in the above relations finally yields (3.32).
Using the "angle bracket" notation, see p. 53 in [11] and pp. 122-123 in [23], we have where we put Z t := δ(1 1 [0,t] ψ) and used (4.14), i.e. E t (φ) − 1 = δ(1 1 [0,t] E(φ)φ). Since {Z t } t∈R + is an F-martingale under P, with Z 0 = 0, and E t (φ) is the Radon-Nikodym derivative of P φ with respect to P on F t , by the Meyer-Girsanov theorem, see Theorem 36 p. 133 in [23], we have is a local F-martingale under P φ . In the following, E φ denotes the expectation with respect to P φ . In order to prove that the process defined in (4.31) is an F-martingale under P φ , it suffices to prove that the stochastic process {(1 + φ (t,x) )p(π) (t,x) } (t,x)∈R + ×E is a stochastic intensity of N under P φ i.e., for any non-negative and predictable stochastic process X : and the integrability condition Indeed, the martingale property follows by Corollary C4 p. 235 in [1]. In order to prove that the process defined in (4.31) is square integrable under P φ , it suffices to prove The square integrability then follows by Proposition 2.4-(ii). We start by proving (4.32). Setting L ∈ E, and reasoning exactly as in the first part of the proof we have that the process defined by (4.31) with X in place of ψ is a local F-martingale under P φ , as X ∈ P 2 (p(π)) since E N ((a, b] × L) < ∞. Therefore there exists a sequence of F-stopping times {T n } n≥0 increasing to infinity such that for any T ∈ R + and n ≥ 0. Letting n and T go to infinity in the above equation, by the monotone convergence theorem we obtain (4.32) for simple predictable stochastic processes. The result follows for a general non-negative predictable stochastic process by a standard application of the monotone class theorem, see e.g. [1], Theorem T1 p. 260. Finally we prove (4.33) and (4.34). Letting M > 0 denote a constant such that |φ (t,x) | ≤ M dtν(dx)dP-almost everywhere, for any T > 0, since the probability measures P and P φ are equivalent we also have |φ (t,x) | ≤ M dtν(dx)dP φ -almost everywhere, for any T > 0. Therefore, which concludes the proof.

Proof of Theorem 3.19
The proof of the Clark-Ocone formula relies on the following propositions, which will be shown at the end of this subsection. The first one provides an F-predictable representation formula for square-integrable functionals of a marked point process with a stochastic intensity. The second one gives a formula which allows us to transform the expectation of the product between a squareintegrable functional of a marked point process with a stochastic intensity and an integral with respect to the compensated marked point process into the expectation of an integral with respect to the measure dtν(dx).
Proposition 4.5. Assume that N has a F-stochastic intensity λ. Then, for any G ∈ L 2 (Ω, F ∞ , P), there exists u (G) ∈ P F 2 (λ) such that, for all t ∈ R + , In particular, We remark that the integrand u (G) ∈ P F 2 (λ) is not made explicit, in contrast with the Clark-Ocone formula.
Proposition 4.6. Assume that N has a F-stochastic intensity λ and that E N ([0, t] × E) < ∞, for all t ∈ R + . Let G ∈ L 2 (Ω, F ∞ , P) and u ∈ P F 1,2 (λ). Then we have Proof of Theorem 3.19. We divide the proof into two steps. In the first step we derive a predictable representation of G, in the second step we identify the integrand of the predictable representation.

Proof of Proposition 2.2
The proof uses the following lemma which guarantees the existence of a predictable version of a bounded stochastic process X.
Proof of Proposition 2.2. If X is assumed to be non-negative, we set X (n) := min(X, n), n ≥ 0. By Lemma 5.1, for any n and (t, x), p(X (n) (t,x) ) = E X (n) (t,x) F t − , P-almost surely. By the monotone convergence theorem lim n→∞ p(X (n) (t,x) ) = E X (t,x) F t − , P-almost surely.
Since p(X (n) ) is predictable for each n its limit exists and is predictable, completing the proof under the assumption that X is non-negative.
If one assumes that, for dtν(dx)-almost all (t, x) ∈ R + × E, X (t,x) ∈ L 1 (Ω, F ∞ , P), then we write X = X + − X − , where X + := max(X, 0) and X − := − min(X, 0). Applying the first part of the proposition to X + and X − we have that there exist two predictable stochastic processes p(X + ) and p(X − ) such that p(X + ) (t,x) = E X + (t,x) F t − and p(X − ) (t,x) = E X − (t,x) F t − , P-almost surely.
By taking the expectation of these two equalities, one has, for dtν(dx)-almost all (t, x), p(X + ) (t,x) < ∞ and p(X − ) (t,x) < ∞, and therefore x) , P-almost surely. (5.1) The claim follows by noticing that the right-hand side of (5.1) is a predictable stochastic process.
Proof of Lemma 5.1. The idea is to apply the existence part of Theorem 3.3 in [17] and the last displayed formula on p. 368 again in [17]. Following the notation of [17], we consider the locally compact second countable Hausdorff space X := R + ×E and the DC-semiring S := {(a, b]×B : a, b ∈ R + , a < b, B ∈ E, B relatively compact}. Moreover, we consider the system Γ := {Γ (t,x) , Γ S : (t, x) ∈ R + × E, S ∈ S} defined by Γ (t,x) := [0, t) × E and Γ (a,b]×B := [0, a] × E. It is readily checked that it satisfies conditions (2.3) and (2.4) in [17]. In this setting, for (t, x) ∈ R + × E, a, b ∈ R + : a < b, B ∈ E a relatively compact set, the σ-fields F((t, x)) and F((a, b] × B) defined on p. 364 of [17] are given by where N | A denotes the restriction of the random measure N to A ∈ B(R + ) ⊗ E. One may easily check that the predictable σ-algebra P in [17] coincides with the σ-field P(F) ⊗ E, and that the point process N satisfies condition (2.5) on p. 364 of [17]. We note that if condition Σ(Λ) on p. 367 of [17] holds, then the claim of the lemma follows by the existence part of Theorem 3.3 in [17] and the last displayed formula on p. 368 again in [17]. In our setting, condition Σ(Λ) on p. 367 of [17] reads Σ(Λ) : P(N ((a, b] × E) = 0 | F a ) > 0 P-a.s., a, b ∈ R + : a < b.

Proof of Proposition 2.5
The proof uses the following lemma which, under a mild integrability condition on N , guarantees that P 1,2 (λ) is dense in P 2 (λ). Although its proof is quite standard, we include it for the sake of completeness.