Markov chain approximations for transition densities of L\'evy processes

We consider the convergence of a continuous-time Markov chain approximation X^h, h>0, to an R^d-valued Levy process X. The state space of X^h is an equidistant lattice and its Q-matrix is chosen to approximate the generator of X. In dimension one (d=1), and then under a general sufficient condition for the existence of transition densities of X, we establish sharp convergence rates of the normalised probability mass function of X^h to the probability density function of X. In higher dimensions (d>1), rates of convergence are obtained under a technical condition, which is satisfied when the diffusion matrix is non-degenerate.


Introduction
Discretization schemes for stochastic processes are relevant both theoretically, as they shed light on the nature of the underlying stochasticity, and practically, since they lend themselves well to numerical methods. Lévy processes, in particular, constitute a rich and fundamental class with applications in diverse areas such as mathematical finance, risk management, insurance, queuing, storage and population genetics etc. (see e.g. [22]).
1.1. Short statement of problem and results. In the present paper, we study the rate of convergence of a weak approximation of an R d -valued (d ∈ N) Lévy process X by a continuoustime Markov chain (CTMC). Our main aim is to understand the rates of convergence of transition densities. These cannot be viewed as expectations of (sufficiently well-behaved, e.g. bounded continuous) real-valued functions against the marginals of the processes, and hence are in general hard to study.
Since the results are easier to describe in dimension one (d = 1), we focus first on this setting.
Specifically, our main result in this case, Theorem 2.4, establishes the precise convergence rate of the normalised probability mass function of the approximating Markov chain to the transition density of the Lévy process for the two proposed discretisation schemes, one in the case where X has a non-trivial diffusion component and one when it does not. More precisely, in both cases we approximate X by a CTMC X h with state space Z h := hZ and Q-matrix defined as a natural discretised version of the generator of X. This makes the CTMC X h into a continuous-time random walk, which is skip-free (i.e. simple) if X is without jumps (i.e. Brownian motion with drift). The quantity κ(h) := where λ is the Lévy measure of X, is related to the activity of the small jumps of X and plays a crucial role in the rate of convergence. We assume that either the diffusion component of X is present (σ 2 > 0) or the jump activity of X is sufficient (Orey's condition [29,p. 190,Proposition 28.3], see also Assumption 2.3 below) to ensure that X admits continuous transition densities p t,T (x, y) (from x at time t to y at time T > t), which are our main object of study.
Let P h t,T (x, y) := P(X h T = y|X h t = x) denote the corresponding transition probabilities of X h and let The following table summarizes our result (for functions f ≥ 0 and g > 0, we shall write lim h↓0 f (h)/g(h) ∈ (0, ∞)) -if g converges to 0, then we will say f decays no slower than (resp. faster than, at the same rate as) g): We also prove that the rates stated here are sharp in the sense that there exist Lévy processes for which convergence is no better than stated.
Note that the rate of convergence depends on the Lévy measure λ, it being best when λ = 0 (quadratic when σ 2 > 0), and linear otherwise, unless the pure jump part of X has infinite variation, in which case it depends on the quantity κ. This is due to the nature of the discretisation of the Brownian motion with drift (which gives a quadratic order of convergence, when σ 2 > 0), and then of the Lévy measure, which is aggregated over intervals of length h around each of the lattice points; see also (v) of Remark 3.11. In the infinite activity case, κ(h) = o(1/h), indeed κ is bounded, if in addition κ(0) < ∞. However, the convergence of hκ(h/2) to zero, as h ↓ 0, can be arbitrarily slow.
Finally, if X is a compound Poisson process (i.e. λ(R) ∈ (0, ∞)) without a diffusion component, but possibly with a drift, there is always an atom present in the law of X at a fixed time, which is why the finite Lévy measure case is studied only when σ 2 > 0.
The proof of Theorem 2.4 is in two steps: we first establish the convergence rate of the characteristic exponent of X h t to that of X t (Subsection 3.2). In the second step we apply this to the study of the convergence of transition densities (Section 4) via their spectral representations (established in Subsection 3.1). Note that in general the rates of convergence of the characteristic functions do not carry over directly to the distribution functions. We are able to follow through the above programme by exploiting the special structure of the infinitely divisible distributions in what amounts to a detailed comparison of the transition kernels p t,T (x, y) and P h t,T (x, y). By way of example, note that if λ([−1, 1]\[−h, h]) ∼ 1/h 1+α for some α ∈ (0, 1), then κ(h) ∼ h −α and the convergence of the normalized probability mass function to the transition density is by Theorem 2.4 of order h 1−α , since κ(0) = ∞ and Orey's condition is satisfied. In particular, in the case of the CGMY [5] (tempered stable) or β-stable [29, p. 80] processes with stability parameter β ∈ (1, 2), we have α = β − 1 and hence convergence of order h 2−β . More generally, if β := inf{p > 0 : [−1,1] |x| p dλ(x) < ∞} is the Blumenthal-Getoor index, and β ≥ 1, then for any p > β, ζ(δ) = O(δ 2−p ). Conversely, if for some p ≥ 1, ζ(δ) = O(δ 2−p ), then β ≤ p.
This gives the overall picture in dimension one. In dimensions higher than one (d > 1), and then under a straightforward extension of the discretization described above, essentially the same rates of convergence are obtained as in the univariate case; this time under a technical condition (cf. Assumption 2.5), which is satisfied when the diffusion-matrix is non-degenerate. Our main result in this case is Theorem 2.7.
1.2. Literature overview. In general, there has been a plethora of publications devoted to the subject of discretization schemes for stochastic processes, see e.g. [19], and with regard to the pricing of financial derivatives [15] and the references therein. In particular, there exists a wealth of literature concerning approximations of Lévy processes in one form or another and a brief overview of simulation techniques is given by [28].
In continuous time, for example, [18] approximates by replacing the small jumps part with a diffusion, and discusses also rates of convergence for E[g • X T ], where g is real-valued and satisfies certain integrability conditions, T is a fixed time and X the process under approximation; [9] approximates by a combination of Brownian motion and sums of compound Poisson processes with two-sided exponential densities. In discrete time, Markov chains have been used to approximate the much larger class of Feller processes and [4] proves convergence in law of such an approximation in the Skorokhod space of càdlàg paths, but does not discuss rates of convergence; [32] has a finite state-space path approximation and applies this to option pricing together with a discussion of the rates of convergence for the prices. With respect to Lévy process driven SDEs, [21] (resp. [34]) approximates solutions Y thereto using a combination of a compound Poisson process and a high order scheme for the Brownian component (resp. discrete-time Markov chains and an operator approach) -rates of convergence are then discussed for expectations of sufficiently regular realvalued functions against the marginals of the solutions.
We remark that approximation/simulation of Lévy processes in dimensions higher than one is in general more difficult than in the univariate case, see, e.g. the discussion on this in [6] (which has a Gaussian approximation and establishes convergence in the Skorokhod space [6,p. 197,Theorem 2.2]). Observe also that in terms of pricing theory, the probability density function of a process can be viewed as the Arrow-Debreu state price, i.e. the current value of an option whose payoff equals the Dirac delta function. The singular nature of this payoff makes it hard, particularly in the presence of jumps, to study the convergence of the prices under the discretised process to its continuous counterpart.
Indeed, Theorem 2.7 can be viewed as a generalisation of such convergence results for the wellknown discretisation of the multi-dimensional Black-Scholes model (see e.g. [24] for the case of Brownian motion with drift in dimension one). In addition, existing literature, as specific to approximations of densities of Lévy processes (or generalizations thereof), includes [12] (polynomial expansion for a bounded variation driftless pure-jump process) and [13] (density expansions for multivariate affine jump-diffusion processes). [20,33] study upper estimates for the densities. On the other hand [2] has a result similar in spirit to ours, but for solutions to SDEs: for the case of the Euler approximation scheme, the authors there also study the rate of convergence of the transition densities.
Further, from the point of view of partial integro-differential equations (PIDEs), the density In particular, we mention the finite-difference method, which is in some sense the counterpart of the present article in the numerical analysis literature, discretising both in space and time, whereas we do so only in space. In general, this literature often restricts to finite activity processes, and either avoids a rigorous analysis of (the rates of) convergence, or, when it does, it does so for initial conditions h = u(0, ·), which exclude the singular δ-distribution. For example, [8, p. 1616, Assumption 6.1] requires h continuous, piecewise C ∞ with bounded derivatives of all orders; compare also Propositions 5.1 and 5.4 concerning convergence of expectations in our setting. Moreover, unlike in our case where the discretisation is made outright, the approximation in [8] is sequential, as is typical of the literature: beyond the restriction to a bounded domain (with boundary conditions), there is a truncation of the integral term in L, and then a reduction to the finite activity case, at which point our results are in agreement with what one would expect from the linear order of convergence of [8, p. 1616, Theorem 6.7].
The rest of the paper is organised as follows. Section 2 introduces the setting by specifying the Markov generator of X h and precisely states the main results. Then Section 3 provides integral expressions for the transition kernels by applying spectral theory to the generator of the approximating chain and studies the convergence of the characteristic exponents. In section 4 this allows us to establish convergence rates for the transition densities. While Sections 3 and 4 restrict this analysis to the univariate case, explicit comments are made in both, on how to extend the results to the multivariate setting (this extension being, for the most part, direct and trivial). Finally, Sec- Note that X is then a Markov process with We refer to [3,29] for the general background on Lévy processes.
2.1.1. Univariate case. In the case when d = 1, we introduce two schemes. Referred to as discretization scheme 1 (resp. 2), and given by (2.2) (resp. (2.4)) below, they differ in the discretization of the first derivative, as follows.
Under discretisation scheme 1, for s ∈ Z h and f : Z h → R vanishing at infinity: where the following notation has been introduced: • for s ∈ Z h : • for s ∈ Z h \{0}: c h s := λ(A h s ); • and finally: Note that Q h has nonnegative off-diagonal entries for all h for which: and in that case Q h is a genuine Q-matrix. Moreover, due to spatial homogeneity, its entries are then also uniformly bounded in absolute value.
Further, when σ 2 > 0, it will be shown that (2.3) always holds, at least for all sufficiently small h (see Proposition 3.9). However, in general, (2.3) may fail. It is for this reason that we introduce scheme 2, under which the condition on the nonnegativity of off-diagonal entries of Q h holds vacuously.
To wit, we use in discretization scheme 2 the one-sided, rather than the two-sided discretisation of the first derivative, so that (2.2) reads: Importantly, while scheme 2 is always well-defined, scheme 1 is not; and yet the two-sided discretization of the first derivative exhibits better convergence properties than the one-sided one (cf. Proposition 3.10). We therefore retain the treatment of both these schemes in the sequel.
For ease of reference we also summarize here the following notation which will be used from dλ(x).

Multivariate case.
For the sake of simplicity we introduce only one discretisation scheme in this general setting. If necessary, and to avoid confusion, we shall refer to it as the multivariate scheme. We choose V = 0 or V = 1, according as λ(R d ) is finite or infinite. L h is then given by: and we agree ∅ := 0). Here the following notation has been introduced: : c h s := λ(A h s ); • and finally for j ∈ {1, . . . , d}: and Notice that when d = 1, this scheme reduces to scheme 1 or scheme 2, according as σ 2 > 0 or σ 2 = 0. Indeed, statements pertaining to the multivariate scheme will always be understood to include also the univariate case d = 1.
Remark 2.1. The complete analogue of c h 0 from the univariate case would be the matrix . . , d}. However, as h varies, so could c h 0 , and thus no diagonalization of c h 0 + Σ possible (in general), simultaneously in all (small enough) positive h. Thus, retaining c h 0 in its totality, we should have to discretize mixed second partial derivatives, which would introduce (further) nonpositive entries in the corresponding Q-matrix Q h of X h . It is not clear whether these would necessarily be counter-balanced in a way that would ensure nonnegative off-diagonal entries. Retaining the diagonal terms of c h 0 , however, is of no material consequence in this respect.
It is verified just as in the univariate case, component by component, that there is some h ∈ (0, +∞] such that for all h ∈ (0, h ), L h is indeed the infinitesimal generator of some CTMC (i.e. the off-diagonal entries of Q h are nonnegative). Q h is then a regular (as spatially homogeneous) Q-matrix, and X h is a compound Poisson process, whose Lévy measure we denote λ h .
2.2. Summary of results. We have, of course: Remark 2.2 (Convergence in distribution). X h converges to X weakly in finite-dimensional distributions (hence w.r.t. the Skorokhod topology on the space of càdlàg paths [17, p. 415, 3.9 Corollary]) as h ↓ 0.
Next, in order to formulate the rates of convergence, recall that P h t,T (x, y) (resp. p t,T (x, y)) denote the transition probabilities (resp. continuous transition densities, when they exist) of X h (resp. X) from x at time t to y at time T , {x, y} ⊂ Z d h , 0 ≤ t < T . Further, for 0 ≤ t < T define: We now summarize the results first in the univariate, and then in the multivariate setting (Remark 2.2 holding true of both).

Univariate case.
The assumption alluded to in the introduction is the following (we state it explicitly when it is being used): Either σ 2 > 0 or Orey's condition holds: The usage of the two schemes and the specification of V is as summarized in Table 1. In short we use scheme 1 or scheme 2, according as σ 2 > 0 or σ 2 = 0, and we use V = 0 or V = 1, according as λ(R) < ∞ or λ(R) = ∞. By contrast to Assumption 2.3 we maintain Table 1 as being in effect throughout this subsubsection. Table 1. Usage of the two schemes and of V depending on the nature of σ 2 and λ.
Note that the right-hand side is defined even if P(X h t = x) = 0 and we let the left-hand side take this value when this is so.
The main result can now be stated.
Theorem 2.4 (Convergence of transition kernels). Under Assumption 2.3, whenever s > 0, the convergence of ∆ s (h) is as summarized in the following table. In general convergence is no better than stipulated.
More exhaustive statements, of which this theorem is a summary, are to be found in Propositions 4.5 and 4.6, and will be proved in Section 4.

2.2.2.
Multivariate case. The relevant technical condition here is: Again we shall state it explicitly when it is being used.
Remark 2.6. It is shown, just as in the univariate case, that Assumption 2.5 holds if l = d, i.e. if Σ is non-degenerate. Moreover, then we may take P = 0, C = 1 2 2 π 2 ∧ d j=1 σ 2 j , = 2 and h 0 = h . It would be natural to expect that the same could be verified for the multivariate analogue of Orey's condition, which we suggest as being: the unit sphere (resp. closed ball of radius r centered at the origin)). Specifically, it is easy to see that (2.9) of Assumption 2.5 still holds.
However, we are unable to show the validity of (2.8).
Under Assumption 2.5, Fourier inversion yields the integral representation of the continuous transition densities for X (for 0 ≤ t < T , {x, y} ⊂ R d ): On the other hand, L 2 ([−π/h, π/h] d ) Hilbert space techniques yield for the normalized transition Finally, we state the result with the help of the following notation: Note that by the dominated convergence theorem, (ζ + χ)(δ) → 0 as δ ↓ 0 (this is seen as in the univariate case, cf. Lemma 3.8).
The proof of Theorem 2.7 is an easy extension of the arguments behind Theorem 2.4, and we comment on this immediately following the proof of Proposition 4.2.

Transition kernels and convergence of characteristic exponents
In the interest of space, simplicity of notation and ease of exposition, the analysis in this and in Section 4 is restricted to dimension d = 1. Proofs in the multivariate setting are, for the most part, a direct and trivial extension of those in the univariate case. However, when this is not so, necessary and explicit comments will be provided in the sequel, as appropriate.
3.1. Integral representations. First we note the following result (its proof is essentially by the standard inversion theorem, see also [29, p. 190
Second, to obtain (2.7) we apply some classical theory of Hilbert spaces, see e.g. [10].

Definition 3.3. For a bounded linear operator
We now diagonalize L h , which allows us to establish (2.7). The straightforward proof is left to the reader.
The following introduces a number of bounded linear operators As λ is finite outside any neighborhood of 0, L h | l 2 (Z h ) (as in (2.2), resp. (2.4)) is a bounded linear mapping. We denote this restriction by L h also. Its diagonalization is then given by where, under scheme 1, and under scheme 2, , but we can and will view Ψ h as defined for all real p by the formulae above). Under either scheme, Ψ h is bounded and continuous as the final sum converges absolutely and uniformly.
Proposition 3.5. For scheme 1 under (2.3) and always for scheme 2, for every 0 ≤ t < T , y ∈ Z h and P X h t -a.s. in x ∈ Z h (2.7) holds, i.e.: In what follows we study the convergence of (2.6) to (2.7) as h ↓ 0. These expressions are particularly suited to such an analysis, not least of all because the spatial and temporal components are factorized.
One also checks that for every t ≥ 0 and p ∈ R: Hence X h are compound Poisson processes [29, p. 18, Definition 4.2].
In the multivariate scheme, by considering the Hilbert space L 2 ([−π/h, π/h] d ) instead, X h is again seen to be compound Poisson with characteristic exponent given by (for p ∈ R d ): In the sequel, we shall let λ h denote the Lévy measure of X h .

3.2.
Convergence of characteristic exponents. We introduce for p ∈ R: and, under scheme 1: Thus: Next, three elementary but key lemmas. The first concerns some elementary trigonometric inequalities as well as the Lipschitz difference for the remainder of the exponential series f l ( : these estimates will be used again and again in what follows. The second is only used in the estimates pertaining to the multivariate scheme. Finally, the third lemma establishes key convergence properties relating to λ. Proof. The first set of inequalities may be proved by comparison of derivatives. Then, (1) follows and finally (3) from the decomposition of |e ix − ix + x 2 /2 − e iy + iy − y 2 /2| 2 into the following terms: The latter inequalities are again seen to be true by comparing derivatives.
Proof. This is an elementary consequence of the complex Mean Value Theorem [11, p. 859, Theorem 2.2] and the Cauchy-Schwartz inequality.
Lemma 3.8. For any Lévy measure λ on R, one has for the two functions (given for 1 ≥ δ > 0): The "finite first absolute moment" case is similar. Proof. If V = 0 this is immediate. If V = 1, then (via a triangle inequality): as h ↓ 0 by Lemma 3.8. Eventually the expression is smaller than σ 2 > 0 and the claim follows.
(iii) The above entails, in particular, convergence of Ψ h (p) to Ψ(p) as h ↓ 0 pointwise in p ∈ R. (v) It is seen from Table 2 that the order of convergence goes from quadratic (at least when σ 2 > 0) to linear, to sublinear, according as the Lévy measure is zero, λ(R) > 0 & κ(0) < ∞, or κ becomes more and more singular at the origin. Let us attempt to offer some intuition in this respect. First, the quadratic order of convergence is due to the convergence properties of the discrete second and symmetric first derivative. Further, as soon as the Lévy measure is non-zero, the latter is aggregated over the intervals (A h s ) s∈Z h \{0} , length h, which (at least in the worst case scenario) commit respective errors of order λ Hence, the more singular the κ, the bigger the overall error. Figure 1 depicts this progressive worsening of the convergence rate for the case of α-stable Lévy processes.

Rates of convergence for transition kernels
Finally let us incorporate the estimates of Proposition 3.10 into an estimate of D h t,T (x, y) (recall the notation in (2.5)). Assumption 2.3 and Table 1 are understood as being in effect throughout this section from this point onwards. Recall that |Ψ h − Ψ| ≤ σ 2 |f h | + µ|g h | + |l h | and that the approximation is considered for h ∈ (0, h ) (cf. Proposition 3.9).
First, the following observation, which is a consequence of the h-uniform growth of − Ψ h (p) as |p| → ∞, will be crucial to our endeavour (compare Remark 3.1). Proof. Assume first σ 2 > 0, so that we are working under scheme 1. It is then clear from (3.1) that: On the other hand, if σ 2 = 0, we work under scheme 2 and necessarily V = 1. In that case it follows from (3.2) for h ≤ 2 and p ∈ [−π/h, π/h]\{0}, that: u 2 dλ(u).

Then for any s > 0, ∆ s (h) = O(f (h)).
Before proceeding to the proof of this proposition, we note explicitly the following elementary, but key lemma: Proof. This follows from the inequality |e z − 1| ≤ |z| for z ≤ 0, whose validity may be seen by direct estimation.
Proof. (Of Proposition 4.2.) From (2.6) and (2.7) we have for the quantity ∆ s (h) from (2.5): Then the first term decays faster than any power law in h by (2)  To this end we assume given a function K with the properties that: exp{Ψ h (p)s} dp decays faster than the leading order term in the estimate of D h t,T (x, y) (for which see, e.g., Table 2); (C) sup [−K(h),K(h)] |Ψ h − Ψ| ≤ 1 for all small enough h.
(suitable choices of K will be identified later, cf. Table 3). We now comment on the reasons behind these choices.
Next, we divide the integration regions in (2.6) and (2.7) into five parts (cf. property (F)): . Then we separate (via a triangle inequality) the integrals in the difference D h t,T (x, y) accordingly and use the triangle inequality in the second and fourth region, thus (with s := T − t > 0): exp{Ψ h (p)s} + |exp{Ψ(p)s}| dp + Finally, we gather the terms with |exp{Ψ(p)s}| in the integrand and use |e z − 1| ≤ e |z| − 1 (z ∈ C) to estimate the integral over [−K(h), K(h)], so as to arrive at: Now, the rate of decay of A(h) can be controlled by choosing K(h) converging to +∞ fast enough, viz. property (E). On the other hand, in order to control the second term on the right-hand side of the inequality in (4.1), we choose K(h) converging to +∞ slowly enough so as to guarantee (C).
Further, due to (C), for all sufficiently small h, everywhere on [−K(h), K(h)]: Manifestly the second term will always decay strictly faster than the first (so long as they are not 0). Moreover, since exp{−Cs|p| }dp integrates every polynomial in |p| (cf. the findings of Proposition 3.10) absolutely, it will therefore be sufficient in the sequel to estimate (cf. (4.1)): On the other hand, for the purposes of establishing sharpness of the rates for the quantity D h t,T (x, y), we make the following: In particular, it follows from our above discussion, that it will be sufficient to consider (we shall i.e. in Remark 4.4 this is A, and the difference to D t,T (0, 0) represents B. Moreover, we can further replace Ψ h (p) − Ψ(p) in the integrand of (4.3) by any expression whose difference to Ψ h (p) − Ψ(p) decays, upon integration, faster than the leading order term. For the latter reductions we (shall) refer to the proof of Proposition 3.10.
We have now brought the general discussion as far as we could. The rest of the analysis must invariably deal with each of the particular instances separately and we do so in the following two propositions. Notation-wise we let DCT stand for Lebesgue dominated convergence theorem. (1) if λ(R) = 0: Moreover, with σ 2 s = 1 and µ = 0 we have lim sup h↓0 D h t,T (0, 0)/h 2 ≥ 1/(8 √ 2π), proving that in general the convergence rate is no better than quadratic.
Proof. Estimates of ∆ s (h) follow at once from (4.2) and Proposition 3.10, simply by integration.
By DCT it is sufficient to observe that: To see the latter, note that the second integral is immediate and equal to: e −(2π/3) 2 /2 . As for the first one, make the change of variables u = 3p/2. Thus we need to establish that: Next note that −u 2 /4 + cos(u) is decreasing on [0, π] and the integrand in A is positive. It follows that: Using integration by parts, it is now clear that this expression is algebraic over the rationals in e, √ 3 and the values of the exponential function at rational multiples of π 2 . Since this explicit expression can be estimated from below by a positive quantity, one can check that , where x n = 3 2 1 3 n and w n = 1/x n (n ∈ N), Orey's condition holds with = 1 and one has lim sup h↓0 D h t,T (0, 0)/ζ(h/2) > 0.
One can in fact check that the integrand is strictly positive, as Lemma 4.7 shows, and thus the proof is complete.
(2) The rate of convergence of the expectations is thus got by combining the above proposition with the findings of Theorems 2.4 and 2.7.  In order to be able to relax condition (ii) of Proposition 5.1, we first establish the following Proposition 5.3, which concerns finiteness of moments of X t .
(1) In (5.8) there is a balance of two terms, viz. the choice of the function K.
Thus, the slower (resp. faster) that K increases to +∞ at 0+, the better the convergence of the first (resp. second) term, provided f / ∈ L 1 (R) (resp. |f |/|g| is ultimately converging to 0, rather than it just being nonincreasing). In particular, when so, then the second term can be made to decay arbitrarily fast, whereas the first term will always have a convergence which is strictly worse than h ∨ ∆ t (h). But this convergence can be made arbitrarily close to h ∨ ∆ t (h) by choosing K increasing all the slower (this since f is locally bounded). In general the choice of K would be guided by balancing the rate of decay of the two terms.
(2) Since, in the interest of relative generality, (further properties of) f and λ are not specified, thus also g cannot be made explicit. Confronted with a specific f and Lévy process X, we should like to choose g approaching infinity (at ±∞) as fast as possible, while still ensuring (a) Let first |f | be bounded by (x → A|x| n ) for some A ∈ (0, ∞) and n ∈ N, and assume that for some m ∈ (n, ∞), the function g = (x → |x| m ∨ 1) satisfies E[g • X t ] < ∞ (so that (i) holds). Suppose furthermore condition (ii) is satisfied as well (as it is for, e.g., f = (x → x n )). It is then clear that the first term of (5.8) will behave as Proof. (Of Proposition 5.4.) This is a simple matter of estimation; for all sufficiently small h > 0: .
Thanks  Then the basis for the numerical evaluations is the observation that for a (finite state space) Markov chain Y with generator matrix Q, the probability P y (Y t = z) (resp. the expectation  First, with a view to the localization/truncation error, we shall find use of the following: Proposition 5.6. Let g : [0, ∞) → [0, ∞) be nondecreasing, continuous and submultiplicative, with lim +∞ g = +∞. Let t > 0 and denote by: We begin with transition densities. To shorten notation, fix the time t = 1 and allow p := p 1 (0, ·) and p h := 1 hP h 1 (0, ·) (P h being the analogue of P h for the processX h ). Note that to evaluate the latter, it is sufficient to compute (eQ Example 5.10. Consider first Brownian motion with drift, σ 2 = 1, µ = 1, λ = 0 (scheme 1, V = 0).
Example 5.13. Suppose that, under the pricing measure, the stock price process S = (S t ) t≥0 is given by S t = S 0 e rt+Xt , t ≥ 0, where S 0 is the initial price, r is the interest rate, and X is a tempered stable process with Lévy measure given by:   We choose the same value for the parameters as [27], namely S 0 = 100, r = 4%, α = 1/2, c = 1/2,  Table 4 summarizes this convergence on the decreasing sequence h n := 1/2 n , n ≥ 1.
In conclusion, the above numerical experiments serve to indicate that our method behaves robustly when the Blumenthal-Getoor index of the Lévy measure is not too close to 2 (in particular, if the pure-jump part has finite variation). It does less well if this is not the case, since then the discretisation parameter h must be chosen small, which is expensive in terms of numerics (viz. the size ofQ h ).