Randomized Milstein algorithm for approximation of solutions of jump-diffusion SDEs

We investigate the error of the randomized Milstein algorithm for solving scalar jump-diffusion stochastic differential equations. We provide a complete error analysis under substantially weaker assumptions than those known in the literature. In case the jump-commutativity condition is satisfied, we prove optimality of the randomized Milstein al-gorithm by establishing matching lower bounds. Moreover, we give some insight into the multidimensional case by investigating the optimal convergence rate for the approximation of jump-diffusion type Lévys’ areas. Finally, we report numerical experiments that support our theoretical findings.

Due to their numerous applications in mathematical finance, control theory, and the modelling of energy markets, cf.[15,20,21,23], jump-diffusion SDEs continue to gain scientific interest.Only in very special cases exact solutions are available.It is therefore important to develop efficient (or even in some sense optimal) numerical algorithms.
In this paper, we define the randomized Milstein algorithm, which is a Milstein-type scheme that uses randomization of the drift coefficient in time.We prove L p -error and optimality of the scheme applied to SDE (1).We provide appropriate upper and lower error estimates in the multidimensional case, extending the findings from [2].In the case that only a finite number of evaluations of W and N are allowed, this enables us to address the problem of the optimal approximation of jump-diffusion type Lévys' areas.As these Lévys' areas are naturally present in approximation schemes for jump-diffusion SDEs, our result implies lower error bounds.Analysis of the lower bounds and optimality is provided in the Information-Based Complexity (IBC) framework, see [24].This setting is widely used for investigating optimal algorithms for approximation of solutions of SDEs, see, for example, [5,6,11], [12,13,17,18,19,22,10].
The approximate the solutions of SDEs using randomized algorithms is, for example, studied in [11,12,17,18,22], where the authors consider the randomized Euler-Maruyama scheme for SDEs in the jump-free case, and provide error bounds and optimality results.The articles [4] • We investigate lower error bounds in the worst-case setting in the scalar (Theorem 4.1) and multidimensional case (Theorem 4.3).This essentially allows us to establish optimality of the randomized Milstein algorithm in the scalar case with p = 2.
• We show that numerical experiments match our theoretical results.
The paper is organized as follows.Section 2 states the assumptions under which we perform error analysis for the randomized Milstein algorithm.Section 3 is devoted to error analysis of the randomized Milstein process.Lower bounds and optimality analysis in the IBC framework are given in Section 4. In Section 5 we show the results of the numerical experiments.Finally, some auxiliary results used in the proofs can be found in the Appendix.
We impose the following assumptions on the coefficient functions.
(ii) There exists a constant K 1 > 0 such that for all t, s ∈ [0, T ], y, z ∈ R, and all f ∈ {µ, σ, ρ} it holds that |f (t, y) (iii) There exists a constant K 2 > 0 such that for all y ∈ R, t, s ∈ [0, T ], (iv) There exists a constant K 3 > 0 such that for all t ∈ [0, T ], y, z ∈ R, and all f ∈ {σ, ρ} it holds that (v) For the initial value X 0 we assume that it is an F 0 -measurable random variable and that By the Lipschitz assumption (2) we obtain that for all (t, y) with and by (i) and (2), Furthermore, we know that for f ∈ {µ, σ, ρ} it holds that for all t ∈ [0, T ] the first partial derivative f y (t, •) is absolutely continuous, since it is Lipschitz continuous.Hence, for all t ∈ [0, T ] the second partial derivative f yy (t, •) exists almost everywhere on R. For all t ∈ [0, T ] denote by S f (t) the set of Lebesgue measure 0 for which the second partial derivative f yy (t, •) does not exist.Then for f ∈ {µ, σ, ρ}, t ∈ [0, T ], and all y ∈ R \ S f (t) it holds that On S f (t) we define f yy (t, •) ≡ 0. At this point, we like to emphasise that the choice of the values of f yy (t, •) on S f (t) does not influence the proof of the main result, since by using the local time theory we see that the suitable bounds we compute are not dependent on these values.Morever, for f ∈ {σ, ρ} we have that there exists a constant K 5 ∈ (0, ∞) such that for all (t, y) ∈ [0, T ]×R, Under Assumption 2.1 the existence and uniqueness of a strong solution to the SDE (1) is well-known, see, for example [16,p. 255,Theorem 6] and for all s, t ∈ [0, T ], see [22,Lemma 1].The estimate (7) can be improved if ρ ≡ 0.Moreover, under the Assumption 2.1 (i), (ii), for f ∈ {µ, σ, ρ} and fixed it is possible to apply the Meyer-Itô formula [16,p. 221,Theorem 71] to the function R y → f (t, y) ∈ R and to the solution process . This gives the following parametric version of the Meyer-Itô formula: For all where Lemma A.1 states basic estimates for the functions above.For approximating the solution of SDE (1) we use the randomized Milstein algorithm.For n ∈ N we set δ = T /n and let t i = iδ for i ∈ {0, . . ., n}.Moreover, we use the notation and and the σ-fields σ(I s,t (Y, Z)) and F s are independent, cf.[7,Fact B.28 (ii)].Let {ξ i } n−1 i=0 be independent random variables on the probability space (Ω, F, P), such that the σ-fileds σ(ξ 0 , ξ 1 , . . ., ξ n−1 ) and F ∞ are independent, with ξ i being uniformly distributed on [t i , t i+1 ].
Then the randomized Milstein algorithm X (δ) is defined recursively through In order to analyse the error of the randomized Milstein algorithm we use the so-called timecontinuous Milstein approximation (X , called the randomized Milstein process.It is defined as follows for t ∈ (t i , t i+1 ], i ∈ {0, . . ., n − 1}. This implies that for all i ∈ {0, . . ., n}, X (δ) (t i ) = X (δ) c (t i ).Now, analog to [13], we extend the filtration (F t ) t≥0 in the following way: we take Fn t = σ(F t ∪G n ), where G n = σ(ξ 0 , . . ., ξ n−1 ).Since G n and F ∞ are independent, W and N are still Wiener and Poisson processes with respect to ( Fn t ) t≥0 , respectively.Since in the paper we are integrating • ( Fn t ) t≥0 -progressively measurable processes with respect to the continuous ( Fn t ) t≥0 -semimartingales (t) t∈[0,T ] , (W (t)) t∈[0,T ] , • ( Fn t ) t≥0 -adapted càglàd processes with respect to the càdlàg ( Fn t ) t≥0 -semimartingale (N (t)) t∈[0,T ] , the (stochastic) integrals are well-defined, see, for example, [16].Moreover, the randomized Milstein process is ( Fn t ) t≥0 -progressively measurable, since it is càdlàg and adapted.Note that the randomised Milstein process is not an implementable algorithm since it uses all values of W and N and these are not accessible.However we will use it as an auxiliary scheme for our proof that the randomized Milstein algorithm has convergence orders δ

Error analysis for the randomized Milstein process
Let for all i ∈ {1, . . ., n}, The processes X and X c can be written for all t ∈ [0, T ] as Lemma 3.1.Under the Assumption 2.1 it holds that there exists a constant K 8 ∈ (0, ∞) such that for all n ∈ N it holds that Proof.By induction and the fact that Moreover, by (13) and (11) we obtain that for all n ∈ N there exists a constant Now, we denote for all t ∈ [0, T ], where By Lemma A.6 we have for all From (3) we get that there exist constants c 2 , c 3 ∈ (0, ∞) such that (16) By (3), (5), and Lemma A.6 we obtain that there exist constants c 4 , c 5 ∈ (0, ∞) such that for all (k, f Here we used that differ only at finitely many points. Combining (15), (16), and ( 17) we obtain that there exist constants c 6 , c 7 , c 8 ∈ (0, ∞) such that Hence, p is monotone and hence Borel measurable.Moreover, by ( 14) it is bounded.Hence, applying Gronwall's lemma proves the claim.
Next we prove the convergence rate of the randomized Milstein algorithm.
Remark 3.4.In the jump-free case (ρ = 0) we get from the proof of Theorem 3.2 that there exists K 10 ∈ (0, ∞) such that for all n ∈ N, Hence our result implies the same upper error bound for the randomized Milstein process as in [13, Proposition 1] but under slightly weaker assumptions on µ and σ.Moreover, for 2 = min{ 1 2 + 1 , 1} we recover the upper error bound from [8], which therein is obtained for a two-stage randomized Milstein scheme.

Lower bounds and optimality
In this section we first consider the scalar case under the JCC and then we study the multidimensional case.In both cases we set p = 2 and assume availability only of standard information, given by values of W and N at a finite number of points.We investigate lower bounds and optimality in the IBC framework, cf.[24].

Scalar case and optimality of the randomized Milstein algorithm
We investigate lower bound and optimality of the randomized Milstein algorithm in the case where p = 2 and when the JCC is satisfied, that is see [15].It follows that under condition (46) the randomized Milstein algorithm uses only standard discrete information about W, N , i.e., the values W (t 1 ), . . ., W (t n ), N (t 1 ), . . ., N (t n ), since, by ( 9) and (10), in that case it has the form Hence, if the JCC holds, the scheme is implementable.
The total number of evaluations of µ, σ, ρ, W , and Any algorithm A that uses the information N (µ, σ, ρ, X 0 , W, N ) and computes the approximation to X(T ), is of the form where ϕ : R 3k 1 +k 2 +k 3 +1 → R is a Borel measurable function.For a fixed n ∈ N we denote by Φ n the class of all algorithms (47) with total number of evaluations l ≤ n.
For (µ, σ, ρ, X 0 ) ∈ F( 1 , 2 , 3 , p, K) we define the error of A ∈ Φ n as The worst-case error of A in a subclass G of F( 1 , 2 , 3 , p, K) is defined by The aim is to find sharp bounds for e (p) n (F( 1 , 2 , 3 , p, K), W, N ), i.e. lower and upper error bounds which match up to constants.
The randomized Milstein algorithm can be written as where X (δ) (T ) is defined in (10), and we have that as n → +∞.
Proof.The upper bound O(n n (F( 1 , 2 , 3 , 2, K), W, N ) follows from Theorem 3.2 and the fact that e (2) ).We now turn to the lower bound.Let A be any algorithm from Φ n that uses at most n evaluations of (µ, σ, ρ), W , and N .We consider the following subclasses of F( 1 , 2 , 3 , 2, K): where by [14, Section 2.2.9, Proposition 2] we obtain that Theorem 4.3.For the trapezoidal method (52) based on the equidistant mesh t i = iT /n, i ∈ {0, 1, . . ., n}, it holds that and therefore it is the optimal method among all methods of the form (50).
Proof.From the projection property for the conditional expectation we get for any algorithm (50) that Therefore, we also have Hence, we need to compute For all i ∈ {0, . . ., n − 1} we define From the definition of the Itô integral we get for all i ∈ {0, . . ., n − 1} that where with s i j = t i + j(t i+1 − t i )/m for all j ∈ {0, . . ., m}.Further, we denote ∆W i j = W (s i j+1 ) − W (s i j ) for all i ∈ {1, . . ., n − 1} and j ∈ {1, . . ., m − 1}.Then Since by Proposition A.5 the processes N and W are conditionally independent given the σalgebra σ(N n (N, W )), we have that Moreover, by [6,Lemma 8] and [19,Lemma 3.1], we have for all s ∈ [t i , t i+1 ] and Plugging ( 57),(58), and (59) into (56), we obtain ) is a (pathwise) Riemann sum for the stochastic process ] with continuous sample paths, it holds for all i ∈ {0, . . ., m − 1} almost surely that Moreover, from (55) and by Jensen's inequality for the conditional expectation we obtain Hence, by (61) and by (60) Convergence in L 2 (Ω) as well as almost sure convergence imply convergence in probability.Moreover by the uniqueness of the limit in probability we get that for all i ∈ {0, . . ., m − 1}, Moreover, we have Hence by plugging (63) into (62) we get for all i ∈ {0, . . ., m − 1} that Combining ( 64) and (54) we get that E[J(N, W )|N n (N, W )] = A T n (N, W ) corresponds to the trapezoidal method.Now we investigate its error in order to get the minimal possible error among all methods of the form (50). To do so, let us consider the step process given for all t ∈ [0, T ] by The process ( Nn (t)) t∈[0,T ] is not adapted to the initial filtration (F t ) t≥0 .However, it is adapted to F t := σ(F W t ∪ F N ∞ ), t ≥ 0.Moreover, (W (t)) t≥0 is still a one-dimensional Wiener process with respect to the filtration ( F t ) t≥0 , since F N ∞ and F W ∞ are independent.Hence is a well-defined Itô integral of the ( F t ) t≥0 -simple process ( Nn (t)) t∈[0,T ] and therefore Since (N (t) − Nn (t)) t∈[0,T ] is a ( F t ) t≥0 -progressively measurable process, by the Itô isometry and Jensen's inequality we get Hence, by (53) we conclude Moreover, for the trapezoidal method A T n (N, W ) based on the equidistant mesh t i = iT /n, i ∈ {0, 1, . . ., n}, we get by (65) that and hence From ( 66) and ( 68) we arrive at

Numerical experiments
We implement1 the randomized Milstein algorithm for the SDE that has already been considered in the jump-free case in [13].The jump coefficient is chosen so that the JCC is satisfied.The verification of Assumption 2.1 is straight forward, for the JCC (46), see Remark 5.2.In our simulations we set λ = 100, M = 100, 1 = 0.1, and 2 = 0.6.
We estimate the L p -error similar as in [20, p. 14] by Here X (k) (T ) is the approximation of X(T ) with step size δ (k) , where δ (k) = 2 −k for k ∈ N. The mean is taken over 2 16 sample paths.
Remark 5.1.In the implementation the interesting part is how the values ξ i are computed.For the randomization we first simulate independent uniformly distributed random variables ξ i on the corresponding intervals for the finest discretization grid.Iteratively we compute the values for the discretization grid with doubled step size as follows: One time interval in the larger grid consists of two time intervals of equal length in the finer grid.For those two intervals we have simulated two values ξ i .Now we simulate an independent Bernoulli(0.5)random variable that determines which of the values ξ i we take.This choice is then uniformly distributed on the interval of the large grid and hence consistent with the randomized Milstein algorithm.
For For p = 1 we take as theoretical convergence rate the same rate as for p = 2, because the L 1 -error can be estimated by the L 2 -error using the Cauchy-Schwarz inequality.In Figure 1 we plot the log 2 (error(k)) over log 2 (δ (k) )) for p ∈ {1, 2, 3, 4} and the corresponding theoretical convergence orders.
Figure 1: Error estimates and theoretical convergence order for p ∈ {1, 2, 3, 4} We see that the observed convergence order is decreasing with increasing p.Further we notice that for p = 1 the convergence of the simulation is higher than the theoretical convergence rate.This is reasonable because we took the rate of the L 2 -error.For p = 2 we observe that the simulation confirms the theoretical results; the slope of the simulation matches the convergence rate, which we proved to be optimal.Also for p = 3 and p = 4 the simulations confirm the theoretical results, since the simulation converges at least as fast as the theoretically obtained upper bound; we have not proven any lower bound.
Next, we regress the slope of the simulated log 2 (error(k)) in dependence of the corresponding log 2 (δ (k) )) for all p ∈ {1, . . ., 8} and compare it to the theoretical upper bounds on the convergence rates we have proven, see Figure 2. We observe that for the simulations the convergence order is dependent on p, which confirms also this theoretical finding.
Then the JCC (46) is satisfied for the pair (σ, ρ).This provides a new class of functions (σ, ρ) satisfying the JCC which may, in contrast to the class considered in [15], be nonlinear.

A Appendix
The proof of the following lemma is straightforward and will be omitted.

Figure 2 :
Figure 2: Slopes of the simulation (estimated by linear regression) in comparison to theoretical convergence rates
[12]ider any class of coefficients of multidimensional SDEs for which (49) is a subproblem.Then by Theorem 4.3, in the worst case setting with respect to the coefficients, the error cannot be smaller than Ω(n −1/2 ).Therefore, no matter if the JCC (46) is satisfied or not, we can apply the Euler-Maruyama (or randomized Euler-Maruyama) scheme in order to achieve the optimal error bound O(n −1/2 ) if the error is measured in the L 2 (Ω)-norm, see, for example,[11],[12].Remark 4.5.In Theorems 4.1 and 4.3 we have considered only the L 2 -error.Matching upper and lower bounds that depend on p remain an open problem.Our numerical experiments in Section 5 suggest that for jump-diffusion SDEs the error indeed depends on p.
p ∈ [2, ∞) we obtain by Theorem 3.2 the theoretical convergence rate