The truncated milstein method for stochastic differential equations

Inspired by the truncated Euler-Maruyama method developed in Mao (J. Comput. Appl. Math. 2015), we propose the truncated Milstein method in this paper. The strong convergence rate is proved to be close to 1 for a class of highly non-linear stochastic differential equations. Numerical examples are given to illustrate the theoretical results.


Introduction
Stochastic differential equation (SDE), as a power tool to model uncertainties, has been broadly applied to many areas [1,2,3]. However, apart from linear SDEs, explicit solutions to most non-linear SDEs can hardly be found. Therefore, numerical approximations to SDEs become essential in the applications of SDE models. This paper is organized as follows. Notations, assumptions and the truncated Milstein method will be introduced in Section 2. The proofs of the main results will be presented in Section 3. An example together with some ideas on further research will be presented in Section 4.

Mathematical Preliminaries
Throughout this paper, unless otherwise specified, let (Ω, F , P) be a complete probability space with a filtration {F t } t≥0 satisfying the usual conditions (that is, it is right continuous and increasing while F 0 contains all P-null sets). Let E denote the expectation corresponding to P.
If A is a vector or matrix, its transpose is denoted by A T . Let B(t) = (B 1 (t), B 2 (t), ..., B m (t)) T be an m-dimensional Brownian motion defined on the space. If A is a matrix, let |A| = trace(A T A) be its trace norm. If x ∈ R d , then |x| is the Euclidean norm. For two real numbers a and b, set a ∨ b = max(a, b) and a ∧ b = min(a, b). If G is a set, its indicator function is denoted by I G , namely I G (x) = 1 if x ∈ G and 0 otherwise.
For j 1 , j 2 = 1, ..., m, define For the truncated Milstein method, we need that both f and g have continuous secondorder derivatives. In addition, the following assumptions are imposed.
for all x, y ∈ R d .
holds for all p ≥ 1, where α 1 is a positive constant dependent on p.
Moreover, Assumption 2.2 guarantees the boundedness of the moments of the underlying solution [3], namely, there exists a positive constant K, dependent on t and p, such that where α 2 is a positive constant.

The Classical Milstein Method
Define a uniform mesh T N : 0 = t 0 < t 1 < · · · < t N = T with t k = k∆, where ∆ = T /N for N ∈ N, the classical Milstein method [32] is When the diffusion coefficient g satisfies the commutativity condition that L j 1 g l,j 2 = L j 2 g l,j 1 , for j 1 , j 2 = 1, ..., m and l = 1, ..., d, the classical Milstein method is simplified into where the property, I = ∆B j 1 k ∆B j 2 k for j 1 = j 2 , is used.
In this paper, we only consider the case of the commutative diffusion coefficient. For the case of the non-commutative diffusion coefficient, the truncated Milstein method may still be applicable. But more complicated notations and new techniques will be involved. Due to the length of the paper, we will report the more general case in the future work.

The Truncated Milstein Method
For j = 1, ..., m and l = 1, ..., d, define the derivative of the vector g j (x) with respect to x l by To define the truncated Milstein method, we first choose a strictly increasing continuous for any u ≥ 2, j = 1, ..., m and l = 1, ..., d.
Denote the inverse function of µ by µ −1 . We see that µ −1 is a strictly increasing continuous function from [µ(0), +∞) to R + . We also choose a number ∆ * ∈ (0, 1] and a strictly decreasing For a given step size ∆ ∈ (0, 1) and any x ∈ R d , define the truncated functions bỹ x |x| , j = 1, ..., m, l = 1, ..., d, That is to say, all the truncated functionsf ,g andG l j are bounded although f , g and G l j may not. The next lemma illustrates that those truncated functions preserve (2.3) for all ∆ ∈ (0, ∆ * ].

Lemma 2.3
Assume that (2.3) holds. Then, for all ∆ ∈ (0, ∆ * ] and any x ∈ R d , The proof of this lemma is the same as that of Lemma 2.4 in [27], so we omit it here. We should of course point out that it was required that h(∆ * ) ≥ µ(2) in [27], but we observe that the proof of Lemma 2.4 in [27] still works if h(∆ * ) ≥ µ(1) and that is why in this paper we The truncated Milstein method is defined by 14) To simplify the notation, we set The continuous version of the truncated Milstein method is defined by

Boundedness of the Moments
It is obvious from (2.12) that for any However, it is not so clear that for any This is what we are going to prove in this subsection. Firstly, we show that Y (t) andȲ (t) are close to each other.

Lemma 2.4
For any ∆ ∈ (0, ∆ * ], any t ≥ 0 and any p ≥ 1, where c is a positive constant independent of ∆. Consequently, for any t ≥ 0 Proof. Fix the step size ∆ ∈ (0, ∆ * ] arbitrarily. For any t ≥ 0, there exists a unique integer where c is a positive constant independent of ∆ that may change from line to line. Then by the elementary inequality, the Hölder inequality and Theorem 7.1 in [3] (Page 39), we have Applying (2.12) and the fact that where K is a positive constant dependent on T but independent of ∆.
Proof. It follows from (2.15) that (2.16) By the Itô formula, we have L j 1g j 2 (Ȳ (s))∆B j 2 (s) dB j 1 (s) = 0 are used. We rewrite the inequality as By (2.12) and (2.13), we see where K is a positive constant independent of ∆ and it may change from line to line but its exact value has no use to our analysis.
Applying the Young inequality that we obtain As the sum of the right-hand-side terms in the above inequality is an increasing function of t, By the Gronwall inequality, we obtain where K is a positive constant independent of ∆. Therefore, the proof is complete.

Main Results
If a function φ : R d → R d is twice differentiable, then the following Taylor formula For any x, h 1 , h 2 ∈ R d , the derivatives have the following expressions Here, .
The proof is similar to that of (2.4).

10)
where C is a positive constant independent of ∆.
Proof. We first give an estimate on E|R 1 (f )| p . Applying Lemmas 2.4 and 2.5, we can find a constant C such that where the polynomial growth condition (2.6) on f ′′ (x), the Hölder inequality and the Jensen inequality have been used. To estimate E|R 1 (f )| p , we derive from (3.5) that for t ∈ [t k , t k+1 ), where the Kronecker delta δ j 1 ,j 2 is a piecewise function of variables j 1 and j 2 . Note that t − t k ≤ ∆, by using the Hölder inequality and the Burkholder-Davis-Gundy inequality we have (3.13) Using Lemma 3.1, (2.12) and the Hölder inequality, we can show that for 0 ≤ t ≤ T, 1 ≤ (3.14) Now, substituting (3.11), (3.13) and (3.14) into (3.12) and making use of the independence of Y (t) and ∆B j 1 (t), ∆B j 2 (t), we obtain as required. Similarly, we can show The proof is complete.
According to (2.9) and (2.10), we have that Therefore, it follows from (3.15) and (3.7) that where and Applying Assumption 2.2 to J 1 , we obtain Inserting the expression (3.4) into (3.18) gives By the Young inequality and the Hölder inequality, we get where Following a very similar approach used for (3.35) in [25], we can show Then, we have Applying the Young inequality to (3.19) gives holds for some positive constant K independent of R.
The proof of this lemma is similar to that of (2.4). Briefly speaking, replacing t by τ R ∧ T in which implies the assertion.
holds for some positive constant K independent of R and ∆.
The proof is similar to that of Lemma 3.5.
We now present our main theorem.
holds for all sufficiently small ∆ ∈ (0, ∆ * ], then for any fixed T = N∆ > 0 and sufficiently holds, where K is a positive constant independent of ∆. Proof. We separate the left hand side of (3.26) into two parts Let us first consider the second term on the right hand side. Fix any p ∈ [1, +∞). Using the Young inequality that for any δ > 0, we can have (3.28) Applying (2.4) and Lemma 2.5, we see where C is a positive constant independent of R and ∆. By Lemmas 3.5 and 3.6, we also have Substituting (3.29) and (3.30) into (3.28) yields Choosing δ = ∆ 2p (h(∆)) 4p and R = ∆ p (h(∆)) 2p −1/(q−p) , we have for any ∆ ∈ (0, ∆ * ). Applying Theorem 3.4 to the first term on the right hand side of (3.27) completes the proof.
Let us close this section by the following remark.
Remark 3.8 In this paper, our conditions are imposed for every p ≥ 1 as we wish to show the strong L 2p -convergence rate for every p ≥ 1. However, our theory can also be applied to the case of some p ≥ 1. For example, assume that the conditions in Theorem 3.7 hold for somep ≥ 1 and (3.25) is replaced by that for the givenp, there exists aq ∈ (p, +∞) such that holds for all sufficiently small ∆ ∈ (0, ∆ * ], then our proof above shows clearly that for all sufficiently small ∆ ∈ (0, ∆ * ] and for any fixed T = N∆ > 0,

An Example and Further Discussion
After the theoretical discussion on the truncated Milstein method, it is time to explain how to apply the method. One may note from Section 2 that the choices of functions µ(u) and h(∆) are essential in order to use the method. The forms of these two functions are highly related to the structures of the drift and diffusion coefficients f and g of the SDE (2.1). We shall illustrate the theory as well as how to choose µ(u) and h(∆) by the following example. Hence That is to say, Assumption 2.2 is fulfilled.
It is clear to see to hold for each p ≥ 1. That is to say, we require In fact, for any given p ≥ 1 and any small ǫ > 0, we can always choose sufficiently large q to make the inequality above to hold. Therefore, by Theorem 3.7 we can conclude That is, the strong L 2p -convergence rate is close to 2p (or L 1 -convergence rate is close to 1).
In the computer simulations, we choose ε = 0.1 and regard the numerical solution with the step size of 2 −16 as the true solution. In Figure 1, we plot the strong errors (i.e., in L 1 ) of the truncated Milstein method with step sizes 2 −13 , 2 −12 , 2 −11 and 2 −10 , respectively. Step Size Error Loglog Plot It is interesting to observe from Figure 1 that the strong convergence rate is quite close to one, although we choose ε = 0.1 and the theoretical result (4.1) only shows the rate of 0.9.
This observation indicates that our theoretical result is somehow conservative.
We also observe from Theorem 3.7 that the strong convergence rate is highly dependent on the choices of the functions, µ(·) and h(·). Although we have demonstrated in the example above how to choose them, the example itself has already indicated that those choices may not be optimal.
Once the step size is decided, the barrier is set for all states and the whole time interval. To be more efficient, it may be worth to design a current-state-dependent truncating barrier, which then may end up with a numerical method with variable step size. We have been working on this new method and will report it later on.