Recovering the pathwise Itô solution from averaged Stratonovich solutions *

We recover the pathwise Itô solution (the solution to a rough differential equation driven by the Itô signature) by concatenating averaged Stratonovich solutions on small intervals and by letting the mesh of the partition in the approximations tend to zero. More speciﬁcally, on a ﬁxed small interval, we consider two Stratonovich solutions: one is driven by the original process and the other is driven by the original process plus a selected independent noise. Then by taking the expectation with respect to the selected noise, we can recover the increment of the bracket process and so recover the leading order approximation of the Itô solution up to a small error. By concatenating averaged increments and by letting the mesh tend to zero, the error tends to zero and we recover the Itô solution.


Introduction
Itô calculus [11,12] can be seen as a transformation between semi-martingales (i.e. the map which sends the driving process to the solution of a stochastic differential equation) and is widely used in various mathematical models. It is well-known that the classic Itô calculus is not stable under pointwise approximations. Indeed, the Wong-Zakai theorem ( [23,22] see also [5]) shows that, when controlled ordinary differential equations are driven by piecewise-linear approximations to Brownian motion, their solutions converge uniformly in probability to the Stratonovich solution as the mesh of the partition in the approximations tends to zero. In contrast to the Stratonovich solution, the Itô solution is not stable with respect to perturbations of the driving process even when the perturbations are very natural.
There has been a long interest trying to develop a pathwise Itô calculus [1,13,4,21], but these attempts have their limitations. For example, the null set depends on the integrand function, or the integral is only defined for closed one-forms (but closed one-forms are rare in high dimensional spaces), or the convergence is in probability (so not truly pathwise). The theory of rough paths [15,16,18,8,7] is close in spirit to Föllmer's approach [4], but it is a far more systematical methodology that can deal with closed and non-closed one-forms, and applies but is not restricted to semi-martingales Unlike the Stratonovich integral, the Itô integral can not be approximated by a sequence of classical integrals. The pathwise Itô solution is generally defined as the Stratonovich solution to a modified equation with an additional drift term, see e.g. Lyons and Qian [17], Lejay and Victoir [14], Friz and Victoir [7], Hairer and Kelly [9]. Other than defining the pathwise Itô solution as the Stratonovich solution to a modified differential equation, we would like to demonstrate that the Itô solution is almost a Stratonovich solution, in the sense that the Itô solution can be expressed as the limit (as the mesh of the partition tends to zero) of concatenated averaged Stratonovich solutions. More specifically, we would need two Stratonovich solutions on a small time interval: one is the Stratonovich solution driven by the Stratonovich signature of the original process, and the other is the Stratonovich solution driven by the joint Stratonovich signature of the original process plus a selected independent noise. Then by taking the expectation of the second Stratonovich solution with respect to the selected noise, we recover the bracket process, and by working with a chosen functional of these two Stratonovich solutions, we get the leading order approximation of the increment of the pathwise Itô solution with a small error. By letting the mesh of the partition tend to zero, the error tends to zero and we recover the pathwise Itô solution. We would like to recover the Itô solution from averaged Stratonovich solutions mainly because the Stratonovich solution fits more naturally into the rough paths framework than the Itô solution. The averaging effect is also related to the reverse situation where any player in a market interacts with a random sub-sample from the stream and the actual effect on the market is the volume weighted average. Based on our result the random interactions will generate an Itô type correction to the equation for the aggregate behavior.
To convey the idea more explicitly, we illustrate it with a simple example. Suppose B is a one-dimensional Brownian motion and f : R → R is sufficiently regular. We want to recover the solution to the Itô stochastic differential equation Suppose W is another one-dimensional Brownian motion which is independent from B. We define a family of Stratonovich solutions y 1,s,t and y 2,s,t indexed by the time intervals {[s, t]} s<t that are defined to be the Stratonovich solution on [s, t] to the stochastic differential equations (with y s denoting the value of y at time s) We would like to identify a function F : ) on small intervals and by letting the mesh of the partition tend to zero, we recover y in the limit. In the real construction, the initial values of y 1,s,t and y 2,s,t are not y s but the value obtained from the last step of concatenation. Here we use y s to give an intuitive explanation.
For a small time interval [s, t], by using Euler's approximation, we have Since W is independent from B, if we take the expectation of y 2,s,t t − y 2,s,t s w.r.t. W , then the expectation simulates the required continuous martingale correction t − s in (1.1) and we get Hence, we may take F (x, y) := 2x − E W (y), ∀x, y ∈ R, (since B and W are independent, W is fixed once and for all for almost every sample path of B). Then it can be proved is a finite partition of [s, t] with s = t 0 < t 1 < · · · < t n = t and |D| := max k |t k+1 − t k | is the mesh of D. By taking the expectation with respect to the selected independent noise W and by working with a chosen functional of the Stratonovich solutions on a small interval, we obtain the leading order approximation of the increment of the Itô solution y, and recover y as the limit of discrete concatenations when the mesh tends to zero. More generally, we can replace B with a d-dimensional continuous martingale (or even a Gaussian process, provided the joint signature of the Gaussian process and the selected noise is well defined), and we have to estimate ydy as well because the pathwise regularity of a continuous martingale is just above the threshold of having finite 2-variation a.s.. While the idea is similar and captured in this example.
Based on Lejay and Victoir [14], any p-rough path, p ∈ [2, 3), can be interpreted as the product of a weak geometric p-rough path and another continuous path with finite 2 −1 p-variation. We will use this equivalence and define the solution to a rough differential equation driven by a p-rough path, p ∈ [2, 3), as the solution to a rough differential equation driven by a p, 2 −1 p -rough path as in Friz and Victoir [7].
where Anti (·) denotes the projection of R d ⊗2 to span {e i ⊗ e j − e j ⊗ e i |i, j = 1, . . . , d} and Sym (·) denotes the projection of R d ⊗2 to span {e i ⊗ e j + e j ⊗ e i |i, j = 1, . . . , d}.
Then γ A is a weak geometric p-rough path 2 (a normal driving path in rough paths theory) and γ S is a continuous path with finite 2 −1 p-variation. The cross integrals between π 1 γ A (which is equal to π 1 (γ)) and γ S are well-defined as Young integrals [24] because p −1 + 2p −1 = 3p −1 > 1, see [14] for details.
Denote by L R d , R e the set of linear mappings from R d to R e . Definition 2.3. f : R e → L R d , R e is said to be Lip (β) for β > 1, if f is β -times Fréchet differentiable (where β denotes the largest integer which is strictly less than β) and where · ∞ denotes the uniform norm and · (β− β )−Höl denotes the (β − β )-Hölder norm.
The following definition is based on Definition 12.2 in [7]. 2 A weak geometric p-rough path is a continuous path of finite p-variation taking values in the step- [p] nilpotent Lie group. ECP 21 (2016), paper 7.
Theorem 2.6 (Existence and Uniqueness). There exists a solution to (2.3) when f is Lip (β) for β > p − 1, and the solution is unique when β > p. The modification we made in (2.4) will not affect this existence and uniqueness result. Indeed, based on Theorem 12.6 [7], when f is Lip (β) for β > p − 1, {y 1,m } m are uniformly bounded in p-variation. When y 1,m converge uniformly as m → ∞ to π 1 (Y ), by interpolating between the p-variation norm and the uniform norm, we have that y 1,m converge to π 1 (Y ) in p -variation for any p > p as m → ∞. Similarly, by interpolating between the 2 −1 p-variation norm and the uniform norm, we have that x S,m converge to γ S in 2 −1 p -variation for any p > p as m → ∞. We choose p ∈ (p, 3) so that (p ) −1 + 2(p ) −1 > 1. Then by using Young integral (Theorem 1.16 [16]) the with the additional term f (y 1,m ) ⊗2 dx S,m in (2.4)). When β > p, based on Theorem 12.10 in [7], the solution in the sense of Definition 12.2 in [7] is unique, so the path · 0 f (π 1 (Y )) ⊗2 dγ S is unique, and we have the uniqueness of the solution to (2.3) in the sense of Definition 2.5.

Recovering the pathwise Itô solution
As mentioned in the introduction, we would like to recover the pathwise Itô solution by taking the average of Stratonovich solutions. The idea is simple, but the concrete formulation needs some care. Here we try to give a sensible explanation of our formulation.
The difference between them is in the definition of the iterated integral of Z, where they are defined as Stratonovich resp. Itô integral. Both S 2 (Z) and I 2 (Z) are almost surely a p-rough path for any p ∈ (2, 3) (Theorem 14.9 [7]). Usually, the pathwise Stratonovich resp. Itô solution is the solution to a rough differential equation driven by the Stratonovich resp. Itô signature.
Since γ is fixed, the condition (3.2) is satisfied e.g. when the cross integrals between π 1 (γ) and M (i.e. the process R) are defined as the L 1 limit of piecewise linear approximations.
Suppose Z is a d-dimensional square integrable martingale such that its bracket process Z has the expression ψ T u ψ u du for some matrix-valued process ψ, and B is a d-dimensional Brownian motion independent from Z. We let γ = S 2 (Z) and define M to be the Itô integral ψ u dB u . In this case, the process R could be defined by (and there are other possible choices) The Stratonovich integrals in (3.3) are well-defined because the 2d-dimensional process (Z, M ) is a continuous martingale w.r.t. the filtration generated by Z and B (Proposition 14.9 [7]). Then condition (3.2) is satisfied for this particular choice of R for almost every γ because the Stratonovich integrals in (3.3) can be expressed as the L 1 limit of piecewise linear approximations and Z and B are independent. For this selection of R, γ (M,R) is almost surely a p-rough path for any p ∈ (2, 3) for almost every γ because γ (M,R) = S 2 (Z + M ) and S 2 (Z + M ) is almost surely a prough path for any p ∈ (2, 3) for almost every sample path of Z (Theorem 14.12 [7]). We did not require that γ is a geometric rough path, so we also could let γ = I 2 (Z). As a specific example when γ is not a sample path of a martingale, suppose B is a d-dimensional Brownian motion and (X, B) is a 2d-dimensional continuous Gaussian process with independent components. When the covariance function of (X, B) has finite ρ-variation for some ρ ∈ [1, 3 2 ), the process (X, B) can be lifted to a p-rough process for any p ∈ (2ρ, 3), and the lifted rough process is the L 1 -limit of the signatures of the piecewise linear approximations (Theorem 15.33 [7]). Then we could let γ be a sample path of the rough process above X (e.g. fractional Brownian motion with Hurst parameter H > 3 −1 ) and let M be the Brownian motion B. Then condition (3.2) holds because the integral between π 1 (γ) and M is the L 1 limit of the piecewise linear approximations, and γ (M,R) is almost surely a p-rough path for some p ∈ (2, 3) based on Theorem 15.33 [7].
As mentioned before, we have two Stratonovich solutions on a small interval: one is driven by the signature of the original process and the other is driven by the joint signature of the original process plus a noise. Here the rough path γ is (a sample path of) the signature of the original process, and γ (M,R) is the joint signature of the original process plus a noise. Suppose f : R e → L R d , R e is Lip (β) for β > p and let I 2 (γ, M ) denote the p-rough path for some p ∈ [2, 3):   . Then by concatenating F (y 1,s,t s,t , y 2,s,t s,t ) on small intervals and by letting the mesh of the partition tend to zero, we can recover y in the limit. Yet in the real construction, the initial values in (3.5) and (3.6) are actually not y s (which is the pathwise Itô solution we would like to recover) but the value obtained from the last step of discrete concatenations of {F (y 1,s,t s,t , y 2,s,t s,t )} [s,t] . Here we use y s for illustration purposes, but discrete concatenations will create an error which propagates and the analysis will need some care.
Here y 1,s,t and y 2,s,t are what we call the Stratonovich solutions (on the small time interval [s, t]), and y is called the Itô solution (on the large time interval [0, T ]). They are not necessarily the usual pathwise Stratonovich resp. Itô solution (e.g. γ could be a Gaussian rough path as in the example given above), and the convergence holds as long as the conditions of Theorem 3.3 below are satisfied. To recover the usual pathwise Itô solution (the RDE solution driven by the Itô signature of a continuous martingale), suppose Z is a square integrable continuous martingale such that its bracket process Z has the expression ψ T u ψ u du for a matrices-valued process ψ, and B is a Brownian motion independent from Z. We let γ = S 2 (Z), M · = · 0 ψ s dB s , and define γ (M,R) := S 2 (Z + M ). In this case, y 1 and y 2 are pathwise Stratonovich solutions driven by the Stratonovich signature S 2 (Z) and S 2 (Z + M ) respectively, and I 2 (γ, M ) coincides with I 2 (Z) (the Itô signature of Z) so y is the pathwise Itô solution driven by the Itô signature I 2 (Z).
In the following we try to give a sketch of our idea which helps to motivate and clarify our arguments and is also useful for picking apart the proof of ≈ f (π 1 (y s )) π 1 (γ s,t ) + (Df ) (f ) (π 1 (y s )) π 2 (γ s,t ) , (The " ≈ " indicates that two values are close up to a small error in pathwise sense, and the error will be made explicit in the proof.) Based on Definition 3.1, we have  Since we work with p ∈ (2, 3), we have to consider the second level approximation as well. By following similar arguments as for the first level (again based on Theorem 12.6 in [7], but here we add in an extra term as in Definition 2.5), we have π 2 y 1 s,t ≈ f (π 1 (y s )) ⊗ f (π 1 (y s )) π 2 (γ s,t ) , E π 2 y 2 s,t ≈ f (π 1 (y s )) ⊗ f (π 1 (y s )) π 2 (γ s,t ) + 2 −1 M s,t , π 2 (y s,t ) ≈ f (π 1 (y s )) ⊗ f (π 1 (y s )) π 2 (γ s,t ) − 2 −1 M s,t .
Then π 2 (y s,t ) ≈ 2π 2 y 1 s,t − E π 2 y 2 s,t . (3.11) Combining (3.10) and (3.11), we have that the linear expression holds: There are other possible expressions of y s,t in term of y 1 s,t and y 2 s,t . For example, which constitutes another approximation that is equivalent to (3.12) at leading order. Then it can be proved that, by concatenating the increments either in the form of (3.12) or in the form of (3.13) and by letting the mesh of the partition tend to zero, one will recover y (the solution to (3.4)) in the limit and the analysis in both cases are similar. There is some freedom to choose the expression of y s,t in term of y 1 s,t and y 2 s,t , and the convergence will hold as long as the error is small. We will work with small increments in the form of (3.13).  (3.14) where y 1,j and y 2,j denote the solution to the rough differential equations on [t j , t j+1 ]: It is worth noting that, (since γ is fixed) y D is deterministic for each D.

Proofs
Our constants may implicitly depend on dimensions (d and e). We specify the dependence on other constants (e.g. C p ), but the exact value of constants may change from line to line.

Results from rough paths theory
The Theorem below follows from Theorem 14.12 in [7] and Doob's maximal inequality.
solution to the rough differential equation The Theorem below follows from Theorem 12.10 in [7].

Proofs of Theorem 3.3 and Corollary 3.6
Before proceeding to details of the proof of Theorem 3.3, we first give a sketch of the proof which may help to make the idea clearer. When f is Lip (β) for β > p, for η ∈ T (2) (R e ), we denote by π f (s, η) the unique solution to the RDE:  (3.14). Since y D by definition is piecewise constant and Y is continuous, to prove the uniform convergence of y D to Y as |D| → 0 it is sufficient to prove that y D converge to Y uniformly on {t j } n j=0 .
For each finite partition D = {t j } n j=0 of [0, T ], we generate a sequence of RDE solutions driven by the same rough path I 2 (γ, M ) along the same vector field f but with different starting time t j and with different initial value y D tj , j = 0, 1, . . . , n. Then Y resp. y D is the first resp. last solution in the sequence, and if we want to compare y D tj with Y tj then we Since the solution is unique, we have , and the difference between two adjacent solutions can be expressed as: Based on Theorem 4.3, the difference between the two increments in the first term in (4.11) can be relegated to the difference between their first level initial values: Then combined with the expression of the second term in (4.11), we would need two elements in our proof: (1) an estimate of i := y D ti,ti+1 − π f t i , y D ti ti,ti+1 for all i, (2) the uniform boundedness of y D in D so that based on (4.9) and (4.11) the difference between Y and y D can be bounded by a term comparable to i i .
The estimate of i mainly follows from Theorem 4.2. The uniform boundedness of y D in D can be proved by mathematical induction. The reason that we can employ induction is that based on (4.11) only the first (k − 1) levels of y D contribute to the kth level difference in (4.10) because the 0th level of any solution is identically 1. Hence, by using the uniform boundedness of the first (k − 1) levels of y D , we can prove the kth level convergence of y D to Y as |D| → 0, which implies the uniform boundedness of the kth level of y D in D. .
As a result, π k π f t i+1 , y D  Proof of Corollary 3.6. (Z, M ) is a 2d-dimensional continuous martingale w.r.t. the filtration generated by Z and B, so can be enhanced by their Stratonovich integrals to a process whose sample paths are almost surely p-rough paths for any p ∈ (2, 3). Suppose Z is in L 4+ for some > 0. Based on Theorem 4.1 (on page 10), we get (let p := 2 + 2 −1 ) The second inequality holds because M is defined to be the Itô integral ψdB for the matrices-valued process ψ satisfying ψ T ψdu = Z and the d-dimensional Brownian motion B is independent from Z so we have Z, M T = 0 a.s. and M T = Z T a.s..