A support and density theorem for Markovian rough paths

We establish two results concerning a class of geometric rough paths $\mathbf{X}$ which arise as Markov processes associated to uniformly subelliptic Dirichlet forms. The first is a support theorem for $\mathbf{X}$ in $\alpha$-H\"older rough path topology for all $\alpha \in (0,1/2)$, which answers in the positive a conjecture of Friz-Victoir (2010). The second is a H\"ormander-type theorem for the existence of a density of a rough differential equation driven by $\mathbf{X}$, the proof of which is based on analysis of (non-symmetric) Dirichlet forms on manifolds.


Introduction
Consider a symmetric Dirichlet form on L 2 (R d , λ) where λ is the Lebesgue measure and a is a measurable, uniformly elliptic function taking values in the space of symmetric d × d matrices (we make our set-up precise in Section 1.1). It is well-known that there exists a symmetric Markov process X in R d associated with E; see [FOT11] for a general construction of X and [Str88] for fundamental analytic properties of E.
We are interested in differential equations of the form driven by X along vector fields V = (V 1 , . . . , V d ) on R e . When a is taken sufficiently smooth, the process X can be realised as a semi-martingale for which the classical framework of Itô gives meaning to the equation (1.2). However for irregular functions a, this is no longer the case, and (1.2) falls outside the scope of Itô calculus.
One of the applications of Lyons' theory of rough paths [Lyo98] has been to give meaning to differential equations driven by processes outside the range of semimartingales. One viewpoint of rough paths theory is that it factors the problem of solving equations of the type (1.2) into first enhancing X to a rough path by appropriately defining its iterated integrals (which is typically done through stochastic means), and then solve (1.2) deterministically.
Probabilistic methods to enhance the Markov process X to a rough path and the study of its fundamental properties appear in [LS99,BHL02,Lej06,Lej08], where primarily the forward-backward martingale decomposition is used to show existence of the stochastic area. A somewhat different approach, which we follow here, is taken in [FV08] where the authors define X directly as a diffusion on the free nilpotent Lie group G N (R d ) (in particular the iterated integrals are given directly in the construction). One can show that in the situation mentioned at the start, the two methods give rise to equivalent definitions of rough paths. The latter construction in fact yields further flexibility in that the evolution of X can depend in a non-trivial way on its higher levels (its iterated integrals). Note that this is a common feature with Lévy rough paths studied in [FS17,Che18]. Markovian rough paths have also recently been investigated in [CO17,CL16] in connection with the accumulated local p-variation functional and the moment problem for expected signatures.
The goal of this paper is to contribute two results to the study of Markovian rough paths in the sense of [FV08]. Our first contribution (Theorem 2.11) answers in the positive a conjecture about the support of X in α-Hölder rough path topology. Such a support theorem appeared in [FV08] for α ∈ (0, 1/6), and was improved to α ∈ (0, 1/4) in [FV10] where it was conjectured to hold for α ∈ (0, 1/2) in analogy to enhanced Brownian motion. Comparing our situation to the case of Gaussian rough paths, where such support theorems are known with sharp Hölder exponents (see e.g., [FV10,Sec. 15.8], and [FGGR16] for recent improvements), the difficulty of course lies in the lack of a Gaussian structure, in particular the absence of a Cameron-Martin space.
Our solution to this problem relies almost entirely on elementary techniques. Indeed, we first show that any stochastic process (taking values in a Polish space) admits explicit lower bounds on the probability of keeping a small α-Hölder norm, provided that it satisfies lower and upper bounds on certain transition probabilities comparable to Brownian motion. This is made precise by conditions (1) and (2) and Theorem 2.5. We then verify these conditions for the translated rough path T h (X) (which is in general non-Markov, see Remark 2.8) for any h ∈ W 1,2 ([0, T ], R d ) using heat kernel estimates of X (we also note that, just like for enhanced Brownian motion, all relevant constants depend on h only through h W 1,2 ).
As usual, in combination with the continuity of the Itô-Lyons map from rough paths theory, an immediate consequence of improving the Hölder exponent in the support theorem for X is a stronger Stroock-Varadhan support theorem (in α-Hölder topology) for the solution Y to the rough differential equation (RDE) (1.2) along with the lower regularity assumptions on the driving vector fields V (Lip 2 instead of Lip 4 ).
Our second contribution (Theorem 3.4 and its Corollary 3.8) may be seen as a non-Gaussian Hörmander-type theorem, and provides sufficient conditions on the driving vector fields V = (V 1 , . . . , V d ) under which the solution to the RDE (1.2) admits a density with respect to the Lebesgue measure on R e . Once again, while this result is reminiscent of density theorems for RDEs driven by Gaussian rough paths (e.g., [BH07,CF10,CHLT15]), the primary difference in our setting is that methods from Malliavin calculus are no longer available due to the lack of a Gaussian structure.
We replace the use of Malliavin calculus by direct analysis of (non-symmetric) Dirichlet forms on manifolds. Indeed, we identify conditions under which the couple (X, Y) admits a density on its natural state-space, and conclude by projecting to Y. We note however that our current result gives no quantitative information about the density beyond its existence (not even for the couple (X, Y)), and we strongly suspect that the method can be improved to yield further information (particularly L p bounds and regularity results in the spirit of the De Giorgi-Nash-Moser theorem).
1.1. Notation. Throughout the paper, we adopt the convention that the domain of a path x : [0, T ] → E, for T > 0 and a set E, is extended to all of [0, ∞) by setting x t = x T for all t > T . For a metric space (E, d), r ≥ 0, and x ∈ E, we denote the ball B(x, r) = {y ∈ E | d(x, y) ≤ r}.
We let G = G N (R d ) denote the step-N free nilpotent Lie group over R d for some N ≥ 2, and let U 1 , . . . , U d be a set of generators for its Lie algebra g = g N (R d ), which we identify with the space of left-invariant vector fields on G. We equip R d with the inner product for which U 1 , . . . , U d form an orthonormal basis upon canonically identifying R d with a subspace of g.
We equip G with the corresponding Carnot-Carathéodory metric d. Let 1 G denote the identity element of G and let λ denote the Haar measure on G normalised so that λ(B(1 G , 1)) = 1.
For Λ > 0, let Ξ(Λ) = Ξ N,d (Λ) denote the set of measurable functions a on G which take values in the space of symmetric d×d matrices and which are sub-elliptic in the following sense: For a ∈ Ξ(Λ), we define the associated Dirichlet form E = E a on L 2 (G, λ) for all We let X = X a,x denote the Markov diffusion on G associated to E with starting point X 0 = x ∈ G. We recall that the sample paths of X are a.s. geometric α-Hölder rough paths for all α ∈ (0, 1/2), and when a(x) depends only on the level-1 projection π 1 (x) ∈ R d of x ∈ G, X serves as the natural rough path lift of the Markov diffusion associated to the Dirichlet form (1.1) on L 2 (R d ) discussed earlier.
For further details, we refer to [FV10].
Remark 1.1. Throughout the paper we assume the symmetric Dirichlet form (1.3) is defined on the Hilbert space L 2 (G, λ) so that X is symmetric with respect to λ. As pointed out in [CO17], it is natural to also consider E defined over L 2 (G, µ) for a measure µ(dx) = v(x)λ(dx), v ≥ 0. While for simplicity we only work with E defined on L 2 (G, λ), we note that appropriate assumptions of v and a Girsanov transform (see, e.g., [Fit97]) can be used to relate the results of this paper to this more general setting.

2.2.
Positive probability of small Hölder norm. Suppose now (E, d) is a Polish space. In this section, we give conditions under which an E-valued process has an explicit positive probability of keeping a small Hölder norm. We fix α ∈ (0, 1/2), a terminal time T > 0, and an E-valued stochastic process X adapted to a filtration Consider the following conditions: (1) There exists C 1 > 0 such that for every c, ε > 0, and every Hölder stopping time τ of X, a.s.
Roughly speaking, the first condition states that the probability of large fluctuations of X over small time intervals should have the same Gaussian tails as that of a Brownian motion, while the second condition bounds from below the probability that X s+ε is in a ball of radius ∼ ε 1/2 given that X s was in the same ball.
Proof. Let τ n = τ ε,γ,s n be defined as in Definition 2.1 with τ 0 = s. Note that (1) implies that for all c, γ > 0, t > s and ε ∈ (0, t − s], In particular, choosing c = 2γε α yields that for all γ > 0, t > s, and ε ∈ (0, t − s], The conclusion now follows from Lemma 2.3 and the observation that for every (which can be seen, for example, by the integral test and the asymptotic behaviour of the incomplete gamma function Γ(p, K)).
2.3. Support theorem for Markovian rough paths. We now turn to the support theorem for Markovian rough paths in α-Hölder topology, which we state in Theorem 2.11 at the end of this section.
Proposition 2.7. Let h ∈ W 1,2 . There exists a constant C 2.7 > 0, depending only on Λ, h W 1,2 , α, and T , such that for all a ∈ Ξ(Λ), x ∈ G, and γ > 0 For the proof, let us fix h ∈ W 1,2 and a filtration (F t ) t∈[0,T ] to which X (and thus T h (X)) is adapted (e.g, the natural filtration generated to X).
Remark 2.8. If a(x) depends only on the first level π 1 (x) for all x ∈ G, then T h (X) is a (non-symmetric, time-inhomogeneous) Markov process. In general, however, T h (X) is non-Markov. The reason is that, for any fixed t ∈ (0, T ], the sigma-algebra σ(X t ) is not necessarily contained in σ(T h (X) t ), i.e., information on whether T h (X) t ∈ A for Borel subsets A ⊂ G does not yield full information about X t , which is necessary to determine the evolution of X, and thus of T h (X).
where C F depends only on Λ and p. We now prove two lemmas which demonstrate that the process T h (X) satisfies conditions (1) and (2).
Lemma 2.9. There exists a constant C > 0, depending only on Λ, such that for all c, ε > 0 satisfying it holds that for every stopping time τ , a.s.
Proof. Suppose c, ε > 0 satisfy (2.4). Lemma 2.10. For all C ≥ C 0 (Λ, h W 1,2 ) > 0, there exists c = c(C, Λ, h W 1,2 ) > 0 such that for all x ∈ G, s ∈ [0, T ], and ε ∈ (0, T − s], a.s. Proof. We use the shorthand notation Y = T h (X). For every x, y ∈ G, consider a geodesic γ y,x : [0, 1] → G with γ y,x 0 = y and γ y,x 1 = x parametrised at unit speed. Let z(y, x) := γ y,x 1/2 denote its midpoint. For any x ∈ G, observe that , r), then evidently d(z(Y s , x), x) ≤ r/2. Moreover, since G is a homogeneous group and due to our normalisation of λ, it holds that λ(B(x, r)) = r Q for all r ≥ 0 and x ∈ G, where Q ≥ 1 is the homogeneous dimension of G. Recall also the lower bound on the heat kernel [FV10, Thm. 16.11] where C l > 0 depends only on Λ. It follows that there exists C 1 > 0, depending only on Λ, such that, for any r, ε > 0 and y ∈ B(1 G , r/2), Finally, by standard rough paths estimates (using that T h (X) s,t is equal to X s,t plus a combination of cross-integrals of X  (2.3), for any 2 < p < N + 1, It follows that if C and R furthermore satisfy We now observe that due to the factor R (N −1)/N in (2.5) above, there exists C 0 > 0, depending only on h W 1,2 and Λ, such that for every C ≥ C 0 , we can find R > 0 for which (2.5) and (2.6) are satisfied.
Proof of Proposition 2.7. By Theorem 2.5, it suffices to check that T h (X) satisfies conditions (1) and (2) with constants C 1 , c 2 , C 2 only depending on Λ and h W 1,2 . However this follows directly from Lemmas 2.9 and 2.10.
Theorem 2.11. Let γ, R > 0. It holds that where d α-Höl;[0,T ] denotes the (homogeneous) α-Hölder metric and S N (h) is the level-N lift of h. In particular, the support of X a,x in α-Hölder topology is precisely Proof. By uniform continuity of the map (x, h) → T h (x) on bounded sets [FV10,Cor. 9.35], and the fact that The bound (2.7) then follows from Proposition 2.7. As a consequence, we see that the support of X a,x contains the closure of {xS N (h) | h ∈ W 1,2 }. The reverse inclusion follows from the fact that X a,x is a.s. a geometric α-Hölder rough path, and is therefore the limit in the d α-Höl;[0,T ] metric of lifts of smooth paths.
Remark 2.12. The main difference with the approach taken in [FV08,Thm. 50] and [FV10,Thm 16.33] to prove a bound of the form P[d α-Höl (X, S N (h)) < γ] > 0 (with α ∈ [0, 1/6) and α ∈ [0, 1/4) respectively) is that we do not rely on a support theorem in the uniform topology. As a consequence, our analysis is more delicate but does not lose any power at each step, which allows us to push to the sharp Hölder exponent range α ∈ [0, 1/2). Note also that [FV10,Thm 16.39] and [FV08,Cor. 46] give this bound for h ≡ 0 with the sharp range α ∈ [0, 1/2). The proof therein relies crucially on lower and upper bounds on the probability that X stays in small balls, namely P a,x [ X 0;[0,t] < γ] ≍ e −λ(γ)tγ 2 with 0 < λ min ≤ λ(γ) ≤ λ max < ∞, which yields a version of Lemma 2.6 for the untranslated process X a,x conditioned to stay in a small ball around x. This argument is rather sensitive to the fact that for each fixed γ > 0 the same quantity λ(γ) appears in the lower and upper bounds; this is not true for the translated process T h (X), which is the reason for our different strategy.

Density theorem
Let µ be a smooth measure on O and define the bilinear map where W * j = −W j − div µ W j is the formal adjoint of W j with respect to µ. In the following lemma, the L p norm · p for p ∈ [1, ∞] is assumed to be on L p (U, µ). For background concerning (non-symmetric, semi-)Dirichlet forms, we refer to [Osh13].
Lemma 3.1. The bilinear form E is closable in L 2 (U, µ), lower bounded, and satisfies the sector condition. Denote by P t the associated (strongly continuous) semi-group on L 2 (U, µ). Suppose further that P t is sub-Markov (so that the closed extension of E is a lower-bounded semi-Dirichlet form) and maps C b (U ) into itself. Then there exists ν > 2 and b > 0 such that for every x ∈ U and t > 0 there exists The proof of Lemma 3.1 is based on the sub-Riemannian Sobolev inequality combined with a classical argument of Nash [Nas58]. We believe this result should be standard, but as we were unable to find a sufficiently similar form in the literature, we prefer to give a proof in Appendix A (see [SCS91,Stu95] for closely related results in the case that E is symmetric or positive semi-definite).
Note also that in the sequel, namely in the proof of Theorem 3.4, we will only require the fact from Lemma 3.1 that the kernel p t exists. The bound on p t (x, ·) 2 is merely a free consequence of the proof of its existence.
3.2. Density for RDEs. We now specialise to the setting of Markovian rough paths. Recall Notation 1.1 and consider the RDE for smooth vector fields V = (V 1 , . . . , V d ) on R e . We suppose also that V are Lip 2 so that (3.2) admits a unique solution. We fix also the starting point For the reader's convenience, we recall the Nagano-Sussmann orbit theorem (see, e.g., [AS04, Chpt. 5]).
A particularly useful consequence of the orbit theorem is the following.
Corollary 3.3. Let notation be as in Theorem 3.2. It holds that Proof. The fact that Lie z W ⊆ T z O and the "only if" implication are obvious. For the "if" implication, suppose dim Lie z W is constant in z ∈ O. Then Lie W defines a distribution on O (a subbundle of the tangent bundle), so the Frobenius theorem implies that Lie W arises from a regular foliation of O. However, each leaf of this foliation is itself an orbit of W . Therefore the foliation contains only one leaf, namely O, which concludes the proof.
Consider the manifold G×R e . We canonically identify the tangent space T (x,y) (G× R e ) with T x G ⊕ T y R e and define smooth vector fields on G × R e by W i = U i + V i . Let z 0 = (x 0 , y 0 ) ∈ G × R e and denote by O = O z0 the orbit of z 0 under the collection W = (W 1 , . . . , W d ).
Denote the couple Z t = (X t , Y t ) which is a Markov process on G × R e . One can readily show that a.s. Z z0 t ∈ O for all t > 0 (e.g., by approximating each sample path of X in p-variation for some p > 2 by piecewise geodesic paths). The proof of Theorem 3.4 will be given at the end of this section. We first state several remarks and a consequence of the theorem.
Remark 3.5. Note that from Notation 1.1 we always consider G = G N (R d ) with N ≥ 2. However, in the special case that a(x) depends only on the first level π 1 (x) for all x ∈ G N (R d ), the identical statement in Theorem 3.4 holds for the process Z t = (π 1 (X t ), Y t ) ∈ R d × R e (the conditions change by substituting G by R d everywhere). The reason for this is that Lemma 3.13 below can be readily adjusted to give analogous infinitesimal behaviour of the process Z t (now taking values in O ⊆ R d × R e ), after which the proof of the theorem carries through without change.
For a statement of the density of Y t itself, let O ′ ⊆ R e denote the orbit of y 0 ∈ R e under V .
Lemma 3.6. Suppose Z z0 t admits a density with respect to a smooth measure on O. Then Y t admits a density with respect to any smooth measure on O ′ .
Proof. By the description of the tangent space T z O in Theorem 3.2, it holds that the projection p 2 : O → O ′ , (x, y) → y, is a (surjective) submersion (in fact a smooth fibre bundle) from O to O ′ . The conclusion follows from the fact that pre-images of null-sets under submersions are null-sets for smooth measures.
Moreover, the condition in Theorem 3.4 may be restated in terms of just the driving vector fields V = (V 1 , . . . , V d ) as follows. Proof. Since the vector fields U 1 , . . . , U d are freely step-N nilpotent and generate the tangent space of G, observe that Combining Theorem 3.4 with Lemmas 3.6 and 3.7, we obtain the following corollary.
Corollary 3.8. Suppose condition (3.3) holds. Then for all t > 0, the RDE solution Y t admits a density with respect to any smooth measure on O ′ . Remark 3.9. Note that O ′ = R e whenever V satisfies Hörmander's condition on R e , in which case every smooth measure is equivalent to the Lebesgue measure.
Remark 3.10. Following Remark 3.5, in the case that a(x) depends only on the first level π 1 (x), we are able to take N = 1 in (3.3) when applying Corollary 3.8.
Remark 3.11. Note that while (3.3) (for any N ≥ 0) implies that V satisfies Hörmander's condition on O ′ , the reverse implication is clearly not true. In particular, we do not know if it is sufficient for V to only satisfy Hörmander's condition on O ′ in order for Y t to admit a density on O ′ . The difficulty of course is that unless (3.3) is satisfied, the couple (X t , Y t ) will in general not admit a density in O, whereby our method of proof breaks down.
For the proof of Theorem 3.4, we first recall for the reader's convenience the infinitesimal behaviour of the coordinate projections of X a . As before, let λ denote the Haar measure on G.
Proof. This is [FV08,Lem. 27 is almost everywhere finite and gives precisely the transition kernel of the Markov process Z t in O with respect to µ.
Remark 3.14. The pre-compact subsets U n were considered in the proof only to obtain existence of p n t from Lemma 3.1 for each n ≥ 1. We could have avoided considering such a compact exhaustion by formulating Lemma 3.1 without a precompactness assumption on U (however, at least without extra assumptions, the proof of such a formulation itself would seem to require a compact exhaustion).
Since W satisfies Hörmander's condition on O, recall that for every x ∈ O there exist constants ν x > 2, C x > 0, and a neighbourhood U x of x with µ(U x ) < ∞ such that for all f ∈ C ∞ c (U x ) (see, e.g., [Stu95,p. 296

])
Ux |f | 2νx/(νx−2) dµ Since U is pre-compact, it is routine to patch together such inequalities using a partition of unity and apply interpolation to arrive at the following Sobolev inequality.