Large deviations for fractional volatility models with non-Gaussian volatility driver

We study stochastic volatility models in which the volatility process is a function of a continuous fractional stochastic process, which is an integral transform of the solution of an SDE satisfying the Yamada-Watanabe condition. We establish a small-noise large deviation principle for the log-price, and, for a special case of our setup, obtain logarithmic call price asymptotics for large strikes.


Introduction
Recently, there has been a surge of interest in using stochastic Volterra equations for financial modelling, with asymptotic approximations being a popular subject of research; see the introductions of [13,12] for many references. While small-noise large deviations for such equations are well studied for Lipschitz coefficients [17,18,20,21], results for processes that involve non-Lipschitz functions in their dynamics are scarce. In the papers [9] and [11], concrete models with finite-dimensional parameter spaces are considered, whereas [5,10,13,12,14] study models where volatility is a function of a Gaussian process. In the present paper, we assume that the volatility process is a function of where U is a continuous non-negative function, assumptions on the kernel K will be specified below, and V solves a one-dimensional SDE satisfying the Yamada-Watanabe condition. A (semi-)explicit generating function, as is available in the rough resp. fractional Heston models considered in [9,11], is not required. Also, our processV is clearly non-Gaussian in general, which sets our results apart from the related papers with Gaussian drivers mentioned above. While our setup allows a lot of freedom in choosing the diffusion V and the other ingredients, we note that truly rough models are not covered, because (1.1) is a Lebesgue integral and not an integral w.r.t. Brownian motion. However, the models that we are considering may be rough at t = 0 (see Remark 4.2). The stock price is given by Here, B, W are independent standard Brownian motions, ρ ∈ (−1, 1) andρ = 1 − ρ 2 . The extension to arbitrary S 0 > 0 is straightforward. We now specify the conditions under which our main results, Theorems 1.6 and 1.7 below, are valid. Assumptions 1.1, 1.3 and 1.4 are in force throughout the paper. We note that the model defined in Section 2 of [4] is a special case of our model, but the aim of that paper is quite different from ours.  Then, K is a Volterra kernel in the sense of [13] resp. [12]. Of course, these conditions have been used earlier; e.g., (b) and (c) are part of the definition of a Volterra type Gaussian process in [15,16]. It is a standard fact that the associated integral operator is compact from L 2 [0, T ] into C[0, T ]; see e.g. Lemma 2 of [13] for a proof. A standard example of a kernel satisfying Assumption 1.1 is the fractional kernel Γ(H + 1 2 ) −1 (t − s) H−1/2 , 0 ≤ s ≤ t, with Hurst parameter H ∈ (0, 1). We note that Γ denotes the gamma function here, whereas later we will use the letter Γ for the solution map of the ODE (1.17) below. Definition 1.2. Let ω be an increasing modulus of continuity on [0, ∞), that is ω : R + → R + is an increasing function such that ω(0) = 0 and lim s→0 ω(s) = 0. A function h defined on R is called locally ω-continuous, if for every δ > 0 there exists a number L(δ) > 0 such that for all (1.7) is continuous, and σ is a positive function on R + that is locally ω-continuous for some modulus of continuity ω as in Definition 1.2.
The process V is assumed to solve the SDE and Here, the sub-linear growth at ∞ is understood in the sense that for every x 0 there exists a µ such that for all x > x 0 we have (R2) The drift coefficientb : R → R is locally Lipschitz continuous, has sub-linear growth at ∞, andb(0) > 0.
Next, introducing a small-noise parameter ε > 0, we define the scaled version V ε of the process V by (1.10) and the scaled stock price by Here, we writeV ε for the processV The scaled log-price process X ε = log S ε , which is the process of interest for our large deviations analysis, is now given by 13) and the integral representation is as follows: (1.14) Definition 1.5. In addition to K from (1.6), we define the integral operatorŝ Clearly, we haveǧ =v, where v solves the ODE (1.17). Moreover,f = K(U • f ) anď g = K(U • Γ(g)), where Γ maps g to the solution of (1.17). By Assumption 1.1 the integral operators of Definition 1.5 are well-defined. In fact, for our kernel K, we get that K : This can be easily seen using the fact that U is continuous and the input functions are continuous on a bounded interval and hence bounded themselves.
We can now state our main results.
Theorem 1.6. The family X ε T satisfies the small-noise large deviation principle (LDP) with speed ε −1 and good rate function I T given by for all x ∈ R, wherever this expression is finite. The validity of the LDP means that for every Borel subset A of R, the following estimate holds, where A • andĀ denote the interior resp. the closure of A: Theorem 1.7. The family of processes X ε satisfies the sample path LDP with speed ε −1 and good rate function Q given by The structure of this paper is as follows. In Section 2, we recall small-noise large deviations for SDEs satisfying the Yamada-Watanabe condition. In Section 3, we prove the main results, i.e. the small-noise LDP for the log-price. In Section 4 we specialize our model to obtain a convenient scaling property, and obtain large-strike asymptotics for call prices from our smallnoise LDP. As mentioned above, Assumptions 1.1, 1.3 and 1.4 are supposed to be satisfied throughout the rest of the paper.
2 LDPs for the driving processes

Sample path LDP for the diffusion
We apply a result of [6], which is based on a representation formula for functionals of Brownian motion obtained in [3], to obtain an LDP for ( √ εB, V ε ). While the Yamada-Watanabe condition from Assumption 1.4 covers virtually all one-dimensional diffusions that have been suggested in financial modelling, we note that Assumption 1.4 could still be weakened, if desired, e.g. by inspecting the proof of Theorem 4.3 in [3]. If assumptions (H1)-(H6) of [6] hold, then the family of processes ( √ εB, V ε ). which satisfy admits an LDP due to Theorem 1 in [6]. For V ε , (H1)-(H6) have been checked in [6, pp. 1143-1144]. For ( √ εB, V ε ), the proofs are similar. The assumptions (H1)-(H3) are clearly satisfied.
Let us check condition (H4), namely unique solvability of the control equation (7) in [6]. Here, it is where f ∈ L 2 [0, T ] is the control function. We also have ϕ 1 , ϕ 2 ∈ C[0, T ]. It follows that the unique solution of (2.2) is given by where the function ϕ 2 is the unique solution of the equation that exists by [6,Proposition 1]. This establishes condition (H4) in our setting. Note at this point, that the ODE (2.3) above is formulated for f ∈ L 2 [0, T ] to match the notation of [6].
Alternatively it can also be written, with a g ∈ H 1 0 , andġ instead of f , see (1.17). Condition (H5) for the second component of Γ v 0 was checked in [6, p. 1144]. For the first component, (H5) is true by the following simple fact.
where B r is the closed ball of radius r > 0 in L 2 [0, T ] endowed with the weak topology.
Proof. If f n ∈ B r converges weakly to f , then the convergence is uniform on compact subsets The tightness assumption (H6) can be established as in [6]. The verification, which is based on the sub-linear growth ofb andσ and the uniform moment estimate in Lemma A.2 of [6], is found on pp. 1137-1138 of [6]. See also Section 4.2 of [6]. Now, Theorem 1 of [6] implies the following assertion.
maps f to the solution of (2.2).
Note that Theorem 1 of [6] actually gives a Laplace principle. But since the rate function is a good rate function (which is shown in [6]), we also get an LDP with the same rate function. See Theorems 1.2.1 and 1.2.3 of [8]. The condition and henceφ Therefore, the following statement holds: if the integral is finite, and I(ϕ 1 , ϕ 2 ) = ∞ in all the remaining cases.

Sample path LDP for
In this subsection we lift the sample path LDP in Theorem 2.2 to one for the family of processes we get when applying the "hat" operator defined in (1.12) to V ε .
Proof. For f ∈ C[0, T ] and all t 1 , t 2 ∈ [0, T ], The number r in the exponent of the last term comes from an estimate for the modulus of continuity of the kernel given by (1.5). Here we used the local boundedness of the continuous function U , and also (1.4). Now, it is clear that the functionf is continuous on [0, T ]. It remains to prove the continuity of the mapping f →f on Moreover, It follows from Assumption 1.1 and (2.8) that there exists a constant C 1 for which 9) and the previous expression converges to zero by the uniform continuity of U on [−C 0 , C 0 ]. This completes the proof.
Proof. We know that ( The necessary condition under which we have Since B and W are independent, the following result is an immediate consequence of Theorem 2.5 and Schilder's theorem. for y ∈ R and ψ 1 ∈ H 1 0 [0, T ], if all the expressions are finite, andÎ(y, ψ 1 , ψ 2 ) = ∞ otherwise.
(ii) The family of processes ( √ εW, √ εB,V ε ) satisfies an LDP with speed ε −1 and rate function 3 Proof of the LDP for the log-price 3.1 Proof of Theorem 1.6 (one-dimensional LDP) It is clear that the one-dimensional LDP in Theorem 1.6 is a special case of the sample path LDP in Theorem 1.7. For the reader's convenience, though, it seemed better to us to first prove Theorem 1.6, and then refer to some parts of this proof in the proof of Theorem 1.7 below. We build on some ideas of [13]. To match the notation there, we note that ε HB from [13] corresponds to our processV ε as defined in (1.12). In the original proof of [13] the author first supposes T = 1. Here, for convenience, we immediately allow a general T > 0. By the following lemma, it suffices to prove an LDP for the driftless process The families (X ε T ) ε>0 and (X ε T ) ε>0 are exponentially equivalent, i.e. for every δ > 0, the following equality holds: Proof. By the same reasoning as in Section 5 of [13], there is a strictly increasing continuous Replacing √ εB in [13] byV ε , we get the estimate where J is the rate function of sup 0≤t≤T |V ε t |, and A = (η −1 ( 2δ εT ), ∞). Since J is a good rate function, we know that J(x, ∞) ր ∞ as x ր ∞, so we get (3.2).
Analogously to [13], we define the functional Φ on the space where t k := kT m for k ∈ {0, . . . , m}. The following approximation property is the key to applying the extended contraction principle (see (4.2.24) in [7]).
Proof. The proof is similar to that of Lemma 21 in [13]. We just need to change the range of the integrals and suprema to [0, T ] instead of [0, 1]. Hence, the grid points for h m are t k := T k m for k ∈ {0, . . . , m}, like in (3.5). We use a different integral operator than [13], and so we have to show that the set (v(s))ḟ (s) ds as follows: Here, µ comes from the sub-linear growth condition for the coefficient functions of the diffusion equation for V in Assumption 1.4. Since the continuous function U is bounded on the interval is a bounded subset of C[0, T ]. The compact operator K, as defined in (1.6), maps the set in (3.7) to a precompact set in C[0, T ]. So we can conclude that E β is precompact. After that, the proof continues like in [13]. There is a k such that t ∈ [t k , t k+1 ). Denote by Ξ(t) the left end of the previous interval. Explicitly, we put where [a] stands for the integer part of the number a ∈ R. For T = 1, this reduces to Ξ(t) = [mt] m .
We will next prove that Φ m ( √ εW T , √ εB,V ε ) is an exponentially good approximation as We start with an auxiliary result.
Proof. This corresponds to Lemma 23 in [13], but we need to adjust some estimates in the proof, since we do not have Gaussianity in our setting. As in [13] we use P sup Then, for |s − t| ≤ T /m, we have where M is the modulus of continuity of the kernel function in Assumption 1.1. We know that V ε satisfies an LDP, by Theorem 2.2. Using this, we can estimate P sup r for ε small enough. Here, J is the good rate function corresponding to sup s∈[0,T ] |U (V ε s )|, which satisfies an LDP, as seen from applying the contraction principle to the continuous mapping f → sup s∈[0,T ] |U (f (s))|. From this, we can write lim sup εց0 ε log P sup Since J has compact level sets, the term on the right-hand side explodes for m ր ∞.
Next, we show that the discretization functionals Φ m yield an exponentially good approximation.
Proof. This lemma corresponds to Lemma 22 in [13]. As in the proof of that lemma, it suffices to show . We have to redefine ξ (m) η in order to take a general T > 0 into account: Note that we use the convention inf ∅ = ∞ here. The equations (55)-(65) in [13] remain the same, except that we replace ε HB byV ε and use our redefined versions of σ (m) and ξ (m) η . Thus, formula (65) in [13] can be applied. The estimates (66) and (67) have to be replaced by (3.14) Using Lemma 3.4, we can handle the second term, and so it remains to find an appropriate estimate for the first term. Here we need to adapt the reasoning in [13] because of the lack of Gaussianity. By the LDP forV ε and the contraction principle applied to the mapping for ε > 0 small enough, where I sup is the rate function of sup t∈[0,T ] |V ε t |. Note that q(η) ր ∞ for η ց 0. So, we get lim sup ηց0 lim sup εց0 ε log P sup Using (3.9) and (3.16), we get (73) and (74) of [13]. Finally, we can complete the proof as in [13].
Let us continue the proof of Theorem 1.6. Lemma 3.2 states that condition (4.2.24) in [7] is satisfied. Furthermore, due to Lemma 3.5, we know that Φ m ( √ εW T , √ εB,V ε ) is an exponentially good approximation of Φ( √ εW T , √ εB,V ε ) as m ր ∞. Hence, we can use the extended contraction principle (Theorem 4.2.23 in [7]), and get thatX ε T satisfies an LDP with good rate function I and speed ε −1 . We know from Lemma 3.1 thatX ε T and X ε T are exponentially equivalent, and so we finally arrive at Theorem 1.6.
According to the extended contraction principle, we have The rate functionÎ is only finite for Note that Γ is the one-dimensional solution map that takes f to the solution of the ODĖ v =b(v) +σ(v)ḟ , v(0) = v 0 . Recall that the function Φ can be written as Hence, if x = Φ(y, f, g), then Inserting this into the rate function obtained through the contraction principle, we get (3.17)

Proof of Theorem 1.7 (a sample path LDP)
We adapt the arguments on pp. 8-11 in [12]. As in the preceding section, our starting point is that we already have an LDP for ( √ εW, √ εB,V ε ), see Corollary 2.6. We redefine the functions Φ and Φ m so that they map C In addition, for all the remaining triples (l, f, g), we set Φ(l, f, g)(t) = 0 for all t ∈ [0, T ]. By the following lemma, we can remove the drift term.
Lemma 3.6. The families of processes X ε andX ε are exponentially equivalent, i.e. for every δ > 0, the following equality holds: Here,X ε is defined in (3.1).
Proof. By taking into account the proof of Lemma 3.1, we see that just one additional estimate is needed, namely Then we directly get which is exactly the same expression as in the proof of (3.2). .

(3.20)
It is not hard to see that for every m ≥ 1, the mapping Φ m is continuous.
Proof. Lemma 3.7 can be obtained from the proofs of Lemma 3.2, Lemma 21 in [13] and Lemma 2.13 in [12]. The only difference here is, that the supremum is taken over two functions from D η = {w ∈ H 1 0 [0, T ] : T 0ẇ 2 ds ≤ η}. By the uniform bound in the proof of Lemma 21 of [13], this is actually irrelevant.
Proof. In the proof of Lemma 3.5, the estimate (3.13) was formulated stronger than needed. We can directly use this to show (2.13) of [12]. We can also get (2.14) of [12] this way. The ingredients of (55)-(65) in [13] do in fact depend on the Brownian motion B via the procesŝ V ε . However, the reasoning for the estimate P sup in [13] stays the same if we replace the driving Brownian motion B by W . The rest of the proof from here on is essentially the same as in the proof of Theorem 2.9 in [12].
The rate functionÎ is only finite for where ψ 1 = f and ψ 2 = K(U • Γ(f )) for some f ∈ H 1 0 [0, T ]. Recall that the function Φ is given by Finally, we get the rate function as follows: (3.24)

Large strike asymptotics
Under suitable scaling assumptions, large strike asymptotics of call prices are a natural consequence of our small-noise LDP. To achieve a convenient scaling w.r.t. space, we assume in this section that for some σ 0 > 0 and β ∈ (0, 1 2 ). Furthermore, V is a drift-less CIR process, i.e.σ(x) = √ x and b ≡ 0, and we take U = id. We are thus dealing with a fractional Heston-type model, where some degree of generality is preserved, as K may be an arbitrary kernel satisfying Assumption 1.1.
We note that small time asymptotics of this model are not within the scope of our approach, because the standard transfer involving Brownian scaling leads (for the fractional kernel) to a small time regime where log-moneyness increases as maturity shrinks, which is of little practical interest. Therefore, we consider large-strike approximations instead. The drift-less log-price iŝ and it is easy to see that the tail of the Gaussian term σ 0 (ρW T + ρB T ) is negligible, as is the passage from the log-price X T toX T . It is clear from our assumptions that εV d = V ε , and thus εV d =V ε , for any ε > 0. Therefore, Then, Theorem 1.6 implies, for any c > 0, that Writing k = ε −(β+1/2) and γ = (β + 1 2 ) −1 ∈ (1, 2), we obtain for c = 1, and replacing k by ck we see that the rate function satisfies the scaling property This easily implies that the rate function is given by For the digital call price, (4.1) then yields P (S T ≥ K) = exp −I T (1)(log K) γ (1 + o(1)) , K ր ∞; (4.3) no confusion between the strike K and the kernel K(·, ·) should arise. Note that the choice of the latter affects the value of I T (1) in (4.3). Since γ ∈ (1, 2), this shows that the stock price S T has finite moments of all orders p > 0. Then Theorem 1.1 in [2] shows that call prices have the same logarithmic large-strike asymptotics as digital calls, which establishes the following result.  [19]. In particular, since H + 1 − δ > 1 for small δ, the paths ofV are C 1 on (0, T ). By modifying the model, using U (x) = |x − V 0 | κ with κ ∈ (0, 1] instead of U = id, the paths ofV become less smooth, namely ( 1 2 κ + H + 1 2 − δ)-Hölder continuous. In addition, if σ(x) = σ 0 (1 + x β ), then the volatility paths t → σ 0 (1 + (V t ) β ) are ( 1 2 κβ + (H + 1 2 )β − δ)-Hölder continuous on [0, T ], for any small enough δ > 0. While this Hölder exponent can be smaller than 1 2 , the volatility process is not rough, because σ(·) is smooth away from zero, and so "roughness" occurs only at time zero. Note that in truly rough models, the volatility process is constructed using stochastic integrals t 0 K(t, s)dW s or related processes, which is not the case in our setup.

Second order Taylor expansion of the rate function
In order to compute the rate function, a certain variational problem needs to be solved numerically. It might be preferable to use the Taylor expansion of the rate function instead, if it can be computed in closed form. The model from the preceding section is a case in point: By the scaling property (4.2), we may evaluate the rate function at a small c > 0 of our choice. For the special case where V = B 2 and U (x) = x or, alternatively, V = B, U (x) = x 2 ,σ ≡ 1 and b ≡ 0, i.e. Γ ≡ id, we now discuss how to expand the rate function, building on [1].
Proposition 5.1. Let U ≡ id and V ≡ B 2 . Furthermore, assume that σ is smooth (at least locally around 0). Suppose that the rate function I is also smooth locally around 0. Then, its Taylor expansion is Remark 5.2. Formula (5.1) gives the second order Taylor expansion. However, the ideas in the proof of Proposition 5.1 can be used for higher orders. Clearly, the computations for the expansions get much more cumbersome in the latter case.

Proof of Proposition 5.1
The proof is very similar to the one of Theorem 3.1 in [1]. In the following, we will outline at which points adjustments are needed. Note that for the special we are treating we have U (x) = x 2 and Γ ≡ id. To simplify computations in the proof, we use T = 1. In Proposition 5.1 of [1], there is a representation of the rate function that coincides with ours, except that different integral transforms are used. For our special case, we have Recall that Kf = · 0 K(·, s)f (s) ds. In [1] the authors use the same integral transform as used in [13,12], i.e. Kḟ . We have to adjust this to our case of K(f 2 ). Here, I x denotes the functional that needs to be minimized to get the value of the rate function at x.
First, we need to get a representation for the minimizing configuration f x of the functional I x . This is done like in Proposition 5.2 in [1]. The corresponding expansions of the ingredients of the rate function for our setting for δ > 0 arẽ F (f + δg) ≈F (f ) + 2δ (σ 2 ) ′ (K(f 2 )), K(f g) , (5.7) G(f + δg) ≈G(f ) + δ( σ(K(f 2 )),ġ + 2 σ ′ (K(f 2 )),ḟ K(f g) ) (5.8) Note, that " ≈ " is defined in [1] as If f = f x is a minimizer then δ → I x (f + δg) has a minimum at δ = 0 for all g. Using (5.6), (5.7) and (5.8) we expand , K(f g) We have f x 0 = 0, for any x. We now test withġ = 1 [0,t] for a fixed t ∈ [0, 1] and obtain Let us recall the ansatz in [1]. The authors of [1] choose for fixed x the optimizing function f x for I x , i.e. f x = argmin f ∈H 1 0 I x (f ). Therefore, the first order condition is I ′ x (f x ) = 0. The authors of [1] use the implicit function theorem to show that the minimizing configuration f x is a smooth function in x (locally around x = 0). As I x is a smooth function, too, this implies the smoothness of x → I x (f x ) = I(x), at least in a neighborhood of 0. Note that for (26) and Lemma 5.3 in [1], the embedding K : H 1 0 → C works, because we have already established that K(U • f ) is continuous (see Lemma 2.4).
In order to apply the implicit function theorem, the authors of [1] show that the ingredients of the rate function are Fréchet differentiable by computing their Gateaux derivative. This is more complicated in our case, because of the different integral transform we use. Therefore we assume that the rate function is locally smooth around 0 in Proposition 5.1, and, consequently, that Lemma 5.6 in [1] holds. After establishing that the implicit function theorem can be used, we can proceed as in [1] up to Theorem 5.12 there.
Next, we will imitate the computations in Theorem 5.12 of [1] in order to get the expansion of the minimizing configuration in our setting. In fact, if we just want to obtain the second order expansion of the rate function in our setting for Brownian motion squared, it suffices to find the first order expansion of f x . Assuming the ansatz we get We use the previous formulas in (5.12) to obtain Comparing the coefficients, we get the same result as the authors of [1] for the first order expansion, i.e. Note that the first order expansion of the minimizing configuration f x is exactly the same as in [1]. The reason is that the expansions of the ingredients of (5.12) are relevant here, and these expansions coincide. For the second order expansion of the rate function, we need second order expansions of its ingredients. These are given in the following formulas, where id 2 denotes the quadractic function s → s 2 : 1 2Ẽ (f x ) = 1 2 Finally, we get the Taylor expansion of the rate function by taking into account the reasoning above. We insert the expansion and the expansions above into Eq. (5.12) for the minimizing configuration. Then, we get and hence the following expansion holds: . (5.20)