1 Introduction

In financial econometrics, diffusion processes are the workhorses used to model the temporal behavior of asset prices: this is mainly due to the processes’ high flexibility. On the one hand, by being defined on the continuous domain, all relevant time scales can be covered. On the other hand, it is rather straightforward to incorporate many stylized facts of empirically observed returns. Such stylized facts encompass stochastic volatility as well as price and volatility jumps (see e.g., (Merton 1976; Heston 1993; Bates 1996) or Duffie et al. (2000) inter alia).

Whereas most of those earlier papers focus on the behavior of asset prices, there is also a theoretical and empirical interest in the statistical behavior of asset returns. From the theoretical point of view, research in this area started with Rubinstein (1973), who derived a CAPM extension to higher moments in his thesis. Next, Kraus and Litzenberger (1976) empirically demonstrated that skewness matters. The theoretical model by Harvey and Siddique (2002) indicates that skewness affects the pricing kernel of agents. Conrad et al. (2013) shows that option implied higher moments have a predictive power for future stock returns. Amaya et al. (2016) and Shen et al. (2016) find predictive powers in realized skewness and realized kurtosis, respectively, for future returns.

Without being exhaustive, this list of research suggests that a thorough understanding, of not only the prices, but also the distribution of asset returns is important, particularly the computation of their moments. This paper will focus on two prominent models from the research literature and further contribute to the literature in various ways. We first derive conditional and unconditional higher (co-)moments for the model proposed by Cox et al. (1985) (CIR hereafter), which can be represented as a squared Bessel process (see (Göing-Jaeschke et al. 2003)). The CIR process is of general interest since it can be used to model the (continuous) part of (log-)prices’ stochastic volatility (see (Heston 1993) or Duffie et al. (2000) inter alia) as well as short-term interest rates. Even though the conditional and unconditional distributions of the CIR process are known, much of the existing emphasis has only been on the mean and variance. We will present an historical overview of how those moments were obtained in the earlier literature.

Our next contribution is to provide general formulas of all unconditional and conditional moments, as well as autocomoments and reduce those formulas to a simple and feasible format. This allows us to obtain an alternative construction of the conditional and unconditional densities of the CIR process. The strategy for this construction is to directly start from the stochastic differential equation (SDE) of the process instead of the associated partial differential equation (PDE) of its transition density. Our starting point is, thus, given by Dana and Jeanblanc-Picqué (1994) and Shreve (2004), who indicate how to transform the SDE to allow a direct computation of the moments as opposed to differentiating a characteristic function obtained from the PDE. From the moments, we then identify the distribution. Dufresne (2001) derives formulas for conditional moments of the CIR process by obtaining the dynamics of powers of the process and eventually obtaining a recursive system of functions with two lags that he solves. In his work, the derivation of comoments is not addressed. More recently, Rujivan (2016) obtained closed-form solutions of moments in the context of the extended CIR process. He achieves this by guessing the representation of the moment solution to the PDE associated with the process. Restricting the time-varying parameters of the extended CIR process to constants, he derives the moments as in Dufresne (2001). He also focuses on comoments. An additional publication contained an erroneous expression of the moments which proves that obtaining those moments is not a trivial affair.

Our next contribution is the derivation of moments of log-returns generated by Bates (1996). Bates’ model is a more general version of the famous model by Heston (1993) since it not only allows the price to have stochastic volatility (modeled by the CIR process) but also allows for jumps in prices. Our analysis will therefore nest findings that were made by Das and Sundaram (1999) and Zhang et al. (2017), who derived higher conditional and unconditional (log-)return moments only for the Heston model. For completeness, it should be mentioned that, Das and Sundaram (1999) among many others, postulates that log-returns have a constant drift. This is problematic. Indeed, if one starts from a geometric Brownian motion for the prices process and then obtains log-returns, via Ito’s lemma, there will appear a variance term in the drift. As discussed at length in simulation experiments by Zhang et al. (2017), this leads to biased expressions of skewness. For realistic parameters, the bias is found to be as large as 10% for skewness at a 3 month horizon, which is noticeable. We indicate also how the jump part requires compensation so that the mean of the log-return is not contaminated by the jump part.

We also contribute by providing the community with R, MATLAB, and Mathematica codes for all deduced (co-)moments. This should be useful for research involving moments of relatively rich data generating processes and for different time horizons. This paper is therefore important for empirically oriented research, where the implementation of theoretical formulas needs to be empirically validated. It could also be important for the estimation of parameters in a method of moment approach. We leave such empirical investigations to later research.

The remainder of this paper is organized as follows. Section 2 introduces the data generating process. In Sect. 3, we present conditional and unconditional (co-)moments of the latent spot variance, which is the solution of the CIR process. On the basis of these formulas, we derive the distributional properties of the spot volatility. The conditional and unconditional characteristic functions of log-returns are deduced in Sect. 4. In Sect. 5, we present general formulas for conditional and unconditional moments of log-returns and give explicit formulas for the first four unconditional moments. In Sect. 6, we validate our analytic formulas: additionally we reveal the numerical instabilities for finite difference approximations of the moments, generated from the characteristic function. We also reveal that the skewness and kurtosis obtained from the simulations may be misleading. Section 7 concludes.

2 Main data generating process

In this section, we are going to present Bates’ model which will be the basis of our further discussions. We denote by \(S_t\) the price of an asset and \(v_t\) the spot volatility of the asset. The asset price is modeled as a geometric Brownian motion with jumps:

$$\begin{aligned} \mathrm {d}S_t&= \left( \mu - {\bar{k}}\right) S_t\, \mathrm {d}t + \sqrt{v_t} S_t\,\mathrm {d}W^x_t + S_tJ_t\mathrm {d}N_t , \end{aligned}$$
(1)
$$\begin{aligned} \mathrm {d}v_t&= \kappa (\theta - v_t)\mathrm {d}t + \sigma \sqrt{v_t} \mathrm {d}W^v_t, \end{aligned}$$
(2)
$$\begin{aligned} {\mathbb {E}}[\mathrm {d}W^x_t\mathrm {d}W^v_t]&= \rho \, \mathrm {d}t. \end{aligned}$$
(3)

Equation (1) defines the dynamic of the price and equation (2) defines the spot variance process. Both \(\mathrm {d}W^x_t\) and \(\mathrm {d}W^v_t\) are increments of Brownian motions. As equation (3) indicates, we allow both increments to be correlated with constant correlation \(\rho \in (-1,1)\). The parameter \(\mu \in {\mathbb {R}}\) controls the drift of the price process. We assume that the starting values of the price process, \(S_0\), and of the spot variance process, \(v_0\), are given. If \(v_0\) is stochastic, we assume that is has finite moments. Jumps are modeled as a marked, compensated, homogeneous Poisson processes with constant intensity \(\lambda \in {\mathbb {R}}^+.\) The marks \(J_t,\) defining the jump sizes are log-normal. Formally we assume that \(\log (1+J_t) \sim {\text {N}}\left( \log (1+\mu _J)-\sigma _J^2,\, 2\sigma _J^2\right)\) with constant \(\mu _J\in {\mathbb {R}}\) the mean size and \(\sigma _J^2\in {\mathbb {R}}^+\) the variance of jumps of the price process. We denote the associated counting process with \(N_t\) and we assume that it is a homogeneous Poisson process.

The evolution of latent stochastic variance \(v_t\) is modeled by the CIR process (2) where \(\kappa\) represents the speed of mean-reversion, \(\theta\) is the long-run mean and \(\sigma\) is the volatility of variance.Footnote 1 We know from Revuz and Yor (1999) that a (strong) solution of this process exists for every parameter \(\left( \kappa ,\theta ,\sigma \right) \in ~]0,\infty [^3\) as long as \(v_0>0.\) In addition, as shown by Feller (1951) and Pitman and Yor (1982) if \(\gamma :=2\kappa \theta /\sigma ^2\ge 1\) then the process will not reach 0, for \(0\le \gamma <1\) the origin is instantaneously reflecting, and for \(\theta =0\) the origin is absorbing and will be reached in finite time. The requirement that \(\gamma \ge 1\) is usually called the Feller condition. We will assume that the parameters \(\left( \kappa ,\theta ,\sigma \right) \in ~]0,\infty [^3\) and are constant. We do not impose the Feller condition.

Following (Bates 1996), the constant \({\bar{k}}\) may be set equal to \(\lambda \mu _J\) in which case the jumps are compensated in the sense that \({\mathbb {E}}\left[ \mathrm {d}S_t|S_t\right] =\mu S_t \mathrm {d}t\) and no parameter concerning the jump part appears in the expectation. In many empirical applications, it is however necessary to compensate the jump process in such a way that returns, defined as log-price differentials, have an uncontaminated instantaneous mean where the jumping part of the process does not intervene in the expected return. To see how this can be achieved introduce \(X_t\equiv \log S_t\). Applying Ito’s lemma to (1) while using the above defined distribution for \(J_t\), we obtain as the dynamic for log-prices:

$$\begin{aligned} \mathrm {d}X_t = \left[ \mu - \frac{1}{2} v_t - \bar{k}\right] \mathrm {d}t + \sqrt{v_t} \,\mathrm {d}W^x_t + \log (1+J_t)\,\mathrm {d}N_t. \end{aligned}$$
(4)

We notice that the right hand side of (4) consists of the sum of an arithmetic Brownian motion and a jump part. Integrating the jump part between time 0 and t yields:

$$\begin{aligned} Y_t = \sum \limits _{i = 0}^{N_t} \log \left( 1+J_i\right) - \bar{k}t. \end{aligned}$$
(5)

It is now immediate to see that by setting the compensator term \({\bar{k}}=\lambda \left\{ \log \left( 1+ \mu _J\right) - \sigma _J^2\right\}\) then \({\mathbb {E}}\left[ {\bar{k}}t-\sum \nolimits _{i = 0}^{N_t} \log \left( 1+J_i\right) \right] =0.\) Thus, the mean \(\mu\) alone describes the average log-return.

The system defined by equations (4), (2), and (3) as well as the definition of \(\bar{k}\) from above will be at the center of our study. We have a DGP that allows the log-price to jump and have a latent stochastic variance. Without the jump part, we obtain the system of data generating processes investigated by Heston (1993).

Eventually, the log-price X at time t, that is the solution of the process under study, is obtained as

$$\begin{aligned} X_t&= X_0+\left[ \mu -\lambda \left\{ \log (1+\mu _J)-\sigma _J^2\right\} \right] t-\frac{1}{2}\int \limits _{0}^tv_sds\nonumber \\& \quad+\int \limits _{0}^t\sqrt{v_s}\mathrm{d}W^x_s+\sum _{i=0}^{N_t}\log (1+J_i). \end{aligned}$$
(6)

3 Construction and properties of \(v_t\)

As mentioned in the Introduction, we provide a short historical review of the discovery of \(v_t,\) described by the stochastic differential equation (2), and then, we will present an alternative computation of all conditional and unconditional moments of this process. Eventually, we can characterize the transition probability from its moments.

3.1 The Feller/Cox Ingersoll Ross process

The first ones to present the process dynamic as a stochastic differential equation are Cox et al. (1985). They give credit of the discovery of this process to Feller (1951) and mention that the transition probability is a noncentral chi-squared, the unconditional distribution a gamma density, and also provide the first two moments. Feller (1951) provides a solution to a partial differential equation (PDE) which he interprets as a Fokker-Planck equation. This solution must not necessarily be a density, and therefore, he did not mention that the solution can be a noncentral chi-squared distribution nor did he provide statistical moments of the density. Also, his discussion is not in terms of a stochastic differential equation (SDE) such as (2).

The contribution of Cox et al. (1985) is therefore a major achievement. Their approach to study the SDE appears to be along what is presently known as the Feynman-Kac approach. They associated a PDE, which the transition probability must follow, with the SDE. Using Laplace or Fourier transforms and guessing a solution yields a system of ordinary differential equations which once solved provides a representation involving the Bessel function. Since already in Fisher (1928) the noncentral chi-squared distribution was introduced, they could then compare the expressions of the densities and recognize the nature of the solution. The first and second moments which they provide were in all likelihood obtained by differentiation of the Laplace or Fourier transform.Footnote 2

Related to this, Johnson et al. (1970) provide elements as to how the noncentral chi-squared distribution was originally obtained from a statistical problem by Fisher (1928). Johnson et al. (1970) also mention an oral conversation with D. W. Boyd who provided them with a general formula to obtain all the moments of a noncentral chi-squared distribution.

A different road to derive the moments is to notice that the CIR process is nothing else but a time changed squared Bessel process, or (squared) radial Ornstein-Uhlenbeck process. Elements on how to perform this change of time may be found in Göing-Jaeschke et al. (2003) or at textbook level in Jeanblanc et al. (2009). The moments of the squared Bessel process may be found in Atlan and Leblanc (2006) as series expansions.

Lastly, Dana and Jeanblanc-Picqué (1994) as well as Shreve (2004) appear to be the first to provide a computation of the first and second moment (variance) of the CIR process by directly solving the SDE. This computation involves a ‘trick’ consisting in Ito-differentiating an expression of the type \(\mathrm {e}^{\kappa t}\, v_t\) and recognizing a significant simplification of the equation, as we show below. In this section, we will use the same ‘trick’ to extend their computations to all moments, encompassing skewness, kurtosis, etc. By obtaining the general expression of all moments, we obtain expressions which, after re-parametrization, are equal to those of Boyd. We will also deduce the conditional and unconditional higher order autocomoments of \(v_t\), since those moments may prove useful for applied econometric research.

Our approach is similar to Dufresne (2001) who also starts from a SDE approach to derive moments and then recovers the density. Our computations are however more direct than his. Interestingly, he indicates how, by using (Revuz and Yor 1999), one can obtain the density of the transition function by using the properties of Bessel processes. We simply use the moments and identify them with those of the noncentral chi-squared distribution.

A recent contribution by Rujivan (2016) considers the extended CIR process, that is a process where the parameters \(\kappa , \theta , \sigma\) are time variant. He achieves this by conjecturing a recursive expression for the conditional moment which after being injected into the PDE yields a system of ordinary differential equations. Solution of the system then yields a closed-form expression for the conditional moments. Restricting the parameters \(\kappa , \theta , \sigma\) to constants, he obtains that his moments are equivalent to the ones of Dufresne (2001). In his paper, he provides also the moments of the unconditional (steady state) distribution which match with the noncentral chi-squared distribution.

3.2 Conditional and unconditional moments of \(v_t\)

Proposition 1

The p-th conditional moment of spot volatility \(v_t\) given (2), \(\forall t > 0\), is defined by:

$$\begin{aligned} \begin{aligned} {\mathbb {E}}\left[ v_t^p | v_0 \right]&= \mathrm {e}^{-p \kappa t}\left( v_0^p + c_{p+1}\nu _p \right) ,~\forall ~p\in {\mathbb {N}}, \end{aligned} \end{aligned}$$
(7)

where

$$\begin{aligned} c_p&= \left( p-1\right) \left\{ \kappa \theta + \frac{1}{2}\left( p-2\right) \sigma ^2\right\} ,&\nu _p&= a_pb_t + \sum _{i = 0}^{p-1}m_{i,p} I_{i+1},&a_p&= \frac{v_0^{p-1}}{\kappa }, \end{aligned}$$

and

$$\begin{aligned} b_t&= \mathrm {e}^{\kappa t} -1,&I_k&= \frac{b_t^k}{k!\kappa ^{k-1}},&m_{k,p}&= {\left\{ \begin{array}{ll} 0, &{} \text {if }k = 0, \\ a_{p-k}\prod \limits _{j = p-k+1}^{p}c_j, &{} \text {if }k \ge 1. \end{array}\right. }\\ \end{aligned}$$

Proof

First, define \(X_t^p = f(v_t, t) = \mathrm {e}^{p\kappa t}v_t^p,~\forall ~p \in {\mathbb {N}}\). Then define \(f_x=\partial f/\partial x\) and \(f_{xx}=\partial ^2 f/\partial x^2\) as the first- and second-order partial derivatives of f with respect to x, respectively. It follows that \(f_v = p\mathrm {e}^{p \kappa t}v_t^{p-1}\), \(f_{vv} = p(p-1)\mathrm {e}^{p \kappa t}v_t^{p-2},\) and \(f_t = p \kappa \mathrm {e}^{p \kappa t}v_t^p\). From Ito’s lemma, we know that \(\mathrm {d}X_t^p = f_v\mathrm {d}v + f_t\mathrm {d}t + \frac{1}{2}f_{vv}(\mathrm {d}v_t)^2\), thus, combining all three terms we get:

$$\begin{aligned} \mathrm {d}X_t^p = p\mathrm {e}^{p\kappa t}v_t^{p-1}\left\{ \kappa \theta + \frac{1}{2}\left( p-1\right) \sigma ^2\right\} \mathrm {d}t + p\mathrm {e}^{p\kappa t}v_t^{p-\frac{1}{2}}\sigma \mathrm {d}W_t. \end{aligned}$$

Integration yields:

$$\begin{aligned} X^p_t = v_0^p + p\left\{ \kappa \theta + \frac{1}{2}\left( p-1\right) \sigma ^2\right\} \int _{0}^{t}\mathrm {e}^{p\kappa u}v_u^{p-1} \mathrm {d}u + p\sigma \int _{0}^{t}\mathrm {e}^{p\kappa u} v_u^{p-\frac{1}{2}}\mathrm {d}W_u. \end{aligned}$$

From \(X_t^p = \mathrm {e}^{p\kappa t}v_t^p\), we easily obtain that:

$$\begin{aligned} v_t^p &= \mathrm {e}^{-p \kappa t} \left[ v_0^p + p\left\{ \kappa \theta + \frac{1}{2}\left( p-1\right) \sigma ^2\right\} \int _{0}^{t}\mathrm {e}^{p\kappa u}v_u^{p-1} \mathrm {d}u \nonumber \right. \\& \quad \left. + p\sigma \int _{0}^{t}\mathrm {e}^{p\kappa u} v_u^{p-\frac{1}{2}}\mathrm {d}W_u\right] . \end{aligned}$$
(8)

Introduce \(c_{p+1} = p\left\{ \kappa \theta + \frac{1}{2}\left( p-1\right) \sigma ^2\right\} .\) Focusing on finite moments, assume that \(\int _0^t{\mathbb {E}}[v_u^{2p-1}]\, \mathrm {d}u<\infty\) then the last term is a martingale and its conditional expectation is zero. The conditional expected value of \(v_t^p\) is then given by:

$$\begin{aligned} \begin{aligned} {\mathbb {E}}\left[ v_t^p | v_0 \right]&= \mathrm {e}^{-p \kappa t}\left( v_0^p + c_{p+1}\int _{0}^{t}e^{p\kappa u}{\mathbb {E}}\left[ v_u^{p-1} \big | v_0 \right] \mathrm {d}u \right) \end{aligned}. \end{aligned}$$

To get a closed-form expression, we have to evaluate \(\nu _p:=\int _{0}^t\mathrm {e}^{p\kappa u}{\mathbb {E}}\left[ v_u^{p-1}\big | v_0\right] \mathrm {d}u\). Clearly, this expression depends on t,  to keep notations simple we refrain from adding an index t. It should be noticed that this recursive behavior under the integral was neglected in Jafari and Abbasian (2017) and therefore led to a flawed formula. By plugging in \({\mathbb {E}}\left[ v_t^{p-1} | v_0\right]\) from above and defining \(a_p = v_0^{p-1}\kappa ^{-1}\) and \(b_t = \mathrm {e}^{\kappa t}-1\) we get:

$$\begin{aligned} \nu _p&= \frac{v_0^{p-1}}{\kappa }\left( \mathrm {e}^{\kappa t}-1\right) + c_p\int _{0}^t \mathrm {e}^{\kappa u}\int _{0}^{u}\mathrm {e}^{(p-1)\kappa r}{\mathbb {E}}\left[ v_r^{p-2}\big | v_0\right] \mathrm {d}r\, \mathrm {d}u\\&= a_pb_t + c_p\int _{0}^t \mathrm {e}^{\kappa u}\int _{0}^{u}\mathrm {e}^{(p-1)\kappa r}{\mathbb {E}}\left[ v_r^{p-2}\big | v_0\right] \mathrm {d}r \mathrm {d}u = a_pb_t + c_p\int _{0}^t \mathrm {e}^{\kappa u}\nu _{p-1} \mathrm {d}u. \end{aligned}$$

Expanding the resulting recursive function with the use of \(I_k\) leads to:

$$\begin{aligned} \begin{aligned} \nu _p&= a_pb_t + c_p\left\{ a_{p-1} I_2 + c_{p-1} \int _{0}^t \mathrm {e}^{\kappa u} \int _{0}^{u} \mathrm {e}^{\kappa r}\nu _{p-2}\mathrm {d}r\,\mathrm {d}u\right\} \\&= a_pb_t + c_p\left\{ a_{p-1} I_2 + c_{p-1}\left( a_{p-2}I_3 +c_{p-2}\int _{0}^t \mathrm {e}^{\kappa u} \right. \right. \\&\quad \left. \left. \int _{0}^{u}\mathrm {e}^{\kappa r} \int _{0}^{c} \mathrm {e}^{\kappa b}\nu _{p-3}\mathrm {d}b~\mathrm {d}r\,\mathrm {d}u \right) \right\} \\&= a_pb_t + c_p\left[ a_{p-1} I_2 + c_{p-1}\left\{ a_{p-2}I_3 +c_{p-2}\right. \right. \\&\quad \left. \left. \left( a_{p-3}I_4 +c_{p-3}\int _{0}^t \mathrm {e}^{\kappa u} \cdots \int _{0}^d \mathrm {e}^{\kappa g}\nu _{p-4}\mathrm {d}g\cdots \mathrm {d}u \right) \right\} \right] \\&\vdots \\&= a_pb_t + c_p\left( a_{p-1}I_2 + a_{p-2}c_{p-1}I_3 + a_{p-3}c_{p-1}c_{p-2}I_4 + \dots \right. \\&\quad \left. + a_1c_{p-1}c_{p-2}\cdot \cdots \cdot c_2 I_p \right) . \end{aligned} \end{aligned}$$

In the last step, we make use of the fact that the recursion ends if \(\nu _1\) is plugged into the repeated integral. This will result in a final additional term of \(a_1I_p\), which is multiplied with all \(p-2\) c-terms from previous recursion steps. With a final expansion and using \(m_{k,p}\), we can write

$$\begin{aligned} \nu _p = a_pb_t + \sum _{i = 0}^{p-1}m_{i,p}I_{i+1}. \end{aligned}$$

\(\square\)

For empirical research, moments up to order 4 are often required. In particular, the third and fourth moments allow the computation of skewness and kurtosis. For the reader’s convenience, we explicitly derive those moments and also remind the well-known first- and second-order moments.

$$\begin{aligned} {\mathbb {E}}\left[ v_t|v_0\right]&= v_0 \mathrm {e}^{-\kappa t} +\theta \left( 1-\mathrm {e}^{-\kappa t}\right) ,\\ {\mathbb {E}}\left[ v_t^2|v_0\right]&= v_0^2 \mathrm {e}^{-2\kappa t}+\left( 1+\gamma ^{-1}\right) \left\{ \theta ^2\left( 1-\mathrm {e}^{-\kappa t}\right) ^2 + 2v_0 \theta \left( \mathrm {e}^{-\kappa t}- \mathrm {e}^{-2\kappa t}\right) \right\} , \\ {\mathbb {E}}\left[ v_t^3|v_0\right]&= v_0^3\mathrm {e}^{-3\kappa t} + \left( 1 + 3\gamma ^{-1} + 2\gamma ^{-2}\right) \left\{ \theta ^3\left( 1-\mathrm {e}^{-\kappa t}\right) ^3 \right. \\&\quad \left. + 3v_0\theta ^2\left( \mathrm {e}^{-\kappa t}-2\mathrm {e}^{-2\kappa t} + \mathrm {e}^{-3\kappa t}\right) \right\} \\&\quad + 3 v_0^2\theta \left( 1+2\gamma ^{-1}\right) \left( \mathrm {e}^{-2 \kappa t}-\mathrm {e}^{-3\kappa t}\right) , \\ {\mathbb {E}}\left[ v_t^4|v_0\right]&= v_0^4\mathrm {e}^{-4\kappa t} +6v_0^2\theta ^2\left( 1 + 5\gamma ^{-1} + 6\gamma ^{-2}\right) \left( \mathrm {e}^{-2\kappa t}- 2\mathrm {e}^{-3\kappa t}+\mathrm {e}^{-4\kappa t}\right) \\&\quad +\left( 1 + \gamma ^{-1} + 11 \gamma ^{-2} + 6\gamma ^{-3}\right) \left\{ \theta ^4\left( 1-\mathrm {e}^{-\kappa t}\right) ^4 \right. \\&\quad \left. + 4v_0 \theta ^3 \left( \mathrm {e}^{-\kappa t}-3\mathrm {e}^{-2\kappa t}+3\mathrm {e}^{-3\kappa t}- \mathrm {e}^{-4\kappa t}\right) \right\} \\&\quad +4v_0^3\theta \left( 1 + 3\gamma ^{-1}\right) \left( \mathrm {e}^{-3\kappa t}-\mathrm {e}^{-4\kappa t}\right) . \end{aligned}$$

Presently, we wish to give the conditional density of \(v_t\). To do so, introduce \({\tilde{\chi }}^2_{\eta }\left( \xi \right) ,\) the noncentral chi-squared distribution with \(\eta\) degrees of freedom and noncentrality parameter \(\xi\) defined through the density \(f\left( x; \eta , \xi \right) = \mathrm {e}^{-\left( \xi + x\right) /2}\frac{1}{2}\left( \frac{x}{\xi }\right) ^{\left( \eta -2\right) /4}I_{\left( \eta -2\right) /2}\left( \sqrt{\xi x}\right)\), \(\forall x>0\), where \(I_\alpha \left( y\right)\) is the modified Bessel function of the first kind of order \(\alpha\).

Corollary 1

Assume spot volatility \(v_t\) follows (2). Assuming that all conditional moments exist then the conditional distribution of \(v_t\) is defined by

$$\begin{aligned} hv_t|v_0 \sim {\tilde{\chi }}^2_{2\gamma }\left( hv_0\mathrm {e}^{-\kappa t}\right) , \end{aligned}$$
(9)

where \(h=4\kappa \left\{ \sigma ^2\left( 1-\mathrm {e}^{-\kappa t}\right) \right\} ^{-1}.\)

Proof

There are two things that we need to demonstrate. First that the conditional moments of (2) coincide with the ones of the noncentral chi-squared and second, that the noncentral chi-squared distribution is moment determinate.Footnote 3 First, we note that with some tedious algebra (or with the help of a symbolic programming language) formula (7) can be written as

$$\begin{aligned} {\mathbb {E}}\left[ v_t^p| v_0\right]&= \left( \theta \gamma ^{-1}\right) ^p\frac{\Gamma \left( \gamma +p\right) }{\Gamma \left( \gamma \right) }\left( 1-\mathrm {e}^{-\kappa t}\right) ^p {}_1F_1\left( -p;\gamma ;-\frac{2 \kappa v_0}{\left( \mathrm {e}^{\kappa t}-1\right) \sigma ^2}\right) , \end{aligned}$$

where \(\Gamma \left( x \right) =\int _{0}^{\infty }t^{x-1}\mathrm {e}^{-x}\mathrm {d}t\) is the Gamma function and \(_1F_1(a; b; z) = \sum _{k = 0}^{\infty }\frac{\Gamma \left( a+k\right) \Gamma \left( b\right) }{\Gamma \left( a\right) \Gamma \left( b+k\right) }\frac{z^k}{k !}\) is Kummer’s confluent hypergeometric function (see (Slater 1966)). Now, recall from for example (Johnson et al. 1970) that the moments of a noncentral chi-squared distribution with \(\eta\) degrees of freedom and noncentrality parameter \(\xi\) can be expressed as:

$$\begin{aligned} \mu _p = 2^p \Gamma \left( \frac{\eta }{2}+p\right) \sum _{j = 0}^{p}\left( {\begin{array}{c}p\\ j\end{array}}\right) \frac{\left( \xi /2\right) ^j}{\Gamma \left( j + \frac{\eta }{2}\right) } =2^p \frac{\Gamma \left( \frac{\eta }{2}+p\right) }{\Gamma \left( \frac{\eta }{2}\right) } {}_1F_1\left( -p;\frac{\eta }{2}; -\frac{\xi }{2} \right) , \end{aligned}$$

where in the second step we use the algorithm presented in Petkovšek et al. (1997) to reveal the identity:

$$\begin{aligned} \sum _{j = 0}^{p}\left( {\begin{array}{c}p\\ j\end{array}}\right) \frac{\left( \xi /2\right) ^j}{\Gamma \left( j + \frac{\eta }{2}\right) } = \frac{1}{\Gamma \left( \frac{\eta }{2}\right) }{}_1F_1\left( -p;\frac{\eta }{2}; -\frac{\xi }{2} \right) . \end{aligned}$$

Plugging in \(\eta = 2\gamma\) and \(\xi = hv_0\mathrm {e}^{-\kappa t}\) gives:

$$\begin{aligned} 2^p \frac{\Gamma \left( \gamma +p\right) }{\Gamma \left( \gamma \right) } {}_1F_1\left( -p;\gamma ; -\frac{2\kappa v_0}{\left( \mathrm {e}^{\kappa t}-1\right) \sigma ^2} \right) = {\mathbb {E}}\left[ v_t^p|v_0\right] h^p = {\mathbb {E}}\left[ \left( hv_t\right) ^p|v_0\right] . \end{aligned}$$

Second, the noncentral chi-squared is moment determinate. To see this, notice that the characteristic function of the noncentral chi-square \({\tilde{\chi }}^2_{\eta }\left( \xi \right)\) is defined as:

$$\begin{aligned} \phi _{{\tilde{\chi }}^2_{\eta }(\xi )}(u)=\frac{\exp \left( -\frac{i\xi u}{1-2iu}\right) }{(1-2iu)^{\eta /2}}. \end{aligned}$$

We notice that the characteristic function of the noncentral chi-square is the product of two analytic functions \((1-2iu)^{-\eta /2}\) and \(\exp \left( -\frac{i\xi u}{1-2iu}\right)\) which are defined over the complex plane except for the singularity \(u=-i/2\), see (Marsden and Hoffman 1999). For the point 0, the radius of convergence is thus 1/2. Thus, there exists a unique power series expansion around 0 and the noncentral chi-square will be moment determinate as indicated by Cramér (1946) p. 176.

We conclude that every moment of \(hv_t\) coincides with the same moment of the noncentral chi-squared distribution and since the noncentral chi-squared is moment determinate (9) holds. \(\square\)

Define the p-th unconditional moment as \({\mathbb {E}}\left[ v_\infty ^p\right] = \lim \limits _{t \rightarrow \infty }{\mathbb {E}}\left[ v_t^p | v_0 \right]\).

Proposition 2

The p-th unconditional moment of \(v_t\) is determined by:

$$\begin{aligned} {\mathbb {E}}\left[ v_\infty ^p\right] = {\left\{ \begin{array}{ll} c_{p+1}a_p, &{}\text {if } p = 1, \\ \frac{c_{p+1}m_{p-1, p}}{p!\kappa ^{p-1}}, &{} \text {if }p \ge 2. \end{array}\right. } \end{aligned}$$
(10)

Proof

Considering (7), we immediately see that the limit of the first term is given by \(\lim \limits _{t \rightarrow \infty }\mathrm {e}^{-p\kappa t}v_0 = 0\). On the other side, the limit of the second term is:

$$\begin{aligned} \lim \limits _{t \rightarrow \infty }\mathrm {e}^{-p\kappa t}c_{p+1}\nu _p = {\left\{ \begin{array}{ll} c_{p+1}a_p, &{} \text {if }p = 1, \\ \frac{c_{p+1}m_{p-1, p}}{p!\kappa ^{p-1}}, &{} \text {if }p \ge 2. \end{array}\right. } \end{aligned}$$

\(\square\)

Explicitly, the first four unconditional moments are given by:

$$\begin{aligned} {\mathbb {E}}\left[ v_\infty \right]&=\theta ,&{\mathbb {E}}\left[ v_\infty ^2\right]&= \theta ^2\left( 1 + \gamma ^{-1}\right) , \\ {\mathbb {E}}\left[ v_\infty ^3\right]&= \theta ^3\left( 1 + 3\gamma ^{-1} + 2\gamma ^{-2}\right) ,&{\mathbb {E}}\left[ v_\infty ^4\right]&= \theta ^4\left( 1 + 6\gamma ^{-1} + 11 \gamma ^{-2} + 6\gamma ^{-3}\right) . \end{aligned}$$

For completeness, we also remind that, as initially shown by Cox et al. (1985) , the unconditional steady-state distribution is a gamma.

Corollary 2

Assume spot volatility \(v_t\) follows (2). The unconditional or steady-state distribution of \(v_t\) is then given by:

$$\begin{aligned} v_t\sim \Gamma \left( \gamma ; \theta \gamma ^{-1} \right) , \end{aligned}$$
(11)

where \(\Gamma \left( \alpha , \beta \right)\) is the Gamma distribution with shape parameter \(\alpha\) and scale parameter \(\beta\) defined through density \(f(x; \alpha , \beta )=\frac{x^{\alpha -1}}{\Gamma \left( \alpha \right) \beta ^\alpha }\mathrm {e}^{-\frac{x}{\beta }}.\)

Proof

Again, after some tedious algebra or with the help of a symbolic programming language (10) can be written as:

$$\begin{aligned} {\mathbb {E}}\left[ v_\infty ^p\right] = \left( \theta \gamma ^{-1}\right) ^p\frac{\Gamma \left( \gamma +p\right) }{\Gamma \left( \gamma \right) }. \end{aligned}$$

It is straightforward to see that this expression gives the p-th moment of a Gamma distribution with shape \(\gamma\) and scale \(\theta \gamma ^{-1}\). Now, we simply apply the same argument as in Corollary 1. Since every moment of \(v_t\) is a moment of the Gamma distribution, (11) holds. \(\square\)

Remark 1

For the reader’s convenience, we would like to mention that unlike the current paper, Cox et al. (1985) used shape and rate parameters in the definition of the Gamma distribution. Thereby, the rate parameter is simply the inverse of the scale parameter.

Remark 2

By using the expression of \({\mathbb {E}}\left[ v_\infty ^p\right]\) appearing in the proof of Corollary 2, we can state that:

$$\begin{aligned} {\mathbb {E}}\left[ v_t^p| v_0\right] = {\mathbb {E}}\left[ v_\infty ^p\right] \left( 1-\mathrm {e}^{-\kappa t}\right) ^p{}_1F_1\left( -p;\gamma ;-\frac{2 \kappa v_0}{\left( \mathrm {e}^{\kappa t}-1\right) \sigma ^2}\right) . \end{aligned}$$

That is the p-th conditional moment of \(v_t\) can be written as a function of the respective unconditional moment.

3.3 Conditional and unconditional comoments of \(v_t\)

Now, let us consider the comoments of \(v_t\).

Proposition 3

The conditional autocomoment of order \(\left( p,q\right)\) of \(v_t\) is given by:

$$\begin{aligned} {\mathbb {E}}\left[ v_0^pv_t^q \big | v_0\right] = \mathrm {e}^{-q\kappa t}\left\{ v_0^{p+q} + c_{q+1}\left( a_{p+q}b_t + v_0^p\sum _{i = 0}^{q-1}m_{i,q}I_{i+1}\right) \right\} . \end{aligned}$$
(12)

Proof

First, we note that \({\mathbb {E}}\left[ v_0^pv_t^q\big |v_0\right] = v_0^p{\mathbb {E}}\left[ v_t^q\big |v_0\right]\). Plugging in (7) gives:

$$\begin{aligned} v_0^p{\mathbb {E}}\left[ v_t^q\big |v_0\right]&= v_0^p\left\{ \mathrm {e}^{-q\kappa t}\left( v_0^q+c_{q+1}\nu _q\right) \right\} . \end{aligned}$$

Using the definitions \(\nu _p\) and \(a_p\) from Proposition 1 we can immediately deduce (12). \(\square\)

Some explicit examples are given by:

$$\begin{aligned} {\mathbb {E}}\left[ v_0 v_t|v_0\right]&= v_0^2 \mathrm {e}^{-\kappa t} +v_0\theta (1-\mathrm {e}^{-\kappa t}),\\ {\mathbb {E}}\left[ v_0 v_t^2|v_0\right]&= v_0^3 \mathrm {e}^{-2\kappa t}+\left( 1+\gamma ^{-1}\right) \left\{ v_0\theta ^2(1-\mathrm {e}^{-\kappa t})^2 + 2v_0^2\theta (\mathrm {e}^{-\kappa t}- \mathrm {e}^{-2\kappa t}) \right\} , \\ {\mathbb {E}}\left[ v_0 v_t^3|v_0\right]&= v_0^4\mathrm {e}^{-3\kappa t} + 3 v_0^3\theta \left( 1+2\gamma ^{-1}\right) (\mathrm {e}^{-2 \kappa t}-\mathrm {e}^{-3\kappa t})\\&\quad +\left( 1 + 3\gamma ^{-1} + 2\gamma ^{-2}\right) \left\{ v_0\theta ^3(1-\mathrm {e}^{-\kappa t})^3 \right. \\&\quad \left. + 3v_0^2\theta ^2 \left( \mathrm {e}^{-\kappa t}-2\mathrm {e}^{-2\kappa t} + \mathrm {e}^{-3\kappa t}\right) \right\} , \\ {\mathbb {E}}\left[ v_0^2v_t^2|v_0\right]&= v_0^4\mathrm {e}^{2-\kappa t} + \left( 1+\gamma ^{-1} \right) \left\{ 2v_0^3\theta \left( \mathrm {e}^{-\kappa t} - \mathrm {e}^{-2\kappa t}\right) + v_0^2\theta ^2\left( 1-\mathrm {e}^{-\kappa t}\right) ^2 \right\} \end{aligned}$$

Remark 3

Using the insight of Corollary 1, we could write (12) as:

$$\begin{aligned} {\mathbb {E}}\left[ v_0^pv_t^q|v_0\right] = \left( \theta \gamma ^{-1}\right) ^q\frac{\Gamma \left( \gamma +q\right) }{\Gamma \left( \gamma \right) }v_0^p\left( 1-\mathrm {e}^{-\kappa t}\right) ^q {}_1F_1\left( -q;\gamma ;-\frac{2 \kappa v_0}{\left( \mathrm {e}^{\kappa t}-1\right) \sigma ^2}\right) . \end{aligned}$$

Proposition 4

The unconditional autocomoment of order \(\left( p, q\right)\) of \(v_t\) is given through:

$$\begin{aligned} {\mathbb {E}}\left[ v_0^p v_t^q\right] = \mathrm {e}^{-q \kappa t}\left( {\mathbb {E}}\left[ v_0^{p+q} \right] + c_{q+1}{\mathbb {E}}\left[ v_0^p \nu _q\right] \right) , \end{aligned}$$
(13)

where

$$\begin{aligned} {\mathbb {E}}\left[ v_0^p \nu _q \right] = {\mathbb {E}}\left[ v_0^{p+q-1}\right] \frac{b_t}{\kappa } + \sum _{i = 0}^{q-1}{\mathbb {E}}\left[ v_0^pm_{i,q}\right] I_{i+1} \end{aligned}$$

with

$$\begin{aligned} {\mathbb {E}}\left[ v_0^pm_{k,q}\right] = {\left\{ \begin{array}{ll} 0, &{} \text {if }k = 0, \\ \frac{1}{\kappa }{\mathbb {E}}\left[ v_0^{p+q-k-1}\right] \prod \limits _{j = q-k+1}^{q}c_{j}, &{} \text {if }k \ge 1. \end{array}\right. } \end{aligned}$$

Proof

First, plug \(v_t^q\) from (8) into \({\mathbb {E}}\left[ v_0^p v_t^q\right]\) and use the definition of \(c_p\):

$$\begin{aligned} \begin{aligned} {\mathbb {E}}\left[ v_0^p v_t^q\right]&= {\mathbb {E}}\left[ v_0^p\left\{ \mathrm {e}^{-q \kappa t} \left[ v_0^q + q\left\{ \kappa \theta + \frac{1}{2}\left( q-1\right) \sigma ^2\right\} \nu _q \right] \right\} \right] \\&= {\mathbb {E}}\left[ v_0^{p+q} \right] \mathrm {e}^{-q \kappa t} + \mathrm {e}^{-q \kappa t}c_{q+1}{\mathbb {E}}\left[ v_0^p \nu _q\right] . \end{aligned} \end{aligned}$$

Next, evaluate \({\mathbb {E}}\left[ v_0^p \nu _q \right]\):

$$\begin{aligned} \begin{aligned} {\mathbb {E}}\left[ v_0^p \nu _q \right]&= {\mathbb {E}}\left[ v_0^p \left\{ a_qb_t + \sum _{i = 0}^{q-1}m_{i,q} I_{i+1}\right\} \right] = {\mathbb {E}}\left[ v_0^{p+q-1}\right] \frac{b_t}{\kappa } + \sum _{i = 0}^{q-1}{\mathbb {E}}\left[ v_0^pm_{i,q}\right] I_{i+1} \end{aligned}, \end{aligned}$$

where we used \(a_p\) and the linearity of the expectation operator. Finally, we have to evaluate \({\mathbb {E}}\left[ v_0^pm_{k,q}\right]\) where we use \(a_p\) and \(m_{k,p}\) to get:

$$\begin{aligned} {\mathbb {E}}\left[ v_0^p m_{k,q}\right] = {\left\{ \begin{array}{ll} 0, &{} \text {if } k = 0, \\ \frac{1}{\kappa }{\mathbb {E}}\left[ v_0^{p+q-k-1}\right] \prod \limits _{j = q-k+1}^{q}c_{j},&{} k \ge 1. \end{array}\right. } \end{aligned}$$

\(\square\)

Some examples of unconditional comoments are as follows:

$$\begin{aligned} {\mathbb {E}}\left[ v_0v_t\right]&= \theta ^2\left( 1+\gamma ^{-1}\mathrm {e}^{-\kappa t}\right) , \\ {\mathbb {E}}\left[ v_0v_t^2\right]&= \theta ^3\left( 1+\gamma ^{-1}\right) \left( 1+2\mathrm {e}^{-\kappa t}\right) ,\\ {\mathbb {E}}\left[ v_0v_t^3\right]&= \theta ^4\left( 1+3\gamma ^{-1}+2\gamma ^{-2}\right) \left( 1+3\gamma ^{-1}\mathrm {e}^{-\kappa t}\right) , \\ {\mathbb {E}}\left[ v_0^2v_t^2\right]&= \theta ^4\left\{ \left( 1+2\gamma ^{-1} + \gamma ^{-2}\right) \left( 1+ 4\gamma ^{-1}\mathrm {e}^{-\kappa t}\right) + 2\gamma ^{-2}\left( 1+\gamma ^{-1}\right) \mathrm {e}^{-2\kappa t}\right\} . \end{aligned}$$

Corollary 3

Equation (13) can be written as:

$$\begin{aligned} {\mathbb {E}}\left[ v_0^pv_t^q\right]= & {} \left( \theta \gamma ^{-1}\right) ^{p+q}\frac{\Gamma \left( \gamma + p +q\right) }{\Gamma \left( \gamma \right) }\mathrm {e}^{-q\kappa t} {}_2F_1\\&\left( -q, 1-q-\gamma ; 1-p-q-\gamma ; 1- \mathrm {e}^{\kappa t}\right) , \end{aligned}$$

where \(_2F_1\left( a, b ; c; z\right) = \sum _{k = 0}^{\infty } \frac{\Gamma \left( a+k\right) \Gamma \left( b+k\right) \Gamma \left( c\right) }{\Gamma \left( a\right) \Gamma \left( b\right) \Gamma \left( c+k\right) }\frac{z^k}{k !}\) is the Gaussian hypergeometric function.

Corollary 4

The unconditional comoments of \(v_t\) possess the commutative property. Namely:

$$\begin{aligned} {\mathbb {E}}\left[ v_0^pv_t^q\right] = {\mathbb {E}}\left[ v_0^qv_t^p\right] . \end{aligned}$$
(14)

Proof

Considering Corollary 3, we see that the first two terms of the expression are commutative in p and q. Therefore, we will leave them out of our further consideration. Making use of Euler’s transformation \({}_2F_1\left( a, b ; c; z\right) =\left( 1-z\right) ^{c-a-b}{}_2F_1\left( c-a, c-b; c; z\right)\) and noting that \({}_2F_1\left( a, b ; c; z\right) = {}_2F_1\left( b, a ; c; z\right)\) we can write the two latter terms as:

$$\begin{aligned} \mathrm {e}^{-p\kappa t}{}_2F_1\left( -p, 1-p-\gamma ; 1-p-q-\gamma ; 1- \mathrm {e}^{\kappa t}\right) , \end{aligned}$$

which completes the proof. \(\square\)

Remark 4

Using the statement in the proof of Corollary 3, we can write that:

$$\begin{aligned} {\mathbb {E}}\left[ v_0^pv_t^q\right] = {\mathbb {E}}\left[ v_t^{p+q}\right] \mathrm {e}^{-q\kappa t} {}_2F_1\left( -q, 1-q-\gamma ; 1-p-q-\gamma ; 1- \mathrm {e}^{\kappa t}\right) , \end{aligned}$$

that is the unconditional comoment of order \(\left( p,q\right)\) is a function of the unconditional moment of order \(p+q\).

We conclude this section by noticing that by extending the computations found in Dana and Jeanblanc-Picqué (1994) and Shreve (2004), we are able to demonstrate in an alternative manner, using the SDE route, that the transition density of the CIR process is indeed a noncentral chi-squared distribution. Once, this has been recognized, the unconditional distribution is easy to obtain as the asymptotic density. It is also relatively easy to deduce the conditional and unconditional auto-comoments.

4 Conditional and unconditional characteristic function of log-returns

Our objective is to obtain analytic expressions for higher moments of the SVJD process defined in Sect. 2. In this section, we will derive the conditional and unconditional characteristic function of log-returns \(r_{t} = X_t - X_0.\) In the next section, by derivation of the characteristic function, we generate the conditional and unconditional moments of \(r_{t}\).

It should be mentioned that concerning Heston’s process, that is the process in consideration without price jumps, an interesting paper by Zhang et al. (2017) derives expressions of higher moments with a novel approach starting from the SDE instead of from the characteristic function. It is not clear how to incorporate jumps into their approach and for this reason, we decided to work instead with the characteristic function.

Proposition 5

Assuming log-prices \(X_t\) follow (2)-(3)-(4), the conditional characteristic function of the log-return \(r_{t}\) is given by:

$$\begin{aligned} \phi _{r_{t}}(u|v_0) = \exp \{A(u, \mu , t) + B(\theta , \kappa , \rho , \sigma , u, t) + C(\kappa , \rho , \sigma , u, t, v_0) + D(\lambda , \mu _J, u, \sigma _J, t)\} \end{aligned}$$
(15)

where

$$\begin{aligned} A\left( u, \mu , t \right)&= \mathrm {i}u\mu t, \\ B\left( \theta , \kappa , \rho , \sigma , u ,t\right)&= \frac{\theta \kappa }{\sigma ^2}\left\{ (\kappa - \rho \sigma u \mathrm {i}-d)t-2\mathrm {log}\left( \frac{1- g \mathrm {e}^{-dt}}{1-g}\right) \right\} , \\ C\left( \kappa , \rho , \sigma , u, t, v_0\right)&= v_0\frac{\kappa - \rho \sigma \mathrm {i}u-d}{\sigma ^{2}}\left( \frac{1-\mathrm {e}^{-dt}}{1-g\mathrm {e}^{dt}}\right) , \\ D\left( \lambda , \mu _J, u, \sigma _J, t\right)&= -\lambda \left\{ \log (1 + \mu _J) - \sigma _J^2\right\} \mathrm {i}ut \\&\quad + \lambda t \left[ (1+\mu _J)^{\mathrm {i}u}\exp \left\{ \sigma _J^2\mathrm {i}u(\mathrm {i}u-1)\right\} - 1 \right] , \end{aligned}$$

with \(d = \sqrt{(\rho \sigma u \mathrm {i}-\kappa )^2 + \sigma ^2(\mathrm {i}u+u^2)}\), \(g = \frac{\kappa - \rho \sigma u \mathrm {i} - d}{\kappa - \rho \sigma u \mathrm {i} + d}\), and \(\mathrm {i}^2 = -1\) an imaginary unit.

Remark 5

Similar terms to the AB,  and C terms above have also been obtained in Heston (1993). The original function g as defined in Heston (1993) may be numerically unstable. For this reason, we use the version of Schoutens et al. (2003) and Albrecher et al. (2007). In this version, the minus and plus signs in front of the term d in the numerator and denominator of g are flipped and correspond to a numerically stable complex root.

Proof

First, given that the jump intensity, \(\lambda\), and the jumps, \(J_t\), are independent of the Brownian increments, we recognize that the proposed process can be decomposed into two parts: one continuous and one discontinuous part. Since these two parts are independent, the characteristic function of each part can be computed separately. Multiplying both functions together, we deduce the characteristic function of the proposed process. Since the continuous part corresponds to the model of Heston (1993), the conditional characteristic function is well known and defined as:

$$\begin{aligned} \phi _{r_{t}}^{Heston}(u|v_0) =&\exp \{A(u, \mu , t) + B(\theta , \kappa , \rho , \sigma , u, t) + C(\kappa , \rho , \sigma , u, t, v_0)\} \end{aligned}$$

with functions A, B and C as given above. The characteristic function of the discontinuous part, that is the compensated compound Poisson process \(Y_t\) defined in (5), is given by:

$$\begin{aligned} \phi _{Y_t}\left( u\right) = {\mathbb {E}}\left[ \mathrm {e}^{\mathrm {i}uY_t}\right] = \exp \left\{ \lambda t\left( \phi _{\mathrm {log}(1+J_t)}(u) -1 - \mathrm {i}u{\mathbb {E}}\left[ \mathrm {log}\left( 1+J_t\right) \right] \right) \right\} , \end{aligned}$$

where

$$\begin{aligned} \phi _{\mathrm {log}(1+J_t)}(u)&= \exp \left\{ \mathrm {i}u\left( \mathrm {log}\left( 1 + \mu _J\right) - \sigma _J^2\right) - \sigma _J^2 u^2\right\} \\&= \left( 1+\mu _J\right) ^{\mathrm {i}u}\mathrm {e}^{-\mathrm {i}u\sigma _J^2-\sigma _J^2u^2} = \left( 1+\mu _J\right) ^{\mathrm {i}u}\mathrm {e}^{\sigma _J^2(\mathrm {i}u)^2 - \mathrm {i}u\sigma _J^2}, \end{aligned}$$

is the characteristic function of the normal distributed jump with expression \(\mathrm {log}\left( 1+J_t\right)\) and corresponding expected jump size:

$$\begin{aligned} {\mathbb {E}}\left[ \mathrm {log}(1+J_t)\right] = \mathrm {log}\left( 1+\mu _J\right) -\sigma _J^2. \end{aligned}$$

Inserting both expressions into \(\phi _{Y_t}\left( u\right)\) gives:

$$\begin{aligned} \phi _{Y_t}\left( u\right)&= \exp \left[ \lambda t \left\{ \left( 1+\mu _J\right) ^{\mathrm {i}u}\mathrm {e}^{\sigma _J^2(\mathrm {i}u)^2 - \mathrm {i}u\sigma _J^2 } - 1- \mathrm {i}u \left( \mathrm {log}(1 + \mu _J) - \sigma _J^2\right) \right\} \right] \\&= \exp \left[ -\lambda \left( \mathrm {log}(1 + \mu _J) - \sigma _J^2\right) \mathrm {i}ut + \lambda t \left\{ (1+\mu _J)^{\mathrm {i}u}\mathrm {e}^{\sigma _J^2\mathrm {i}u(\mathrm {i}u-1)} - 1 \right\} \right] . \end{aligned}$$

Multiplying \(\phi _{r_{t}}^{Heston}(u|v_0)\) with \(\phi _{Y_t}(u)\) gives (15). \(\square\)

Proposition 6

Assuming log-prices \(X_t\) follow (2)-(3)-(4) the unconditional characteristic function of the log-return \(r_{t}\) is given by:

$$\begin{aligned} \begin{aligned} \phi _{r_{t}}(u)&= \exp \{A(u, \mu , t) + B(\theta , \kappa , \rho , \sigma , u, t) + {\mathcal {C}}(\kappa , \theta , \rho , \sigma , u, t) + D(\lambda , \mu _J, u, \sigma _J, t)\}, \end{aligned} \end{aligned}$$
(16)

where the functions A, B,  and D are defined as above and the function \({\mathcal {C}}\) is defined as:

$$\begin{aligned} {\mathcal {C}}(\kappa , \theta , \rho , \sigma , u, t) = \log \left[ \left( \frac{2\kappa }{\sigma ^2}\right) ^{\gamma } \left\{ \frac{2\kappa }{\sigma ^2} - \frac{\kappa - \rho \sigma \mathrm {i}u-d}{\sigma ^{2}}\left( \frac{1-\mathrm {e}^{-dt}}{1-g\mathrm {e}^{dt}}\right) \right\} ^{-\gamma }\right] . \end{aligned}$$

Proof

First, we define the unconditional characteristic function of \(r_{t}\) as:

$$\begin{aligned} \phi _{r_{t}}(u) = {\mathbb {E}}\left[ \phi _{r_{t}}(u|v_0)\right] =\int _{0}^{\infty } \phi _{r_{t}}(u|v_0) f(v_0) \mathrm {d}v_0, \end{aligned}$$

where \(f(v_0)\) denotes the steady-state density function of the latent variance v defined in (2). From Corollary 2, we recognize the Gamma distribution with shape parameter \(\gamma\) and scale parameter \(\theta \gamma ^{-1}\). Next, notice that only the function C in \(\phi _{r_{t}}(u|v_0)\) depends on \(v_0\). Therefore, we only have to evaluate:

$$\begin{aligned} \int _{0}^{\infty }\exp \{C(\kappa , \rho , \sigma , u, t, v_0)\} f(v_0) \mathrm {d}v_0, \end{aligned}$$

which gives:

$$\begin{aligned} \left( \frac{2\kappa }{\sigma ^2}\right) ^{\gamma } \left( \frac{2\kappa }{\sigma ^2} - \frac{\kappa - \rho \sigma \mathrm {i}u-d}{\sigma ^{2}}\left[ \frac{1-\mathrm {e}^{-dt}}{1-g\mathrm {e}^{dt}}\right] \right) ^{-\gamma }. \end{aligned}$$

Defining the logarithm of this term as \({\mathcal {C}}(\kappa , \theta , \rho , \sigma , u, t)\) and substituting for \(C(\kappa , \rho , \sigma , u, t, v_0)\) in \(\phi _{r_{t}}(u|v_0)\) gives (16). \(\square\)

5 Conditional and unconditional moments of log-returns

Now, the k-th unconditional noncentral moment of log-returns \(r_{t}\) can be obtained as:

$$\begin{aligned} \mu _k = {\mathbb {E}}[(r_{t})^k] = \mathrm {i}^{-k} \frac{\mathrm {d}^k \phi _{r_{t}}(u)}{\mathrm {d} u^k}\Biggr |_{u=0}. \end{aligned}$$
(17)

To compute the conditional moments, that is the moments dependent on the starting value of latent variance \(v_0\), one simply has to substitute \(\phi _{r_{t}}(u)\) with \(\phi _{r_{t}}(u|v_0)\) in (17). In the following, we will focus on the unconditional moments.

Let \(\zeta _1 ={\mathbb {E}}\left[ \left( r_{t} - \mu _1\right) ^2\right] = \mu _2 - \mu _1^2\) be the unconditional variance of \(r_{t}\). Further, let \(\zeta _2\) and \(\zeta _3\) be defined as the unconditional skewness and kurtosis of \(r_{t}\). Just like the variance we can write these two quantities as functions of unconditional moments:

$$\begin{aligned} \zeta _2&= {\mathbb {E}}\left[ \left( \frac{r_{t} - \mu _1}{\zeta _1^{1/2}}\right) ^3\right] = \frac{\mu _3 - 3\mu _1\mu _2+2\mu _1^3}{(\mu _2 - \mu _1^2)^{3/2}} ,\\ \zeta _3&= {\mathbb {E}}\left[ \left( \frac{r_{t} - \mu _1}{\zeta _1^{1/2}}\right) ^4\right] = \frac{\mu _4 - 4\mu _3\mu _1+6\mu _2\mu _1^2-3\mu _1^4}{(\mu _2 - \mu _1^2)^{2}}. \end{aligned}$$

By utilizing these relationships and a symbolic programming language, we are able to calculate closed-form representations of all moments as well as the variance, skewness, and kurtosis for \(r_{t}\):

$$\begin{aligned} \mu _1&= \left( \mu - \frac{\theta }{2}\right) t,\\ \mu _2&=\frac{1}{\kappa ^2 4}\left\{ \mathrm {e}^{-\kappa t}\theta \sigma \left( \sigma -4\kappa \rho \right) -\sigma ^2\theta \kappa ^{-1}+\sigma \theta \left( 4\rho +\sigma t\right) \right\} \\&\quad - \kappa ^{-1}\rho \sigma t\theta + t\theta \\&\quad + \frac{t}{4}\left\{ t\left( \theta -2\mu \right) ^2 +2\lambda \sigma _J^2\left( 4 + 2\sigma _J^2\right) \right\} \\&\quad +\lambda t\log (1+\mu _J)\left\{ \log (1+\mu _J)-2\sigma _J^2\right\} , \\ \mu _3&= \frac{\lambda t}{4} \log (\mu _J+1) \left\{ 3 \sqrt{2} \sigma _J \left( 2 t \theta +\sqrt{2}\sigma _J+4-4 \mu t\right) +4 \log ^2(\mu _J+1)\right. \\&\quad -\left. 6 \log (\mu _J+1) \left( t\theta + \sqrt{2}\sigma _J-2 \mu t\right) \right\} \\&\quad - \frac{3 \sigma \theta }{8 \kappa ^5\mathrm {e}^{\kappa t}}\left\{ 4\kappa ^3 \rho \left( 2 \mu t+2 \rho \sigma t-t \theta -2\right) \right. \\&\quad +\left. \kappa ^2 \sigma \left( 16 \rho ^2-6 \rho \sigma t+t\theta +4-2 \mu t\right) +\kappa \sigma ^2 \left( \sigma t-12 \rho \right) +2 \sigma ^3 \right\} - \frac{t\lambda \sqrt{2}\sigma _J^3}{4}\\&\quad +\frac{t}{8} \left[ 6 \lambda \sigma _J^2 \left( 2 \mu t-t \theta -4\right) +12 \sqrt{2} \lambda t \sigma _J \left( 2 \mu -\theta \right) \right. \\& \quad+ t \left. \left( 2 \mu -\theta \right) \left\{ t \left( \theta -2 \mu \right) ^2+12 \theta \right\} \right] \\&\quad +\frac{3\rho \sigma t \theta }{2\kappa } \left( t \theta +2-2 \mu t\right) -\frac{3\sigma \theta }{8\kappa ^2} \left\{ \rho \left( 4 t \theta +8-8 \mu t\right) +\sigma t \left( t \theta +4-2 \mu t\right) +8 \rho ^2 \sigma t\right\} \\&\quad +\frac{3\sigma ^2 \theta }{8\kappa ^3}\left( 16 \rho ^2+6 \rho \sigma t+t \theta +4-2 \mu t\right) -\frac{3\sigma ^3 \theta }{8\kappa ^4} \left( 12 \rho +\sigma t\right) +\frac{3\sigma ^4 \theta }{4\kappa ^5}, \\ \zeta _1&= t\theta -\frac{\sigma \theta \{\mathrm {e}^{\kappa t} (\kappa t-1)+1\} (4 \kappa \rho -\sigma )}{4 \kappa ^3\mathrm {e}^{\kappa t}} \\&\quad + \lambda t\left[ \log ^2(\mu _J+1)+\sqrt{2}\sigma _J\left\{ 1-\log (\mu _J+1)\right\} \right] + \frac{\lambda t\sigma _J^2}{2}, \\ \zeta _2&= \frac{\zeta _1^{-3/2}}{8} \left[ 2 \lambda t \log (\mu _J+1) \left\{ 4 \log ^2(\mu _J+1)+3 \sqrt{2} \sigma _J \left( \sqrt{2}\sigma _J+4\right) -6 \sigma _J \log (\mu _J+1)\right\} \right. \\&\quad -2 \sqrt{2}\lambda t \sigma _J^3- \frac{3 \sigma \theta }{\kappa ^5\mathrm {e}^{\kappa t}} \left( 2 \kappa \rho -\sigma \right) \left\{ 4 \kappa ^2 \left( \rho \sigma t-1 \right) +\kappa \sigma \left( 8 \rho -\sigma t\right) -2 \sigma ^2\right\} -24 \lambda t \sigma _J^2\\&\quad + \left. \frac{3 \sigma \theta }{\kappa ^5} \left( 2 \kappa \rho -\sigma \right) \left\{ 4 \kappa ^3 t-4 \kappa ^2 \left( \rho \sigma t+1\right) -\kappa \sigma \left( 8 \rho +\sigma t\right) -2 \sigma ^2 \right\} \right] . \end{aligned}$$

The expressions for \(\mu _4\) and \(\zeta _3\) are too long to be presented. Nonetheless, they are part of the programs we implement in R, MATLAB, and Mathematica. The codes can be downloaded from the following GitHub repository: https://github.com/mrockinger/HigherMoments.

One may question the necessity to work with such long formulas leave alone their validity. To validate our approach, we turn to a numerical investigation involving evaluation of the moments for different parameters and comparing the analytic moments with the ones obtained with competing methods.

6 Relevance of analytic formulas

Having obtained analytic expressions of the various moments, we may compare them with simulated moments and moments obtained as finite differences. Below, we present results for unconditional moments and restricting the model to Heston’s model, i.e. omitting the jump part. In non-reported simulations, we also validated the moments for the full process of interest. Our emphasis is to reveal a deterioration of simulation and finite difference-based numerical approximations of higher moments if one does not use the formula for the exact moments. For simulations, we used a simple Euler scheme and the Quadratic Exponential scheme by Andersen (2008). Since the Euler scheme did not perform as well as Andersen’s, we retained only the latter.

Concerning finite differences, we computed the moments of the process by approximating the derivatives in (17). Eventually, finite difference approximations \(m_k\) are used to compute approximations of the noncentral moments \(\mu _k\):

$$\begin{aligned} m_1= & {} \frac{1}{i}\frac{\phi (u+h)-\phi (u-h)}{2h}, \end{aligned}$$
(18)
$$\begin{aligned} m_2= & {} -\frac{\phi (u+h)-2\phi (u)+\phi (u-h)}{h^2}, \end{aligned}$$
(19)
$$\begin{aligned} m_3= & {} -\frac{1}{i}\frac{\phi (u+2h)-2\phi (u+h)+2\phi (u-h)-\phi (u-2h)}{2h^3}, \end{aligned}$$
(20)
$$\begin{aligned} m_4= & {} \frac{\phi (u\!+\!3h)-2\phi (u\!+\!2h)-\phi (u\!+\!h)+4\phi (u)-\phi (u\!-\!h)-2\phi (u\!-\!2h)+\phi (u\!-\!3h)}{4h^4}. \end{aligned}$$
(21)

In theory if h becomes infinitely small, \(m_k\) converge to the theoretical moments \(\mu _k.\) In practice, the choice of h depends on the relative error due to rounding in floating point arithmetic. On a 64 bit machine, this error is of the order of 2.2\(\times 10^{-16}.\) Because of the fourth power in the denominator of the fourth moment, the error will be of the order 1.21\(\times 10^{-4}\) which is a relatively large value. We attempt to sense the relative stability of the approximations of variance, skewness, and kurtosis by taking h on a grid taking the values 0.1, 0.01, and 0.001.

Table 1 We restrict the model to the continuous trajectory, i.e. Heston’s model omitting the jump part

In Table 1, we present the results of our investigations. We retained in col 1-3 the same parameters as in Andersen (2008). In the last column, we used parameters obtained by Eraker et al. (2003) for the stochastic volatility model (without jumps). The row called Feller presents the value taken by the ratio \(Feller=2\kappa \theta /\sigma ^2.\) We remind that if this ratio is larger than 1 then what is known as the Feller condition is satisfied. In this case, the variance process will remain away from 0. As the value taken by the Feller condition for realistic parameters is usually smaller than 1, this means that the variance process can hit 0. In such a case, higher moments may be difficult to approximate either by simulation or finite differences. The statistics with a superscript Th are the theoretical moments obtained above. Statistics with a superscript Sim are obtained by using (Andersen 2008) Quadratic Exponential algorithm with \(\gamma _1=\gamma _2\). We averaged the moments of 10’000 runs each with 10’000 observations. Statistics with a \(\Delta\) are the finite differences obtained for various values of h as indicated above.

Starting with the mean and variance, we notice a perfect fit for the finite difference approximation. We also notice a very good fit for the moments obtained for the QE scheme. Turning to the skewness estimate and the kurtosis, we notice first a significant deterioration if one computes higher moments with simulated data. This deterioration is strongest when the Feller condition is more strongly violated. We also notice that for relatively large values of h, for instance values of 0.1 and 0.01, the finite difference moments are equal to the theoretical moments (which by the way validates our theoretical formulas). As one decreases the step size to 0.001 one notices a degeneration of skewness which moves into the complex domain. If one had taken \(h=0.0001\) skewness would have exploded taking unrealistic values of several thousands. It is for kurtosis that the finite difference scheme deteriorates most obviously. Already for \(h=0.01\), the kurtosis obtained from finite differences may differ by about 1% from the theoretical values. For \(h=0.001\), kurtosis explodes taking even negative values.

Possibly, by using a more advanced difference scheme, it would have been possible to obtain a better approximation of skewness and kurtosis. This is however not the purpose of our work since we propose to use the exact analytic formulas which we have validated numerically.

7 Conclusion

The objective of this work was to obtain conditional and unconditional moments of returns defined as differentials of log-prices. These returns are allowed to be contaminated by Gaussian jumps occurring with a frequency determined by an homogeneous Poisson process. Specifically, we adapt (Bates 1996) by compensating jumps in such a way that the jump part does not affect the conditional expectation of returns. Such a model may prove useful for time-series investigations in continuous time where it is required that returns computed as log-prices have an explicit parameter for the mean and where the jump part of the process has been compensated.

As in Heston (1993) and Bates (1996), we assume that spot volatility follows a CIR process. We discuss the historic developments that led CIR to recognize that the transition probability of their stochastic differential equation is a noncentral chi-squared distribution and that the unconditional solution is a gamma distribution. Essentially, they obtained the Laplace transform that the transition probability must follow. Recognizing the expression of a noncentral chi-squared distribution initially discovered by the statistician (Fisher 1928), they used the first and second moments from that literature.

We were able to derive those results following an alternative route: Building on the insights of Dana and Jeanblanc-Picqué (1994) and Shreve (2004), we solve the stochastic differential equation directly and obtain simple yet general formulas for all conditional and unconditional (co-)moments of the spot volatility. By recognizing that these moments are identical to those of a noncentral chi-squared distribution, we derive the conditional and unconditional distributions of the solution to the CIR process in a novel way.

If spot or near-spot volatility observations are available or can at least be estimated, these moments could help estimating the process since it is known that the estimation of the CIR process is delicate, see (Johnson et al. 1970).

We then provide explicit formulas for the conditional and unconditional characteristic function of log-returns in our model. We show how to use these functions to deduce the conditional and unconditional moments of log-returns. Finally, we give explicit expressions for the first four unconditional moments of log-returns. These formulas could be used to investigate the impact of changes of a single parameter on higher moments or to validate novel simulation approaches for the general class of semi-continuous diffusion processes.

All moment functions, for the solution of the CIR process as well as for log-returns, are implemented in R, MATLAB, and Mathematica and can be downloaded from the linked GitHub repository.