1 Introduction

Volatility is a degree of uncertainty of an underlying risky asset and the volatility management and trading are considered to be a crucial task for private investors, institutions, and hedge funds. Volatility derivatives are a class of financial derivatives where the payoff is a function of some measure of the volatility. Variance swap is a prominent example of this type of derivatives. It is one of forward contracts for realized variance whose payoff is given by the spread between realized and implied variances. Variance swap users trade volatility levels directly and thus they may have an advantage over vanilla option traders for hedging purposes.

In order to derive a fair strike (delivery) price of variance swap, the Heston volatility model (Heston 1993) has been generally used as the classical one. Its volatility process is driven by the Cox–Ingersoll–Ross (CIR) process (Cox, Ingersoll & Ross 1985) which has a mean reversion property. For example, Swishchuk (2004) studied the pricing of volatility and variance swaps using a probabilistic approach. Broadie and Jain (2008) found the closed form prices of variance swap and volatility swap under the Heston model together with the impact of discrete sampling. Zhu and Lian (2011) derived a closed form solution for a discretely sampled variance swap by manipulating an additional variable introduced by Little and Pant (2001). Zheng and Kwok (2014b) used saddlepoint approximation methods for pricing variance derivatives. Also, Zheng and Kwok (2014a) derived the prices of generalized and exotic variance swaps under the Heston model with jumps. Kim and Kim (2020) obtained affine approximations for the fair strike prices of the generalized variance swaps using the projection techniques of Grzelak and Oosterlee (2011). However, the downside of the Heston model is that it often results in an unsuitable value compared with market prices, particularly when it is applied to the derivatives whose time-to-maturity is short. Also, it is difficult to capture the movement of volatility when the fluctuations of the volatility are relatively large.

So, there have been studies showing that multi-factor stochastic volatility models can explain the market more accurately than the Heston model. Gatheral (2008) proposed the double mean reverting model for consistent modeling of SPX and VIX options. Fouque, Papanicolaou, Sircar and Solna (2003, 2011) developed an asymptotic two-scale factor stochastic volatility model for pricing financial derivatives. Christoffersen, Heston and Jacobs (2009) proposed a two-factor Heston model, named the double Heston model, with fast and slow factors which are all mean reverting and showed that the model can explain the dynamics of market more flexibly than the classical Heston model. Using this model, Gauthier and Possamai (2011) improved the European option pricing formula given by Christoffersen et al. Moreover, Zhang and Feng (2019) and Fallah and Mehrdoust (2019) studied the pricing of American options under the double Heston model. However, it costs too much time for calibration under the double Heston model because the number of the model parameters is twice that of the Heston model.

The contribution of this article is as follows. First, we use a rescaling approach for the original double Heston model to reduce the number of the model parameters. Second, we obtain a closed form analytic solution formula for variance swaps explicitly under the rescaled double Heston model. Third, we show that the pricing error between the rescaled model and the original model is minimal while the computation time with the rescaled model is much reduced compared to the original model. So, the rescaling approach proposed in this article is verified as a possible efficient method for computing the prices of derivatives such as variance swaps.

This paper is organized as follows. We first introduce a reduced version of the double Heston stochastic volatility model and formulate a pricing problem for variance swaps in terms of partial differential equations (PDEs) in Section 2. We use the Fourier transform and Green function methods to solve the PDEs and obtain a closed form solution formula for variance swaps in Section 3. In Section 4, we check the validity of the formula using Monte Carlo simulation technique. Also, we investigate the impacts of the parameters of fast and slow volatility factors on the fair strike prices of variance swaps. We estimate the model parameters from VIX term structures provided by Chicago Board of Options Exchange (CBOE) and compare the Heston model, the double Heston model and the rescaled double Heston model and show the merits of the rescaled model in terms of computation time and accuracy. Finally, Sect 5 concludes.

2 Model and Problem Formulation

Given an underlying asset price process \(X_t\), the (original) double Heston model devised by Christoffersen et al. (2009) is given by

$$\begin{aligned} dX_t&=rX_t dt + \sqrt{Y_t}X_t dW^1_t + \sqrt{Z_t}X_t dW^2_t, \\ dY_t&=c_1(\theta _1 - Y_t)dt + \sigma _y \sqrt{Y_t} dW_t^y, \\ dZ_t&=c_2(\theta _2 - Z_t)dt + \sigma _z \sqrt{Z_t} dW_t^z, \\ dW^1_t dW_t^y&=\rho _{xy}dt, \quad dW^2_t dW_t^z=\rho _{xz}dt, \end{aligned}$$
(1)

where \(c_i\)’s are mean-reversion rate, \(\theta _i\)’s are long-run mean variances and \(\sigma _y,\sigma _z\) are vol-of-vol, which are all positive constants. It is assumed that there is a zero correlation between \(Y_t\) and \(Z_t\). The market prices of volatility risk have been suppressed here. One volatility process moves as a fast mean reverting factor while another one is slowly mean reverting. The double Heston model may capture the movement of real market well but the drawback is that it has too many parameters for calibration. To calibrate parameters more efficiently, we propose in this paper the following rescaled version of the double Heston model.

$$\begin{aligned} dX_t&=rX_t dt + \sqrt{Y_t}X_t dW^1_t + \sqrt{Z_t}X_t dW^2_t, \\ dY_t&=\frac{1}{\alpha }(\theta _1 - Y_t)dt + \sqrt{\frac{\theta _1}{\alpha }}\sqrt{Y_t} dW_t^y, \\ dZ_t&=\beta (\theta _2 - Z_t)dt + \sqrt{\theta _2 \beta }\sqrt{Z_t} dW_t^z, \\ dW^1_t dW_t^y&=\rho _{xy}dt, \quad dW^2_t dW_t^z=\rho _{xz}dt, \end{aligned}$$
(2)

where \(\alpha \), \(\beta \), \(\theta _1\) and \(\theta _2\) are all positive. Intuitively, we have in mind that \(\alpha ^{-1}\) plays a role as a fast-scale factor whereas \(\beta \) represents a slow-scale but mathematically, we do not assume that \(\alpha \) and \(\beta \) are small here. The above formulation satisfies the Feller condition, which is a sufficient condition for the volatility to be strictly positive. Note that the number of model parameters is reduced from 8 in (1) to 6 in (2):

$$\begin{aligned} c_1, c_2, \theta _1, \theta _2, \sigma _y, \sigma _z, \rho _{xy}, \rho _{xz} \;\;\; \Longrightarrow \;\;\; \alpha , \beta , \theta _1, \theta _2, \rho _{xy}, \rho _{xz} \end{aligned}$$

As the discretely sampled realized variance of a variance swap over [0, T] is defined by

$$\begin{aligned} \sigma ^2_{\text{RV}} = 100^{2}\times \frac{1}{T}\sum _{j=1}^N v\left( X_{t_{j-1}},X_{t_{j}}\right) , \end{aligned}$$
(3)

where \(\left[ 0,T\right] \) is divided into N periods of \([t_{j-1},t_{j}]\), \(t_N=T\) and \(j=1,\ldots ,N\), and

$$\begin{aligned} v\left( X_{t_{j-1}},X_{t_{j}}\right)&=\left( \log \frac{X_{t_j}}{X_{t_{j-1}}}\right) ^{2}, \end{aligned}$$

the fair strike, \(K_{\text{var}}\), of the variance swap is given by the strike K such that the expected value of the payoff \(\sigma ^2_{\text{RV}}-K\) is equal to 0 at time \(t=t_0\). Therefore, we have

$$\begin{aligned} K_{\text{var}}\left( t_{0},X_{t_{0}},Y_{t_{0}},Z_{t_{0}}\right)&=\mathbb{E}^{0,*}\left[ \sigma _{\text{RV}}^{2}\right] =100^{2}\times \frac{1}{T}\sum _{j=1}^N \mathbb{E}^{0,*}\left[ v\left( X_{t_{j-1}},X_{t_{j}}\right) \right] \\&=100^{2}\times \frac{1}{T}\sum _{j=1}^N \mathbb{E}^{0,*}\left[ \mathbb{E}^{t_{j-1},*}\left[ v\left( X_{t_{j-1}},X_{t_{j}}\right) \right] \right] . \end{aligned}$$
(4)

Here, the tower property of the conditional expectation has been used. Note that the above realized variance is defined in terms of logarithmic return instead of simple return. It has an advantage against the simple return in that the multi-period log return is a sum of the one-period log returns, while the multi-period simple return is a product of the one-period simple returns, leading to computational problems for values close to zero. That is why the log return is preferred in financial industry.

The payoff function in (4) depends on two unknown random variables \(X_{t_{j-1}}\) and \(X_{t_j}\). To deal with this difficult problem, we use the idea proposed by Little and Pant (2001). Namely, considering the time interval \([0,t_j]\), we use a new independent variable \(I_t\) defined by

$$\begin{aligned} I_t=\int _0^t\delta \left( {t_{j-1}-s}\right) X_s ds, \end{aligned}$$

where the \(\delta (\cdot )\) denotes the generalized Dirac delta function. Note that \(I_t=0\) on the interval \([0,t_{j-1})\) and \(I_t=X_{t_{j-1}}\) on \([t_{j-1},t_{j}]\) and so \(I_t\) experiences a jump in value across time \(t_{j-1}\).

If \(U^j(t,x,y,z;I)\) is the solution of the PDE

$$\begin{aligned}&\Big (\frac{\partial }{\partial t}+rx\frac{\partial }{\partial x}+\frac{1}{\alpha }(\theta _1-y)\frac{\partial }{\partial y}+\beta (\theta _2 -z)\frac{\partial }{\partial z}+\frac{1}{2}(y+z)x^2\frac{\partial ^2}{\partial x^2}+\frac{1}{2\alpha }\theta _1 y\frac{\partial ^2}{\partial y^2} \\&\quad +\frac{1}{2}\theta _2 z\frac{\partial ^2}{\partial z^2}+\sqrt{\frac{\theta _1}{\alpha }}\rho _{xy} xy\frac{\partial ^2}{\partial x\partial y}+\sqrt{\theta _2 \beta }\rho _{xz}xz\frac{\partial ^2}{\partial x\partial z} +{\delta (t_{j-1}-t)\frac{\partial }{\partial I}}\Big )U^j \\&=0,\quad 0\le t<t_{j}, \end{aligned}$$
(5)

with terminal condition

$$\begin{aligned} U^{j}(t_j,X_{t_j},Y_{t_j},Z_{t_j};I_{t_j})|=v\left( I_{t_{j}},X_{t_{j}}\right) , \end{aligned}$$

then by the well-known Feynman–Kac theorem (cf. Oksendal 2000), we have

$$\begin{aligned} \mathbb{E}^{t_{j-1},*}\left[ v\left( X_{t_{j-1}},X_{t_{j}}\right) \right] =U^j\left( t_{j-1},X_{t_{j-1}},Y_{t_{j-1}},Z_{t_{j-1}};X_{t_{j-1}}\right) . \end{aligned}$$
(6)

From the property of Dirac delta function, one can get rid of the independent variable \(I_t\) in the PDE (5) except at the time \(t_{j-1}\). However, the terminal condition still depends on \(I_t\) which has a jump at \(t=t_{j-1}\). So, we need a continuity condition at \(t=t_{j-1}\) from the no-arbitrage pricing theory (cf. Wilmott 2013). The condition (a jump condition) is expressed as

$$\begin{aligned} \lim _{t\ \uparrow \ t_{j-1}}^{ }U^j(t,x,y,z;I)=\lim _{t\ \downarrow \ t_{j-1}}^{ }U^j(t,x,y,z;I). \end{aligned}$$

Therefore, we divide the time domain \([0,t_{j}]\) into two subintervals \([0,t_{j-1}]\) and \([t_{j-1},t_{j}]\) as the variable \(I_t\) could be regarded as constant on each interval and solve the PDE system by two steps, the first step in \([t_{j-1},t_{j}]\) and the second step in \([0,t_{j-1}]\). The solution obtained in the first step will provide the terminal condition for the PDE problem of the second step through the jump condition. So, on the time interval \([t_{j-1},t_{j}]\), \(U_j(t,x,y,z;I)\) satisfies

$$\begin{aligned}&{\Big (\frac{\partial }{\partial t}+rx\frac{\partial }{\partial x}+\frac{1}{\alpha }(\theta _1-y)\frac{\partial }{\partial y}+\beta (}{\theta _2 -z)\frac{\partial }{\partial z}+\frac{1}{2}(y+z)x^2\frac{\partial ^2}{\partial x^2}+\frac{1}{2\alpha }\theta _1 y\frac{\partial ^2}{\partial y^2}} \\&\quad {+\frac{1}{2}\theta _2 z\frac{\partial ^2}{\partial z^2}+\sqrt{\frac{\theta _1}{\alpha }}\rho _{xy} xy\frac{\partial ^2}{\partial x\partial y}+}{\sqrt{\theta _2 \beta }\rho _{xz}xz\frac{\partial ^2}{\partial x\partial z}\Big )U^j=0} \end{aligned}$$
(7)

with the terminal condition as stated above. On the time interval \([t_0=0,t_{j-1}]\), we use notation \(W^j(t,x,y,z;I)\) instead of \(U^j(t,x,y,z;I)\) for convenience. It satisfies

$$\begin{aligned}&\Big (\frac{\partial }{\partial t}+rx\frac{\partial }{\partial x}+\frac{1}{\alpha }(\theta _1-y)\frac{\partial }{\partial y}+\beta (\theta _2 -z)\frac{\partial }{\partial z}+\frac{1}{2}(y+z)x^2\frac{\partial ^2}{\partial x^2}+\frac{1}{2\alpha }\theta _1 y\frac{\partial ^2}{\partial y^2} \\&\quad +\frac{1}{2}\theta _2 z\frac{\partial ^2}{\partial z^2}+\sqrt{\frac{\theta _1}{\alpha }}\rho _{xy} xy\frac{\partial ^2}{\partial x\partial y}+\sqrt{\theta _2 \beta }\rho _{xz}xz\frac{\partial ^2}{\partial x\partial z} \Big )W^j=0 \end{aligned}$$
(8)

with terminal condition \(W^{j}(t,x,y,z)|_{t=t_{j-1}}=U^{j}(t_{j-1},x,y,z;x)\).

From the above two step process of solving the PDE problems, we have

$$\begin{aligned} \mathbb{E}^{0,*}\left[ U^{j}\left( t_{j-1},X_{t_{j-1}},Y_{t_{j-1}},Z_{t_{j-1}};X_{t_{j-1}}\right) \right] =W^j(t_0,X_{t_0},Y_{t_0},Z_{t_0}) \end{aligned}$$
(9)

and thus putting (4), (6) and (9) together leads to

$$\begin{aligned} K_{\text{var}}(t_0,X_{t_0},Y_{t_0},Z_{t_0})=100^2\times \frac{1}{T}\sum _{j=1}^{N}W^j(t_0,X_{t_0},Y_{t_0},Z_{t_0}). \end{aligned}$$
(10)

3 Pricing Variance Swaps

In this section, we analytically solve the PDE system (7) for the inner expectation \(U^j\) and the PDE system (8) for the outer expectation \(W^j\) using the Fourier transform and Green function methods and obtain a closed form formula for the fair strike value of the variance swap. In fact, it will be expressed as elementary functions without any involvement of integral.

Proposition 1

Under the dynamics (2) of the underlying asset price, the solution of (7), \(j=1,2,\ldots ,N\), is given by

$$\begin{aligned} U^j(\tau ,x,y,z;I)&=B_{11}^{2}y^2+B_{21}^{2}z^2+2B_{11}B_{21}yz \\&\quad +\left( 2B_{11}\log \frac{x}{I}+2A_{1}B_{11}+2B_{11}r\tau -B_{12}\right) y \\&\quad +\left( 2B_{21}\log \frac{x}{I}+2A_{1}B_{21}+2B_{21}r\tau -B_{22}\right) z, \\&\quad -A_{2}+(r\tau +A_1+\log \frac{x}{I})^2, \end{aligned}$$
(11)

where \(\tau =t_{j}-t\) and \(q=r\tau +\log x\) and \(A_{1}\), \(A_{2}\), \(B_{11}\), \(B_{12}\), \(B_{21}\) and \(B_{22}\) are given by

$$\begin{aligned} A_{1}(\tau )&=\alpha _{1}\left( \tau ;\alpha ^{-1},\theta _1,\sqrt{\alpha ^{-1}\theta _1}\right) +\alpha _{1}\left( \tau ;\beta ,\theta _2,\sqrt{\beta \theta _2}\right) , \\ A_{2}(\tau )&=\alpha _{2}\left( \tau ;\alpha ^{-1},\theta _1,\sqrt{\alpha ^{-1}\theta _1},\rho _{xy}\right) +\alpha _{2}\left( \tau ;\beta ,\theta _2,\sqrt{\beta \theta _2},\rho _{xz}\right) , \\ B_{11}(\tau )&=\beta _{1}\left( \tau ;\alpha ^{-1},\theta _1,\sqrt{\alpha ^{-1}\theta _1}\right) , \quad B_{12}(\tau )=\beta _{2}\left( \tau ;\alpha ^{-1},\theta _1,\sqrt{\alpha ^{-1}\theta _1},\rho _{xy}\right) , \\ B_{21}(\tau )&=\beta _{1}\left( \tau ;\beta ,\theta _2,\sqrt{\beta \theta _2}\right) , \quad B_{22}(\tau )=\beta _{2}\left( \tau ;\beta ,\theta _2,\sqrt{\beta \theta _2},\rho _{xz}\right) , \end{aligned}$$

respectively. Here, the functions \(\alpha _{1}\), \(\alpha _{2}\), \(\beta _{1}\) and \(\beta _{2}\) are explicitly given by

$$\begin{aligned}&\alpha _{1}(\tau ;c,\theta ,\sigma )=\frac{\theta }{2}\left( \frac{e^{\tau c}-1}{c e^{\tau c}}-\tau \right) , \quad \beta _{1}(\tau ;c,\theta ,\sigma )=\frac{1-e^{\tau c}}{2c e^{\tau c}}, \\&\alpha _{2}(\tau ;c,\theta ,\sigma ,\rho )=-\frac{\theta }{2}\Bigg [e^{-2\tau c}\left( \frac{\sigma ^2}{4c^2}\right) +e^{-\tau c}\left( \frac{2}{c}+\frac{\sigma ^2}{c^3}-\frac{4\rho \sigma }{c^2}\right) -\frac{2}{c} \\&\quad -\frac{5\sigma ^2}{4c^3}+\frac{4\rho \sigma }{c^2}+\tau \left( 2+\frac{\sigma ^2}{2c^2}-\frac{2\rho \sigma }{c}+2e^{-\tau c}\left( \frac{\sigma ^2}{2c^2}-\frac{\rho \sigma }{c}\right) \right) \Bigg ], \\&\beta _{2}(\tau ;c,\theta ,\sigma ,\rho )=\frac{\sigma ^2}{4c^3}e^{-2\tau c}+\frac{1}{c}\left( 1-\frac{\rho \sigma }{c}\right) e^{-\tau c}-\frac{1}{c}-\frac{\sigma ^2}{4c^3}+\frac{1}{c^2}\rho \sigma \\&\quad +\frac{\tau }{c}\left( \frac{\sigma ^2}{2c}-\rho \sigma \right) e^{-\tau c}, \end{aligned}$$

respectively.

Proof

Using the Fourier transform \(\hat{\phi }\) defined by

$$\begin{aligned} \hat{\phi }(\tau ,w,y,z;I)=\int _\mathbb{R}e^{-iwq}\phi (\tau ,q,y,z;I)dq, \end{aligned}$$

Equation (7) is transformed into

$$\begin{aligned} \hat{\mathcal{L}}\hat{U}^j(\tau ,w,y,z;I)&=0, \qquad 0<\tau \le t_{j}-t_{j-1}, \\ \hat{U}^j(0,w,y,z;I)&={\hat{v}(w;I)}, \end{aligned}$$

where the operator \(\hat{\mathcal{L}}\) and the function \(\hat{v}\) are given by

$$\begin{aligned} \hat{\mathcal{L}}&=\Big (-\frac{\partial }{\partial \tau }+\frac{1}{\alpha }(\theta _1-y)\frac{\partial }{\partial y}+\beta (\theta _2-z)\frac{\partial }{\partial z}+\frac{1}{2}(y+z)(-iw-w^2) \\&\quad +\rho _{xy}\frac{1}{\sqrt{\alpha }}yiw\frac{\partial }{\partial y} +\rho _{xz}\sqrt{\beta }ziw\frac{\partial }{\partial z}+\frac{1}{2\alpha }y\frac{\partial ^2}{\partial y^2}+\frac{1}{2}\beta z \frac{\partial ^2}{\partial z^2}\Big ), \\ {\hat{v}(w;I)}&=\int _\mathbb{R}e^{-iwq}{v(e^q;I)}dq, \end{aligned}$$

respectively. Here, \(v(e^q;I)\) denotes the payoff function.

Let \(\hat{G}(\tau ,w,y,z;I)\) be the solution of the PDE problem

$$\begin{aligned} \hat{\mathcal{L}}\hat{G}(\tau ,w,y,z;I)&=0, \qquad 0<\tau \le t_{j}-t_{j-1}, \\ \hat{G}(0,w,y,z;I)&={\delta }_0(q). \end{aligned}$$
(12)

Then it is the Fourier transform of the Green function of Eq. (7). Since the coefficients of \(\hat{\mathcal{L}}\) are linear in y and z, following Heston’s procedure in Heston (1993) for finding a (affine) solution, we suppose that

$$\begin{aligned} \hat{G}(\tau ,w,y,z;I)&=e^{A(\tau ,w)+yB_1(\tau ,w)+zB_2(\tau ,w)}. \end{aligned}$$
(13)

Putting (13) into (12), we can find the following ordinary differential equation (ODE) problems for A, \(B_1\) and \(B_2\).

$$\begin{aligned} \frac{\partial A(\tau ,w)}{\partial \tau }&=\frac{1}{\alpha }\theta _1 B_1+\beta \theta _2{B_2}, \\ A(0,w)&=0, \\ \frac{\partial B_1(\tau ,w)}{\partial \tau }&=-\frac{1}{2}(iw+w^2)-\left( \frac{1}{\alpha }-iw\rho _{xy}\frac{1}{\sqrt{\alpha }}\right) B_1+\frac{1}{2\alpha } B_1^2, \\ B_1(0,w)&=0, \\ \frac{\partial B_2(\tau ,w)}{\partial \tau }&=-\frac{1}{2}(iw+w^2)-\left( \beta -iw\rho _{xz}\sqrt{\beta }\right) B_2+\frac{1}{2}\beta B_2^2, \\ B_2(0,w)&=0. \end{aligned}$$

Solutions of these Riccati type nonlinear differential equations can be explicitly obtained as

$$\begin{aligned} A(\tau ,w)&=\frac{\theta _1}{\alpha }\left( (1-\sqrt{\alpha }\rho _{xy}iw+a_1(w))\tau -2\log \left( \frac{1-b_1(w)e^{\tau a_1(w)}}{1-b_1(w)}\right) \right) \\&\quad +{\theta _2}\left( (\beta -\sqrt{\beta }\rho _{xz}iw+a_2(w))\tau -2\log \left( \frac{1-b_2(w)e^{\tau a_2(w)}}{1-b_2(w)}\right) \right) , \\ B_1(\tau ,w)&=(\alpha -\sqrt{\alpha }\rho _{xy}iw+\alpha ^2 a_1(w))\cdot \frac{1-e^{\tau a_1(w)}}{1-b_1(w)e^{\tau a_1(w)}}, \\ B_2(\tau ,w)&=\frac{\beta -\sqrt{\beta }\rho _{xz}iw+a_2(w)}{\beta ^2}\cdot \frac{1-e^{\tau a_2(w)}}{1-b_2(w)e^{\tau a_2(w)}}, \\ a_1(w)&:=\alpha ^{-1}\sqrt{(1-\sqrt{\alpha }\rho _{xy}iw)^2+(w^2+iw)}, \quad b_1(w):=\frac{1-\sqrt{\alpha }\rho _{xy}iw+a_1(w)}{1-\sqrt{\alpha }\rho _{xy}iw-a_1(w)}, \\ a_2(w)&:=\sqrt{(\beta -\sqrt{\beta }\rho _{xz}iw)^2+ \beta ^2(w^2+iw)}, \quad b_2(w):=\frac{\beta -\sqrt{\beta }\rho _{xz}iw+a_2(w)}{\beta -\sqrt{\beta }\rho _{xz}iw-a_2(w)}. \end{aligned}$$
(14)

Note that the solutions are expressed as a combination of the single-factor Heston type solutions.

By the convolution theorem, which states that the Fourier transform of a convolution of two functions is the product of their Fourier transforms, one can deduce

$$\begin{aligned} U^j(\tau ,q,y,z;I)&=\frac{1}{2\pi }\int _{\mathbb{R}}e^{iwq}{\hat{v}(w;I)}\hat{G}(\tau ,w,y,z;I)dw. \end{aligned}$$
(15)

Here, the payoff v(xI) is given by

$$\begin{aligned} {v(e^q;I)}=\left( \log \frac{x}{I}\right) ^2 \end{aligned}$$

and thus the Fourier transform of it can be found to be

$$\begin{aligned} {\hat{v}(w;I)}=2\pi \left( -{\delta }_{0}''(w)-2i\left( \log I\right) {\delta }_{0}'(w)+\left( \log I \right) ^{2}{\delta }_{0}(w) \right) , \end{aligned}$$

where \({\delta }_{0}'\) and \({\delta }_{0}''\) are the first and second derivatives of the generalized Dirac delta function, respectively, defined in the distributional sense and satisfy the property

$$\begin{aligned} {\delta }_{0}''(w)=\frac{-1}{2\pi }\int _{\mathbb{R}}q^{2}e^{-iwq}dq, \quad {\delta }_{0}'(w)=\frac{-i}{2\pi }\int _{\mathbb{R}}q e^{-iwq}dq. \end{aligned}$$

Then from (13) and (15) we obtain

$$\begin{aligned} U^j(\tau ,q,y,z;I)&=\frac{1}{2\pi }\int _{\mathbb{R}}e^{iwq}\cdot {\hat{v}(w;I)}\hat{G}(\tau ,w,y,z;I)dw \\&=\frac{1}{2\pi }\int _{\mathbb{R}}e^{iwq}\cdot {\hat{v}(w;I)}e^{A(\tau ,w)+yB_1(\tau ,w)+zB_2(\tau ,w)}dw \\&=\frac{\partial ^{2} k_1}{\partial w^{2}}\Bigg |_{w=0}\cdot (-1)+\frac{\partial k_1}{\partial w}\Bigg |_{w=0}\cdot \left( 2i \log I \right) +k_1(\tau ,0,q) \cdot \left( \log I \right) ^{2}, \end{aligned}$$
(16)

where \(k_1(\tau ,w;q)\) is given by

$$\begin{aligned} k_1(\tau ,w;q)=\exp (iwq+A(\tau ,w)+yB_1(\tau ,w)+zB_2(\tau ,w)) \end{aligned}$$

in terms of A, \(B_1\) and \(B_2\) in (14).

Now, if we define \(A_1\), \(A_2\), \(B_{k1}\) and \(B_{k2}\), \(k=1,2\), as

$$\begin{aligned} A_1(\tau )= & {} \frac{1}{i}\cdot \frac{\partial A}{\partial w}\Bigg |_{w=0}, \quad A_2(\tau )=\frac{\partial ^{2} A}{\partial w^2}\Bigg |_{w=0}, \\ B_{k1}(\tau )= & {} \frac{1}{i}\cdot \frac{\partial B_k}{\partial w}\Bigg |_{w=0}, \quad B_{k2}(\tau )=\frac{\partial ^{2} B_k}{\partial w^2}\Bigg |_{w=0}, \end{aligned}$$

respectively, then one can find that \(\frac{\partial A}{\partial w}|_{w=0}\) and \(\frac{\partial B}{\partial w}|_{w=0}\) are pure imaginary numbers while \(\frac{\partial ^{2} A}{\partial w^2}|_{w=0}\) and \(\frac{\partial ^{2} B}{\partial w^2}|_{w=0}\) are real numbers and (16) leads to

$$\begin{aligned} U^j(\tau ,x,y,z;I)&=(-1)\cdot \left( A_2+yB_{12}+zB_{22}+(qi+iA_1+iyB_{11}+izB_{21})^2\right) \\&\quad +(2i\log I)\cdot \left( iq+iA_1+iyB_{11}+izB_{21}\right) +(\log I)^2, \end{aligned}$$

where \(A_1\), \(A_2\), \(B_{k1}\) and \(B_{k2}\), \(k=1,2\) are given as in the statement of Proposition 1. Further direct calculation of this yields the desired form in Proposition 1. \(\hfill\square \)

Next, we solve the PDE system (8) for the outer expectation \(W^j\).

Proposition 2

Under the dynamics (2) of the underlying asset price, the solution of (8), \(j=1,2,\ldots ,N\), is given by

$$\begin{aligned} W^j(t,y,z)&=B_{11}^2(\varDelta t)f_2(t,y)+B_{21}^2(\varDelta t)g_2(t,z) \\&\quad +2B_{11}(\varDelta t)B_{21}(\varDelta t)f_1(t,y)g_1(t,z) \\&\quad +\left\{ 2A_{1}(\varDelta t)B_{11}(\varDelta t)+2B_{11}(\varDelta t)r\varDelta t-B_{12}(\varDelta t)\right\} f_1(t,y) \\&\quad +\left\{ 2A_{1}(\varDelta t)B_{21}(\varDelta t)+2B_{21}(\varDelta t)r\varDelta t-B_{22}(\varDelta t)\right\} g_1(t,z) \\&\quad -A_{2}(\varDelta t)+(r\varDelta t+A_1(\varDelta t))^2, \end{aligned}$$
(17)

where \(\varDelta t =t_{j}-t_{j-1}\) and \(A_{1}\), \(A_{2}\), \(B_{11}\), \(B_{12}\), \(B_{21}\) and \(B_{22}\) are given as in Proposition 1and \(f_1\), \(f_2\), \(g_1\) and \(g_2\) are given by

$$\begin{aligned} f_1(t,y)&=ye^{-(t_{j-1}-t)/\alpha }+\theta _1(1-e^{-(t_{j-1}-t)/\alpha }), \\ f_2(t,y)&=(f_1(t,y))^2+y\theta _1(e^{-(t_{j-1}-t)/\alpha }-e^{-2(t_{j-1}-t)/\alpha }) +\theta _1^2(1-e^{-(t_{j-1}-t)/\alpha })^2, \\ g_1(t,z)&=ze^{-\beta (t_{j-1}-t)}+\theta _2(1-e^{-\beta (t_{j-1}-t)}), \\ g_2(t,z)&=(g_1(t,z))^2+z\theta _2(e^{-\beta (t_{j-1}-t)}-e^{-2\beta (t_{j-1}-t)}) +\theta _2^2(1-e^{-\beta (t_{j-1}-t)})^2, \end{aligned}$$

respectively.

Proof

From the jump condition (2) and Proposition 1 with \(t=t_{j-1}\) (so, \(\tau =t_j-t=\varDelta t\)), the terminal condition of \(W^j\) is

$$\begin{aligned} W^j(t_{j-1},x,y,z)&=U^j(t_{j-1},x,y,z;x) \\&=B_{11}^2(\varDelta t)y^2+B_{21}^2(\varDelta t)z^2+2B_{11}(\varDelta t)B_{21}(\varDelta t) yz \\&\quad +\left\{ 2A_{1}(\varDelta t)B_{11}(\varDelta t)+2B_{11}(\varDelta t)r\varDelta t-B_{12}(\varDelta t)\right\} y \\&\quad +\left\{ 2A_{1}(\varDelta t)B_{21}(\varDelta t)+2B_{21}(\varDelta t)r\varDelta t-B_{22}(\varDelta t)\right\} z \\&\quad -A_{2}(\varDelta t)+(r\varDelta t+A_1(\varDelta t))^2, \end{aligned}$$
(18)

which is a polynomial in y and z and independent of x. By the Feynman–Kac theorem, \(W^j\) is also independent of x on the interval \([0,t_{j-1})\) and (8) becomes

$$\begin{aligned} \Big (\frac{\partial }{\partial t}+\frac{1}{\alpha }(\theta _1-y)\frac{\partial }{\partial y}+\beta (\theta _2 -z)\frac{\partial }{\partial z}+\frac{1}{2\alpha }\theta _1 y\frac{\partial ^2}{\partial y^2} +\frac{1}{2}\theta _2 z\frac{\partial ^2}{\partial z^2}\Big )W^j=0. \end{aligned}$$

Then by the Feynman–Kac theorem, \(W^j\) becomes

$$\begin{aligned} W^j(t,x,y,z)= E^{0,*}[W^j(t_{j-1},x,y,z)|Y_t=y,Z_t=z]. \end{aligned}$$
(19)

Since \(Y_t\) and \(Z_t\) are both CIR processes, we have

$$\begin{aligned} E^{0,*}[Y_{t_{j-1}}|Y_t=y]&=ye^{-(t_{j-1}-t)/\alpha }+\theta _1(1-e^{-(t_{j-1}-t)/\alpha }):=f_1(t,y), \\ E^{0,*}[(Y_{t_{j-1}})^2|Y_t=y]&=(f_1(y))^2+y\theta _1(e^{-(t_{j-1}-t)/\alpha }-e^{-2(t_{j-1}-t)/\alpha }) \\&\quad +\theta _1^2(1-e^{-(t_{j-1}-t)/\alpha })^2:=f_2(t,y), \\ E^{0,*}[Z_{t_{j-1}}|Z_t=z]&=ze^{-\beta (t_{j-1}-t)}+\theta _2(1-e^{-\beta (t_{j-1}-t)}):=g_1(t,z), \\ E^{0,*}[(Z_{t_{j-1}})^2|Z_t=z]&=(g_1(z))^2+z\theta _2(e^{-\beta (t_{j-1}-t)}-e^{-2\beta (t_{j-1}-t)}) \\&\quad +\theta _2^2(1-e^{-\beta (t_{j-1}-t)})^2:=g_2(t,z). \end{aligned}$$
(20)

Substituting (18) into (19) and exploiting (20), the desired result in Proposition 2 follows by linear property of conditional expectation and the independence of two processes \(Y_t\) and \(Z_t\). \(\hfill\square \)

Using the result in Proposition 2, we finally obtain the fair strike price of a variance swap as follows.

Theorem 1

Under the rescaled double Heston model (2), the fair strike of a variance swap at \(t=0\) defined in terms of the realized variance (3) is given by

$$\begin{aligned} K_{\text{var}}(y,z)=100^2\times \frac{1}{T}\sum _{j=1}^{N}W^j(0,y,z), \end{aligned}$$
(21)

where \(W^j(t,y,z)\) is calculated as in Proposition 2.

Proof

From the result (10) and Proposition 2, we obtain the theorem. \(\hfill\square \)

We note that the fair strike price of the variance swap does not depend on the spot price x. It depends only on the variance y of the underlying asset return. That is because the variance swap considered in this work is a vanilla variance swap. However, if it is an exotic variance swap such as gamma swap which has a different payoff structure, then it would depend on the spot price as studied by Kim and Kim (2020).

4 Numerical Results

In this section, we first check the validity of our theoretical formula in Theorem 1 using Monte Carlo (MC) simulation and then compare the rescaled double Heston model (2) (in brief, the rDH model) to the Heston model and the original double Heston model (1) (in brief, the DH model) based on the result of calibration to VIX market data. The theoretical formulas for the fair strike price under the Heston and DH models are given together with calibration results in “Appendix”.

4.1 MC Simulation

The analytic formula in Theorem is already given in a closed form but we confirm the validity of it by comparing it with a MC simulation result. For MC simulation, we use the Milstein method Mil’shtejn (1975) to reduce discretization errors since path dependent derivatives like variance swap may be vulnerable to errors created by discretizing the Brownian motions. Sample paths generated by the Milstein method under the rDH model are given by

$$\begin{aligned} X_{t_{j+1}}&=X_{t_{j}}+rX_{t_{j}}\varDelta t + \sqrt{Y_{t_{j}}}X_{t_{j}}\varDelta W_1(t_{j})+\sqrt{Z_{t_{j}}}X_{t_{j}}\varDelta W_2(t_{j}) \\&\quad +\frac{1}{2}X_{t_{j}}Y_{t_{j}}\left( (\varDelta W_1(t_{j}))^2-\varDelta t\right) +\frac{1}{2}X_{t_{j}}Z_{t_{j}}\left( (\varDelta W_2(t_{j}))^2-\varDelta t\right) ,\\ Y_{t_{j+1}}&=Y_{t_{j}}+\frac{1}{\alpha }(\theta _1-Y_{t_{j}})\varDelta t + \sqrt{\frac{\theta _1}{\alpha }}\sqrt{Y_{t_{j}}}\left( \rho _{xy}\varDelta W_1(t_{j})+\sqrt{1-\rho _{xy}^2}\varDelta W_3(t_{j})\right) \\&\quad +\frac{1}{4}\sqrt{\frac{\theta _1}{\alpha }}\left( (\rho _{xy}\varDelta W_1(t_{j})+\sqrt{1-\rho _{xy}^2}\varDelta W_3(t_{j}))^2-\varDelta t\right) ,\\ Z_{t_{j+1}}&=Z_{t_{j}}+\beta (\theta _2-Z_{t_{j}})\varDelta t + \sqrt{\theta _2\beta }\sqrt{Z_{t_{j}}}\left( \rho _{xz}\varDelta W_2(t_{j})+\sqrt{1-\rho _{xz}^2}\varDelta W_4(t_{j})\right) \\&\quad +\frac{1}{4}\sqrt{\theta _2\beta }\left( (\rho _{xz}\varDelta W_2(t_{j})+\sqrt{1-\rho _{xz}^2}\varDelta W_4(t_{j}))^2-\varDelta t\right) , \end{aligned}$$

where \(\varDelta t=T/N\), \(t_{j}=j\varDelta t\), \(\varDelta W_{i}(t_j)=W_{i}(t_{j})-W_{i}(t_{j-1})\), \(j=1,2,\ldots ,N\), \(i=1,2,3,4\) and the Brownian motions \(W_{i}\) are mutually independent. 2,000,000 number of sample paths are utilized for pricing the variance swap via MC simulation. The parameters are given by \(r=0.02\), \(\alpha =0.4\), \(\beta =0.1\), \(\theta _1=0.03\), \(\theta _2=0.5\), \(\rho _{xy}=-0.8\), \(\rho _{xz}=-0.7\), \(X_0=100\), \(Y_0=0.02\) and \(Z_0=0.04\).

The MC simulation result is given in Table 1. One can see that the formula in Theorem 1 is guaranteed to be derived correctly and the relative error is more reduced when sampling frequency is increased.

Table 1 Fair strike prices \(K_{\text{var}}\) obtained by MC simulation and the analytic formula in Theorem 1 and their relative errors

4.2 Sensitivity Analysis

In this section, we investigate the sensitivity of the fair strike value to the parameters of the fast and slow variance factors.

Fig. 1
figure 1

Impact of the long-run mean levels \(\theta _1\) (the fast variance factor) and \(\theta _2\) (the slow variance factor)

Figure 1a and b represent the impact of the long-run mean levels \(\theta _1\) and \(\theta _2\) of the fast and slow variances on the fair strike price under the rDH model, respectively. We set \(\theta _2=0.2\) in Fig. 1a and \(\theta _1=0.2\) in Fig. 1b. The other parameter values in Fig. 1 are fixed as \(r=0.02\), \(\delta =0.3\), \(\epsilon =0.1\), \(\rho _1=-0.8\), \(\rho _2=-0.9\), \(y(0)=0.15\) and \(z(0)=0.12\), where y(0) and z(0) are the initial values of the fast and slow variances, respectively. The figures indicate that the fair strike value increases as \(\theta _1\) or \(\theta _2\) increases as it should be. In Fig. 1a, the fair strike price decreases as tenor becomes shorter when \(\theta _1\) is bigger than or equals to \(y(0)=0.15\) while it decreases and then increases as tenor gets shorter when \(\theta _1\) is smaller than y(0). Meanwhile, in Fig. 1b, the fair strike price decreases or increases and then decreases as tenor becomes shorter depending on whether \(\theta _2 > z(0)\) or not, respectively. Comparing Fig. 1a with b, it can be viewed that the fair strike of the variance swap is more affected by the fast variance factor than the slow variance factor and the slow factor effect increases gradually as time-to-maturity gets longer.

Figure 2a and b show the influence of the mean reversion rates \(1/\alpha \) and \(\beta \) on the fair strike price under the rDH model. We set the parameter values as \(\theta _1=0.1\), \(\theta _2=0.2\), \(y(0)=0.15\) and \(z(0)=0.12\) in Fig. 2. In this case, we note that \(\theta _1 - y(0) < 0\) and \(\theta _2 - z(0) > 0\) hold. One can notice from Fig. 2a that the lower \(\alpha \) (higher mean reversion rate) is, the lower the fair strike price is. It could be explained that the higher mean reversion rate makes the fast variance process \(Y_t\) move to the lower mean level \(\theta _1\) than the initial level y(0) rapidly. On the contrary, as seen in Fig. 2b, the higher mean reversion rate (higher \(\beta \)) results in the higher strike price because the slow variance process \(Z_t\) moves to the higher mean level \(\theta _2\) than the initial level z(0) fast.

Fig. 2
figure 2

Impact of the mean-reversion rates \(1/\alpha \) (the fast variance factor) and \(\beta \) (the slow variance factor)

4.3 Calibration

There is no market data of variance swap available as it is traded in the over-the-counter market. However, the CBOE VIX term structure could be an alternative to calibrate the parameters. The definition of VIX is given by

$$\begin{aligned} \text{VIX}_t(\tau )=100\times \sqrt{\frac{2}{\tau }\mathbb{E}^t\left[ \int _{t}^{t+\tau }\frac{dX_s}{X_s}-d(\log (X_s))\right] }. \end{aligned}$$

Under the rDH model, it is reduced by the Ito formula to

$$\begin{aligned} \left( \text{VIX}_t (\tau )\right) ^2=100^2\times \frac{1}{\tau }\int _{t}^{t+\tau }\mathbb{E}^t[Y_s+Z_s] ds. \end{aligned}$$

Thus we could use the VIX term structure data as an instrument for calibration, regarding the VIX as a square root of variance. One can find a detailed study of VIX term structure, for example, in Luo and Zhang (2012).

In this paper, we quote weekly VIX term structure data from 2015 to 2019 and from 03/01/2020 to 30/10/2020 in the CBOE website (https://www.cboe.com), making a distinction between before and after the COVID-19 pandemic. For calibration purpose, the data from 2015 to 2019 is divided into five periods of 1 year. We perform a nonlinear regression analysis to estimate the parameters by taking the following steps. First, we estimate y(0) and z(0) for each day with other parameters fixed (Step1). Second, we estimate \(\alpha \), \(\beta \), \(\theta _1\), \(\theta _2\), \(\rho _{xy}\) and \(\rho _{xz}\) with the fixed y(0) and z(0) obtained in Step 1 (Step2). We repeat Step 1 and Step 2 with updated parameters until the mean-square-error (MSE) is minimized, i.e., the procedure stops when

$$\begin{aligned} |\text{MSE}_k-\text{MSE}_{k+1}|<1\times 10^{-5} \end{aligned}$$

and

$$\begin{aligned} \text{MSE}_{k+1}/\text{MSE}_{k}<99.9\%, \end{aligned}$$

where \(\text{MSE}_{k}\) is defined by

$$\begin{aligned} \text{MSE}_{k}=\frac{1}{M}\sum _{i,j=1}^{M}\left( \sqrt{K(t_i)(T_j-t_i)}-\text{VIX}(t_i,T_j)\right) ^2. \end{aligned}$$

Here, K(t) is the theoretical price of the variance swap, \(\text{VIX}(t_i,T_j)\) is the market data of VIX term structure at \(t_i\) with time-to-maturity \(T_j\), M is the total number of data, k is the number of times that the calibration steps are repeated. The calibrations for the three models are implemented based upon MATLAB 2018c and executed on Intel(R) Core(TM) i5-6300HQ.

Table 2 shows the estimated parameters of the rDH model for the periods of 2015, 2016, 2017, 2018, 2019 and 2020. The estimated parameters of the Heston and DH models are quoted in Table 4 in “Appendix”. To show more credibility of the excellence of the rDH model, we also perform a calibration with a bigger sample size of VIX term structure covering the whole period from 01/01/2015 to 30/10/2020. As one can see from Table 2, it is reasonable to say that \(1/\alpha \) could work as a fast mean reversion rate while \(\beta \) as a slow mean reversion rate.

Table 2 Parameters of the rDH model calibrated to VIX term structures

The MSEs and the elapsed time of calibration are quoted in Table 3. In this table, for example, ‘3 m 20 s’ means 3 min and 20 s. One can notice that the MSEs of calibration of the rDH and DH models are much smaller than that of the Heston model. Also, as shown in Fig. 3, in which ‘Difference’ refers to the absolute error (%) between the computed square root of the variance swap value and the VIX data, the rDH model captures the market volatility much better than the Heston model. It is interesting to see in Table 3 that there is not much difference of MSE between the rDH model and the DH model but the calibration time cost of the rDH model is much smaller than that of the DH model. It suggests that the role of vol-of-vol is very limited for model calibration or pricing of variance swaps under the double Heston framework. One noticeable fact is that the MSEs have far increased in the pandemic situation of COVID-19 around the world in 2020 regardless of the model of choice. Even the DH model has a difficulty of catching the market movement if the market is in extremely volatile situation.

Fig. 3
figure 3

Daily errors between the square root of the variance swap value and the VIX data for the Heston and rDH models

Table 3 Calibration MSEs (and time cost) of the Heston, rDH and DH models to VIX term structures

Figure 4 presents the fitting results of the Heston, rDH and DH models to the VIX term structures in 2019 and 2020. As illustrated in Fig. 4a and b, in 2019 when the market was relatively stable, the VIX term structure was usually concave and upward and both the rDH and DH models captured the market quite well while the Heston model still followed the market trend but there was a significant gap. However, after the World Health Organization (WHO) declared COVID-19 pandemic on March 11, the situation has been changed drastically. As demonstrated in Fig. 4c–d, the VIX market itself was so irregular and unstable that even the multi-factor stochastic volatility models such as the rDH and DH models could not capture the market behavior.

Fig. 4
figure 4

Fit of the Heston, rDH and DH models to VIX term structures in 2019 and 2020

In order to investigate how much the slow variance factor has effects on the variance swap value with the calibrated parameters in Table 2, we first decompose the fair strike price \(K_{var}\) given by (21) into three terms as follows.

$$\begin{aligned} K_{var} = K_y + K_z + R_{yz}, \end{aligned}$$

where \(K_y\) is the term that depends only on the fast variance factor and \(K_z\) is the term that depends only on the slow variance factor, and \(R_{yz}\) is the remainder term that depends on both the fast and slow factors. Figure 5 illustrates the graphs of \(K_{var}\), \(K_y\), \(K_z\) and \(R_{yz}\) under the rDH model with the parameters in Table 2 with \(y(0)=0.0083\) and \(z(0)=0.062\) for the year 2020, where \(\theta _2\) is particularly very small as it is given by \(1.08\times 10^{-8}\). The figure shows that the fast and slow terms are both influential but the remainder term is negligible (average at \(1.95\times 10^{-2}\)). Even if the long-run mean level of the slowly mean reverting variance is very small, the slow term still can have a sufficient impact on the price formation of variance swaps.

Fig. 5
figure 5

Impact of the fast and slowly mean reverting variance factors on the variance swap price at 16/10/2020

5 Conclusion

This paper has proposed a computationally more efficient rescaled version of the double Heston model and applied to the pricing problem of variance swaps. We obtain a closed form analytic formula of the fair strike price and investigate the impacts of the fast and slowly mean reverting variance factors on the price. The pricing formula contains neither integral nor terms required to be numerically calculated, making pricing or calibration much easier. We exploit VIX term structure data for calibration and divide the time period into before and after the COVID-19 pandemic started. We find that volatility after the start of the pandemic in 2020 is so irregular that any model cannot cope with the market VIX term structure. However, we verify that the proposed rescaled double Heston model and the original double Heston model are equally superior to the Heston model in terms of fitting to the market data in a stable market condition, while the proposed model is much more efficient than the original double Heston model in view of calibration time cost. Generally speaking, our study in this article supports why the market practitioners had better view the volatility factors from the two different time scale point of view (fast and slow factors).