Quadratic Variance Swap Models ∗

We introduce a novel class of term structure models for variance swaps. The multivariate state process is characterized by a quadratic diffusion function. The variance swap curve is quadratic in the state variable and available in closed form, greatly facilitating empirical analysis. Various goodness-of-fit tests show that quadratic models fit variance swaps on the S&P 500 remarkably well, and outperform affine models. We solve a dynamic optimal portfolio problem in variance swaps, index option, stock index and bond. An empirical analysis uncovers robust features of the optimal investment strategy. JEL Classification: C51, G13


Introduction
A variance swap pays the difference between the realized variance of some underlying asset and the fixed variance swap rate. Variance swaps are actively traded at different maturities. This induces a term structure of variance swap rates, which reflects market expectations about future variance and provides important information for managing variance risk. Fig. 1 shows variance swap rates on the Standard and Poor's 500 Index (S&P 500). The term structure takes a variety of shapes and exhibits rich dynamics. During low volatility periods, such as [2005][2006], the term structure is upward sloping. During financial crises, such as fall 2008, the short-end spikes up, and the term structure becomes downward sloping. Having a model that captures such term structure movements appears to be crucial to consistently price variance swaps across different maturities or to optimally invest in such contracts. Surprisingly, the term structure of variance swap rates has received little attention in the literature.
We provide a novel class of flexible and tractable variance swap term structure models. The multivariate state variable driving the stochastic variance follows a quadratic diffusion process characterized by linear drift and quadratic diffusion functions. Variance swap rates are quadratic in the state variable. The variance swap curve is available in closed form in terms of a linear ordinary differential equation, which greatly facilitates empirical applications. Higher order polynomial specifications are possible.
We perform an exhaustive specification analysis of the univariate quadratic model and of a parsimonious bivariate Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/jfec extension. Model identification is provided in terms of canonical representations. We also study univariate polynomial specifications of higher order. We fit these models to the daily quadratic variation from tick-by-tick S&P 500 futures data and the term structure of variance swap rates on the S&P 500 shown in Fig. 1. Several statistical tests show that the bivariate quadratic model captures the term structure dynamics remarkably well. The quadratic state process is able to generate sudden large movements in the variance swap rates, and the quadratic variance swap model can produce a rich variety of term structure shapes, as observed empirically. Nested affine and other specifications are soundly rejected. Our quadratic model also outperforms a standard two-factor affine jumpdiffusion model which is typically used in the literature. We reach this conclusion using various likelihood-based tests (e.g., Giacomini and White, 2006), information theoretic criteria (i.e., Akaike and Bayesian Information Criteria), and Diebold-Mariano tests derived from variance swap pricing errors.
We find that the bivariate quadratic model produces better forecasts of variance swap rates than the univariate quadratic and polynomial models, as well as the martingale model. The latter uses today's variance swap rates as a prediction of future variance swap rates. Given the high persistence of variance swap rates, 1 the martingale model is a challenging benchmark. When we regress future variance swap rates on model-based predictions of variance swap rates, we find that the bivariate quadratic model has an intercept and a slope not statistically different from zero and one, respectively, and thus produces accurate forecasts. The bivariate model outperforms the martingale model, which in turn dominates the univariate quadratic and polynomial models. From an economic perspective, this suggests that the bivariate quadratic model captures well the ex ante risk premiums embedded in variance swaps.
At least two features contribute to the popularity of variance swaps. First, hedging a variance swap is easier than hedging other volatility derivatives. In the absence of asset price jumps, the payoff of a variance swap can be replicated by dynamic trading in the underlying asset and a static position in a continuum of vanilla options with different strike prices and the same underlying and maturity date. In practice, of course, continuous trading is unfeasible and vanilla options exist only for a limited number of strike prices and could not exist at all for a given maturity date. 2  predetermined time horizon. Suppose an investor holds a broadly diversified portfolio and is concerned about volatility risk over the next month. Buying a variance swap on the S&P 500, with one-month maturity, would provide a direct hedge against volatility risk. In contrast, taking positions on options and futures on the VIX index 3 would not provide an equally direct hedge. 4 To assess the economic relevance of variance swaps, we study a dynamic optimal portfolio problem in variance swaps, index option, stock index, and risk-free bond. We use a quadratic jump-diffusion model for the price process of the stock index. We solve for the optimal strategy of a power utility investor who maximizes the expected utility from terminal wealth. The variance swaps are on-the-run and rolled over at pre-specified arbitrary points in time. The optimal strategy, composed of the familiar myopic and intertemporal hedging terms (Merton, 1971), is derived in quasi-closed form. A Taylor series expansion of the intertemporal hedging term involves conditional moments of the state variables, which are available in closed form. We implement the optimal portfolio using three-month and two-year variance swaps, an out-of-the-money put option, and the S&P 500. To include the put option in the investment universe, we develop a novel pricing formula for European options. The transition density of the stock price process is approximated using an Edgeworth expansion, relying on closed form expressions for joint conditional moments of the stock price and state variables.
We empirically find that the optimal portfolio weights in the variance swaps follow a short-long strategy, with a short position in the two-year variance swap (to earn the negative variance risk premium), and a long position in the threemonth variance swap (to hedge volatility increases). 5 This result is consistent with the empirical finding that long-term variance swaps carry more variance risk premium and react less to volatility increases than short-term variance swaps, e.g., Egloff, Leippold, and Wu (2010) and Aït-Sahalia, Karaman, and Mancini (2014). We also find that optimal weights in variance swaps exhibit strong periodic patterns, which depend on the maturity and roll-over date of the contracts. Remarkably, when the stock price does not jump and the investor cannot trade index options, the optimal strategy in variance swaps remains largely the same. This suggests that the short-long strategy in variance swaps is a robust feature of the optimal investment strategy.
The optimal investment in the put option is in line with the numerical calibration in   Table 1). The portfolio weight is very small (less than 1% of the total wealth in most cases), increases in the jump size and/or jump intensity, and can change sign: it is positive when the jump risk premium is small (providing hedging against index jumps), while it is negative when the jump risk premium is large (to earn the jump risk premium).
We consider two relative risk aversion levels, five and one. The first is an average value in survey data. 6 The second corresponds to logarithmic utility. Optimal portfolio weights for both levels share the patterns described above. However, the respective wealth trajectories are largely different. The more risk averse investor takes on smaller positions than the log-investor, in absolute value. This results in a smooth and steady growth of his wealth over time, which is largely unaffected by market declines. In contrast, the wealth trajectory of the log-investor is volatile and fluctuates even more than the S&P 500. This suggests that variance swaps can be used either to achieve stable wealth growth or to seek additional risk premiums, depending on the risk profile of the investor. Rebalancing the portfolio less frequently than daily, such as monthly or yearly, leads to similar results.
To summarize our findings on the optimal portfolio, the short-long strategy in variance swaps appears to be a robust feature of the optimal investment strategy. Variance swaps serve the purpose of providing exposure to volatility risk premiums and hedging volatility risk. The presence of price jumps does not significantly change the optimal investment in variance swaps, provided that the investor can optimally trade index options to hedge price jumps or earn the jump risk premium.
Our paper is related to various strands of the literature. A fast growing literature studies the variance risk premium and its impact on asset prices, e.g., Jiang and Tian (2005), Carr and Wu (2009), Bollerslev, Tauchen, and Zhou (2009), Todorov (2010, Bollerslev and Todorov (2011), Drechsler and Yaron (2011), Mueller, Vedolin, and Yen (2011), Martin (2013, and Bekaert and Hoerova (2014). This line of research focuses almost exclusively on a single maturity. As mentioned above, the term structure of variance swap rates has remained unexplored until recently, e.g., Amengual (2009), Egloff, Leippold, and Wu (2010), and Aït-Sahalia, Karaman, and Mancini (2014. 7 Part of the reason could be that variance swap data became available only recently. We contribute to this line of research by proposing a novel quadratic term structure model, assessing its empirical performance, and studying dynamic optimal portfolios in this setting. There is an extensive literature on term structure models for interest rates. This literature mainly focuses on affine term structure models, where the zero-coupon yield curve is affine in the state variable which follows an affine diffusion process. 8 The loadings in turn are given in terms of a nonlinear ordinary differential equation. 9 Quadratic and higher order polynomial specifications of the yield curve are very limited, see Filipović (2002) and Chen, Filipović, and Poor (2004). These limitations do not exist for the variance swap curve. This allows us to define the class of generic quadratic variance swap models, where the QÀspot variance is a quadratic, or higher order polynomial, function of the state variable which follows a quadratic diffusion process. The resulting variance swap curve is quadratic, or higher order polynomial, in the state variable, and the loadings are given in terms of a linear ordinary differential equation.
Several studies investigate dynamic optimal portfolios with stochastic investment opportunity set. 10 Most of them consider optimal investment in stock and bond only, without any derivative, often in a one-factor stochastic volatility setting, and when the price process is continuous. The focus is usually on theoretical aspects of optimal portfolios, and thus empirical analyses are not provided. In addition to stock and bond, in an affine setting,  extend the investment opportunity set to options, and Egloff, Leippold, and Wu (2010) to variance swaps. We study, both from theoretical and empirical perspectives, dynamic optimal portfolios including variance swaps and index option in our quadratic setting with stock index jumps.
The structure of the paper is as follows. Section 2 presents variance swaps. Section 3 introduces quadratic variance swap models. Section 4 discusses model estimates. Section 5 studies optimal portfolios in variance swaps, index option, stock index, and risk-free bond. Section 6 investigates the empirical performance of optimal portfolios. Section 7 concludes. Technical derivations and proofs are collected in the online appendix. 11

Variance swaps
Let S t denote the price process of a stock index modeled on some filtered probability space ðΩ; F ; ðF t Þ t Z 0 ; QÞ, where Q is a risk neutral measure. We let S t be a semimartingale of the form where r t is the risk-free rate, B t is a QÀstandard Brownian motion, and χðdt; dξÞ denotes the random measure associated to the jumps of S t . Its QÀcompensator ν Q t ðdξÞ dt is such that the last term in (1) is the increment of a QÀpure jump martingale. The diffusive component of the price volatility is σ t .
Let t ¼ t 0 ot 1 o ⋯ ot n ¼ T denote the trading days over a given time horizon ½t; T. The annualized realized variance is the annualized sum of squared log-returns over the given time horizon, It is known that, as sup i ¼ 1;…;n t i Àt i À 1 ð Þ -0, the realized variance converges in probability to the quadratic variation of the log-price, This approximation is commonly adopted in practice (e.g., Egloff, Leippold, and Wu, 2010) and quite accurate at a daily sampling frequency (e.g., Broadie and Jain, 2008;Jarrow, Kchia, Larsson, and Protter, 2013), as is the case in our data set. 12 A variance swap initiated at t with maturity T, or term T Àt, pays the difference between the annualized realized variance RVðt; TÞ and the variance swap rate VSðt; TÞ fixed at t. 13 By convention, the variance swap rate is such that the variance swap contract has zero value at inception. No arbitrage implies that where E Q denotes expectation under the risk neutral measure Q, and is the QÀspot variance process. 14 The jump compensator ν P t ðdξÞ dt of the index price process under the objective probability measure P differs from the QÀjump compensator ν Q t ðdξÞ dt in general, 8 Affine diffusion processes are nested in our class of quadratic diffusion processes. 9 See, e.g., Duffie and Kan (1996), Dai and Singleton (2000), Duffie, Pan, and Singleton (2000), Duffie, Filipović, and Schachermayer (2003), and Collin-Dufresne, Goldstein, and Jones (2008). Dai and Singleton (2003) and Duarte (2004) discuss some limitations of affine term structure models. Various extensions of affine models have been suggested by, e.g., Constantinides (1992), Goldstein (2000), Leippold and Wu (2002), Collin-Dufresne and Goldstein (2002), Ahn, Dittmar, and Gallant (2002), Kimmel (2004), and Collin-Dufresne, Goldstein, and Jones (2009). 10 Liu, Longstaff, and Pan (2003) use a one-factor price jump setting to study optimal investment in stock and bond only. Chacko and Viceira (2005) use a one-factor affine diffusive setting to study optimal investment in stock and bond only. Liu (2007) studies optimal investment in multiple stocks and bond in a similar affine diffusive setting. Aït-Sahalia, Cacho-Diaz, and Hurd (2009) study optimal investment in multiple stocks (and bond) featuring price jumps with constant return volatilities. Detemple and Rindisbacher (2010) study optimal investment in multiple stocks and bond in a diffusive setting with constant return volatility. Kim and Omberg (1996), Brennan and Xia (2002) and Sangvinatsos and Wachter (2005) provide related studies. 11 http://jfe.rochester.edu/appendix.htm 12 Market microstructure noise, while generally a concern in high frequency inference, is largely a non-issue at the level of daily returns. 13 As the difference is in variance units, the payoff is converted in dollar units via a suitable notional amount. 14 We assume that the risk-free rate and the QÀspot variance are independent processes, which is certainly a tenuous assumption.
To consistently price variance swaps and capture the term structure of volatility risk, it is crucial to design models for the entire variance swap curve T↦VSðt; TÞ. In view of (4), this boils down to modeling the QÀspot variance process v Q t . These models should be analytically tractable and yet flexible enough to reproduce the empirical features of variance swap rates. Any positive semimartingale whose QÀspot variance process coincides with v Q t is then a consistent price process in the sense that VSðt; TÞ is the corresponding variance swap rate.
It is instructive to draw an analogy between the term structure of variance swaps and interest rates. The variance swap curve reflects market expectations about future changes in QÀspot variance, see (4). The financial variable in interest rate models corresponding to the QÀspot variance v Q t is the risk-free short rate r t . Market expectations about future changes in short rates are expressed in terms of the zero-coupon yield curve with short-end given by yðt; tÞ ¼ r t . Clearly, the yield curve is a nonlinear function of the short rate process. In contrast, the variance swap curve is a linear function of the QÀspot variance process. This linear relation gives greater flexibility for the specification of analytically tractable term structure models for variance swaps than for interest rates. Indeed, most common factor models for the term structure of interest rates are affine term structure models. The short rate is specified as an affine function of the state variable which follows an affine diffusion process. The resulting yield curve is affine in the state variable, and the loadings are given as solutions to a nonlinear ordinary differential equation, e.g., Duffie and Kan (1996) and Dai and Singleton (2000). Specifying the short rate as a quadratic function of the state variable is possible. But it generically requires that the state variable follows a Gaussian process, e.g., Ahn, Dittmar, and Gallant (2002); Chen, Filipović, and Poor (2004), and Liu (2007). 15 Moreover, there exists no consistent polynomial specification of the yield curve beyond second order, Filipović (2002). These limitations do not exist for variance swap term structure models, and this flexibility is exploited here.

Quadratic variance swap models
Let X t be a diffusion process in some state space X & R m , solving the stochastic differential equation (SDE) where W t is a standard d-dimensional Brownian motion under the risk neutral measure Q, and μðxÞ and ΣðxÞ are R m À and R mÂd Àvalued functions on X , for some integers m; d Z1. The process X t has the following quadratic structure: Definition 3.1. The diffusion X t is called quadratic if its drift and diffusion functions are linear and quadratic in the state variable, for some parameters b A R m , β A R mÂm , and a; α k ; A kl A S m with A kl ¼ A lk , where S m denotes the set of symmetric m Â m-matrices, and > denotes transpose.
An m-factor quadratic variance swap model is obtained by specifying the QÀspot variance as a quadratic function of the state variable, with gðxÞ ¼ ϕþψ > x þ x > πx, for some parameters ϕAR, ψ A R m , and π A S m . The following theorem justifies the terminology of quadratic variance swap model.
The online appendix shows that, under mild technical conditions, the converse to Theorem 3.2 also holds true: a quadratic term structure implies that the QÀspot variance function and the state diffusion process X t be necessarily quadratic. This result implies that our quadratic model framework is exhaustive in the sense that we do not miss any other diffusion specification which is consistent with a quadratic term structure.
We also specify an R d Àvalued process for the market price of risk, Λ, such that dW P t ¼ dW t À Λ t dt is a PÀBrownian motion, and ΣðX t ÞΛ t ¼ Υ 0 þΥ 1 X t holds for some parameters Υ 0 A R m and Υ 1 A R mÂm . This implies that the PÀdynamics of X t are of the form Thus, the process X t follows a quadratic diffusion under P as well. The properties of X t derived from the quadratic structure hold under Q as well as under P. It follows by inspection that an affine transformation of the state, X t ↦c þγ X t , preserves the quadratic property (9) and (10) of X t and the quadratic term structure (12). From an econometric viewpoint, this implies that the above general model is not identifiable. This calls for a canonical representation. A full specification analysis of general multi-factor quadratic models is beyond the scope of this paper. 16 In the following sections, we first provide an exhaustive specification analysis for the univariate quadratic model. We then study a bivariate extension and univariate polynomial specifications of higher order. Model identification is asserted in terms of canonical representations.

Univariate quadratic model
In this section, let m ¼ d ¼ 1 and consider a univariate quadratic diffusion on some interval X in R and for some real parameters b, β, a, α, and A Z0. The linear ordinary differential equations (13) simplify, and the explicit expressions are given in the online appendix. The invariance of quadratic processes with respect to affine transformations allows us to distinguish exactly three equivalence classes of quadratic processes on unbounded intervals with a canonical representation each. In other words, any univariate quadratic process (on unbounded intervals and possibly after an affine transformation) necessarily falls in one of the three equivalence classes. The three canonical representations are identifiable, and thus can be estimated using variance swap data.
Theorem 3.3. Denote the discriminant of the diffusion function of X t by D ¼ α 2 À4Aa. The quadratic process X t falls in one of the following three equivalence classes: Class 1: Either A 40 and D o 0, or A ¼ α ¼ 0 and a 40.
The canonical representation is specified by Note that for A¼0 we obtain a Gaussian process.
Class 3: Either A 4 0 and D 4 0, or A¼0 and αa0. The canonical representation is specified by X ¼ ½0; þ1Þ, The boundary point 0 is not attainable if and only if b Z 1=2, in which case we can choose X ¼ ð0; þ1Þ. Note that for A ¼0 we obtain an affine process.
Remark 3.4. For A o 0 and D 4 0, the state space X becomes bounded. The canonical representation for this equivalence class is the Jacobi process on X ¼ ½0; 1. We do not consider this case, as here we focus on state processes on unbounded state spaces.

Bivariate quadratic model
In this section, we consider a bivariate extension of the above univariate quadratic model. Higher dimensional extensions are conceptually straightforward, but these models would be difficult to estimate because of the large number of parameters. Our empirical analysis below shows that a bivariate model provides a good fit to variance swaps and quadratic variation, thus higher order dimensional extensions do not appear to be practically relevant.
Let m ¼2 and consider a bivariate quadratic diffusion X t ¼ ðX 1t ; X 2t Þ > of the form with β 12 Z0 and X 2t Z 0. The components X 1t and X 2t are instantaneously uncorrelated and only interact via the drift term. The QÀspot variance function is assumed to depend on X 1t only, where x ¼ ðx 1 ; x 2 Þ, for some real parameters ϕ, ψ, and π.
Hence, X 1t drives the QÀspot variance, while X 2t determines the stochastic mean reversion level, Àðb 1 þβ 12 X 2t Þ= β 11 , of X 1t . The linear ordinary differential equations (13) simplify, and the explicit expressions are given in the online appendix. The admissible specifications for X 2t are either Class 2 or 3 with the corresponding canonical representations given by Theorem 3.3. The diffusion function of X 1t can be of any Class 1-3 with the corresponding canonical representations from Theorem 3.3. Imposing b 1 ¼ 0 when the diffusion function of X 1t is in Class 1 or 2, and b 1 ¼ 0 or 1/2 when it is in Class 3, ensures that the bivariate quadratic model is identified. This is proved in the online appendix. The univariate quadratic model is nested in the bivariate model, setting X 2t to a positive constant value.
To keep the model parsimonious, a risk premium is attached only to the first Brownian motion, W 1t . The 16 This would require to find necessary and sufficient conditions on the model parameters and the state space X such that the multivariate quadratic diffusion X t be well-defined in X . The matrix-valued quadratic form on the right hand side of (10) needs to be positive semi-definite for all x A X . Moreover, it has to vanish in the direction orthogonal to the boundary at all boundary points, for the state space to be invariant under the dynamics of X t . Hence, the state space X is specified by the zeros of quadratic forms on R m . The zero level sets of quadratic forms on R m are complex geometric objects, and the canonical classification of quadratic diffusions would at least require an exhaustive classification of such zero level sets. Filipović and Larsson (2014) provide a related study. market price of risk process is then The parameter λ 0 can take any real value if the diffusion function of X 1t is in Class 1, λ 0 Z 0 if the diffusion function of X 1t is in Class 2 or in Class 3 along with b 1 ¼ 1=2, and λ 0 ¼ 0 otherwise. It follows from Cheridito, Filipović, and Kimmel (2007) that the change of measure P $ Q is welldefined under these conditions.

Univariate polynomial model
An important property of quadratic diffusion processes is that their conditional nth moments are available in closed form as polynomials of degree n in the state variables. This is in fact the reason why in Theorem 3.2 we obtain the closed form quadratic expression for GðT À t; X t Þ. Indeed, ∂GðT Àt; XÞ=∂T is simply the F t Àconditional moment of the quadratic polynomial gðX T Þ in X T . This polynomial preserving property of X t suggests a natural extension of the quadratic variance swap models, namely, higher order polynomial variance swap models. Here we discuss the univariate case. The multivariate case is a straightforward but notationally cumbersome extension.
As in Section 3.1, we consider the univariate quadratic diffusion process (15). The following theorem formalizes the polynomial preserving property of X t .
Theorem 3.5. The ðN þ 1Þ row vector of the first N F t Àconditional moments of X t þ τ with τ Z 0 is given by where B is an upper triangular ðN þ 1Þ Â ðN þ 1Þ matrix defined in the online appendix, and e Bτ denotes the matrix exponential of Bτ.
A polynomial variance swap model is then obtained by specifying the QÀspot variance as a polynomial function The following corollary is an immediate consequence of Theorem 3.5.
Corollary 3.6. Under the above assumptions, the polynomial variance swap model admits a polynomial term structure. That is, the variance swap rates are polynomial of degree N in X t : where the functions P i : ½0; þ1Þ-R satisfy the linear ordinary differential equations dPðτÞ dτ where PðτÞ ¼ ðP 0 ðτÞ; P 1 ðτÞ; …; P N ðτÞÞ > and p ¼ ðp System (24) is equivalent to (13) for N ¼2, with loadings ΦðτÞ ¼ P 0 ðτÞ, Ψ ðτÞ ¼ P 1 ðτÞ, and ΠðτÞ ¼ P 2 ðτÞ.

Model estimation
In this section, we fit the variance swap models in Sections 3.1-3.3 to variance swap rates on the S&P 500 and its quadratic variation computed from tick-by-tick S&P 500 futures prices. An advantage of this estimation approach is that model estimates are not impaired by potential misspecifications of the index dynamics and allows for a thorough comparison of the variance swap models.

Data set
Our data set includes daily over-the-counter quotes of variance swap rates on the S&P 500, with fixed terms at two, three, and six months, and one and two years. 17 It ranges from January 4, 1996 to June 7, 2010, and includes 3,626 observations for each term. Standard statistical tests do not detect any day-of-the-week-effect, so we use all available daily data. An interesting feature of this data set is that terms, rather than maturities, are fixed. This facilitates the comparison of the term structure over time, without using any interpolation method to recover variance swap rates for a specific term. Our data set also includes daily quadratic variation computed from tick-by-tick S&P 500 futures prices using the two-scale estimator of Zhang, Mykland, and Aït-Sahalia (2005). Fig. 1 shows the term structure of variance swap rates over time and suggests that variance swap rates are mean-reverting, volatile, with spikes and clustering during the major financial crises over the last 15 years, and historically high values during the financial crisis in fall 2008. While most term structures are upward sloping (48% of our sample), they can also be [Àshaped (23% of our sample) and rarely downward sloping or \Àshaped. 18 The bottom and peak of the [À and \Àshaped parts of the term structures, can be anywhere at the three-or six-month or one-year terms. The slope of the term structure, measured as the difference between the twoyear and two-month variance swap rates, shows a strong negative relation to the contemporaneous level of volatility. Thus, in high volatility periods, the short-end of the term structure (variance swap rates with two or three month terms) rises more than the long-end, producing downward sloping term structures. Table 1 provides summary statistics of our data set. We split the sample in two parts. The first part ranges from January 4, 1996 to April 2, 2007, includes 2,832 daily observations (about 3/4 of the whole sample), and will be used for in-sample analysis and model estimation. The second part ranges from April 3, 2007 to June 7, 2010, includes 794 daily observations, and will be used for outof-sample analysis, including model validation. The out-ofsample analysis appears to be particularly interesting as 17 We thank Mika Kastenholz from Credit Suisse for providing us with the variance swap data. 18 On some occasions, the term structure is $ Àshaped, but the difference between, e.g., the two-and three-month variance swap rates is virtually zero and this term structure is nearly [Àshaped.
Please cite this article as: Filipović the sample period covers the recent financial crisis, a period of unprecedented market turmoil, which was not experienced in the in-sample period.
For the sake of interpretability, we follow market practice and report variance swap rates in volatility percentage units, i.e., ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi VSðt; TÞ p Â 100. Various empirical regularities emerge from Table 1. The mean level of variance swap rates is slightly but strictly increasing with term. The standard deviation, skewness, and kurtosis of variance swap rates are decreasing with term. Unreported first order autocorrelations of variance swap rates range from 0.984 to 0.995, are slightly increasing with the term, and imply a mean half-life of shocks between 43 and 138 days. 19 This confirms that mean reversion is present in the time series and suggests that long-term variance swap rates are more persistent than short-term rates. Comparing in-and out-of-sample statistics reveals a significant increase in level and volatility of variance swap rates, mainly due to the market turmoil in fall 2008.
A Principal Component Analysis (PCA) shows that the first principal component explains about 95.3% of the total variance of variance swap rates and can be interpreted as a level factor, while the second principal component explains an additional 3.8% and can be interpreted as a slope factor. 20 This finding is somehow expected because PCA of several other term structures, such as bond yields, produces qualitatively similar results. Less expected is that two factors explain nearly all the variance of variance swap rates, i.e., 99.1%. Repeating the PCA for various subsamples produces little variation in the first two factors and explained total variance. Table 1 also shows summary statistics of variance swap floating legs, i.e., the realized variance of daily S&P 500 returns for various terms. All statistics of realized variances share qualitatively the same features as those of the variance swap rates. The main difference is that, especially during the insample period, realized variances tend to be lower and more volatile, positively skewed, and leptokurtic than variance swap rates. This difference highlights the profitability and riskiness of shorting variance swaps, earning the large negative variance risk premiums embedded in such contracts. The ex post variance risk premium is defined as the average realized variance minus the variance swap rate, which is simply the average payoff of a long position in the respective variance swap. The corresponding summary statistics are reported in the last panel of Table 1. In the in-sample period, ex post variance risk premiums are negative and, except for the longest maturity, increasing in absolute value with the term. Notably, ex post Sharpe ratios from shorting variance swaps also increase with their term, ranging from 0.60 (¼1.67/2.80) for two-month variance swaps to 0.85 (¼2.15/2.54) for oneyear variance swaps. This suggests that it is more profitable on average to sell long-term than short-term variance swaps. In the out-of-sample period, the opposite holds as short-term variance swap rates increase proportionally more than longterm variance swap rates, making it more profitable, ex post, to buy long-term variance swaps.
To summarize, the term structure of variance swap rates exhibits rich dynamics, challenging any term structure model. Whether our quadratic models are flexible enough to fit variance swap rates is an empirical question that we address in the following sections.

Model estimates
The state process X t driving the term structure is not observed and variance swap rates and quadratic variation are nonlinear functions of X t . To extract the latent state we use the nonlinear unscented Kalman filter, which is found by Christoffersen, Dorion, Jacobs, and Karoui (2012) to have good finite sample properties in the context of estimating dynamic term structure models. We then estimate the model parameters using maximum likelihood.
The measurement equation entails a six-dimensional observation vector. The first five components are the variance swap rates with terms equal to two, three, and six months, and one and two years. The last component is the logarithm of the daily quadratic variation computed from tick-by-tick S&P 500 futures prices, entering the measurement equation as an affine function of logðv Q t Þ. We therefore use information both from the variance swap and S&P 500 futures markets to estimate the models. The online appendix provides details on the estimation method.
It is known that univariate affine models cannot capture the empirical features of variance swap rates, e.g., Egloff, Leippold, and Wu (2010) and Aït-Sahalia, Karaman, and Mancini (2014). These models, for example, can only produce upward or downward sloping term structures, and variance swap rates have all the same persistence. Such model-based features of variance swap rates are in sharp contrast with the empirical features summarized in Table 1. In principle our univariate quadratic model in Section 3.1 could capture the empirical features of variance swap rates. Intuitively, the quadratic features of the QÀspot variance v Q t and of the state process diffusion function relax the constraints imposed by an affine specification.
We begin model estimations by fitting each of the three canonical representations of the univariate quadratic model in Section 3.1 to variance swap rates and quadratic variation. We Table 2 Model estimates.
Entries are parameter estimates (Est.) for the univariate quadratic, univariate polynomial, and bivariate quadratic models, and corresponding standard errors (S.E.). Identifiable, thus restricted, versions of the following model are estimated: QÀdynamics of the state process dX 1t ¼ ðb q for the QÀBrownian motion W 1t . An empty entry means that the parameter is set to zero to achieve model identification. Models are estimated using maximum likelihood with unscented Kalman filter. The measurement equation is six-dimensional. The first five components are variance swap rates with variance of measurement error σ 2 VS . The sixth component is the logarithm of the daily quadratic variation, logðQV t Þ, with expectation c 0 þ c 1 logðv Q t Þ, and conditionally normal measurement error ϵ t with mean ρ ϵ ϵ t À 1 and variance c 2 þ c 3 QV t À 1 . The online appendix provides details on the estimation approach. AIC and BIC are Akaike and Bayesian Information Criteria, respectively. Sample data are daily variance swap rates on the S&P 500, with terms of two, three, six, 12, 24 months, and daily quadratic variation computed from tick-by-tick S&P 500 futures prices, from January 4, 1996 to April 2, 2007, a total of 2,832 observations for each series. find that the largest log-likelihood of the univariate quadratic model is achieved when the state process X t is in Class 3 (Theorem 3.3). This finding is confirmed by Akaike and Bayesian Information Criteria (AIC and BIC). 21 Table 2 reports the corresponding parameters, which are estimated rather precisely.
We further investigate this model by considering four parametric restrictions that induce four alternative model specifications. Each specification is tested via a likelihood ratio (LR) test. Specification 1 imposes that X t has an affine dynamic by setting the quadratic coefficient A ¼0 in (15). Specification 2 constrains the QÀspot variance function, v Q t ¼ ϕþψX t þ πX 2 t , to be linear in X t by setting π ¼ 0. The corresponding LR tests strongly reject both restrictions, suggesting that the quadratic features of X t and v Q t play an important role in fitting variance swap rates and quadratic variation. Specification 3 restricts the functional form of the QÀspot variance by imposing the QÀspot variance function to have exactly one root, i.e., ψ 2 ¼ 4ϕπ. This guarantees the nonnegativity of the QÀspot variance for any realization of X t . Specification 4 further restricts Specification 3 by testing whether the root is at X t ¼0, i.e., ϕ ¼ ψ ¼ 0. The corresponding LR tests strongly reject both restrictions, confirming that a flexible quadratic link between v Q t and X t is statistically important to fit variance swap rates. To summarize, these statistical tests suggest that the full flexibility of the univariate quadratic model is necessary to fit variance swap rates and quadratic variation.
We now investigate whether enriching the functional form of the QÀspot variance can improve the fitting of the data. We estimate the univariate polynomial variance swap model in Section 3.3 when the state process X t follows a quadratic diffusion and the degree of the polynomial is N ¼ 5. The choice N ¼5 asserts that the univariate polynomial model has the same number of parameters as the bivariate quadratic model, estimated next. Table 2 reports the parameter estimates. 22 The additional parameters, p 3 , p 4 , p 5 , allow only for a modest increase in the log-likelihood and a modest decrease of the AIC and BIC. Moreover, the economic magnitude of such parameters appears to be rather small. Thus, the polynomial form of the QÀspot variance helps only marginally to improve the fitting of variance swap rates and quadratic variation.
We now turn to the bivariate extension of the quadratic model in Section 3.2. We estimate all the identifiable equivalence class combinations of X 1t and X 2t , and find that the best fit, in terms of likelihood, AIC and BIC, is obtained when X 1t is in Class 1 and X 2t is in Class 3. Table 2 reports the parameter estimates, as well as AIC and BIC. All the parameters are estimated precisely, as can be seen from the small standard errors.
The log-likelihood of the bivariate model is significantly larger than the log-likelihood of univariate models and the values of the BIC and AIC are significantly lower. The LR statistic of the bivariate model versus the univariate quadratic model is 28,757. The Vuong (1989) statistic of the bivariate model versus the univariate polynomial model is 48.5. Both statistics are highly significant and strongly reject the null hypothesis that the bivariate quadratic model is equivalent to any of the other two univariate models. 23 Following Giacomini and White (2006), we also compare the bivariate model and the univariate models using scoring-type rules. The test statistic is the log-likelihood under the bivariate model minus the log-likelihood under the univariate quadratic or Fig. 2. Time series evolution of state process. In the bivariate quadratic model in Section 3.2, X 1t is in Class 1 and X 2t is in Class 3; Theorem 3.3. The model is fitted to daily variance swap rates on the S&P 500, from January 4, 1996 to April 2, 2007, and terms of two, three, six, 12, 24 months, and daily quadratic variation computed from tick-by-tick S&P 500 futures data. Table 2 reports model estimates. The vertical line is April 3, 2007, i.e., beginning of the out-ofsample period. 21 When the state process X t is in Class 1, 2, and 3, AIC are À 93,127, À 94,163, and À 94,264, and BIC are À 93,066, À 94,102, and À 94,180, respectively. Both criteria achieved the minimum value when X t is in Class 3. 22 The relation between model parameters in Section 3.3 and those in Table 2 is straightforward, namely, p 0 ¼ ϕ, p 1 ¼ ψ, and p 2 ¼ π. 23 The asymptotic distribution of the test statistics under the null hypotheses is the chi-square with five degrees of freedom and standard normal, respectively. Recall that the bivariate quadratic model nests the univariate quadratic model. Setting b 2 ¼ β 22 ¼ a 2 ¼ α 2 ¼ A 2 ¼ 0 in the bivariate model, i.e., imposing five parameter restrictions, implies that X 2t is constant and can be normalized to one for identification purposes. polynomial model. If the two models are equivalent, the test statistic has zero mean, which can be tested using a simple t-test. 24 The t-statistics are 10.1 and 9.8, respectively, and are both highly significant. These tests further support that the bivariate quadratic model fits variance swap rates and quadratic variation significantly better than the univariate models. Fig. 2 shows the filtered trajectories of the state process X t in the bivariate model and suggests a natural interpretation of its components. X 1t is more volatile and mimics the time series trajectories of short-term variance swap rates, mainly capturing sudden movements in those rates. X 2t is more persistent and mainly captures long-term movements in variance swap rates.

Two-factor affine jump-diffusion model
We now introduce a two-factor affine jump-diffusion (AJD) model that provides a challenging benchmark to assess the accuracy of our quadratic models in subsequent goodness-of-fit tests. The QÀdynamics (1) of the index are specified as for some standard Brownian motions B t and ðW 1t ; W 2t Þ.
The first factor X 1t is the diffusive component of the QÀspot variance and follows a two-factor mean-reverting process in which the second factor X 2t controls its stochastic long run mean and β 1 o0 the speed of mean reversion. The second factor X 2t follows its own stochastic mean-reverting process and mean-reverts to À b 2 =β 2 4 0, with speed of mean reversion β 2 o0. The relative index jump size ξ t 4 À 1 can be any integrable random variable.
Only the second moment of the log-jump size,  Pan, and Singleton (2000). 25 The intensity of the counting jump process N t is stochastic and given by ν t ¼ ν 0 þ ν 1 X 1t , where ν 0 and ν 1 are nonnegative constants. 26 As in our bivariate quadratic model (Section 3.2), we use the market price of risk specification in (21) and attach the risk premium λ 1 ffiffiffiffiffiffiffiffi X 1;t p = ffiffiffiffiffi ffi α 1 p to the QÀBrownian motion W 1t . Thus, the Girsanov-transformed PÀBrownian motion is dW As variance jumps are the main feature of AJD models to generate volatility of volatility, we also allow for a variance jump risk premium. Specifically, under P the variance jump size Z 1t is exponentially distributed with parameter μ P 1 . The logarithm of the PÀspot variance is an affine function of logðv Q t Þ, where the QÀspot variance v Q t ¼ X 1t þμ 2S ðν 0 þν 1 X 1t Þ, which is an affine function of X 1t . A Table 3 Two-factor affine jump-diffusion model.
Entries are parameter estimates (Est.) for the two-factor affine jumpdiffusion model (Section 4.3), and corresponding standard errors (S.E.).

QÀdynamics of the stock index
where r t is the risk-free rate, the diffusive spot variance X 1t evolves as dX 1t ¼ β 1 ðX 1t À X 2t Þ dt þ ffiffiffiffiffiffiffiffiffiffiffiffi α 1 X 1t p dW 1t þ Z 1t dNt , its stochastic long run mean is controlled by X 2t , which evolves as dX 2t ¼ ðb 2 þ β 2 X 2t Þ dt þ ffiffiffiffiffiffiffiffiffiffiffiffi α 2 X 2t p dW 2t , for some standard Brownian motions B t and ðW 1t ; W 2t Þ. The second moment of the log-price jump size is μ 2S . The variance jump size Z 1t is exponentially distributed with parameter μ Q 1 . Jump sizes ξ t and Z 1t are independent from Brownian motions and jump times. Jumps in returns and variance occur contemporaneously and are triggered by dN t . The intensity of N t is νt ¼ ν 0 þ ν 1 X 1t , where ν 0 and ν 1 are which is an affine function of X 1t . The risk premium λ 1 is attached to the QÀBrownian motion W 1t and gives the PÀBrownian motion dW Under the objective measure P, the variance jump size Z 1t is exponentially distributed with parameter μ P 1 . The model is estimated using particle filter. The measurement equation is six-dimensional. The first five components are variance swap rates with variance of measurement error σ 2 VS . The sixth component is the logarithm of the daily quadratic variation, logðQV t Þ, with expectation c 0 þ c 1 logðv Q t Þ, and conditionally normal measurement error ϵ t with mean ρ ϵ ϵ t À 1 and variance c 2 þ c 3 QV t À 1 . AIC and BIC are Akaike and Bayesian Information Criteria, respectively. Sample data are the same as in Table 2 Vuong's tests. Given the autocorrelation and heteroskedasticity in the log-likelihood differences, robust standard errors are computed using the Newey and West (1987) variance estimator with the number of lags optimally chosen according to Andrews (1991). 25 Eraker, Johannes, and Polson (2003) fit models with contemporaneous and independent jumps in returns and variance to S&P 500 data. They find that the two models perform similarly, but the model with contemporaneous jumps is estimated more precisely. Eraker (2004), Broadie, Chernov, and Johannes (2007), Chernov, Gallant, Ghysels, and Tauchen (2003), and Todorov (2010) provide further evidence for contemporaneous jumps in returns and variance. 26 Rewriting the dynamics of X 1t in terms of the compensated jump component, i.e., Z 1t dNt À μ Q 1 ðν 0 þ ν 1 X 1t Þ dt, shows that the speed of mean reversion is ðÀβ 1 À μ Q 1 ν 1 Þ, and the stochastic long run mean is In our bivariate quadratic model, the factors have a quadratic, rather than affine, diffusion, and the QÀspot variance v Q t is a quadratic function of X 1t , and does not exhibit jumps; see (20). The quadratic features of the factors and QÀspot variance generate more volatility of volatility relative to diffusive affine specifications. Which modeling approach is more suitable for fitting variance swap rates and quadratic variation is an empirical question that we address below.
Model (25) is fitted to variance swap rates and quadratic variation, as all other models in Section 4.2. Given the presence of jumps in the spot variance, the model is estimated using the particle filter method in Bardgett, Gourier, and Leippold (2013). Table 3 reports the parameter estimates. The diffusive variance X 1t is more volatile and fast mean-reverting than the second factor X 2t that controls its long run mean. Jumps are estimated to be rare events, as one jump occurs on average once every 3.8 years ( ¼ 1=ðν 0 þ ν 1 E P ½X 1t Þ). 27 These findings are broadly consistent with estimates of similar models reported in the literature. Importantly, in terms of likelihood, AIC and BIC, our bivariate quadratic model outperforms the two-factor affine jump-diffusion model, which in turn outperforms the univariate models.

Goodness-of-fit tests
To corroborate the above likelihood-based analysis, we now analyze the variance swap pricing errors for the various models and run various goodness-of-fit tests. Table 4 summarizes the pricing errors, which are defined as model-based minus actual variance swap rates, both in volatility units. Consistently with the likelihood-based analysis, the bivariate quadratic model nearly always, significantly outperforms the other models in terms of bias and root mean square error (RMSE). For example, in the out-ofsample period, the RMSE of the bivariate quadratic model for the two-month variance swap rates is 60% lower than the RMSE of the univariate model. The comparison among the bivariate quadratic, univariate polynomial, and two-factor AJD models is particularly interesting, as the three models have the same number of parameters. In most cases, the RMSE of the bivariate model is less than half the RMSE of the polynomial model. In the in-sample period, the bivariate Table 4 Variance swap pricing errors.
The pricing error is defined as the model-based minus observed variance swap rate, both in volatility percentage units, i.e., ð ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Gðτ; Xt Þ=τ p À ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi VSðt; t þ τÞ p Þ Â 100. Entries are mean (Bias) and root mean square error (RMSE) of pricing errors for variance swap rates under the univariate quadratic, univariate polynomial, two-factor affine jump-diffusion (AJD), and bivariate quadratic models. DMu (respectively, DMp and DMa) is the Diebold-Mariano test statistic of the univariate quadratic (respectively, polynomial and AJD) model versus the bivariate quadratic model, Section 4.4. Under the null hypothesis that the univariate quadratic (respectively, polynomial and AJD) model and the bivariate quadratic model have pricing errors of equal magnitude, the DM test statistic is a standard normal. A positive value means that the bivariate quadratic model outperforms the univariate model. Term τ is in months. Panel A shows pricing error statistics for the in-sample period, used to estimate the models, which is from January 4, 1996 to April 2, 2007, a total of 2,832 observations for each series. Panel B shows pricing error statistics for the out-of-sample period, which is from April 3, 2007 to June 7, 2010, a total of 794 observations for each series.  quadratic model largely outperforms the two-factor AJD model. In the out-of-sample period, the bivariate quadratic model tends to outperform the two-factor AJD model that proves to be a challenging benchmark and dominates univariate models for most variance swap terms. Fig. 3 shows actual and model-based trajectories under the bivariate quadratic model of the two-month and two-year variance swap rates, which are respectively the most and least volatile rates. The good performance of the model is evident throughout the in-sample and out-of-sample periods. A small lack of fit of the highest values of the two-year variance swap rates in the out-of-sample period is noticeable and occurs during the market turmoils of fall 2008.
To assess the statistical differences of the model pricing errors, we run various Diebold-Mariano (DM) tests. 28 For each model and each term, the time-t loss function is given by the absolute pricing error, Lðe t Þ ¼ je t j, where e t ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi Gðτ; X t Þ=τ p À ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi . 29 Denote the time-t loss differential between the univariate and bivariate quadratic models by d  Table 4 reports the results. DM tests strongly confirm that the bivariate model Fig. 3. Actual and model-based variance swap rates. Model-based variance swap rates are from the bivariate quadratic model in Section 3.2, with X 1t in Class 1 and X 2t in Class 3; Theorem 3.3. The model is fitted to daily variance swap rates on the S&P 500 with terms of two, three, six, 12, 24 months, and daily quadratic variation from tick-by-tick S&P 500 futures prices, from January 4, 1996 to April 2, 2007. Table 2 reports model estimates. Variance swap rates are in volatility percentage units, i.e., ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi VSðt; TÞ p Â 100. Upper graph: variance swap rates with two-month term (shortest term in our sample). Lower graph: variance swap rates with two-year term (longest term in our sample). The vertical line is April 3, 2007, i.e., beginning of the out-of-sample period. 28 We follow standard practice and use Diebold-Mariano tests to draw conclusions about models, rather than about model forecasts; see Diebold (2012) for a discussion of this point. 29 The time-t pricing error considered here uses the time-t filtered value of X t , not its prediction as in the log-likelihood function, which makes the DM tests complementary to the likelihood-based analysis in the previous section. 30 The standard error is computed using the Newey and West (1987) autocorrelation and heteroskedasticity consistent variance estimator with the number of lags optimally chosen according to Andrews (1991 largely outperforms the univariate quadratic and polynomial models. 31 In the in-sample period the bivariate quadratic model consistently outperforms the two-factor AJD model. In the out-of-sample period the bivariate model outperforms the two-factor AJD model in three out of five terms, and has statistically the same performance in the other two terms. As a robustness check, we also run DM tests using pricing errors in variance units, rather than volatility units, i.e., e t ¼ Gðτ; X t Þ=τ À VSðt; t þ τÞ, and using quadratic loss functions, rather than absolute loss functions. These additional DM tests largely confirm the results in Table 4. Finally, we run predictive regressions for each model and each term. We regress the actual future variance swap rate VSðt; t þτÞ on a constant and the d-day ahead, model-based prediction, E P ½Gðτ; X t Þ=τjF t À d , obtained at time t Àd, i.e., VSðt; t þτÞ ¼ γ 0 þγ 1 E P ½Gðτ; X t Þ=τjF t À d þerror t : If the model captures well the variance swap term structure dynamics, then it should provide unbiased, γ 0 ¼ 0, and efficient, γ 1 ¼ 1, forecasts of future variance swap rates. As an additional benchmark in the context of predictive regressions, we consider the martingale model that uses the actual variance swap rate at time t Àd as a predictor of the future variance swap rate. The martingale model is a challenging benchmark because of the strong persistence of variance swap rates; first order autocorrelations of variance swap rates range from 0.984 to 0.995, Section 4.1. We consider two forecasting horizons, d¼ 1 day and d¼10 days. Table 5 reports the regression results. 32 Notably, for both forecasting horizons and nearly all terms, the bivariate quadratic model provides unbiased and efficient variance swap rate forecasts, as can be seen from the high p-values of the null hypotheses H 0 : γ 0 ¼ 0, and H 0 : γ 1 ¼ 1. The univariate quadratic and polynomial Table 5 Variance swap predictive regressions. For each model and term, entries report time series regressions of future actual variance swap rates on a constant and a d-day ahead model-based prediction, i.e., VSðt; t þ τÞ ¼ γ 0 þ γ 1 E P ½Gðτ; Xt Þ=τjF t À d þerrort , where d is either one-day (Panel A) or ten-day (Panel B), and E P ½Gðτ; Xt Þ=τjF t À d is the time t À d model-based, conditional prediction of the τ-variance swap rate observed at time t. Variance swap rates are in variance percentage units, i.e., VSðt; t þ τÞ Â 100. For each term τ, the first row reports estimates of γ 0 and γ 1 , the second row reports the p-value of the null hypotheses H 0 : γ 0 ¼ 0, and H 0 : γ 1 ¼ 1, respectively. If model-based variance swap rate predictions are unbiased, then γ 0 ¼ 0. If model-based variance swap rate predictions are efficient, then γ 1 ¼ 1. Robust standard errors are computed using Newey and West (1987) covariance matrix estimator with the number of lags optimally chosen according to Andrews (1991). The martingale model is a benchmark model in which the future actual VSðt; t þ τÞ is predicted using the past actual VSðt À d; t À d þ τÞ. Term τ is in months. The sample period is from January 4, 1996 to June 7, 2010, a total of 3,626 observations for each series.

Martingale
Univ. quad. Univ. poly. AJD Biv. quad. 32 Also in these regressions, robust standard errors are computed using the Newey and West (1987) covariance matrix estimator with the number of lags optimally chosen according to Andrews (1991 models provide biased and inefficient forecasts in most cases, as can be seen from the low p-values. The martingale model provides relatively accurate forecasts for the one-day horizon, most persistent, long-term variance swap rates, but its forecasting accuracy largely deteriorates when moving to the tenday horizon. The two-factor AJD proves again to be a challenging model, passing most tests. However, only the bivariate quadratic model passes all tests at 10% confidence level. To summarize, also predictive regressions strongly confirm that the bivariate quadratic model captures well the variance swap term structure dynamics.

Optimal portfolios: theoretical setup
In this section, we study dynamic optimal investment in variance swaps, index option, stock index, and risk-free bond. Because the stock index can jump and variance swaps are only sensitive to the quadratic variation, variance swaps cannot be used to hedge index jumps. Options on the stock index, such as out-of-the-money put options, are typically used to hedge this jump risk. We therefore allow the investor to dynamically trade variance swaps, index option, stock index, and risk-free bond. As becomes clear later, all these securities are necessary to span the risk in the economy and achieve market completeness. While our primary interest is on variance swap investment, studying optimal portfolios in all these securities allows us to have a comprehensive view on the optimal portfolios in Egloff, Leippold, and Wu (2010) and . 33 We now formalize and solve the dynamic optimal portfolio problem. As at the beginning of Section 3, we consider a diffusion process X t in some state space X & R m , solving the SDE (8), where W t is a standard d-dimensional QÀBrownian motion. The QÀspot variance, v Q t , and variance swap rates, VSðt; TÞ, are given as functions of the state variable, X t , by (11) and (12), respectively.

Investing in variance swaps
We first compute the return of an investment in variance swaps. Fix a term τ 40, and consider a τ-variance swap issued at some inception date t Ã . Denote its maturity T Ã ¼ t Ã þ τ. The nominal spot value Γ t at date t A ½t Ã ; T Ã of a one-dollar notional long position in this variance swap is given by where the risk-free rate r is constant for simplicity. In stochastic differential form, we obtain dΓ t ¼ Γ t r dt þdM t with the QÀmartingale increment excess return where ∇ x denotes the gradient. Now fix a date t A ½t Ã ; T Ã Þ, and consider an investor with positive wealth V t who takes a position in this variance swap with relative notional exposure of n t . The cost of entering such a position is n t V t Γ t . The remainder of the wealth, V t À n t V t Γ t , is invested in the riskfree bond, making the investment self-financing. At a later instant t þ dt, the wealth has grown to Consider now τ-variance swaps that are issued at a sequence of inception dates 0 ¼ t Ã 0 ot Ã 1 o⋯, with t Ã k þ 1 À t Ã k rτ, for example, three-month variance swaps issued every month. At any date t A ½t Ã k ; t Ã k þ 1 Þ the investor takes a position in the respective on-the-run τ-variance swap with maturity In the limit case where a new τ-variance swap is issued at any date t, we obtain a "sliding" variance swap investment, and we set T Ã ðtÞ ¼ t þ τ. Iterating the above reasoning shows that the resulting wealth process V t evolves according to where the excess return on the right hand side is a QÀmartingale increment.

Optimal portfolio problem
We now consider an investment universe consisting of stock index S, risk-free bond, index option O, and n on-therun variance swaps with different terms τ 1 o⋯ oτ n and respective issuance dates encoded by n maturity functions T Ã 1 ðtÞ; …; T Ã n ðtÞ, as defined above. The QÀdynamics (1) of the index are specified as where σðX t Þ 2 is the diffusive component of the QÀspot var- X -R d is some function with constant norm J R J 1, modeling the correlation between stock returns and diffusive variance changes. 34 The last term in (31) is a QÀcompensated jump component. The random arrival of jump events is induced by the counting process N t , which has a stochastic intensity ν Q ðX t Þ. Following , we adopt a deterministic relative index jump size ξ4 À1. That is, conditional upon a jump arrival, the stock index jumps from S t À to 33 Egloff, Leippold, and Wu (2010) study optimal investment in variance swaps, stock index, and bond in a two-factor affine diffusion setting.  study optimal investment in index option, stock index, and bond in a one-factor affine jump-diffusion setting. We study optimal investment in variance swaps, index option, stock index, and bond in a multi-factor quadratic jump-diffusion setting. 34 The index price dynamics in (31) are equivalent to S t À ð1 þ ξÞ. This specification of a deterministic jump size simplifies our analysis in the sense that only one index option is needed to complete the market with respect to the jump component. This formulation, though simple, is capable of capturing the sudden and high-impact nature of index jumps that cannot be produced by diffusions. More generally, one could introduce random jump size with multiple values and use multiple options to complete the market. The index option is left unspecified at this stage, but to fix ideas one can think of it as an out-of-the-money put option on the stock index, as will be the case in our empirical analysis of optimal portfolios. Let O t ¼ OðS t ; X t Þ be the time-t value of the option. The QÀdynamics of O t are where ∂ s O t and ∇ x O t measure the sensitivity of the option price to infinitesimal changes in the stock index and state variables, respectively, and ΔO t measures the change in the option price when the underlying stock index jumps, When the option has nonzero sensitivities ∂ s O t , ∇ x O t , and ΔO t , it provides exposure to the fundamental sources of risk, W t and N t , and access to their risk premiums. Let w t denote the fraction of wealth invested in the stock index, ϕ t the fraction of wealth invested in the option, and n t ¼ ðn 1t ; …; n nt Þ > the vector of relative notional exposures to each on-the-run τ i -variance swap, i¼1,…,n. To make the investment self-financing, the fraction of wealth invested in the risk-free bond is given by 1 Àn > t Γ t À w t À ϕ t , where Γ t is the vector of the variance swap spot values. Combining (30)-(32), the resulting wealth process V t has QÀdynamics where θ W t and θ N t are defined, for given portfolio weights n t , w t , and ϕ t , by the ðd þ1Þ Â ðn þ 2Þ matrix G t is defined by and D t is the m Â n matrix whose ith column is given by e À rðT Ã i ðtÞ À tÞ =τ i À Á ∇ x GðT Ã i ðtÞÀt; X t Þ. Effectively, by taking positions n t , w t , and ϕ t on the risky assets, the investor invests θ W t in the diffusive shocks W t , and θ N t in the jump risk N t , controlling the portfolio exposure to the fundamental risks.
We now formulate the optimal portfolio problem. We consider an investor who has a fixed finite time horizon T, maximizes his expected utility from terminal wealth, and has a power utility function with constant relative risk aversion η. That is, the investment objective is max fnt ;wt ;ϕ t ;0 r t r Tg for some given initial wealth V 0 . The objective probability measure P is related to the risk neutral measure Q via the pricing kernel where ν P ðX t Þ is the jump intensity of N t under P, and dW P t ¼ dW t À ΛðX t Þ dt is a PÀBrownian motion. The pricing kernel π t sets the risk premiums in the economy, with ΛðX t Þ and ν Q ðX t Þ=ν P ðX t Þ controlling the premium for diffusive and jump risks, respectively. As usual in the optimal portfolio literature, we exogenously specify the risk premiums in (38), and our analysis of optimal portfolios is of a partial equilibrium nature. That is, the investor solving (37) takes the risk premiums as given. As pointed out by Liu and Pan (2003, p. 403), "this is very much the spirit of the asset allocation problem: a small investor takes prices (both risks and returns) as given and finds for himself the optimal trading strategy." Chacko and Viceira (2005), Liu (2007), Aït-Sahalia, Cacho-Diaz, and Hurd (2009), Detemple and Rindisbacher (2010), among others, provide studies of optimal portfolios in partial equilibrium settings.
By choosing the number n of on-the-run variance swaps available in the market according to the number d of driving Brownian motions, it allows to achieve market completeness. Market completeness in turn allows us to solve the optimal portfolio problem analytically.
Assumption 5.1. The market is complete with respect to the stock index, the index option, and the n on-the-run τ ivariance swaps. Specifically, we assume that n ¼ m ¼ d À1, and that the ðd þ1Þ Â ðd þ1Þ matrix G t is invertible dt dQÀa:s: From (36) we see that G t is invertible dt dQÀa:s: if and only if the d Â d matrix ΣðX t Þ > ; σðX t ÞRðX t Þ À Á and the ðd À 1Þ Â ðd À 1Þ matrix D t are invertible dt dQ-a.s. and 35 The matrix D t is invertible dt dQÀa:s: only if the maturity date functions T Ã i ðtÞ are mutually different for all t. This means that none of the n ¼ d À 1 on-the-run τ ivariance swaps is redundant. Condition (39) states that the option price has to exhibit different sensitivities with respect to large and small index price changes. For convex option prices ΔOt ξSt À 4∂ s O t À and (39) holds. Because of market completeness, the control variables n t , w t , ϕ t in the optimal portfolio problem (37) can be replaced by θ W t , θ N t . The solution of (37) then consists of two logical steps. First, find the optimal exposures θ WÃ t and θ NÃ t to the fundamental risk factors W t and N t to support 35 Here we use the fact that ΔOt ξSt ¼ ΔOt ξSt À and ∂sOt ¼ ∂sOt À dt dQÀa:s: which satisfies the Hamilton-Jacobi-Bellman (HJB) Fig. 4. Optimal portfolio. Wealth is optimally invested in three-month and two-year variance swaps, index put option, stock index, and risk-free bond. Variance swaps are rolled over monthly and yearly, respectively. Optimal portfolio is rebalanced daily. Risk aversion is η ¼ 5. n 1t is the optimal fraction of wealth invested in the three-month variance swap, and n 2t in the two-year variance swap (upper graph); w t in the stock index (middle graph); ϕ t in the index put option (lower graph and the function hðτ; xÞ is defined in the online appendix. The optimal exposure θ WÃ t to the diffusive risk is composed of the familiar myopic and intertemporal hedging terms, as discussed in Merton (1971). The myopic demand, coming from ΛðX t Þ=η, would be the mean-variance Optimal portfolio for log-investor. Wealth is optimally invested in three-month and two-year variance swaps, index put option, stock index, and risk-free bond. Variance swaps are rolled over monthly and yearly, respectively. Optimal portfolio is rebalanced daily. Risk aversion is η ¼ 1. n 1t is the optimal fraction of wealth invested in the three-month variance swap, and n 2t in the two-year variance swap (upper graph); w t in the stock index (middle graph); ϕ t in the index put option (lower graph). The vertical line is April 3, 2007, i.e., beginning of the out-of-sample period.
Please cite this article as: Filipović, D., et al., Quadratic variance swap models. Journal of Financial Economics (2015), http: //dx.doi.org/10.1016/j.jfineco.2015.08.015i optimal investment over the next instant not accounting for future investments, or assuming a constant investment opportunity set. The intertemporal hedging demand, coming from ΣðX t Þ > ∇ x hðT Àt; X t Þ, arises due to the need to hedge against fluctuations in the investment opportunities. These fluctuations are induced, inter alia, by the stochastic diffusive component of the volatility of the stock index. We discuss the computation of ∇ x hðT Àt; X t Þ in the online appendix. The optimal exposure θ NÃ t to the jumps only consists of a myopic term.
The following corollary shows that variance swaps can be used to span diffusive volatility risk. The optimal investments in the stock index and index option are thus only seeking the diffusive and jump risk premiums. Optimal portfolio Proxy portfolio S&P500 Fig. 7. Wealth process for log-investor. Wealth is optimally invested in three-month and two-year variance swaps, index put option, stock index, and riskfree bond. Variance swaps are rolled over monthly and yearly, respectively. Optimal portfolio is rebalanced daily. Proxy portfolio is rebalanced less frequently: three-month variance swap, index put option, and stock index positions are rebalanced monthly, two-year variance swap position is rebalanced yearly. Risk aversion is η ¼ 1. S&P 500 is normalized to 100. The vertical line is April 3, 2007, i.e., beginning of the out-of-sample period.
investor takes a long position in the stock index, earning the equity risk premium, and a long position in the put option, hedging the jump risk. The optimal long position in the put option ϕ Ã t is increasing in the absolute relative index jump size, Àξ, and is decreasing in the relative option price change upon a jump, ΔO t =O t À . If the latter is small, a large fraction of wealth needs to be allocated to the put option to hedge the jump risk.
If the jump risk is priced, ν Q ðX t Þ=ν P ðX t Þ 41, then θ NÃ t ξo0, and the optimal wealth process exhibits negative jumps. The investor optimally takes on jump risk to earn its risk premium. Because θ NÃ t ξ4 À1, the optimal wealth level is always one jump away from being negative. Suppose again that ξo0, and the option is a put, ΔO t 4 0. The optimal investment in the put option is As can be seen in (42), θ NÃ t is increasing in the jump risk premium ν Q ðX t Þ=ν P ðX t Þ and in the risk tolerance, 1=η. If w Ã t 4 0 and the jump risk premium and/or risk tolerance are low, then w Ã t À θ NÃ t 40, and the investor still takes a long position in the put option to hedge index jump risk. If instead the jump risk premium and/or risk tolerance are high, then w Ã t À θ NÃ t o 0, and the investor optimally takes a short position in the put option to earn the jump risk premium.

Bivariate quadratic model specification
We now resume the bivariate quadratic variance swap model in Section 3.2. Our empirical analysis in Section 4 shows that the best fit is attained when X 1t is in Class 1 and X 2t is in Class 3. We focus on this specification in the following. The dimension of the Brownian motion W t is d ¼3, and the 2 Â 3 dispersion matrix ΣðxÞ takes the form We specify the PÀjump intensity as ν P ðxÞ ¼ ν P σðxÞ 2 where ν P is a positive constant. This specification allows for more jumps to occur during more volatile periods, as shown in the empirical literature. Similarly, the QÀjump intensity is set equal to ν Q ðxÞ ¼ ν Q σðxÞ 2 . The diffusive spot variance is thus proportional to the QÀspot variance function, In our empirical analysis of optimal strategies, we set the jump intensities ν P ¼ 0:5, ν Q ¼ 0:7, and the index jump size ξ ¼ À25%, similarly to . These parameters imply that one large index jump occurs on average once every 50 years. In Section 6.2 we experiment with other jump configurations, namely, smaller and more frequent index jumps. Our conclusions on the empirical features of optimal strategies remain largely unchanged.
To account for the widely documented correlation between index returns and diffusive variance changes, e.g., Broadie, Chernov, and Johannes (2007) and Aït-Sahalia and , the correlation vector function is chosen to be of the form RðxÞ ¼ R 1 ðxÞ; 0; ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 1 À R 1 ðxÞ 2 q > . The correlation between index returns and diffusive variance changes is then given by Consistently with (21), we specify the market price of diffusive risk function as where Λ 3 ðxÞ is implicitly defined, up to its sign, by The sign of R 3 ðxÞΛ 3 ðxÞ has a direct impact on the equity risk premium, which is given by Based on our estimates, R 3 ðX t ÞΛ 3 ðX t Þ is larger in absolute value than R 1 ðX t ÞΛ 1 ðX t Þ. Since R 3 ðxÞ is positive, a negative Λ 3 ðxÞ would lead to a negative equity risk premium, which would be economically odd, so we take the positive square root in (49). Clearly, J ΛðxÞJ 2 needs to be specified so that the argument in the square root in (49) is nonnegative for all x A X . We specify it as proportional to the QÀspot variance ΛðxÞ 2 ¼ κgðxÞ; ð51Þ with κ Z κ Ã ¼ max x A X Λ 1 ðxÞ 2 =gðxÞ. Since Λ 1 ðxÞ is uniformly bounded in x, it follows that the QÀspot variance g(x) and the equity risk premium (50) are increasing functions in x 1 , for x 1 large enough. This means that the equity risk premium increases in bad times, i.e., when variance increases and stock index falls due to the leverage effect. Such a countercyclical equity risk premium is certainly a desirable feature of our model and motivates the chosen specification (51) of ΛðxÞ 2 . We set κ in (51) to achieve a sample average of the equity risk premium (50) equal to 6%. 36 The stock index (31) exhibits quadratic stochastic variance and jumps, which is outside the standard affine setting. To study the empirical features of the optimal trading strategy we develop a novel pricing formula for European options. The transition density of the index S t is approximated using an Edgeworth expansion, relying on closed form expressions for joint conditional moments of S t and state variables X t . In the online appendix, we derive the option pricing formula and discuss the computation of the option price sensitivities ∂ s O t , ∇ x O t , and ΔO t in (33). 36 The equity risk premium is notoriously difficult to estimate. Merton (1980) even argues that a positive risk premium should be explicitly modeled, and various studies have followed this approach, e.g., Jackwerth (2000) and Barone-Adesi, Engle, and Mancini (2008).

Optimal portfolios in the bivariate quadratic model
We assume that an index put option and n¼2 variance swaps are available for investment. The latter are specified by their maturity date functions T Ã 1 ðtÞ and T Ã 2 ðtÞ. We allow for various roll-over strategies. In all cases the maturity date functions differ, T Ã 1 ðtÞ aT Ã 2 ðtÞ, for all t, which is important in view of Assumption 5.1. It is a tedious but routine exercise to check that all assumptions underpinning Theorem 5.2 are satisfied. We sketch the arguments in the online appendix.
The optimal fractions of wealth invested in the stock index and index option are given by which is recovered by setting v ¼ ð0; 0; 1Þ > in the proof of Corollary 5.3 in the online appendix.
For a put option, the ratio 0 r ∂ s O t ξS t À =ΔO t o1, due to monotonicity and convexity. Thus, the denominators are positive. If the jump risk premium is small, then θ NÃ t is small, and the numerators are also positive. In that case, the investor optimally takes long positions in the stock index and put option, earning the equity risk premium and hedging the jump risk, respectively. If the jump risk premium is large, then θ NÃ t is large, and the investor can optimally short the put option, earning the jump risk premium, and hedging the short put with a short position in the stock index.
The intertemporal hedging demand is fully borne by the optimal investment in variance swaps. Plugging (52) in (35) shows that the optimal vector of relative notional exposures to the respective on-the-run variance swaps is given as solution n Ã t ¼ n t of the linear equation We provide a closed form approximation of ∇ x hðT À t; X t Þ in the online appendix.

Optimal portfolios: empirical findings
We now perform an empirical analysis of optimal portfolios using the above bivariate quadratic model. The investment universe consists of the stock index, risk-free bond, outof-the-money put option with strike price 0:95S t , and threemonth and two-year variance swaps, rolled over monthly and yearly, respectively. The initial wealth is normalized to 100. The risk-free rate is set to 2%. The investment horizon is T¼14.4 years, which is the time span of our sample. The risk aversion is set to η ¼ 5, which is an average value in survey data. 37 We also consider the risk aversion η ¼ 1, which corresponds to logarithmic utility. An investor with η ¼ 1 is significantly less risk averse than an investor with η ¼ 5.
Optimal portfolios are rebalanced daily. That is, each day optimal portfolio weights are adjusted according to (52) and (53). We also consider proxy portfolios with lower rebalancing frequencies. Section 6.2 discusses several robustness checks that largely confirm our results.

Optimal and proxy portfolios
Figs. 4 and 5 display the optimal portfolio weights in onthe-run three-month and two-year variance swaps, stock index, and put option, for η ¼ 5 and η ¼ 1, respectively. The optimal weights in variance swaps induce a short-long strategy, with a short position in the (long-term) two-year variance swap, and a long position in the (short-term) threemonth variance swap. As the negative variance risk premium for two-year variance swaps is larger in absolute value than the risk premium for three-month variance swaps (Section 4.1), going short in two-year variance swaps allows to reap a larger risk premium. These short positions are partially hedged via long positions in three-month variance swaps, limiting portfolio losses when volatility increases. The threemonth variance swap is more sensitive to volatility increases than the two-year variance swap, and it is thus a more effective hedging instrument.
The optimal weights in variance swaps exhibit significant periodic patterns, with increasing portfolio weights in absolute value when their maturities are approaching. Intuitively, close to maturity, most realized variance has been accumulated, inducing little volatility in spot value and thus reducing the risk premium carried by the variance swap. To keep an optimal level of portfolio risk exposure and earn risk premiums, the optimal weights in variance swaps need to increase in absolute value.
The optimal weight in the stock index (52) is positive, which is consistent with the positive equity risk premium to be earned. In contrast to the weights in variance swaps, the stock index weight does not exhibit any periodic pattern. The optimal weights in the stock index and the three-month variance swap are significantly larger for η ¼ 1 than for η ¼ 5.
The log-investor substantially increases the wealth allocation to the stock index. When the stock index falls and volatility increases, the large positions in three-month variance swaps effectively prevent large drops of the portfolio value.
The optimal weight in the out-of-the-money put option is positive, very small, and around 0.2%, for the investor with η ¼ 5. In their calibration exercise,  report similar portfolio weights for out-of-the-money put options. As the jump risk premium is small (ν Q =ν P ¼ 1:4) and the index jump size is large (ξ ¼ À25%), the investor optimally uses the put option to hedge index jumps, rather than to earn the jump risk premium. During low volatility periods, index jumps are less likely to occur, and the investor optimally reduces the put option weight essentially to zero. As noted above, the log-investor takes larger positions in the stock index than the more risk averse investor with η ¼ 5. The log- 37 Meyer and Meyer (2005)  investor's portfolio is therefore significantly exposed to index jumps, and the optimal portfolio weight in the put option increases to around 2%, during relatively volatile periods.
Some oscillations in portfolio weights are observed during the low volatility period 2005-2006. Because volatility reaches historically low values, variance swap rates are also low. This renders the matrix D t in (36) close to singular. However, low volatility also implies small changes in variance swap values. This in turn annihilates the impact of oscillating portfolio weights on the wealth process, resulting in non-oscillating wealth trajectories, as shown below in Figs. 6 and 7. Fig. 6 shows the wealth trajectory of the optimal portfolio for an investor with risk aversion η ¼ 5. The wealth trajectory exhibits low volatility and steady growth. Thus, optimally investing in variance swaps, put option, and stock index allows for a smooth wealth growth, which is far less sensitive to market falls than investing in the stock index only. The corresponding Sharpe ratio is 1.45%, which is larger than the Sharpe ratio of 1.20% of the S&P 500. The S&P 500 yields a higher terminal wealth than the optimal portfolio. This can occur because the optimal portfolio is not designed to maximize terminal wealth. Compared to the stock index, the optimal portfolio can exhibit lower returns on some occasions but it has always lower volatility. Optimally including variance swaps and put options in the portfolio of a risk averse investor brings more utility than investing in the stock index only because the risk averse investor dislikes large wealth fluctuations. Fig. 7 shows the wealth trajectory of the optimal portfolios for an investor with risk aversion η ¼ 1. The logoptimal wealth process has a Sharpe ratio of 1.56%, and exhibits significantly larger fluctuations than the S&P 500. This is in sharp contrast with the optimal wealth trajectory of the more risk averse investor with η ¼ 5. It appears that variance swaps can be used either to seek additional risk premiums or achieve stable wealth growth, depending on the risk profile of the investor. In separate work, we consider a special case of the current optimal portfolio problem. We study optimal investment in variance swaps, stock index, and bond, when the price process of the stock index is continuous and the investor has no access to index options, in the bivariate quadratic setting (Section 5.3). Remarkably, power utility investors follow very similar optimal trading strategies, taking short-long positions in variance swaps, and long positions in the stock index. Even though the settings are different in terms of index dynamics and investment universe, there is a striking similarity of the optimal weights in variance swaps. This suggests that shortlong positions in variance swaps are a robust feature of the optimal trading strategy. Furthermore, optimal wealth trajectories for η ¼ 5 and η ¼ 1 share very similar patterns as wealth trajectories in Figs. 6 and 7. This lends further empirical support to our finding that variance swaps can be used either to seek additional risk premiums or achieve stable wealth growth.
We now study the performance of proxy portfolios when the number of contracts in the portfolio is rebalanced at lower frequencies than daily. Specifically, the stock index, put option, and three-month variance swap positions are rebalanced monthly, and the two-year variance swap position is rebalanced yearly. Between rebalancing dates, positions are kept constant. At rebalancing dates t Ã ik , i¼ 1,2, variance swap investments are rolled over to newly issued three-month and two-year variance swaps, respectively, according to the portfolio weights n it Ã ik given as exponentially weighted average of past optimal portfolio weights, where ω it ¼ e À ðt Ã ik À tÞ . 38 These portfolio weights attempt to capture the periodic pattern of the optimal weights over the lifetime of the variance swaps. The reason for assessing the performance of proxy portfolios is that low rebalancing frequencies reduce transaction costs when implementing the portfolio strategy in practice. Interestingly, Figs. 6 and 7 show that the wealth trajectories of the proxy portfolios are similar to the ones of the optimal portfolios. Although this is only an in-sample result, it suggests that our optimal portfolio strategies have potential to be implemented in practice.
The results above differ from those in Egloff, Leippold, and Wu (2010) in a number of ways. In their diffusive affine setting, the optimal weight in the stock index is constant over time and the optimal weights in variance swaps are stateindependent. In our quadratic setting, optimal portfolio weights depend on state variables and exhibit the rich dynamics discussed above. Thus, the two optimal strategies are fundamentally different. Furthermore, they assume that at any time the investor can trade newly issued variance swaps at zero spot value ("sliding" variance swap investment). This is a special case of our framework in which we take into account investments in on-the-run variance swaps. This allows us to uncover periodic patterns in the optimal variance swap weights. Moreover, their empirical implementation of optimal portfolios is static, while we implement dynamic strategies.
They use a risk aversion of η ¼ 200 while we use η ¼ 5 and η ¼ 1. In our setting, the stock index can jump and the investor can trade put options to hedge jump risk. This is not the case in their setting. Finally, market price of risk specifications are different in the two studies. This implies that optimal portfolio weights are significantly different and actually mirror each other. 39

Robustness checks
We performed several robustness checks that largely confirm our empirical analysis of optimal portfolios.
Optimal portfolios above are based on three-month and two-year variance swaps. Optimal portfolios based on variance swaps with other term combinations (such as threemonth and one-year; six-month and one-year; six-month and two-year) have similar performance. The same holds true when using different roll-over periods (such as daily, half term, or term of the variance swaps). In the analysis above we use 95% out-of-the-money put options. We also used at-the- 38 We set n i0 ¼ n Ã i0 for the initial holding period. 39 As mentioned above, our optimal trading strategy is to go short in long-term variance swaps (to earn the variance risk premium), long in short-term variance swaps (to hedge volatility increases), and long in the stock index (to earn the equity risk premium). Egloff, Leippold, and Wu (2010) find opposite trading directions in their optimal trading strategy.
Please cite this article as: Filipović, D., et al., Quadratic variance swap models. Journal of Financial Economics (2015), http: //dx.doi.org/10.1016/j.jfineco.2015.08.015i money put options and the resulting wealth processes were very similar. For example, when the risk aversion is η ¼ 5, the optimal wealth process always grows steadily over time and is significantly smoother than the trajectory of the stock index. Indeed, since we are in a complete market setup, in theory, the choice of variance swap terms, roll-over periods, and index derivatives has no impact on the optimal wealth trajectory.
We experimented with other values of index jump size and intensity, and jump risk premium. When index jumps are smaller or carry more risk premium, the optimal investment in the put option switches from long to short, as theory predicts. For example, when the index jump size is ξ ¼ À10%, one index jump occurs on average once every ten years, the jump risk premium is ν Q =ν P ¼ 1:2, and the risk aversion is η ¼ 5, the optimal investment ϕ t in the out-of-the-money put option is negative, around À1%, and somewhat mirrors the ϕ t in Fig. 4. Optimal investments in variance swaps and stock index largely share the same patterns as in Fig. 4. The optimal wealth trajectory is smooth and very similar to the wealth trajectory in Fig. 6.
We also considered other investment horizons, such as five and ten years. The pattern of optimal portfolio weights is only marginally affected by the choice of the investment horizon.
Besides the risk aversion levels of η ¼ 5 and η ¼ 1, we also experimented with higher values, such as η ¼ 30. The optimal portfolio weights in the risky assets follow the same pattern. The weights are smaller in absolute value, which is consistent with the investor being more risk averse.
The above empirical analysis is based on a sample average equity risk premium of 6%. We redid the analysis for a sample average equity risk premium set to 4% by changing the parameter κ in (51) accordingly. This simply leads to smaller portfolio weights in the stock index, as theory predicts.
Finally, we discuss the impact of transaction costs on wealth trajectories. We analyzed actual bid-ask spreads of variance swap rates from a large broker-dealer. Bid-ask spreads relative to variance swap rates tend to be smaller in pre-crisis than crisis periods, and to decrease with term. The average relative bid-ask spread for two-month and one-year variance swap are 2.3% and 1.2%, respectively. We used these average bid-ask spreads to assess the impact of transaction costs on the proxy portfolios in Section 6.1, which are rebalanced at lower frequency than daily. As newly issued variance swaps have zero value at inception, bid-ask spreads are paid when liquidating existing variance swap positions. We find that such bid-ask spreads have only a minor impact on wealth trajectories. These results are not reported but are available from the authors upon request. Optimal portfolios also include put options, stock index, and risk-free bond. Optimal portfolio weights in put options are very tiny. Bid-ask spreads for liquidly traded stock index and risk-free bonds are very small. 40 Thus, these transaction costs have practically no impact on wealth trajectories.

Conclusion
We introduce a novel class of quadratic term structure models for variance swaps. The multivariate state variable driving the stochastic variance follows a quadratic diffusion process. The variance swap curve is quadratic in the state variable and available in closed form, greatly facilitating empirical applications. Various goodness-of-fit tests show that quadratic models fit variance swap rates remarkably well and significantly outperform affine specifications. The quadratic features of the stochastic variance and of the state process diffusion function appear to generate enough volatility of volatility to fit the empirical dynamics of variance swap rates and quadratic variation.
We study dynamic optimal portfolios in variance swaps, put option, stock index, and risk-free bond, when the stock index can jump. Optimal portfolio weights are available in terms of a Taylor series expansion involving conditional moments of the state variables, which in turn are available in closed form. The empirical analysis of optimal portfolios reveals that optimal portfolio weights in variance swaps induce a short-long strategy, with a short position in long-term variance swaps (to earn the negative variance risk premium), and a long position in short-term variance swaps (to hedge volatility increases). This shortlong strategy in variance swaps is a robust feature of the optimal trading strategy. Portfolio weights exhibit strong periodic patterns, which depend on the roll-over period and maturity of the variance swaps. The optimal investment in variance swaps can be used either to achieve stable wealth growth or to seek additional risk premium, depending on the risk profile of the investor. Depending on the index jump risk premium, the optimal investment in put options is used either to hedge index jumps or to earn the jump risk premium. Optimal portfolio weights in put options appear to be very small relative to portfolio weights in variance swaps.
Future research can take various directions. Variance swaps on different underlying assets, such as commodities, exchange and interest rates, are actively traded over-thecounter. Quadratic models can easily be applied to these contracts. The recently listed S&P 500 Variance Futures at the CBOE (see Footnote 4) will provide additional data for further studies on variance swaps. Studying derivatives on variance swaps in our quadratic setup can also be an interesting direction for future research.

Appendix A. Supplementary data
Supplementary data associated with this paper can be found in the online version at http://dx.doi.org/10.1016/j. jfineco.2015.08.015.