DCC-and DECO-HEAVY: Multivariate GARCH models based on realized variances and correlations

This paper introduces the scalar DCC-HEAVY and DECO-HEAVY models for conditional variances and correlations of daily returns based on measures of realized variances and correlations built from intraday data. Formulas for multi-step forecasts of conditional variances and correlations are provided. Asymmetric versions of the models are developed. An empirical study shows that in terms of forecasts the scalar HEAVY models outperform the scalar BEKK-HEAVY model based on realized covariances and the scalar BEKK, DCC, and DECO multivariate GARCH models based exclusively on daily data. © 2022TheAuthors.PublishedbyElsevierB.V.onbehalfofInternationalInstituteof Forecasters.ThisisanopenaccessarticleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/).


Introduction
The covariance matrix of daily asset returns is used in several applications of financial management.Therefore, modeling the temporal dependence in the elements of the covariance matrix, often with the ultimate aim of forecasting, is a key area of financial econometrics.The most widespread model class for this purpose is that of multivariate generalized autoregressive conditional heteroskedasticity (MGARCH) models, wherein the conditional covariance matrix of daily returns is specified as a deterministic function of past daily returns.A survey of MGARCH models is provided by Bauwens, Laurent, and Rombouts (2006).
The increasing availability of intraday data has led to the development of so-called High-frEquency-bAsed VolatilitY (HEAVY) models, initially in the univariate case (Engle & Gallo, 2006;Shephard & Sheppard, 2010).In the multivariate case, such models have been introduced in the stochastic volatility framework by Jin and Maheu (2013), and in the MGARCH one by Noureldin, Shephard, and Sheppard (2012).In the latter framework, the main difference is that the conditional covariance matrix of daily returns is specified in the HEAVY case as a function of lagged realized covariances, instead of lagged outer products of daily returns in the MGARCH case.Thus, multi-step forecasts of daily conditional covariance matrices by HEAVY models require as input forecasts of realized covariances, which are obtained from a dynamic model for the realized covariance matrix.HEAVY models are based on more accurate measurements of covariances than GARCH models, and they improve forecasts of the conditional covariance matrix of daily returns, as illustrated by Noureldin et al. (2012).A different class of models that relate the daily return volatility to a realized volatility measure is the realized GARCH model of Hansen, Huang, and Shek (2012); this type of model has been extended to the multivariate setup by Gorgi, Hansen, Janus, and Koopman (2019) and Hansen, Lunde, and Voev (2014).
A well-known difficulty with MGARCH and HEAVY models is that the number of parameters they require tends to be large, being at least a quadratic function of the dimension of the return vector.Therefore, when the dimension is more than a handful of assets, a scalar parameterization is adopted.For example, Noureldin et al. (2012), for 10 assets, and Opschoor, Janus, Lucas, and Van Dijk (2018), for 30 assets, adopt the scalar BEKK parameterization of Engle and Kroner (1995).This parameterization, coupled with a targeting procedure, which is a way to estimate the constant parameter matrices of the model, facilitates the estimation of the remaining scalar parameters enormously, since their number is independent of the dimension.The scalar BEKK parameterization implies that the conditional variances and covariances have the same persistence and the same sensitivity to the past realized covariances, which may be unrealistic for a large number of assets.A factor HEAVY model, proposed by Sheppard and Xu (2019), avoids this drawback of the scalar BEKK while keeping the number of parameters manageable for estimation.
The first contribution of this research is the extension of the HEAVY class by the DCC specification of Engle (2002) and the DECO of Engle and Kelly (2012).Scalar DCC-HEAVY and DECO-HEAVY models are developed.Like it was done for the BEKK-HEAVY, these models use wellknown formulations of the corresponding MGARCH models, modifying them by specifying the dynamics of the daily conditional correlation matrix as a function of past realized correlations.Likewise, the model we propose for the realized correlation matrix is of DCC type, instead of BEKK type.In both parts of the model, there is a decoupling of the parameters of the variance dynamics (daily or realized) from those of the model for the corresponding correlation matrix.The decoupling has been shown to be advantageous in several respects for MGARCH models: the scalar DCC is a more flexible model than the scalar BEKK and usually provides a better empirical fit, even after taking into account its heavier parameterization; each part of the model can be estimated by QML in two steps (one for the variance equations, and one for the correlation), which makes it practical for handling a dimension of more than a handful of assets; and it usually improves the forecast quality.The same advantages occur in the scalar HEAVY model context, and this is confirmed empirically.
The proposed DCC-HEAVY formulation is different from the model of Braione (2016), where the dynamics of the daily correlation matrix are driven by the outer product of the past degarched returns (which is not a correlation matrix), while the model for the realized covariances and variances is of BEKK type.An advantage of the new specifications is that they allow us to forecast directly the correlations several steps ahead, avoiding an approximation due to the fact that otherwise, correlations are obtained by normalizing quasi-correlations.
Stationarity conditions and formulas for multi-step forecasts are derived from a vector multiplicative error representation of the DCC-HEAVY model, where the dynamics are driven by both lagged realized measures and outer products of past degarched returns, thus encompassing both DCC-HEAVY and DCC-GARCH.Furthermore, asymmetric impact and HAR-type terms (see Corsi (2009)) are added to the DCC-and DECO-HEAVY models.
The second contribution of this research is a detailed empirical comparison of the BEKK-, DCC-, and DECO-HEAVY and GARCH models.All models are applied to the stocks in the Dow Jones Industrial Average (DJIA) index.Like in Shephard and Sheppard (2010), the effects of the lagged squared returns are insignificant when lagged realized variances are included in the conditional variance equations.Likewise, the effect of the lagged outer product of degarched returns is insignificant when the lagged realized correlation matrix is included in the conditional correlation equation.Moreover, when applying the model confidence set approach based on statistical and economic loss functions, the empirical results show that the DCC-and DECO-HEAVY models provide better outof-sample forecasts than the DCC-GARCH, DECO-GARCH, BEKK-GARCH, and BEKK-HEAVY models.Including asymmetric and HAR terms further improves the DCC-and DECO-HEAVY model forecasts.
The remainder of the paper is organized as follows.Section 2 introduces the DCC-and DECO-HEAVY models.Section 3 provides the multiplicative error representation, the multi-step forecast formulas, and model extensions.Section 4 presents the estimation procedure.Section 5 provides the empirical results.Section 6 concludes.A supplementary appendix includes additional theoretical and empirical results, in particular for a dataset used by Noureldin et al. (2012).

DCC-HEAVY and DCC-DECO models
In the first subsection, the multivariate HEAVY framework of Noureldin et al. (2012) is recalled, in particular the scalar BEKK-HEAVY model.The scalar DCC-HEAVY and DCC-DECO models are defined in the next subsections.
Assuming, for instance, six and a half hours of trading per day and five-minute returns, m is equal to 78.The outer product of daily returns is the k × k matrix r t r ′ t .The most simple realized covariance measure for the k assets on day t is the k × k matrix, defined as (1) Assuming that m > k, RC t is positive definite (assuming that no asset is a linear combination of other assets).
Denote by v t the k × 1 realized variance vector of day t, consisting of the diagonal elements of RC t , and by RL t the realized correlation matrix of day t, defined as where diag(RC t ) is the diagonal matrix obtained by setting the off-diagonal elements of RC t equal to zero, and the exponent −1/2 transforms each diagonal element into the inverse of its square root.Thus, the off-diagonal elements of RL t are the realized correlation coefficients for the asset pairs, and its diagonal elements are equal to unity.
A multivariate HEAVY model specifies a dynamic process for the conditional covariance matrix H t of the daily return and another one for the conditional mean M t of the realized covariance matrix of day t: where F t−1 is the information set generated by the daily and intra-daily observations, and E(r t |F t−1 ) = 0 is assumed for simplicity (otherwise, r t denotes the demeaned return vector).The link between both expectations comes from the dependence of H t on past values of functions of RC t .
For the specification of the dynamics of H t and M t , Noureldin et al. (2012) adopt the BEKK-type model to ensure that the conditional covariance matrix is positive semidefinite.The scalar version of the model, with the ''targeting'' parameterization of the constant terms, is where the k × k matrices H = E(r t r ′ t ) and M = E(RC t ) are assumed to be positive definite (PD), and targeting means that these matrices are replaced by their empirical counterparts.For (6), sufficient restrictions for the positivity of M by H, thus using only the daily information.

DCC-HEAVY model
An alternative to the BEKK specification for H t and M t is the DCC model of Engle (2002).Any covariance matrix can be written as the product DRD, where D is the corresponding diagonal matrix of standard deviations, and R is the correlation matrix.

DCC-HEAVY specification of H t
Let us denote by h t the k × 1 vector of conditional variances (that is, the diagonal elements of H t ), by h 1/2 t the vector of conditional standard deviations (obtained by taking the square root of each entry of h t ), and by R t the corresponding conditional correlation matrix.Then, the conditional covariance matrix H t can be written as where, for x being a k × 1 vector, Diag(x) is the k × k diagonal matrix with the entries of x as diagonal elements.Assumption (3) for the HEAVY-BEKK is replaced by Thus, instead of specifying altogether the dynamics of the conditional variances and covariances of the returns, as for instance in ( 5), the DCC-HEAVY model specifies the dynamics of the conditional variances of the returns and of the conditional correlation matrix (R t ) of the degarched returns (i.e., the observed returns divided by their conditional standard deviations).Notice indeed that is a matrix with unit diagonal elements and off-diagonal elements that are the conditional correlation coefficients, that is, the conditional covariances divided by the corresponding conditional standard deviations.This is the same setting as in the DCC-GARCH model of Engle (2002), and since a covariance is the product of two standard deviations and a correlation, the expectation of the covariance is not the corresponding function of the expected standard deviations and expected correlation.
The dynamics of the conditional variance vector are specified as where v t−1 is defined below (1), ω h is a k × 1 positive vector, and A h and B h are k × k matrices, such that each entry of h t is positive.To ease the restrictions necessary for this and to avoid parameter proliferation, A h and B h are restricted to be diagonal matrices with positive entries on the diagonal, and the elements on the diagonal of B h to be smaller than unity.The diagonality restrictions imply that each conditional variance depends on its own lag and the corresponding previous realized variance, as in the HEAVY-r model of Shephard and Sheppard (2010).More generally, A h can be non-diagonal to allow spillover effects.If B h is restricted to be diagonal, the model of the k variances can be estimated in k separate parts (see Section 4).The DCC-GARCH model of Engle (2002) for the conditional variances is similar to (10), with r 2 t−1 (the squared elements of r t−1 ) replacing v t−1 .
The conditional correlation matrix is specified through a scalar dynamic equation: where α r ≥ 0, β r ≥ 0, β r = 0 if α r = 0, β r < 1, R is the k × k unconditional correlation matrix of u t , and P is the k × k unconditional expectation of RL t .The elements of R and P can be set to their empirical counterpart to simplify the estimation, so that only two parameters (α r and β r ) remain to be estimated.By substituting (12) in (11), R t is equal to R + α r (RL t−1 − P) + β r (R t−1 − R), and by taking the unconditional expectation on both sides, E(R t ) = R if E(RL t ) = P.The specification of R t is similar in spirit to the specification of H t in (5), by taking into account that E(R t ) is not equal to E(RL t ), like E(RC t ) is not equal to E(H t ).
Since R, P, and RL t−1 have unit diagonal elements, and assuming that the initial matrix R 0 is a correlation matrix, it is obvious that R t has unit diagonal elements, but to be a well-defined correlation matrix it must be positive definite (PD).This is not necessarily the case for the set of values of (α r , β r ) stated above.The issue is illustrated in Section B of the supplemental appendix.For estimation, we proceed (like for the BEKK-HEAVY model) by checking that R t (rather than R) is positive definite in the sample period during the numerical maximizing of the log-likelihood function.Like for the BEKK-HEAVY model, for the datasets we used, no issue of a lack of PDness occurred.
An advantage of the proposed specifications of R t , based on the realized correlation matrix, is that there is no need to transform the covariance matrix of the degarched returns into a correlation one, as in the DCC-GARCH model of Engle (2002), which is specified by where α q ≥ 0, β q ≥ 0, β q = 0 if α q = 0, α q + β q < 1, and Q is a k × k PD matrix.Q t is actually the conditional covariance matrix of u t , with non-unit diagonal elements, and by ( 13), it is transformed into a correlation matrix.However, this parameterization raises two issues, one about estimation, the other about forecasting.First, because E(u t u ′ equal to E(Q t ) and therefore is not consistently estimated by ∑ t u t u ′ t /T .Thus, using this average to estimate Q as proposed by Engle (2002), so that only α q and β q remain to be estimated by QML, introduces an asymptotic bias in their estimator.See Aielli (2013) for details and an alternative formulation of ( 14) that avoids this problem.
Second, at date t, the (s + 1)-step-ahead forecast of R t+s+1 requires E t (u t+s u ′ t+s ) = E t (R t+s ), which is not available in closed form due to the nonlinear relation (13) between R t+s and Q t+s .By assuming E t (R t+s ) ≈ E t (Q t+s ), Engle and Sheppard (2001) obtain the closed-form forecast recurrence relation that starts with E t (R t+1 ) = R t .The correlation forecasts are thus approximate and biased.
The DCC-HEAVY model differs from the DCC-GARCH model in three ways: 1) the dynamics of conditional variances h t are driven by the lagged realized variances v t−1 ; 2) the conditional correlation R t is modeled directly rather than parameterized in a sandwich form as in (13); and 3) the dynamics of the conditional correlation matrix R t are driven by the lagged realized correlation matrix.The last two features allow us to obtain exact closed forms for s-step-ahead correlation forecasts, as explained in Section 3.

DCC-HEAVY specification of M t
The specification of the DCC-HEAVY model requires defining the dynamics of where m t is the vector containing the main diagonal of M t , that is, the conditional means of the realized variances, and P t is the corresponding conditional mean of the realized correlation matrix.
The conditional expectation of the realized variances is specified as where ω m is a positive k × 1 vector, and A m and B m are k × k matrices that are restricted to be diagonal matrices with positive entries, as discussed after (10).
The dynamic process for the conditional expectation of the realized correlation matrix is defined in the following way: where α p ≥ 0, β p ≥ 0, β p = 0 if α p = 0, α p + β p < 1, and P is a correlation matrix that is the unconditional mean of RL t .The elements of P can be set to their empirical counterpart to render the estimation simpler.E(RL t ) is not equal to the unconditional correlation matrix E(P t ), due to the nonlinearity of the transformation from covariance to correlation.However, Bauwens, Storti, and Violante (2012) show that if RC t is computed from a large enough number of high-frequency returns, P should be almost equal to E(RL t ).

DECO-HEAVY model
The DECO-HEAVY model differs from the DCC-HEAVY in the specification of the conditional correlation matrix corresponding to H t and of the conditional mean of the realized correlation matrix corresponding to M t .
The specification of the conditional correlation matrix corresponding to H t , denoted by R E t , is based on the assumption that all the conditional correlations are the same time-varying correlation ρ t ∈ (−1/(k − 1), 1), chosen to be the average of the correlation coefficients of R t = (r t,ij ), defined by ( 11)-( 12): where J k is a k×k matrix of ones.The DECO-GARCH model of Engle and Kelly (2012) uses as the dynamic equicorrelation coefficient the average of the DCC-GARCH correlations defined by ( 13) together with the modification of ( 14) proposed by Aielli (2013).
Likewise, the conditional mean of the realized correlation matrix corresponding to M t , denoted by P E t , is specified as: The main advantage of DECO with respect to DCC, especially when k is very large, is the availability of analytical expressions of the inverse and determinant of the equicorrelated matrices, which are used in the computation of the likelihood function for estimation, and of economic loss functions for forecast evaluations.For more details, see Engle and Kelly (2012).

Representation, forecasting, and extensions
In this section, the DCC-HEAVY and the closely related DCC-GARCH and DCCX-GARCHX models are represented as multiplicative error models (MEMs), from which stationarity conditions and closed-form formulas for multi-step forecasts follow directly.An extension of the DCC-HEAVY model is proposed by adding asymmetric impact and HAR terms.

Multiplicative error representation
An MEM for a positive variable x t specifies it as the product of a positive conditional mean and a positive error that follows some distribution with expectation equal to one.If x t is a squared centered return, the conditional mean is the conditional variance.This can be extended to the elements of a vector.In this subsection, the focus is on the conditional expectation formulation, and not on the distribution.
For the conditional and realized variance equations, define the vectors of 2k × 1 elements The conditional expectation formulation of the conditional and realized variance Eqs. ( 10) and ( 17) is where and the 0 symbol stands for a k × k matrix of zeros.If the top part becomes the DCC-GARCH model, and the realized variance model is kept in the bottom part.If the model (referred to as DCC-GARCHX) includes in the top part both the lagged squared return and the lagged realized variance, encompassing the two previous models.
the conditional expectation formulation of the conditional and realized correlation matrix Eqs. ( 11) and ( 18) is where and the 0 symbol is scalar in this case.If α= the constant term is adapted, then the top part becomes the ''quasi-correlation'' Eq. ( 14) of the DCC-GARCH model, the realized correlation equation being kept in the bottom part. If both u t−1 u ′ t−1 and RL ′ t−1 are included in the top part (referred to as DCCX-GARCH).
Processes such as defined by ( 23) and ( 27) can be written as VARMA(1,1) by defining appropriate error terms.From this representation, it follows that the covariance stationarity condition is that the largest eigenvalue of A + B is smaller than unity for (23), and likewise for the largest eigenvalue of α+β for (27).For the matrices α and β defined after ( 27), the previous condition is equivalent to β r < 1 and α p + β p < 1, as written after ( 11) and ( 18).The unconditional first moment is then obtained by applying standard results for the VARMA representations.
the variances of the DCC-HEAVY model.For the correla- the last equality resulting from (12).

Multiple-step-ahead forecasting
Forecasts of the conditional covariance matrices of daily returns are used in several financial applications.The s-step-ahead forecast of H t+s , computed at date t, is defined in the case of DCC-type models as where E t (.) is a short notation for E(.|F t−1 ).Notice that as in the DCC model, H t+s|t is not equal to E t (H t+s ), due to the nonlinearity of the transformation of covariances into correlations and of the square root function.
To obtain E t (h t+s ) and E t (R t+s ), the conditional expectation expressions of the previous subsection are useful to compute E t (µ t+s ) and E t (Φ t+s ), denoted by µ t+s|t and Φ t+s|t , respectively.
Starting from (23) leads to E t (µ t+s ).In moving more than one step ahead, x t+s|t is not known and needs to be substituted with its corresponding conditional expectation µ t+s|t .Hence, which can be solved recursively, giving the closed-form forecast Proceeding in the same way for E t (Φ t+s ) from ( 27) gives The closed-form forecast is For instance, the DCC-HEAVY model s-step-ahead forecasts µ t+s|t and Φ t+s|t are derived from ( 30) and ( 31) by setting A, B, α, and β to the matrices defined after ( 23) and ( 27).The s-step-ahead forecast E t [h t+s ] of the conditional variance vector corresponds to the first k elements of µ t+s|t , and the s-step forecast E t [R t+s ] of the conditional correlation corresponds to the k × k upper block of Φ t+s|t .

Model extensions
It has long been known that stock markets react differently to positive and negative news.The asymmetric effect is now commonly used to refer to any volatility model, univariate and multivariate alike, in which the (co)variances respond asymmetrically to positive and negative shocks.The DCC-HEAVY model can be extended by incorporating the asymmetric effect into the variance and correlation equations.In the variance equation, the asymmetric effect implies that volatility tends to increase more following negative return shocks than equally sized positive shocks.In the correlation equation, the asymmetric effect implies that the correlation between stock returns tends to increase when the market turns down.The extended model is called the ADCC-HEAVY model.
t ) (a matrix with diagonal elements equal to 0).The conditional variance and correlation equations of the ADCC-HEAVY model are then where Γ h is a k × k diagonal matrix, and D is the sample mean of D t .Eq. ( 32) extends (10); asymmetric effects correspond to positive values on the diagonal of Γ h .
Eqs. ( 33)-( 34) extend ( 11)-( 12).The asymmetric effect corresponds to a positive value of γ r : the impact of the lagged realized correlation between assets i and j on their current conditional correlation is equal to α r + γ r only if both r i,t−1 and r j,t−1 are negative.Otherwise, the impact is reduced to α r if γ r is positive.Notice that the diagonal elements of D t−1 ⊙RL t−1 and D⊙ P are equal to one, so that the same holds for R t .Like in the simpler model where γ r = 0, in estimation, (α r , β r , γ r ) are constrained to values such that R t is PD for all t.In the empirical applications, this did not create any difficulty.
The same asymmetric effects can be included in the realized variance and correlation equations.Furthermore, the heterogeneous autoregressive (HAR) model of Corsi (2009) has emerged as a simple and powerful way to include the long-memory feature of realized volatilities.The model was extended to the multivariate setting by Chiriac and Voev (2011) and Oh and Patton (2016).Adding HAR terms to the realized variance and correlation Eqs. ( 17) and ( 18), respectively, results in richer dynamic equations: where Γ m , A w m , and elements, in estimation, the parameter space must be constrained to ensure that this matrix is PD for all t.HAR terms are not added to the conditional covariance and correlation equations, since these effects are insignificant in the empirical application.
Other ways to define the asymmetric effect have been used.For example, one can define d it = 1 when the stock volatility RC ii,t is ''high'' (above some threshold).In the correlation equation, that implies that the correlation between two stock returns increases more when both stocks are highly volatile than if neither or only one is highly volatile.Another asymmetric effect is to let all correlations increase if the market volatility (for instance measured by VIX) increases; see Bauwens and Otranto (2016).Section C of the supplemental appendix provides the formulas of multiple-step forecasts of the extended models.

Estimation
The DCC-and DECO-HEAVY models are parameterized with a finite-dimensional , where the p H × 1 vector θ H is the parameter vector of the HEAVY model for H t , and the p M ×1 vector θ M is the parameter vector of the HEAVY model for M t , θ H and θ M can be estimated separately, as they are variation-free in the sense of Engle, Hendry, and Richard (1983).Moreover, each of these estimations can be split into two steps, as explained below.

Estimation of θ H
To get a quasi-likelihood function, we add to assumption ( 8)-( 9) the hypothesis that the distribution of the innovation of the return vector is multivariate Gaussian: Neglecting irrelevant constants, the quasi-log-likelihood function for T observations, given initial values, is ) .
This function can be maximized numerically in a single step, but for large k, the large dimension of the parameter space makes this difficult.A two-step estimation can be based on a partition of θ H into θ H1 , the parameters of the variance Eq. ( 10), and θ H2 , the parameters of the correlation Eq. ( 11).The two-step procedure was proposed by Engle (2002) for the DCC-GARCH model.
The first step consists in estimating θ H1 by maximizing the quasi-log-likelihood obtained by replacing R t in the second line of (38) by the identity matrix, so that the objective function does not depend on θ H2 : ) . (39) The matrix A h of ( 10) can be non-diagonal to allow spillover effects, but the matrix B h must be diagonal to allow a separate estimation of the k equations.In the case of GARCH models with the same structure, Francq and Zakoian (2016) provide the asymptotic properties of the separate (equation-by-equation) estimation of the variance equations.In our empirical analysis, both A h and B h are restricted to be diagonal.
The second step maximizes (38) with respect to θ H2 , fixing θ H1 to the value θH1 obtained at the first step.The second step objective function can be written as where ût = r t ⊙ ĥ−1/2 t , and ĥt means that h t defined in ( 10) is evaluated at θ H1 = θH1 .
The two-step estimator is presumably a consistent but inefficient estimator of the parameter θ H under conditions (such as asymptotic identification, strict stationarity, and ergodicity of the time-series processes) similar to those stated in Engle and Sheppard (2001) for the DCC-GARCH model.The two-step estimation method is numerically tractable for large k, if R and P are targeted, i.e., estimated by their empirical counterparts, as proposed below (12).The impact of the targeting is a loss of efficiency.Noureldin et al. (2012) discuss this issue for the BEKK-HEAVY model, which is estimated in a single step.Further work to derive rigorously the asymptotic distribution of the single-and two-step estimators of the DCC-HEAVY, with and without targeting, is needed and beyond the scope of this paper.In Section 4.3, we provide the results of a simulation study of the properties of the two-step estimator.
We assume that conditionally on the past information set, RC t follows a central Wishart distribution of dimension k, and we denote this assumption by where ν is the degrees-of-freedom parameter restricted by ν > k − 1.The chosen parameterization implies that Using the expression of a Wishart density function, and of M t in ( 16), the quasi-log-likelihood function for a sample of T observations, given initial conditions, is where θ M is the vector of the parameters that appear in ( 17) and ( 18), and D m,t stands for Diag(m 1/2 t ).In the above expression, terms that depend on ν but do not depend on θ M are not included.The parameter ν is considered a nuisance parameter that can be neglected to estimate θ M , and practically it can be set to unity without loss of information, since the score for θ M is proportional to the value of this parameter.
As shown by Bauwens et al. (2012), the Wishart assumption provides a quasi-likelihood function, which can serve as an objective function to get a single-step estimator.They also show that the DCC-HEAVY model part for the realized covariance matrix can be estimated in two steps.The parameter space θ M is split into θ M1 for the parameters in the realized volatility model and θ M2 for the parameters in the realized correlation model.We denote by QL M1 the quasi-log-likelihood where P t in ( 42) is replaced by the identity matrix and ν is set to unity: ] . ( This estimation is split into k separate estimations when the matrices A m and B m of ( 17) are restricted to be diagonal.
We denote by QL M2 the quasi-log-likelihood where θ M1 is fixed at the value θM1 obtained at the first step: where Dm,t means that Dm,t is evaluated at θ M1 = θM1 .The parameter vector θ M2 includes α p , β p , and the elements of P. The latter can be targeted by the unconditional mean of the realized correlations, as discussed after (18), in which case the second-step maximization is done with respect to two parameters and is therefore feasible for large k.

Properties of the QML estimator by simulation
Shephard and Sheppard (2010) provide the formula of the asymptotic Gaussian distribution and its covariance matrix for the QML estimator of the univariate HEAVY model, with and without targeting.Noureldin et al. (2012) provide the same type of result for the BEKK-HEAVY model.The asymptotic theory of Shephard and Sheppard (2010) is relevant for the estimators of the parameters of the variance equations of the DCC-HEAVY model, since these equations are univariate models.A complete asymptotic distribution theory for the single-step and two-step QML estimators of θ H and θ M of the DCC-HEAVY model, with or without targeting, is beyond the scope of this paper.Some results of a simulation study are reported below, focusing on the bias, the root mean squared error, and the normality of the sampling distribution of the twostep estimator in the case of targeting, as defined in the previous subsections and used in Section 5.The parameters of the process from which the data are generated are taken from the empirical results reported in the next section.The data-generating process is defined by ( 16)-( 18) for the M t part; once m t is computed as (17) and P t as (18), M t is computed as defined by ( 16).Next, a realized covariance matrix RC t is simulated according to (41), where ν is set to 50.The corresponding diagonal v t and correlation matrix RL t are the ''observed'' (simulated) data for date t and are also used as inputs to compute m t+1 , P t+1 , h t+1 , and R t+1 .For the H t part, H t is simulated as N k (0, H t ), where H t is computed as defined by (7); for this, h t is computed as in (8) and R t as in ( 11)-( 12).The dimension k is equal to 29, like in the empirical application of Section 5. A sample of T observations is generated and used for estimation; we set T = 2000 and 4000 to illustrate the impact of increasing the sample size.The largest sample size of 4000 is chosen close to the sample size of the application in Section 5, and the smallest is close to the sample size of the application reported in Section F of the supplemental appendix.For each T , the data simulation and the parameter estimation is repeated 1000 times, delivering 1000 estimates of each parameter.
Table 1 provides a synthetic view of the simulation results.The relative bias (RB) and root mean squared error (RMSE) are defined as for the estimator φ of the parameter φ (an element of θ) having the true value φ 0 and estimated by φs for the sth simulated dataset.The RMSE values indicated in the table must be divided by 100 to obtain the actual values.Some comments about the results follow.
The RMSE in absolute values and the relative biases decrease (with two minor exceptions for the RMSE)) as T increases, which indicates the likely consistency of the es-  2000).In brief, the results indicate that the consistency and the asymptotic normality are likely properties of the QML estimator; notice that the QML estimator is actually an ML estimator, since the estimated model is correctly specified in the simulation study.
For the estimators of the parameters of the variance equations, the relative median biases and RMSE are generally small in absolute values; for T = 4000, the largest median bias is −1.73%.The results for the estimator of each equation are in Tables D1 and D2 of Section D of the supplemental appendix.For the correlation parameter estimators, the largest bias is −1.53% at T = 4000, and the other biases are smaller than 1%.The RMSEs are small and of the same order of magnitude as the standard errors of the estimates of Section 5: for example, for αh , the RMSE is equal to 0.0029 in Table 1 for T = 4000, and the standard error is equal to 0.0065 (based on Table 3) for T = 4318.

Data description
High-frequency data for 29 stocks belonging to the Dow Jones Industrial Average (DJIA) index are used; the 30th stock was dropped, since it was not permanently in the index during the sample period.The sample period The daily realized covariance matrices are computed as explained at the beginning of Section 2, using fiveminute returns.The synchronization of intraday prices of the 29 stocks was done using five-minute intervals; the price closest (from the left) to the respective sampling point was taken; and the first and last 15 min of the day (9:30-16:00) were excluded, so that m = 72 in (1).
Descriptive statistics are provided in Table E1 of the supplemental appendix.For each stock, it reports the time-series averages and standard deviations of its squared returns and realized variances, and the means and standard deviations of the time-series averages of its realized covariances and correlations with the other 28 stocks.Each average realized variance does not account for the overnight variation and is therefore a fraction (in most cases 50 to 60 percent) of the corresponding average squared return.
Fig. 1 shows a representative example of time-series plots of the realized variances of two stocks (JPM and XOM) and the corresponding realized covariances and correlations.
The focus of the empirical application is a forecasting comparison of the conditional covariance and correlation matrices, and the conditional variances of the 29 stocks using a set of models.Before reporting the results of the comparisons, estimation results are reported for the DCC and DECO models.equations of the GARCH, GARCHX, HEAVY-h, and HEAVYm models.The estimates for each stock, and the associated robust t-statistics, are given in Table E2 of the supplemental appendix.The estimates of the B h parameters in the HEAVY-h model are smaller than in the GARCH model (median of 0.481 versus 0.916), while the estimates of the A h parameters are much larger (medians of 0.781 versus 0.072).The influence of these differences is visible in the bottom panel of Fig. 2, which shows (in log) the time series of the average (over the 29 stocks) of the corresponding fitted conditional variances.The GARCH path is smoother than the HEAVY-h path, and the latter fluctuates locally more strongly than the former, responding faster to recent changes of volatility.Similar differences occur for each stock.The GARCH parameter estimates are much more homogeneous across the different stocks than the HEAVY-h estimates.The values of the HEAVY-m parameter estimates are more similar to HEAVY-h than to GARCH, though they are much more homogeneous than for HEAVY-h.

Estimation results for the full period
In the nesting GARCHX model, the coefficient (in A h ) of the lagged squared return is set to zero for eight stocks out of 29 because a non-negativity constraint is imposed and is binding; for these stocks, GARCHX estimates are identical to HEAVY-h.For the 21 other stocks, the estimate is positive (between 0.002 and 0.068), with tstatistics below 1.5 in 12 cases (out of 21), and in four cases larger than 2.Even though the t-statistic has a nonstandard distribution, since the null hypothesis of zero is on the boundary of the parameter space, these results suggest that for almost all stocks, the estimate is not significant.On the contrary, the estimate of the lagged realized variance coefficient (in A mh ) is positive (between 0.103 and 1.748, with 0.698 as the median value) and the associated t-statistics are usually large enough to suggest they are significant (only three are below 2.5).The loglikelihood gain of GARCHX over HEAVY-h is equal to 5 for the 21 additional parameters, and hence it appears to be minor.On the contrary, the gains of GARCHX and HEAVYh over GARCH are substantial (41 and 36, respectively).Notice that GARCH and HEAVY-h are not nested, but they have the same number of parameters, so choosing between them using their log-likelihood values is equivalent to a choice based on model choice criteria.In brief, these results suggest that the conditional variance dynamics are better captured by the lagged realized variance than by the lagged squared return, confirming the findings of Shephard and Sheppard (2010).
Table 2 also reports the smallest and largest eigenvalues of the matrix A + B (see Section 3.1) corresponding to each model (GARCH, GARCHX, and HEAVY).The largest eigenvalue is smaller than 1, and hence the covariance stationarity condition is satisfied for the variance equations of each model.For the HEAVY and GARCHX models, the largest eigenvalue comes from the M t part of the model.Table 3 presents the second-step parameter estimates of the correlation models: DCC-GARCH, DCCX-GARCH, DCC-HEAVY-R (Eqs.( 11)-( 12)), DCC-HEAVY-P (Eq.( 18)), and the corresponding DECO versions.For the DCC models, these estimates are broadly in line with those of the variance equations.The estimate of β r in the DCC-HEAVY-R model is smaller than that of β q in the DCC-GARCH model (0.869 versus 0.988), and the estimate of α r is larger than that of α q (0.068 versus 0.003), implying less smooth and more reactive fitted correlations.The paths of average fitted correlations and covariances of the three models are shown on Fig. 2. The paths of the DECO-HEAVY-R model are much less smooth and more reactive to recent information than for DCC-HEAVY-R, due to a smaller estimate of β r (0.552 versus 0.869) and a larger one of α r (0.447 versus 0.068).The paths for the DCC-GARCH are smoother and less reactive than in the HEAVY models.The described path differences are stronger for each pair of stocks, since averaging reduces the variability.
In the nesting DCCX-and DECOX-GARCH models, the coefficient estimates of the lagged realized correlation (0.068 and 0.447) are of the same magnitude as in the DCC-and DECO-HEAVY-R (0.061 and 0.369), with large t-statistics (9.40 for DCCX and 5.09 for DECOX).The coefficient estimate of the lagged return cross-product is very close to zero (with a t-statistic of 10.59) in DCCX, while it is equal to 0.029 (with a t-statistic of 1.94) in DECOX.The maximized second-step log-likelihood values of DCCX-GARCH (−5115) and DCC-HEAVY (−5119) are slightly different, but they are much larger than the value of DCC-GARCH (−5143).For the DECO models: DECO-HEAVY (−5307) and DECOX-GARCH (−5306) are very close, but DECO-GARCH (−5319) is lower.Thus, lagged realized correlations can be considered as more important drivers of the conditional correlations rather than return cross-products.
The maximized log-likelihood values and their decomposition into the variance and correlation parts are reported in Table 3.The decompositions suggest that the DCC-HEAVY and DECO-HEAVY dominate the DCC-GARCH model in both the variance and correlation parts.However, most of the in-sample gain comes from the variance part.The overall improvement is substantial.DCC-HEAVY improves slightly more than DECO-HEAVY.Notice that these models have the same number of parameters, so comparisons using log-likelihood values are equivalent to comparisons using model choice criteria.

Forecasting comparisons
A comparison of models can be made by evaluating the in-sample and out-of-sample forecasting performances   The HEAVY Gains are the differences of log-likelihood between the DCC/DECO-HEAVY model to the DCC/DECO GARCH model.
of the models using the model confidence set (MCS) of Hansen, Lunde, and Nason (2011).An MCS identifies a set of models having the best forecasting performance at a chosen confidence level, based on a loss function.Six models are compared: DCC-GARCH, DCC-HEAVY, DECO-GARCH, DECO-HEAVY, BEKK-GARCH, and BEKK-HEAVY.Out-of-sample s-step-ahead forecasts of the 29-dimensional covariance and correlation matrices are computed, for s = 1, 5, and 22; for horizons 5 and 22, they are iterated forecasts.For DECO models, the correlations are computed from DECO itself, not from the underlying DCC.

Loss functions
Statistical and economic loss functions are adopted along the lines proposed by Becker, Clements, Doolan, and Hurn (2015).For the covariance matrix forecasts, two statistical loss functions are used, which compare the covariance matrix forecasts with respect to the actual (unobserved) covariance matrix Σ t+s .The first one is based on the negative of the Wishart log-density function: where H a t+s|t denotes the s-step forecast using model a conditional on time t information.The second loss function is based on the square of the Frobenius norm of the difference between the forecast and benchmark matrices (see, e.g., Golosnoy et al. (2012)), defined by Since Σ t+s is unobservable, the observed realized covariance matrix RC t+s is used as a proxy for it.These statistical loss functions provide a consistent ranking of volatility models in the sense of Patton (2011) and Patton and Sheppard (2009), as they are robust to noise in the proxy; see also Laurent, Rombouts, and Violante (2013).
For the correlation matrix forecasts, the QLIK and FN losses are computed from the same formulas as for covariances, that is ( 46) and ( 47).The only difference is that forecasted correlations and realized correlations replace forecasted covariances and realized covariances, respectively.
Once a time series of T h,s (covariance or correlation) forecasts is obtained for a model, the corresponding losses and their time-series average are computed: This is performed for each model and each forecast horizon, so that models can be ranked by the MSC procedure and an MCS at a chosen confidence level can be identified.
For the variance forecasts of different models, we use the univariate loss functions and where v i,t+s is the observed realized variance of stock i at date t + s, and h a i,t+s|t is the corresponding s-step forecast of model a, based on information available at date t.Once the time-series average of each loss function has been computed for each stock, the mean across stocks is taken and used in the MCS procedure.That is, QLIK a i,t,s , and The economic loss functions are relevant for the covariance matrix forecasts.They are based on forecasted portfolio performances.The same economic loss functions as Engle and Kelly (2012) are used: global minimum variance portfolio (GMV), and minimum variance portfolio (MV); see also Engle and Colacito (2006).These loss functions are based on the variances of the forecasts of portfolio returns.A superior model produces optimal portfolios with lower forecast variance.Given a covariance matrix forecast H a t+s|t , the GMV portfolio weight vector ŵa t+s is computed as the minimizer of the portfolio variance (w a t+s ) ′ H a t+s|t w a t+s subject to the constraint that the weights add to unity.Once this is done for each forecast date, the GMV loss function is the average of the portfolio variances over the forecast period: The MV portfolio is obtained by minimizing the portfolio variance subject to the additional constraint that the expected portfolio return be larger than a chosen value.Following (Engle & Kelly, 2012), this value is fixed at q = 10% and the expected portfolio return (µ) at the mean of the data.The MV loss is defined like (52) but with the optimal weight vectors corresponding to the MV minimizations.The optimal GMV and MV weights are analytically known functions of H a t+s|t , and of µ and q for MV (see, e.g., Engle & Kelly, 2012).

Results
To compute out-of-sample forecasts, each model is reestimated every fifth observation based on rolling sample windows of 3000 observations, resulting in a total of T h = 1318 out-of-sample forecasts for s = 1, 1314 for s = 5, and 1297 for s = 22.different models.The boldface values identify the models that belong to the 90% model confidence set (MCS90, hereafter) for each loss function when the comparison is limited to the six symmetric models mentioned in the first paragraph of this subsection.The comparisons, including the four asymmetric models included in the table, are presented in Section 5.4, and the MCS ranks in the table pertain to the comparisons of all models.At forecast horizons 1 and 5, DCC-HEAVY belongs to the MCS90 for all loss functions, DECO-HEAVY belongs to it for a subset of loss functions, and all the other models are out of each MCS90.At horizon 22, no HEAVY model is in the MCS90 of FN and QLIK; MCS90 includes DCC-GARCH and DECO-GARCH for FN, and DCC-GARCH and BEKK-GARCH for QLIK For the GMV and MV loss functions, MCS90 consists of DCC-HEAVY and DECO-HEAVY.
Table 5 reports the MCS90 sets for correlation and variance forecasts separately, excluding the BEKK models for these comparisons, and using only the statistical loss functions, since the economic loss functions use the covariance matrix.
For correlations, DCC-HEAVY is in the MCS90 of both loss functions at the three horizons.DECO-HEAVY is also in the MCS90 of FN at the three horizons, and DECO-GARCH is in the QLIK MCS90 set at horizon 22.
For variances, the loss values defined by ( 49) are reported in the ''Variance'' part of Table 5.Notice that DECO models are irrelevant (being the same as DCC in the first step of estimation).The results reveal that DCC-HEAVY is alone in MCS90 for both MSE and QLIK at horizons 1 and 5.At horizon 22, only DCC-GARCH is in both MCS90 sets.Shephard and Sheppard (2010) report that the performance of HEAVY with respect to GARCH deteriorates as the forecast horizon increases.
To trace where the forecast gains occur, Table 6 reports the ratios between the losses of the DCC-HEAVY and DCC-GARCH models, and likewise for the DECO models.One can see that the DCC/DECO-HEAVY models outperform the DCC/DECO-GARCH models both in covariance, correlation and variance forecast losses at the forecast horizons 1 and 5 (with a single exception for DECO).The improvements are larger for horizon 1 than 5, and smaller for DECO than for DCC (except in one case).They can be important: e.g., DCC-HEAVY reduces the covariance and variance QLIK losses by 10% at least and up to 25%.At horizon 22, the GARCH models have the smallest losses, except for correlations where the differences are less than 2.5%.The loss improvements of GARCH with respect to HEAVY are between 7% and 17% when they occur.
In brief, the forecast comparisons clearly favor the DCC-HEAVY model at the short forecast horizons for all loss functions, and to a lesser extent the DECO-HEAVY model, at the expense of the BEKK-HEAVY and the three GARCH models.At the longest forecast horizon, the results depend on the loss function; moreover, there is a clear worse performance of DCC-HEAVY relative to DCC-GARCH for variance and covariance forecasts at this horizon.
The in-sample forecasting results are reported in the supplemental appendix (Tables E3-E4).They do not differ much from the out-of-sample results.

Asymmetric and HAR terms in DCC/DECO-HEAVY
The out-of-sample forecast performance of the DCC-HEAVY model deteriorates with respect to DCC-GARCH when the forecast horizon increases, switching from better to worse at some horizon between 5 and 22. DCC-HEAVY makes use of forecasts of realized variances and correlations.If the realized variance or correlation equations are incorrectly specified, the forecast error is brought to the conditional covariance and correlation forecasts.These forecast errors get larger when the forecast horizon is longer.To improve the forecast performance of DCC-HEAVY in multiple-step-ahead forecasts, the ADCC-HEAVY model defined by ( 32)-( 34)-( 35)-( 36) is worth trying.The HAR terms are useful to capture the longmemory feature of realized variances and correlations.

Estimation results
The parameter estimates of the asymmetric variance equations are reported in the supplemental appendix (Tables E5 and E6).For the conditional variances, the coefficient estimates of the asymmetric term (the diagonal elements of Γ h in (32)) are positive for 22 stocks, with t-statistics above 2 for eight of them.For the realized variances, the coefficient estimates of the asymmetric term (the diagonal elements of Γ m in ( 35)) are all positive, with t-statistics larger than 2.5 for 28 stocks.The coefficient estimates of the weekly HAR term are positive (with a single, insignificant exception) but only 14 of them have t-statistics above 2, indicating a moderate impact of this term.On the contrary, the monthly HAR term is positive for all stocks and appears to be strongly significant (with one exception).
The estimates of the asymmetric correlation equations are reported in Table 7.For the conditional correlations, the estimate of γ r is positive in the ADCC (0.025) and very significant (t-statistic, 6.51).This means that the impact of each lagged realized correlation on the next conditional correlation is stronger (being equal to 0.062 + 0.025) when both lagged returns of the corresponding assets are negative than otherwise (being then equal to 0.062).Nevertheless, the additional statistically significant impact of 0.025 is not very important on the next-period correlation, since it increases it by 2.5% of the previous period correlation (if both lagged returns are negative): that means an increase of 0.0125 if the previous correlation is 0.5.For the ADECO model, the impact is of 2% and statistically insignificant.
For the realized correlations, the estimate of γ p is positive and significant in both models (HAR-ADCC and HAR-ADECO), which means that the impact of each lagged realized correlation on the next conditional mean of the realized correlation is stronger (being equal to 0.063 + 0.010 in the HAR-ADCC) when both lagged returns of the corresponding assets are negative than otherwise (being then equal to 0.063).The same remark applies as above, about the limited value of the expected correlation change this implies between two consecutive days.
Concerning the HAR terms, the weekly term impact is negative (−0.033) and significant in HAR-ADCC, and positive (0.075) and insignificant in HAR-ADECO.The monthly term impact is negative in HAR-ADCC but positive in HAR-ADECO, being significant in both models.Anyway, the effective daily changes of expected correlations these terms imply are small in HAR-ADCC and moderate in HAR-ADECO.

Forecast comparisons
For covariance forecasts, Table 4 reports the loss function values for the four asymmetric models (ADCC-HEAVY, ADECO-HEAVY, ADCC-GARCH, and ADECO-GARCH) in addition to the six symmetric models.The MCS90 for each loss function and horizon is identified by the underlined values; a bold underlined value is thus in the MCS90 for the 10 models, and in that of the first six models, while a bold value, not underlined, is in the MCS90 of the first six models but is removed from the MCS90 when the 10 models are compared.
The changes of the MCS90 due to the inclusion of the four asymmetric models in the comparisons are in favor of HEAVY models: the asymmetric HEAVY models are added  to the MCS90 of some loss functions and horizons, but no asymmetric GARCH model is added for any loss function and horizon.For example, at horizons 5 and 22, ADCC-HEAVY is in the MCS90 sets for all loss functions; only the ADECO-HEAVY model also belongs to these sets in the case of the FN loss.The asymmetric HEAVY models attenuate or even reverse the worse performance of their symmetric counterparts relative to DCC at the three horizons, as can be seen by comparing the covariance loss ratios reported in Tables 8 and 9 and the corresponding values in Table 6.
For correlation forecasting, Table 5 also includes the asymmetric models in the comparisons.The changes of the MCS90 due to this are mixed: at horizon 1, only the asymmetric versions of the HEAVY models that were in the initial MCS90 are added; at horizon 5, ADECO-HEAVY is added for both losses and ADCC-HEAVY for QLIK loss); at horizon 22, ADCC-HEAVY is added for FN loss, being alone in MCS90; for QLIK loss, ADCC-HEAVY, ADECO-GARCH, and ADECO-HEAVY are added, DCC-HEAVY being removed and DECO-GARCH kept.The correlation loss ratios reported in Tables 8 and 9 differ slightly from those of Table 6, indicating that the asymmetric versions of the models do not much improve the symmetric versions.
For variance forecasting (Table 5), the main change in the MCS90 due to the inclusion of asymmetric models is that ADCC-HEAVY is added for both loss functions and all horizons, even being the only model in the sets at horizons 5 and 22.This model ranked first in all comparisons.Clearly, ADCC-HEAVY improves variance forecasting, especially at horizons larger than 1.Loss ratios show that this improvement is important at horizon 22; for example, for MSE loss, it goes from −16.6% in Table 6 to +14.7% in Table 8 and +22.7% in Table 9.

Conclusions
Multivariate volatility models that specify the dynamics of the daily conditional covariance matrix as a function of realized covariances have emerged in the literature since 2012.They are a valuable alternative to multivariate GARCH models wherein the dynamics depend on lagged squared returns and their cross-products, because realized variances and covariances are more precise measures of daily volatility.
Perhaps surprisingly, with the partial exception of Braione (2016), no dynamic conditional correlation formulation of a HEAVY model has been proposed in the literature, where BEKK-type formulations are used in the papers of Noureldin et al. (2012) and Opschoor et al. (2018).Our contribution fills this gap by developing DCCtype HEAVY models.Such models have the advantage, with respect to BEKK models, of separating the specification of the conditional variances from the specification of the conditional correlations.The same advantage occurs in the specifications of the expected realized variances and correlations.As for GARCH DCC models, this results in more flexible models, in the sense that the dynamics of variances differ between assets and from the dynamics of correlations.Sticking to scalar models for correlations, the models remain parsimonious in parameters.
An illustrative empirical application for 29 assets illustrated the value of the flexibility of DCC and DECO versions of HEAVY models.These models, including extensions to include asymmetric effects, have superior forecasting performance with respect to BEKK formulations.As always in this type of empirical exercise, this finding is contingent on the dataset used and cannot be claimed to be valid in general.A robust conclusion is that HEAVY models dominate GARCH versions in terms of forecasting performance.An additional application to the data of Noureldin et al. (2012) is reported in Section F of the supplemental appendix and broadly confirms the empirical conclusions.
This research did not develop the asymptotic theory of the QML estimator of the parameters of the DCCand DECO-HEAVY models.A small-scale simulation study illustrated that the usual properties of consistency and asymptotic normality are likely to hold.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
timator.Each RMSE for T = 4000 is approximately equal to (actually, a bit smaller than) the corresponding one for T = 2000 divided by the square root of 4000/2000, which indicates that convergence occurs at the rate √ T .The Jarque-Bera normality tests of the sampling distributions of the correlation parameter estimators indicate that normality is not rejected (except for α r and β r at T =

Fig. 2 .
Fig. 2. The figure shows the pairwise average of log-covariances and correlations estimated in the second step with DCC-GARCH and DCC/DECO-HEAVY-H, over the sample period 03/01/2001-16/04/2018.The bottom graph shows the average of the log-variances of the 29 stocks, estimated in the first step of DCC-GARCH and DCC-HEAVY.

Table 1
Simulation results for the two-step QMLE of the DCC-HEAVY parameters.Med: median; RB: relative bias (in percentages); RMSE: root mean squared error; see the definitions in (45) For the variance parameter vectors (A h , B h , A m , B m ): the values are the minimum, median, and maximum of the RB and RMSE across the 29 parameters in each vector.N. test: p-value of the Jarque-Bera normality test of the sampling distribution.The reported values are obtained from 1000 simulated samples of the data generating process defined in Section 4.3.
Table 2 presents summary statistics (median, minimum, and maximum) of the first-step parameter estimates (except the constant terms) of the 29 variance

Table 2
Summary of parameter estimates of the variance equations for 29 stocks.Max) are the summary statistics of the estimates; λ max and λ min are, respectively, the largest and smallest eigenvalues of the corresponding A + B matrix; see (23)-(26) for their definition for the GARCH, GARCHX, and HEAVY models.All estimates are provided in TableE2of the supplementary appendix.For A h of GARCHX, the statistics are for the 21 non-zero estimates (8 estimates are equal to 0).

Table 3
Parameter estimates (and robust t-statistics) of the correlation equations.

Table 4
MCS for loss functions of out-of-sample covariance forecasting.Values of loss functions in bold identify models in the 90% level MCS when the comparison is limited to the first six models.Underlined values identify models in the 90% level MCS when the comparison is done for all models.A value in bold but not underlined is thus in the MCS of the first six models, but is excluded when considering all models.The MCS rankings are for the global comparison.

Table 4
reports the out-of-sample forecast losses, defined by (48), and the economic loss functions for the

Table 5
MCS for loss functions of out-of-sample correlation and variance forecasting.Values of loss functions in bold identify models in the 90% level MCS when the comparison is limited to the first six models.Underlined values identify models in the 90% level MCS when the comparison is done for all models.A value in bold but not underlined is thus in the MCS of the first six models, but is excluded when considering all models.The MCS rankings are for the global comparison.

Table 6
Loss ratios between DCC-HEAVY and DCC-GARCH: Out-of-sample forecasting.

Table 7
Parameter estimates (and robust t-statistics) of the asymmetric correlation equations.

Table 8
Loss ratios between ADCC-HEAVY and DCC-GARCH: Out-of-sample forecasting.For the Variance panel, the results of DCC and DECO are identical.

Table 9
Loss ratios between ADCC-HEAVY and ADCC-GARCH: Out-of-sample forecasting.For the Variance panel, the results of DCC and DECO are identical.