Infinite-order, long-memory heterogeneous autoregressive models

https://doi.org/10.1016/j.csda.2013.08.009Get rights and content

Abstract

We develop an infinite-order extension of the HAR-RV model, denoted by HAR(). We show that the autocorrelation function of the model is algebraically decreasing and thus the model is a long-memory model if and only if the HAR coefficients decrease exponentially. For a finite sample, a prediction is made using coefficients estimated by ordinary least squares (OLS) fitting for a finite-order model, HAR(p), say. We show that the OLS estimator (OLSE) is consistent and asymptotically normal. The approximate one-step-ahead prediction mean-square error is derived. Analysis shows that the prediction error is mainly due to estimation of the HAR(p) coefficients rather than to errors made in approximating HAR() by HAR(p). This result provides a theoretical justification for wide use of the HAR(3) model in predicting long-memory realized volatility. The theoretical result is confirmed by a finite-sample Monte Carlo experiment for a real data set.

Introduction

The volatility of financial data is an important component of financial markets in both practical applications and theoretical studies. The long-memory property of volatility, meaning that historical volatility has a persistent impact on future volatility, is important in investment decision-making. Thus, the time series model for financial data should reflect the long-memory property. While the FIGARCH and ARFIMA models can be used for long-memory empirical analysis, such fractional integration models lack a clear economic interpretation.

To describe the long memory of volatility, Corsi, 2004, Corsi, 2009 proposed an additive cascade model containing volatility components defined for different time periods, called the heterogeneous autoregressive model of realized volatility (HAR-RV), that has three heterogeneous volatility components. Inspired by the HARCH model of Müller et al. (1997) and Dacorogna et al. (1998), the HAR-RV (called HAR hereafter) model is consistent with the heterogeneous market hypothesis and with the asymmetric propagation of volatility between long- and short-time horizons, and has different volatility components generated by the actions of different types of market participant. Although the HAR model is not formally a long-memory model, long-memory behavior results from the sum of volatility components constructed for different time horizons. Some more long-memory aspects of realized volatility have been discussed by Raggi and Bordignon (2012) in terms of the Markov switching approach.

The HAR model has been successfully used in forecasting realized volatility. Ghysels et al. (2006) and Forsberg and Ghysels (2007) compared HAR with the MIDAS model. Andersen et al. (2007) used the HAR model for prediction of the volatility of stock prices, foreign exchange rates, and bond prices. Corsi et al. (2008) showed that non-Gaussianity and time-varying volatility of reduced-form RV models, such as ARFIMA and HAR, might be partly attributable to time variations in the volatility of the RV estimator. McAleer and Medeiros (2008) proposed a model called heterogeneous autoregression with multiple-regime smooth transition as an extension of the HAR model. This model contains long memory and nonlinearity, and incorporates sign and size asymmetries. Motivated by the use of HAR models in practice, Craioveanu and Hillebrand (2009) provided a critical review of the advantages of HAR models for daily RV. Hillebrand and Medeiros (2010) considered log–linear and neural network HAR models of realized volatility. They also applied bagging, a data mining technique, to realized volatility. Tang and Chi (2010) addressed test methods and models for long memory and found that the HAR model showed better predictive ability than the ARFIMA-RV model. Other uses of HAR models include risk management with VaR measures (Clements et al., 2008), risk-return tradeoff (Bollerslev et al., 2009), serial correlation (Bianco et al., 2009), implied volatility (Buscha et al., 2011), and realized volatility errors (Asai et al., 2012).

In the HAR model of Corsi (2009), a hierarchical model is considered with three volatility components corresponding to time horizons of 1 day (1d), 1 week (1w), and 1 month (1m). The HAR(3)-RV time series representation of the proposed cascade model can be written in the form RVt+1(d)=c+β(d)RVt(d)+β(w)RVt(w)+β(m)RVt(m)+ωt+1, where RVt(d) is the realized variance on day t, and RVt(w) and RVt(m) are moving averages given by RVt(w)=15(RVt(d)+RVt1(d)++RVt4(d)),RVt(m)=122(RVt(d)+RVt1(d)++RVt21(d)). Therefore, the HAR(3) model can be expressed as an AR(22) model. By adding lags of RVt(d) up to lag 21, the model captures the long-memory properties of RV in a parsimonious way, but is theoretically a short-memory model.

It is important to study the prediction performance of the HAR(3) model of Corsi (2009) when the data-generating process has long memory. For this purpose, we extend the HAR(3) model to long memory and develop an asymptotic estimation theory. As noted by Corsi (2009), the HAR(3) model has short memory because it is an AR(22) model. To obtain the long-memory property, an infinite-order model, HAR(), is proposed. We present stationarity conditions for the HAR() model and give necessary and sufficient conditions for the long-memory property. The essential part of these conditions is that the HAR() coefficients decay exponentially.

Poskitt (2007) and Baillie and Kapetanios (2009) suggested the use of high-order autoregressions to approximate long-memory processes. Poskitt (2007) considered long AR approximations for general fractionally integrated processes. Poskitt (2007) established convergence rates for AR estimates and gave a CLT for the coefficient estimates. Baillie and Kapetanios (2009) dealt with practical investigation of a time series with long-memory characteristics using a semi-parametric estimation of the long-memory parameter.

One approach for predictions of the HAR() process is to estimate a HAR(p) model of order p that increases with the sample size. The finite-order approach was used by Ing and Wei (2003) for an infinite-order autoregressive process and by Kuersteiner (2005) for infinite-order vector autoregressions, but their analyses were not for long-memory processes. We establish consistency and limiting normality of the OLSE of the fitted HAR(p) model.

The remainder of the paper is organized as follows. In Section  2, we describe the long-memory HAR() model and discuss its properties. In Section  3, asymptotic estimation theory is developed for HAR(p) fitting as an approximation for the HAR() model. In Section  4, the large-sample mean-squared error for HAR(p) prediction is derived. In Section  5, a Monte Carlo experiment is conducted. Realized volatilities for log returns of the Korean stock price index (KOSPI) and the Korean Won–US Dollar exchange rate are analyzed in terms of HAR estimation and HAR prediction in Section  6. Section  7 contains conclusions. Proofs are given in Section  8.

Section snippets

HAR() model and long-memory properties

We extend the HAR(3) model of Corsi (2009) to an infinite-order model given by Yt=β0+β1Yt,h1+β2Yt,h2++ϵt, where {βj:j=0,1,2,} is a sequence of real numbers tending to 0, {hj:j=1,2,} is a given sequence of positive integers increasing to , Yt,hj=1hj(Yt1+Yt2++Ythj), and {ϵt} is a sequence of i.i.d. random variables with mean zero and variance E[ϵt2]=σ2. We denote this as the HAR() model.

We discuss basic probabilistic properties of Yt in (1). Note that Yt is an AR() process: Yt=β0+j=1βj

Asymptotic theory for the least-squares estimator

We derive asymptotic results such as moment bounds for the sample autocovariance matrix and its inverse (Theorem 3.2), and consistency and asymptotic normality of the OLSE (Theorem 3.3, Theorem 3.4). The OLSE is involved in the inverse sample autocovariance matrix, and thus the finiteness of its moment bound for the inverse sample autocovariance matrix with increasing dimension should be established. The moment bounds for a fitted finite order were studied by Ing and Wei (2003) for

Asymptotic prediction theory

In this section, we investigate prediction error. The 1-step-ahead prediction based on a HAR(p) fitting is given by Yˆn+1(p)=Xn+1(p)βˆ(p)=Xn+1(p)Rˆ(p)11nt=1nXt(p)[Xt(p)β(p)+at(p)]=Xn+1(p)β(p)+fn+1(p), where fn+1(p)=Xn+1(p)Rˆ(p)11nt=1nXt(p)at(p). Thus, the prediction error is Yˆn+1(p)Yn+1=fn+1(p)an+1(p)=fn+1(p)ηn+1(p)ϵn+1, where ηn+1(p)=j=p+1βjYn+1,hj as in (5).

Note that the prediction error consists of the three terms ϵn+1, ηn+1(p), and fn+1(p). The first term is the future

A Monte Carlo study

This section compares predictions based on HAR(3) fitting with those based on higher-order fittings HAR(4) and HAR(5) when data are generated from two long-memory models, HAR(8) and fractional integration (FI(d)). The aim of this Monte Carlo experiment is to determine whether higher-order fittings provide meaningful prediction improvement over the commonly used HAR(3) for data with a really long memory.

We show that HAR(3) fitting is sufficient and that higher-order HAR fittings do not provide

Analysis of real data sets

Two real data sets were analyzed using the theory developed: KOSPI returns and the Korea Won–US Dollar exchange rate for the period 2 January 2009–28 December 2012. Fig. 4 shows 5-min RV for log returns. RV for a given working day is the square root of the sum of squares of 5-min log returns for working hours (9:00–15:00) on the day. RVs are the sum of squares of the log returns observed for times {09:00,09:05,,15:00}, where the log return for the start time of 9:00 is the off-time log return

Conclusion

We extended the HAR(3) model to an infinity-order model, denoted by HAR(), with genuine long memory. The basic properties of the HAR() model were established. A key long-memory condition for stationarity of the HAR() model is that the HAR() coefficients decrease exponentially.

Asymptotic mean squared errors for prediction revealed that prediction errors, apart from future errors, mainly arise from estimating the unknown coefficients of the fitted HAR(p) model rather than approximating the

Proofs

Proof of Theorem 2.1

Owing to (A1)(a), the polynomial B(z)=1/A(z)=k=0ξkzk is bounded away from zero for |z|1. From A(z)B(z)=1, Yt has a moving-average representation. The absolute summability of ξi follows from that of ϕi in Remark 2 by Wiener’s theorem (Zygmund, 1968, p. 245). 

Proof of Theorem 2.2

For (i), we see that in the characteristic polynomial A(z)=1i=1ϕizi in (2), ϕ1==ϕh1=α1,ϕhj1+1==ϕhj=αj for j=2,3,. Let k be given. We find j such that hj1<khj. Then kcνj under assumption (A2), and ϕk=αj=l=jβlhlc(λ/ν)j1λ/ν=c(λ/

Acknowledgments

We are very grateful for the valuable comments of Professor Wayne A. Fuller and two anonymous referees that improved the paper considerably. We thank Ms Soyoung Park for providing data analysis. This work was supported by the National Research Foundation of Korea (NRF-2012-2046157) through the National Research Foundation of Korea (NRF) funded by the Ministry of Education Science and Technology.

References (29)

  • P. Ango Nze et al.

    Weak dependence: models and applications to econometrics

    Econometric Theory

    (2004)
  • M. Asai et al.

    Modelling and forecasting noisy realized volatility

    Computational Statistics & Data Analysis

    (2012)
  • Baillie, R., Kapetanios, G., 2009. Semi parametric estimation of long memory: comparisons and some attractive...
  • K.N. Berk

    Consistent autoregressive spectral estimates

    The Annals of Statistics

    (1974)
  • Cited by (27)

    • Forecasting realised volatility: Does the LASSO approach outperform HAR?

      2021, Journal of International Financial Markets, Institutions and Money
    • Detecting structural breaks in realized volatility

      2019, Computational Statistics and Data Analysis
    • Forecasting realized volatility: A review

      2018, Journal of the Korean Statistical Society
    • An integrated heteroscedastic autoregressive model for forecasting realized volatilities

      2016, Journal of the Korean Statistical Society
      Citation Excerpt :

      We find many successful applications and extensions of the model. Among many others, we refer Andersen, Bollerslev, and Diebold (2007) and Corsi, Pirino, and Reno (2010) for models with jump; McAleer and Medeiros (2008) for models having leverage effect; Busch, Christensen, and Nielsen (2011) for models with implied volatility and jumps; Hwang and Shin (2014) for an infinite order model; Hwang and Shin (2013, 2015) and Song and Shin (2015) for structural breaks; and Yun and Shin (2015) for the issue of overnight in RV forecasting. The HAR model represents efficiently long-memories of financial volatilities by employing the efficient regressors of the one-day lag, one-day lagged weekly moving average, and one-day lagged monthly moving average of realized volatility.

    View all citing articles on Scopus
    View full text