Infinite-order, long-memory heterogeneous autoregressive models

doi:10.1016/j.csda.2013.08.009

Computational Statistics & Data Analysis

Volume 76, August 2014, Pages 339-358

https://doi.org/10.1016/j.csda.2013.08.009 Get rights and content

Abstract

We develop an infinite-order extension of the HAR-RV model, denoted by HAR( $\infty$ ). We show that the autocorrelation function of the model is algebraically decreasing and thus the model is a long-memory model if and only if the HAR coefficients decrease exponentially. For a finite sample, a prediction is made using coefficients estimated by ordinary least squares (OLS) fitting for a finite-order model, HAR( $p$ ), say. We show that the OLS estimator (OLSE) is consistent and asymptotically normal. The approximate one-step-ahead prediction mean-square error is derived. Analysis shows that the prediction error is mainly due to estimation of the HAR( $p$ ) coefficients rather than to errors made in approximating HAR( $\infty$ ) by HAR( $p$ ). This result provides a theoretical justification for wide use of the HAR(3) model in predicting long-memory realized volatility. The theoretical result is confirmed by a finite-sample Monte Carlo experiment for a real data set.

Introduction

The volatility of financial data is an important component of financial markets in both practical applications and theoretical studies. The long-memory property of volatility, meaning that historical volatility has a persistent impact on future volatility, is important in investment decision-making. Thus, the time series model for financial data should reflect the long-memory property. While the FIGARCH and ARFIMA models can be used for long-memory empirical analysis, such fractional integration models lack a clear economic interpretation.

To describe the long memory of volatility, Corsi, 2004, Corsi, 2009 proposed an additive cascade model containing volatility components defined for different time periods, called the heterogeneous autoregressive model of realized volatility (HAR-RV), that has three heterogeneous volatility components. Inspired by the HARCH model of Müller et al. (1997) and Dacorogna et al. (1998), the HAR-RV (called HAR hereafter) model is consistent with the heterogeneous market hypothesis and with the asymmetric propagation of volatility between long- and short-time horizons, and has different volatility components generated by the actions of different types of market participant. Although the HAR model is not formally a long-memory model, long-memory behavior results from the sum of volatility components constructed for different time horizons. Some more long-memory aspects of realized volatility have been discussed by Raggi and Bordignon (2012) in terms of the Markov switching approach.

The HAR model has been successfully used in forecasting realized volatility. Ghysels et al. (2006) and Forsberg and Ghysels (2007) compared HAR with the MIDAS model. Andersen et al. (2007) used the HAR model for prediction of the volatility of stock prices, foreign exchange rates, and bond prices. Corsi et al. (2008) showed that non-Gaussianity and time-varying volatility of reduced-form RV models, such as ARFIMA and HAR, might be partly attributable to time variations in the volatility of the RV estimator. McAleer and Medeiros (2008) proposed a model called heterogeneous autoregression with multiple-regime smooth transition as an extension of the HAR model. This model contains long memory and nonlinearity, and incorporates sign and size asymmetries. Motivated by the use of HAR models in practice, Craioveanu and Hillebrand (2009) provided a critical review of the advantages of HAR models for daily RV. Hillebrand and Medeiros (2010) considered log–linear and neural network HAR models of realized volatility. They also applied bagging, a data mining technique, to realized volatility. Tang and Chi (2010) addressed test methods and models for long memory and found that the HAR model showed better predictive ability than the ARFIMA-RV model. Other uses of HAR models include risk management with VaR measures (Clements et al., 2008), risk-return tradeoff (Bollerslev et al., 2009), serial correlation (Bianco et al., 2009), implied volatility (Buscha et al., 2011), and realized volatility errors (Asai et al., 2012).

In the HAR model of Corsi (2009), a hierarchical model is considered with three volatility components corresponding to time horizons of 1 day $(1 d)$ , 1 week $(1 w)$ , and 1 month $(1 m)$ . The HAR(3)-RV time series representation of the proposed cascade model can be written in the form $R V_{t + 1}^{(d)} = c + β^{(d)} R V_{t}^{(d)} + β^{(w)} R V_{t}^{(w)} + β^{(m)} R V_{t}^{(m)} + ω_{t + 1},$ where $R V_{t}^{(d)}$ is the realized variance on day $t$ , and $R V_{t}^{(w)}$ and $R V_{t}^{(m)}$ are moving averages given by $R V_{t}^{(w)} = \frac{1}{5} (R V_{t}^{(d)} + R V_{t - 1}^{(d)} + \dots + R V_{t - 4}^{(d)}),$ $R V_{t}^{(m)} = \frac{1}{22} (R V_{t}^{(d)} + R V_{t - 1}^{(d)} + \dots + R V_{t - 21}^{(d)}) .$ Therefore, the HAR(3) model can be expressed as an AR(22) model. By adding lags of $R V_{t}^{(d)}$ up to lag 21, the model captures the long-memory properties of RV in a parsimonious way, but is theoretically a short-memory model.

It is important to study the prediction performance of the HAR(3) model of Corsi (2009) when the data-generating process has long memory. For this purpose, we extend the HAR(3) model to long memory and develop an asymptotic estimation theory. As noted by Corsi (2009), the HAR(3) model has short memory because it is an AR(22) model. To obtain the long-memory property, an infinite-order model, HAR $(\infty)$ , is proposed. We present stationarity conditions for the HAR $(\infty)$ model and give necessary and sufficient conditions for the long-memory property. The essential part of these conditions is that the HAR $(\infty)$ coefficients decay exponentially.

Poskitt (2007) and Baillie and Kapetanios (2009) suggested the use of high-order autoregressions to approximate long-memory processes. Poskitt (2007) considered long AR approximations for general fractionally integrated processes. Poskitt (2007) established convergence rates for AR estimates and gave a CLT for the coefficient estimates. Baillie and Kapetanios (2009) dealt with practical investigation of a time series with long-memory characteristics using a semi-parametric estimation of the long-memory parameter.

One approach for predictions of the HAR $(\infty)$ process is to estimate a HAR $(p)$ model of order $p$ that increases with the sample size. The finite-order approach was used by Ing and Wei (2003) for an infinite-order autoregressive process and by Kuersteiner (2005) for infinite-order vector autoregressions, but their analyses were not for long-memory processes. We establish consistency and limiting normality of the OLSE of the fitted HAR $(p)$ model.

The remainder of the paper is organized as follows. In Section 2, we describe the long-memory HAR $(\infty)$ model and discuss its properties. In Section 3, asymptotic estimation theory is developed for HAR $(p)$ fitting as an approximation for the HAR $(\infty)$ model. In Section 4, the large-sample mean-squared error for HAR $(p)$ prediction is derived. In Section 5, a Monte Carlo experiment is conducted. Realized volatilities for log returns of the Korean stock price index (KOSPI) and the Korean Won–US Dollar exchange rate are analyzed in terms of HAR estimation and HAR prediction in Section 6. Section 7 contains conclusions. Proofs are given in Section 8.

Section snippets

HAR $(\infty)$ model and long-memory properties

We extend the HAR(3) model of Corsi (2009) to an infinite-order model given by $Y_{t} = β_{0} + β_{1} Y_{t, h_{1}} + β_{2} Y_{t, h_{2}} + \dots + ϵ_{t},$ where ${β_{j} : j = 0, 1, 2, \dots}$ is a sequence of real numbers tending to $0$ , ${h_{j} : j = 1, 2, \dots}$ is a given sequence of positive integers increasing to $\infty$ , $Y_{t, h_{j}} = \frac{1}{h_{j}} (Y_{t - 1} + Y_{t - 2} + \dots + Y_{t - h_{j}}),$ and ${ϵ_{t}}$ is a sequence of i.i.d. random variables with mean zero and variance $E [ϵ_{t}^{2}] = σ^{2}$ . We denote this as the HAR $(\infty)$ model.

We discuss basic probabilistic properties of $Y_{t}$ in (1). Note that $Y_{t}$ is an AR $(\infty)$ process: $Y_{t} = β_{0} + \sum_{j = 1}^{\infty} β_{j}$

Asymptotic theory for the least-squares estimator

We derive asymptotic results such as moment bounds for the sample autocovariance matrix and its inverse (Theorem 3.2), and consistency and asymptotic normality of the OLSE (Theorem 3.3, Theorem 3.4). The OLSE is involved in the inverse sample autocovariance matrix, and thus the finiteness of its moment bound for the inverse sample autocovariance matrix with increasing dimension should be established. The moment bounds for a fitted finite order were studied by Ing and Wei (2003) for

Asymptotic prediction theory

In this section, we investigate prediction error. The 1-step-ahead prediction based on a HAR $(p)$ fitting is given by ${\hat{Y}}_{n + 1} (p) = X_{n + 1} {(p)}^{'} \hat{β} (p) = X_{n + 1} {(p)}^{'} \hat{R} {(p)}^{- 1} \frac{1}{n} \sum_{t = 1}^{n} X_{t} (p) [X_{t} {(p)}^{'} β (p) + a_{t} (p)] = X_{n + 1} {(p)}^{'} β (p) + f_{n + 1} (p),$ where $f_{n + 1} (p) = X_{n + 1} {(p)}^{'} \hat{R} {(p)}^{- 1} \frac{1}{n} \sum_{t = 1}^{n} X_{t} (p) a_{t} (p) .$ Thus, the prediction error is ${\hat{Y}}_{n + 1} (p) - Y_{n + 1} = f_{n + 1} (p) - a_{n + 1} (p) = f_{n + 1} (p) - η_{n + 1} (p) - ϵ_{n + 1},$ where $η_{n + 1} (p) = \sum_{j = p + 1}^{\infty} β_{j} Y_{n + 1, h_{j}}$ as in (5).

Note that the prediction error consists of the three terms $ϵ_{n + 1}$ , $η_{n + 1} (p)$ , and $f_{n + 1} (p)$ . The first term is the future

A Monte Carlo study

This section compares predictions based on HAR(3) fitting with those based on higher-order fittings HAR(4) and HAR(5) when data are generated from two long-memory models, HAR(8) and fractional integration (FI $(d)$ ). The aim of this Monte Carlo experiment is to determine whether higher-order fittings provide meaningful prediction improvement over the commonly used HAR(3) for data with a really long memory.

We show that HAR(3) fitting is sufficient and that higher-order HAR fittings do not provide

Analysis of real data sets

Two real data sets were analyzed using the theory developed: KOSPI returns and the Korea Won–US Dollar exchange rate for the period 2 January 2009–28 December 2012. Fig. 4 shows 5-min RV for log returns. RV for a given working day is the square root of the sum of squares of 5-min log returns for working hours (9:00–15:00) on the day. RVs are the sum of squares of the log returns observed for times ${09 : 00, 09 : 05, \dots, 15 : 00}$ , where the log return for the start time of 9:00 is the off-time log return

Conclusion

We extended the HAR(3) model to an infinity-order model, denoted by HAR $(\infty)$ , with genuine long memory. The basic properties of the HAR $(\infty)$ model were established. A key long-memory condition for stationarity of the HAR $(\infty)$ model is that the HAR $(\infty)$ coefficients decrease exponentially.

Asymptotic mean squared errors for prediction revealed that prediction errors, apart from future errors, mainly arise from estimating the unknown coefficients of the fitted HAR $(p)$ model rather than approximating the

Proofs

Proof of Theorem 2.1

Owing to (A1)(a), the polynomial $B (z) = 1 / A (z) = \sum_{k = 0}^{\infty} ξ_{k} z^{k}$ is bounded away from zero for $| z | \leq 1$ . From $A (z) B (z) = 1$ , $Y_{t}$ has a moving-average representation. The absolute summability of $ξ_{i}$ follows from that of $ϕ_{i}$ in Remark 2 by Wiener’s theorem (Zygmund, 1968, p. 245). □

Proof of Theorem 2.2

For (i), we see that in the characteristic polynomial $A (z) = 1 - \sum_{i = 1}^{\infty} ϕ_{i} z^{i}$ in (2), $ϕ_{1} = \dots = ϕ_{h_{1}} = α_{1}, ϕ_{h_{j - 1} + 1} = \dots = ϕ_{h_{j}} = α_{j}$ for $j = 2, 3, \dots$ . Let $k$ be given. We find $j$ such that $h_{j - 1} < k \leq h_{j}$ . Then $k \sim c ν^{j}$ under assumption (A2), and $ϕ_{k} = α_{j} = \sum_{l = j}^{\infty} \frac{β_{l}}{h_{l}} \sim \frac{c {(λ / ν)}^{j}}{1 - λ / ν} = c (λ /$

Acknowledgments

We are very grateful for the valuable comments of Professor Wayne A. Fuller and two anonymous referees that improved the paper considerably. We thank Ms Soyoung Park for providing data analysis. This work was supported by the National Research Foundation of Korea (NRF-2012-2046157) through the National Research Foundation of Korea (NRF) funded by the Ministry of Education Science and Technology.

References (29)

M. Clements et al.
Quantile forecasts of daily exchange rate returns from forecasts of realized volatility
Journal of Empirical Finance
(2008)
P. Doukhan et al.
A new weak dependence condition and applications to moment inequalities
Stochastic Processes and their Applications
(1999)
E. Ghysels et al.
Predicting volatility: getting the most out of return data sampled at different frequencies
Journal of Econometrics
(2006)
E. Hwang et al.
Strong consistency of the stationary bootstrap under $ψ$ -weak dependence
Statistics & Probability Letters
(2012)
C.K. Ing et al.
On same-realization prediction in an infinite-order autoregressive process
Journal of Multivariate Analysis
(2003)
M. McAleer et al.
A multiple-regime smooth transition heterogeneous autoregressive model for long memory and asymmetries
Journal of Econometrics
(2008)
U. Müller et al.
Volatilities of different time resolutions — analyzing the dynamics of market components
Journal of Empirical Finance
(1997)
T.D. Pham et al.
Some mixing properties of time series models
Stochastic Processes and their Applications
(1985)
D. Raggi et al.
Long memory and nonlinearities in realized volatility: a Markov switching approach
Computational Statistics & Data Analysis
(2012)
T. Andersen et al.
Roughing it up: including jump components in the measurement, modeling and forecasting of return volatility
Review of Economics and Statistics
(2007)

P. Ango Nze et al.

Weak dependence: models and applications to econometrics

Econometric Theory

(2004)

M. Asai et al.

Modelling and forecasting noisy realized volatility

Computational Statistics & Data Analysis

(2012)

Baillie, R., Kapetanios, G., 2009. Semi parametric estimation of long memory: comparisons and some attractive...

K.N. Berk

Consistent autoregressive spectral estimates

The Annals of Statistics

(1974)

Cited by (27)

A partially periodic oscillation model combined with heterogeneous autoregression and its application to COVID-19
2024, Applied Mathematical Modelling
This paper proposes a partially periodic oscillation model, which is motivated by time series modelling for COVID-19 daily confirmed cases, in particular, to represent more accurately the dynamic features of the 7-day periodicity. In order to express the phenomenon of the partial 7-day cycle in the COVID-19 data, some partial periodic part is added to a heterogeneous autoregression model. Estimation algorithm based on the least squares errors and regression analysis is provided and parameter estimation consistency is given along with its proof. A Monte-Carlo simulation study is carried out to investigate the finite-sample performance. The proposed model is applied to the COVID-19 daily confirmed cases of the most affected eight countries which posses the partially periodic oscillation. Model criteria such as RMSE, MAE, HMAE, AIC and BIC are compared with other existing models. Efficiency of the model, relative to the benchmark, is evaluated to reveal its better accuracy performance. Out-of-sample forecasting analysis is conducted as well. The novelty is that this work is a challenging trial to identify the partially periodic oscillation of COVID-19 data, without smoothing, as well as the proposed model outperforms the existing time series models in the empirical analysis of the worldwide COVID-19.
Forecasting realised volatility: Does the LASSO approach outperform HAR?
2021, Journal of International Financial Markets, Institutions and Money
The HAR model dominates current volatility forecasting. This model implies a restricted lag approach, with three parameters accounting for an AR(22) structure. This paper uses the Lasso method, which selects a parsimonious lag structure, while allowing both a flexible lag structure and lags greater than 22. In-sample results suggest that while significance is largely found among the first 22 lags, consistent with the HAR model, there is evidence that longer lags contain information, as Lasso models provide an improved fit. Out-of-sample forecasts for daily, weekly and monthly volatility, evaluated using MSE, QLIKE, MCS and VaR measures, suggest that the ordered Lasso model provides the preferred forecasts using an AR(1 0 0) at the daily level and an AR(22) for the weekly and monthly horizons. The results support the view that a more flexible lag structure is preferred over the HAR approach.
Detecting structural breaks in realized volatility
2019, Computational Statistics and Data Analysis
This paper considers the detection of structural changes in realized volatility based on HAR–GARCH models. For this, we propose a quasi-likelihood based score test for parameter changes in HAR–GARCH models. We derive the limiting null distribution of the score test by first introducing the quasi-maximum likelihood estimator to the HAR–GARCH model and establishing its asymptotic properties. The proposed test statistic is shown to converge weakly to a function of the Brownian bridge under the null of no structural change. Our simulations study shows reasonable sizes and powers of the test, even for non-Gaussian innovations. A real data application to S&P 500 realized volatility over the last 12 years coincides with three waves of financial crisis, namely the US housing, European sovereign debt, and emerging market crisis.
Forecasting realized volatility: A review
2018, Journal of the Korean Statistical Society
Forecast methods for realized volatilities are reviewed. Basic theoretical and empirical features of realized volatilities as well as versions of estimators of realized volatility are briefly investigated. Major forecast models featuring the empirical aspects of persistency and asymmetry are discussed in terms of forecasting models for which the heterogeneous autoregressive (HAR) model is one of the most basic one in the recent literature. Forecast methods addressing the issues of jump, break, implied volatility, and market microstructure noise are reviewed. Forecasting realized covariance matrix is also considered.
Realized volatility forecast of agricultural futures using the HAR models with bagging and combination approaches
2017, International Review of Economics and Finance
In order to reduce the uncertainty associated with a single predictor model, we incorporate the bagging and combination approaches into a HAR model with the lags of realized volatility and other potential predictors to forecast the realized volatility of agricultural commodity futures in China. We evaluate the performances of the two approaches by employing the mean square forecast error (MSFE) loss function, the modified DM test and the model confidence set (MCS) test at the multiple horizons over the three out-of-sample periods. We find that the realized forecasts from the HAR model with bagging and principal component (PC) combination approaches produce the lowest MSFE at relatively longer forecast horizons. We also find that the simple average of the forecasts from the HAR models with bagging and PC combination methods leads to a further reduction in MSFE, suggesting that they are the effective methods to forecast the realized volatility of agricultural commodity futures in China.
An integrated heteroscedastic autoregressive model for forecasting realized volatilities
2016, Journal of the Korean Statistical Society
Citation Excerpt :
We find many successful applications and extensions of the model. Among many others, we refer Andersen, Bollerslev, and Diebold (2007) and Corsi, Pirino, and Reno (2010) for models with jump; McAleer and Medeiros (2008) for models having leverage effect; Busch, Christensen, and Nielsen (2011) for models with implied volatility and jumps; Hwang and Shin (2014) for an infinite order model; Hwang and Shin (2013, 2015) and Song and Shin (2015) for structural breaks; and Yun and Shin (2015) for the issue of overnight in RV forecasting. The HAR model represents efficiently long-memories of financial volatilities by employing the efficient regressors of the one-day lag, one-day lagged weekly moving average, and one-day lagged monthly moving average of realized volatility.
A new strategy for forecasting realized volatility (RV) is proposed for the heteroscedastic autoregressive (HAR) model of Corsi (2009). The strategy is constraining the sum of the HAR coefficients to one, resulting in an integrated model, called IHAR model. The IHAR model is motivated by stationarity of estimated HAR model, downward biases of estimated HAR coefficients, and over-rejection of ADF test for long-memory processes. Considerable out-of-sample forecast improvements of the IHAR model over the HAR model are demonstrated for RVs of 4 financial assets: the US S&P 500 index, the US NASDAQ index, the Japan yen/US dollar exchange rate, and the EU euro/US dollar exchange rate. Forecast improvement is also verified in a Monte Carlo experiment and in an empirical comparison for an extended data set. The forecast improvement is shown to be a consequence of the fact that the IHAR model takes better advantage of the long memory of RV and the conditional heteroscedasticity of RV than the HAR model.

View all citing articles on Scopus

View full text

Infinite-order, long-memory heterogeneous autoregressive models

Abstract

Introduction

Section snippets

HAR(∞) model and long-memory properties

Asymptotic theory for the least-squares estimator

Asymptotic prediction theory

A Monte Carlo study

Analysis of real data sets

Conclusion

Proofs

Acknowledgments

Journal of Empirical Finance

Stochastic Processes and their Applications

Journal of Econometrics

Statistics & Probability Letters

Journal of Multivariate Analysis

Journal of Econometrics

Journal of Empirical Finance

Stochastic Processes and their Applications

Computational Statistics & Data Analysis

Roughing it up: including jump components in the measurement, modeling and forecasting of return volatility

Review of Economics and Statistics

Weak dependence: models and applications to econometrics

Econometric Theory

Modelling and forecasting noisy realized volatility

Computational Statistics & Data Analysis

Consistent autoregressive spectral estimates

The Annals of Statistics

HAR $(\infty)$ model and long-memory properties