The estimation of misspecified long memory models

doi:10.1016/j.jeconom.2013.08.023

Journal of Econometrics

Volume 178, Part 2, January 2014, Pages 225-230

https://doi.org/10.1016/j.jeconom.2013.08.023 Get rights and content

Abstract

We consider time series that, possibly after integer differencing or integrating or other detrending, are covariance stationary with spectral density that is regularly varying near zero frequency, and unspecified elsewhere. This semiparametric framework includes series with short, long and negative memory. We consider the consistency of the popular log-periodogram memory estimate that, conventionally but wrongly, assumes the spectral density obeys a pure power law. The local-to zero misspecification leads to increased bias, such that the usual central limit theorem may only hold for bandwidths entailing considerable imprecision. The order of the bias is calculated for several slowly-varying factors, and some discussion of mean squared error and bandwidth choice is included.

Introduction

The spectral density at low frequencies determines the long-run behavior of stationary time series. Let the covariance stationary and invertible process $z_{t}$ , $t = 0, \pm 1, \dots$ , have a spectral density function $f (λ)$ , $λ \in (- π, π]$ , defined by $c o v (z_{t}, z_{t + j}) = \int_{- π}^{π} f (λ) cos (j λ) d λ, j = 0, \pm 1, \dots .$ In practice, a finite realization, $z_{1}, \dots, z_{n}$ , may be the outcome of integer differencing or integrating or deterministic detrending of a nonstationary or non-invertible series. With $a \sim b$ meaning that $a / b \to 1$ , we assume that $f (λ)$ is regularly-varying at zero frequency, that is $f (λ) \sim L (\frac{1}{λ}) λ^{- 2 d}, as λ \to 0^{+},$ where $0 \leq | d | < 1 / 2$ and, for positive argument $x$ , the function $L (x)$ is slowly-varying (in Karamata’s sense), being positive and measurable on some neighborhood $[X, \infty)$ , with $L (c x) / L (x) \to 1 as x \to \infty, all c > 0 .$

Detailed discussions of slowly-varying functions, and their applications in probability theory, are contained in Seneta (1976) and Bingham et al. (1987). A basic property is that as $x \to \infty L (x)$ can diverge, or converge to zero, or converge to a positive constant, or oscillate, and for any $a > 0$ ,

$x^{a} L (x) \to \infty, x^{- a} L (x) \to 0, as x \to \infty .$ Therefore in (2) the power law $λ^{- 2 d}$ dominates the slowly varying factor $L (1 / λ)$ so that, for any $L$ , as $λ \to 0^{+} f (λ)$ still diverges for $0 < d < 1 / 2$ , and still $f (0) = 0$ for $- 1 / 2 < d < 0$ , while when $d = 0 f (λ)$ diverges when $L (x) \to \infty$ as $x \to \infty$ and $f (0) = 0$ when $L (x) \to 0$ as $x \to \infty$ .

The simplest example of such an $L$ is $L (x) \equiv C > 0 .$ Others include (see Bingham et al. (1987, p. 16)) $L (x) = C {log}_{k} x, k \geq 1,$ where ${log}_{1} x = log x$ and ${log}_{k} x = {log}_{k - 1} log x$ , $k \geq 2$ , as well as powers and rational functions of the ${log}_{k} x$ , $k \geq 1$ (e.g. $L (x) = 1 / log x$ ), and $L (x) = C exp {\overset{k}{\prod_{j = 1}} {({log}_{j} x)}^{a_{j}}}, 0 < a_{j} < 1, j = 1, \dots, k \geq 1,$ $L (x) = C exp {log x / {log}_{2} x} .$

Let $A_{j, k}$ denote the $σ$ -field of events generated by $z_{t}$ , $j \leq t \leq k$ , and define $α_{j} = {sup}_{A \in A_{- \infty, t}, B \in A_{t + j}, \infty} | P (A B) - P (A) P (B) |$ for $j > 0$ . Then if $α_{j} \to 0$ as $j \to \infty$ , $z_{t}$ is said to be $α$ -mixing. Suppose for the purposes of this paragraph that $z_{t}$ is Gaussian, in which case the coefficient of complete regularity decays at the same rate as $α_{j}$ , see Ibragimov and Rozanov (1978, pp. 111, 113). Thus from Ibragimov and Rozanov (1978, pp. 178) $z_{t}$ satisfying (2) cannot be $α$ -mixing when $d > 0$ (because not every positive power of $f (λ)$ is integrable). The usual examples of Gaussian $α$ -mixing processes have bounded spectral density, e.g. a stationary and invertible autoregressive moving average (ARMA), and thus satisfy (2) with $d = 0$ and constant $L$ , (5). However $α$ -mixing does not rule out all unbounded $f (λ)$ . From Ibragimov and Rozanov (1978, pp. 179, 180), $f (λ) = C^{*} exp {\overset{\infty}{\sum_{j = 1}} \frac{cos j λ}{j (log j + 1)}}$ for some $C^{*} > 0$ implies $z_{t}$ is $α$ -mixing. The spectral density in (9) satisfies $f (λ) \sim C log (1 / λ) as λ \to 0^{+},$ which corresponds to combining (6) for $k = 1$ with (2) for $d = 0$ . Incidentally under (9) $α_{j}$ decays very slowly, like $1 / log j$ (and thus does not satisfy conditions for central limit theory for statistics such as the sample mean of $z_{t}$ , $1 \leq t \leq n$ ). From Ibragimov and Rozanov (1978, p. 180) a process with spectral density the reciprocal of the right side of (9) (which converges like ${(log (1 / λ))}^{- 1}$ as $λ \to 0^{+}$ ) is also $α$ -mixing.

Under additional conditions to (2) (see Yong, 1974) the autovariance sequence satisfies $c o v (z_{t}, z_{t + j}) \sim \frac{L (j) π}{cos (d π) Γ (2 d)} j^{2 d - 1}, as j \to \infty .$ The probability literature covers the asymptotic behavior of various simple statistics under (11), in particular linear and quadratic forms (see e.g. Taqqu (1975), Dobrushin and Major (1979), Fox and Taqqu, 1985, Fox and Taqqu, 1987). However, the frequency domain form (2) perhaps provides greater intuitive appeal. Early empirical support for the notion of a divergent spectral density at zero frequency was noted by Granger (1966). He reported nonparametric spectral density estimates for a number of economic time series, and while these are inevitably finite at zero frequency, they are strongly peaked there, and his Fig. 1 is suggestive of a spectral singularity at zero frequency. Of course such an outcome could also be consistent with nonstationarity (such as a unit root), and he did not present formulae such as (2), but clearly (2) with $d > 0$ and any $L$ , or even with $d = 0$ and diverging $L$ , is consistent with his “typical spectral shape”. The leading methods of semiparametric estimation of the memory parameter $d$ have also been frequency-domain. However, they have mainly focused on the simple power law form, with (5) assumed in (2), that is $f (λ) \sim C λ^{- 2 d}, as λ \to 0^{+} .$ The leading fractional parametric models (which specify $f (λ)$ parametrically for all $λ$ ), namely $f (λ) \propto {| 1 - e^{i λ} |}^{- 2 d}$ (Adenstedt (1974)) and its extension to fractionally-integrated ARMA (FARIMA) spectra are covered by (12). In (12) the knife-edge case $d = 0$ describes short memory, when a FARIMA reduces to an ARMA, while the cases $0 < d < 1 / 2$ and $- 1 / 2 < d < 0$ respectively describe long memory and antipersistence. However, methods of estimating such parametric models are inconsistent when $f (λ)$ is misspecified, in particular high-frequency misspecification produces asymptotic bias even in estimates of the low-frequency parameter $d$ . This drawback is overcome (at cost of slower convergence, and of requiring choice of a smoothing number) by semiparametric methods, based on (12), in particular log-periodogram and local Whittle estimates of $d$ and $C$ , see e.g. Geweke and Porter-Hudak (1983), Kuensch (1987), Robinson, 1995a, Robinson, 1995b, where the latter two references established that both estimates are asymptotically normal for all $d \in (- 1 / 2, 1 / 2)$ , and with an asymptotic variance that is constant with respect to $d$ . Thus, standard large-sample inference using these estimates is very simple to implement. Extensions to estimates based on nonstationary processes have been developed by Velasco, 1999a, Velasco, 1999b and subsequent authors.

In principle, one could specify a particular $L$ in (2) up to an unknown scale factor as in (6), (7), (8), for example, and accordingly modify the estimates, and we would expect to achieve good statistical properties if $L$ is correctly chosen. One could also imagine specifying $L$ up to finitely many unknown parameters, e.g. $L (x) = C {(log x)}^{θ}$ for unknown $θ$ , and extend the semiparametric methods to estimate $d$ , $C$ and the additional parameter vector. However in either case the prospect of correct specification of $L$ seems far-fetched, and of greater practical interest is the robustness of existing estimates to unknown, nonparametric, $L$ .

Robinson (1994a) investigated asymptotic properties of the averaged periodogram statistic, and its functionals of interest, including a semiparametric estimate of $d$ , under (2) with unknown $L$ . Define the discrete Fourier transform

$w (λ) = {(2 π n)}^{- 1 / 2} \underset{t = 1}{\sum^{n}} z_{t} e^{i t λ}, λ \in (- π, π],$ and the periodogram $I (λ) = {| w (λ) |}^{2}, λ \in (- π, π] .$ The averaged periodogram is defined as $\hat{F} (λ) = \underset{j = 1}{\sum^{[n λ / 2 π]}} I (λ_{j}), 0 < λ \leq π,$ where $[.]$ here denotes the integer part and $λ_{j} = 2 π j / n$ . For a user-chosen integer $m \in [1, n / 2)$ satisfying $1 / m + m / n \to 0 as n \to \infty,$ Robinson (1994a) showed that $\hat{F} (λ_{m}) / F (λ_{m}) \to_{p} 1, as n \to \infty,$ where $F (λ) = \int_{0}^{λ} f (h) d h .$ For this purpose (2) was assumed but (like a good deal of the long memory literature) under the restriction $0 < d < 1 / 2$ (though there seems no reason why a similar result should not hold also for $- 1 / 2 < d \leq 0$ ), as well as regularity conditions. Further, Robinson (1994a) proposed the following averaged periodogram estimate of $d$ : ${\tilde{d}}_{q} = \frac{1}{2} - \frac{log {\hat{F} (q λ_{m}) / \hat{F} (λ_{m})}}{2 log q},$ where $q$ is chosen in the interval $(0, 1)$ . He showed that under the same conditions as imposed for (17), ${\tilde{d}}_{q} \to_{p} d, as n \to \infty .$ Under somewhat stronger conditions he obtained a rate of convergence in (20), $O_{p} (n^{- η})$ , for some $η > 0$ . The property (20), like (17), holds for any slowly varying $L$ , which is unknown to the practitioner. Intuitively both properties might be anticipated due to (3) and the ratio forms on the left hand side of (17) and in ${\tilde{d}}_{q}$ . Robinson (1994b) discussed mean squared error and optimal choice of $m$ in this setting. The present paper addresses the above issues with respect to the log-periodogram estimate, which, like ${\tilde{d}}_{q}$ but unlike the local Whittle estimate, is defined in closed form, so relatively easily yields information on rates of convergence. Soulier (2010) established a lower bound for the rate of convergence of estimates of $d$ in (2), and proved it to be optimal, illustrating his results with the log periodogram estimate. Giraitis et al. (1997) had considered similar issues with respect to (12), but Soulier (2010) found that the presence of an unanticipated $L$ can produce much slower rates, and that unlike under (12), the log periodogram estimate is no less efficient than the local Whittle estimate, cf. Robinson, 1995a, Robinson, 1995b), where the asymptotic distributional results derived in the latter references may only hold alongside bandwidth choices that yield unacceptable imprecision.

The following section considers the consistency of the log-periodogram estimate. Section 3 evaluates the order of magnitude of the bias in several slowly varying examples, with some discussion of mean squared error and bandwidth choice. Section 4 provides some concluding remarks.

Section snippets

Consistency of log periodogram estimate

We employ the version of the log-periodogram estimate proposed by Robinson (1995a) (which is slightly simpler than Geweke and Porter-Hudak’s (1983)). For $m$ as described in the previous section, define $ν_{j} = log j - \frac{1}{m} \underset{k = 1}{\sum^{m}} log k, 1 \leq j \leq m,$ and introduce the additional notation $v_{j}^{(ℓ)} = \underset{k = 1}{\sum^{j}} ν_{k}^{ℓ}, 1 \leq j \leq m,$ for integer $ℓ$ . The log-periodogram estimate we consider is $\hat{d} = - \frac{1}{2} \underset{j = 1}{\sum^{m}} ν_{j} log I (λ_{j}) / v_{m}^{(2)} .$ Define also $U_{j} = log {I (λ_{j}) / f (λ_{j})} .$

We introduce two assumptions.

Assumption 1

As $n \to \infty$ $\frac{1}{m} \underset{j = 1}{\sum^{m}} ν_{j} U_{j} = o_{p} (1) .$

The unprimitive Assumption 1 can hold

Examples and rates

The paragraph following Assumption 2 argues that the assumption does not much strengthen the slow variation property of $L$ , but it is nevertheless desirable to check it in several cases, and this will desirably indicate rates of convergence. Throughout the derivations it is understood that $x$ is chosen arbitrarily large and $δ \in (0, 1]$ .

1.
$L (x) = C (1 + D x^{- β})$ , $0 < β \leq 2$ , $D \neq 0$ .
This is actually a case of (12), and was assumed in the central limit theorem for $\hat{d}$ of Robinson (1995a), because some refinement of (12) is

Final comments

We have considered the consistency of the semiparametric log-periodogram regression memory estimate in the presence of an unanticipated slowly-varying factor in the spectral density, under a general condition on the function, and verified this condition and calculated convergence rates in several examples. As implied by the results of Soulier (2010), these convergence rates are mostly slow, to the extent that unless the bandwidth $m$ grows extremely slowly the bias will be too large to allow the

Acknowledgements

I thank two referees for comments which have led to presentational improvements, and Phillipe Soulier for alerting me to the reference Soulier (2010). This research was supported by a Cátedra de Excelencia at Universidad Carlos III de Madrid, Spanish Nacional de I+d+I Grant SEJ2007-62908/ECON, and ESRC Grant ES/J007242/1.

References (22)

C. Velasco
Non-stationary log-periodogram regression
Journal of Econometrics
(1999)
R. Adenstedt
On large-sample estimation for the mean of a stationary random sequence
Annals of Statistics
(1974)
N.H. Bingham et al.
Regular Variation
(1987)
R.L. Dobrushin et al.
Non-central limit theorems for nonlinear functionals of Gaussian fields
Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte Gebiete
(1979)
R. Fox et al.
Non-central limit theorems for quadratic forms in random variables having long-range dependence
Annals of Probability
(1985)
R. Fox et al.
Central limit theorems for quadratic forms in random variables having long-range dependence
Probability Theory and Related Fields
(1987)
J. Geweke et al.
The estimation and application of long memory time series models
Journal of Time Series Analysis
(1983)
L. Giraitis et al.
Rate optimal semiparametric estimation of the memory parameter of the Gaussian time series with long-range dependence
Journal of Time Series Analysis
(1997)
C.W.J. Granger
The typical spectral shape of an economic variable
Econometrica
(1966)
C.M. Hurvich et al.
The mean squared error of Geweke and Porter-Hudak’s estimator of the memory parameter in a long-memory time series
Journal of Time Series Analysis
(1998)

I.A. Ibragimov et al.

Gaussian Random Processes

(1978)

Cited by (3)

Issues in the estimation of mis-specified models of fractionally integrated processes
2020, Journal of Econometrics
This short paper provides a comprehensive set of new theoretical results on the impact of mis-specifying the short run dynamics in fractionally integrated processes. We show that four alternative parametric estimators – frequency domain maximum likelihood, Whittle, time domain maximum likelihood and conditional sum of squares – converge to the same pseudo-true value under common mis-specification, and that they possess a common asymptotic distribution. The results are derived assuming the true data generating mechanism is a fractional linear process driven by a martingale difference innovation. A completely general parametric specification for the short run dynamics of the estimated (mis-specified) fractional model is considered, and with long memory, short memory and antipersistence in both the model and the data generating mechanism accommodated. The paper can be seen as extending an existing line of research on mis-specification in fractional models, important contributions to which have appeared in Journal of Econometrics. It also complements a range of existing asymptotic results on estimation in correctly specified fractional models. Open problems in the area are the subject of the final discussion.
Harmonically Weighted Processes
2020, Journal of Time Series Analysis
Testing for Change-Points in Long-Range Dependent Time Series by Means of a Self-Normalized Wilcoxon Test
2016, Journal of Time Series Analysis

View full text

The estimation of misspecified long memory models

Abstract

Introduction

Section snippets

Consistency of log periodogram estimate

Examples and rates

Final comments

Acknowledgements

Journal of Econometrics

On large-sample estimation for the mean of a stationary random sequence

Annals of Statistics

Regular Variation

Non-central limit theorems for nonlinear functionals of Gaussian fields

Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte Gebiete

Non-central limit theorems for quadratic forms in random variables having long-range dependence

Annals of Probability

Central limit theorems for quadratic forms in random variables having long-range dependence

Probability Theory and Related Fields

The estimation and application of long memory time series models

Journal of Time Series Analysis

Rate optimal semiparametric estimation of the memory parameter of the Gaussian time series with long-range dependence

Journal of Time Series Analysis

The typical spectral shape of an economic variable

Econometrica

The mean squared error of Geweke and Porter-Hudak’s estimator of the memory parameter in a long-memory time series

Journal of Time Series Analysis

Gaussian Random Processes