Testing for adequacy of seasonal adjustment in the frequency domain

https://doi.org/10.1016/j.jspi.2020.06.012Get rights and content

Highlights

  • Concept of spectral peak is rigorously developed.

  • A theoretical testing framework for visual significance is provided.

  • Method provides visual significance testing with uncertainty quantification.

  • Simulations indicate minimal sample sizes needed for given spectral peak widths.

Abstract

Peaks in the spectral density estimates of seasonally adjusted data are indicative of an inadequate adjustment. Spectral peaks are currently assessed in the X-13ARIMA-SEATS program via the visual significance (VS) approach; this paper provides a rigorous statistical foundation for VS by defining measures of uncertainty for spectral peak measures, allowing for formal hypothesis testing, using the framework of fixed-bandwidth fraction asymptotics for taper-based spectral density estimates. The simulation results show that the test has good size and power properties for a variety of peak features.

Introduction

Quarterly or monthly economic time series typically exhibit seasonality, most often described via a nonstationary stochastic process with unit root corresponding to the known seasonal frequencies (Bell and Hillmer, 1983); also see the discussion in Chapter 3 of Hylleberg (1986). Adequate estimation and removal of seasonality should correspond to the absence of spectral peaks at these same seasonal frequencies in the adjusted series. If there is residual seasonality present, the adjustment is inadequate, and hence testing for residual seasonality is a problem of widespread importance; millions of time series are seasonally adjusted each month at statistical agencies around the world, many of whom utilize the software program X-13ARIMA-SEATS (U.S Census Bureau, 2015) of the U.S. Census Bureau. Testing for residual seasonality involves data arising from time series that typically have trend nonstationarity, but no longer have seasonal nonstationarity, and therefore applications of the tests can be formulated for processes that are stationary after differencing. This is the case considered in this paper.

One method of testing for the presence of residual seasonality is based on detection of spectral peaks (Findley, 2005). The detection of seasonality in raw, or unadjusted data, is a different problem that historically has used different tools — we provide a brief discussion in order to frame the problem of testing for residual seasonality.

One approach to the detection of seasonality (in raw data) is to postulate as a null hypothesis the existence of a sinusoid – corresponding to a deterministic seasonal component – at the frequency of interest, and test whether spectral density estimates warrant such a hypothesis. The early literature on spectral peak testing (see Priestley (1981)) focused on this approach, and the stable-seasonality test of Lytras et al. (2007) does as well. Any stochastic seasonal component, conceived as a nonstationary process, can include such a stable sinusoidal component without loss of generality — this is analogous to the fact that a random walk with drift can be decomposed into a linear term (its mean) plus the purely stochastic mean zero portion. Tests for stable sinusoids focus on the deterministic part of seasonality, but are not designed to address the stochastic portion. However, it is important to do so: removal of the deterministic portion alone (say, via regression) does not entail the removal of the whole stochastic seasonal, and such an approach fails to accomplish the goal of seasonal adjustment — for most economic series, the seasonality is too evolutive to be adequately captured by fixed periodic functions (Findley et al., 2017).

In summary, seasonal adjustment will typically remove deterministic seasonality as well as any nonstationary stochastic facets to seasonality. However, residual seasonality is deemed to be present if seasonal peaks exist in the spectral density of the trend-differenced adjusted process. Tests for residual seasonality can therefore be based upon a stationary time series process, and all the earlier work has recognized this. Early work on assessing the effect of seasonal adjustment appeared in Nerlove (1964) and Grether and Nerlove (1970). Pierce, 1976, Pierce, 1979 looked at adequacy of seasonal adjustment by examining the magnitude of the autocorrelations at seasonal lags of the adjusted series. This approach is generalized to the Qs statistic, adopted by TRAMO-SEATS (Maravall, 2012), which is a variant of the Box–Ljung–Pierce test applied to seasonal lag autocorrelations.

Whereas a deterministic sinusoid corresponds to a jump in the spectral distribution function – and will appear in a spectral density estimate as a tall slender peak – stochastic seasonality (i.e., the residual seasonality remaining after seasonal adjustment) instead corresponds to a broader peak in the spectral density estimate (e.g., computed using an Autoregressive Estimator); it will have a broader peak that nonetheless is approaching an infinite height as sample size increases. It becomes important to consider the width of a spectral peak in its assessment. Since the procedure of seasonal adjustment, viewed in the frequency domain, amounts to multiplication of a function with a peak by another function with a trough, whether or not the peak is transformed into a trough or not depends on the width of these functions (see McElroy and Roy (2017)). For this and other reasons, Soukup and Findley (1999) considered a measure of the peak that involved the distance between ordinates of the log spectrum when examined on a grid of frequencies of mesh size π60.

The actual measure proposed in Soukup and Findley (1999) – which has now become somewhat of a standard by virtue of its incorporation into the X-12-ARIMA software used at most international statistical agencies – is computed by comparing the spectral peak ordinate to both nearest neighbor ordinates, with respect to the chosen frequency grid; when both ordinate differences exceed a threshold (selected based upon empirical criteria), the spectral peak is declared to be “visually significant.” So far no distribution theory has been proposed for this statistic, making it difficult to rigorously determine Type I and II error. This paper puts the concept of visual significance upon a rigorous statistical footing by describing the exercise of finding a visually significant peak as a proper hypothesis testing procedure.

In order to develop a statistical theory for spectral peaks, one first requires a theory for spectral density estimation. Spectral density estimates generally fall into two classes: model-based (e.g., the Autoregressive spectral estimator, or other estimators derived from a fitted model) and nonparametric (e.g., based on smoothing the periodogram). We focus on the latter class, based upon tapering the sample autocovariances with a positive definite taper, such as the Bartlett or Daniell kernels (Priestley, 1981). Asymptotic theory for such estimates goes back to Parzen (1957), and the literature adopts the perspective that the taper bandwidth is negligible relative to sample size. More recent literature, as in Hashimzade and Vogelsang (2008), adopts the perspective of the so-called fixed-b asymptotics, where the ratio of bandwidth length to sample size is assumed to be a fixed fraction b(0,1). Because the fixed-b asymptotic framework has several advantages – including a superior approximation of the sampling distribution (McElroy and Politis, 2014, Sun, 2014) – we pursue spectral peak detection with this perspective in mind. Some of the results require extensions of previous literature, such as Hashimzade and Vogelsang (2008) and McElroy and Politis (2014); we allow the frequencies of interest to depend on sample size, and can be more general than Fourier frequencies.

Section snippets

Measuring spectral peaks

The idea of a peak in a graph is surprisingly subtle to capture through mathematical formulas. McElroy and Holan (2009) set forth a measure based upon measuring the second derivative of the spectral density — referred to as the Spectral Convexity (SC) test. The approach of Soukup and Findley (1999) compares the log autoregressive spectrum at a frequency of interest θ (in units of radians), e.g., a seasonal or trading day frequency, to two “nearest neighbors” on the left and right, some distance

Spectral estimation and asymptotic critical values

We consider tapered autocovariance function (acf) estimators defined as linear weighted combinations of the first M sample autocovariances, with the weights determined by a specified taper. The quantity M is the bandwidth length. Depending on how fast the bandwidth length grows relative to sample size, three different convergences are obtained. These are the cases of fixed bandwidth length, small bandwidth fraction, and fixed bandwidth fraction (fixed-b), respectively. Following recent work on

AR(2) peak

We start evaluating the peak detection performance of the proposed VS test using an AR(2) process {Xt} satisfying (12ρcos(θ)B+ρB2)Xt=ϵtwith noise variance σϵ2=1. The parameterization in (12) puts a single peak at the frequency θ=π6. The values of ρ that makes the VS values at ω=π6 to be {0,0.05,0.10,0.15,0.20,0.30,0.40,0.50} for a width of π30 are ρ={0.000,0.861,0.915,0.941,0.959,0.980,0.9918,0.9976}. Fig. 1 shows the spectrum for the different VS values.

Table 1 shows the power (based on a

Data analysis

The X-13ARIMA-SEATS software allows users to apply VS to the raw (original) data, to the RegARIMA model residuals, the seasonally adjusted (SA) series, or the estimated irregular. However, the raw data typically has nonstationary features, so that stationary tapered spectral estimators are not appropriate, and the methods of this paper should not be applied. More precisely, the tools of this paper properly apply to stationary processes, and cannot be applied to nonstationary processes without a

Discussion

Because seasonal adjustment is an enormous activity for statistical offices, the determination of adjustment inadequacy is extremely important. A host of criteria have been proposed over the decades (summarized in Hylleberg (1986)), and recent work has focused on placing seasonal adjustment diagnostics on a rigorous statistical footing. Following the work of McElroy and Holan (2009), this paper examines the assessment of spectral peaks and incorporates a quantification of Type I error into the

CRediT authorship contribution statement

Tucker McElroy: Conceptualization, Methodology, Formal analysis, Investigation, Resources, Visualization, Supervision, Project administration, Funding acquisition. Anindya Roy: Conceptualization, Methodology, Software, Validation, Formal analysis, Data curation, Visualization.

Acknowledgments

The authors thank Xiaofeng Shao for stimulating discussions about this problem, and helpful comments from the referees.

Disclaimer

This report is released to inform interested parties of research and to encourage discussion. The views expressed on statistical issues are those of the authors and not necessarily those of the U.S. Census Bureau.

References (22)

  • LytrasD.P. et al.

    Determining aeasonality: a comparison of diagnostics from X-13-ARIMA

  • View full text