A nonparametric plug-in rule for selecting optimal block lengths for block bootstrap methods

doi:10.1016/j.stamet.2006.08.002

Statistical Methodology

Volume 4, Issue 3, July 2007, Pages 292-321

https://doi.org/10.1016/j.stamet.2006.08.002 Get rights and content

Abstract

In this paper, we consider the problem of empirical choice of optimal block sizes for block bootstrap estimation of population parameters. We suggest a nonparametric plug-in principle that can be used for estimating ‘mean squared error’-optimal smoothing parameters in general curve estimation problems, and establish its validity for estimating optimal block sizes in various block bootstrap estimation problems. A key feature of the proposed plug-in rule is that it can be applied without explicit analytical expressions for the constants that appear in the leading terms of the optimal block lengths. Furthermore, we also discuss the computational efficacy of the method and explore its finite sample properties through a simulation study.

Introduction

In recent years, several block bootstrap methods have been proposed in the literature in the context of weakly dependent time series data. A key advantage of the block bootstrap methods is that unlike classical inference methods for time series data, they provide consistent estimators of various population quantities without requiring stringent parametric model assumptions. However, the accuracy of block bootstrap estimators critically depends on the block size that must be supplied by the user. The orders of magnitude of the optimal block sizes are known in some inference problems (see [16], [12], [19], [2]). However, the leading terms of these optimal block sizes depend on various population characteristics in an intricate manner, making it difficult to estimate these parameters in practice. In a seminal paper, Hall, Horowitz and Jing [12] (hereafter referred to as HHJ) describe an empirical device for data-based selection of the optimal block sizes for bootstrap variance estimation and bootstrap distribution function estimation. The key step there is to define a data-based version of the mean-squared-error (MSE) function by using the subsampling method of Politis and Romano [26] and of Hall and Jing [15], which is then minimized and rescaled to produce an estimator of the optimal block size (cf. Section 6.3 below). For the important special case of bootstrap variance estimation, Bühlmann and Künsch [2] (hereafter referred to as BK) describe a plug-in method for estimating the optimal block size based on spectral density estimation methodology. For variance estimation by the circular and the stationary block bootstrap methods, Politis and White [29] (hereafter referred to as PW) recently describe a set of plug-in estimators of the optimal block sizes for the case of the sample mean only.

In this paper, we describe an alternative approach to empirical choice of the optimal block size for block bootstrap estimation of various population quantities, that may be thought of as a generalized ‘plug-in’ rule. Unlike traditional ‘plug-in’ rules, the proposed method employs nonparametric resampling methods to estimate the relevant constants in the leading term of the optimal block size and, hence, does not require the knowledge and/or derivation of explicit analytical expressions for the constants. We establish consistency of the proposed method for optimal block sizes in bootstrap bias and variance estimation problems as well as in bootstrap distribution function problems, as considered in HHJ. In addition, we also show that the proposed method produces consistent estimators of the optimal block size for bootstrap quantile estimation. This last result is particularly important from a practical point of view, as bootstrap quantiles are frequently used for constructing confidence intervals whose converage probabilities crucially depend on the choice of the block size.

The proposed plug-in rule is based on the Jackknife-After-Bootstrap (JAB) method of Efron [7] and Lahiri [20]. The choice of the JAB method is partly prompted by computational efficacy of the proposed plug-in method. As shown in these papers, the JAB method produces an estimate of the variance of a block bootstrap estimator by suitably regrouping the bootstrap replicates (generated for the Monte Carlo computation) of the given block bootstrap estimator itself. As a result, the JAB method does not require any iterated resampling and enjoys remarkable computational efficacy. For estimation of the bias part, here we propose a new bias estimator that is also computationally efficacious and does not involve iterated resampling. Furthermore, we show that the proposed bias estimator is consistent for all four population characteristics (viz., bias, variance, distribution function, and quantiles) of a large class of estimators that can be expressed as smooth functions of sample means. As a consequence, the proposed ‘nonparametric plug-in’ method (hereafter referred to as the NPPI method) is computationally attractive and it provides consistent estimators of the optimal block length in different bootstrap estimation problems without explicit analytical considerations.

Another important aspect of the method is that being a plug-in rule, it is computationally less demanding than methods based on minimizations of criteria functions, which involve computation of an estimated MSE function for several different values of the block size. Thus, the method presented here combines the computational simplicity of a ‘plug-in’ approach with the generality of a criterion-based method, such as the HHJ method. The NPPI approach can also be used effectively in other problems involving smoothing parameter selection, such as data-based selection of the MSE-optimal bandwidths for density and regression function estimations. Apart from a brief description of the NPPI method for smoothing parameter selection in a general curve estimation problem (cf. Section 2), we do not pursue such generalizations in this paper.

The rest of the paper is organized as follows. We conclude Section 1 with a brief literature review. In Section 2, we present a general description of the NPPI principle and state a specific version of the method for selecting the optimal block sizes in various bootstrap estimation problems. The key ingredients of the proposed method are resampling method-based estimation of the variance and the bias of a block bootstrap estimator, which are presented in Section 3. Here, we briefly describe the JAB method that yields a nonparametric estimator of the variance of a block bootstrap estimator, and also introduce a new bias estimator for bootstrap estimators using an asymptotic representation. We establish consistency of the proposed plug-in rule in Section 4. Some important issues involving practical implementation of the proposed NPPI method are discussed in Section 5. Finite sample properties of the method and a comparison of its performance with the existing block size selection methods are given in Section 6. Section 7 contains some concluding remarks. Proofs of all results are given in the Appendix.

Block bootstrap and block jackknife methods for dependent data have been put forward by several authors, notably by Hall [10], Carlstein [3], Künsch [16], Liu and Singh [22], Politis and Romano [25], [27], Carlstein et al. [4], and Paparoditis and Politis [24]. Properties of block-bootstrap methods have been studied by Lahiri [17], [18], [19], Davison and Hall [5], Bühlmann [1], Naik-Nimbalkar and Rajarshi [23], Hall, Horowitz and Jing [12], Hall, Lahiri and Polzehl [13], and Götze and Künsch [9], among others. The problem of data-based choice of the optimal block size has been considered by Hall, Horowitz and Jing [12], Bühlmann and Künsch [2], and Politis and White [29], among others. In his 1992 seminal paper, Efron formulated the JAB method for independent data and established its computational efficacy. The JAB method for dependent data was formulated by Lahiri [20]. For a book-length treatment of the resampling methodology for time series data, see Lahiri [30].

Section snippets

The ‘nonparametric plug-in’ principle

Let ${\hat{φ}}_{n} \equiv {\hat{φ}}_{n} (ℓ)$ be a block bootstrap estimator of a population parameter $φ_{n}$ based on blocks of size $ℓ$ . For example, $φ_{n} \equiv n Var ({\hat{θ}}_{n})$ could be the scaled variance of a given estimator ${\hat{θ}}_{n}$ and ${\hat{φ}}_{n} (ℓ) = n {Var}_{*} (θ_{n}^{*})$ is the corresponding block bootstrap estimator based on blocks of size $ℓ$ , where ${Var}_{*}$ denotes the conditional variance given the data. It is known (cf. HHJ) that for many $φ_{n}$ ’s, the variance of the bootstrap estimator ${\hat{φ}}_{n} (ℓ)$ is an increasing function of the block length $ℓ$ while its bias is a

The moving block bootstrap method

We begin with a brief description of the Moving Block Bootstrap (MBB) method of Künsch [16] and Liu and Singh [22]. Let ${X_{1}, \dots, X_{n}} \equiv X_{n}$ be a finite segment of a stationary time series ${X_{i}}_{i \in Z}$ and let $T_{n} = t_{n} (X_{n}; θ)$ be a random variable of interest, where $θ$ is a population parameter. For example, we may have $T_{n} = \sqrt{n} ({\bar{X}}_{n} - μ)$ , where ${\bar{X}}_{n} = n^{- 1} \sum_{i = 1}^{n} X_{i}$ denotes the sample mean and $μ = E X_{1}$ denotes the population mean. Next let $ℓ$ denote the block size and with $N = n - ℓ + 1$ , let $B_{i} = {(X_{i}, \dots, X_{i + ℓ - 1})}^{'}, i = 1, \dots, N$ denote the

Theoretical results

For clarity of exposition, we describe the basic theoretical framework in Section 4.1, and state the results on consistency of the NPPI method for bootstrap bias and variance estimations in Section 4.2, and for bootstrap distribution function and quantile estimations in Section 4.3, respectively.

Implementation of the NPPI method in practice

Theorem 1, Theorem 2 show that the NPPI estimator of the optimal block length $ℓ_{k}^{0}$ is consistent in all four cases, $k = 1, 2, 3, 4$ , for a wide range of choices of the smoothing parameters $ℓ$ and $m$ . Note that the choice of $ℓ$ determines the accuracy of the bias estimator proposed in Section 3.3 and the choice of $m$ determines the accuracy of the JAB variance estimator, defined in Section 3.2. We now discuss choices of $ℓ$ and $m$ that yield a reasonable estimator ${\hat{ℓ}}_{k}^{0}$ of the optimal block size $ℓ_{k}^{0}$ for each $k =$

Simulation study

In this section, we report the results of a simulation study on finite sample performance of the NPPI method. We considered four different time series models, given by $X_{t} = (ϵ_{t} + ϵ_{t - 1}) / \sqrt{2}, ϵ_{t} \sim^{iid} χ^{2} (1) - 1;$ $X_{t} = 0.3 X_{t - 1} + ϵ_{t}, ϵ_{t} \sim^{iid} χ^{2} (1) - 1;$ $X_{t} = - 0.1 X_{t - 1} + ϵ_{t}, ϵ_{t} \sim^{iid} χ^{2} (1) - 1;$ $X_{t} = Y_{t}^{2} + Y_{t} 1 (Y_{t} < 0),$ where $(χ^{2} (1) - 1)$ denotes the Chi-squared distribution on $R$ with one degree of freedom centered at its mean, $1 (S)$ denotes the indicator function of a statement $S$ , taking the value 1 if $S$ is true and the value 0 if $S$ is false, and

Concluding remarks

In this paper, we propose a general method for estimating the optimal block sizes for block bootstrap estimation of various population parameters. We establish consistency of the method in different bootstrap estimation problems, including bootstrap variance estimation and bootstrap quantile estimation. An important feature of the proposed method is that it can be applied in a wide range of situations without the knowledge of exact analytical expressions for the population parameters that

References (30)

P. Bühlmann et al.
Block length selection in the bootstrap for time series
Comput. Statist. Data Anal.
(1999)
P. Hall
Resampling a coverage pattern
Stochastic. Process. Appl.
(1985)
S.N. Lahiri
Second order optimality of stationary bootstrap
Statist. Probab. Lett.
(1991)
S.N. Lahiri
On Edgeworth expansion and moving block bootstrap for studentized M-estimators in multiple linear regression models
J. Multivariate Anal.
(1996)
P. Bühlmann
Blockwise bootstrapped empirical process for stationary sequences
Ann. Statist.
(1994)
E. Carlstein
The use of subseries methods for estimating the variance of a general statistic from a stationary time series
Ann. Statist.
(1986)
E. Carlstein et al.
Matched-block bootstrap for dependent data
Bernoulli
(1998)
A.C. Davison et al.
On studentizing and blocking methods for implementing the bootstrap with dependent data
Aust. J. Statist.
(1993)
B. Efron
Bootstrap methods: Another look at the jackknife
Ann. Statist.
(1979)
B. Efron
Jackknife-after-bootstrap standard errors and influence functions (with discussion)
J. Roy. Statist. Soc. Ser. B
(1992)

F. Götze et al.

Asymptotic expansions for sums of weakly dependent random vectors

Zieb Wahr. Verw Gebiete

(1983)

F. Götze et al.

Second-order correctness of the blockwise bootstrap for stationary observations

Ann. Statist.

(1996)

P. Hall

The Bootstrap and Edgeworth Expansion

(1992)

P. Hall et al.

On blocking rules for the bootstrap with dependent data

Biometrika

(1995)

P. Hall et al.

On bandwidth choice in nonparametric regression with both short- and long-range dependent errors

Ann. Statist.

(1995)

Cited by (0)

^☆: Research partially supported by NSF grants no. DMS 0072571 and 0306574.

View full text

A nonparametric plug-in rule for selecting optimal block lengths for block bootstrap methods☆

Abstract

Introduction

Section snippets

The ‘nonparametric plug-in’ principle

The moving block bootstrap method

Theoretical results

Implementation of the NPPI method in practice

Simulation study

Concluding remarks

Comput. Statist. Data Anal.

Stochastic. Process. Appl.

Statist. Probab. Lett.

J. Multivariate Anal.

Blockwise bootstrapped empirical process for stationary sequences

Ann. Statist.

The use of subseries methods for estimating the variance of a general statistic from a stationary time series

Ann. Statist.

Matched-block bootstrap for dependent data

Bernoulli

On studentizing and blocking methods for implementing the bootstrap with dependent data

Aust. J. Statist.

Bootstrap methods: Another look at the jackknife

Ann. Statist.

Jackknife-after-bootstrap standard errors and influence functions (with discussion)

J. Roy. Statist. Soc. Ser. B

Asymptotic expansions for sums of weakly dependent random vectors

Zieb Wahr. Verw Gebiete

Second-order correctness of the blockwise bootstrap for stationary observations

Ann. Statist.

The Bootstrap and Edgeworth Expansion

On blocking rules for the bootstrap with dependent data

Biometrika

On bandwidth choice in nonparametric regression with both short- and long-range dependent errors

Ann. Statist.