A nonparametric plug-in rule for selecting optimal block lengths for block bootstrap methods

https://doi.org/10.1016/j.stamet.2006.08.002Get rights and content

Abstract

In this paper, we consider the problem of empirical choice of optimal block sizes for block bootstrap estimation of population parameters. We suggest a nonparametric plug-in principle that can be used for estimating ‘mean squared error’-optimal smoothing parameters in general curve estimation problems, and establish its validity for estimating optimal block sizes in various block bootstrap estimation problems. A key feature of the proposed plug-in rule is that it can be applied without explicit analytical expressions for the constants that appear in the leading terms of the optimal block lengths. Furthermore, we also discuss the computational efficacy of the method and explore its finite sample properties through a simulation study.

Introduction

In recent years, several block bootstrap methods have been proposed in the literature in the context of weakly dependent time series data. A key advantage of the block bootstrap methods is that unlike classical inference methods for time series data, they provide consistent estimators of various population quantities without requiring stringent parametric model assumptions. However, the accuracy of block bootstrap estimators critically depends on the block size that must be supplied by the user. The orders of magnitude of the optimal block sizes are known in some inference problems (see [16], [12], [19], [2]). However, the leading terms of these optimal block sizes depend on various population characteristics in an intricate manner, making it difficult to estimate these parameters in practice. In a seminal paper, Hall, Horowitz and Jing [12] (hereafter referred to as HHJ) describe an empirical device for data-based selection of the optimal block sizes for bootstrap variance estimation and bootstrap distribution function estimation. The key step there is to define a data-based version of the mean-squared-error (MSE) function by using the subsampling method of Politis and Romano [26] and of Hall and Jing [15], which is then minimized and rescaled to produce an estimator of the optimal block size (cf. Section 6.3 below). For the important special case of bootstrap variance estimation, Bühlmann and Künsch [2] (hereafter referred to as BK) describe a plug-in method for estimating the optimal block size based on spectral density estimation methodology. For variance estimation by the circular and the stationary block bootstrap methods, Politis and White [29] (hereafter referred to as PW) recently describe a set of plug-in estimators of the optimal block sizes for the case of the sample mean only.

In this paper, we describe an alternative approach to empirical choice of the optimal block size for block bootstrap estimation of various population quantities, that may be thought of as a generalized ‘plug-in’ rule. Unlike traditional ‘plug-in’ rules, the proposed method employs nonparametric resampling methods to estimate the relevant constants in the leading term of the optimal block size and, hence, does not require the knowledge and/or derivation of explicit analytical expressions for the constants. We establish consistency of the proposed method for optimal block sizes in bootstrap bias and variance estimation problems as well as in bootstrap distribution function problems, as considered in HHJ. In addition, we also show that the proposed method produces consistent estimators of the optimal block size for bootstrap quantile estimation. This last result is particularly important from a practical point of view, as bootstrap quantiles are frequently used for constructing confidence intervals whose converage probabilities crucially depend on the choice of the block size.

The proposed plug-in rule is based on the Jackknife-After-Bootstrap (JAB) method of Efron [7] and Lahiri [20]. The choice of the JAB method is partly prompted by computational efficacy of the proposed plug-in method. As shown in these papers, the JAB method produces an estimate of the variance of a block bootstrap estimator by suitably regrouping the bootstrap replicates (generated for the Monte Carlo computation) of the given block bootstrap estimator itself. As a result, the JAB method does not require any iterated resampling and enjoys remarkable computational efficacy. For estimation of the bias part, here we propose a new bias estimator that is also computationally efficacious and does not involve iterated resampling. Furthermore, we show that the proposed bias estimator is consistent for all four population characteristics (viz., bias, variance, distribution function, and quantiles) of a large class of estimators that can be expressed as smooth functions of sample means. As a consequence, the proposed ‘nonparametric plug-in’ method (hereafter referred to as the NPPI method) is computationally attractive and it provides consistent estimators of the optimal block length in different bootstrap estimation problems without explicit analytical considerations.

Another important aspect of the method is that being a plug-in rule, it is computationally less demanding than methods based on minimizations of criteria functions, which involve computation of an estimated MSE function for several different values of the block size. Thus, the method presented here combines the computational simplicity of a ‘plug-in’ approach with the generality of a criterion-based method, such as the HHJ method. The NPPI approach can also be used effectively in other problems involving smoothing parameter selection, such as data-based selection of the MSE-optimal bandwidths for density and regression function estimations. Apart from a brief description of the NPPI method for smoothing parameter selection in a general curve estimation problem (cf. Section 2), we do not pursue such generalizations in this paper.

The rest of the paper is organized as follows. We conclude Section 1 with a brief literature review. In Section 2, we present a general description of the NPPI principle and state a specific version of the method for selecting the optimal block sizes in various bootstrap estimation problems. The key ingredients of the proposed method are resampling method-based estimation of the variance and the bias of a block bootstrap estimator, which are presented in Section 3. Here, we briefly describe the JAB method that yields a nonparametric estimator of the variance of a block bootstrap estimator, and also introduce a new bias estimator for bootstrap estimators using an asymptotic representation. We establish consistency of the proposed plug-in rule in Section 4. Some important issues involving practical implementation of the proposed NPPI method are discussed in Section 5. Finite sample properties of the method and a comparison of its performance with the existing block size selection methods are given in Section 6. Section 7 contains some concluding remarks. Proofs of all results are given in the Appendix.

Block bootstrap and block jackknife methods for dependent data have been put forward by several authors, notably by Hall [10], Carlstein [3], Künsch [16], Liu and Singh [22], Politis and Romano [25], [27], Carlstein et al. [4], and Paparoditis and Politis [24]. Properties of block-bootstrap methods have been studied by Lahiri [17], [18], [19], Davison and Hall [5], Bühlmann [1], Naik-Nimbalkar and Rajarshi [23], Hall, Horowitz and Jing [12], Hall, Lahiri and Polzehl [13], and Götze and Künsch [9], among others. The problem of data-based choice of the optimal block size has been considered by Hall, Horowitz and Jing [12], Bühlmann and Künsch [2], and Politis and White [29], among others. In his 1992 seminal paper, Efron formulated the JAB method for independent data and established its computational efficacy. The JAB method for dependent data was formulated by Lahiri [20]. For a book-length treatment of the resampling methodology for time series data, see Lahiri [30].

Section snippets

The ‘nonparametric plug-in’ principle

Let φˆnφˆn() be a block bootstrap estimator of a population parameter φn based on blocks of size . For example, φnnVar(θˆn) could be the scaled variance of a given estimator θˆn and φˆn()=nVar(θn) is the corresponding block bootstrap estimator based on blocks of size , where Var denotes the conditional variance given the data. It is known (cf. HHJ) that for many φn’s, the variance of the bootstrap estimator φˆn() is an increasing function of the block length while its bias is a

The moving block bootstrap method

We begin with a brief description of the Moving Block Bootstrap (MBB) method of Künsch [16] and Liu and Singh [22]. Let {X1,,Xn}Xn be a finite segment of a stationary time series {Xi}iZ and let Tn=tn(Xn;θ) be a random variable of interest, where θ is a population parameter. For example, we may have Tn=n(X̄nμ), where X̄n=n1i=1nXi denotes the sample mean and μ=EX1 denotes the population mean. Next let denote the block size and with N=n+1, let Bi=(Xi,,Xi+1),i=1,,N denote the

Theoretical results

For clarity of exposition, we describe the basic theoretical framework in Section 4.1, and state the results on consistency of the NPPI method for bootstrap bias and variance estimations in Section 4.2, and for bootstrap distribution function and quantile estimations in Section 4.3, respectively.

Implementation of the NPPI method in practice

Theorem 1, Theorem 2 show that the NPPI estimator of the optimal block length k0 is consistent in all four cases, k=1,2,3,4, for a wide range of choices of the smoothing parameters and m. Note that the choice of determines the accuracy of the bias estimator proposed in Section 3.3 and the choice of m determines the accuracy of the JAB variance estimator, defined in Section 3.2. We now discuss choices of and m that yield a reasonable estimator ˆk0 of the optimal block size k0 for each k=

Simulation study

In this section, we report the results of a simulation study on finite sample performance of the NPPI method. We considered four different time series models, given by Xt=(ϵt+ϵt1)/2,ϵtiidχ2(1)1;Xt=0.3Xt1+ϵt,ϵtiidχ2(1)1;Xt=0.1Xt1+ϵt,ϵtiidχ2(1)1;Xt=Yt2+Yt1(Yt<0), where (χ2(1)1) denotes the Chi-squared distribution on R with one degree of freedom centered at its mean, 1(S) denotes the indicator function of a statement S, taking the value 1 if S is true and the value 0 if S is false, and

Concluding remarks

In this paper, we propose a general method for estimating the optimal block sizes for block bootstrap estimation of various population parameters. We establish consistency of the method in different bootstrap estimation problems, including bootstrap variance estimation and bootstrap quantile estimation. An important feature of the proposed method is that it can be applied in a wide range of situations without the knowledge of exact analytical expressions for the population parameters that

References (30)

  • F. Götze et al.

    Asymptotic expansions for sums of weakly dependent random vectors

    Zieb Wahr. Verw Gebiete

    (1983)
  • F. Götze et al.

    Second-order correctness of the blockwise bootstrap for stationary observations

    Ann. Statist.

    (1996)
  • P. Hall

    The Bootstrap and Edgeworth Expansion

    (1992)
  • P. Hall et al.

    On blocking rules for the bootstrap with dependent data

    Biometrika

    (1995)
  • P. Hall et al.

    On bandwidth choice in nonparametric regression with both short- and long-range dependent errors

    Ann. Statist.

    (1995)
  • Cited by (0)

    Research partially supported by NSF grants no. DMS 0072571 and 0306574.

    View full text