Abstract
Frequentist confidence intervals that include some element of data-based model selection or model averaging is an active area of research. Assessments of the performance, in terms of coverage and expected length, of such intervals yield few positive results. Efron, JASA 2014, proposed a confidence interval centred on a bootstrap smoothed estimator, with width proportional to an estimator of Efron’s delta method approximation to the standard deviation of this estimator. Recently, Kabaila and Wijethunga assessed the performance of this confidence interval using a testbed consisting of two nested linear regression models, with error variance assumed known. This interval was shown to have far better coverage properties than the corresponding post-model-selection confidence interval. However, its expected length properties were not as good as had been hoped for. For this testbed, we ask the following question. Does there exist a formula for the data-based width of a confidence interval centred on the bootstrap smoothed estimator so that it has good performance in terms of both coverage and expected length? Using a decision-theoretic performance bound we answer this question in the negative.
Similar content being viewed by others
References
Bickel P, Doksum K (1977) Mathematical statistics. Basic ideas and selected topics. Holden-Day, Oakland
Blyth C (1951) On minimax statistical decision procedures and their admissibility. Ann Math Stat 22:22–42
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140
Buckland S, Burnham K, Augustin N (1997) Model selection: an integral part of inference. Biometrics 53:603–618
Efron B (2014) Estimation and accuracy after model selection. J Am Stat Assoc 109:991–1007
Efron B, Hastie T (2016) Computer age statistical inference: algorithms, evidence, and data science. Cambridge University Press, New York
Fletcher D (2018) Model averaging. Springer, Berlin
Fletcher D, Turek D (2011) Model-averaged profile likelihood intervals. J Agric Biol Environ Stat 17:38–51
Giri K (2008) Confidence intervals in regression utilizing prior information. PhD thesis, Department of Mathematics and Statistics, La Trobe University
Hall P (1992) The bootstrap and Edgeworth expansion. Springer, New York
Hjort N (2014) Comment on ‘estimation and accuracy after model selection’ by Bradley Efron. J Am Stat Assoc 109:1017–1020
Hjort N, Claeskens G (2003) Frequentist model average estimators. J Am Stat Assoc 98:879–899
Hodges J, Lehmann E (1952) The use of previous experience in reaching statistical decisions. Ann Math Stat 23:396–407
Johnstone I (2019) Gaussian estimation: sequence and wavelet models. Book Draft, version of September 16, 2019. https://imjohnstone.su.domains/
Kabaila P (2009) The coverage properties of confidence regions after model selection. Int Stat Rev 77:405–414
Kabaila P (2018) On the minimum coverage probability of model averaged tail area confidence intervals. Can J Stat 46:279–297
Kabaila P, Giri K (2009) Confidence intervals in regression utilizing prior information. J Stat Plan Inference 139:3419–3429
Kabaila P, Giri K (2013) Further properties of frequentist confidence intervals in regression that utilize uncertain prior information. Aust N Z J Stat 55:259–270
Kabaila P, Kong Y (2016) Lower bounds on integrated risk, subject to inequality constraints. Aust N Z J Stat 58:293–315
Kabaila P, Tuck J (2008) Confidence intervals utilizing prior information in the Behrens–Fisher problem. Aust N Z J Stat 50:309–328
Kabaila P, Wijethunga C (2019) Confidence intervals centred on bootstrap smoothed estimators. Aust N Z J Stat 61:19–38
Kabaila P, Wijethunga C (2019) On confidence intervals centred on bootstrap smoothed estimators. Stat 8:e233
Kabaila P, Welsh A, Abeysekera W (2016) Model-averaged confidence intervals. Scand J Stat 43:35–48
Kabaila P, Welsh A, Mainzer R (2017) The performance of model averaged tail area confidence intervals. Commun Stat Theory Methods 46:10718–10732
Kabaila P, Welsh A, Wijethunga C (2020) Finite sample properties of confidence intervals centered on a model averaged estimator. J Stat Plan Inference 207:10–26
Kempthorne P (1983) Minimax-Bayes compromise estimators. In: Proceedings of the business and economic statistics section. American Statistical Association, Washington, pp 563–573
Kempthorne P (1987) Numerical specification of discrete least favorable prior distributions. SIAM J Sci Stat Comput 8:171–184
Kempthorne P (1988) Controlling risks under different loss functions: the compromise decision problem. Ann Stat 16:1594–1608
Leeb H, Pötscher B (2005) Model selection and inference: facts and fiction. Econom Theory 21:21–59
Mainzer R, Kabaila P (2019) ciuupi: An R package for computing confidence intervals that utilize uncertain prior information. R Journal 11:323–336
Turek D, Fletcher D (2012) Model-averaged Wald confidence intervals. Comput Stat Data Anal 56:2809–2815
Wald A (1950) Statistical decision functions. Wiley, New York
Wijethunga C (2019) Confidence intervals constructed by model averaging and bootstrap smoothing. PhD thesis, Department of Mathematics and Statistics, La Trobe University
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendix
Appendix
We will express all quantities of interest in terms of the random vector \((\widehat{\theta }, \widehat{\gamma })\), which has the following bivariate normal distribution:
1.1 A.1 Proof of Theorem 1
The following proof is based, in part, on the derivations described in Section 4.3 of Giri (2008). The coverage probability of the confidence interval \(\text {CI}(s)\) is
where \(G = (\widehat{\theta } - \theta )/\big (\sigma \,{v_\theta ^{1/2}} \big )\). It follows from (6) that
For given \(\rho \) and function s, the coverage probability of \(\text {CI}(s)\) is a function of \(\gamma \). We denote this coverage probability by \(c(\gamma ; s, \rho )\).
Since b and s are odd and even functions, respectively,
where \(G^{\prime } = -G\) and \(\widehat{\gamma }^{\prime } = - \widehat{\gamma }\). It follows from (7) that
Hence \(c(\gamma ; s, \rho ) = c(-\gamma ; s, \rho )\).
Let \(\bar{k}(x) = k(x)\) for \(|x| < c \, \); otherwise \(\bar{k}(x) = 0\). Since \(b(x) = \rho \, \bar{k}(x)\),
where \(G^{\prime } = -G\). It follows from (7) that
Hence \(c(\gamma ; s, \rho ) = c(\gamma ; s, -\rho )\).
It follows from (7) that the probability distribution of G, conditional on \(\widehat{\gamma }=h\), is \(N\big (\rho (h-\gamma ), 1-\rho ^2 \big )\). Note that
where \(\widetilde{G} \sim N\big (\rho (h-\gamma ), 1-\rho ^2 \big )\). Thus
The usual \(1-\alpha \) confidence interval based on the full model \(\mathcal{M}_2\) has coverage probability \(1-\alpha \). Thus
Therefore \(1-\alpha \) is equal to
It follows from this equality and (8) that
Change the variable of integration to \(y=-h\) in the second integral. The result \(c(\gamma ; s, \rho ) = 1 - \alpha - R_1(s, \gamma )\) now follows from the fact that both s and \(\phi \) are even functions. \(\square \)
1.2 A.2 Proof of Theorem 2
The following proof is based, in part, on the derivations described in Section 4.3 of Giri (2008). Note that
since \(s(x)=z_{1-\alpha /2}\) for all \(|x| \ge c\). Obviously,
It follows from this equality and (9) that
Change the variable of integration to \(y=-h\) in the first integral on the right-hand side. The fact that both s and \(\phi \) are even functions implies that (1) is true. \(\square \)
1.3 A.3 Computation of the function \(\varvec{s_{\gamma , \nu }}\) for given \(\varvec{(\gamma ,\nu )}\)
Throughout this section we suppose that \((\varvec{\gamma },\varvec{\nu })\) is given. We describe the computation of \(s_{\gamma , \nu }\), a value of \(s \in \mathcal{D}\) that minimizes \(\widetilde{g}(s, \varvec{\gamma }, \varvec{\nu })\). Straightforward manipulations show that
where \(q(x; h, \varvec{\gamma }, \varvec{\nu })\) is defined to be
Recall that the functions \(\ell \) and \(\ell ^{\dag }\) are defined in the statement of Theorem 1. It follows from (10) that a function \(s_{\varvec{\gamma }, \varvec{\nu }}\), defined as a minimizer of \(\widetilde{g}(s, \varvec{\gamma }, \varvec{\nu })\) over \(s \in \mathcal{D}\), may be found as follows. We set \(s_{\varvec{\gamma }, \varvec{\nu }}(h)\), for any \(h \in [0, c]\), to be a minimizer over \(x\in [0, \infty )\) of \(q(x; h, \varvec{\gamma }, \varvec{\nu })\).
Now \(q(x; h, \varvec{\gamma }, \varvec{\nu })\) is a continuous function of \(x \in [0, \infty )\) for all \(h \in [0,c]\) and every given \((\varvec{\gamma }, \varvec{\nu })\). An examination of some examples of this function of \(x \in [0, \infty )\) show that this function may have several local minima, including the possibility of a local minimum at \(x = 0\). Consequently, the value of \(x \in [0, \infty )\) that minimizes \(q(x; h, \varvec{\gamma }, \varvec{\nu })\) may change discontinuously, as h increases. In other words, the function \(s_{\varvec{\gamma }, \varvec{\nu }}(h)\) may have discontinuities. Figure 5 of the Supplementary Information provides some illustrations of functions \(q(x; h, \varvec{\gamma }, \varvec{\nu })\) of \(x\in [0, \infty )\) that have two local minima. Figure 4 of the Supplementary Information provides an illustration of a function \(s_{\varvec{\gamma }, \varvec{\nu }}(h)\) with discontinuities.
To evaluate the lower bound (3), we need to evaluate
Although the function \(s_{\varvec{\gamma }, \varvec{\nu }}(h)\) may have discontinuities, the integrand of the integral on the right-hand side of (11) is a continuous function of \(h \in [0, c]\). An illustration of this, when the function \(s_{\varvec{\gamma }, \varvec{\nu }}(h)\) has discontinuities, is provided by Figure 3 of the Supplementary Information.
To carry out the computation of the function \(s_{\varvec{\gamma }, \varvec{\nu }}\) accurately and effectively, we use the properties of \(dq(x; h, \varvec{\gamma }, \varvec{\nu })/dx\), considered as a function of x, described in Appendix A.4. Suppose that \(h \in [0, c]\) is given. Theorem 3 of Appendix A.4 leads to the procedure described at the end of this appendix for finding an interval \(\big [0, \widetilde{x}\big ]\) that must contain a value of \(x \ge 0\) that minimizes \(q(x; h, \varvec{\gamma }, \varvec{\nu })\).
We use the following two step procedure to find the value of \(x \in [0, \widetilde{x}]\) that minimizes \(q(x; h, \varvec{\gamma }, \varvec{\nu })\). We find all possible local minima in Step 1 and compare them to find the global minimum in Step 2.
Step 1: By considering \(dq(x; h, \varvec{\gamma }, \varvec{\nu })/dx\), find all the local minimizers of \(q(x; h, \varvec{\gamma }, \varvec{\nu })\) in the interval \([0, \widetilde{x}]\). Define w to be the smallest integer that is greater than or equal to \(10 \, \widetilde{x}\). We evaluate \(dq(x; h, \varvec{\gamma }, \varvec{\nu })/dx\) on the evenly-spaced grid \(x_1=0, x_2=0.1, x_3=0.2, \dots , x_w\) of values of x. To find the values of \(x \in [0, x_w]\) that are local minimizers of \(q(x; h, \gamma , \nu )\), we need to consider the following two cases.
\(Case 1: x = 0\)
\(x=0\) is a local minimizer of \(q(x; h, \varvec{\gamma }, \varvec{\nu })\) if either \(dq(0; h, \varvec{\gamma }, \varvec{\nu })/dx > 0\) or \(dq(0; h, \varvec{\gamma }, \varvec{\nu })/dx = 0\) and \(dq(x_2; h, \varvec{\gamma }, \varvec{\nu })/dx > 0\); otherwise \(x=0\) is not a local minimizer.
\(Case 2: 0< x < x_w\)
If \(dq(x_i; h, \varvec{\gamma }, \varvec{\nu })/dx < 0\) and \(dq(x_{i+1}; h, \varvec{\gamma }, \varvec{\nu })/dx > 0\), then \(dq(x; h, \varvec{\gamma }, \varvec{\nu })/dx\) has a zero in the interval \([x_i, x_{i+1}]\) that is a local minimizer of \(q(x; h, \varvec{\gamma }, \varvec{\nu })\). We find this zero using the R function uniroot, to which we provide the interval \([x_i, x_{i+1}]\). Also, if \(dq(x_i; h, \varvec{\gamma }, \varvec{\nu })/dx = 0\) and \(dq(x_{i-1}; h, \varvec{\gamma }, \varvec{\nu })/dx < 0\) and \(dq(x_{i+1}; h, \varvec{\gamma }, \varvec{\nu })/dx > 0\) then \(x_i\) is a zero of \(dq(x; h, \varvec{\gamma }, \varvec{\nu })/dx\) that is a local minimizer of \(q(x; h, \varvec{\gamma }, \varvec{\nu })\).
Step 2 Evaluate \(q(x; h, \varvec{\gamma }, \varvec{\nu })\) at the local minimizers of \(q(x; h, \varvec{\gamma }, \varvec{\nu })\) found in Step 1. The global minimum of \(q(x; h, \varvec{\gamma }, \varvec{\nu })\) is simply the minimum of all of the local minima.
1.4 A.4 Properties of \(\varvec{dq(x; h, \varvec{\gamma }, \varvec{\nu })/dx}\) considered as a function of \(\varvec{x}\)
It is straightforward to show that
and
It follows that
where \(t_1(h, \varvec{\gamma }, \varvec{\nu })\) is defined to be
and \(t_2(x; h, \varvec{\gamma }, \varvec{\nu })\) is defined to be
with
Suppose that \(h \in [0,c]\) and \((\varvec{\gamma }, \varvec{\nu })\) are given. Then \(t_1(h, \varvec{\gamma }, \varvec{\nu })\) is a fixed positive number. Observe that \(t_2(x; h, \varvec{\gamma }, \varvec{\nu })\) is a function of \(x \in [0, \infty )\) that can only take positive values and \(d\ell (h, \gamma ; x)/dx\) approaches 0 as \(x \rightarrow \infty \). We will use the following theorem to find \(\widetilde{x} < \infty \), such that \(dq(x; h, \varvec{\gamma }, \varvec{\nu })/dx > 0\) for all \(x \ge \widetilde{x}\). This implies that a value of x that minimizes \(q(x; h, \varvec{\gamma }, \varvec{\nu })\) cannot belong to the interval \([\widetilde{x}, \infty )\).
Theorem 3
Let \(\mu (h, \rho , \gamma ) = b(h) - \rho (h-\gamma )\). Then \(t_2(x; h, \varvec{\gamma }, \varvec{\nu })\) is a decreasing function of \(x \in \big [x^*, \infty \big )\), where \(x^*\) is equal to
Proof
We first prove that, for every \(h \in \mathbb {R}\), \(d \ell (h, \gamma ; x)/dx\) is a decreasing function of \(x \in \big [|\mu (h, \rho , \gamma )|, \infty )\). Observe that, for all \(x \ge 0\),
This is a decreasing function of \(x \in \big [|\mu (h, \rho , \gamma )|, \infty )\). Consequently, \(d \ell (h, \gamma ; x)/dx\) and \(d \ell (-h, \gamma ; x)/dx\) are decreasing functions of \(x \in \big [|\mu (h, \rho , \gamma )|, \infty )\) and \(x \in \big [|\mu (-h, \rho , \gamma )|, \infty )\), respectively.
Thus \(d \ell (h, \gamma _1(j); x)/dx\) is a decreasing function of \(x \in \big [|\mu (h, \rho , \gamma _1(j))|, \infty \big )\), for \(j = 1, \dots , m_1\). Therefore
is a decreasing function of \(x \in \Big [\underset{j = 1, \dots , m_1}{\textrm{max}} |\mu (h, \rho , \gamma _1(j)) |, \infty \Big )\). Similarly, \(d \ell (-h, \gamma _1(j); s)/ds\) is a decreasing function of \(s \in \big [|\mu (-h, \rho , \gamma _1(j))|, \infty \big )\), for \(j = 1, \dots , m_1\). Therefore
is a decreasing function of \(s \in \Big [\underset{j = 1, \dots , m_1}{\textrm{max}} |\mu (-h, \rho , \gamma _1(j)) |, \infty \Big )\). Therefore \(t_2(x; h, \varvec{\gamma }, \varvec{\nu })\) is a decreasing function of \(x \in \big [x^*, \infty \big )\). \(\square \)
We use Theorem 3 to find \(\widetilde{x} < \infty \), such that \(dq(x; h, \varvec{\gamma }, \varvec{\nu })/dx > 0\) for all \(x \ge \widetilde{x}\) as follows. First evaluate \(x^*\) and then \(dq(x^*; h, \varvec{\gamma }, \varvec{\nu })/dx\). If \(dq(x^*; h, \varvec{\gamma }, \varvec{\nu })/dx > 0\) then set \(\widetilde{x} = x^*\) and stop; otherwise use the \(\textsf{R}\) function \(\textsf{uniroot}\) to find the solution for \(x \in [x^*, \infty )\) of \(dq(x; h, \varvec{\gamma }, \varvec{\nu })/dx = 0\) and then set \(\widetilde{x}\) equal to this solution.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kabaila, P., Wijethunga, C. Confidence intervals centred on bootstrap smoothed estimators: an impossibility result. Stat Papers (2023). https://doi.org/10.1007/s00362-023-01454-9
Received:
Revised:
Published:
DOI: https://doi.org/10.1007/s00362-023-01454-9