The minimum coverage probability of confidence intervals in regression after a preliminary F test

https://doi.org/10.1016/j.jspi.2011.11.002Get rights and content

Abstract

Consider a linear regression model with regression parameter β=(β1,,βp) and independent normal errors. Suppose the parameter of interest is θ=aTβ, where a is specified. Define the s-dimensional parameter vector τ=CTβt, where C and t are specified. Suppose that we carry out a preliminary F test of the null hypothesis H0:τ=0 against the alternative hypothesis H1:τ0. It is common statistical practice to then construct a confidence interval for θ with nominal coverage 1α, using the same data, based on the assumption that the selected model had been given to us a priori (as the true model). We call this the naive 1α confidence interval for θ. This assumption is false and it may lead to this confidence interval having minimum coverage probability far below 1α, making it completely inadequate. We provide a new elegant method for computing the minimum coverage probability of this naive confidence interval, that works well irrespective of how large s is. A very important practical application of this method is to the analysis of covariance. In this context, τ can be defined so that H0 expresses the hypothesis of “parallelism”. Applied statisticians commonly recommend carrying out a preliminary F test of this hypothesis. We illustrate the application of our method with a real-life analysis of covariance data set and a preliminary F test for “parallelism”. We show that the naive 0.95 confidence interval has minimum coverage probability 0.0846, showing that it is completely inadequate.

Introduction

Consider the linear regression model Y=Xβ+ε, where Y is a random n-vector of responses, X is a known n×p matrix with linearly independent columns, β is an unknown parameter p-vector and εN(0,σ2In) where σ2 is an unknown positive parameter. Suppose that the parameter of interest is θ=aTβ where a is a given p-vector (a0). We seek a 1α confidence interval for θ.

Let the s-dimensional parameter vector τ be defined to be CTβt where C is a specified p×s matrix (s<p) with linearly independent columns and t is a specified s-vector. Suppose that a does not belong to the linear subspace spanned by the columns of C. Also suppose that we carry out a preliminary F test of the null hypothesis H0:τ=0 against the alternative hypothesis H1:τ0. It is then common statistical practice to construct a confidence interval for θ with nominal coverage 1α, using the same data, based on the assumption that the selected model had been given to us a priori (as the true model). We call this the naive 1α confidence interval for θ. In Section 2, we provide a convenient description of this confidence interval. This assumption is false and it can lead to the naive 1α confidence interval having minimum coverage probability far below 1α, making it completely inadequate. Our aim is to compute this minimum coverage probability. For s=1, the preliminary F test is equivalent to a t test. The case of a single preliminary t test has been dealt with by Kabaila and Giri (2009b, Theorem 3). So, in the present paper, we restrict attention to the case that s>1.

Straightforward application of the methodology of Farchione (2009, Ph.D. thesis, Section 5.7), leads to an expression for the coverage probability of the naive 1α confidence interval, for a given value of an s-dimensional parameter vector, that is a multiple integral of dimension s+1. Finding the minimum coverage probability using this formula becomes increasingly cumbersome as s increases due to both the need to (a) evaluate multiple integrals of dimension s+1 and (b) the need to search for the minimum over a space of dimension s.

In Section 3, by a careful consideration of the geometry of the situation, we derive a new elegant and computationally convenient formula for the coverage probability of this confidence interval for given parameter values. For s=2 this formula is a sum of a triple and a double integral and for all s>2 this formula is a sum of a quadruple and a double integral. This formula also shows that the coverage probability is a function of a two-dimensional parameter vector, irrespective of how large s is. This makes it easy to compute the minimum coverage probability of the naive confidence interval, irrespective of how large s is. Another important aspect of this formula is that it can be used to delineate general categories of a, C and X for which the naive confidence interval has poor coverage properties.

A very important practical application of this formula is to the analysis of covariance. In this context, τ can be defined so that H0 expresses the null hypothesis of “parallelism”. In the applied statistics literature on the analysis of covariance it is commonly recommended that a preliminary F test of the null hypothesis of “parallelism” be carried out. See, for example, Kuehl (2002, p. 563), Milliken and Johnson (2002, pp. 14–17) and Freund et al. (2006, pp. 363–368). For an analysis of covariance, we can choose a so that the parameter θ is the difference in expected responses for two specified treatments, for the same specified values of the covariates.

In Section 4, we illustrate the application of the results of the paper with a real-life analysis of covariance data set and a preliminary F test for “parallelism”. We define θ to be (expected response to treatment 1)−(expected response to treatment 2), evaluated at the same specified value of the covariate. We show that the naive 0.95 confidence interval for θ has minimum coverage probability 0.0846, for this specified value of the covariate. This shows that this confidence interval is completely inadequate, for this specified value of the covariate.

Section snippets

Description of the naive confidence interval

In this section we provide a convenient description of the naive 1α confidence interval constructed after the preliminary F test. Let β^ denote the least squares estimator of β. Define R(β)=(YXβ)T(YXβ). Let m=np. Define Σ^2=R(β^)/m=(YXβ^)T(YXβ^)/m. Also, define Θ^=aTβ^ and τ^=CTβ^t. We suppose that the columns of the matrix C are linearly independent. We also suppose that a does not belong to the linear subspace spanned by the columns of C. Now define the (s+1)×(s+1) matrix: V=v11v21Tv21V

The coverage probability of the naive confidence interval

Define b=v111/2V221/2v21 and W=Σ^/σ. Let fW denote the probability density function of W. Define b=b12++bs2. Thusb2=v111v21TV221v21=aT(XTX)1C(CT(XTX)1C)1CT(XTX)1aaT(XTX)1a.Since Var(Θ)=σ2(v11v21TV221v21)0, b[0,1]. The assumption that the vector a does not belong to the linear subspace spanned by the columns of C implies that b>0. So, we may assume that b(0,1]. Now define i(x,w;b)=P(t(m)w+xZt(m)w+x),where ZN(0,1b2), andj(x,y,w;b)=Pxt(m+s)mw2+ym+s1b2Zx+t(

Application to a real-life data set

In this section we consider the real-life analysis of covariance data set due to Chin et al. (1994) and analysed by Yandell (1997, Chapter 17), who makes this data available at the website http://www.stat.wisc.edu/∼yandell/pda/. This data is listed in Table 1. It consists of the observed response (weight gain) for a given treatment and value of the covariate (feed intake). There are four possible treatments, numbered 1–4.

We use the following linear regression model for this data: Yij=μi+β˜i(xij

Discussion

Discussion 5.1

The poor coverage properties of naive confidence intervals found in this paper are presaged by the poor coverage properties of naive confidence intervals found in the context of a preliminary best subset variable selection by minimizing an AIC-type criterion, see e.g. Kabaila (2005), Kabaila and Leeb (2006) and Kabaila and Giri, 2009a, Kabaila and Giri, 2009b (cf Kabaila, 2009). Apart from the form of preliminary model selection used, minimum AIC versus an F test, these papers differ from the

Acknowledgments

The authors are grateful to an anonymous reviewer for some comments and suggestions that helped to improve the paper.

References (15)

There are more references available in the full text version of this article.

Cited by (0)

View full text