Optimized recentered confidence spheres for the multivariate normal mean

Casella and Hwang, 1983, JASA, introduced a broad class of recentered confidence spheres for the mean $\boldsymbol{\theta}$ of a multivariate normal distribution with covariance matrix $\sigma^2 \boldsymbol{I}$, for $\sigma^2$ known. Both the center and radius functions of these confidence spheres are flexible functions of the data. For the particular case of confidence spheres centered on the positive-part James-Stein estimator and with radius determined by empirical Bayes considerations, they show numerically that these confidence spheres have the desired minimum coverage probability $1-\alpha$ and dominate the usual confidence sphere in terms of scaled volume. We shift the focus from the scaled volume to the scaled expected volume of the recentered confidence sphere. Since both the coverage probability and the scaled expected volume are functions of the Euclidean norm of $\boldsymbol{\theta}$, it is feasible to optimize the performance of the recentered confidence sphere by numerically computing both the center and radius functions so as to optimize some clearly specified criterion. We suppose that we have uncertain prior information that $\boldsymbol{\theta}= \boldsymbol{0}$. This motivates us to determine the center and radius functions of the confidence sphere by numerical minimization of the scaled expected volume of the confidence sphere at $\boldsymbol{\theta}= \boldsymbol{0}$, subject to the constraints that (a) the coverage probability never falls below $1-\alpha$ and (b) the radius never exceeds the radius of the standard $1-\alpha$ confidence sphere. Our results show that, by focusing on this clearly specified criterion, significant gains in performance (in terms of this criterion) can be achieved. We also present analogous results for the much more difficult case that $\sigma^2$ is unknown.


Introduction
Suppose that X = (X 1 , ..., X p ) ∼ N (θ, σ 2 I) where θ = (θ 1 , ..., θ p ) and I denotes the p×p identity matrix (p ≥ 3). We consider the problem of finding a confidence set for θ, based on X, with prescribed confidence coefficient 1−α that satisfies the following two conditions. The first condition is that this confidence set has volume that never exceeds the volume of the standard 1 − α confidence sphere centred at X. The second condition is that this confidence set has an expected volume that is small by comparison with that of the standard 1 − α confidence sphere, when θ = 0. Casella and Hwang (1987) argue cogently that the confidence set for θ should be tailored to the uncertain prior information available about θ. We consider confidence sets that are tailored to the uncertain prior information that θ = 0. The second condition is motivated by a desire to utilize this uncertain prior information. This is a very difficult problem to solve and, in common with much of the existing literature on it, we first consider the case that σ 2 is known. Later, we consider the more difficult case that σ 2 is unknown.
To provide specific proposals for recentered confidence spheres, Hwang (1983, 1987) and Samworth (2005) center their confidence spheres at the positive-part James-Stein estimator of θ. The radii of these confidence spheres are functions of X that are determined by empirical Bayes considerations (Casella and Hwang, 1983), Taylor series or the bootstrap (Samworth, 2005). These papers compare confidence sets for θ using the coverage probability and the scaled volume (or its p'th root).
In the present paper, for the case that σ 2 is known, we consider confidence spheres with center a(T )X and radius b(T ), where T = X / √ p, and the functions a and b satisfy the following conditions.
Condition A a : [0, ∞) → (0, ∞) is a continuous nondecreasing function that satisfies a(x) = a + (x) for all x ≥ k, where a + (T )X is the positive-part James-Stein estimator and k is a (sufficiently large) specified positive number.
This choice of center and radius has some intuitive appeal, since T = X / √ p may be viewed as a test statistic for testing the null hypothesis that θ = 0 against the alternative hypothesis that θ = 0. The scaled expected volume is defined to be the ratio (expected volume of the recentered confidence sphere) / (volume of I). The functions a and b are determined by numerically minimizing the scaled expected volume of the recentered confidence sphere at θ = 0, subject to the constraint that the coverage probability never falls below 1 − α. This numerical constrained minimization is made feasible by the fact that both the coverage probability and the scaled expected volume of the recentered confidence sphere are functions of γ = θ , for given functions a and b. For computational feasibility, we assume that the values of a(x) and b(x) for any x ∈ [0, k] are found by piecewise cubic Hermite polynomial interpolation.
For σ 2 known, our contributions are the following: • We shift the focus from scaled volume (or its p'th root) to scaled expected volume. A goal of seeking to minimize (in some sense) the volume of the recentered confidence sphere for the most probable values of X when θ = 0, subject to the coverage constraint, is problematic (Casella and Hwang, 1986). By contrast, as we show in the present paper, minimization of the scaled expected volume of the recentered confidence sphere (RCS) at θ = 0, subject to the coverage constraint, leads to an RCS with excellent performance.
• We remove the restriction that the RCS is centered solely at the positivepart James-Stein estimator and explore the consequences of removing this restriction.
• The new RCS compares favourably with the RCS described in Section 4 of Casella and Hwang (1983), which is centered on the positive-part James-Stein estimator and has radius determined by empirical Bayes considerations, in terms of both the minimum coverage probability and the scaled expected volume at θ = 0. For 1 − α = 0.95, this is shown in Figure 1 for p = 3 and in Table 1 for p = 3, 4, . . . , 13, 20, 25.
• We derive a new computationally convenient formula for the coverage probability of an RCS, of the type described in Section 3 of Casella and Hwang (1983), which is applicable for any p (even or odd), whereas the computationallyconvenient formula, derived by Casella and Hwang (1983), is applicable only for p odd.
We now consider the more difficult case that σ 2 is unknown. Suppose that we have additional data that provides the estimator S 2 for σ 2 , where mS 2 /σ 2 ∼ χ 2 m and S 2 and X are independent. The standard 1 − α confidence set isĨ = θ : θ − X ≤d S , where the positive numberd satisfies P G ≤d 2 /p = 1 − α for G ∼ F p,m . In the related context that there is uncertain prior information that θ 1 = θ 2 = · · · = θ p , Casella and Hwang (1987) put forward an RCS with center at an analogue of the positive-part James-Stein estimator (which is defined for σ 2 known) and radius that is an analogue of the radius based on empirical Bayes considerations for σ 2 known.
For σ 2 unknown, the new RCS is an analogue of the new RCS for σ 2 known. For σ 2 unknown, we consider confidence spheres with centerã(T )X and radiusb(T ), whereT = X /( √ p S) and the functionsã andb satisfy the following conditions.
of an estimator of θ due James and Stein (1961, pp. 365-366). This estimator belongs to a class of estimators described by Baranchik (1970) and is an analogue for σ 2 unknown of the positive-part James-Stein estimator.
This choice of center and radius has some intuitive appeal, sinceT = X /( √ p S) may be viewed as a test statistic for testing the null hypothesis that θ = 0 against the alternative hypothesis that θ = 0. The scaled expected volume is defined to be the ratio (expected volume of the RCS) / (expected volume ofĨ). The functionsã andb are determined by numerically minimizing the scaled expected volume of the RCS at θ = 0, subject to the constraint that the coverage probability never falls below 1−α. This numerical constrained minimization is made feasible by the fact that both the coverage probability and the scaled expected volume of the RCS are functions of γ = θ /σ, for given functionsã andb. For computational feasibility, we assume that the values ofã(x) andb(x) for any x ∈ [0, k] are found by piecewise cubic Hermite polynomial interpolation. Also, for computational feasibility, we assume that p is odd.
For σ 2 unknown, our contributions are the following: • We remove the restriction that the RCS is centered solely at the analogue of the positive-part James-Stein estimator and explore the consequences of removing this restriction.
• The new RCS compares favourably with an RCS that is centered on an ana-logue of the positive-part James-Stein estimator, in terms of both the minimum coverage probability and the scaled expected volume at θ = 0.
2. Results for σ 2 known. Comparison of the performances of the new RCS and the RCS of Casella and Hwang (1983, Section 4).
In this section, we suppose that σ 2 is known. Without loss of generality, we assume that σ 2 = 1. Casella and Hwang (1983, Section 3) define a class of RCS's that can be expressed in the form Casella and Hwang (1983, Section 3) derive a computationally convenient formula for the coverage probability of J(a, b) that is applicable for p odd. Our notation for an RCS is slightly different from that used by Casella and Hwang (1983), who express an RCS in terms of X . This makes no essential difference. Let γ = θ . In Appendix A, we show that the coverage probability of J(a, b) is, for given functions a and b, an even function of γ and we derive a new computationally convenient formula for this coverage probability that is applicable for any p (even or odd). Details of the numerical evaluation of this coverage probability, using these computationally convenient formulas, are also presented in Appendix A. The numerical results for coverage probabilities that are presented in this section were found using this new computationally convenient formula.
Define a + : [0, ∞) → (0, ∞) by the requirement that a + (T )X is the positive-part James-Stein estimator. This implies that The specific proposal for an RCS that is given in Section 4 of Casella and Hwang (1983) is J(a + , b * ), where b * is determined by empirical Bayes considerations and is given by We define the scaled expected volume of J(a, b) to be the ratio since the volume of a sphere in R p with radius r is 2 r p π p/2 / p Γ(p/2) . In Appendix A, we show that this is an even function of γ = θ , for given function b. We also derive a new computationally convenient formula for this scaled expected volume.
To find the new RCS, we require that the functions a and b satisfy Conditions A and B, respectively, In addition, for computational feasibility, we specify the following parametric forms for these functions.
(a) Suppose that x 1 , . . . , x q 1 satisfy 0 = x 1 < x 2 < · · · < x q 1 = k. The function a is fully specified by the vector a(x 1 ), . . . , a(x q 1 ) as follows. The value of a(x) for any given x ∈ [0, k] is found by piecewise cubic Hermite polynomial interpolation for these given function values. We call x 1 , . . . , x q 1 the knots of this piecewise cubic Hermite polynomial.
(b) Suppose that y 1 , . . . , y q 2 satisfy 0 = y 1 < y 2 < · · · < y q 2 = k. The function b is fully specified by the vector b(y 1 ), . . . , b(y q 2 ) as follows. The value of b(y) for any given y ∈ [0, k] is found by piecewise cubic Hermite polynomial interpolation for these given function values. We call y 1 , . . . , y q 2 the knots of this piecewise cubic Hermite polynomial.
For judiciously-chosen values of the knots, we compute the functions a and b, which take these parametric forms, are nondecreasing and are such that (a) the scaled expected volume evaluated at θ = 0 (i.e. at γ = 0) is minimized and (b) the coverage probability of J(a, b) never falls below 1 − α. Piecewise cubic Hermite interpolation is described by Fritsch and Carlson (1980) and it is implemented in the pchip function in MATLAB. All of the computations presented in the present paper were performed using programs written in MATLAB using the Statistics and Optimization toolboxes.
This coverage constraint is implemented in the computations as follows. For any reasonable choice of the functions a and b, the coverage probability of J(a, b) converges to 1 − α as γ → ∞. The constraints implemented in the computations are that the coverage probability of J(a, b) is greater than or equal to 1 − α for every γ in a judiciously-chosen finite set of values. That a given finite set of values of γ is adequate to the task is judged by checking numerically, at the completion of computations, that the coverage probability constraint is satisfied for all γ ≥ 0.
For 1 − α = 0.95, we compare the coverage probability and scaled expected volume of the new RCS with J(a + , b * ), the RCS of Casella and Hwang (1983, Section 4). We have chosen k = 10. We chose the knots of a and b that allow these functions to provide good approximations to a + and b * , respectively. In this way, we sought to ensure that J(a, b) could perform at least as well as J(a + , b * ) in terms of minimizing the scaled expected volume at γ = 0, subject to the coverage constraint. Note that a + (x) = 0, for 0 ≤ x ≤ 1 − (2/p). Thus for the function a, we place the first two knots at 0 and 1 − (2/p). The next three knots of a are at 1 − (2/p) + (τ /10), 1 − (2/p) + (2τ /10) and 1 − (2/p) + (4τ /10), where τ = (k/2) − 1 − (2/p). The remaining knots of a are at k/2, 3k/4 and k.
Thus for the function b, we place the first two knots at 0 and d/ √ p. The next two knots are at d/ The remaining knots at k/2, 3k/4 and k. The new RCS was computed for each p ∈ {3, 4, . . . , 13, 20, 25}.
The coverage constraint was implemented in the computations by requiring that the coverage probability of J(a, b) is greater than or equal to 1 − α for all γ ∈ {0, 1, 2, . . . , 64, 65}. This was shown to be adequate to the task by checking numerically, at the completion of the computation of the new RCS, that the coverage probability constraint is satisfied for all γ ≥ 0. Figure 1 shows that, for p = 3, the coverage probability of the new RCS is no less than 0.95 for all γ, while the coverage probability of J(a + , b * ), the RCS of Casella and Hwang (1983, Section 4), is slightly below 0.95 for some values of γ. This figure also shows that, for p = 3, the scaled expected volume of the new RCS is substantially less than the scaled expected volume of J(a + , b * ), at θ = 0. Table 1 presents the comparison of the minimum coverage probability and the scaled expected volume at θ = 0 of the new RCS and J(a + , b * ) for p ∈ {3, 4, . . . , 13, 20, 25}.
According to this table, the new RCS always achieves a coverage probability greater than or equal to 0.95, while J(a + , b * ), the RCS of Casella and Hwang (1983, Section 4), does not achieve this for p ≤ 6. Also, for every value of p considered, J(a, b) achieves a substantially lower scaled expected volume at θ = 0 than J(a + , b * ). In summary, our new RCS compares favourably with that of Casella and Hwang (1983, Section 4), in terms of both the minimum coverage probability and the scaled expected volume at θ = 0.
Remark 2.1: When we initially considered the construction of new RCS's for θ, we set a(x) = 1 for all x ≥ k. This seemed a very reasonable choice that leads to J(a, b) coinciding with the standard 1−α confidence set I when X ≥ k. Surprisingly, the computation of the nondecreasing functions a and b such that the scaled expected volume at γ = 0 was minimized, subject to the coverage probability of J(a, b) never falling below 1 − α, always resulted in a J(a, b) that was, within computational accuracy, equal to I. A careful investigation revealed that the explanation for this phenomenon is that for all J(a, b)'s, other than those very close to I, there was a small dip (over a narrow interval of values of γ) in the coverage probability below 1 − α. As k is increased, this dip becomes less pronounced, but appears to never disappear entirely. In other words, it did not seem possible for J(a, b) to satisfy the coverage constraint unless it was, within computational accuracy, equal to I. We found the following solution to this problem. If, instead of setting a(x) = 1 for all x ≥ k, we set a(x) = a + (x) for all x ≥ k, then this phenomenon does not occur.

Remark 2.2:
We have chosen the functions a and b to be a piecewise cubic Hermite interpolating polynomial in the interval [0, k]. Other choices of parametric forms for this function are also possible. For example, one could choose this function to be a quadratic spline in this interval. Our reason for choosing piecewise cubic Hermite interpolation is that this leads to interpolating function with fewer undesirable oscillations between the knots than, say, natural cubic spline interpolation.
Remark 2.3: Casella and Hwang (1983, Section 3) argue that it is desirable that the set S θ , described in their Theorem 3.1, is an interval. During the computation of the new RCS, it was found that at every stage (including the final stage) this set was an interval.
3. Results for σ 2 unknown. Comparison of the performances of the new RCS and the RCS centered on an analogue of the positive-part James-Stein estimator In this section, we consider the more difficult case that σ 2 is unknown. Suppose that we have additional data that provides the estimator S 2 for σ 2 , where mS 2 /σ 2 ∼ χ 2 m and S 2 and X are independent. The standard 1 − α confidence set isĨ = θ : θ − X ≤d S , where the positive numberd satisfies P G ≤d 2 /p = 1 − α for G ∼ F p,m . Define the class of RCS's that can be expressed in the form where the functionsã andb satisfy ConditionsÃ andB, respectively, andT = X /( √ p S).
The coverage probability ofJ(ã,b) is where W = S/σ, ϑ = θ/σ and Y = X/σ. Obviously, Y ∼ N(ϑ, I) and W has the same distribution as Q/m, where Q ∼ χ 2 m . Since Y and W are independent, this coverage probability is equal to where f W denotes the probability density function of W . Let γ = ϑ = θ/σ . It follows from Theorem 1 (presented in Appendix A) that, for any given w > 0 and functionsã andb, is an even function of γ. It follows from (2) that the coverage probability ofJ(ã,b) is also an even function of γ. We evaluate (3) using the computationally convenient formula of Casella and Hwang (1983, Section 3), which is applicable for p odd. The method used for the numerical evaluation of (2) is described in Appendix B. We define the scaled expected volume ofJ(ã,b) to be the ratio In Appendix B, we show that this is an even function of γ = ϑ = θ/σ , for given functionb. We also derive a new computationally convenient formula for this scaled expected volume. Defineã Note thatã + X /( √ pS) X is the positive-part version of an estimator of θ due James and Stein (1961, pp. 365-366). This estimator belongs to a class of estimators described by Baranchik (1970) and is an analogue, for σ 2 unknown, of the positivepart James-Stein estimator. We compare two different new RCS's. The first of these RCS's is centered oñ In other words, this RCS has the formJ(ã + ,b). For computational feasibility, we specify the following parametric form for the functionb.
Suppose that y 1 , . . . , y q 2 satisfy 0 = y 1 < y 2 < · · · < y q 2 = k. The functioñ b is fully specified by the vectorb(y 1 ), . . . ,b(y q 2 ) as follows. The value of b(y) for any given y ∈ [0, k] is found by piecewise cubic Hermite polynomial interpolation for these given function values. We call y 1 , . . . , y q 2 the knots of this piecewise cubic Hermite polynomial.
For judiciously-chosen values of these knots, we compute the functionb, which takes this parametric form, is nondecreasing and is such that (a) the scaled expected volume evaluated at θ = 0 (i.e. at γ = 0) is minimized and (b) the coverage probability ofJ(ã + ,b) never falls below 1 − α.
The second of the RCS's has the formJ(ã,b). For computational feasibility, we additionally specify the following parametric form for the functionã.
Suppose that x 1 , . . . , x q 1 satisfy 0 = x 1 < x 2 < · · · < x q 1 = k. The functioñ a is fully specified by the vectorã(x 1 ), . . . ,ã(x q 1 ) as follows. The value of a(x) for any given x ∈ [0, k] is found by piecewise cubic Hermite polynomial interpolation for these given function values. We call x 1 , . . . , x q 1 the knots of this piecewise cubic Hermite polynomial.
For judiciously-chosen values of the knots, we compute the functionsã andb, which take these parametric forms, are nondecreasing and are such that (a) the scaled expected volume evaluated at θ = 0 (i.e. at γ = 0) is minimized and (b) the coverage probability ofJ(ã,b) never falls below 1 − α.
For 1 − α = 0.95, we compare the coverage probability and scaled expected volume of these two RCS's for odd values of p. We have chosen k = 10. We chose the knots ofã that allow this function to provide a good approximation toã + . In this way, we sought to ensure thatJ(ã,b) could perform at least as well asJ(ã + ,b) in terms of minimizing the scaled expected volume at γ = 0, subject to the coverage constraint. Note thatã + (x) = 0 for 0 ≤ x ≤ 2(p − 2)m/p(m + 2). Thus for the functionã we place the first two knots at 0 and 2(p − 2)m/p(m + 2). The next three knots are at equally spaced positions between 2(p − 2)m/p(m + 2) and k/2.
The last two knots are at k/2 and k. For both RCS's we place the knots of the functionb at 0, 2, 4, 6, 8, and 10.
The coverage constraint was implemented in the computations by requiring that the coverage probability of these RCS's is greater than or equal to 1 − α for all γ ∈ {0, 1, 2, . . . , 64, 65}. This was shown to be adequate to the task by checking numerically, at the completion of the computation of these RCS's, that the coverage probability constraint is satisfied for all γ ≥ 0.
We compare the two new RCS's for all combinations of p ∈ {3, 5, 7, 9, 25} and m ∈ {3, 10, 30}. Figure 2 compares these RCS's in detail for p = 3 and m = 3. Table 2 presents the comparison of the scaled expected volumes at θ = 0 of the two new RCS's for all the combinations of m ∈ {3, 10, 30} and p ∈ {3, 5, 7, 9, 25}. This table shows that, for every combination of m and p considered, the RCS of the formJ(ã,b) achieves a significantly lower scaled expected volume at θ = 0 than the RCS of the formJ(ã + ,b). For these new RCS's, the decrease in the scaled expected volume at θ = 0 is higher when m is smaller, for given p. Note that the coverage probability results of these RCS's are not presented, since both of these new RCS's achieve a minimum coverage probability greater than or equal to 0.95.
In summary, both the new 1 − α RCS's compare favorably with the standard 1 − α confidence set for θ. Also, the new 1 − α RCS of the formJ (ã,b) compares favorably with the new 1−α RCS of the formJ(ã + ,b) in terms of the scaled expected volume at θ = 0.

Conclusion
In a number of respects, a recentered confidence sphere for the multivariate normal mean θ has relatively simple properties. This sphere is specified by two nondecreasing real-valued functions defined on the positive real line. Both the coverage probability and the scaled expected volume of this sphere are even functions of the scalar parameter γ = θ/σ . Finally, there are computationally convenient expressions for both the coverage probability and the scaled expected volume of a recentered confidence sphere. These properties permit our construction, by numerical constrained optimization, of a recentered confidence sphere that is close to optimal in terms of minimizing the scaled expected volume at θ = 0, subject to the coverage constraint. This kind of direct approach to confidence set construction is also possible in other problems where the type of confidence set under consideration has relatively simple properties.
Appendix A: Results for σ 2 known In this appendix, we derive computationally-convenient formulas for the coverage probability and the scaled expected volume of the RCS J(a, b), when σ 2 is known. We assume, without loss of generality, that σ 2 = 1. Suppose that p ≥ 3. Let A computationally convenient formula for the coverage probability of J (a, b) In this section we show that the coverage probability of J(a, b) is an even function of γ, for given functions a and b, and we derive a computationally convenient formula for this coverage probability. We first present the proofs and derivations and then state the results.
The coverage probability of J(a, b) is Let Z = X − θ, so that Z ∼ N (0, I). We write Z = RU where R and U are independent, R 2 ∼ χ 2 p and U is a random p-vector which is distributed uniformly on the surface of a unit sphere in R p . Then, Let L = (θ/ θ ) ⊤ (Z/ Z ). Note that L is a random variable which has a distribution that does not depend on θ. Let f L denote the probability density function of L. Let B(x, y) = Γ(x)Γ(y)/Γ(x + y) denote the beta function. For p ≥ 3, Now, (5) can be written as follows.
Note that this formula is valid for all γ ≥ 0 if, for example, we set L = 1 for θ = 0. Thus For θ = 0, This is a function of γ, by (6) and (7) and the fact that (R, L) has a distribution that does not depend on θ. Using the fact that (R, L) and (R, −L) have the same distribution, it may be shown that this coverage probability is an even function of γ. We have therefore proved the following result.
Theorem 1. The coverage probability of J(a, b) is an even function of γ.
We now derive the new computationally convenient formula for the coverage probability of J(a, b). By the law of total probability, this coverage probability is equal to By using the law of total probability in this way, we simplify the computer programming required for the evaluation of the coverage probability. Let c(γ; a, b) = P ( a(T )X − θ ≤ b(T ), T < k) c * (γ; a + ) = P a + (T )X − θ ≤ d c + (γ; a + ) = P a + (T )X − θ ≤ d, T < k Thus, the coverage probability of J(a, b) is equal to c(γ; a, b) + c * (γ; a + ) − c + (γ; a + ). We now derive computationally convenient approximations for c(γ; a, b), c * (γ; a + ) and c + (γ; a + ).
Derivation of the computationally convenient approximation for c(γ; a, b) Observe that Let t = (r 2 + 2γrℓ + γ 2 )/p. We define functions g and h as follows.
Let I be defined as follows.
where A is an arbitrary statement. By definition, c(γ; a, b) is equal to To compute this multiple integral, our next step is to truncate the outer integral. We approximate this multiple integral by where, for a specified small positive number δ, l r = 0 for p ≤ 10 F −1 p (δ/2) for p > 10. and and F p denotes the χ 2 p distribution function. The reason for not truncating the integral at the lower endpoint for p ≤ 10 is that there is little to be gained in this case. The following lemma provides an upper bound on the error of approximation.
Proof. Let e = (9) − (10). Obviously, e ≥ 0 and Since I can only take the values 0 and 1, we have that Obviously, (10) is equal to We now use the fact that g(r, ℓ, γ) is a particularly simple function of r and ℓ for given γ, to simplify (13). Note that for given values of ℓ and γ, there is a non-empty interval of values of r such that g(r, ℓ, γ) < 0 if and only if r 2 + 2γℓr + γ 2 − pk 2 = 0 has distinct real solutions for r. This condition is equivalent to where s = γ 2 − pk 2 /γ. This implies that (13) is equal to where A computationally convenient formula for v(ℓ, γ; a, b) is obtained as follows. Suppose the set of values (if it exists) of r ∈ [l r , u r ] such that g(r, ℓ, γ) ≤ 0 and h(r, ℓ, γ; a, b) ≤ 0, for given ℓ, γ and functions a and b, is expressed in the form of a union of disjoint intervals as follows.

Remark:
The computer program used to find the disjoint intervals l i (ℓ, γ; a, b), u i (ℓ, γ; a, b) , carries out an extensive grid search, followed by the application of the MATLAB zero-finding function fzero. This programming is designed to account for the possibility of quite a large number of such intervals. However, careful investigations suggest that h(r, ℓ, γ; a, b), considered as a function of r, is smooth, leading typically to a single interval i.e. K(ℓ, γ; a, b) = 1. A similar remark applies to the computation of the disjoint intervals involved in computations of the computationally convenient approximations for c * (γ; a + ) and c + (γ; a + ).
h + (r, ℓ, γ; a + ) = (a + (t)) 2 r 2 + 2a where, as in the previous section, t = (r 2 + 2γrℓ + γ 2 )/p. By definition, c * (γ; a + ) is equal to To compute this multiple integral, our next step is to truncate the outer integral. We approximate this multiple integral by where l r and u r are given by (11) and (12), respectively. The following lemma provides an upper bound on the error of approximation.
The proof of this lemma is omitted, because it is very similar to the proof of Lemma 1. Obviously, (16) is equal to This is equal to where v * (ℓ, γ; a + ) = u r l r I h + (r, ℓ, γ; a + ) ≤ 0 f R (r) dr.
Similarly to the previous derivation, a computationally convenient formula for v * (ℓ, γ; a + ) is obtained as follows. Suppose the set of values (if it exists) of r ∈ [l r , u r ] such that h + (r, ℓ, γ; a + ) ≤ 0, for given ℓ, γ and functions a and b is expressed in the form of a union of disjoint intervals as follows.
v * (ℓ, γ; a + ) = where, as in the previous section, F p denotes the χ 2 p distribution function. Derivation of a computationally convenient approximation for c + (γ; a + ) Note that c + (γ; a + ) is obtained by simply replacing the function a by the function a + and the function b by the constant d in the expression for c(γ; a, b). Thus, by replacing the function a by the function a + and the function b by the constant d in the expression for the computationally convenient approximation c approx (γ; a, b) for c(γ; a, b), we obtain the computationally convenient approximation c + approx (γ; a + ) for c + (γ; a + ).
The following theorem provides a computationally convenient expression for the coverage probability of J(a, b) for p ≥ 3.
Comparison of the two computationally convenient formulas for the coverage probability of J (a, b) Note that there is a typographical error in this formula as stated on page 691 of Casella and Hwang (1983). The (n + 1)! on the second line of (3.10) should be replaced by (n + i)!. The main advantage of computationally convenient formula stated in Theorem 2 is that it is applicable for any p ≥ 3, whereas the computationally convenient formula stated by Casella and Hwang (1983, Section 3) is applicable only for odd values of p ≥ 3. We found that the coverage probability computed using the formula of Casella and Hwang (1983, Section 3) was inaccurate when γ = θ is very close to zero but not equal to zero. There is no such problem with the formula for the coverage probability stated in Theorem 2. In terms of computational speed, for small values of p, both of these computationally convenient formulas perform equally well. However, for large values of p, the coverage probability can be computed faster using the formula of Casella and Hwang (1983, Section 3).
Numerical evaluation of the coverage probability of J (a, b) using the new computationally convenient formula The computer programs for the computation of the coverage probability using (18) were checked for correctness in two ways, for some particular examples. Firstly, the coverage probabilities computed using the computationally convenient formula of Casella and Hwang (1983) for p odd and the computationally convenient formula (18) were compared. Secondly, coverage probabilities computed using these formulas were compared with the results of coverage probabilities computed using Monte Carlo simulations.
A computationally convenient formula for the scaled expected volume of J (a, b) The following theorem provide a computationally convenient-formula for the scaled expected volume of the recentered confidence sphere J(a, b).
Theorem 3. For given function b, the scaled expected volume of J(a, b) is an even function of γ = ||θ||.
(a) Let f v; p, γ 2 denote the noncentral χ 2 probability density function with p degrees of freedom and noncentrality parameter γ 2 , evaluated at v. The scaled expected volume of J(a, b) is Numerical evaluation of the coverage probability ofJ (ã,b) using (2) The new RCS is found by numerically solving the constrained optimization problem described in Section 3. This type of computation has been carried out in other related problems by Farchione and Kabaila (2012) and Giri (2009, 2013). The main lesson from these related computations is that the coverage probability needs to be computed with great accuracy. Let Therefore, the formula (2) for the coverage probability ofJ (ã,b) is ∞ 0 ψ w, γ;ã,b f W (w) dw.
We transform the variable of integration in the second integral from w to x = F W (w), where F W denotes the cumulative distribution function of W . Therefore, the coverage probability ofJ(ã,b) is where ψ F −1 W (x), γ;ã,b evaluated at x = 1 is defined to be the limit as x ↑ 1 of this function. This limit is 1. The integrands in both of these integrals are smooth. These integrals were computed using Simpson's rule with an appropriately chosen fixed number of evaluations of the integrand. To help ensure accurate computation of the integrands for both integrals, progressive numerical quadrature, using Simpson's rule was used and a doubling of equal-length segments at each stage of the progression is used. Progressive numerical quadrature is described, for example, in Section 6.1 of Davis and Rabinowitz (1984). The main stopping criterion is that |Q 2s − Q s |/Q 2s ≤ 10 −8 , where Q s denotes the computed quadrature using s segments. The computer programs for the computation of the coverage probability were checked for correctness by comparing computed values, for some particular examples, with the results of coverage probabilities computed using Monte Carlo simulations.
A computationally convenient formula for the scaled expected volume of J (ã,b) The following theorem provides a computationally convenient formula for the scaled expected volume of the RCSJ(ã,b).
The second term in this expression is equal to Thus the scaled expected volume ofJ(ã,b) is is equal to (21).
The proof of this lemma is omitted for the sake of brevity.