Median Confidence Regions in a Nonparametric Model

The problem of constructing confidence regions for the median in the nonparametric measurement error model (NMEM) is considered. This problem arises in many settings, including inference about the median lifetime of a complex system arising in engineering, reliability, biomedical, and public health settings. Current methods of constructing CRs are discussed, including the T-statistic based CR and the Wilcoxon signed-rank statistic based CR, arguably the two default methods in applied work when a confidence interval about the center of a distribution is desired. Optimal equivariant CRs are developed with focus on subclasses of of the class of all distributions. Applications to a real car mileage efficiency data set and Proschan's air-conditioning data set are demonstrated. Simulation studies to compare the performances of the different CR methods were undertaken. Results of these studies indicate that the sign-statistic based CR and the optimal CR focused on symmetric distributions satisfy the confidence level requirement, though they tended to have higher contents; while two of the bootstrap-based CR procedures and one of the developed adaptive CR tended to be a tad more liberal but with smaller contents. A critical recommendation is that, under the NMEM, both the T-statistic based and Wilcoxon signed-rank statistic based confidence regions should not be used since they have degraded confidence levels and/or inflated contents.


Introduction and Motivation
Given a univariate distribution function G, the two most common measures of central tendency are the mean µ = xG(dx), provided it exists ( |x|G(dx) < ∞), and the median ∆ = inf{x ∈ : G(x) ≥ 1/2}. The mean need not always exist, whereas the median always exists. Under symmetric distributions, and when the mean exists, then the mean and the median coincide. This paper is concerned with making statistical inferences about the median ∆ of a distribution. A popular model leading to the problem of making inference about the median of a distribution is the so-called measurement error model. In this model ∆ represents a quantity of interest which is unknown, and when one measures its value, the observed value x is a realization of the random variable X = ∆ + , where represents a measurement error with a continuous distribution F ( ) whose median equals zero. As such, the distribution of X is G(x) = F (x − ∆). Typically, F (·) is assumed to be a zeromean normal distribution, but this assumption is not tenable in many situations. For instance, in dealing with event times in biomedical, reliability, engineering, economic, and social settings, the error distribution need not even be symmetric. This is also the case when dealing with economic indicators such as per capita income, retirement savings, etc. As such, a general model is to simply assume that the error distribution F belongs to the class of all continuous distributions with medians equal to zero. This class will be denoted by F c,0 . Another situation where this problem arises is when dealing with a complex engineering system, such as the motherboard of a laptop computer or some technologically-advanced car (e.g., a Tesla Model S sedan). Such a system will be composed of many different components configured according to some structure function, with the components having different failure-time distributions and some of them possibly acting dependently on each other. Of main interest for such a system will be its time-to-failure (also called lifetime) denoted by X. Because of the complexity of the system, it may not be feasible to analyze the distribution of X by taking into account each of the failure time distributions of the components and the system's structure function which represents the configuration of the components to form the system. Thus a simplified and practically feasible viewpoint is to assume that the system's life distribution is some continuous distribution G. One may then be interested in the median ∆ of this distribution G.
Thus, in these situations, the observable random variable X is assumed to have a distribution F (x − ∆) with F (·) ∈ F c,0 and ∆ ∈ being the median of X. This will be referred to as the one-population nonparametric measurement error model, abbreviated NMEM. This is the simplest among the measurement error models. The goal is to infer about the parameter of interest ∆ with F (·) acting as an infinite-dimensional nuisance parameter. We shall be interested in this paper in the construction of a confidence region (CR) for ∆ based on a random sample of observations of X. This definitely is a classic problem since the construction of a confidence interval for the median was even discussed in ( [14]). More generally, quantiles instead of just the median may be of interest, and the methods developed here could be adaptable to inference about general quantiles.
Arguably, confidence regions for a parameter are preferable than point estimates since they address simultaneously the issue of how close to the truth (measured through the content of the region) and how sure about such closeness to the truth (measured by the confidence region coefficient). For more discussions on desirability of confidence regions see, for instance, the introduction in [3] and chapter 5 in [2]. Of course, one could typically accompany a point estimate (PE) by an estimate of its standard error (ESE), but then the user still needs to deduce closeness and sureness based on the PE and the ESE, usually a non-trivial matter if to be done properly.
We introduce some notations and definitions. Let X 1 , X 2 , . . . , X n be independent and identically distributed (IID) random variables (a random sample) from F (x − ∆), where ∆ ∈ and F ∈ F c,0 . The mathematical problem is to construct a confidence region (CR) for the parameter θ(F, ∆) = where b(k; n, 1/2) = n k 2 −n andl(k) is an appropriate estimator of l(k; F ) = E{X (k+1) }−E{X (k) }. The randomizer U is a uniform random variable, while c * is the infimum over all c ∈ satisfying P {b(B; n, 1/2) > cl(B)} ≤ 1 − α where B is a binomial random variable with parameters n and 1/2. A specific form ofl(k) that leads to a reasonable CR is given bŷ whereF (w) = n i=1 I{X i −∆ ≤ w}/n, the empirical distribution function of X i −∆, i = 1, 2, . . . , n, with∆ being the sample median. The specific CR above will be developed in section 5. Prior to the development of the specific CRs, in section 2 we utilize invariance ideas to derive the general form of the almost-optimal equivariant CR for ∆ under the NMEM but still under the assumption that F is known. Then, we address the question of how to deal with the fact that F is not actually known, leading to the region estimator above. Two other region estimators which are focused toward the class of symmetric distributions and the class of negative exponential distributions, but still valid for the general NMEM model, will be developed in sections 3 and 4, respectively. In the simulation studies, the procedure focused on symmetric distributions actually performed quite robustly under varied distributions (even for the non-symmetric distributions) in terms of coverage probability and it had mean content superior to the procedure based on the sign statistic. Prior to studying the performance of these new region estimators, we briefly describe existing ('off-the-shelf') region estimators for the median in section 6. We then proceed to demonstrate these different region estimators by applying to two data sets in section 7. Section 8 will present the results of simulation studies comparing the performances of these region estimators under different underlying distributions by examining their mean contents and their achieved confidence levels. These comparisons are also of major importance since they demonstrate CR procedures that should be preferred and which CRs should not be used under the NMEM . Section 9 will provide some concluding remarks.

Invariant Models and Equivariant CRs
We first review the notions of invariant statistical models and equivariant CRs (see, for instance, [9]). We do this review in a more general framework than the concrete NMEM which is the focus of this paper. We note that sufficiency and invariance were major ideas utilized by Peter Hooper in several of his papers dealing with confidence sets and prediction sets, cf., [7,6,8].
Let X be an observable random element taking values in a sample space X. The class of probability models governing X is P which consists of probability measures P 's on the measurable space (X, F), with F a suitable σ-field of subsets of X. Let τ : P → T be a functional, with τ (P ) being the parameter of interest. A 100(1 − α)% confidence region for τ (P ) is a set-valued mapping Γ : X → T , where T is a class of subsets of T such that Let G = {g : X → X} be a family of transformations on X that forms a group under an operation · and with identity element 1 G ≡ 1. LetḠ = {ḡ : P → P} be a group of transformations on P such that there exists a homomorphismh : G →Ḡ and let 1 ≡ 1Ḡ =h(1 G ) be the identity element inḠ. The statistical model is said to be (G,Ḡ)-invariant if In addition, letG = {g : T → T} be a group of transformations on T such that there exists a homomorphismh : G →G. The parametric functional τ (P ) is said to be (Ḡ,G)-equivariant if τ (ḡP ) =gτ (P ) for all g ∈ G, P ∈ P. Employing a decision-theoretic framework, define a loss function on T × σ(T) given by the 0 − 1 loss function We shall say the the loss function isG-invariant if L(gτ,gC) = L(τ, C) for every g ∈ G, τ ∈ T, and C ∈ σ(T). Given a confidence region Γ(X), its risk function is As such, the condition for a 100(1 − α)% confidence region Γ(X) is equivalent to having R(P, Γ) = E P {L(τ (P ), Γ(X))} ≤ α for every P ∈ P. When a (G,Ḡ)-invariant statistical model is coupled with aG-invariant loss function, then we would say that the statistical problem of constructing a confidence region Γ(·) is (G,Ḡ,G)-invariant. A confidence region Γ(X) is then said to be (G,G)equivariant if for every g ∈ G and x ∈ X, we have that The Principle of Invariance then dictates that we should only utilize (G,G)-equivariant confidence regions.
For an invariant confidence region problem, if Γ(·) is equivariant, then we have that, for every g ∈ G, Furthermore, if the groupḠ is transitive over P, meaning that for any given P 0 ∈ P we have {ḡP 0 :ḡ ∈ G} = P, then it suffices to consider an arbitrary element P 0 ∈ P to determine P {τ (P ) ∈ Γ(X)} for all P ∈ P since this equals the value using the arbitrary P 0 .
Recall that we also need to measure the quality of a confidence region by measuring its content using the quantity C(P, Γ) = E P [ν(Γ(X))], where ν(·) is a measure on (T, σ(T)), e.g., Lebesgue measure. We seek those confidence regions with small C(P, Γ). Observe that for an equivariant Γ(·) in an invariant statistical model, we have for every g ∈ G that for all x ∈ X and g ∈ G and for some ξ : G → , then there is the possibility of finding a Γ(·) that satisfies the required confidence level and minimizes the content. We shall call this condition as quasi-invariance of ν with respect to (G,G). However, if (G,G)-quasi-invariance of ν does not hold, then a uniformly best confidence region may not exist. But, a uniformly best confidence region on a subfamily P 0 ⊂ P may still exist among the class of 100(1−α)% confidence regions over P. In [7,6] quasi-invariance of the measure ν was imposed, but in some settings this may be unnatural such as in the NMEM under consideration in the current paper.

Towards Optimal CRs for the Median
Consider now the problem of constructing a CR for the median ∆ under the NMEM: X i = ∆+ i , i = 1, . . . , n; i IID ∼ F (·) ∈ F c,0 , ∆ ∈ . Prior to invoking invariance, we first reduce via the Sufficiency Principle. Thus, we may assume that the observable random vector is X () = (X (1) , X (2) , . . . , X (n) ), the vector of order statistics which is a complete sufficient statistic. The appropriate sample space A word on our notation: even though we had reduced to X () , in the sequel, when we write P F or E F , this means that the common distribution of the original X i 's is F . For measuring the content of a region for ∆ we use Lebesgue measure ν on .
The first invariance reduction is obtained through location-invariance. The problem is invariant with respect to translations with the groups of transformations being, for every c ∈ , x () → x () + c; (F, ∆) → (F, ∆ + c); and θ → θ + c.
Observe that for a location-equivariant Γ(·), we have for every c ∈ that by taking c = −∆ to obtain the last equality. The problem has thus been reduced to considering X () to be the order statistics from F (·) and we seek a location-equivariant Γ(x () ) such that, for every F ∈ F c,0 , P F {0 ∈ Γ(X () )} ≥ 1 − α. In addition, we seek to minimize E F ν[Γ(X () )] over all F ∈ F c,0 . Note that Lebesgue measure ν in is location-invariant, that is, ν(B) = ν(B + c) for every B ∈ B and c ∈ .
We remark at this stage that if we know the distribution F (·), then we could determine the optimal CR for ∆ under this known distribution and no further invariance reduction will be needed. To demonstrate, suppose that F is the normal distribution with mean zero and variance σ 2 which could be taken to be σ 2 = 1, so F (·) = Φ(·), with Φ(·) the standard normal distribution function (we also let φ(·) to denote the standard normal density function). Then, we seek a location-equivariant On the other hand, we obtain obtained after the obvious change-of-variables. The problem is then to find a location-equivariant Γ * (x () ) that will minimize X I{0 ∈ Γ * (x () )}h(x () )dx () subject to the condition that I{0 ∈ Γ * (x () )}f (x () )dx () ≥ 1 − α. The solution to this constrained minimization problem (see the optimization result in Theorem 2) is the well-known z-confidence interval for the normal mean given by Γ * (x () ) = [x ± z α/2 (1/ √ n)]. However, since F is known only to belong to F c,0 , a further invariance reduction is needed. This is achieved through strictly increasing continuous transformations with 0 as a fixed point. Let M denote the collection of functions m(·) that are strictly increasing continuous function on with m(0) = 0. The groups of transformations are given by (1) ), m(x (2) ), . . . , m(x (n) )); F → F m −1 ; and θ → m(θ).
The group of transformations M with F → F m −1 is transitive over F c,0 . Thus, we may simply pick an arbitrary F 0 ∈ F c,0 , which could be taken to be We emphasize again that in the second equation we could not drop the term m −1 nor factor it out from inside the ν(·) measure. This will prevent us from obtaining a uniformly (over F c,0 ) best confidence region for ∆. Next, we obtain a representation of Γ(x () ) by choosing a specific member of M that depends on x () . For an and for w ∈ define it such that it is strictly increasing and continuous over all w ∈ . Observe that for j = 1, 2, . . . , n, Lemma 1 With m(x () )(·) defined as above, a location-equivariant (LE) and M-equivariant (ME) where Γ 0 is some region in . Thus, the LE and ME Γ(·)'s are determined by Γ 0 's, which are subsets of . In fact, given a Γ 0 ⊂ , we have Proof: We utilize the location-equivariance (LE) and M-equivariance (ME) of Γ(·). We have where Γ 0 = Γ(1, 2, . . . , n). To establish (6), given a Γ 0 ⊂ , observe that It now follows that

Optimal CRs
Next, we tackle the problem of choosing an 'optimal' (properly defined) region Γ 0 , which then determines Γ(x () ) via the representation in Lemma 1. Recall that the goal is to find Γ(·) such that has a binomial distribution with parameters (n, 1/2), denoted by B(·; n, 1/2), and with associated probability mass function b(k; n, 1/2) = n k 2 −n I{k ∈ {0, 1, . . . , n}}. We have that with the first equality obtained using the portion of the proof of Lemma 1 and where we define Assume first that we know the values of {l(k) ≡ l(k; F ), k = 0, 1, . . . , n}. We now allow for randomized confidence regions in order to achieve optimality, that is, we allow for Γ 0 and Γ to depend on a randomizer U which is a standard uniform random variable independent of the X i 's. We remark that in [7,6,8] randomized procedures were also allowed to enable achieving optimality, similarly to the Neyman-Pearson theory of most powerful tests (cf., [9]). Define the right-continuous non-decreasing [0, 1]-valued function, for t ∈ , I{b(k; n, 1/2) > tl(k; F )}b(k; n, 1/2).
where the expectation is with respect to the randomizer U .
Summing over k = 0, 1, . . . , n, and integrating over u ∈ [0, 1], we find that Since c ≥ 0 and by condition we have which completes the proof of the theorem.
Remark: We note that this proof is similar to that of the Neyman-Pearson Lemma except for the fact that the {l(k; F ) : k = 0, 1, . . . , n} is not a distribution function. Therefore, the optimal Γ 0 , possibly using a randomizer U ∼ U [0, 1], is The associated optimal confidence region for ∆, possibly randomized, is Note that if we drop the term in bracket containing {U ≤ γ} in (8), we obtain which is a conservative confidence region for ∆ in the sense that [13]) associated with the binomial distribution, then we may still obtain an exact confidence region.

Implementation Aspects
The optimal confidence region required knowledge of the l(k; F )'s, or at the very least the ordering of the k ∈ {0, 1, . . . , n} for inclusion in the Γ * 0 , which is determined by the magnitude of the ratios r(k; F ) ≡ b(k; n, 1/2)/l(k; F ), k = 0, 1, . . . , n. In general, l(k; F ) will depend on the unknown true distribution F , hence the values of l(k; F )'s or the r(k; F )'s will not be known. In this case, we will not be able to determine Γ * (·). We describe two approaches to circumvent this problem.
(i) Restrict F to belong to a subclass of the family of continuous distributions, sayF c,0 ⊂ F c,0 and determine the l(k; F )'s or the r(k; F )'s for this class. This is then tantamount to satisfying the confidence level condition over the whole of F c,0 , but focusing only on the subclassF c,0 for minimizing the expected Lebesgue measure of the confidence region.
(ii) Utilize the observed data to estimate l(k; F ) byl(k), then use thesel(k)'s in the expression of Γ * (X () , U ). There are several ways to accomplish this which are discussed below. However, it should be pointed out that when estimates are plugged-in, data double-dipping ensues and the achieved confidence level may not anymore satisfy the condition of being at least equal to 1 − α.
The next two sections will deal with these two approaches towards developing the region estimators.

For General Symmetric Distributions
We assumed the uniform family of distributions in the preceding subsection. A question arises whether we obtain the same CR if F belongs to the subfamily F sym c,0 of F c,0 , the additional condition being symmetry of the distribution. For instance, the classes of normal, Cauchy, logistic, double exponential, symmetric mixtures of distributions all belong to this subclass. It turns out that Γ * 10 is also optimal for this larger subclass as a consequence of Theorem 3. The CR based on the Wilcoxon signed-rank statistic is an appropriate CR for these symmetric class of continuous distributions; however, this CR is not a legitimate competitor of Γ * 10 (X () , U ) since it does not satisfy the confidence level requirement for non-symmetric distributions which are still under the NMEM.
(i) If n is even, then r(k; F ) is maximized at k = n/2 and it decreases from the maximum as |k − n/2| increases.
The results stated in Theorem 3 appear intuitive when the distribution F ∈ F sym c,0 is unimodal. What is surprising is that the results also hold when F is bi-modal or multi-modal. For instance, if we take a symmetric mixture of a N (−µ, σ) and a N (µ, σ), even when µ is quite large relative to σ, the results still hold true. We present a mathematical proof of Theorem 3. To do so we first establish an identity analogous to that given by Pearson in a paper of [5].
Lemma 4 Let X (1) < . . . < X (n) be the associated order statistics for a random sample of size n from a continuous distribution F . For k = 0, 1, 2, . . . , n − 1, n, and provided that the expectations are well-defined with possibly a value of ∞, then Proof: Recall that for a positive-valued continuous random variable W , we have For k ∈ {1, 2, . . . , n}, we also recall that P {X (k) ≤ y} = n j=k n j F (y) k [1 − F (y)] n−j . Using these expressions, we immediately find that We now prove Theorem 3.
Proof: Using the representation in Lemma 4, we have for k = 0, 1, . . . , n that We prove the results by showing that Q(k; F ) is symmetric about n/2 and that it first decreases then increases with the maximum occurring at values of k that depends on whether n is odd or even. Making the transformation u = F (x) in the expression of Q(k; F ) and denoting by f the density function of F , we have In the last integral, let w = 1 − u and note that since Letting In addition, we have Therefore, on u ∈ (0, 1/2], u → b(u; α) is decreasing when α > 0; constant at 0 when α = 0; and increasing when α < 0. Consequently, This completes the proof of the theorem.
It follows that under this negative exponential distribution model, The optimal Γ 0 will therefore be of form Γ * 0 = k ∈ {0, 1, . . . , n} : with c chosen to be the smallest value such that P {B ∈ Γ * 0 } ≤ 1 − α. Note that this subset Γ * 0 will never include n, but it could include 0. We shall denote by Γ * 11 (X () , U ) the resulting randomized CR for ∆ under this exponentially-distributed focused case.
We illustrate the resulting CR for concrete values of n. We start with n = 10, an even sample size. Using R [12] we obtain for each k ∈ {0, 1, 2, . . . , n} the values of r(k; F ) (up to proportionality) and P {B = k} given in Table 1. Observe that the way we start including values of k into Γ * 0 is according to the value of r(k). Thus we start by first including k-values of 4 and 5; then 3 and 6; etc. Observe the asymmetry in the process of including the k-values into Γ * 0 with a bias in favor of the lower k-values. The intuition behind this is that since the exponential distribution is highly right-skewed, then the expected lengths between successive order statistics increases as k increases, which is formally indicated by the l(k; F ) = 1/[λ(n − k)] expression. Thus, to shorten the interval, there is preference for the lower order statistics. For a 95% confidence region, from the table we find that Γ * 0 = {2, 3, 4, 5, 6, 7} which yields P {B ∈ Γ * 0 } = .9346. The randomization probability then becomes γ = .95 − .9346 .0537 = .2867. Table 1: Values of (k, r(k), P (B = k)) for k = 0, 1, 2, . . . , n for the case n = 10 when F is assumed to be a negative exponential distribution. The 95% (randomized) confidence region for the median will then be Γ * 11 (X, U ) = [X (2) , X (8) )I{U > .2867} + [X (1) , X (9) )I{U ≤ .2867}.

Data-Adaptive Methods
Next we demonstrate approach (ii). In this situation we do not know the l(k; F )'s so we instead estimate these quantities using the observed data. Recall that l(k; F ) = E F [X (k+1) − X (k) ] so that without any knowledge about the underlying F we will not have a closed-form expression for these l(k; F )'s. However, given the sample data, we could estimate l(k; F ), unbiasedly, bŷ which is the method-of-moments estimator. This estimator though maybe unstable or inefficient.
As such the form of the 'optimal' Γ 0 will be We shall denote by Γ * 12 (X () , U ) the resulting randomized CR. The constant c will be chosen to be the smallest value such that P {B ∈ Γ * 0 |X} ≤ 1 − α. This value of c will depend on the sample data X since the ordering of entry of the k-values into Γ * 0 will depend on X, so that the randomization probability γ will also depend on X. Because of this dependence of the c and γ values to the sample data X, it is possible that the achieved confidence coefficient of the resulting confidence region will not anymore be 100(1 − α)%. In the simulation studies in section 8 we will indeed see that there is degradation in terms of the achieved confidence level for this CR.
A possibly better adaptive approach is to utilize the representation of l(k; F ) in Lemma 4 and to replace the unknown F (·) in the expression by an estimator based on the X i −∆, i = 1, 2, . . . , n, where∆ is the sample median of the X i 's. We expect this will lead to a better procedure since the estimators of the l(k; F )'s will be more stable compared to the method-of-moments estimator discussed above. In our implementation of this idea, we use as our estimator of F the empirical distribution functionF of X i −∆, i = 1, 2, . . . , n, given byF (t) = 1 The resulting estimator of l(k; F ) is then Observe that we could also just have used the EDF of the X i 's instead of the (X i −∆)'s since thê ∆ cancel out. As such, to order the k ∈ {0, 1, . . . , n} in terms of entry into Γ * 0 , we usê The resulting randomized CR for ∆ will be denoted by Γ * 13 (X () , U ). These adaptive procedures are totally nonparametric in the sense that no knowledge of the underlying distribution F is required. If we do have knowledge of the families of distributions in which F belongs, then we may be able to provide a better estimate of the l(k; F )'s as in the preceding subsection. It should also be noted, however, that these adaptive procedures may not anymore satisfy the confidence level requirement due to the data-dependent plug-in step. Section 8, which presents results of simulation studies, provides some insights regarding the empirical properties of these procedures.

Brief Review of Existing 'Off-the-Shelf ' Median CRs
There are several existing methods for constructing frequentist-based 100(1 − α)% CRs for ∆ under the NMEM. We describe some of these methods prior to our illustrations and simulation studies.
For a sample realization x = (x 1 , . . . , x n ), we define the usual sample statistics: X, S 2 ,∆, and M will then represent the random versions of these sample statistics. As earlier mentioned, φ(z) and Φ(z) will be the standard normal density and distribution functions, respectively, Φ −1 (·) the standard normal quantile function. We will use the conventional notation z β = Φ −1 (1 − β) for the (1 − β)th quantile of Φ(·). T (·; k) and T −1 (·; k) will denote the distribution and quantile functions, respectively, of a Student's T random variable with degrees-of-freedom k, and its (1 − β)th quantile will be denoted by t k;β .
Arguably, the most commonly-used CR for the center of a distribution, which is ∆ for symmetric distributions, is the T -based CR given by However, this CR is actually not valid under the NMEM since it does not satisfy the condition We still included this CR since it is typically the first choice to use by practitioners when constructing a confidence interval for µ or ∆ and we would like to examine and compare its performance with other CRs under the NMEM. The nonparametric analog of the T -based CR is the CR constructed from the Wilcoxon signedrank statistic W + ( [13]). The Walsh averages associated with the sample X i 's are W ij = (X i +X j )/2 for i ≤ j. Let W (1) ≤ W (2) ≤ . . . ≤ W ((n)(n+1)/2) be the order statistics of these Walsh averages. This Wilcoxon signed-rank based nonparametric CR is given by where, with W + (·) denoting the null distribution (that is, under ∆ = 0 and F ∈ F sym c,0 ) of W + , k 1 = sup{w : W + (w) ≤ α/2} and k 2 = inf{w : This CR is valid under F ∈ F sym c,0 , but not for F ∈ F c,0 . In contrast, the nonparametric CR derived from the sign statistic is valid under F c,0 (see [14]). As before, let B(·) be the binomial distribution with parameters n and 1/2. Let k 1 = sup{w : B(w) ≤ α/2} and k 2 = inf{w : B(w) ≥ 1 − α/2}.

Then, this sign statistic-based CR is
Another CR of ∆ is developed from the asymptotic normality of the sample median M (X). For X 1 , X 2 , . . . , X n IID from a distribution F (·) with density f (·), this asymptotic distribution (cf., [13]) is given by Iff (∆; X) is an estimator of f (∆), then an asymptotic confidence interval for ∆ is Kernel-based estimators of the density f are available in R using the density object function ( [12]), and coupled with the approx function, f (∆) could then be estimated. This is how we implemented this CR in the illustrations and in the simulation studies. Instead of relying on asymptotic approximations, one may resort to bootstrapping approaches. Let X * k = (X * k1 , . . . , X * kn ) be the kth bootstrap sample out of BREPS bootstrap samples. Denote its sample median by M * k . The basic bootstrap CR using the median obtains the α/2th and (1−α)/2th quantiles of {M * k −M, k = 1, 2, . . . , BREP S}, denoted by κ * 1−α/2 and κ * α/2 , respectively, and constructs the CR (cf., [4]) via The next bootstrapped CR is derived using the studentized median as pivot and with its standard error estimated by S * boot , the standard deviation of {M * k , k = 1, 2, . . . , BREP S}. The CR is constructed via Γ 6 (X) = M (X) ± t n−1;α/2 (S * boot ) .
The bootstrap percentile CR also uses the bootstrap distribution of M (X). Denoting by where B * −1 (·) is the bootstrap quantile function. This percentile bootstrap CR is usually bettered by so-called bias-corrected CRs. See, for instance, chapter 5 of [2] and chapter 11 of [4]. The first improvement is provided by the biascorrected (BC) CR which is where p 0 = B * (M (X)) and z 0 = Φ −1 (p 0 ). On the other hand, the BCa method (bias-corrected and accelerated) CR takes the form , where the acceleration coefficient a can be estimated using jackknifed samples estimatesM (i) , i = 1, 2, . . . , n, viaâ Together with these existing CRs for ∆, we add the CRs developed in earlier sections (we drop the superscript ' * ' in the notation): Γ 10 , the CR optimized for symmetric distributions [see equation (10)]; Γ 11 , the CR optimized for exponential distributions (see section 4); Γ 12 , the adaptive CR using the crude estimator for l(k; F ); and Γ 13 , the adaptive CR using the empirical distribution in the estimator of l(k; F ) (see section 5). We summarize these different CR procedures examined in the illustrations and the simulation studies in Table 2. We note that among these thirteen CRs, those that are location-equivariant and equivariant to monotone transformations are Γ 3 , Γ 10 , Γ 11 , Γ 12 , and Γ 13 .

Illustration using Real Data Sets
For our first illustration we use a real data set gathered by the first author about the mileage efficiency of his car during a period when he was commuting between Ann Arbor, Michigan and Bowling Green, Ohio. Efficiency is measured in terms of the miles traveled per gallon of gasoline. In the data set there were n = 205 observations, with each observation recorded at each gas fill-up. The histogram of these observations are shown in Figure 1. The full data set, which contained other relevant variables, was also used in demonstrating linear model global model validation procedures in [10].
For this sample data the sample mean isx = 29.291, sample median is m = m(x) =∆ = 29.462, the first and third sample quartiles are 28.275 and 30.575, respectively, and the extreme values are 18.374 and 33.472. From the histogram we notice that there is a noticeable left-skewness in the distribution. The sample standard deviation is 1.887 and the inter-quartile range is 2.299. For our illustration, we seek a confidence region for the population median ∆, where the population could Sign Statistic Based Γ 4 Asymptotic Distribution of Median Based Γ 5 Basic Median Bootstrap Γ 6 Median with Bootstrapped SE Γ 7 Percentile Bootstrapped Γ 8 Bias-Corrected (BC) Bootstrapped Γ 9 Bias-Corrected Accelerated (BCa) Bootstrapped Γ 10 Optimally Focused for Symmetric Γ 11 Optimally Focused for Exponential Γ 12 Adaptive with l(k) Unbiasedly Estimated Γ 13 Adaptive with l(k) Estimated using EDF   Table 2. The horizontal line depicts the sample median. 29.8

CR Type
Interval s i n CR be thought of as the hypothetical set of all values of AveMilesGal if the car has been observed forever. We mention that one interesting aspect of this data set is that the population from which this sample is obtained is a mixture of subpopulations corresponding to summer highway driving, winter highway driving, summer city driving, winter city driving, and a combination of highway and city driving. Figure 2 depicts the 95% confidence regions produced by the thirteen different methods for the AveMilesGal data set. Except for CR Γ 12 , all of them produced intervals. It is possible that method Γ 13 may also produce a region that is not an interval, while CRs Γ 1 to Γ 11 will all produce confidence intervals by construction. In addition, observe that only the CRs produced by Γ 4 and Γ 6 turned out to be symmetric about the sample median.
Our other illustration uses Proschan's famous Boeing air-conditioning data set [11]. Proschan also demonstrated that this data set did not come from an exponential distribution, but came from a mixture of exponential distributions, inducing a decreasing failure rate property. Our goal in this illustration is to provide a confidence region for the median inter-failure time given this data set. Figure 4 presents the comparative plots of the CRs for the thirteen methods. Notice that the T -based CR, as well as the Wilcoxon signed-rank based CR, appear to be rather different from the other eleven CRs. In fact, both excluded the sample median. At the same time, we are somewhat unfair in the manner in which we applied these two procedures since it would have been more appropriate to first perform a transformation to approximately symmetrize the distribution prior to applying these procedures, and then re-transforming back. Also, notice that the Γ 12 CR is composed of one big interval together with several smaller intervals. This could be attributed to the instability of the method-of-moments estimates of the l(k; F )'s. Interestingly, this CR has the smallest content, but we will see in the simulation studies in section 8 that its achieved coverage probability of the median tends to be lower than the nominal coverage probability.

Comparison of Methods via Simulations
In this section we present the results of simulation studies to compare the thirteen CR methods listed in Table 2 in terms of their contents and coverage probabilities under the NMEM. Different error distributions [normal, Cauchy, uniform, logistic, gamma, Weibull and mixture of normals] and varied sample sizes (n ∈ {10, 15, 20, 25, 30, 40, 50}) were utilized in the simulations. The computer programs were coded in R [12]. For each combination of error distribution and sample size, 10000 simulation replications were ran, and with 2000 bootstrap replications for the bootstrap-based methods. Figures 5 and 6 present the results of this simulation study in the form of plots of the standardized mean contents (these are the simulated mean contents multiplied by the square root of the sample size) and the percentage coverage of the true median under different distributions and for the varied sample sizes. As we progress with the different distributions we eliminated in the plots the methods that were not competitive, either with respect to their achieved coverage probability or their mean content. This is in order to make the plots leaner and cleaner and to enable more clear-cut comparisons among those methods that are competitive. Methods that were eliminated from contention after a distribution run are as follows (they were still included in the runs but Figure 4: The 95% confidence regions for the median based on the thirteen methods for the Proschan's Boeing air-conditioning data set. The horizontal line represents the sample median. . Upon final examination of the resulting plots we also eliminated method 11 since it is unstable with respect to content. The first observation from these simulations concerns the performance of Γ 1 and Γ 2 . Both perform very well under the normal distribution with Γ 1 having smaller content as expected, but Γ 1 performs extremely poorly with respect to content under the Cauchy distribution, and both do not perform well in terms of coverage for non-symmetric distributions. However, their poor performances under non-symmetric distributions are also expected. The next observation is the consistent validity of the sign-statistic based procedure Γ 3 . Its coverage probability is always at least the specified coverage probability and in fact it tends to be conservative. At the same time, it has the highest mean content among the competitive methods. Γ 10 , which is the procedure focused towards symmetric distributions and which is actually a randomized version of Γ 3 , is preferable over Γ 3 since it has smaller mean content and at the same time it satisfies the coverage requirement. The basic bootstrap CR, Γ 5 , surprisingly performed poorly with respect to its achieved coverage probability. The CR based on the studentized sample median using a bootstrapped standard deviation as estimate of the median's standard error, Γ 6 , did not perform well for small sample sizes in terms of coverage probability. Γ 8 and Γ 9 almost have the same performance, as is to be expected from their derivations, while Γ 11 tended to have unstable mean contents. Γ 12 did not perform acceptably due to its degraded coverage probability. Thus, based on these simulation studies, the most competitive among the thirteen CR procedures are Γ 3 , Γ 7 , Γ 9 , Γ 10 , and Γ 13 .
To focus comparisons among these five procedures, we ran more extensive simulations with 20000 replications and 5000 bootstrap replications for sample sizes n ∈ {10, 15, 20, 25, 30, 40, 50, 75, 100} to obtain better comparisons of these five chosen methods. The results of these simulations are contained in Figures 7 and 8. Examining the plots in Figures 7 and 8, we see that the sign-based test, Γ 3 , and the procedure focused on symmetric distributions, Γ 10 , are consistently satisfying the confidence level requirement among all the distributions considered, including the skewed gamma and Weibull distributions, with Γ 3 tending to be more conservative. Γ 10 , which as pointed out earlier is a randomized version of Γ 3 , has confidence level that is very close to the pre-specified level, hence it also has smaller content compared to Γ 3 . The PCT-Bootstrap, Γ 7 , and the BCa-Bootstrap, Γ 9 , tend to be liberal, and so is the adaptive Γ 13 , though the latter could sometimes be conservative such when the distribution is Cauchy, or it could be more liberal than Γ 7 and Γ 9 such as for the mixture of normal distributions. Content-wise, Γ 7 , Γ 9 , and Γ 13 almost have the same performances, with Γ 9 appearing to be having a tad smaller content. Due to the lower achieved confidence, these three CRs also tended to have smaller contents compared to Γ 3 and Γ 10 , with Γ 3 possessing the largest content and being also the most conservative. We note here the impact of the plug-in procedure for Γ 13 . Since we did not know l(k; F ) we plugged-in an estimator for this function which utilized the empirical distribution function and which also used the same sample data, so there is data double-dipping. The impact of this plug-in and double-dipping procedure is to make the procedure somewhat liberal. In contrast, the Γ 10 CR did not need this plug-in procedure and we observe that the achieved level for this CR is very close to the pre-specified confidence level.

Concluding Remarks
In this paper we revisit the problem of constructing confidence regions (CRs) for the median under the NMEM. The problem is also relevant in the study of complex engineering systems where it is difficult to determine the exact functional form of the system's lifetime distribution, so one is forced to utilize a nonparametric model for the system's lifetime distribution. In addition, this problem also arises in economic settings where of interest will be the median instead of the mean when dealing with the income variable due to high right-skewness of its distribution. Several existing nonparametric approaches to constructing CRs for the median were reviewed. Among these 'offthe-shelf' methods are the T -based method which utilizes the sample mean, and that based on the Wilcoxon signed-rank statistic. These two methods may perhaps be the default methods for many users whose goal is to infer about the 'center' of the population distribution. Also included are the computationally-intensive CR methods such as the bootstrap percentile and the BCa approaches ( [4]). When assessing the adequacy of a CR, there is a need to examine its expected content (in one-dimensional settings, content is its Lebesgue measure) in addition to the requirement that it satisfies a specified nominal level of coverage of the true parameter value. Thus we examined the problem of obtaining optimal CRs, which are those with smallest expected contents. Under the NMEM, invariance with respect to location-shifts and monotone transformations were invoked to reduce the problem to constructing equivariant CRs. Within the class of equivariant CRs, we obtained the best CRs, with respect to minimizing the expected content for subclasses of the class of all continuous distribution functions. Thus, there is a best CR when focused on the subclass of symmetric distributions and a best CR for the subclass of exponential distributions. We also developed fully nonparametric data-adaptive CRs under the NMEM. These adaptive CRs are the ones labeled Γ 12 and Γ 13 . Based on simulation studies to compare the different CRs, we found that under the NMEM, both the T -based and the Wilcoxon signed-rank statistic based CRs should not be utilized. The sign-statistic based procedure, Γ 3 , and the optimal procedure focused towards symmetric distributions, Γ 10 , both fulfill the confidence level requirement, with Γ 3 being somewhat conservative, hence tended to have larger content. Among the other CR methods, the bootstrap procedures Γ 7 (PCT) and Γ 9 (BCa), and the adaptive procedure Γ 13 , were the most competitive among all scenarios, but these three tended to be a tad more liberal than either Γ 3 and Γ 10 , hence also possessed smaller contents. If one is to insist that the desired confidence level should be achieved, then Γ 10 appears to be the best, or if one is opposed to using randomized CRs, then Γ 3 should be chosen. However, we feel that abhorrence to the use of randomized procedures is not mathematically justifiable nor defensible, since the use of randomized procedures allows for better methods and such strategies are in fact the bulwark of statistical decision and game theory. On the other hand, if one could tolerate a small degradation in the achieved confidence level, then Γ 9 and Γ 13 appear to be reasonable choices among these different CR procedures.
Finally, we mention that the approach of developing the CR procedures using invariance considerations is extendable to other more complex settings, such as two-sample settings, K-sample settings, situations with censored and/or truncated data, and when the parameter of interest is not necessarily the median. In addition, of particular and major interest to us leading to this work is the potential that the use of invariance arguments may enable the construction of optimal multiple confidence regions, extending results in multiple testing settings. We hope to explore these extensions and new possibilities in future work.  {10, 15, 20, 25, 30, 40, 50}) for normal, Cauchy, uniform, logistic, gamma(2,1), Weibull(.5,1), and mixture of normals error distributions. The plots pertaining to the lengths (contents) utilized the standardized mean length, which is the mean length multiplied by √ n. The description of the 13 CR methods are in Table 2.  The plots pertaining to the lengths (contents) utilized the standardized mean length, which is the mean length multiplied by √ n. The description of these five CR methods are in Table 2.