The nonparametric bootstrap for the current status model

It has been proved that direct bootstrapping of the nonparametric maximum likelihood estimator (MLE) of the distribution function in the current status model leads to inconsistent confidence intervals. We show that bootstrapping of functionals of the MLE can however be used to produce valid intervals. To this end, we prove that the bootstrapped MLE converges at the right rate in the $L_p$-distance. We also discuss applications of this result to the current status regression model.


Introduction
In the current status model, the variable of interest is a survival variable X with distribution function F 0 . However, instead of observing the exact survival time X, a censoring variable T ∼ G is observed together with the indicator Δ = 1 X≤T . Such data arise naturally in clinical trials when a patient can only be checked at one measurement due to destructive testing. A lot of research has been published on the behavior of the maximum likelihood estimator (MLE) F n of the distribution function F 0 . The limiting distribution of n 1/3 (F n (t) − F 0 (t)) is after scaling by the constant κ = {4F 0 (t)(1 − F 0 (t))f 0 (t)/g(t)} 1/3 given by where W is a two-sided Brownian motion with W (0) = 0 (see [19]). Other estimators with similar asymptotic properties are Chernoff's estimator of the mode ( [6]), the Grenander estimator ( [10]) of a nonincreasing density, Manski's maximum score estimator ( [27]) and Rouseeuw's least median of squares estimator ( [29]). A general framework for cube-root n asymptotics is given in [25].
In this paper we investigate the behavior of Efron's nonparametric bootstrap method ( [9]) for constructing confidence intervals for smooth functionals of the MLE. It is known that the nonparametric bootstrap is inconsistent for generating the limit distribution of the MLE. The authors of [2] prove that (conditional on the data), whereF n is the bootstrap MLE and W andŴ are two independent two-sided Brownian motions originating at zero. A similar result is obtained in [26] and in [31] for the Grenander estimator. The maximum score estimator of [27] is another example of a cube-root n statistic with asymptotic distribution derived in [25], where inconsistency of the nonparametric bootstrap for this estimator is shown in [2].
Constructing asymptotic confidence intervals for the distribution function in the current status model based on Chernoff's distribution and the normalizing constant κ is complicated by the need to compute the critical values of C and to estimate the density f 0 consistently. Since this turns out to be a rather difficult task several alternative bootstrap methods have been proposed based on resampling from a smooth estimate. [32] consider a smooth kernel estimateF of F 0 and resample the Δ i from a Bernoulli distribution with probabilityF (T i ), while keeping the censoring variables T i fixed and center the values of the bootstrap samples by subtracting the smooth estimate of the distribution function. [26] and [31] propose similar smooth respampling schemes for the Grenander estimator and a model-based smoothed bootstrap procedure for making inference on the maximum score estimator is developed in [28]. All methods result in consistent estimation of the (suitably standardized) distribution C conditional on the original data.
A drawback of this approach is that smoothness conditions of F 0 are used which allow faster than cube-root n estimation of F 0 . This raises the question if one should really use confidence intervals based on the MLE instead of on a faster converging estimate. This latter procedure is followed in [14], where the authors consider constructing confidence intervals around the smoothed maximum likelihood estimator (SMLE) of F 0 in the current status model. The SMLE is a kernel estimate based on the MLE with an asymptotic normal distribution, instead of Chernoff's limiting distribution ( [16]). The bootstrap method proposed in [14] is however still based on the smooth bootstrap procedure described in [32] and not on Efron's nonparametric bootstrap. We show in this paper that the construction of confidence intervals around the SMLE based on the nonparametric bootstrap can also be proved to be valid, where one does not resample from a smooth estimate of F 0 , but just resamples with replacement from the pairs (T i , Δ i ) in the original sample. This method already has been used without proof in [17] and also in [18] and the present manuscript intends to fill the gap of the missing proofs here. An important difference with the smooth bootstrap in [14] is that for the centering of the estimates in the nonparametric bootstrap samples the SMLE of the original sample is used, whereas this will not work for the resampling as proposed in [14]; in the latter case one needs to center the estimates in the bootstrap samples by a kernel convolution of the SMLE in the original sample. It is not clear which method is better, and the most striking fact is the similarity of the results of the two methods in our simulations. An advantage of the purely nonparametric bootstrap, discussed in the present paper, might be its conceptual simplicity and the absence of the need to center with a convolution of the SMLE in the centering of the bootstrap samples instead of the SMLE itself. An advantage of the smooth bootstrap, discussed in [14] might be the fact that only the indicators Δ i are being resampled, and that in this sense one stays closest to the sample distribution of the observation times T i , which stay fixed in this procedure.
Although it is argued in [8] that the naive bootstrap will not work for their goodness-of-fit test for monotone functions, based on the Grenander estimator, no theoretical justification for this conjecture is given. Other examples where a smooth bootstrap procedure is used, are the likelihood ratio type two-sample test for current status data proposed by [11] and the test for equality of functions under monotonicity constraints proposed by [7]. Both tests establish asymptotic normality for the test statistic considered.
The paper is organized as follows: In Section 2 we introduce the current status model and review some interesting properties of the MLE. The validity of the nonparametric bootstrap is discussed in Section 3. In Section 4 we provide two examples to illustrate the applicability of our result. In the first example we construct pointwise confidence intervals based on the smoothed MLE in the current status model. The second example deals with doing inferences for a finite dimensional regression parameter in the current status linear regression model.
For both examples, the theoretical and finite sample behavior of the nonparametric bootstrap is discussed. Section 5 presents some concluding remarks. The proofs of our results are given in Section 6.

The current status model and the MLE
The X i are interpreted as (nonnegative) survival times with distribution function F 0 . Instead of observing X, a censoring variable T ∼ G is observed (with density g) independent of X. One could say that in the current status model, each observation Z i represents the current status of the item i at time T i . The density of Z i with respect to the product of Lebesgue measure and counting measure on [0, R] × {0, 1} is given by The maximum likelihood estimator F n is defined as the maximizer of the log likelihood given by (up to a constant not depending on F ),  [19] show that the MLE can be characterized as the left-continuous slope of the greatest convex minorant of a cumulative sum diagram consisting of the points (0,0) and ⎛ where we let T (j) denote the jth order statistic of the T i and Δ (j) be the Δ i corresponding to it (assuming no ties are present in the data). An important property of the MLE is the so-called switch relation, see [17] p. 69. Let G n be the empirical distribution function of T 1 , . . . , T n and define the process V n by and the process (in a) U n by Then, taking a = F 0 (t), we get the switch relation: see also Figure 1.

Bootstrapping the MLE
In this section we establish properties of the bootstrap MLEF n based on the nonparametric bootstrap proposed by [9]. Our main concern is to show that conditional on the data Z 1 , . . . , Z n , we have Denote the empirical probability measure of Z 1 , . . . , Z n by P n . The bootstrap empirical measure isP where 1 Zi denotes the points mass at Z i = (T i , Δ i ) and is a vector of multinomial weights, independent of Z 1 , . . . , Z n . The bootstrap MLEF n is computed using the weighted cumulative sum diagram formed by the point (0, 0) and ⎛ where M n(j) corresponds to the multinomial weight corresponding to T (j) . The bootstrap MLEF n is then calculated from the left-continuous slope of the convex minorant of this cusum diagram.
To complete notation, we suppose that the vectors ((Z 1 , . . . , Z n ), M n ), n = 1, 2, . . . are defined on the product space (([0, R]×{0, 1}) ∞ ×Z ∞ + , B, P ZM ), where Z + is the set of nonnegative integers and B is the collection of Borel sets, generated by the finite dimensional projections. We say that a real-valued function Γ n defined on the joint probability space is of order o P M (1) in probability if for all , η > 0: where P * denotes outer probability and P M |Z is the conditional probability measure w.r.t. the weights, given the sample Z 1 , . . . , Z n .
To establish (3.1), we need the following result, which is a bootstrap version of Lemma 11.5 in [17].
Also suppose that the observation distribution G has a continuous derivative g that stays away form zero and infinity on [0, R]. Let and define the procesŝ with processesV n andĜ n defined bŷ Then there are positive constants K 1 and K 2 , such that, for all a ∈ (0, 1) and for all large n: where {A} denotes the indicator 1 A of the event A.
Lemma 3.1 implies that the probability that for all x ∈ [0, R], and a = F 0 (t), tends to 1 as n → ∞. The proof of Lemma 3.1 is given in Section 6. The proof uses empirical process theory and results on tail probabilities for √ n(P n − P n ) F for classes F with finite entropy integrals. Similar results are proved using martingale theory in Section 11.2 of [17] for the original sample and in [14] for a smooth bootstrap empirical process. Since where {F n (t) − F 0 (t)} + denotes the positive part of {F n (t) − F 0 (t)} and since, it follows from Lemma 3.1 and the bootstrapped switch relation given by that there exists a positive constant K > 0 such that,

P. Groeneboom and K. Hendrickx
In particular, there exists a K 1 > 0 such that: and likewise there exists a K 2 > 0 such that: In the next section we show how (3.1) can be used to justify the bootstrap validity for drawing inferences in models which can be estimated using smooth functionals of the MLE. The proofs for deriving the asymptotic behavior of these functionals are in general based on applications of the Cauchy-Schwarz inequality and on showing asymptotic equicontinuity. Both steps involve calculating the L 2 -distance which can often be reduced to the L 2 -distance between the MLE and the true underlying distribution function. Our main result given in (3.1) is therefore important to show that the asymptotic properties of the estimates obtained in the original sample are still valid in the bootstrap sample conditionally on the data. The asymptotic behavior of the functionals does not depend on the distribution function of the MLE, which is, as shown in Theorem 5 of [2], not the same in the original sample and the bootstrap sample (conditionally on the data). We note that the variances of the corresponding asymptotic distributions however still have the same order n −2/3 , just like our squared L p -distances in (3.1).

Applications
In this Section we illustrate the applicability of our bootstrap results. In our first example we consider the current status model described in Section 2 and estimate F 0 by the SMLE. In the second example we consider estimating a finite dimensional regression parameter for the current status model, where in addition to observing the vector (T, Δ), also a covariate vector X is observed.

The Smoothed Maximum Likelihood Estimator (SMLE)
We estimate F 0 by the SMLEF nh obtained by first estimating the MLE F n and then smoothing this using a smoothing kernel, i.e., where K is an integrated kernel, and where h is a chosen bandwidth. Here dF n represents the jumps of the discrete distribution function F n and K is one of the usual symmetric twice differentiable kernels with compact support, used in density estimation. In our computer experiments, we used the triweight kernel For a constant c > 0 and h = cn −1/5 , the SMLE has been proved to converge at rate n −2/5 with asymptotic limit distribution, (see [16]). The SMLE is often used in the smooth bootstrap procedures described in Section 1 (see also the numerical example below). LetF * nh (t) be the bootstrapped SMLE based on replacing F n in (4.1) by the bootstrapped MLÊ F n , then we have the following result, given the data (T 1 , Δ 1 ), . . . , (T n , Δ n ), in probability. Note that, in contrast to the smooth bootstrap method described in [14], we do not need to estimate the convolution SMLE (see (4.7) below).
To prove the asymptotic normality result for the nonparametric bootstrap, given in (4.3), we prove (in Section 6) the following Lemma:   nh (t) is defined by (4.4) withP n replaced by P n , we have by Lemma 4.1 that, in probability, which converges, conditional on the data (T 1 , Δ 1 ), . . . (T n , Δ n ) to the same asymptotic limit as in probability (see e.g. [21] for more details about the use of the bootstrap for kernel estimators). Finally, applying the central limit theorem on the expression above proves the asymptotic normality result for the bootstrapped SMLE given in (4.3). The proof of Lemma 4.1 is a generalization of the proof for the representation of the SMLEF nh (t) as the "toy-estimator" defined in (4.5). The proof is outlined in Section 11.3 of [17] and uses the result of Theorem 11.3 given in Section 11.2 which is the analogue of our Lemma 3.1 in the original sample. In our experiments we used the method of [30], see also p. 328 in [17]. It is straightforward to show that the nonparametric bootstrap method remains valid under this boundary correction. Moreover, one should also take into account the bias defined in (4.2) when constructing confidence intervals around the SMLE. The bias issue is discussed in more details via a simulation study in Section 4.1.1.
In the remainder part of this Section, we show the applicability of this bootstrap result (4.3) by constructing pointwise confidence intervals (CIs) around the SMLE. We consider two different simulation models and a real data example to illustrate the performance of these CIs.
In the first simulation study we compare our nonparametric bootstrap CIs with (a) the smooth bootstrap CIs proposed in [14], (b) the likelihood ratio intervals around the MLE F n proposed in [4], (c) the smooth bootstrap MLE-based intervals proposed in [32] and (d) Wald-type CIs, derived from the asymptotic normality of the SMLE.
In a second simulation study, we discuss the difficulties with the construction of pointwise CIs around the SMLE that are not necessarily specific to the bootstrap procedure but that have to be taken into account in order to obtain good CIs around the SMLE under current status data. We first describe a bandwidth selection procedure for choosing the bandwidth of the SMLE and we next discuss the effect of the bias on the performance of the CIs. The algorithms to produce the proposed CIs around the SMLE can be found in the R package curstatCI.

Simulation study 1: comparing CIs for the distribution function under current status data
To illustrate the performance of the nonparametric bootstrap procedure for constructing pointwise CIs of the distribution function, we consider a first simulation study based on N = 5, 000 simulation runs from a model where both X and T have a Uniform(0,2) distribution. In this model the bias where S nh (t) resp. S * nh (t) are estimates of the variance σ 2 (t) defined in (4.2) (apart from the factor cg(t) which drops out in the Studentized bootstrap procedure) given by In Figure 2(a) we compare the proportion of times that F 0 (t) is not in the 95% bootstrap CIs for t = 0.02, 0.04, . . . , 2 with the corresponding proportions obtained with (a) the smooth bootstrap procedure proposed in [14], (b) the likelihood ratio intervals around the MLE F n proposed in [4] and (c) the smooth  bootstrap MLE-based intervals proposed in [32]. For samples of size n = 1, 000, B = 1, 000 bootstrap samples were generated for both methods and the triweight kernel is used for calculation of the SMLE with h = 2n −1/5 , where the constant c = 2 corresponds to the length of the support of the observation variable T . For the smooth bootstrap procedures (a) and (c), first a bootstrap sam- is obtained by keeping the T i in the original sample fixed and by resampling the Δ * i from a Bernoulli distribution with probabilitỹ F nh (T i ), then the bootstrap MLEF n and SMLEF * nh are estimated based on the The smooth bootstrap 1 − α intervals around the SMLE proposed in [14] are then constructed via (4.6), except that the SMLEF nh (t) in the definition of W * nh (t) is replaced by the convolution SMLE given by and that the variance estimate in the bootstrap sample is given by The convolution SMLE corresponds to the extra level of smoothing introduced by the smooth bootstrap procedure and is hence not required for the nonparametric bootstrap. The smooth bootstrap CIs of [32] around the MLE are given by where again the extra level of smoothing is introduced (since one subtractsF nh and not F n ) to justify the smooth bootstrap procedure.
The performance of the SMLE-based CIs is comparable. The bootstrap intervals based on the classical bootstrap procedure avoid however calculation of the convolution SMLE defined in (4.7). The CIs in (b) and (c) have similar coverage proportions in the middle of the interval [0, 2] but have a worse behavior near the boundaries of the interval compared to the SMLE-based intervals. Figure 3(a) shows the average length of both bootstrap intervals around the SMLE in comparison with the average length of the likelihood ratio CIs of [4] and the smooth MLE-based CIs of [32]. The latter intervals are constructed around the MLE F n instead of the SMLEF nh . The length of the MLE-based intervals is larger than the length of the SMLE-based intervals due to the fact that the MLE converges at the slower rate n 1/3 .
Instead of constructing the Studentized bootstrap intervals where the quantiles of the limiting distribution of the SMLE are derived from the bootstrap distribution, one can alternatively consider Wald-type confidence intervals using the quantiles of the normal distribution and an estimate of the asymptotic variance.  [14], the likelihood ratio CIs of [4] (red, dashed) and the smooth MLE-based CIs of [32] (green,dotted); and (b) Wald-type CIs using the first estimateσ 1,nh (red,dashed), the second estimatê σ 2,nh (blue,dashed-dotted) and the third estimateσ 3,nh (green,dotted). n = 1, 000, N = 5, 000, B = 1, 000 and h = 2n −1/5 .
We compare three different estimatesσ nh (t) for σ(t) defined in (4.2) and construct CIs given by where z α is the αth quantile of the standard normal distribution. In this simulation study β(t) defined in (4.2) is zero. The effect of β(t) on the behavior of the intervals will be discussed in the second simulation study below. A first estimate forσ nh (t) is given bŷ where g nh is a classical kernel estimate for the density g of the observation time T ∼ U (0, 2), using again the triweight kernel with bandwidth h = 2n −1/5 . A second estimate for σ(t) is inspired by the fact that the SMLE is asymptotically equivalent to the toy-estimator defined in (4.5), which has a sample variance This suggests taking the second estimate n −2/5σ 2,nh (t) equal to the root of (4.10) where F 0 is replaced by the MLE F n and g is replaced by the kernel density estimate g nh .
Contrary to the bootstrap procedure for constructing CIs defined in (4.6), both estimatesσ 1,nh (t) andσ 2,nh (t) require estimating the density g. A bootstrap based estimate for the variance, avoiding estimating g, is finally given   Figure 4 compares the coverage proportions between the bootstrap CIs in (4.6) with the Wald-type CIs in (4.8) using the three different variance estimates described above. Pointwise confidence bands for the variance estimates are illustrated in Figure 5. The curves show the average variance estimate and the 5% and 95% empirical quantiles of the variance estimates at points t = 0.02, 0.04, . . . , 2. The best results for the Wald-type CIs are obtained with the second variance estimateσ 2 2,nh (t) but the coverage proportions and average lengths (shown in Figure 3(b)) are inferior to the results obtained with the bootstrap CIs in (4.6). Estimating the density g inσ 1,nh (t) andσ 2,nh (t) requires an additional bandwidth selection, whereas the estimateσ 3,nh (t) is straightforward to obtain and does not depend on an estimate of g. The variance of the first estimateσ 2 1,nh (t) is larger than the variance of the second and third variance estimatesσ 2 2,nh (t) andσ 2 3,nh (t) , especially near the boundaries of the support.
Although we have proven validity of the nonparametric bootstrap for constructing pointwise CIs around the SMLE, the performance of the CIs is often influenced by several other aspects that are not specifically due to the nonparametric bootstrap algorithm. In what follows we describe some of these issues further and analyze the problems that can arise in the construction of the CIs. In a second simulation study we investigate the bias effect. Estimation of the bias defined in (4.2) is known to be a rather difficult task since it requires estimating the derivative f 0 of the density f 0 under current status data. Sufficiently accurate estimates of the bias are hard to obtain by direct estimation of f 0 . Besides estimating the derivative directly we therefore also explore the effect of the bandwidth choice on the performance of the pointwise CIs. We first describe a

Bandwidth selection
In the previous simulation study, we considered taking the bandwidth equal to h = 2n −1/5 , where the factor 2 is based on the size of the support [0, 2] of the density f 0 . This choice gave satisfactory results on the performance of the CIs discussed above. A bad choice of the bandwidth can however seriously affect the performance of the SMLE. It is therefore advisable to use an approach that selects the bandwidth with respect to some optimization criteria. We apply the method proposed in [20] to select the bandwidth which uses bootstrap subsamples of smaller size from the original sample to estimate the pointwise mean squared error (MSE) of the SMLE. The method works as follows: to obtain an approximation to the optimal bandwidth minimizing the pointwise MSE, we generate B bootstrap subsamples of size m = o(n) from the original sample using the subsampling principle and take c t,opt as the minimizer of whereF n,c0n −1/5 is the SMLE in the original sample of size n using an initial bandwidth c 0 n −1/5 for some constant c 0 . The bandwidth used for estimating the SMLE is next given by h = c t,opt n −1/5 where c t,opt minimizes MSE(c) as a function of c. In the simulation study below we show the results for m = 50 when generating subsamples from a sample of size n = 1, 000. Other subsample sizes m = 30, 100 were considered as well which resulted in similar optimal bandwidth choices. We used subsamples m = 100 resp. m = 250 when we generated data sets of size n = 5, 000 resp. n = 10, 000 from the model.

Simulation study 2: correcting the asymptotic bias
To investigate the effect of the bias on the construction of the pointwise CIs in (4.6), we consider a second simulation study where the event times are generated from a truncated exponential distribution on [0, 2] and the censoring times are uniformly distributed on [0, 2]. The density of the event times is given by f 0 (t) = exp(−t)/(1 − exp(−2))1 [0,2] (t) and therefore the bias β(t) defined in (4.2) will influence the performance of the CIs. Figure 6 compares the proportion of times that F 0 (t) is not in the 95% bootstrap CIs for t = 0.02, 0.04, . . . , 2 with the corresponding proportions in the bias corrected CIs given by where Q * 1−α/2 (t) and S nh (t) are defined above and where β(t) is the true bias of the SMLE at timepoint t defined in (4.2). The bandwidth of the SMLE is selected by the procedure described in Section 4.1.2. The coverage proportions of the uncorrected CIs are clearly smaller than the nominal 95%-level at the left endpoint of the interval [0, 2] in correspondence to the region where β(t) is largest and correcting for the bias effect is needed to obtain good CIs. Figure 6 suggests that the coverage proportions of the intervals will be satisfying if the bias can be estimated sufficiently accurately. Estimation of the bias requires estimating the density f 0 , which is a rather difficult task with current status data. A kernel based estimate of f 0 using the MLE F n is given bỹ where the bandwidthh ∼ n −1/9 . In our experiments, we take the bandwidth of the estimatef nh (t) equal toh =c t,opt n −1/9 wherec t,opt is selected by the same bootstrap-MSE approach discussed in Section 4.1.2, but with the SMLE replaced by this derivative estimate. To obtain good estimates of f 0 near the boundaries of the support, we consider the boundary correction method explained in Section 9.2 of [17]. A direct estimator of the actual bias is then obtained by first replacing f 0 (t) in (4.2) by the estimatef nh (t) and next multiplying with n −2/5 , i.e. the order of the actual bias that has to be taken into account when constructing the CIs.
Similarly to the estimate of the pointwise MSE defined in (4.12), we can also construct a bootstrap method for estimating the bias by using the subsampling principle described in [20]. Our estimate Bias(t) of the actual bias β(t)n −2/5 , is given by Figure 7 compares the average true bias effect β(t)n −2/5 and the average bias estimates obtained by either the direct estimation approach or the bootstrap based bias estimate for sample sizes n = 1000, 5000 and n = 10, 000. Note that, since the bandwidth constant c t,opt used for estimating the SMLE is different in each simulation run, the true bias (depending on c t,opt , see (4.2)) in each run is also different and therefore the average true bias is shown in Figure 7. The actual size of the bias decreases with increasing sample size. The proportion of times that F 0 (t) is not in the 95% bootstrap CIs, shown in Figure 8, decreases if one corrects for the bias by one of the discussed bias estimates. The results for the direct bias estimate using the estimatef nh are slightly better than the results for the bootstrap estimate of β(t)n −2/5 . The coverage proportions are however still anti-conservative for points at the left end of the support. We also considered constructing the bias corrected CIs in the uniform model used in Section 4.1.1 where the actual bias is zero (results not shown). The results of the uncorrected CIs in (4.6) were slightly better and estimating the bias in this model has a somewhat negative effect on the coverage proportions of the pointwise CIs around the SMLE.
Similarly to the methods proposed in [14] we next investigate how the choice of the bandwidth can affect the coverage proportions and average length of our CIs. To this end, we consider the concept of undersmoothing proposed by [22] and take c t,opt n −1/4 as the bandwidth used in constructing the CIs defined in (4.6). The coverage proportions of the CIs for the exponential model, shown in Figure 9, illustrate that the performance of the CIs around the SMLE improve by undersmoothing. We also observed that if we considered a smaller bandwidth choice h = (1/3)c t,opt n −1/5 , the coverage proportions even improve further and give satisfactory results in the left end point of the support. This illustrates that a smaller bandwidth choice can indeed correct for the bias in the CIs.
The results of the CIs in (4.6) in the uniform model with a bandwidth h = c t,opt n −1/4 or h = (1/3)c t,opt n −1/5 are in line with the results obtained with a bandwidth h = c t,opt n −1/5 and similar to the results shown in Figure 4. This shows that undersmoothing in a model without bias has no negative effect on the coverage proportions of our CIs.
By undersmoothing, the length of our SMLE-based CIs increases but the average length of the CIs remains remarkably smaller than the average length of the CIs around the MLE proposed by [4] and [32] (see Table 1).  [4] and [32] at timepoints t = 0.5, 1, 1.5.

Rubella data
We also applied the bootstrap procedures to the Rubella data set described by [24]. The data set contains 230 observations on the prevalence of rubella in Austrian males. For the smooth bootstrap, CIs were calculated in [14] using the bandwidth h = c t,opt n −1/4 . Figure 10 shows the CIs obtained with the nonparametric bootstrap and illustrates the applicability of our method in a real data example. For comparison, we also show the CIs obtained by the methods of [4] and [32]. The latter CIs were obtained by the Rcpp scripts in [13]. The nonparametric bootstrap SMLE-based CIs, including the data-driven bandwidth procedure, can be generated with the R package curstatCI.

The current status linear regression model
In the current status linear regression model we are interested in the estimation of the regression parameter β 0 based on observations (T 1 , X 1 ,  [32] with B = 1, 000 'smooth' bootstrap samples from the SMLE with bandwidth h = 80n −1/5 . with i.i.d. random error terms ε i , independent of (T i , X i ) with unknown distribution function F 0 .

b) MLE (red, solid) and CI obtained by the method of Banerjee and Wellner [4], (c) MLE (red, solid) and CI obtained by the method of Sen and Xu
In [15] a simple score estimator β n was introduced depending on the MLE F n,β for fixed β, defined as, for some fixed truncation parameter ∈ (0, 1/2). It is proved in [15] that √ n β n − β 0 is asymptotically normal with mean zero and variance , δ) is the truncated expectation of w(T, X, Δ) for some deterministic function w and where P denotes the probability measure of (T, X, Δ).
A bootstrap versionβ n based on a bootstrap sample from P n is then defined as the zero-crossing of whereF n,β is the MLE in the bootstrap sample. A straightforward extension of the results given in Section 3 shows that, as n tends to infinity, (4.18) The validity of the bootstrap method follows from the fact that, in probability, we have conditionally on the data (T 1 , X 1 , Δ 1 ), . . . , (T n , X n , Δ n ) that, (4.19) where the dominant term in the right-hand side of the display above is normally distributed with mean zero and variance W conditional on (T 1 , X 1 , Δ 1 ), . . . , (T n , X n , Δ n ).

Remark 4.2.
The nonparametric bootstrap is also valid for the second estimator of β 0 proposed in [15] based on a different score function involving the MLEF n,β and the derivative of the SMLEF nh,β (constructed by the procedure described in Section 4.1).
To provide more insight into the finite sample behavior of the classical bootstrap estimators, we show in Tables 2 and 3 the results of two simulation studies for a one-dimensional regression model Y = β 0 X + ε. In the first simulation setting we take β 0 = 0.5 and consider Uniform(0,2) distributions for the variables T and X; for the distribution of the random error ε we take f 0 (e) = 384(e − 3/8)(5/8 − e)1 [3/8,5/8] (e). A picture of the density and distribution function of the random error in model 1 is shown in Figure 11. The first model is also analyzed in [15]. In the second simulation model T, X and ε are independently sampled from a standard normal distribution and β 0 = 1. A similar model was considered in [1].
With these simulations we want to point out that it is not necessary to use smoothing techniques for doing inferences in the current status linear regression model. We compare the simple score estimator (SSE) described above with  Han's maximum rank correlation estimator ( [23], MRCE) and with the efficient score estimator (ESE) proposed in [15]. The asymptotic behavior of the MRCE for the current status model, also obtained without any smoothing techniques, is established in [1] where the author also proposes consistent kernel-based estimates of the asymptotic variance of the MRCE. We use these variance estimates to construct estimates for V, W and the almost (determined by the truncation parameter ) efficient variance of the SSE. For more details about the variance estimation we refer to [1]. A summary of N = 1, 000 simulation runs from models 1 and 2 for different sample sizes n is given in Tables 2 and 3 Note that the differences between the limiting variances for the different estimation methods are tiny and that the effect of the truncation parameter on the asymptotic behavior of the score estimators is small. Tables 2 and 3 show that n times the variance tends to converge to the asymptotic variance for all estimators. The ESE performs worse for small sample sizes and the results suggest to use the SSE for point estimation of the regression parameter β 0 .
We constructed Wald-type CIs, similar to the intervals proposed in [1], using the asymptotic normal limiting distribution of the estimators and compared the coverage proportion and average length of these intervals with bootstrap CIs based on the nonparametric bootstrap described in this paper using B = 1, 000 samples from the original data. For the MRCE, the validity of the classical bootstrap is proved in [33]. The Wald-type CIs remain anti-conservative for the ESE in model 2.
We observed (result not shown) that, in both models, the bias in estimating the efficient variance of the ESE remains larger than the bias of the asymptotic variance estimates for the SSE and the MRCE. Tables 2 and 3 show that the coverage proportion of the classical bootstrap CIs converges to the nominal 95%−level and the average length of the CIs obtained by resampling from the original data is smaller than the corresponding length of the Wald-type CIs. We also investigated the behavior of Studentized bootstrap CIs (results not shown) based on the variance estimate used in the construction of the Wald-type CIs, but no improvement was observed for the behavior of the bootstrap intervals.
Our results do not indicate better performances corresponding to smoothing techniques and therefore suggest that smoothing should not be the primary concern in inferences for the current status linear regression model. Note that the Wald-type CIs are constructed using smoothing kernel estimation for the variance estimate and that the only results obtained without any smoothing are the bootstrap CIs for the SSE and the MRCE. It is noteworthy that the SSE tends to perform better than the MRCE, which is not based on a nuisance parameter that is not estimable at √ n−rate. Based on these results, we recommend the use of the SSE in combination with the nonparametric bootstrap procedure for doing inference in the current status linear regression model.

Discussion
In this paper we studied the behavior of the nonparametric bootstrap in current status models. Asymptotic results show that, given the data, the L 2 −distance between the bootstrap MLEF n and the underlying distribution function F 0 is of order n −1/3 . This result is noteworthy given the fact that the nonparametric bootstrap is inconsistent for generating the distribution of the MLE. Despite this negative result, we show that it is still possible to use the MLE while doing inferences for certain functionals in the current status model. We illustrated the effectiveness of this result by constructing pointwise confidence intervals around the SMLE and proved the validity of interval estimation in the current status linear regression model.
The result is applicable to several other nonparametric estimators depending on a cube-root n convergence class. Because of its connection with the MLE, applications of the nonparametric bootstrap involving the Grenander estimator, such as the smoothed Grenander estimator used in [7] or the goodness-of-fit tests described in [8], are worthy of study in further research.
Extensions to semiparametric models, where one considers bootstrapping a finite dimensional parameter, are also possible such as the score estimator for the semiparametric monotone single index model proposed by [3], which is similar to the current status linear regression estimator discussed in Section 4.2. A general bootstrap consistency result for semiparametric M-estimators is derived in [5]. However, if computations are in first instance based on nonparametric maximum likelihood estimators or least squares estimators of the infinite dimensional parameter, fixing temporarily the finite-dimensional parameter, the use of local smooth functional theory is needed, where the remainder terms involving the cube-root-n M-estimator of the nuisance parameter are shown to be negligible by an application of a result of the type (3.1). The treatment of the remainder terms in this local smooth functional theory is a highly non-trivial matter. On the other hand, in [5], this negligibility is assumed to hold by their condition SB3.
Furthermore, the results in [5] hold for a class of exchangeable bootstrap weights of which the multinomial weights considered in this paper are a special case. Although we did not investigate this in the present paper, extensions of our nonparametric bootstrap results to the more general bootstrap resampling schemes seem possible as well.
Another interesting extension of this research is the construction of confidence bands for the distribution instead of the currently proposed pointwise confidence intervals. Note that our main result (3.2) does not imply: A bound on sup t∈[0,R] n 1/3 F n (t) − F 0 (t) which no doubt would contain logarithmic factors, would be needed for confidence bands instead of our pointwise confidence intervals. The idea is that the process t → n 1/3 F n (t) − F 0 (t) will fall apart into asymptotically independent pieces, and that we therefore expect Gumbel-type distributions to enter, via the maximum of independent random variables. The theory for this still has to be developed, however. What struck us in the present simulation studies is how comparatively well the global behavior of our pointwise confidence intervals still was, indicating that the extra logarithmic factors do not have such a very large impact.
Probably results similar to those presented in the current paper will follow for the more challenging interval censoring, type II models where the development of the local limit theory for the MLE has not yet been settled. It is reasonable to believe that the nonparametric bootstrap also allows for inferences with the maximum smoothed likelihood estimator studied in [12].
Proof. We only prove (6.1), since the proof of (6.2) is similar. Let F t be the (Vapnik-Cervonenkis) class of functions To prove (6.1), we use that an exponential tail bound can be derived from a bounded Orlicz norm · P,ψ , i.e., when taking ψ 1 (x) = exp(x) − 1, for x ≥ 0, we get, for x > 0 the inequality Using the second statement of Theorem 2.14.5 in [34], with p = 1, we get, the following inequality: where · * Ft denotes the so-called measurable majorant of · Ft (see [34]). (Note that we use temporarily the "*" notation which is used for bootstrap variables in the rest of the paper.) Furthermore, we have by the rightmost inequality of Theorem 2.14.1 of [34] that √ n P n − P n * Ft Pn,1 and where the supremum is over all discrete probability measure Q with F t Q, dG n (u), (6.5) . We next evaluate the second term on the right-hand side of (6.4). We have: in probability (since a term defined only on the probability space (X , A, P ) of order O p (1) is also of order O P M (1) in probability). So we obtain, for j ≥ K in probability, conditioning on (T 1 , Δ 1 ), (T 2 , Δ 2 ), . . . using the inequality on Orlicz norms on p. 96 or 239 of [34]: for some c 2 > 0. This proves the statement.
Proof. As in the proof of Lemma 6.1, we consider the Vapnik-Cervonenkis collection of functions: We have, using Theorem 2.14.1 of [34]: for some K > 0. Since, The result now easily follows, see, e.g., [25]. p. 201.
As a consequence of Lemma 6.1 and Lemma 6.2 we get the following result. Lemma 6.3. LetV n andV n be defined bŷ where the processĜ n is defined in (3.3), and letD n =V n −V n . Then there exist constants K 1 , K 2 > 0 such that, for each j ≥ 1, j ∈ N, in probability.
Proof. We again only prove (6.1), since the proof of (6.2) is similar. First note:
We now prove Lemma 3.1.

Proof of Lemma 4.1
We introduce notations K h and K h to denote the scaled versions of K and K respectively: Proof. Define the function Denote the points of jump of the MLEF n byτ 1 , . . . ,τ m and define the piecewise constant functionψ t,h with only jumps atτ 1 , . . . ,τ m bȳ By the convex minorant interpretation ofF n , we have ψ t,h (u)(δ −F n (u))dP n (u, δ) = 0, (see the discussion of the SMLE in [17], p. 332). We can writẽ We first evaluate A I and show that this term is o P M (n −2/5 ) in probability, we have: An argument similar to that of Lemma A.7 in [16] shows that and hence, in probability. Similarly to the proof of Lemma A.7 in [16], we can also show that (6.14) in probability, such that, We now study the term A II . Using the same inequality for ψ t,h −ψ t,h as used in the second display after (11.49) on p. 333 of [17], we get for some constant C > 0 that: for all u such that f 0 is positive and continuous in a neighborhood around u. We decompose the term A II as follows, For the first term on the right-hand side of the above display we write, where we use (6.15) in the last inequality. The first term in the display above is o P M (n −2/5 ) in probability by (6.14) and (6.15). Since in probability, we have by Markov's inequality and Fubini's theorem that, Similar to the arguments used in the treatment of term A I above, we get by using again arguments similar to that of Lemma A.7 in [16] that: in probability.

The current status linear regression model: bootstrap validity
In this section we give a road map for the proof of the bootstrap validity in the current status linear regression model. We assume that the assumptions stated in Theorem 4.1 of [15] hold. Since the proof is very similar to the proof of Theorem 4.1 in [15], we leave the details to the interested reader. Consider the bootstrap score function ψ ( ) n (β) = x{δ −F n,β (t − β x)} dP n (t, x, δ), (6.19) for some fixed truncation parameter ∈ (0, 1/2). The main idea is to show that + o P M (n −1/2 + (β n − β 0 )), (6.20) in probability, where E denotes the unconditional expectation. As in [15] we can work with the definitionψ ( ) n (β n ) = 0, for the score estimatorβ n . Since by the proof of Theorem 4.1 in [15], we get that, The validity of the bootstrap then follows by the arguments given in Section 4.2. Very important in the proof of (6.20) is the conditional bootstrapped L 2 -result, (6.21) in probability, where F β is defined in (4.18). Letφβ n,F n,βn be a (random) piecewise constant version of φβ n , where and where, for a piecewise constant distribution function F with finitely many jumps at τ 1 < τ 2 < . . . , the functionφ β,F is defined in the following way.  · δ −F n,βn (t −β n x) dP n (t, x, δ) It is shown in the proof of Theorem 4.1 in [15] that II b = o p (n −1/2 + (β n − β 0 )), and therefore II b = o P M (n −1/2 + (β n − β 0 )) in probability.
Using similar arguments as in in the proof of Theorem 4.1 in [15] we can also show that II a = o P M (n −1/2 ) in probability.