Asymptotic representation of presmoothed Kaplan–Meier integrals with covariates in a semiparametric censorship model
Introduction
In Survival Analysis and other fields, the variable of interest is a lifetime which is observed under right-censoring. Therefore, rather than one observes , where is the recorded (possibly censored) lifetime, is a censoring indicator, and is the potential censoring time. Often, a -dimensional vector of covariates is attached to each individual. Estimation of the expectation for a general transformation is of interest; in particular, this allows for the estimation of regression coefficients in e.g. linear censored regression. Given a random sample , Stute (1993) proposed the following estimator of : where is the th ordered -datum, is the th concomitant, and is the jump of the Kaplan–Meier estimator of at . Under the two following identifiability assumptions:
- (i)
and are independent;
- (ii)
and are independent conditionally on ;
In some instances, information on the conditional probability of censoring is available. Introduce the function . Without covariates, Dikta (1998) proposed a semiparametric censorship model in which the function belongs to a certain parametric family. He introduced an estimator for alternative to the Kaplan–Meier estimator and he proved the asymptotic superiority of the new estimator in the sense of having a smaller asymptotic variance (Dikta et al., 2005). Recently, Dikta (2014) proved that the semiparametric estimator is asymptotically efficient. In the setting with covariates, de Uña-Álvarez and Rodríguez-Campos (2004) proved the strong consistency of where and where stands for a uniformly consistent estimator of . These are ‘presmoothed’ Kaplan–Meier weights, in the sense that some preliminary smoothing of the probability of uncensoring is performed before the computation of the product-type weights.
Under Dikta (1998)’s semiparametric model, with , and therefore the function is estimated by some parametric fit . In such a case, and ignoring covariates for a moment, is just the jump of Dikta’s semiparametric estimator of at . Interestingly, application of these ‘presmoothed’ Kaplan–Meier weights in the presence of covariates allows for a variance reduction, similarly as for the marginal setting. This was illustrated through simulations in the context of censored linear regression (de Uña-Álvarez and Rodríguez-Campos, 2004). More recent applications in which similar features are seen include estimation of a conditional distribution function (Iglesias-Pérez and de Uña-Álvarez, 2010), or estimation of multivariate distribution functions and transition probabilities in multi-state models (de Uña-Álvarez and Amorim, 2011, Amorim et al., 2011, Moreira et al., 2013). Still, asymptotic properties of the presmoothed Kaplan–Meier integral (for a general function ) beyond consistency are unknown. Dikta et al. (2005) established an i.i.d. representation for such integrals in absence of covariates; in this paper, we extend his results to the more general setting in which covariables are present. As a consequence, we obtain a CLT for the estimator; as an important application, we derive the asymptotic normality of regression estimators based on the semiparametric censorship model. We also prove that the asymptotic variance of is larger than that of . Like in our previous papers, we assume that the cdf of is continuous, and we use this continuity in the proofs. The case of a discrete distribution is different; typically, one has (besides assumptions (i) and (ii) above) the additional assumption of no common jumps of the censoring and the survival distribution (Stute, 1993).
The rest of the paper is organized as follows. In Section 2 we introduce the needed assumptions and the main results. In Section 3 we include a simulation study to investigate the finite sample performance of the proposed estimator. Proofs of the main results are given in Section 4. Some auxiliary results are collected and proved in the Appendix Auxiliary results, Appendix A.
Locally efficient estimation in semiparametric censorship models has been considered under the viewpoint of coarsening at random too, see e.g. Robins and Rotnitzky (1992) or Robins and Finkelstein (2000). In these papers, the approach is based on inverse probability weighted augmented estimation, which allows for the construction of doubly robust estimators. This approach depends on a model for the coarsening mechanism (i.e. the conditional distribution of the observed data given the full data ) as well as a model for the cumulative distribution function of the full data. The method we follow here is different in that we only model the conditional expectation of the binary indicator given . Although the consistency of our approach depends on the assumed model, it may provide accurate estimators even under slight miss-specifications (see Section 3). Note besides that, since both and are observable, a reasonable model for can be postulated by using binary regression techniques (Cox and Snell, 1989, Dikta et al., 2006).
Section snippets
Main results
Under a semiparametric censorship model, we have with ; i.e., for some . Following Dikta (1998), the parameter is estimated by the conditional MLE, that is, by the maximizer of By repeating the arguments of the proof to Theorem 2.3 in Dikta (1998) (adapted to covariates), the asymptotic normality of may be established (see our Lemma 1 in Section 4). Throughout the paper, for a given function
Simulation study
In this section we include a simulation study to investigate the finite sample performance of the semiparametric estimator . In particular, we consider the indicator function for a given point, so reduces to an empirical bivariate distribution function . For comparison purposes, we include the results pertaining to the Kaplan–Meier-based estimator, (Stute, 1993).
The simulation steps are as follows. We simulate a Gaussian
Proofs
To prove Theorem 1 we follow the steps in the proof to Theorem 2.1 in Dikta et al. (2005). For this, ten lemmas will be established. The first one gives the uniform convergence with rate for the presmoothing function . Lemma 2 gives a suitable representation of . A Taylor expansion leads then to another representation with several terms, which are analyzed in detail along Lemma 3, Lemma 4, Lemma 5, Lemma 6, Lemma 7, Lemma 8, Lemma 9, Lemma 10. In Lemma 3, Lemma 4, Lemma 5,
Acknowledgment
Work supported by the Grant MTM2011-23204 (FEDER support included) of the Spanish Ministerio de Ciencia e Innovación, and by the Grant MTM2014-55966-P of the Spanish Ministerio de Economía y Competitividad.
References (20)
- et al.
Presmoothing the transition probabilities in the illness-death model
Statist. Probab. Lett.
(2011) On semiparametric random censorship models
J. Statist. Plann. Inference
(1998)Asymptotically efficient estimation under semi-parametric random censorship models
J. Multivariate Anal.
(2014)- et al.
The central limit theorem under semiparametric random censorship models
J. Statist. Plann. Inference
(2005) Consistent estimation under random censorship when covariables are present
J. Multivariate Anal.
(1993)Convergence of Probability Measures
(1968)- et al.
Analysis of Binary Data
(1989) - et al.
A semiparametric estimator of the bivariate distribution function for censored gap times
Biom. J.
(2011) - et al.
Strong consistency of presmoothed Kaplan–Meier integrals when covariables are present
Statistics
(2004) Weak representation of the cumulative hazard function under semiparametric random censorship models
Statistics
(2001)