Skip to main content

Advertisement

Log in

Semiparametric regression analysis of doubly censored failure time data from cohort studies

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

Doubly censored failure time data occur when the failure time of interest represents the elapsed time between two events, an initial event and a subsequent event, and the observations on both events may suffer censoring. A well-known example of such data is given by the acquired immune deficiency syndrome (AIDS) cohort study in which the two events are HIV infection and AIDS diagnosis, and several inference methods have been developed in the literature for their regression analysis. However, all of them only apply to limited situations or focus on a single model. In this paper, we propose a marginal likelihood approach based on a general class of semiparametric transformation models, which can be applied to much more general situations. For the implementation, we develop a two-step procedure that makes use of both the multiple imputation technique and a novel EM algorithm. The asymptotic properties of the resulting estimators are established by using the modern empirical process theory, and the simulation study conducted suggests that the method works well in practical situations. An application is also provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Cai T, Cheng S (2004) Semiparametric regression analysis for doubly censored data. Biometrika 91:277–290

    Article  MathSciNet  Google Scholar 

  • De Gruttola V, Lagakos SW (1989) Analysis of doubly-censored survival data, with application to AIDS. Biometrics 45:1–11

    Article  MathSciNet  Google Scholar 

  • Goggins WB, Finkelstein DM, Zaslavsky AW (1999) Applying the cox proportional hazards model for analysis of latency data with interval censoring. Stat Med 18:2737–2747

    Article  Google Scholar 

  • Gomez G, Lagakos SW (1994) Estimation of the infection time and latency distribution of AIDS with doubly censored data. Biometrics 50:204–212

    Article  Google Scholar 

  • Huang J (1999) Asymptotic properties of nonparametric estimation based on partly interval-censored data. Statistica Sinica 9:501–519

    MathSciNet  MATH  Google Scholar 

  • Kim Y (2006) Regression analysis of doubly censored failure time data with frailty. Biometrics 62:458–464

    Article  MathSciNet  Google Scholar 

  • Kim MY, De Gruttola V, Lagakos SW (1993) Analyzing doubly censored data with covariates, with application to AIDS. Biometrics 49:13–22

    Article  Google Scholar 

  • Kosorok MR (2008) Introduction to empirical processes and semiparametric inference. Springer, New York

    Book  Google Scholar 

  • Kosorok MR, Lee BL, Fine JP (2004) Robust inference for univariate proportional hazards frailty regression models. Ann Stat 32:1448–1491

    Article  MathSciNet  Google Scholar 

  • Li Z, Owzar K (2016) Fitting cox models with doubly censored data using spline-based sieve marginal likelihood. Scand J Stat 43:476–486

    Article  MathSciNet  Google Scholar 

  • Li S, Hu T, Wang P, Sun J (2018) A class of semiparametric transformation models for doubly censored failure time data. Scand J Stat. https://doi.org/10.1111/sjos.12319

    Article  MathSciNet  MATH  Google Scholar 

  • Lin DY, Ying Z (1994) Semiparametric analysis of the additive risk model. Biometrika 81:61–71

    Article  MathSciNet  Google Scholar 

  • McMahan CS, Wang L, Tebbs JM (2013) Regression analysis for current status data using the EM algorithm. Stat Med 32:4452–4466

    Article  MathSciNet  Google Scholar 

  • Pan W (2001) A multiple imputation approach to regression analysis for doubly censored data with application to AIDS studies. Biometrics 57:1245–1250

    Article  MathSciNet  Google Scholar 

  • Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New York

    Book  Google Scholar 

  • Rudin W (1973) Functional analysis. McGraw-Hill, New York

    MATH  Google Scholar 

  • Sun J (1995) Empirical estimation of a distribution function with truncated and doubly interval-censored data and its application to AIDS studies. Biometrics 51:1096–1104

    Article  Google Scholar 

  • Sun J (2006) The statistical analysis of interval-censored failure time data. Springer, New York

    MATH  Google Scholar 

  • Sun J, Liao Q, Pagano M (1999) Regression analysis of doubly censored failure time data with applications to AIDS studies. Biometrics 55:909–914

    Article  Google Scholar 

  • Sun L, Kim Y, Sun J (2004) Regression analysis of doubly censored failure time data using the additive hazards model. Biometrics 60:637–643

    Article  MathSciNet  Google Scholar 

  • van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York

    Book  Google Scholar 

  • Wang Y (2008) Dimension-reduced nonparametric maximum likelihood computation for interval-censored data. Comput Stat Data Anal 52:2388–2402

    Article  MathSciNet  Google Scholar 

  • Wang L, McMahan CS, Hudgens MG, Qureshi ZP (2016) A flexible, computationally efficient method for fitting the proportional hazards model to interval-censored data. Biometrics 72:222–231

    Article  MathSciNet  Google Scholar 

  • Wen CC, Chen YH (2011) Nonparametric maximum likelihood analysis of clustered current status data with the gamma-frailty Cox model. Comput Stat Data Anal 55:1053–1060

    Article  MathSciNet  Google Scholar 

  • Zeng D, Lin DY (2006) Efficient estimation of semiparametric transformation models for counting processes. Biometrika 93:627–640

    Article  MathSciNet  Google Scholar 

  • Zeng D, Cai J, Shen Y (2006) Semiparametric additive risks model for interval-censored data. Statistica Sinica 16:287–302

    MathSciNet  MATH  Google Scholar 

  • Zeng D, Mao L, Lin DY (2016) Maximum likelihood estimation for semiparametric transformation models with interval-censored data. Biometrika 103:253–271

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors wish to thank the Editor, the Associate Editor and two reviewers for their helpful comments and suggestions. This work was supported by National Natural Science Foundation of China (NSFC) (Grant Nos. 11871173; 11471086).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xia Cui.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Expressions of the conditional expectations in the proposed EM algorithm

In this appendix, we provide the explicit expressions of the conditional expectations involved in the proposed EM algorithm. Also for notational simplicity, we will ignore the conditional arguments in the below expressions.

$$\begin{aligned} \mathrm{E} ( Z_{ik})= & {} \delta _{o,i} \, \left\{ I(c_{k} = T^*_{i}) + \lambda _{k} \exp (X^T_{i}\varvec{\beta }) \mathrm{E} ( \mu _i)\, I(c_{k}> T^*_{i}) \right\} \\&+\,\delta _{c,i} I(R_{i}< \infty ) \left\{ \lambda _{k} \exp (X^T_{i}\varvec{\beta }) \psi _i \, I(L_{i} \le c_{k} \le R_{i}) \right. \\&\quad \left. +\,\lambda _{k} \exp (X^T_{i}\varvec{\beta }) \mathrm{E} ( \mu _i ) \, I(c_{k}> R_{i}) \right\} \\&+\, \delta _{c,i} I(R_{i} = \infty ) \, \lambda _{k} \exp (X^T_{i} \varvec{\beta }) \mathrm{E} ( \mu _i ) \, I(c_{k} > L_{i}),\\&\mathrm{E} ( \mu _i ) = \delta _{o,i} \,\frac{\int _{\mu _{i}} \mu _i^2 \exp (-\mu _{i} U_{i})\,\phi (\mu _{i}|r)\,\mathrm {d}\mu _{i}}{ e^{-G (U_{i})}\,G^\prime (U_{i})} + \delta _{c,i} I(R_{i} < \infty ) \\&\frac{ \exp (-G( V_{i}))\,G^\prime (V_{i}) - \exp (-G( W_{i}))\,G^\prime (W_{i})}{ \exp (-G (V_{i})) - \exp (-G (W_{i}))}\\&+\, \delta _{c,i} I(R_{i} = \infty ) \, G^\prime (V_{i}) , \end{aligned}$$

where \(V_{i} = \sum _{c_{k} < L_{i}} \, \lambda _{k} \exp (X^T_{i}\varvec{\beta }) \)\(W_{i} = \sum _{c_{k} \le R_{i}} \, \lambda _{k} \exp (X^T_{i}\varvec{\beta }) \), \(U_{i} = \sum _{c_{k} \le T^*_{i}} \lambda _{k} \exp (X^T_{i}\varvec{\beta }) \) and

$$\begin{aligned} \psi _i = \frac{ \int _{\mu _i} \,\mu _{i} (\exp (-\mu _{i} V_{i})- \exp (-\mu _{i} W_{i})) \, \{1 - \exp (-\mu _{i} (W_{i}-V_{i}) )\}^{-1}\, \phi (\mu _{i}|r)\,\mathrm {d}\mu _{i}}{\exp (-G (V_{i})) - \exp (-G (W_{i}))}, \end{aligned}$$

In particular, if \(\phi (\mu _i|r)\) is the gamma density function with known parameter r, then one has

$$\begin{aligned} G^\prime (x) = \frac{\int _{\mu _i} \mu _{i} \exp (-x\,\mu _i)\,\phi (\mu _i|r)\,\mathrm{d} \mu _i}{\exp ( -G (x) )} =\frac{(r \, x + 1)^{-r^{-1}-1}}{\exp (-G(x))}, \end{aligned}$$

and

$$\begin{aligned} \int _{\mu _{i}} \mu _i^2 \exp (-\mu _{i} x)\,\phi (\mu _{i}|r)\,\mathrm {d}\mu _{i} = (1+r)(r x + 1)^{-r^{-1} - 2}. \end{aligned}$$

Furthermore, we propose to employ Gauss–Laguerre quadrature technique to calculate the following item that has no closed-form

$$\begin{aligned} \int _{\mu _i} \,\mu _{i} (\exp (-\mu _{i} V_{i})- \exp (-\mu _{i} W_{i})) \, \{1 - \exp (-\mu _{i} (W_{i}-V_{i}))\}^{-1}\, \phi (\mu _{i}|r)\,\mathrm {d}\mu _{i}. \end{aligned}$$

Appendix B: Proofs of \(Q\left( \theta ,\theta ^{(l)}|S\right) \) is a concave function

In the each iteration of the EM algorithm, \(\theta ^{(l+1)}\) can be obtained through the following two step procedure. Firstly, the estimator of each \(\lambda _k\) can be obtained by solving the estimating equation \(\partial {Q\left( \theta ,\theta ^{(l)}|S\right) } / \partial {\lambda _{k}} = 0\) and can be expressed as the following closed-form expression

$$\begin{aligned} \lambda ^{(m+1)}_k = \frac{\sum _{i=1}^{n} \mathrm{E}(Z_{ik})}{\sum _{i=1}^{n} \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta }) }, \quad \, k = 1,\ldots , K_n, \end{aligned}$$

where

$$\begin{aligned} Q\left( \theta ,\theta ^{(l)}|S\right) = \sum _{i=1}^n \,\sum _{k=1}^{K_n}\, \mathrm{E} (Z_{ik}) \, X_i^T \varvec{\beta } + \log (\lambda _{k})\mathrm{E} (Z_{ik}) -\lambda _{k} \, \exp ( X_i^T \varvec{\beta }) \,\mathrm{E}(\mu _{i} ), \end{aligned}$$

and \(\theta ^{(l)}\) denotes the estimator of \(\theta \) at the lth iteration. Let \(\Lambda = (\lambda _1, \ldots , \lambda _{K_n})\) and for each value of \(\beta \), \(\Lambda ^{(m+1)} \) is a unique maximizer in that the matrix of the second order partial derivatives \(\partial ^2 Q\left( \theta ,\theta ^{(l)}|S\right) /\partial \Lambda ^2\) is negative definite for all \(\lambda _k\)’s. This is because \(\partial ^2 Q\left( \theta ,\theta ^{(l)}|S\right) /\partial \Lambda ^2\) is a diagonal matrix whose kth diagonal component is

$$\begin{aligned} \frac{\partial ^2 Q\left( \theta ,\theta ^{(l)}|S\right) }{\partial \lambda _k^2} = -\frac{1}{\lambda _{k}^2} \, \sum _{i=1}^n \, \mathrm{E} (Z_{ik}), \end{aligned}$$

which is absolutely less than 0 since \(\mathrm{E} (Z_{ik}) > 0\).

In the second step, after inserting the estimators above into \(Q\left( \theta ,\theta ^{(l)}|S\right) \), the objective function one need to solve for updating \(\varvec{\beta }\) becomes

$$\begin{aligned} H\left( \varvec{\beta },\theta ^{(l)}|S\right) = \sum _{i=1}^n \,\sum _{k=1}^{K_n}\,\left\{ \mathrm{E} (Z_{ik}) \, X_i^T\varvec{\beta } - \log \left[ \sum _{i=1}^n \mathrm{E}(\mu _{i}) \exp ( X_i^T\varvec{\beta })\right] \right\} . \end{aligned}$$

One can easily derive the second order partial derivatives \(H\left( \varvec{\beta },\theta ^{(l)}|S\right) \) with respect to \(\varvec{\beta }\) given as follows

$$\begin{aligned} \frac{\partial ^2 H(\varvec{\beta }, \theta ^{(l)})}{\partial \varvec{\beta } \partial \varvec{\beta }^T}&= -\left\{ \sum _{i=1}^n \,\sum _{k=1}^{K_n}\, \mathrm{E} (Z_{ik}) \right\} \\&\quad \left\{ \frac{\sum _{i=1}^{n} \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta })X_i X_i^T\sum _{i=1}^{n} \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta }) }{\{\sum _{i=1}^{n} \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta })\}^2 }\right. \\&\phantom {=\;\;}\left. - \frac{\left\{ \sum _{i=1}^{n} \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta })X_i \right\} \left\{ \sum _{i=1}^{n} \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta })X_i^T\right\} }{\{\sum _{i=1}^{n} \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta })\}^2 } \right\} , \end{aligned}$$

which can be re-expressed as

$$\begin{aligned} \frac{\partial ^2 H\left( \varvec{\beta },\theta ^{(l)}|S\right) }{\partial \varvec{\beta } \partial \varvec{\beta }^T}&= -\left\{ \sum _{i=1}^n \,\sum _{k=1}^{K_n}\, \mathrm{E} (Z_{ik}) \right\} \\&\quad \left\{ \frac{\sum _{i=1}^{n} \sum _{j=1}^{n} \mathrm{E}(\mu _{j})\,\exp ( X_j^T \varvec{\beta }) \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta })X_i X_i^T }{\{\sum _{i=1}^{n} \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta })\}^2 }\right. \\&\phantom {=\;\;}\left. -\, \frac{\left\{ \sum _{i=1}^{n} \sum _{j=1}^{n} \mathrm{E}(\mu _{j})\,\exp ( X_j^T \varvec{\beta }) \mathrm{E}(\mu _{i})\,\exp ( X_i^T\varvec{\beta }) X_i X_j^T \right\} }{\{\sum _{i=1}^{n} \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta })\}^2 } \right\} . \end{aligned}$$

Based on the fact that interchanging the indices i and j does not affect \(\frac{\partial ^2 H\left( \varvec{\beta },\theta ^{(l)}|S\right) }{\partial \varvec{\beta }\varvec{\beta }^T}\), we have

$$\begin{aligned}&\sum _{i=1}^{n} \sum _{j=1}^{n} \mathrm{E}(\mu _{j})\,\exp ( X_j^T \varvec{\beta }) \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta })X_i X_i^T\\&\quad = \frac{1}{2} \sum _{i=1}^{n} \sum _{j=1}^{n} \mathrm{E}(\mu _{j})\,\exp ( X_j^T \varvec{\beta }) \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta })X_i X_i^T\\&\quad \quad +\, \frac{1}{2} \sum _{j=1}^{n} \sum _{i=1}^{n} \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta }) \mathrm{E}(\mu _{j})\,\exp ( X_j^T \varvec{\beta })X_j X_j^T\\&\quad = \frac{1}{2} \sum _{i=1}^{n} \sum _{j=1}^{n} \mathrm{E}(\mu _{j})\,\exp ( X_j^T \varvec{\beta }) \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta })(X_i X_i^T + X_j X_j^T ). \end{aligned}$$

Through similar discussion above, we can obtain

$$\begin{aligned}&\left\{ \sum _{i=1}^{n} \sum _{j=1}^{n} \mathrm{E}(\mu _{j})\,\exp ( X_j^T \varvec{\beta }) \mathrm{E}(\mu _{i})\,\exp ( X_i^T\varvec{\beta }) X_i X_j^T \right\} \\&\quad = \frac{1}{2} \left\{ \sum _{i=1}^{n} \sum _{j=1}^{n} \mathrm{E}(\mu _{j})\,\exp ( X_j^T\varvec{\beta }) \mathrm{E}(\mu _{i})\,\exp ( X_i^T \varvec{\beta }) (X_i X_j^T+ X_j X_i^T) \right\} . \end{aligned}$$

Denote \(\psi _i = \mathrm{E}(\mu _{i})\,\exp ( X_i^T\varvec{\beta })\) and \(\psi _j = \mathrm{E}(\mu _{j})\,\exp ( X_j^T \varvec{\beta })\). Therefore, \(\frac{\partial ^2 H\left( \varvec{\beta },\theta ^{(l)}|S\right) }{\partial \varvec{\beta } \partial \varvec{\beta }^T}\) can be expressed as the following equivalent form

$$\begin{aligned} \frac{\partial ^2 H\left( \varvec{\beta },\theta ^{(l)}|S\right) }{\partial \varvec{\beta } \partial \varvec{\beta }^T}&= -\frac{1}{2}\left\{ \sum _{i=1}^n \,\sum _{k=1}^{K_n}\, \mathrm{E} (Z_{ik}) \right\} \left\{ \frac{\sum _{i=1}^{n} \sum _{j=1}^{n} \psi _j \psi _i(X_i X_i^T+ X_j X_j^T) }{\{\sum _{i=1}^{n} \psi _i\}^2 }\right. \\&\phantom {=\;\;}\left. -\, \frac{\left\{ \sum _{i=1}^{n} \sum _{j=1}^{n} \psi _j \psi _i (X_i X_j^T+X_j X_i^T) \right\} }{\{\sum _{i=1}^{n} \psi _i\}^2 } \right\} .\\&= -\,\frac{1}{2}\left\{ \sum _{i=1}^n \,\sum _{k=1}^{K_n}\, \mathrm{E} (Z_{ik}) \right\} \\&\quad \left\{ \frac{\sum _{i=1}^{n} \sum _{j=1}^{n} \psi _j \psi _i (X_i X_i^T+ X_j X_j^T-X_i X_j^T-X_j X_i^T) }{\{\sum _{i=1}^{n} \psi _i\}^2 }\right\} .\\&= -\,\frac{1}{2}\left\{ \sum _{i=1}^n \,\sum _{k=1}^{K_n}\, \mathrm{E} (Z_{ik}) \right\} \\&\quad \left\{ \frac{\sum _{i=1}^{n} \sum _{j=1}^{n} \psi _j \psi _i (X_i-X_j) (X_i-X_j)^T }{\{\sum _{i=1}^{n} \psi _i\}^2 }\right\} . \end{aligned}$$

In the following, we consider \(\phi ^T \frac{\partial ^2 H\left( \varvec{\beta },\theta ^{(l)}|S\right) }{\partial \varvec{\beta } \partial \varvec{\beta }^T} \phi \) for any \(\phi \in R^q\) given by

$$\begin{aligned}&-\frac{1}{2}\left\{ \sum _{i=1}^n \,\sum _{k=1}^{K_n}\, \mathrm{E} (Z_{ik}) \right\} \left\{ \frac{\sum _{i=1}^{n} \sum _{j=1}^{n} \psi _j \psi _i \phi ^T(X_i-X_j) (X_i-X_j)^T\phi }{\{\sum _{i=1}^{n} \psi _i\}^2 }\right\} .\\&\quad = -\,\frac{1}{2}\left\{ \sum _{i=1}^n \,\sum _{k=1}^{K_n}\, \mathrm{E} (Z_{ik}) \right\} \left\{ \frac{\sum _{i=1}^{n} \sum _{j=1}^{n}\psi _j \psi _i \{\phi ^T(X_i-X_j)\}^2 }{\{\sum _{i=1}^{n} \psi _i\}^2 }\right\} . \end{aligned}$$

Since \(\mathrm{E} (Z_{ik})\), \(\mathrm{E}(\mu _{j})\) and \(\exp ( X_j^T \varvec{\beta })\) are positive for all i, k and j, \(\phi ^T \frac{\partial ^2 H\left( \varvec{\beta },\theta ^{(l)}|S\right) }{\partial \varvec{\beta } \partial \varvec{\beta }^T} \phi \) is definitely non-positive. Furthermore, for nonzero \(\phi \), \(\phi ^T \frac{\partial ^2 H\left( \varvec{\beta },\theta ^{(l)}|S\right) }{\partial \varvec{\beta } \partial \varvec{\beta }^T} \phi = 0 \) only if \(\phi (X_i-X_j)^T = 0\) for all i and j, which indicates that all subjects have the same value for a special covariate and the corresponding regression parameter and \(\Lambda _0\) are unidentifiable in such case. Since we have shown the model is identifiable during the proof of Theorem 1, we can conclude that \(\phi ^T \frac{\partial ^2 H\left( \varvec{\beta },\theta ^{(l)}|S\right) }{\partial \varvec{\beta } \partial \varvec{\beta }^T} \phi < 0\) for any \(\phi \in R^q\) except \(0^q\), which indicates that \( \frac{\partial ^2 H\left( \varvec{\beta },\theta ^{(l)}|S\right) }{\partial \varvec{\beta } \partial \varvec{\beta }^T}\) is negative definite. That is, we can conclude that \(Q\left( \theta ,\theta ^{(l)}|S\right) \) is a concave function and \(Q\left( \theta ^{(l+1)},\theta ^{(l)}|S\right) \ge Q\left( \theta ^{(l)},\theta ^{(l)}|S\right) \) for each \(l>0\).

Appendix C: proofs of the asymptotic properties

In this appendix, we will sketch the proofs for Theorems 1, 2, which are mainly based on some results on empirical processes given in van der Vaart and Wellner (1996). In the following, for a function f and a random variable Y with the distribution P, define \({\mathbb {P}} f = \int f(y) \mathrm{d}P(y)\) and \({\mathbb {P}}_n f = n^{-1}\sum _{i=1}^n f(Y_i)\).

Proof of Theorem1

Define the likelihood function for a single subject conditional on S as

$$\begin{aligned} L(\theta |{\mathcal {O}}, S)= & {} \left\{ \Lambda ^{\prime }(T-S) \exp (X^T \varvec{\beta }) \, G^{\prime } \{\Lambda (T-S) \exp (X^T \varvec{\beta })\} \right. \nonumber \\&\quad \left. \exp \left[ -G \{\Lambda (T-S) \exp (X^T \varvec{\beta })\} \right] \right\} ^{\delta _{o}}\nonumber \\&\times \left\{ \exp \left[ -G \{\Lambda (T_{L}-S) \exp (X^T\varvec{\beta })\} \right] \right. \nonumber \\&\quad \left. -\, \exp \left[ -G \{\Lambda (T_{R}-S) \exp (X^T \varvec{\beta })\} \right] \right\} ^{\delta _{c}}, \end{aligned}$$
(2)

where \(\theta = (\varvec{\beta }, \Lambda )\), \({\mathcal {O}}=(\delta _{o},\delta _{c},T,T_L,T_R,X)\), \(\delta _{c}= I(T_{L} < T_{R})\) and \(\delta _{o}= I(T_{L} = T_{R})\). For any \(0<\alpha <1,\) define the class of functions

$$\begin{aligned} {\mathcal {L}}=\{\log (1-\alpha +\alpha p({\mathcal {O}}|\theta , {\hat{F}}_{c})/p({\mathcal {O}}|\theta _0, {\hat{F}}_{c})):p\in {\mathcal {P}}\}, \end{aligned}$$

where \({\mathcal {P}}=\{p({\mathcal {O}}|\theta , {\hat{F}}_{c}):\theta \in \Theta \}\) and \(p({\mathcal {O}}|\theta , {\hat{F}}_{c}) = \int _{S_{L}}^{S_{R}} L(\theta |{\mathcal {O}},s) \, \mathrm{d} {\hat{F}}_{c}(s)\). In the following, we will first show that \({\mathcal {L}}\) is a Donsker class under conditions (A1)–(A5).

At first, we will show that \({\mathcal {P}}\) is a Donsker class. Note that \(\{X^T \varvec{\beta },\varvec{\beta } \in {\mathcal {B}}\}\) and \(\{\Lambda (\cdot ), \Lambda \) is a bounded and increasing right-continuous function with \(\Lambda (0) = 0\}\) are both Donsker classes, where the latter holds because of its bounded total variation. Since G\(G'\) and \(\exp (\cdot )\) are continuously differentiable functions, the preservation of the Donsker property based on Theorem 2.10.6 of van der Vaart and Wellner (1996) implies that the classes \(\{G^{(k)}(\Lambda (t)\exp (X^T\varvec{\beta }))\},\)\(k=0, 1\) and \(\{\exp (X^T\varvec{\beta }),\varvec{\beta }\in {\mathcal {B}}\}\) are Donsker classes. Furthermore, by using the preservation of the Donsker property under the summation and product, as given in Examples 2.10.7–2.10.8 of van der Vaart and Wellner (1996), we can conclude that the class \({\mathcal {P}}\) is a Donsker class. Next, we know that \(p({\mathcal {O}}|\theta , {\hat{F}}_{c})\) is bounded away from zero, then for any \(p({\mathcal {O}}|\theta _1, {\hat{F}}_{c}),p({\mathcal {O}}|\theta _2, {\hat{F}}_{c}) \in {\mathcal {P}},\) by using the mean value theorem, we have

$$\begin{aligned}&|\log (1-\alpha +\alpha p({\mathcal {O}}|\theta _1, {\hat{F}}_{c})/p({\mathcal {O}}|\theta _0, {\hat{F}}_{c}))-\log (1-\alpha +\alpha p({\mathcal {O}}|\theta _2, {\hat{F}}_{c})/p({\mathcal {O}}|\theta _0, {\hat{F}}_{c}))|\\&\quad \le K|p({\mathcal {O}}|\theta _1, {\hat{F}}_{c})-p({\mathcal {O}}|\theta _2, {\hat{F}}_{c})|, \end{aligned}$$

where K denotes some positive integer. Then we can conclude that \({\mathcal {L}}\) is a Donsker class. Define \(p_n({\mathcal {O}}, {\hat{F}}_{c}) = p({\mathcal {O}}|{\hat{\theta }}_n, {\hat{F}}_{c})\) and \(p_0({\mathcal {O}},{\hat{F}}_{c}) = p({\mathcal {O}}|\theta _0, {\hat{F}}_{c})\). Since \({\hat{\theta }}_n\) is the maximum marginal likelihood estimator of \(\theta \) in \(\Theta \) and \(\theta _0\in \Theta .\)

$$\begin{aligned} \sum _{i=1}^n\log p_n({\mathcal {O}}_i,{\hat{F}}_{c})\ge \sum _{i=1}^n\log p_0({\mathcal {O}}_i,{\hat{F}}_{c}), \end{aligned}$$

and hence

$$\begin{aligned} \sum _{i=1}^n\log \frac{p_n({\mathcal {O}}_i,{\hat{F}}_{c})}{p_0({\mathcal {O}}_i,{\hat{F}}_{c})}\ge 0. \end{aligned}$$

By concavity of the function \(x\rightarrow \log x,\) for any \(0<\alpha <1,\)

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n\log \biggl (1-\alpha +\alpha \frac{p_n({\mathcal {O}}_i,{\hat{F}}_{c})}{p_0({\mathcal {O}}_i,{\hat{F}}_{c})}\biggl )\ge 0. \end{aligned}$$
(L.1)

The left side can be written as

$$\begin{aligned} ({\mathbb {P}}_n - {\mathbb {P}})\log \biggl (1-\alpha +\alpha \frac{p_n({\mathcal {O}},{\hat{F}}_{c})}{p_0({\mathcal {O}},{\hat{F}}_{c})}\biggl )+ {\mathbb {P}}\log \biggl (1-\alpha +\alpha \frac{p_n({\mathcal {O}},{\hat{F}}_{c})}{p_0({\mathcal {O}},{\hat{F}}_{c})}\biggl ). \end{aligned}$$
(L.2)

Since we have shown that \({\mathcal {L}}=\{\log (1-\alpha +\alpha p({\mathcal {O}}|\theta , {\hat{F}}_{c})/p({\mathcal {O}}|\theta _0, {\hat{F}}_{c})):p\in {\mathcal {P}}\}\) is a Donsker class and thus we can conclude that the first term of (L.2) converges to zero almost surely (Lemma 8.17 in Kosorok (2008)). Since \({\mathcal {B}}\) is bounded, for any subsequence of \(\hat{\varvec{\beta }}_n\), we can find a further subsequence converging to \(\varvec{\beta }_*\in \bar{{\mathcal {B}}}\), the closure of \({\mathcal {B}}.\) Moreover, by Helly’s selection theorem, for any subsequence of \({\hat{\Lambda }}_n,\) we can find a further subsequence converging to some increasing function \(\Lambda _*.\) Choose the convergent subsequence of \(\hat{\varvec{\beta }}_n\) and the convergent subsequence of \({\hat{\Lambda }}_n\) such that they have the same indices, and without loss of generality, assume that \(\hat{\varvec{\beta }}_n\) converges to \(\varvec{\beta }_*\) and that \({\hat{\Lambda }}_n\) converges to \({\Lambda }_*.\) Let \(p_*({\mathcal {O}},{\hat{F}}_{c})=p({\mathcal {O}}|\theta _*, {\hat{F}}_{c})\), where \(\theta _* = (\varvec{\beta }_*, \Lambda _*)\). By the bounded convergence theorem, the second term of (L.2) converges to

$$\begin{aligned} {\mathbb {P}}\log \biggl (1-\alpha +\alpha \frac{p_*({\mathcal {O}},{\hat{F}}_{c})}{p_0({\mathcal {O}},{\hat{F}}_{c})}\biggl ), \end{aligned}$$

which is nonnegative. However, by Jensen’s inequality, it must be nonpositive. So it must be zero. It follows that \(p_*({\mathcal {O}},{\hat{F}}_{c})=p_0({\mathcal {O}},{\hat{F}}_{c})\) with probability one. Furthermore, note that \({\hat{F}}_{c}\) is the non-parametric maximum likelihood estimator of \(F_{c}\) based on \((S_L, S_R)\), by condition (A1) and the arguments of Huang (1999), we know that \({\hat{F}}_c(s)\) is a the consistent estimator of \(F_{c0}(s)\), where \(F_{c0}\) is the true value of \(F_{c}\), the conditional distribution function of the initial event. Then it is easy to see that \(p_*({\mathcal {O}},F_{c0})=p_0({\mathcal {O}},F_{c0})\). In particular, by choosing \(\delta _c = 1\) and \(R =\infty \), we can obtain

$$\begin{aligned} \exp \left[ -G \left\{ \Lambda _0(L) \exp (X^T\varvec{\beta }_0)\right\} \right] = \exp \left[ -G \left\{ \Lambda _*(L) \exp (X^T\varvec{\beta }_* )\right\} \right] . \end{aligned}$$

Then by the monotonicity of \(G(\cdot )\) and \(\exp (\cdot ),\) we have \(\log \Lambda _0(\cdot ) + X^T\varvec{\beta }_0 = \log \Lambda ^*(\cdot ) +X^T\varvec{\beta }^*.\) Therefore, by condition (A2), we obtain \(\varvec{\beta }_0=\varvec{\beta }^*\) and \(\Lambda _0(\cdot )=\Lambda ^*(\cdot )\). It follows that \(\hat{\varvec{\beta }}_n\rightarrow \varvec{\beta }_0\) and \({\hat{\Lambda }}_n(t)\) converges weakly to \(\Lambda _0(t)\) almost surely. The latter convergence can be strengthened to uniform convergence since \(\Lambda _0(t)\) is continuous. Thus, we have proved Theorem1. \(\square \)

Proof of Theorem 2

To prove this theorem, in the following, we propose to check the conditions given in theorem 3.3.1 of van der Vaart and Wellner (1996). Consider the set \({\mathcal {H}}=\{(h_1,h_2):h_1\in R^q, h_2(\cdot ) \in BV[0, \tau ]; \Vert h_1\Vert \le 1, \Vert h_2\Vert _V\le 1\}\), where \(\Vert h_2\Vert _V\) denotes the total variation of \(h_2(\cdot )\) in \([0, \tau ].\) Let \(n_1\) denote the number of subjects whose subsequent events can be observed exactly and \(n_1/n \rightarrow \alpha _1 \) with \(0 < \alpha _1 \le 1\). Consider submodels \(\varvec{\beta }_{\epsilon }= \varvec{\beta } + \epsilon h_1 \) and \(\Lambda _{\epsilon }(t) = \Lambda (t) + \epsilon \int _0^{t} h_2(v)\mathrm{d}\Lambda (v)\). Then the score functions along these submodels are given by

$$\begin{aligned}&U_{n1}(h_1) = \frac{1}{n} \frac{\partial l(\varvec{\beta }_{\epsilon }, \Lambda , {\hat{F}}_{c})}{\partial {\epsilon }}\bigg |_{\epsilon = 0}\\&\quad = \frac{ \int _{S_{L_1}}^{S_{R_1}} \cdots \int _{S_{L_n}}^{S_{R_n}} \, L_n(\varvec{\beta }, \Lambda |s) \left\{ \frac{n_1}{n}{\mathbb {P}}_{n1}A_{1}(T, s, X, h_1) + \frac{n_2}{n}{\mathbb {P}}_{n2}B_{1}(T, s, X, h_1) \right\} \, \prod _{i=1}^{n} \mathrm{d}{\hat{F}}_c(s_i)}{L_n(\varvec{\beta },\Lambda , {\hat{F}}_{c})},\\&U_{n2}(h_2) = \frac{1}{n} \frac{\partial l(\varvec{\beta }, \Lambda _{\epsilon }, {\hat{F}}_{c})}{\partial {\epsilon }}\bigg |_{\epsilon = 0}\\&\quad = \frac{ \int _{S_{L_1}}^{S_{R_1}} \cdots \int _{S_{L_n}}^{S_{R_n}} \, L_n(\varvec{\beta }, \Lambda |s) \left\{ \frac{n_1}{n}{\mathbb {P}}_{n1}A_{2}(T, s, X, h_1) + \frac{n_2}{n}{\mathbb {P}}_{n2}B_{2}(T, s, X, h_1) \right\} \, \prod _{i=1}^{n} \mathrm{d}{\hat{F}}_c(s_i)}{L_n(\varvec{\beta },\Lambda , {\hat{F}}_{c})}, \end{aligned}$$

where

$$\begin{aligned} {\mathbb {P}}_{n1} A_{1}(T, s, X, h_1)= & {} \frac{\sum _{i=1}^{n_1} A_{1i}(T_i, s_i, X_i, h_1)}{n_1}\\ {\mathbb {P}}_{n2} B_{1}(T, s, X, h_1)= & {} \frac{\sum _{i=n_1+1}^{n} B_{1i}(T_i, s_i, X_i, h_1)}{n_1} \\ {\mathbb {P}}_{n1} A_{2}(T, s, X, h_2)= & {} \frac{\sum _{i=1}^{n_1} A_{2i}(T_i, s_i, X_i, h_2)}{n_1}\\ {\mathbb {P}}_{n2} B_{2}(T, s, X, h_2)= & {} \frac{\sum _{i=n_1+1}^{n} B_{2i}(T_i, s_i, X_i, h_2)}{n_1}, \end{aligned}$$
$$\begin{aligned} A_{1i}(T_i, s_i, X_i, h_1)= & {} \frac{G^{\prime \prime } \{\Lambda (T_i-s_i) \exp (X^T_i \varvec{\beta })\} \Lambda (T_i-s_i) \exp (X^T_i \varvec{\beta }) {\tilde{X}}_i h_1}{\exp (-G\{\Lambda (T_i-s_i) \exp (X^T_i \varvec{\beta })\}},\\ B_{1i}(T_i, s_i, X_i, h_1)= & {} \frac{S_{i}(T_{R_i}) G^{\prime } \{\Lambda (T_{R_i}-s_i) \exp (X^T_i \varvec{\beta })\} \Lambda (T_{R_i}-s_i) \exp (X^T_i \varvec{\beta }) {\tilde{X}}_i h_1}{S_{i}(T_{L_i}) - S_{i}(T_{R_i})},\\&-\,\frac{S_{i}(T_{L_i}) G^{\prime } \{\Lambda (T_{L_i}-s_i) \exp (X^T_i \varvec{\beta })\} \Lambda (T_{L_i}-s_i) \exp (X^T_i \varvec{\beta }) X_i h_1}{S_{i}(T_{L_i}) - S_{i}(T_{R_i})},\\ A_{2i}(T_i, s_i, X_i, h_1)= & {} \frac{G^{\prime \prime } \{\Lambda (T_i-s_i) \exp (X^T_i \varvec{\beta })\} \exp (X^T_i \varvec{\beta }) \int _0^{T_i-s_i} h_2(v)\mathrm{d}\Lambda (v)}{\exp (-G\{\Lambda (T_i-s_i) \exp (X^T_i \varvec{\beta })\}},\\ B_{2i}(T_i, s_i, X_i, h_1)= & {} \frac{S_{i}(T_{R_i}) G^{\prime } \{\Lambda (T_{R_i}-s_i) \exp (X^T_i \varvec{\beta })\} \exp (X^T_i \varvec{\beta }) \int _0^{T_i-s_i} h_2(v)\mathrm{d}\Lambda (v)}{S_{i}(T_{L_i}) - S_{i}(T_{R_i})},\\&-\,\frac{S_{i}(T_{L_i}) G^{\prime } \{\Lambda (T_{L_i}-s_i) \exp (X^T_i \varvec{\beta })\} \exp (X^T_i \varvec{\beta }) \int _0^{T_i-s_i} h_2(v)\mathrm{d}\Lambda (v) }{S_{i}(T_{L_i}) - S_{i}(T_{R_i})}, \end{aligned}$$

and

$$\begin{aligned} S_{i}(t) = \exp \left[ -G \{\Lambda (t-s_i) \exp (X^T_i \varvec{\beta })\} \right] . \end{aligned}$$

Define \(U_{n}(\varvec{\beta }, \Lambda , {\hat{F}}_c)(h_1, h_2) = U_{n1}(h_1)+ U_{n2}(h_2)\) and \(u(\varvec{\beta }, \Lambda , {\hat{F}}_c)(h_1, h_2) = \lim _{n \rightarrow \infty } U_{n}(\varvec{\beta }, \Lambda , {\hat{F}}_c)(h_1) = u_{1}(h_1)+ u_{2}(h_2)\). It is easy to show that \(u(\varvec{\beta }_0, \Lambda _0, {\hat{F}}_c)(h_1, h_2) = 0\) and \(u(\varvec{\beta }, \Lambda , {\hat{F}}_c)\) is Fr\(\acute{e}\)chet differentiable in that \(u(\varvec{\beta }, \Lambda , {\hat{F}}_c)\) is a smooth function of \(\varvec{\beta }\) and \(\Lambda \). Let \({\dot{u}}(\beta _0,\Lambda _0, {\hat{F}}_c)(\varvec{\beta }-\varvec{\beta }_0,\Lambda -\Lambda _0, , {\hat{F}}_c)[h_1,h_2] \) denote the corresponding Fr\(\acute{e}\)chet derivative of \(u(\varvec{\beta }, \Lambda , {\hat{F}}_c)\) at \( (\varvec{\beta }_0,\Lambda _0)\). After some algebra, we have

$$\begin{aligned}&{\dot{u}}(\varvec{\beta }_0,\Lambda _0, {\hat{F}}_c)(\varvec{\beta }-\varvec{\beta }_0,\Lambda -\Lambda _0, , {\hat{F}}_c)[h_1,h_2] \\&\quad = (\varvec{\beta } - \varvec{\beta }_0)^T Q_1(h) + \int _{0}^{\tau } Q_2(v,h) \mathrm{d}\{\Lambda (v)-\Lambda _0(v)\}, \end{aligned}$$

where \({\mathcal {Q}}_1(h) = B h_1+\int _0^{\tau } D_1(s)^T h_2(s) \mathrm{d} s\), and

$$\begin{aligned} {\mathcal {Q}}_2(v,h) = D_2(v)^Th_1 + \int _0^{\tau } D_3(v) h_2(v) \mathrm{d} s + D_4(v)h_2(v). \end{aligned}$$

Here, B is a constant matrix, \(D_1(\cdot ),\)\(D_2(\cdot ),\)\(D_3(\cdot ),\)\(D_4(\cdot ),\) are continuously differentiable functions depending on the true distribution. Consider the following two classes of function

$$\begin{aligned} {\mathcal {C}}_1(\varvec{\beta }_0, \Lambda _0)= & {} \{h_1^T U_1^*(\varvec{\beta }_0, \Lambda _0, {\hat{F}}_c): \Vert h_1\Vert \le 1\},\\ {\mathcal {C}}_2(\varvec{\beta }_0, \Lambda _0)= & {} \{U_2^*(\varvec{\beta }_0, \Lambda _0)(h_2, {\hat{F}}_c): h_2(\cdot ) \in BV[0, \tau ]\}, \end{aligned}$$

where

$$\begin{aligned} U_1^*(\varvec{\beta }_0, \Lambda _0, {\hat{F}}_c) = \frac{ \int _{S_{L}}^{S_{R}} \, L(\varvec{\beta }, \Lambda |s) \left\{ A_{1}(T, s, X, h_1) + B_{1}(T, s, {\tilde{X}}, h_1) \right\} \,\mathrm{d}{\hat{F}}_c(s)}{L(\varvec{\beta },\Lambda , {\hat{F}}_{c})}\Bigg |_{\varvec{\beta }=\varvec{\beta }_0, \Lambda = \Lambda _0} \end{aligned}$$

and

$$\begin{aligned} U_2^*(\varvec{\beta }_0, \Lambda _0, {\hat{F}}_c)(h_2) = \frac{ \int _{S_{L}}^{S_{R}} L(\varvec{\beta }, \Lambda |s) \left\{ A_{2}(T, s, X, h_1) + B_{2}(T, s, X, h_1) \right\} \, \mathrm{d}{\hat{F}}_c(s)}{L(\varvec{\beta },\Lambda , {\hat{F}}_{c})}\Bigg |_{\varvec{\beta }=\varvec{\beta }_0, \Lambda = \Lambda _0}. \end{aligned}$$

It is easy to show that \({\mathcal {C}}_1\) is Donsker class since \(U_1^*(\varvec{\beta }_0, \Lambda _0, {\hat{F}}_c)\) is a bounded function based on the conditions (A1) - (A5). Since \(U_2^*(\varvec{\beta }_0, \Lambda _0, {\hat{F}}_c)(h_2)\) is also a bounded function for each \( h_2(\cdot ) \in BV[0, \tau ]\) and \({\mathcal {C}}_2\) can be viewed as the summation of Donsker classes, we can conclude that \({\mathcal {C}}_2\) is a Donsker class by using the preservation of the Donsker property under the summation. Therefore, \(n^{1/2} U_{n}(\varvec{\beta }_0, \Lambda _0, {\hat{F}}_c)(h) - u(\varvec{\beta }_0, \Lambda _0, {\hat{F}}_c)(h)\) converges weekly to a Gaussian process \(G^*\) on \(l^{\infty }({\mathcal {H}})\). Additionally, we can show that \({\mathcal {C}}_1(\varvec{\beta }, \Lambda )\) and \({\mathcal {C}}_2(\varvec{\beta }, \Lambda )\) are Donsker classes for any \(\varvec{\beta }\) and \(\Lambda \) that satisfy \(\Vert \varvec{\beta }-\varvec{\beta }_0\Vert \rightarrow 0\) and \(\sup _{t\in [0, \tau ]}|\Lambda (t)-\Lambda _0(t)|\rightarrow 0\) as \(n \rightarrow \infty \) and thus have

$$\begin{aligned} \sup _{h\in {\mathcal {H}}} \Big |(U_{n}-u)(\hat{\varvec{\beta }}_n, {\hat{\Lambda }}_n, {\hat{F}}_c)(h) - (U_{n}-u)(\varvec{\beta }_0, \Lambda _0, {\hat{F}}_c)(h)\Big |\rightarrow 0. \end{aligned}$$

Now we show that \({\dot{u}}(\varvec{\beta }_0,\Lambda _0, {\hat{F}}_c)\) is continuously invertible, which is equivalent to the invertibility of the linear operator \({\mathcal {Q}}(h)=({\mathcal {Q}}_1(h),{\mathcal {Q}}_2(h))\). It suffices to prove that \({\mathcal {Q}}(h)\) is a one-to-one map (Rudin 1973, pp. 99–103). Note that if \({\mathcal {Q}}(h)=0,\) then \({\dot{u}}(\varvec{\beta }_0,\Lambda _0, {\hat{F}}_c)=0\) for any \((\varvec{\beta },\Lambda )\) in the neighborhood of \((\varvec{\beta }_0,\Lambda _0).\) We choose \(\varvec{\beta } =\varvec{\beta }_0 + \epsilon h_1\) and \(\Lambda (t) = \Lambda _0(t) + \epsilon \int _0^t h_2(s) \mathrm{d} \Lambda (s).\) By the definition of \({\dot{u}}(\varvec{\beta }_0,\Lambda _0, {\hat{F}}_c),\) we have \({\dot{u}}(\varvec{\beta }_0,\Lambda _0)=-{\mathbb {P}}\{h_1^TU_1^*(\varvec{\beta }_0, \Lambda _0, {\hat{F}}_c)+ U_2^*(\varvec{\beta }_0, \Lambda _0, {\hat{F}}_c)(h_2)\}^2 = 0\), which implies \(h_1^TU_1^*(\varvec{\beta }_0, \Lambda _0, {\hat{F}}_c)+ U_2^*(\varvec{\beta }_0, \Lambda _0, {\hat{F}}_c)(h_2) = 0\) almost surely. Let \(\delta _c=1\) and \(R=\infty \), after some algebra, we have \(X^Th_1+h_2(t)=0\), and by condition (A2), we have \(h_1=0, h_2\equiv 0.\)

Based on Theorem 3.3.1 of van der Vaart and Wellner (1996), we have \(n^{1/2} [\{\hat{\varvec{\beta }}_n, {\hat{\Lambda }}_n(t)\} - \{\varvec{\beta }_0, \Lambda _0(t)\}]\) converges weekly to a tight Gaussian process \(\{{\dot{u}}(\varvec{\beta }_0,\Lambda _0, {\hat{F}}_c)\}^{-1}G^*\), and by the consistency of \({\hat{F}}_c(s)\), we know that \(\{{\dot{u}}(\varvec{\beta }_0,\Lambda _0, {\hat{F}}_c)\}^{-1}G^* \rightarrow \{{\dot{u}}(\varvec{\beta }_0,\Lambda _0, F_{c0})\}^{-1}G^*\), whose variance is given by

$$\begin{aligned} \int _0^{\tau } h_2(v) {\mathcal {Q}}^{-1}_2(v,h) \mathrm{d}\Lambda _0(v) + h_1^T {\mathcal {Q}}^{-1}_1(h), \end{aligned}$$

where \({\mathcal {Q}}^{-1}(h)=({\mathcal {Q}}^{-1}_1(h),{\mathcal {Q}}^{-1}_2(h))\) is the inverse of \({\mathcal {Q}}(h)\).

The resulting one-dimensional submodel is \(\epsilon \rightarrow \theta _{\epsilon ,h} =(\varvec{\beta }+\epsilon h_1,\Lambda (t)+\epsilon \int _0^th_2(s)\mathrm{d} s)\) and its derivative is \(\partial \theta _{\epsilon ,h}/\partial \epsilon |_{\epsilon =0}\equiv {\dot{\theta }}(h),\) where \(h=(h_1,h_2)\in {\mathcal {H}}\) and \(\theta =(\varvec{\beta },\Lambda )\). Define \(U_\theta (h)={\partial l(\theta _{\epsilon ,h}| {\mathcal {O}}, {\hat{F}}_c)}/{\partial \epsilon }|_{\epsilon =0},\) where \(l(\theta |{\mathcal {O}}; {\hat{F}}_c)=\log p( {\mathcal {O}}|\theta , {\hat{F}}_c)\). Denote \({\tilde{\theta }}=-{\dot{U}}^{-1}[U_{\theta _0}(\cdot )].\) To show the efficiency, we follow the same idea in the proof of Corollary 3.2 in Kosorok (2008). The main idea is to characterize the influence function by Riesz theorem (van der Vaart and Wellner 1996, p. 363). The fact that \({\dot{U}}^{-1}\) is linear implies that

$$\begin{aligned} {\mathbb {P}}_0\Big [{\tilde{\theta }}U_{\theta _0}(h)\Big ]= & {} {\mathbb {P}}_0\Big [-{\dot{U}}^{-1}[U_{\theta _0}(\cdot )]U_{\theta _0}(h)\Big ]\\= & {} -{\dot{U}}^{-1} {\mathbb {P}}_0\Big [U_{\theta _0}(\cdot )U_{\theta _0}(h)\Big ]\\= & {} -{\dot{U}}^{-1}\Big [-{\dot{U}}({\dot{\theta }}_0(h))(\cdot )\Big ] \\= & {} {\dot{\theta }}_0(h), \end{aligned}$$

where \({\mathbb {P}}_0 = {\mathbb {P}}_{\theta _0}.\) Then, by Proposition 18.2 in Kosorok (2008), \({\tilde{\theta }}=-{\dot{U}}^{-1}[U_{\theta _0}(\cdot )]\) is the efficient influence function. Therefore, the asymptotic variance of \(n^{1/2} (\hat{\varvec{\beta }}_n -\varvec{\beta }_0)\) achieves the semiparametric efficiency bound. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, S., Sun, J., Tian, T. et al. Semiparametric regression analysis of doubly censored failure time data from cohort studies. Lifetime Data Anal 26, 315–338 (2020). https://doi.org/10.1007/s10985-019-09477-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-019-09477-x

Keywords

Navigation