Abstract
Regression analysis of interval-censored failure time data has recently attracted a great deal of attention partly due to their increasing occurrences in many fields. In this paper, we discuss a type of such data, multivariate current status data, where in addition to the complex interval data structure, one also faces dependent or informative censoring. For inference, a sieve maximum likelihood estimation procedure is developed and the proposed estimators of regression parameters are shown to be asymptotically consistent and efficient. For the implementation of the method, an EM algorithm is provided, and the results from an extensive simulation study demonstrate the validity and good performance of the proposed inference procedure. For an illustration, the proposed approach is applied to a tumorigenicity experiment.
Similar content being viewed by others
References
Chang, I.S., Wen, C.C., Wu, Y.J.: A profile likelihood theory for the correlated gamma-frailty model with current status family data. Statistica Sinica 17, 1023–1046(2007)
Chen, C.M., Lu, T.F.C., Chen, M.H., Hsu, C.M.: Semiparametric transformation models for current status data with informative censoring. Biom. J. 19, 641–656 (2012)
Chen, C.M., Wei, J.C., Hsu, C.M., Lee, M.Y.: Regression analysis of multivariate current status data with dependent censoring: application to ankylosing spondylitis data. Stat. Med. 33, 772–785 (2014)
Chen, M.H., Tong, X.W., Sun, J.: The proportional odds model for multivariate interval-censored failure time data. Stat. Med. 26, 5147–5161 (2007)
Cox, D.R.: Regression analysis and life tables (with discussion). J. R. Stat. Soc. B 34, 187–220 (1972)
Dunson, D.B., Dinse, G.E.: Bayesian models for multivariate current status data with informative censoring. Biometrics 58, 79–88 (2002)
Efron, B.: Censored data and the bootstrap. J. Am. Stat. Assoc. 76, 312–319 (1981)
Finkelstein, D.M.: A proportional hazards model for interval-censored failure time data. Biometrics 42, 845–854 (1986)
Goggins, W.B., Finkelstein, D.M.: A proportional hazards model for multivariate interval-censored failure time data. Biometrics 56, 940–943 (2000)
Guo, G., Rodriguez, G.: Estimating a multivariate proportional hazards model for clustered data using the EM algorithm, with an application to child survival in Guatemala. J. Am. Stat. Assoc. 87, 969–976 (1992)
Hu, T., Zhou, Q., Sun, J.: Regression analysis of bivariate current status data under the proportional hazards model. Can. J. Stat. 45, 410–424 (2017)
Jewell, N.P., van der Laan, M.J., Lei, X.: Bivariate current status data with univariate monitoring times. Biometrika 92, 847–862 (2005)
Kalbfleisch, J.D., Prentice, R.L.: The Statistical Analysis of Failure Time Data, 2nd edn. Wiley, New York (2002)
Li, S.W., Hu, T., Wang, P.J., Sun, J.: Regression analysis of current status data in the presence of dependent censoring with applications to tumorigenicity experiments. Comput. Stat. Data Anal. 110, 75–86 (2017)
Lin, D.Y., Oakes, D., Ying, Z.: Additive hazards regression with current status data. Biometrika 85, 289–298 (1998)
Liu, Y.Q., Hu, T., Sun, J.: Regression analysis of current status data in the presence of a cured subgroup and dependent censoring. Lifetime Data Anal. 23, 626–650 (2017)
Lu, M., Zhang, Y., Huang, J.: Estimation of the mean function with panel count data using monotone polymial splines. Biometrika 94, 705–706 (2007)
Ma, L., Hu, T., Sun, J.: Sieve maximum likelihood regression analysis of dependent current status data. Biometrika 85, 649–658 (2015)
National Toxicology Program: Toxicology and carcinogenesis studies of chloroprene (case no. 126-99-8) in \(F344/N\) rats and \(B6C3F_1\) mice (inhalation studies). Technical Report 467. U.S. Department of Health and Human Services, Public Health Service, National Institutes of Health, Bethesda, MD (1998)
Pakes, A., Pollard, D.: simulation and the asymptotic of optimization estimators. Econometrica 57, 1027–1057 (1989)
Ramsay, J.O.: Monotone regression splines in action. Stat. Sci. 3, 425–441 (1988)
Shen, X., Wrong, W.: Convergence rate of sieve estimates. Ann. Stat. 57, 580–615 (1994)
Su, Y.R., Wang, J.L.: Semiparametric efficient estimation for shared-frailty models with doubly-censored clustered data. Ann. Stat. 44, 1298–1331 (2016)
Sun, J.: The Statistical Analysis of Interval-Censored Failure Time Data. Springer, New York (2006)
Van Der Vaart, A.W.: Asymptotic Statistics. Cambridge University Press, New York (1998)
Van Der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer, New York (1996)
Wang, N., Wang, L., McMahan, C.S.: Regression analysis of bivariate current status data under the Gamma-frailty proportional hazards model using the EM algorithm. Comput. Stat. Data Anal. 83, 140–150 (2015)
Wen, C.C., Chen, Y.H.: Nonparametric maximum likelihood analysis of clustered current status data with the gamma-frailty Cox model. Comput. Stat. Data Anal. 83, 140–150 (2011)
Wei, L.J., Lin, D.Y., Weissfeld, L.: Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J. Am. Stat. Assoc. 84, 1065–1073 (1989)
Zhang, Z., Sun, J., Sun, L.: Statistical analysis of current data with informative observation times. Stat. Med. 24, 1399–1407 (2005)
Zhao, S., Hu, T., Ma, L., Wang, P., Sun, J.: Regression analysis of informative current status data with the additive hazards model. Lifetime Data Anal. 21, 241–258 (2015)
Acknowledgements
The authors wish to thank the Editor-in-Chief, Dr. Zhiming Ma, the Associate Editor and two reviewers for their many helpful and insightful comments and suggestions that greatly improved the paper. The research was partially supported by Grants from the Natural Science Foundation of China [Grant Number 11731011], a grant from key project of the Yunnan Province Foundation, China [Grant Number 202001BB050049].
Author information
Authors and Affiliations
Corresponding author
Appendix: Proofs of the Asymptotic Properties of \(\hat{\theta }_n\)
Appendix: Proofs of the Asymptotic Properties of \(\hat{\theta }_n\)
In this Appendix, we will sketch the proof of Theorems 4.1, 4.2 and 4.3. For this, we will mainly use some results about empirical processes given in van der Vaart and Wellner [26].
Proof of Theorem 4.1
To prove the consistency, we will verify the conditions of Theorem 5.7 of Van der Vaart [25]. First we will verify the condition \(J_1=: \text{ lim}_{n}\text{ sup}_{\theta _n\in \Theta _n}|\text{ P}_n l(\theta , O)-\text{ P }l(\theta , O)|=o_p(1).\) Note that
Therefore, it is sufficient to prove that \(J_{1k}=o_p(1), k=1,2\). To prove that \(J_{11}=o_p(1)\), we just need verify that \(\varepsilon =\{l(\theta _n, O), \theta _n\in \Theta _n\}\) is Euclidean class for its envelope function \(\text{ max}_{\theta _n\in \Theta _n}l(\theta _n, O)\). According to (A1), (A2) and Lemma 2.14 in Pakes and Pollard [20], it is easy to see that class \(\varepsilon \) is a Euclidean class. Hence, we have \(J_{11}=o_p(1)\). For \(J_{12}\), by Lemman A1 of Lu et al. [17] and contiguous property of log-likelihood function, we have \(J_{12}=o(1)\). Thus, we could obtain that condition \(J_1=: \text{ lim}_{n}\text{ sup}_{\theta _n\in \Theta _n}|\text{ P}_n l(\theta , O)-\text{ P }l(\theta , O)|=o_p(1)\) holds.
Now, we verify that another condition of Theorem 5.7 of Van der Vaart [25] holds. That is, for any \(\epsilon \),
Note that this condition is satisfied according to condition (A4). Now, by Theorem 5.7 of Van der Vaart [25], we have \(d(\hat{\theta }_n, \theta _0)=o_p(1)\), which completes the proof of Theorem 4.1. \(\square \)
Proof of Theorem 4.2
To derive the convergence rate, for any \(\omega >0\), define the class \({\mathcal {F}}_w=\{l(\theta _{n0}, O)-l(\theta , O): \theta \in \Theta _n, d(\theta , \theta _{n0})\leqslant w\}\) with \(\theta _{n0}=(\beta _0,\gamma _0,\Sigma _0, \Lambda _{1n0}, \ldots , \Lambda _{Kn0}, \Lambda _{cn0})\). Following the calculation of Shen and Wong (1994, P.597), we can establish that \(\text{ log }N_{[]}(\epsilon , {\mathcal {F}}_{\omega }, \parallel .\parallel _{2})\leqslant CN \text{ log }(\omega /\epsilon )\) with \(N=2(s+k_n)\), where \(N_{[]}(\epsilon , {\mathcal {F}}_{\omega }, d)\) denotes the bracketing number (see Definition 2.1.6 in [26]) with respect to the metric or semi-metric d of a function class \( {\mathcal {F}}\). Moreover, some algebraic calculations lead to \(\parallel l(\theta _{n0},O)-l(\theta , O)\parallel ^2\leqslant C\omega ^2\) for any \(l(\theta _{n0},O)-l(\theta , O)\in {\mathcal {F}}_\omega \). Then Lemma 19.36 of van der Vaart [25] gives
where \(E^{*}\) is the outer expectation and \(M_1\) is a positive constant. Let \(\phi _n(\omega )=\omega ^{1/2}(1+\frac{\omega ^{1/2}}{\epsilon ^2\sqrt{n}}M_1)\). Then \(\phi _n(\omega )/\omega \) is a decreasing function, and \(n^{\frac{2}{3}}\phi _n(n^{\frac{-1}{3}})=O(\sqrt{n})\) for large n. Furthermore, by Theorem 4.1, we know that \(\hat{\theta }_n\) is consistent. According to theorem 3.4.1 of van der Vaart and Wellner [26], we can conclude that \(d(\hat{\theta }_n, \theta _0)=\{ \parallel \hat{\zeta }_n-\zeta _0 \parallel ^{2}+\parallel \hat{\Lambda }_{cn}(c)-\Lambda _{c0}(c)\parallel ^{2}+\sum \nolimits _{k=1}^{K}\int [\hat{\Lambda }_{kn}(c)-\Lambda _{k0}(c)]^{2}f_k(c)\mathrm{d}c\}^{\frac{1}{2}}=O_p(n^{-1/3})\), which completes the proof of Theorem 4.2. \(\square \)
Proof of Theorem 4.3
The score functions for \(\beta \) and \(\gamma \) are denoted by \(S_{\beta }(\theta )\) and \(S_\gamma (\theta )\), respectively, where \(S_\beta (\theta )=\frac{\partial l(\beta , \gamma , \Sigma , {\mathcal {A}})}{\partial \beta }\) and \(S_\gamma (\theta )=\frac{\partial l(\beta , \gamma , \Sigma , {\mathcal {A}})}{\partial \gamma }\). For \(k = 1, \ldots , K\), we let \(h_k(t)\) be a nonnegative and nondecreasing function on \([\tau _1, \tau _2]\). Define \(H =\{h = (h_1(t), \ldots , h_K(t))\}.\) Consider parametric submodels \(\Lambda _\epsilon (t)=(\Lambda _{1, \epsilon }(t), \ldots , \Lambda _{K, \epsilon }(t))\), where \(\Lambda _{k, \epsilon }(t)=\Lambda _k(t)+\epsilon h_k(t)\). For each k, the score function along the kth submodels is given by \(S_{\Lambda _k(\theta )}[h(k)]=\frac{\partial l(\beta , \gamma , \Sigma , \Lambda _k, \epsilon )}{\partial \epsilon }|_{\epsilon =0}\). The efficient score for \(\zeta \) at \((\zeta _0, {\mathcal {A}}_0)\) is \({\tilde{l}}(\zeta _0, {\mathcal {A}}_0)=S_{\zeta }(\zeta _0, {\mathcal {A}}_0)-\sum \nolimits _{k=1}^{K}S_{\Lambda _{k}}(\zeta _0, {\mathcal {A}}_0)[h_k^{*}]\), \(h_k^{*}\)is a \((d + 1)\)-vector function satisfying
for each \(h_k\) in H. By following similar calculations in Section 3 of Chang et al. [1], we can establish the existence of \(h_k\) in the above equation.
The efficient Fisher information matrix \(I_0\) for \(\zeta \) at \((\zeta _0, {\mathcal {A}}_0)\) is defined as \({\mathbb {P}}({\tilde{l}}(\zeta _0, {\mathcal {A}}_0){\tilde{l}}'(\zeta _0, {\mathcal {A}}_0)).\) By Taylor expansion, we can obtain
Note that \({\mathbb {P}}{\tilde{l}}(\zeta _0, {\mathcal {A}}_0)=0\), \({\mathbb {P}}(S_{\zeta }(\theta )S_{\Lambda _k}(\theta )[h_k])=-{\mathbb {P}}(S_{\zeta , k}(\theta )[h_k])\), \({\mathbb {P}}(S_{\Lambda _k}(\theta )[{\tilde{h}}_k]S_{\Lambda _j}(\theta )[h_j])=-{\mathbb {P}}(S_{k, j}(\zeta )[{\tilde{h}}_k, h_j]),\) by the consistency and the convergence rate of \({\hat{\Lambda }}_n\), we can conclude that \({\mathbb {P}}{\tilde{l}}(\zeta _0, {\hat{\mathcal {A}}}_n)=O_p(n^{-2/3})\). Therefore, \(\sqrt{n}({\mathbb {P}}_n-{\mathbb {P}})({\tilde{l}}({\hat{\zeta }}_n, {\hat{\mathcal {A}}}_n)- {\tilde{l}}({\hat{\zeta }}_0, {\hat{\mathcal {A}}}_0))=o_p(1)\). Due to the fact that \({\mathbb {P}}_n{\tilde{l}}({\hat{\theta }}_n)={\mathbb {P}}{\tilde{l}}(\theta _0)=0\) and \({\mathbb {P}}{\tilde{l}}(\zeta _0, {\hat{\mathcal {A}}})=o_p(1)\), we have
By the mean value theorem, we have
where \(\zeta '\) is a point between \(\hat{\zeta }_n\) and \(\zeta _0\). Since \(\hat{\theta }_n\) is consistency and \({\mathbb {P}}(-\frac{\partial {\tilde{l}}(\theta _0)}{\partial \zeta }={\mathbb {P}}({\tilde{l}}(\theta _0){\tilde{l}}'(\theta _0))=I_0\), we can conclude that
This completes the proof of Theorem 4.3. \(\square \)
Rights and permissions
About this article
Cite this article
Li, H., Ma, C., Sun, J. et al. A New Approach for Regression Analysis of Multivariate Current Status Data with Informative Censoring. Commun. Math. Stat. 11, 775–794 (2023). https://doi.org/10.1007/s40304-021-00274-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40304-021-00274-3