A New Approach for Regression Analysis of Multivariate Current Status Data with Informative Censoring

Li, Huiqiong; Ma, Chenchen; Sun, Jianguo; Tang, Niansheng

doi:10.1007/s40304-021-00274-3

A New Approach for Regression Analysis of Multivariate Current Status Data with Informative Censoring

Published: 19 November 2022

Volume 11, pages 775–794, (2023)
Cite this article

Communications in Mathematics and Statistics Aims and scope Submit manuscript

Huiqiong Li¹,
Chenchen Ma²,
Jianguo Sun² &
…
Niansheng Tang¹

320 Accesses
1 Citation
Explore all metrics

Abstract

Regression analysis of interval-censored failure time data has recently attracted a great deal of attention partly due to their increasing occurrences in many fields. In this paper, we discuss a type of such data, multivariate current status data, where in addition to the complex interval data structure, one also faces dependent or informative censoring. For inference, a sieve maximum likelihood estimation procedure is developed and the proposed estimators of regression parameters are shown to be asymptotically consistent and efficient. For the implementation of the method, an EM algorithm is provided, and the results from an extensive simulation study demonstrate the validity and good performance of the proposed inference procedure. For an illustration, the proposed approach is applied to a tumorigenicity experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new method for regression analysis of interval-censored data with the additive hazards model

Article 26 February 2020

Regression analysis of current status data in the presence of a cured subgroup and dependent censoring

Article 30 September 2016

Regression analysis of informative current status data with the additive hazards model

Article 31 July 2014

References

Chang, I.S., Wen, C.C., Wu, Y.J.: A profile likelihood theory for the correlated gamma-frailty model with current status family data. Statistica Sinica 17, 1023–1046(2007)
Chen, C.M., Lu, T.F.C., Chen, M.H., Hsu, C.M.: Semiparametric transformation models for current status data with informative censoring. Biom. J. 19, 641–656 (2012)
Article MathSciNet MATH Google Scholar
Chen, C.M., Wei, J.C., Hsu, C.M., Lee, M.Y.: Regression analysis of multivariate current status data with dependent censoring: application to ankylosing spondylitis data. Stat. Med. 33, 772–785 (2014)
Article MathSciNet Google Scholar
Chen, M.H., Tong, X.W., Sun, J.: The proportional odds model for multivariate interval-censored failure time data. Stat. Med. 26, 5147–5161 (2007)
Article MathSciNet Google Scholar
Cox, D.R.: Regression analysis and life tables (with discussion). J. R. Stat. Soc. B 34, 187–220 (1972)
Google Scholar
Dunson, D.B., Dinse, G.E.: Bayesian models for multivariate current status data with informative censoring. Biometrics 58, 79–88 (2002)
Article MathSciNet MATH Google Scholar
Efron, B.: Censored data and the bootstrap. J. Am. Stat. Assoc. 76, 312–319 (1981)
Article MathSciNet MATH Google Scholar
Finkelstein, D.M.: A proportional hazards model for interval-censored failure time data. Biometrics 42, 845–854 (1986)
Article MathSciNet MATH Google Scholar
Goggins, W.B., Finkelstein, D.M.: A proportional hazards model for multivariate interval-censored failure time data. Biometrics 56, 940–943 (2000)
Article MATH Google Scholar
Guo, G., Rodriguez, G.: Estimating a multivariate proportional hazards model for clustered data using the EM algorithm, with an application to child survival in Guatemala. J. Am. Stat. Assoc. 87, 969–976 (1992)
Article Google Scholar
Hu, T., Zhou, Q., Sun, J.: Regression analysis of bivariate current status data under the proportional hazards model. Can. J. Stat. 45, 410–424 (2017)
Article MathSciNet MATH Google Scholar
Jewell, N.P., van der Laan, M.J., Lei, X.: Bivariate current status data with univariate monitoring times. Biometrika 92, 847–862 (2005)
Article MathSciNet MATH Google Scholar
Kalbfleisch, J.D., Prentice, R.L.: The Statistical Analysis of Failure Time Data, 2nd edn. Wiley, New York (2002)
Book MATH Google Scholar
Li, S.W., Hu, T., Wang, P.J., Sun, J.: Regression analysis of current status data in the presence of dependent censoring with applications to tumorigenicity experiments. Comput. Stat. Data Anal. 110, 75–86 (2017)
Article MathSciNet MATH Google Scholar
Lin, D.Y., Oakes, D., Ying, Z.: Additive hazards regression with current status data. Biometrika 85, 289–298 (1998)
Article MathSciNet MATH Google Scholar
Liu, Y.Q., Hu, T., Sun, J.: Regression analysis of current status data in the presence of a cured subgroup and dependent censoring. Lifetime Data Anal. 23, 626–650 (2017)
Article MathSciNet MATH Google Scholar
Lu, M., Zhang, Y., Huang, J.: Estimation of the mean function with panel count data using monotone polymial splines. Biometrika 94, 705–706 (2007)
Article MathSciNet MATH Google Scholar
Ma, L., Hu, T., Sun, J.: Sieve maximum likelihood regression analysis of dependent current status data. Biometrika 85, 649–658 (2015)
MathSciNet MATH Google Scholar
National Toxicology Program: Toxicology and carcinogenesis studies of chloroprene (case no. 126-99-8) in $F344/N$ rats and $B6C3F_1$ mice (inhalation studies). Technical Report 467. U.S. Department of Health and Human Services, Public Health Service, National Institutes of Health, Bethesda, MD (1998)
Pakes, A., Pollard, D.: simulation and the asymptotic of optimization estimators. Econometrica 57, 1027–1057 (1989)
Article MathSciNet MATH Google Scholar
Ramsay, J.O.: Monotone regression splines in action. Stat. Sci. 3, 425–441 (1988)
Google Scholar
Shen, X., Wrong, W.: Convergence rate of sieve estimates. Ann. Stat. 57, 580–615 (1994)
MathSciNet MATH Google Scholar
Su, Y.R., Wang, J.L.: Semiparametric efficient estimation for shared-frailty models with doubly-censored clustered data. Ann. Stat. 44, 1298–1331 (2016)
Article MathSciNet MATH Google Scholar
Sun, J.: The Statistical Analysis of Interval-Censored Failure Time Data. Springer, New York (2006)
MATH Google Scholar
Van Der Vaart, A.W.: Asymptotic Statistics. Cambridge University Press, New York (1998)
Book MATH Google Scholar
Van Der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer, New York (1996)
Book MATH Google Scholar
Wang, N., Wang, L., McMahan, C.S.: Regression analysis of bivariate current status data under the Gamma-frailty proportional hazards model using the EM algorithm. Comput. Stat. Data Anal. 83, 140–150 (2015)
Article MathSciNet MATH Google Scholar
Wen, C.C., Chen, Y.H.: Nonparametric maximum likelihood analysis of clustered current status data with the gamma-frailty Cox model. Comput. Stat. Data Anal. 83, 140–150 (2011)
MathSciNet MATH Google Scholar
Wei, L.J., Lin, D.Y., Weissfeld, L.: Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J. Am. Stat. Assoc. 84, 1065–1073 (1989)
Article MathSciNet Google Scholar
Zhang, Z., Sun, J., Sun, L.: Statistical analysis of current data with informative observation times. Stat. Med. 24, 1399–1407 (2005)
Article MathSciNet Google Scholar
Zhao, S., Hu, T., Ma, L., Wang, P., Sun, J.: Regression analysis of informative current status data with the additive hazards model. Lifetime Data Anal. 21, 241–258 (2015)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors wish to thank the Editor-in-Chief, Dr. Zhiming Ma, the Associate Editor and two reviewers for their many helpful and insightful comments and suggestions that greatly improved the paper. The research was partially supported by Grants from the Natural Science Foundation of China [Grant Number 11731011], a grant from key project of the Yunnan Province Foundation, China [Grant Number 202001BB050049].

Author information

Authors and Affiliations

Department of Statistics, Yunnan University, Kunming, 650091, People’s Republic of China
Huiqiong Li & Niansheng Tang
Department of Statistics, University of Missouri, Columbia, MO, 65211, USA
Chenchen Ma & Jianguo Sun

Authors

Huiqiong Li
View author publications
You can also search for this author in PubMed Google Scholar
Chenchen Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jianguo Sun
View author publications
You can also search for this author in PubMed Google Scholar
Niansheng Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianguo Sun.

Appendix: Proofs of the Asymptotic Properties of $\hat{\theta }_n$

In this Appendix, we will sketch the proof of Theorems 4.1, 4.2 and 4.3. For this, we will mainly use some results about empirical processes given in van der Vaart and Wellner [26].

Proof of Theorem 4.1

To prove the consistency, we will verify the conditions of Theorem 5.7 of Van der Vaart [25]. First we will verify the condition $J_1=: \text{ lim}_{n}\text{ sup}_{\theta _n\in \Theta _n}|\text{ P}_n l(\theta , O)-\text{ P }l(\theta , O)|=o_p(1).$ Note that

$$\begin{aligned} J_1\leqslant \text{ lim}_n \text{ sup}_{\theta _n\in \Theta _n}|&\text{ P}_nl(\theta , O)-\text{ P }l(\theta _n,O)|+\text{ lim}_{n}\text{ sup}_{\theta _n\in \Theta _n}|\text{ P }l(\theta _n,O)-\text{ P }l(\theta ,O)|\\&=:J_{11}+J_{12}. \end{aligned}$$

Therefore, it is sufficient to prove that $J_{1k}=o_p(1), k=1,2$. To prove that $J_{11}=o_p(1)$, we just need verify that $\varepsilon =\{l(\theta _n, O), \theta _n\in \Theta _n\}$ is Euclidean class for its envelope function $\text{ max}_{\theta _n\in \Theta _n}l(\theta _n, O)$. According to (A1), (A2) and Lemma 2.14 in Pakes and Pollard [20], it is easy to see that class $\varepsilon $ is a Euclidean class. Hence, we have $J_{11}=o_p(1)$. For $J_{12}$, by Lemman A1 of Lu et al. [17] and contiguous property of log-likelihood function, we have $J_{12}=o(1)$. Thus, we could obtain that condition $J_1=: \text{ lim}_{n}\text{ sup}_{\theta _n\in \Theta _n}|\text{ P}_n l(\theta , O)-\text{ P }l(\theta , O)|=o_p(1)$ holds.

Now, we verify that another condition of Theorem 5.7 of Van der Vaart [25] holds. That is, for any $\epsilon $,

$$\begin{aligned} \text{ sup}_{d(\theta , \theta _0)>\epsilon }{\mathbb {P}}l(\theta , O)<{\mathbb {P}}l(\theta _0, O). \end{aligned}$$

Note that this condition is satisfied according to condition (A4). Now, by Theorem 5.7 of Van der Vaart [25], we have $d(\hat{\theta }_n, \theta _0)=o_p(1)$, which completes the proof of Theorem 4.1. $\square $

Proof of Theorem 4.2

To derive the convergence rate, for any $\omega >0$, define the class ${\mathcal {F}}_w=\{l(\theta _{n0}, O)-l(\theta , O): \theta \in \Theta _n, d(\theta , \theta _{n0})\leqslant w\}$ with $\theta _{n0}=(\beta _0,\gamma _0,\Sigma _0, \Lambda _{1n0}, \ldots , \Lambda _{Kn0}, \Lambda _{cn0})$. Following the calculation of Shen and Wong (1994, P.597), we can establish that $\text{ log }N_{[]}(\epsilon , {\mathcal {F}}_{\omega }, \parallel .\parallel _{2})\leqslant CN \text{ log }(\omega /\epsilon )$ with $N=2(s+k_n)$, where $N_{[]}(\epsilon , {\mathcal {F}}_{\omega }, d)$ denotes the bracketing number (see Definition 2.1.6 in [26]) with respect to the metric or semi-metric d of a function class $ {\mathcal {F}}$. Moreover, some algebraic calculations lead to $\parallel l(\theta _{n0},O)-l(\theta , O)\parallel ^2\leqslant C\omega ^2$ for any $l(\theta _{n0},O)-l(\theta , O)\in {\mathcal {F}}_\omega $. Then Lemma 19.36 of van der Vaart [25] gives

$$\begin{aligned} E^{*}\text{ sup}_{d(\theta , \theta _0)<\omega }\parallel \sqrt{n}({\mathbb {P}}_n-{\mathbb {P}})(l(\theta , O)-l(\theta _0, O))\parallel =O(1)\omega ^{1/2}(1+\frac{\omega ^{1/2}}{\epsilon ^2\sqrt{n}}M_1), \end{aligned}$$

where $E^{*}$ is the outer expectation and $M_1$ is a positive constant. Let $\phi _n(\omega )=\omega ^{1/2}(1+\frac{\omega ^{1/2}}{\epsilon ^2\sqrt{n}}M_1)$. Then $\phi _n(\omega )/\omega $ is a decreasing function, and $n^{\frac{2}{3}}\phi _n(n^{\frac{-1}{3}})=O(\sqrt{n})$ for large n. Furthermore, by Theorem 4.1, we know that $\hat{\theta }_n$ is consistent. According to theorem 3.4.1 of van der Vaart and Wellner [26], we can conclude that $d(\hat{\theta }_n, \theta _0)=\{ \parallel \hat{\zeta }_n-\zeta _0 \parallel ^{2}+\parallel \hat{\Lambda }_{cn}(c)-\Lambda _{c0}(c)\parallel ^{2}+\sum \nolimits _{k=1}^{K}\int [\hat{\Lambda }_{kn}(c)-\Lambda _{k0}(c)]^{2}f_k(c)\mathrm{d}c\}^{\frac{1}{2}}=O_p(n^{-1/3})$, which completes the proof of Theorem 4.2. $\square $

Proof of Theorem 4.3

The score functions for $\beta $ and $\gamma $ are denoted by $S_{\beta }(\theta )$ and $S_\gamma (\theta )$, respectively, where $S_\beta (\theta )=\frac{\partial l(\beta , \gamma , \Sigma , {\mathcal {A}})}{\partial \beta }$ and $S_\gamma (\theta )=\frac{\partial l(\beta , \gamma , \Sigma , {\mathcal {A}})}{\partial \gamma }$. For $k = 1, \ldots , K$, we let $h_k(t)$ be a nonnegative and nondecreasing function on $[\tau _1, \tau _2]$. Define $H =\{h = (h_1(t), \ldots , h_K(t))\}.$ Consider parametric submodels $\Lambda _\epsilon (t)=(\Lambda _{1, \epsilon }(t), \ldots , \Lambda _{K, \epsilon }(t))$, where $\Lambda _{k, \epsilon }(t)=\Lambda _k(t)+\epsilon h_k(t)$. For each k, the score function along the kth submodels is given by $S_{\Lambda _k(\theta )}[h(k)]=\frac{\partial l(\beta , \gamma , \Sigma , \Lambda _k, \epsilon )}{\partial \epsilon }|_{\epsilon =0}$. The efficient score for $\zeta $ at $(\zeta _0, {\mathcal {A}}_0)$ is ${\tilde{l}}(\zeta _0, {\mathcal {A}}_0)=S_{\zeta }(\zeta _0, {\mathcal {A}}_0)-\sum \nolimits _{k=1}^{K}S_{\Lambda _{k}}(\zeta _0, {\mathcal {A}}_0)[h_k^{*}]$, $h_k^{*}$is a $(d + 1)$-vector function satisfying

$$\begin{aligned} {\mathbb {P}}[(S_{\zeta }(\zeta _0, {\mathcal {A}}_0)-\sum \limits _{k=1}^{K}S_{\Lambda _{k}}(\zeta _0, {\mathcal {A}}_0)[h_k^{*}])'(\sum \limits _{k=1}^{K}S_{\Lambda _{k}}(\zeta _0, {\mathcal {A}}_0)[h_k^{*}])=0, \end{aligned}$$

for each $h_k$ in H. By following similar calculations in Section 3 of Chang et al. [1], we can establish the existence of $h_k$ in the above equation.

The efficient Fisher information matrix $I_0$ for $\zeta $ at $(\zeta _0, {\mathcal {A}}_0)$ is defined as ${\mathbb {P}}({\tilde{l}}(\zeta _0, {\mathcal {A}}_0){\tilde{l}}'(\zeta _0, {\mathcal {A}}_0)).$ By Taylor expansion, we can obtain

$$\begin{aligned}&{\mathbb {P}}{\tilde{l}}(\zeta _0, {\mathcal {A}})={\mathbb {P}}{\tilde{l}}(\zeta _0, {\mathcal {A}}_0)\\&\quad +{\mathbb {P}}\{\sum \limits _{k=1}^{k}S_{\zeta , k}(\theta )[\Lambda _k-\Lambda _{k0}]-\sum \limits _{k=1}^{K}\sum \limits _{j=1}^{K}S_{\zeta , j}(\theta )[h_k^{*}, \Lambda _k-\Lambda _{k0}]\}+O_p(\sum \limits _{k=1}^{K}\parallel \Lambda _k-\Lambda _{k0}\parallel ^{2}). \end{aligned}$$

Note that ${\mathbb {P}}{\tilde{l}}(\zeta _0, {\mathcal {A}}_0)=0$, ${\mathbb {P}}(S_{\zeta }(\theta )S_{\Lambda _k}(\theta )[h_k])=-{\mathbb {P}}(S_{\zeta , k}(\theta )[h_k])$, ${\mathbb {P}}(S_{\Lambda _k}(\theta )[{\tilde{h}}_k]S_{\Lambda _j}(\theta )[h_j])=-{\mathbb {P}}(S_{k, j}(\zeta )[{\tilde{h}}_k, h_j]),$ by the consistency and the convergence rate of ${\hat{\Lambda }}_n$, we can conclude that ${\mathbb {P}}{\tilde{l}}(\zeta _0, {\hat{\mathcal {A}}}_n)=O_p(n^{-2/3})$. Therefore, $\sqrt{n}({\mathbb {P}}_n-{\mathbb {P}})({\tilde{l}}({\hat{\zeta }}_n, {\hat{\mathcal {A}}}_n)- {\tilde{l}}({\hat{\zeta }}_0, {\hat{\mathcal {A}}}_0))=o_p(1)$. Due to the fact that ${\mathbb {P}}_n{\tilde{l}}({\hat{\theta }}_n)={\mathbb {P}}{\tilde{l}}(\theta _0)=0$ and ${\mathbb {P}}{\tilde{l}}(\zeta _0, {\hat{\mathcal {A}}})=o_p(1)$, we have

$$\begin{aligned} -\sqrt{n}{\mathbb {P}}({\tilde{l}}({\hat{\theta }}_n)-{\tilde{l}}(\zeta _0, {\hat{\mathcal {A}}})=\sqrt{n}{\mathbb {P}}_n({\tilde{l}}(\theta _0))+o_p(1). \end{aligned}$$

By the mean value theorem, we have

$$\begin{aligned} -\sqrt{n}{\mathbb {P}}\frac{\partial {\tilde{l}}(\zeta ', {\hat{\mathcal {A}}}_n)}{\partial \zeta }(\hat{\zeta }_n-\zeta _0)=\sqrt{n}{\mathbb {P}}_n({\tilde{l}}(\theta _0))+o_p(1), \end{aligned}$$

where $\zeta '$ is a point between $\hat{\zeta }_n$ and $\zeta _0$. Since $\hat{\theta }_n$ is consistency and ${\mathbb {P}}(-\frac{\partial {\tilde{l}}(\theta _0)}{\partial \zeta }={\mathbb {P}}({\tilde{l}}(\theta _0){\tilde{l}}'(\theta _0))=I_0$, we can conclude that

$$\begin{aligned} \sqrt{n}(\hat{\zeta }_n-\zeta _0)=I_0^{-1}\sqrt{n}{\mathbb {P}}({\tilde{l}}(\theta _0)+o_p(1){\mathop {\rightarrow }\limits ^{d}}N(0, I_0^{-1}). \end{aligned}$$

This completes the proof of Theorem 4.3. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, H., Ma, C., Sun, J. et al. A New Approach for Regression Analysis of Multivariate Current Status Data with Informative Censoring. Commun. Math. Stat. 11, 775–794 (2023). https://doi.org/10.1007/s40304-021-00274-3

Download citation

Received: 16 March 2021
Revised: 25 May 2021
Accepted: 12 November 2021
Published: 19 November 2022
Issue Date: December 2023
DOI: https://doi.org/10.1007/s40304-021-00274-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A New Approach for Regression Analysis of Multivariate Current Status Data with Informative Censoring

Abstract

Access this article

Similar content being viewed by others

A new method for regression analysis of interval-censored data with the additive hazards model

Regression analysis of current status data in the presence of a cured subgroup and dependent censoring

Regression analysis of informative current status data with the additive hazards model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Proofs of the Asymptotic Properties of \(\hat{\theta }_n\)

Proof of Theorem 4.1

Proof of Theorem 4.2

Proof of Theorem 4.3

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A New Approach for Regression Analysis of Multivariate Current Status Data with Informative Censoring

Abstract

Access this article

Similar content being viewed by others

A new method for regression analysis of interval-censored data with the additive hazards model

Regression analysis of current status data in the presence of a cured subgroup and dependent censoring

Regression analysis of informative current status data with the additive hazards model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Proofs of the Asymptotic Properties of \(\hat{\theta }_n\)

Appendix: Proofs of the Asymptotic Properties of \(\hat{\theta }_n\)

Proof of Theorem 4.1

Proof of Theorem 4.2

Proof of Theorem 4.3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation