Skip to main content
Log in

Nonparametric inference for the joint distribution of recurrent marked variables and recurrent survival time

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

Time between recurrent medical events may be correlated with the cost incurred at each event. As a result, it may be of interest to describe the relationship between recurrent events and recurrent medical costs by estimating a joint distribution. In this paper, we propose a nonparametric estimator for the joint distribution of recurrent events and recurrent medical costs in right-censored data. We also derive the asymptotic variance of our estimator, a test for equality of recurrent marker distributions, and present simulation studies to demonstrate the performance of our point and variance estimators. Our estimator is shown to perform well for a wide range of levels of correlation, demonstrating that our estimators can be employed in a variety of situations when the correlation structure may be unknown in advance. We apply our methods to hospitalization events and their corresponding costs in the second Multicenter Automatic Defibrillator Implantation Trial (MADIT-II), which was a randomized clinical trial studying the effect of implantable cardioverter-defibrillators in preventing ventricular arrhythmia.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Bang H, Tsiatis AA (2000) Estimating medical costs with censored data. Biometrika 87(2):329–343

    Article  MathSciNet  MATH  Google Scholar 

  • Chan KCG, Wang MC (2010) Backward estimation of stochastic processes with failure events as time origins. Ann Appl Stat 4(3):1602–1620

    Article  MathSciNet  MATH  Google Scholar 

  • Dabrowska DM (1988) Kaplan-Meier estimate on the plane. Ann Stat 16(4):1475–1489

  • Fang HB, Wang J, Deng D, Tang ML (2011) Estimating the mean of a mark variable under right censoring on the basis of a state function. Comput Stat Data Anal 55(4):1726–1735

    Article  MathSciNet  MATH  Google Scholar 

  • Genest C (1987) Frank’s family of bivariate distributions. Biometrika 74(3):549–555

    Article  MathSciNet  MATH  Google Scholar 

  • Glasziou PP, Simes RJ, Gelber RD (1990) Quality adjusted survival analysis. Stat Med 9(11):1259–1276

    Article  Google Scholar 

  • Huang Y (2009) Cost analysis with censored data. Med Care 47(7):S115–S119

    Article  Google Scholar 

  • Huang Y, Louis TA (1998) Nonparametric estimation of the joint distribution of survival time and mark variables. Biometrika 85(4):785–798

    Article  MathSciNet  MATH  Google Scholar 

  • Huang CY, Wang MC (2005) Nonparametric estimation of the bivariate recurrence time distribution. Biometrics 61(2):392–402

    Article  MathSciNet  MATH  Google Scholar 

  • Lin DY (1997) Non-parametric inference for cumulative incidence functions in competing risks studies. Stat Med 16(8):901–910

    Article  Google Scholar 

  • Lin DY (2000) Proportional means regression for censored medical costs. Biometrics 56(3):775–778

    Article  MathSciNet  MATH  Google Scholar 

  • Luo X, Huang CY (2011) Analysis of recurrent gap time data using the weighted risk-set method and the modified within-cluster resampling method. Stat Med 30(4):301–311

    Article  MathSciNet  Google Scholar 

  • Lin DY, Feuer EJ, Etzioni R, Wax Y (1997) Estimating medical costs from incomplete follow-up data. Biometrics 53(2):419–434

    Article  MATH  Google Scholar 

  • Lin DY, Wei LJ, Yang I, Ying Z (2000) Semiparametric regression for the mean and rate functions of recurrent events. J R Stat Soc Ser B 62(4):711–730

    Article  MathSciNet  MATH  Google Scholar 

  • Liu L, Huang X, O’Quigley J (2008) Analysis of longitudinal data in the presence of informative observational times and a dependent terminal event, with application to medical cost data. Biometrics 64(3):950–958

    Article  MathSciNet  MATH  Google Scholar 

  • Moss AJ, Zareba W, Hall WJ, Klein H, Wilber DJ, Cannom DS, Daubert JP, Higgins SL, Brown MW, Andrews ML (2002) Prophylactic implantation of a defibrillator in patients with myocardial infarction and reduced ejection fraction. New Engl J Med 346(12):877–883

    Article  Google Scholar 

  • Pena EA, Strawderman RL, Hollander M (2001) Nonparametric estimation with recurrent event data. J Am Stat Assoc 96(456):1299–1315

    Article  MathSciNet  MATH  Google Scholar 

  • Rolski T, Schmidli H, Schmidt V, Teugels J (2009) Stochastic processes for insurance and finance. Wiley, NewYork

    MATH  Google Scholar 

  • Shi H, Cheng Y, Li J (2014) Assessing diagnostic accuracy improvement for survival or competing-risk censored outcomes. Can. J. Stat. 42(1):109–125

    Article  MathSciNet  MATH  Google Scholar 

  • Strawderman RL (2000) Estimating the mean of an increasing stochastic process at a censored stopping time. J. Am. Stat. Assoc. 95(452):1192–1208

    Article  MathSciNet  MATH  Google Scholar 

  • van der Vaart AW (1998) Asymptotic statistics. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  • Wang MC, Chang SH (1999) Nonparametric estimation of a recurrent survival function. J Am Stat Assoc 94(445):146–153

    Article  MathSciNet  MATH  Google Scholar 

  • Wang MC, Chen YQ (2000) Nonparametric and semiparametric trend analysis for stratified recurrence times. Biometrics 56(3):789–794

    Article  MATH  Google Scholar 

  • Zhu H (2013) Non-parametric analysis of gap times for multiple event data: an overview. Int Stat Rev 82(1):106–122

    Article  MathSciNet  Google Scholar 

  • Zwanziger J, Hall WJ, Dick AW, Zhao H, Mushlin AI, Hahn RM, Wang H, Andrews ML, Mooney C, Wang H, J MA (2006) The cost effectiveness of implantable cardioverter-defibrillators: results from the Multicenter Automatic Defibrillator Implantation Trial (MADIT)-II. J Am coll Cardiol 47(11):2310–2318

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank Dr. Arthur Moss and Dr. Hongwei Zhao for access to the MADIT-II data. The authors would also like to thank the editor, an associate editor and three reviewers for their constructive comments that greatly improved the paper. Kwun Chuen Gary Chan is partially supported by US National Institutes of Health Grant R01 HL 122212.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Laura M. Yee.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 38 KB)

Appendix

Appendix

We wish to find the influence function of \(\hat{G}(t,m)\). To do so, we will first discuss the uniform consistency of \(\hat{G}(t,m)\), then find the influence function of \(\hat{\varLambda }(t,m)\), and finally use those results to get the final influence function of interest. We use the same notation and setup of Sect. 2 of the paper.

1.1 Uniform consistency of \(\hat{\varLambda }(t,m)\) and \(\hat{G}(t,m)\)

Note that \(\hat{F}(t,m)\) and \(\hat{H}(t)\) are empirical processes, \(E(\hat{F}(t,m))=F(t,m)\) and \(E(\hat{H}(t))=H(t)\). Therefore, \(\hat{F}(t,m)\) and \(\hat{H}(t)\) are uniformly consistent estimators for F(tm) and H(t) by the Glivenko–Cantelli theorem (Vaart 1998). Note that \(\hat{\varLambda }(t,m)\) is a functional of \(\hat{F}(t,m)\) and \(\hat{H}\), and therefore, \(\hat{\varLambda }(t,m)\) is uniformly consistent to \(\varLambda (t,m)\) by Lemma A.1 in Lin et al. (2000). Also, \(\hat{S}(t)\) is uniformly consistent according to Wang and Chang (1999). Since \(\hat{G}(t,m)\) is a functional of \(\hat{S}(t)\) and \(\varLambda (t,m)\), it follows that \(\hat{G}(t,m)\) is uniformly consistent to G(tm) by Lemma A.1 in Lin et al. (2000).

1.2 Influence function: \(\hat{\varLambda }(t,m)\)

First, note that \(\hat{\varLambda }(t,m)= \int _{[0,t]} \hat{F}(ds,m)/\hat{H}(s)\) depends on the pair (\(\hat{F}(t,m)\), \(\hat{H}(t)\)) through two maps:

$$\begin{aligned} (A,B) \rightarrow \left( A,\frac{1}{B}\right) \rightarrow \int _0^t \frac{1}{B}dA. \end{aligned}$$

Then, the functional derivative of the maps at (FH) is:

$$\begin{aligned} (\alpha ,\beta ) \rightarrow \left( \alpha ,-\frac{\beta }{H^2}\right) \rightarrow \int _0^t \frac{d\alpha }{H}- \int _0^t \frac{\beta dF}{H^2}, \end{aligned}$$

where \(\alpha =\hat{F}-F\) and \(\beta =\hat{H}-H\). Now, by the functional delta method (Vaart 1998) and simplification we have:

$$\begin{aligned} \hat{\varLambda }(t,m) -\varLambda (t,m)= & {} \int _0^t \frac{1}{H(s)}[\hat{F}(ds,m) - F(ds,m)]\\&-\,\int _0^t \frac{\hat{H}(s) - H(s)}{( H(s) )^2}F(ds,m) + o_p\left( \frac{1}{ \sqrt{n} } \right) \\= & {} \int _0^t \frac{\hat{F}(ds,m)}{H(s)} - \varLambda (t,m) - \int _0^t \frac{\hat{H}(s)F(ds,m)}{(H(s))^2}\nonumber \\&+ \varLambda (t,m) + o_p\left( \frac{1}{\sqrt{n}}\right) \\= & {} \frac{1}{n} \sum _{i=1}^n \left[ \frac{I(k_i \ge 2)}{k_i^*} \sum _{j=1}^{k_i^*}\left( I(y_{ij} \le t,m_{ij}\le m) \frac{1}{H(y_{ij})} \right) \right. \\&\left. -\, \int _0^t \frac{F(ds,m)}{(H(s))^2} \frac{1}{k_i^*} \sum _{j=1}^{k_i^*} I(y_{ij}\ge s) \right] \\&+\, o_p\left( \frac{1}{\sqrt{n}}\right) \\= & {} \frac{1}{n} \sum _{i=1}^n \phi _i(t,m)+ o_p\left( \frac{1}{\sqrt{n}}\right) \\ \end{aligned}$$

where

$$\begin{aligned} \phi _i(t,m)= & {} \frac{I(k_i \ge 2)}{k_i^*} \sum _{j=1}^{k_i^*}\left( I(y_{ij} \le t,m_{ij}\le m) \frac{1}{H(y_{ij})} \right) \\&-\, \int _0^t \frac{F(ds,m)}{(H(s))^2} \frac{1}{k_i^*} \sum _{j=1}^{k_i^*} I(y_{ij}\ge s) \end{aligned}$$

is the influence function of \(\hat{\varLambda }(t,m)\) \(-\) \(\varLambda (t,m)\). Then a consistent estimator for \(\phi _i(t,m)\), \(\hat{\phi }_i(t,m)\), can be calculated by plugging in \(\hat{H}(t)\) and \(\hat{F}(t,m)\) for H(t) and F(tm).

1.3 Influence function: \(\hat{G}(t,m)\)

Now, we wish to find the influence function for \(\hat{G}(t,m)\), the estimated joint distribution of recurrent survival time and marked variable, where \(\hat{G}(t,m) = \int _0^t \hat{S}(s-) \hat{\varLambda }(ds,m)\). The influence function for the recurrent survival function, \(\hat{S}(t)\), as presented in Wang and Chang, is written as follows:

$$\begin{aligned} \phi _i^{WC}(t)= & {} S(t)\left[ \int _0^t\frac{1}{H^2(u)}\left( \frac{1}{k_i^*}\sum _{j=1}^{k_i^*}I(y_{ij}\ge u)\right) F(du)\right. \\&\left. -\,\frac{I(k_i \ge 2)}{k_i^*}\sum _{j=1}^{k_i^*} \frac{I(y_{ij}< t)}{H(y_{ij})}\right] \ . \end{aligned}$$

Now, completing steps similar to those in the previous section of the Appendix, we use both \(\phi _i(t,m)\) and \(\phi _i^{WC}(t)\) to obtain the influence function for G(tm):

$$\begin{aligned} \hat{G}(t,m)-G(t,m)&= \int _0^t S(s-) \left[ \hat{\varLambda }(ds,m) - \varLambda (ds,m) \right] \\&\quad +\,\int _0^t\left[ \hat{S}(s-)-S(s-)\right] \varLambda (ds,m) + o_p\left( \frac{1}{\sqrt{n}}\right) \\&=\frac{1}{n} \sum _{i=1}^n \left[ \int _0^t S(s-) \phi _i(ds,m) + \int _0^t \phi _i^{WC}(s)\varLambda (ds,m) \right] \\&\quad +\,o_p\left( \frac{1}{\sqrt{n}}\right) \\&= \frac{1}{n} \sum _{i=1}^n \eta _i(t,m) + o_p\left( \frac{1}{\sqrt{n}}\right) \ . \end{aligned}$$

Then an estimator for \(\eta _i(t,m)\), \(\hat{\eta }_i(t,m)\), can be estimated by plugging in \(\hat{S}(t)\), \(\hat{\phi }_i(t,m)\), \(\hat{\phi }_i^{WC}(t)\), and \(\hat{\varLambda }(t,m)\) for S(t), \(\phi _i(t,m)\), \(\phi _i^{WC}(t)\), and \(\varLambda (t,m)\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yee, L.M., Chan, K.C.G. Nonparametric inference for the joint distribution of recurrent marked variables and recurrent survival time. Lifetime Data Anal 23, 207–222 (2017). https://doi.org/10.1007/s10985-015-9347-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-015-9347-7

Keywords

Navigation