Case-cohort analysis of clusters of recurrent events

Chen, Feng; Chen, Kani

doi:10.1007/s10985-013-9275-3

Case-cohort analysis of clusters of recurrent events

Published: 06 July 2013

Volume 20, pages 1–15, (2014)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

Feng Chen¹ &
Kani Chen²

570 Accesses
14 Citations
Explore all metrics

Abstract

The case-cohort sampling, first proposed in Prentice (Biometrika 73:1–11, 1986), is one of the most effective cohort designs for analysis of event occurrence, with the regression model being the typical Cox proportional hazards model. This paper extends to consider the case-cohort design for recurrent events with certain specific clustering feature, which is captured by a properly modified Cox-type self-exciting intensity model. We discuss the advantage of using this model and validate the pseudo-likelihood method. Simulation studies are presented in support of the theory. Application is illustrated with analysis of a bladder cancer data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Nested exposure case-control sampling: a sampling scheme to analyze rare time-dependent exposures

Article 13 November 2018

Jan Feifel, Madlen Gebauer, … Jan Beyersmann

Cohort Study

Multiplicative rates model for recurrent events in case-cohort studies

Article 08 February 2019

Poulami Maitra, Leila D. A. F. Amorim & Jianwen Cai

References

Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723. doi:10.1109/TAC.1974.1100705
Article MATH MathSciNet Google Scholar
Barlow WE (1994) Robust variance estimation for the case-cohort design. Biometrics 50(4):1064–1072
Article MATH Google Scholar
Barlow WE, Ichikawa L, Rosner D, Izumi S (1999) Analysis of case-cohort designs. J Clin Epidemiol 52(12):1165–1172
Article Google Scholar
Borgan O, Langholz B, Samuelsen SO, Goldstein L, Pogoda J (2000) Exposure stratified case-cohort designs. Lifetime Data Anal 6(1):39–58
Article MATH MathSciNet Google Scholar
Breslow N (1972) Discussion of paper by D. R. Cox. J R Stat Soc Ser B (Methodol) 34:216–217
MathSciNet Google Scholar
Byar D (1980) The veterans administration study of chemoprophylaxis for recurrent stage I baldder tumors: comparisons of placebo, pyridoxine, and topical thiotepa. In: Pavone-Macaluso M, Smith PH, Edsmyr F (eds) Bladder tumors and other topics in urological oncology. Plenum Press, New York, pp 363–370
Chapter Google Scholar
Cai J, Prentice RL (1995) Estimating equations for hazard ratio parameters based on correlated failure time data. Biometrika 82(1):151–164. doi:10.1093/biomet/82.1.151. http://biomet.oxfordjournals.org/content/82/1/151.abstract, http://biomet.oxfordjournals.org/content/82/1/151.full.pdf+html
Google Scholar
Cai J, Prentice RL (1997) Regression estimation using multivariate failure time data and a common baseline hazard function model. Lifetime Data Anal 3(3):197–213
Article MATH Google Scholar
Chornoboy E, Schramm L, Karr A (1988) Maximum likelihood identification of neural point process systems. Biol Cybern 59(4):265–275. doi:10.1007/BF00332915
Article MATH MathSciNet Google Scholar
Crane R, Sornette D (2008) Robust dynamic classes revealed by measuring the response function of a social system. Proc Natl Acad Sci 105:15649–15653
Article Google Scholar
Engle RF, Russell JR (1998) Autoregressive conditional duration: a new model for irregularly spaced transaction data. Econometrica 66:1127–1162. http://www.jstor.org/stable/2999632
Google Scholar
Errais E, Giesecke K, Goldberg LR (2010) Affine point processes and portfolio credit risk. SIAM J Financ Math 1:642–665
Article MATH MathSciNet Google Scholar
Felini M, Johnson E, Preacely N, Sarda V, Ndetan H, Bangara S (2011) A pilot case-cohort study of liver and pancreatic cancers in poultry workers. Annals of Epidemiology 21(10):755–766. doi:10.1016/j.annepidem.2011.07.001. http://www.sciencedirect.com/science/article/pii/S1047279711002079
Hawkes AG (1971) Spectra of some self-exciting and mutually exciting point processes. Biometrika 58(1):83–90
Article MATH MathSciNet Google Scholar
Koopman SJ, Lucas A, Monteiro A (2008) The multi-state latent factor intensity model for credit. J Econom 142:399–424
Article MathSciNet Google Scholar
Kopperschmidt K, Stute W (2009) Purchase timing models in marketing: a review. AStA Adv Stat Anal 93:123–149. doi:10.1007/s10182-008-0096-8
Article MATH MathSciNet Google Scholar
Lu SE, Shih JH (2006) Case-cohort designs and analysis for clustered failure time data. Biometrics 62(4):1138–1148. doi:10.1111/j.1541-0420.2006.00584.x
Article MATH MathSciNet Google Scholar
Lu W, Tsiatis AA (2006) Semiparametric transformation models for the case-cohort study. Biometrika 93(1):207–214. doi:10.1093/biomet/93.1.207. http://biomet.oxfordjournals.org/content/93/1/207.abstract, http://biomet.oxfordjournals.org/content/93/1/207.full.pdf+html
Google Scholar
Nan B (2004) Efficient estimation for case-cohort studies. Can J Stat 32(4):403–419. http://www.jstor.org/stable/3316024
Google Scholar
Ogata Y (1988) Statistical models for earthquake occurrences and residual analysis for point processes. J Am Stat Assoc 83(401):9–27. http://www.jstor.org/stable/2288914
Google Scholar
Prentice RL (1986) A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73:1–11
Article MATH MathSciNet Google Scholar
Self SG, Prentice RL (1988) Asymptotic distribution theory and efficiency results for case-cohort studies. Ann Stat 16(1):64–81
Article MATH MathSciNet Google Scholar
Teich MC, Saleh BEA (2000) Branching processes in quantum electronics. IEEE J Sel Top Quantum Electron 6(6):1450–1457
Article Google Scholar
Therneau T (2012) A package for survival analysis in S. R package version 2.37-2. http://CRAN.R-project.org/package/survial
Therneau TM, Hamilton SA (1997) rhDNase as an example of recurrent event analysis. Stat Med 16:2029–2047. doi:10.1002/(SICI)1097-0258(19970930)16:18<2029::AID-SIM637>3.0.CO;2-H
Google Scholar
Therneau TM, Li H (1999) Computing the Cox model for case cohort designs. Lifetime Data Anal 5(2):99–112
Article MATH Google Scholar
Vere-Jones D (1995) Forecasting earthquakes and earthquake risk. Int J Forecast 11(4):503–538. doi:10.1016/0169-2070(95)00621-4. http://www.sciencedirect.com/science/article/B6V92-3XWRN36-2/2/fe30ea6bc743e2109d4968681d73d61d
Google Scholar
Wei LJ, Lin DY, Weissfeld L (1989) Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Stat Assoc 84(408):1065–1073. http://www.jstor.org/stable/2290084
Google Scholar
Zeng D, Lin DY, Avery CL, North KE, Bray MS (2006) Efficient semiparametric estimation of haplotype-disease associations in case-cohort and nested case-control studies. Biostatistics 7(3):486–502. doi:10.1093/biostatistics/kxj021. http://biostatistics.oxfordjournals.org/content/7/3/486.abstract, http://biostatistics.oxfordjournals.org/content/7/3/486.full.pdf+html
Google Scholar
Zhang H, Schaubel DE, Kalbfleisch JD (2011) Proportional hazards regression for the analysis of clustered survival data from case-cohort studies. Biometrics 67(1):18–28
Article MATH MathSciNet Google Scholar
Zhuang J, Ogata Y, Vere-Jones D (2002) Stochastic declustering of space-time earthquake occurrences. J Am Stat Assoc 97(458):369–380. doi:10.1198/016214502760046925
Article MATH MathSciNet Google Scholar

Download references

Acknowledgments

We would like to thank the Associate Editor and two anonymous referees for their valuable comments, which have led to improved presentation. F.C. was supported by a University of New South Wales (UNSW) Early Career Researcher grant and a UNSW Faculty Research Grant. K.C. was supported by Hong Kong Research Grants Council Grants (601011 and 601612).

Author information

Authors and Affiliations

School of Mathematics and Statistics, The University of New South Wales, Sydney, NSW, 2052, Australia
Feng Chen
Department of Mathematics, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
Kani Chen

Authors

Feng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kani Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kani Chen.

Appendix

1.1 Sketch proof of Proposition 1

Although due to the presence of the self-exciting component, our model is not a special case of the model considered by Self and Prentice (1988), the asymptotic properties of the pseudo likelihood based estimator can be proved along the same lines of Self and Prentice (1988) for the Cox proportional intensity model. Here we present a sketch of the technical arguments using the empirical approximation method. The conditions C3–C5 ensure that $\varPsi (t, \theta )Y( t) $ is $P$-Glivenko–Cantelli over $[0, t_0] \times \varTheta $ for any fixed $t_0>0$. As a result,

$$\begin{aligned} \tilde{R}^{(k)} (t, \theta ) \rightarrow r^{(k)} (t, \theta ) \end{aligned}$$

uniformly over $[0, \tau ] \times \varTheta $ with probability one. It follows that, uniformly over $\varTheta $,

$$\begin{aligned} {1 \over n}\left[ \log \{L(\theta )\} -\log \{ L(\theta _0)\} \right] \rightarrow l(\theta ) - l(\theta _0), \end{aligned}$$

where

$$\begin{aligned} l(\theta )&= \mathrm{E }\left( \int _0^C \left[ \varPsi (t, \theta )- \log \mathrm{E }\{ \exp \{ \varPsi (t, \theta ) \} \mu _0(t) Y(t)\}\right] \,\mathrm d N(t) \right) \\&= \int _0^\tau \mathrm{E }\left( \left[ \varPsi (t, \theta )- \log \mathrm{E }\{ \exp \{ \varPsi (t, \theta ) \} \mu _0(t) Y(t) \}\right] \exp \{\varPsi (t, \theta _0) \} Y(t)\mu _0(t)\right) \mathrm d t. \end{aligned}$$

Observe that, for any positive random variable $\zeta $ and nonnegative $\eta $ with positive mean, Jensen’s inequality implies $ \mathrm{E }[\eta \log \zeta ]/\mathrm{E }[\eta ] \le \log \mathrm{E }[\zeta \eta ] - \log \mathrm{E }[\eta ].$ Set $\zeta = \exp \{\varPsi (t, \theta ) - \varPsi (t, \theta _0)\}$ and $\eta = \exp \{\varPsi (t, \theta _0)\} Y(t) \mu (t)$. It is seen that the integrand in the second expression of $l(\theta )$ achieves a unique maximum when $\theta =\theta _0$. By C4, $l(\theta )$ achieves maximum only at $\theta _0$. The uniform convergence over $\varTheta $ implies that $ \hat{\theta }$ is strongly consistent.

The asymptotic normality is proved by the Taylor expansion in a small neighborhood of $\theta _0$. Similar to the definition of $r^{(k)}$ and $\tilde{R}^{(k)}$, let

$$\begin{aligned} R^{(k)}(t, \theta ) = {1 \over n} \sum \limits _{i=1}^n \psi _i(t, \theta ) Y_i(t) \exp \{ \varPsi _i(t, \theta ) \}. \end{aligned}$$

The principal derivation is

$$\begin{aligned}&\hat{\theta }- \theta _0 \\&= - \left[ \frac{\partial ^2}{\partial \theta \partial \theta ^{\top }}\log \{L(\theta _0)\} \right] ^{-1} \left[ \sum \limits _{i=1}^n \int _0^{C_i } \left\{ \psi _i(t, \theta _0) - { R^{(1)}(t, \theta _0) \over R^{(0)} (t, \theta _0)} \right\} \,\mathrm d N_i(t) \right. \\ \nonumber&\left. \qquad \qquad \qquad \qquad \qquad +\sum \limits _{i=1}^n \int _0^{C_i} \left\{ { R^{(1)}(t, \theta _0) \over R^{(0)} (t, \theta _0)} - { \tilde{R}^{(1)}(t, \theta _0) \over \tilde{R}^{(0)} (t, \theta _0)} \right\} \,\mathrm d N_i(t) \right] +o_P(n^{-1/2})\\ \!&= \! \!-\! \left[ \frac{\partial ^2}{\partial \theta \partial \theta ^{\top }}\log \{L(\theta _0)\} \right] ^{-1} \left[ \sum \limits _{i=1}^n \int _0^{C_i} \left\{ \psi _i(t, \theta _0) - { R^{(1)}(t, \theta _0) \over R^{(0)} (t, \theta _0)} \right\} \,\mathrm d N_i(t)\right. \\&\qquad \qquad \qquad \qquad \qquad \left. + \sum \limits _{i=1}^n \left( {\epsilon _i \over n_*/n } - 1\right) \xi _i \right] + o_P(n^{-1/2}).\\ \end{aligned}$$

Observe that $\epsilon _j$ are membership indicators for simple random sampling without replacement and the asymptotic normality of $\hat{\theta }$ follows.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, F., Chen, K. Case-cohort analysis of clusters of recurrent events. Lifetime Data Anal 20, 1–15 (2014). https://doi.org/10.1007/s10985-013-9275-3

Download citation

Received: 25 September 2012
Accepted: 18 June 2013
Published: 06 July 2013
Issue Date: January 2014
DOI: https://doi.org/10.1007/s10985-013-9275-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Case-cohort analysis of clusters of recurrent events

Abstract

Access this article

Similar content being viewed by others

Nested exposure case-control sampling: a sampling scheme to analyze rare time-dependent exposures

Cohort Study

Multiplicative rates model for recurrent events in case-cohort studies

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 Sketch proof of Proposition 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Case-cohort analysis of clusters of recurrent events

Abstract

Access this article

Similar content being viewed by others

Nested exposure case-control sampling: a sampling scheme to analyze rare time-dependent exposures

Cohort Study

Multiplicative rates model for recurrent events in case-cohort studies

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Sketch proof of Proposition 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation