Elsevier

Handbook of Statistics

Volume 23, 2003, Pages 603-623
Handbook of Statistics

Analysis of Recurrent Event Data

https://doi.org/10.1016/S0169-7161(03)23034-0Get rights and content

Publisher Summary

This chapter describes non- and semiparametric methods of analyzing recurrent event data. It focuses on the functions that are modeled in the analysis of recurrent event data. Recurrent event data are often encountered in biomedicine (e.g., opportunistic infections among AIDS patients), demography (e.g., birth patterns among woman of child-bearing age), and quality control (e.g., automobile repairs). The data structure for recurrent events represents a special case of multivariate survival data, where the failure times for a subject are ordered. As such, recurrent event data have often been analyzed using methods of multivariate survival analysis. Semiparametric regression methods are described in the chapter, with special attention given to elucidating the sometimes subtle differences between the methods, with respect to interpretation of parameter estimates. The methods are also illustrated using an example analysis of a preschool asthma data set.

Introduction

In many research settings, the event of interest can be experienced more than once per subject. Such outcomes have been termed recurrent events. Recurrent event data are often encountered in biomedicine (e.g., opportunistic infections among AIDS patients), demography (e.g., birth patterns among woman of child-bearing age) and quality control (e.g., automobile repairs). Typically, the study will have a fixed termination date, such that the occurrence times are potentially censored. The data structure for recurrent events represents a special case of multivariate survival data, where the failure times for a subject are ordered. As such, recurrent event data have often been analyzed using methods of multivariate survival analysis (e.g., Wei et al., 1989, Prentice et al., 1981, Andersen and Gill, 1982). However, fundamental characteristics of recurrent event data mean that care must be exercised in the application of methods designed for a larger class of general data structures amenable to multivariate survival analysis. Correspondingly, the analysis of recurrent event data has recently been the subject of much methodological research.

In this chapter, we describe non- and semi-parametric methods of analyzing recurrent event data. The chapter is organized as follows. In the second section, we set up essential notation, then focus on the functions which are modelled in the analysis of recurrent event data. In the third section, semiparametric regression methods are described, with special attention given to elucidating the sometimes subtle differences between the methods, with respect to interpretation of parameter estimates. The methods will be illustrated using an example analysis of a preschool asthma data set. In Section 4, we describe various nonparametric estimation methods which can be used for recurrent event data. The final section contains some concluding remarks, and lists some very recent work dealing with data structures which lie beyond the scope of this chapter.

Section snippets

Notation and basic functions of interest

Throughout this chapter, the notation will be as follows. Ni(t)=∫0tdNi(s) will represent the number of events in [0,t] for subject i, for i=1,…,n, where dNi(s) denotes the number of events in the small time interval [s,s+ds). We assume that subject i is observed over the [0,Ci] interval, where Ci denotes censoring time, and is observed to experience events at times Ti,1,…,Ti,mi. We assume that Ci is determined independently of {dNi(t);t⩾0}. In the presence of covariates, the assumption is

Semiparametric models for recurrent event data

The effects of certain covariates on the recurrent event process are often of interest to investigators. In this section, we describe various semiparametric models which often form the basis for regression analysis of recurrent event data. Several of the models are illustrated through the analysis of a retrospective cohort study on preschool asthma (Schaubel et al., 1996). For this study, children born during a one-year period, April 1, 1984 to March 31, 1985, were followed retrospectively from

Nonparametric estimation of the recurrent event survival and distribution functions

Each of the methods described in this chapter thus far has involved semiparametric modelling. We now consider nonparametric estimation of the gap time distribution, survival and related functions.

A complication in the estimation of gap time functions is induced dependent censoring. That is, even if total times are censored independently (e.g., loss to follow-up, administrative censoring), the gap times for the second and subsequent events will be subject to induced dependent censoring, except

Conclusion

In this chapter, we have reviewed several methods useful in the analysis of recurrent event data. The example analysis of the preschool asthma data set illustrates the differences between the methods, and resulting parameter estimators. Ultimately, the choice of which model to fit depends on the objectives of the investigator, and possibly the specifics of the data set of interest. Methodological interest in recurrent event data persists. Methods have recently been developed for data structures

Acknowledgements

This work was partially supported by National Institutes of Heath grant R01 HL-57444.

References (60)

  • P.K. Andersen et al.

    Statistical Models Based on Counting Processes

    (1993)
  • P.K. Andersen et al.

    Cox's regression model for counting processes: A large sample study

    Ann. Statist.

    (1982)
  • U. Barai et al.

    Multiple statistics for multiple events, with application to repeated infections in the growth factor studies

    Statist. Medicine

    (1997)
  • N. Breslow

    Contribution to the discussion of the paper by D.R. Cox

    J. Roy. Statist. Soc. Ser. B

    (1974)
  • J. Cai et al.

    Estimating equations for hazard ratio parameters based on correlated failure time data

    Biometrika

    (1995)
  • J. Cai et al.

    Regression estimation using multivariate failure time data and a common baseline hazard function model

    Lifetime Data Anal.

    (1997)
  • J. Cai et al.

    Marginal means and rates models for multiple type recurrent event data

    Lifetime Data Anal

    (2003)
  • G. Campbell et al.

    Large sample properties of nonparametric statistical inference

  • I.-S. Chang et al.

    Information and asymptotic efficiency in some generalized proportional hazards models for counting processes

    Ann. Statist.

    (1994)
  • S.-H. Chang et al.

    Conditional regression analysis for recurrence time data

    J. Amer. Statist. Assoc.

    (1999)
  • Chang, S.-H. (1995). Regression analysis for recurrent event data. Doctoral Dissertation. Johns Hopkins University,...
  • C.L. Chiang

    Introduction to Stochastic Processes in Biostatistics

    (1968)
  • D. Clayton

    Some approaches to the analysis of recurrent event data

    Statist. Methods Medical Res.

    (1994)
  • R.J. Cook et al.

    Discussion of paper by Wei and Glidden

    Statist. Medicine

    (1997)
  • R.J. Cook et al.

    Marginal analysis of recurrent events and a terminating event

    Statist. Medicine

    (1997)
  • D.R. Cox

    Regression models and life-tables (with discussion)

    J. Roy. Statist. Soc. Ser. B

    (1972)
  • D.R. Cox

    Partial likelihood

    Biometrika

    (1975)
  • D.R. Cox et al.

    Point Processes

    (1980)
  • D.M. Dabrowska

    Kaplan–Meier estimate on the plane

    Ann. Statist.

    (1988)
  • B. Efron

    Censored data and the bootstrap

    J. Amer. Statist. Assoc.

    (1981)
  • T.R. Fleming et al.

    Counting Processes and Survival Analysis

    (1991)
  • S. Gao et al.

    An empirical comparison of two semi-parametric approaches for the estimation of covariate effects from multivariate failure time data

    Statist. Medicine

    (1997)
  • D. Ghosh et al.

    Nonparametric analysis of recurrent events and death

    Biometrics

    (2000)
  • P.J. Huber

    The behaviour of maximum likelihood estimates under nonstandard conditions

  • J.D. Kalbfleisch et al.

    The Statistical Analysis of Failure Time Data

    (2002)
  • E.L. Kaplan et al.

    Nonparametric estimation from incomplete samples

    J. Amer. Statist. Assoc.

    (1958)
  • P. Kelly et al.

    Survival analysis for recurrent event data: An application to childhood infectious diseases

    Statist. Medicine

    (2000)
  • N.M. Laird et al.

    Covariance analysis of censored survival data using log-linear analysis techniques

    J. Amer. Statist. Assoc.

    (1981)
  • J.F. Lawless

    The analysis of recurrent events for multiple subjects

    Appl. Statist.

    (1995)
  • J.F. Lawless et al.

    Some simple robust methods for the analysis of recurrent events

    Technometrics

    (1995)
  • Cited by (21)

    • The recurrence of financial distress: A survival analysis

      2022, International Journal of Forecasting
    • A joint frailty model provides for risk stratification of human immunodeficiency virus–infected patients based on unobserved heterogeneity

      2018, Journal of Clinical Epidemiology
      Citation Excerpt :

      Therefore, in these circumstances, the standard models could potentially result in biased parameter estimates, leading to erroneous inferences [4,9–11]. Hence, frailty models are frequently used to analyze recurrent event processes [5,12,13] and are extensions to the standard Cox model. Frailty models assume that some patients are either more or less likely to experience the events of interest compared with others, that is, more frail.

    • Robust estimation for panel count data with informative observation times

      2013, Computational Statistics and Data Analysis
      Citation Excerpt :

      By panel count data, we mean the data that concern occurrence rates of certain recurrent events and give only the numbers of the events that occur between the observation times, but not their occurrence times. Such data naturally occur in longitudinal follow-up studies on recurrent events in which study subjects can be observed only at discrete time points rather than continuously (Cai and Schaubel, 2004; Cook and Lawless, 2007; Sun, 2006). Many authors have discussed the analysis of panel count data when the recurrent event process of interest and the observation process are independent completely or conditional on covariates.

    View all citing articles on Scopus
    View full text