frailtypack : An R Package for the Analysis of Correlated Survival Data with Frailty Models Using Penalized Likelihood Estimation or Parametrical Estimation

Frailty models are very useful for analysing correlated survival data, when observations are clustered into groups or for recurrent events. The aim of this article is to present the new version of an R package called frailtypack . This package allows to ﬁt Cox models and four types of frailty models (shared, nested, joint, additive) that could be useful for several issues within biomedical research. It is well adapted to the analysis of recurrent events such as cancer relapses and/or terminal events (death or lost to follow-up). The approach uses maximum penalized likelihood estimation. Right-censored or left-truncated data are considered. It also allows stratiﬁcation and time-dependent covariates during analysis.


Introduction
Frailty models (Duchateau and Janssen 2008;Hougaard 2000;Wienke 2010;Hanagal 2011) are extensions of the Cox proportional hazards model (Cox 1972) which is the most popular model in survival analysis. In many clinical applications, the study population needs to be considered as a heterogeneous sample or as a cluster of homogeneous groups of individuals such as families or geographical areas. Sometimes, due to lack of knowledge or for economical reasons, some covariates related to the event of interest are not measured. The frailty approach is a statistical modelling method which aims to account for the heterogeneity caused by unmeasured covariates. It does so by adding random effects which act multiplicatively on the hazard function. frailtypack is an R package (R Development Core Team 2012) which allows to fit four types of frailty models, for left-truncated and right-censored data, adapted to most survival analysis issues. The aim of this paper is to present the new version of the R package frailtypack, which is available from the Comprehensive R Archive Network at http://CRAN. R-project.org/package=frailtypack, and the various new models proposed. It depends on the R survival package (Therneau 2012). The initial version of this package (Rondeau and Gonzalez 2005) was proposed for a simple shared frailty model, and was developed for more general frailty models (Rondeau et al. 2012). The shared frailty model (Rondeau et al. 2003) can be used, when observations are supposed to be clustered into groups. The nested frailty model (Rondeau et al. 2006) is most appropriate, when there are two levels of hierarchical clustering. However, several relapses (recurrent events) are likely to increase the risk of death, thus the terminal event is considered as an informative censoring. Using a joint frailty model, it is possible to fit jointly the two hazard functions associated with recurrent and terminal events (Rondeau et al. 2007), when these events are supposed to be correlated. The additive frailty model (Rondeau et al. 2008) is more adapted to study both heterogeneity across trial and treatment-by-trial heterogeneity (for instance meta-analysis or multicentric datasets study). Depending on the models, stratification and time-dependent covariates are allowed or not. The frailty models discussed in recent literature present several drawbacks. Their convergence is too slow, they do not provide standard errors for the variance estimate of the random effects and they can not estimate smooth hazard function. frailtypack use a nonparametric penalized likelihood estimation, and the smooth estimation of the baseline hazard functions is provided by using an approximation by splines. frailtypack was first written in Fortran 77 and was implemented for the statistical software R. Section 2 presents the models that frailtypack can fit and the estimation method. Section 3 describes all the functions and the arguments of frailtypack. Section 4 provides some examples illustrating frailtypack functions.

Models
In the following models, the covariates could be time-dependent, but to simplify the expressions we replace for instance X ij (t) by X ij .

Cox model
The proportional hazards Cox model is a frailty model without random effect. With frailtypack, it is possible to fit such a model with parameters estimated by penalized likelihood maximization.

Shared frailty model
When observations are clustered into groups such as hospitals or cities, or when observations are recurrent events times (cancer relapses), the shared gamma frailty model is the most often adapted model (Rondeau et al. 2003). In the following, we will use recurrent event terminology; nevertheless, grouped data can also be treated. For the j-th (j = 1, ..., n i ) individual of the i-th group (i = 1, ..., G), let T ij denote the recurrent event times under study, let C ij be the right-censoring times and let L ij be the left truncation times. The observations Y ij equal to min(T ij , C ij ) and the censoring indicators are δ ij = I {Y ij =T ij } . The recurrent event times may be left-truncated and right-censored. Stratified analysis by a binary variable is allowed. The hazard function for a shared frailty model is where λ 0 (t) is the baseline hazard function, X ij the covariate vector associated with the vector of regression parameters β, and v i is the random effect associated with the i-th group. We assume that the v i are independently and identically distributed (i.i.d.) from a gamma distribution with E(v i ) = 1 and Var(v i ) = θ, i.e., v i ∼ Γ 1 θ , 1 θ . We observe Y ij , L ij , δ ij . The full marginal loglikelihood for this model has an analytical formulation (Klein 1992) where Φ = (λ 0 (·), β, θ), the cumulative baseline hazard function is Λ 0 (·) and the number of

Nested frailty model
Nested frailty models account for hierarchical clustering of the data by including two nested random effects that act multiplicatively on the hazard function. Such models are appropriate when observations are clustered at several hierarchical levels such as in geographical areas (Rondeau et al. 2006). For instance, the American Cancer Society Study on air pollution and mortality (Pope et al. 1995) included 552,138 subjects from 151 metropolitan areas throughout the United States, themselves nested within 44 states. As the hypothesis of independent observations is not a priori obvious in this cohort, a flexible survival model with two nested random effects at state and city levels is needed to obtain valid estimates. We consider G independent clusters and within the i-th there are n i subclusters. Left-truncated and right-censored data are allowed. Stratified analysis by a binary variable is allowed, too. Let T ijk denote the survival times under study for the k-th subject (k = 1, ..., K ij ) from the j-th subgroup (j = 1, ..., n i ) of the i-th group (i = 1, ..., G), C ijk the right-censoring times and L ijk the left-truncating times. The observations are Y ijk = min(T ijk , C ijk ) and the censoring indicators δ ijk . We assume that T ijk , L ijk and C ijk are independent. We observe Y ijk , L ijk , δ ijk . The hazard function for a nested frailty model is where λ 0 (t) is the baseline hazard function and X ijk the covariate vector associated with β the corresponding vector of regression parameters. The cluster random effect v i and the subcluster random effect z ij are both independently and identically gamma-distributed: , with E(z ij ) = 1 and Var(z ij ) = η.
If η is null, then observations from the same subgroup are independent and if θ is null, then observations from the same group are independent. A larger variance implies greater heterogeneity in frailty across groups and a greater correlation of the survival times for individuals that belong to the same group. The full marginal loglikelihood takes the following form (Rondeau et al. 2006) where Φ = (λ 0 (·), β, η, θ), the cumulative baseline hazard function is Λ ijk (·), the number of recurrent events in the i-th group is m i =

Joint frailty model
The observation of successive events across time (recurrent events) for subjects in cohort studies may be terminated by loss to follow-up, end-of-study, or a major failure event such as death. In this context, the major failure event could be correlated with recurrent events and the usual assumption of noninformative censoring of the recurrent event process by death is not valid. Joint frailty models allow to study the joint evolution over time of two survival processes by considering the terminal event as informative censoring (Rondeau et al. 2007). A common frailty term v i for the two rates takes into account the heterogeneity in the data, associated with unobserved covariates. We assume that the v i are independently and identically distributed (i.i.d.) from a gamma distribution with E(v i ) = 1 and Var(v i ) = θ, i.e., v i ∼ Γ 1 θ , 1 θ . The frailty term acts differently on the two rates (v i for the recurrent rate and v α i for death rate). The parameters θ and α characterize the dependence between recurrent event process T ij and terminal event time T * i attributed to the unobserved random effects. A zero value of α implies that the dependence between T ij and T * i can be fully explained by the (observed) covariates. If θ is null, T ij and T * i are considered independent and the parameter α is non interpretable. On the other hand, if θ is nonzero and α is also nonzero, then θ not only accounts for the intra-subjects correlations, but also represents the dependence between the recurrent event rate and the terminal event rate. The covariates could be different for the recurrent rate and the death rate. This model can be fitted to right-censored and/or left-truncated data, but stratification by a boolean variable is not allowed. We denote for subject i (i = 1, ..., G) T ij the j-th recurrent time (j = 1, ..., n i ) considered as time-to-event, C i the censoring time, L i the left-truncating time, D i the death time. Y ij = min(T ij , C i , D i ) corresponds to each observation time, and δ ij = I {Y ij =T ij } is the recurrent event indicator. T * i = min(C i , D i ) is the last follow-up time and δ * i = I {T * i =D i } the death indicator. And we observe Y ij , T * i , L i , δ ij , δ * i . The hazard functions system for joint frailty models of recurrent events and death is where r 0 (t) (resp. λ 0 (t)) is the recurrent (resp. terminal) event baseline hazard function, β 1 (resp. β 2 ) the regression coefficients vector associated with X ij (resp. X i ) the covariate vector and v i ∼ Γ( 1 θ , 1 θ ) are i.i.d. Contrary to the shared gamma frailty model, the full log-likelihood does not take a simple form because the integrals do not have a closed form (Rondeau et al. 2007). The full marginal log-likelihood expression is where Φ = (r 0 (·), λ 0 (·), β, α, θ), with β = (β 1 , β 2 ) the covariates vector for recurrent events or death, with Y i0 = 0 and is the cumulative hazard function for death (resp. for recurrent events) and m i is the total number of recurrent events for subject i.

Additive frailty model
In a meta-analysis of clinical trials, the main objective consists in both looking at the heterogeneity between trials of underlying risk and treatment effects. An additive frailty model is a proportional hazards model with two correlated random effects at the trial level that act multiplicatively on the hazard function and on the interaction with treatment. This approach does not only allow variations of the baseline hazard function across trials but also variations of the treatment effect across trials in a meta-analysis concerning individual survival data (Vaida and Xu 2000;Legrand et al. 2005;Rondeau et al. 2008;Ha et al. 2011). If we are only interested in the variation of the baseline hazard function across trials, a simple shared frailty model can be used (see Section 2.1). In the case of an additive frailty model, only right-censored data are allowed, but not left-truncated data. We suppose that the G trials are independent. For the j-th (j = 1, ..., n i ) individual of the i-th trial (i = 1, ..., G), let T ij denote the survival times under study and let C ij be the corresponding right-censoring times. The observations Y ij equal to min(T ij , C ij ) and the censoring indicators δ ij = I {Y ij =T ij } . For each subject, we observe a binary variable X ij1 which equals 1 when the patient is in the experimental arm and 0 when in the standard arm, as a treatment arm indicator. Stratified analysis by a binary variable is allowed. The hazard function for the j-th individual of the i-th trial with random trial effect u i and random treatment-by-trial interaction w i is where λ 0 (t) is the baseline hazard function, β k the fixed effect associated with the covariate X ijk (k=1,..,p), β 1 is the treatment effect and X ij1 the treatment variable. ρ is the corresponding correlation coefficient for the two frailty terms. The variance σ 2 of u i represents the heterogeneity between trials of the overall underlying baseline risk and the variance τ 2 of w i represents the heterogeneity between trials of the overall treatment effect β 1 . If σ 2 = 0, then observations from the same trial are independent. A larger variance implies greater heterogeneity across trials and a greater correlation of the survival times for individuals belonging to the same trial. A null variance τ 2 indicates no heterogeneity of the treatment effect over trials. The full marginal log-likelihood expression for an additive frailty model is where Φ = (λ 0 (·), β, σ 2 , τ 2 , ρ). The calculus of this log-likelihood is detailed elsewhere (Rondeau et al. 2008). This marginal log-likelihood depends on integrals that have no analytical solution. We approximate it by using the Laplace integration technique, which seems to provide good estimates.

Computational methods and estimation
The estimation method used in frailtypack is the maximization of the penalized loglikelihood (Joly et al. 1998;Rondeau et al. 2003).

Penalized likelihood
The penalized loglikelihood has different expressions according to the models.
For Cox, shared, nested, additive frailty models, it is where κ is a positive smoothing parameter which controls the trade-off between the data fit and the smoothness of the functions.
If it is a stratified analysis with a maximum of two strata (for instance, a stratification on gender), the penalized loglikelihood is where λ 0m (·) and λ 0f (·) are the baseline hazard functions for men and women respectively. We suppose that these 2 baseline hazard functions are different. That is why it requires two positive smoothing parameters: κ 1 for the stratum 1 (men) and κ 2 for the stratum 2 (women).
The penalized loglikelihood for joint frailty models is defined as Two positive smoothing parameters (κ 1 and κ 2 ) are necessary because a joint frailty model requires two hazard functions (r 0 (·) for recurrent events and λ 0 (·) for death). Stratified analyses with this model are not allowed.
For a fixed value of the smoothing parameter, the maximization of the penalized likelihood provides estimators for Φ, the parameters of the model.

Maximization of the penalized likelihood
The estimated parameters are obtained by the robust Marquardt algorithm (Marquardt 1963), which is a combination between a Newton Raphson algorithm and a steepest descent algorithm. This algorithm has the advantage of being more stable than the Newton Raphson algorithm while preserving its fast convergence property. To be sure of having positive hazard functions at all stages of the algorithm, the variance of the frailties and the spline coefficients need to be positive. We ensure this positiveness by using a square transformation. The vector Φ of the parameters is updated until the convergence using the following iterative expression The step δ is equal to 1 by default but can be modified to ensure that the likelihood is improved at each iteration. The matrix H is a diagonal-inflated Hessian matrix to ensure positive definiteness. The term ∆(L(Φ (r) )) corresponds to the penalized log-likelihood gradient at the r-th iteration. The iterations stop when the difference between two consecutive log-likelihoods is small (< 10 −4 ), the coefficients are stable (< 10 −4 ) and the gradient is small enough (< 10 −6 ). The first and second derivates are calculated using finite differences method. After the convergence, the standard errors of the estimates are directly obtained from the inverse of the Hessian matrix.
The integrals in the full log-likelihood expression do not always have an analytical solution.
They are evaluated using a Gaussian quadrature with Laguerre polynomials with 20 points for the nested and the joint frailty models. They are evaluated using the Laplace integration method for the additive frailty models.

Approximation with splines
The estimator of the baseline hazard function λ 0 (·) has no analytical solution, but can be approximated on the basis of splines with Q knots: Cubic M-splines (polynomial functions of 3 rd order that are combined linearly to approximate a function on an interval) which are a variant of cubic B-splines are used. The second derivate of the baseline hazard function λ 0 (·) is approximated by a sum of polynomial functions of 1 st order. This approximation allows flexible shapes of the hazard function while reducing the number of parameters. The more knots we use, the closer is the approximation to the true hazard function. An approximation for the confidence bands at 95% of λ 0 (·) is provided: where M (t) = (M 1 (t), ..., M m (t)) is the M-splines vector and Iη = ∂ 2 pl(η) ∂η 2 .

Parameter initialization
The frailtypack programs include several algorithms which need initial values. It is very important to choose good initial values for the maximization of the penalized likelihood. The closer the initial value is to the true value, the faster the convergence. According to the model used, we implemented different methods to provide initial values.
For a shared frailty model, the splines and the regression coefficients are initialized to 0.1. The program fits, firstly, an adjusted Cox model to give new initial values for the splines and the regression coefficients. The variance of the frailty term θ is initialized to 0.1. Then, a shared frailty model is fitted.
For a joint frailty model, the splines and the regression coefficients are initialized to 0.5. The program fits an adjusted Cox model to provide new initial values for the regression and the splines coefficients. The variance of the frailty term θ and the coefficient α are initialized to 1. Then, it a joint frailty model is fitted.
For a nested frailty model, the splines and the regression coefficients are initialized to 0.1. The program fits an adjusted Cox model to provide new initial values for the regression and the splines coefficients. The variances of the frailties are initialized to 0.1. Then, a shared frailty model with covariates considering only subgroup frailties is fitted to give a new initial value for the variance of the subgroup frailty term. Then, a shared frailty model with covariates and considering only group frailties is fitted to give a new initial value for the variance of the group frailties. In a last step, a nested frailty model is fitted.
For an additive frailty model, the splines and the regression coefficients are initialized to 0.1. An adjusted Cox model is fitted, to provide new initial values for the splines coefficients and the regression coefficients. The variances of the frailties are initialized to 0.1. Then an additive frailty model with independent frailties is fitted. At last, an additive frailty model with correlated frailties is fitted.
Another important point is the choice of the smoothing parameters.
How to estimate the smoothing parameter κ The smoothing parameter can be fixed by the user or evaluated by an automatic method: the maximization of a likelihood cross-validation criterion (Joly et al. 1998) for Cox model. This method provides an estimated value of the smoothing parameter by minimizing the function whereΦ κ is the maximum penalized likelihood estimator,Ĥ l is minus the converged Hessian matrix of the log-likelihood,Ĥ pl is minus the converged Hessian matrix of the penalized loglikelihood and l(·) is the full log-likelihood. To minimizeV (κ), we calculate it for several and very remote values of κ (for avoiding a local minimum).
The cross-validation method is not implemented for more than one smoothing parameter. Thus, it can not be used for a stratified analysis and for a joint frailty model.
Concerning the choice of the kappas for a joint frailty model, first we have to fit two shared frailty models with cross-validation method: one with recurrences as event of interest and the other one with death as event of interest. Then, the two κ obtained are used in the joint frailty model. If the terminal event survival curve is too smoothed, we can set κ 2 = κ 1 (see modJoint.gap in Section 4.2). Choosing the kappas for a stratified model is based on the same idea. First, we fit a shared frailty model without stratification (see mod.sha.gap in Section 4.2) using a cross-validation method for κ. Then this κ is used for the two strata in the stratified shared frailty model (see mod.sha.str.gap in Section 4.2).

Frailtypack arguments
The two main functions which can be used in frailtypack are frailtyPenal for shared, joint and nested frailty models and additivePenal for additive frailty models. Different arguments for each type of model are proposed. We describe here each of the arguments needed for these functions and explain how to parametrize them to fit each type of model.

Arguments of the main functions
The standard code for fitting a Cox model or a shared, joint and nested frailty model is: Common arguments for fitting a model formula: Indicates the model which is fitted. It is different according to the model (see the following subsections).
data: Indicates the name of the data file. The database structure is different according to the model (see Section 4.1).
recurrentAG: Logical value (TRUE or FALSE). If recurrentAG = TRUE, it indicates that the counting process approach of Andersen and Gill (Andersen and Gill 1982) with a calendar timescale for recurrent event times is used. Within a calendar timescale, the time corresponds to the time since entry/inclusion in the study. Within a gap timescale, the time corresponds to the time between two recurrent events. Default is FALSE, in particular for recurrent events or clustered data with gap-time as the timescale. This argument is always FALSE when an additive frailty model is fitted.
n.knots: The number of knots to use. It corresponds to the n.knots+2 splines functions for the approximation of the baseline hazard function or the survival functions. The number of knots must be between 4 and 20. It is recommended to start with a small number of knots (for instance: n.knots = 7) and to increase the number of knots until the graph of the baseline hazard function remains unchanged.
kappa1: The smoothing parameter of the penalized likelihood.
kappa2: The second smoothing parameter, it is required if the analysis is stratified (only for Cox, shared, nested and additive frailty models). This parameter will corresponds to the smoothing parameter for the second baseline hazard function. If a joint frailty model is fitted, this parameter will correspond to the smoothing parameter for the death baseline hazard function (kappa1 being the smoothing parameter for the recurrent events baseline hazard function). frailtyPenal(Surv(t1, t2, event)~cluster (group) + subcluster(subgroup) + cov1 + cov2, data = database, n.knots = 8, kappa1 = 50000, recurrentAG = TRUE) The subcluster function identifies subgroups levels, the cluster function from the survival package identifies groups levels.
Specified arguments for fitting an additive frailty model formula: Example.
correlation: Logical value (TRUE or FALSE). Are the two random effects (u i and w i ) correlated? If so, the correlation coefficient is estimated. The default is TRUE.

Objects returned by frailtyPenal and additivePenal
The objects returned by frailtyPenal and additivePenal are detailed in the reference manual (see Value part), which is available along with the package or from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=frailtypack.
3.3. Other available functions: plot, summary, and print frailtypack includes R methods for summarizing, printing, and plotting objects of different classes depending on the model fitted. summary gives estimations and confidence intervals for the hazard ratios of each covariate. print provides a short summary of the parameter estimates. plot is useful to draw baseline survival or hazard functions for each type of model. For a joint frailty model, it is possible to plot the terminal event and/or the recurrent hazard or survival functions (see Section 4.4).

Illustrating examples
The Cox proportional hazards model and the shared frailty model have already been introduced in the first version of frailtypack (Rondeau and Gonzalez 2005).
To illustrate the models provided by frailtypack, three datasets are proposed: readmission, dataNested, dataAdditive. The first subsection explains how to build a dataset adapted to each model. Then, in the second subsection we describe the R code to fit the different models and the parameter estimates in the output obtained with the R function print. We also present the R function summary adapted to these models. Finally, we present the different options of the R function plot according to the model.

Build an adapted dataset
Several variables are necessary to fit the models provided by frailtypack.
Cox, shared frailty, and joint frailty models The following variables are common for all these models (see Table 1). For instance, when studying recurrent events with a Cox model or a shared frailty model, it is necessary to specify the identification number of the subject (id), an event indicator (event) and some covariates (e.g., chemo, sex, dukes, charlson) for each observation. If the interest is on the duration between two events, the gap-time timescale (e.g recurrentAG = FALSE) is appropriate and then a variable (time) that indicates the duration between two events is needed. If the focus is on the occurence of events during the follow-up, the calendar-time timescale (e.g., recurrentAG = TRUE) must be used and two variables such as (time.start) and (time.stop) are needed. time.start indicates the time at which the subject entered the study or the time at which the event last occured if it is not its first occurence. time.stop corresponds to the time when the event occurs or to the end of the follow-up when there is no more occurence. For instance, patient 2 entered the study at time time.start = 0, then developed an event at time time.stop = 489 and is then censored at time time.stop = 1182. In a joint frailty model, we fit jointly the terminal event rate and the recurrent event rate.
The dataset readmission is used for the Cox, shared and joint frailty model. It contains rehospitalization data of patients diagnosed with colorectal cancer . The data describe the calendar time (in days) of the successive hospitalizations after the date of surgery. The first readmission time was considered as the time between the date of the surgical procedure and the first rehospitalization after discharge related to colorectal cancer. Each subsequent readmission time was defined as the difference between the current hospitalization date and the previous discharge date. A total of 861 rehospitalization events were recorded for the 403 patients included in the analysis. Several readmissions can occur for the same patient, and an individual frailty may influence the occurrence of subsequent rehospitalizations.

Nested frailty model
In the nested frailty model, we use two levels of clustering. We need to specify the group and the subgroup of each subject such as the variables group and subgroup in dataNested (see Table 2). t1 is the time of entrance in the study and a left-truncating time and t2 corresponds to the time of event if the event indicator event equals 1; otherwise t2 corresponds the end of follow-up. var1 and var2 are generated covariates. This dataset, included in the package, contains a simulated sample of 400 observations which may be divided 20 clusters with 4 subgroups and 5 subjects in each subgroup, in order to obtain two levels of grouping. Two independent gamma frailty parameters with a variance fixed at 0.1 for the cluster effect and at 0.5 for the subcluster effect were generated. Independent survival times were generated from a Weibull baseline risk function. The percentage of censored data was around 30 per cent. The right-censoring variables were generated from a uniform distribution on [1, 36] and a left-truncating variable was generated with a uniform distribution on [0, 10]. Observations were included only if the survival time was greater than the truncation time.

Additive frailty model
In an additive frailty model, we are interested in the heterogeneity across trials. We need to specify the treatment variable. The treatment variable could be var1 (see Table 3). This model doesn't allow left-truncated data, thus the time of entrance t1 for each subject has to be equal to zero and only gap-time timescale can be used (e.g., recurrentAG = FALSE). To illustrate this model we used a generated dataset named dataAdditive included in the package.
This dataset contains simulated samples of 100 clusters with 100 subjects in each cluster, such as a compilation of clinical trials databases. Two correlated centered gaussian random effects are generated with the same variance fixed at 0.3 and the covariance at -0.2. The regression coefficient β is fixed at −0.11. The percentage of right censoring data is around 30 percent which are generated from a uniform distribution on [1,150]. Independent survival times were generated from a Weibull baseline risk function.
Then we can fit the adapted model.

Fit the different models and print the parameter estimates
The objective of the study described in Gonzalez et al. (2005) is to analyse the hospital readmission times related to colorectal cancer after surgical procedure. We first estimate the hazard rate ratios of readmission time for covariates analysed using a Cox proportional model, then using a shared frailty model and finally using a joint frailty model. Before fitting the model, we need to load the package frailtypack and the dataset readmission. This step is necessary for every model. An extract from this dataset is provided in Table 1 of the previous Section 4.1.

Cox model
The following lines correspond to the R code for fitting a Cox model. We use cross-validation (cross.validation = TRUE) for the smoothing parameter because it provides smoother functions (hazard or survival).
R> mod.cox.gap <-frailtyPenal(Surv(time, event)~cluster(id) + dukes + + charlson + sex + chemo, Frailty = FALSE, n.knots = 10, kappa1 = 1, + data = readmission, cross.validation = TRUE) With the print function, the parameter estimates of the models can be presented. Exact number of knots used: 10 Best smoothing parameter estimated by an approximated Cross validation: 208159031, DoF: 11.00 The variance of the frailty term theta is significantly different from 0, meaning that there is heterogeneity between subjects. We can deduce this by using a modified Wald test: W m (θ) = 0.672/0.142 = 4.73 > 1.64, with 1.64 the critical value for a normal one-sided test. The modified Wald test (W m ) is a significance test for the variance of the random effects distribution occurring on the boundary of the parameter space. The usual squared Wald statistic is simplified to a mixture of two distributions and hence the critical values must be derived from this mixture (Molenberghs and Verbeke 2007). We have a p value < 0.05 for the covariates dukes = 3, charlson = 3 and sex. This suggests the existence of a higher risk to be rehospitalized for men with a Dukes's stage at 3 and a Charlson index at 3.

Shared frailty model with stratification
Stratified analysis can be conducted using frailtypack if the stratified variable has 2 modalities (maximum number of strata = 2). We need to set the value of the two smoothing parameters (kappa1 and kappa2) and introduce the function strata in the formula of the model. As the cross-validation method is not possible in this case, we use the estimation of kappa in the previous model mod.sha for the initial of the kappas in the stratified model. This stratification procedure is the same for all models. We will see below that stratification is allowed for nested and additive frailty models but not for joint frailty models.
In this model, for instance, a chemotherapy does not seem to be a useful treatment to decrease relapses (p = 0.29). However, chemotherapy was positively associated with death, i.e., people treated with chemotherapy have a higher risk of death (probably because they have a more severe form of disease in the first place) (p = 0.000022 < 0.05 and the hazard ratio is 2.99). Similar conclusions about the influence of chemotherapy or gender are not possible with the previous models (Cox, shared or shared + stratification). Indeed, only one event of interest is analyzed using these models whereas joint frailty models allow to link two events of interest. The variance of the frailty ( The likelihood cross-validation criterion (LCV) is adopted here to guide the choice of the model used in the analysis. As LCV is particularly computationally demanding when n is large, an approximate version LCV a has been proposed by O'Sullivan (Sullivan 1988) for the estimation of the hazard function in a survival case and adapted recently by Commenges et al. (2007). Lower values of LCV a indicate a better fitting model. The LCV a is then defined as: with H pl minus the converged hessian of the penalized log-likelihood, H l minus the converged hessian of the log-likelihood and l(·) is the full log-likelihood. In the case of a parametric approach tr(H −1 pl H l ) − l(·) will represent the number of parameters and LCV a will be approximately equivalent to the AIC criterion. The LCV a for parametric approaches is defined as: with np the total number of parameters. The LCV a criteria is included in the package by default frailtypack.
For instance when comparing the two previously fitted shared frailty models (non stratified or stratified) we observe similar results : LCV a = 3.78 vs. LCV a = 3.79. The gain of using a joint model instead of a shared model can be evaluated by comparing the LCV a = 4.84 from the joint frailty model and the LCV a computed by pooling the likelihoods from the two shared frailty models (for recurrent event and for death) with the total number of parameters in these two models LCV a = 3.78 + 1.02 = 4.80. In our case, the LCV a from the joint model was not better.

Nested frailty model
The following code is an example for fitting a nested frailty model. An extract from the dataset dataNested is provided in

Additive frailty models with no correlation between the random effects
In the model modAdd2cov.withoutCorr, we suppose that the random effects are not correlated (correlation = FALSE). An extract from the dataset dataAdditive is provided in Table 3 of the previous section.

Additive frailty models with a correlation between the random effects
If we suppose that the two random effects are correlated, we have to change the option correlation to: correlation = TRUE.

Print the hazard ratios
The package allows to estimate the hazard ratios using the R function named summary, e.g., summary(model, level = 0.95). The option level = 0.95 indicates the level of confidence (here it is 95 percent). For instance, the summary function for the joint frailty models gives the following output: R> summary(modJoint.gap, level = 0.95) Recurrences: ------------hr 95% C.

Conclusion
The new version of the package frailtypack allows to deal with correlated survival data using shared, nested, joint, additive frailty models or the Cox model (Rondeau et al. 2012). These models can be used when survival data are clustered into several levels, when the terminal event is considered as an informative censoring data, for meta-analysis studies or multicentric datasets. This article shows how to build the database according to the model, how to code for getting parameter estimates, hazard ratios and hazard or survival function curves and how to interpret the results.
By developing the frailtypack package in R we hope to have provided a useful, easy and pertinent tool which addresses many biomedical issues. We included recently parametric hazard functions and we plan to include prediction methods in the near future and to extend it regularly by adding more functions and other frailty models.