flexsurv: A Platform for Parametric Survival Modelling in R

flexsurv is an R package for fully-parametric modelling of survival data. Any parametric time-to-event distribution may be fitted if the user supplies a probability density or hazard function, and ideally also their cumulative versions. Standard survival distributions are built in, including the three and four-parameter generalized gamma and F distributions. Any parameter of any distribution can be modelled as a linear or log-linear function of covariates. The package also includes the spline model of Royston and Parmar (2002), in which both baseline survival and covariate effects can be arbitrarily flexible parametric functions of time. The main model-fitting function, flexsurvreg , uses the familiar syntax of survreg from the standard survival package (Therneau 2014). Censoring or left-truncation are specified in Surv objects. The models are fitted by maximising the full log-likelihood, and estimates and confidence intervals for any function of the model parameters can be printed or plotted. flexsurv also provides functions for fitting and predicting from fully-parametric multi-state models, and connects with the mstate package (de Wreede et al. 2011). This article explains the methods and design principles of the package, giving several worked examples of its use. [Note: A version of this vignette is published as Jackson (2016) in Journal of Statistical Software. All content there is included here. There have been no substantial changes in the survival modelling parts since then. Version 2.0 of flexsurv added new features for multi-state modelling, and since that version, multi-state modelling with flexsurv has been described in a separate vignette.]


Motivation and design
The Cox model for survival data is ubiquitous in medical research, since the effects of predictors can be estimated without needing to supply a baseline survival distribution that might be inaccurate.However, fully-parametric models have many advantages, and even the originator of the Cox model has expressed a preference for parametric modelling (see Reid 1994).Fully-specified models can be more convenient for representing complex data structures and processes (Aalen et al. 2008), e.g.hazards that vary predictably, interval censoring, frailties, multiple responses, datasets or time scales, and can help with out-of-sample prediction.For example, the mean survival E(T ) = ∞ 0 S(t)dt, used in health economic evaluations (Latimer 2013), needs the survivor function S(t) to be fully-specified for all times t, and parametric models that combine data from multiple time periods can facilitate this (Benaglia et al. 2014).
flexsurv for R (R Core Team 2014) allows parametric distributions of arbitrary complexity to be fitted to survival data, gaining the convenience of parametric modelling, while avoiding the risk of model misspecification.Built-in choices include spline-based models with any number of knots (Royston and Parmar 2002) and 3-4 parameter generalized gamma and F distribution families.Any user-defined model may be employed by supplying at minimum an R function to compute the probability density or hazard, and ideally also its cumulative form.Any parameters may be modelled in terms of covariates, and any function of the parameters may be printed or plotted in model summaries.
flexsurv is intended as a general platform for survival modelling in R. The survreg function in the R package survival (Therneau 2014) only supports two-parameter (location/scale) distributions, though users can supply their own distributions if they can be parameterised in this form.Some other contributed R packages can fit survival models, e.g., eha (Broström 2014) and VGAM (Yee and Wild 1996), though these are either limited to specific distribution families, or not specifically designed for survival analysis.Others, e.g.ActuDistns (Nadarajah and Bakar 2013), contain only the definitions of distribution functions.flexsurv enables such functions to be used in survival models.
It is similar in spirit to the Stata packages stpm2 (Lambert and Royston 2009) for spline-based survival modelling, and stgenreg (Crowther and Lambert 2013) for fitting survival models with user-defined hazard functions using numerical integration.Though in flexsurv, slow numerical integration can be avoided if the analytic cumulative distribution or hazard can be supplied, and optimisation can also be speeded by supplying analytic derivatives.flexsurv also has features for multi-state modelling and interval censoring, and general output reporting.It employs functional programming to work with user-defined or existing R functions.§2 explains the general model that flexsurv is based on.§3 gives examples of its use for fitting built-in survival distributions with a fixed number of parameters, and §4 explains how users can define new distributions.§5 concentrates on classes of models where the number of parameters can be chosen arbitrarily, such as splines.§6 mentions the use of flexsurv for fitting and predicting from fully-parametric multi-state models, which is described more fully in a separate vignette.Finally §7 suggests some potential future extensions.

General parametric survival model
The general model that flexsurv fits has probability density for death at time t: f (t|µ(z), α(z)), t ≥ 0 (1) The cumulative distribution function F (t), survivor function S(t) = 1 − F (t), cumulative hazard H(t) = − log S(t) and hazard h(t) = f (t)/S(t) are also defined (suppressing the conditioning for clarity).µ = α 0 is the parameter of primary interest, which usually governs the mean or location of the distribution.Other parameters α = (α 1 , . . ., α R ) are called "ancillary" and determine the shape, variance or higher moments.
Covariates All parameters may depend on a vector of covariates z through link-transformed linear models g 0 (µ(z)) = γ 0 + β ⊤ 0 z and g r (α r (z)) = γ r + β ⊤ r z. g() will typically be log() if the parameter is defined to be positive, or the identity function if the parameter is unrestricted.
Suppose that the location parameter, but not the ancillary parameters, depends on covariates.If the hazard function factorises as h(t|α, µ(z)) = µ(z)h 0 (t|α), then this is a proportional hazards (PH) model, so that the hazard ratio between two groups (defined by two different values of z) is constant over time t.Alternatively, if S(t|µ(z), α) = S 0 (µ(z)t|α) then it is an accelerated failure time (AFT) model, so that the effect of covariates is to speed or slow the passage of time.For example, doubling the value of a covariate with coefficient β = log(2) would give half the expected survival time.
Data and likelihood Let t i : i = 1, . . ., n be a sample of times from individuals i.Let c i = 1 if t i is an observed death time, or c i = 0 if this is censored.Most commonly, t i may be right-censored, thus the true death time is known only to be greater than t i .More generally, the survival time may be interval-censored on (t min i , t max i ).
Also let s i be corresponding left-truncation (or delayed-entry) times, meaning that the ith survival time is only observed conditionally on the individual having survived up to s i , thus s i = 0 if there is no left-truncation.
With at most right-censoring, the likelihood for the parameters θ = {γ, β} in Equation 1, given the corresponding data vectors, is where , and µ, α are related to γ, β and z i via the link functions defined above.The log-likelihood also has a concise form in terms of hazards and cumulative hazards, as With interval-censoring, the likelihood is These likelihoods assume that the times of censoring are fixed or otherwise distributed independently of the parameters θ that govern the survival times (see, e.g.Aalen et al. (2008)).
The individual survival times are also independent, so that flexsurv does not currently support shared frailty, clustered or random effects models (see §7).
The parameters are estimated by maximising the full log-likelihood with respect to θ, as detailed further in §3.6.

Fitting standard parametric survival models
An example dataset used throughout this paper is from 686 patients with primary node positive breast cancer, available in the package as bc.This was originally provided with stpm (Royston 2001), and analysed in much more detail by Sauerbrei and Royston (1999) and Royston and Parmar (2002)  The main model-fitting function is called flexsurvreg.Its first argument is an R formula object.The left hand side of the formula gives the response as a survival object, using the Surv function from the survival package.
R> fs1 <-flexsurvreg(Surv(recyrs, censrec) ~group, data = bc, + dist = "weibull") Here, this indicates that the response variable is recyrs.This represents the time (in years) of death or cancer recurrence when censrec is 1, or (right-)censoring when censrec is 0. The covariate group is a factor representing a prognostic score, with three levels "Good" (the baseline), "Medium" and "Poor".All of these variables are in the data frame bc.If the argument dist is a string, this denotes a built-in survival distribution.In this case we fit a Weibull survival model.
Printing the fitted model object gives estimates and confidence intervals for the model parameters and other useful information.Note that these are the same parameters as represented by the R distribution function dweibull: the shape α and the scale µ of the survivor function S(t) = exp(−(t/µ) α ), and group has a linear effect on log(µ).

R> fs1
Call The maximised log-likelihoods are the same, however the parameterisation is different: the first coefficient (Intercept) reported by survreg is log(µ), and survreg's "scale" is dweibull's (thus flexsurvreg)'s 1 / shape.The covariate effects β, however, have the same "accelerated failure time" interpretation, as linear effects on log(µ).The multiplicative effects exp(β) are printed in the output as exp(est).
The same model can be fitted in eha, also by maximum likelihood, as R> library(eha) R> aftreg(Surv(recyrs, censrec) ~group, data = bc, dist = "weibull") The results are presented in the same parameterisation as flexsurvreg, except that the shape and scale parameters are log-transformed, and (unless the argument param="lifeExp" is supplied) the covariate effects have the opposite sign.

Truncation and time-dependent covariates
If we also had left-truncation times in a variable called start, the response would be Surv (start, recyrs, censrec).Or if all responses were interval-censored between lower and upper bounds tmin and tmax, then we would write Surv(tmin, tmax, type = "interval2").Time-dependent covariates are not supported in flexsurv.In other packages, they are represented in "counting process" form -as a series of left-truncated survival times, which may also be right-censored.For each individual there would be multiple records, each corresponding to an interval where the covariate is assumed to be constant.The response would be of the form Surv(start, stop, censrec), where start and stop are the limits of each interval, and censrec indicates whether a death was observed at stop.However, using this approach would require the probability of survival up to the left-truncation time to be represented by a term of the form S i (s i ) = exp(−H i (s i )) in the likelihood (equation 2).The cumulative hazard H over the interval from time 0 to time s i depends on how the covariates change for an individual on this time interval, and flexsurv cannot keep track of different observations for an individual.Furthermore, prediction from models with time-dependent covariates is a difficult problem, as it requires knowledge of when the covariates are expected to change.Joint models for longitudinal and survival data are a more flexible approach for modelling the association of a survival time with a time-dependent predictor.
In versions of flexsurv since April 2020, models with individual-specific right-truncation times are also supported.These are used for situations with "retrospective ascertainment", where cases are only included in the data if they have died by a specific time.These models are specified through an argument rtrunc to flexsurvreg that names the variable with the truncation times.See the Supplementary Examples vignette for a worked example.

Relative survival
In relative survival models (Nelson et al. 2007), the survivor function is expressed as S(t) = S * (t)R(t), where S * (t) is the "expected" or "baseline" survival, and R(t) is the relative survival.Equivalently, the hazard is defined as h(t) = h * (t) + λ(t), where h * () is the baseline hazard function, and λ(t) is the excess mortality rate associated with the disease of interest.The baseline represents a reference population, and is typically obtained from national routinelycollected mortality statistics, adjusted (e.g. by age/sex) to represent the population under study.The parametric model is defined and estimated for R(t).
These models are implemented in flexsurv by supplying the variable in the data that represents the expected mortality rate h * (t) in the bhazard argument to flexsurvreg.This is only used for the individuals in the data who die, and bhazard describes the expected hazard at the death time.The values of bhazard for censored individuals are ignored.
Note that the parameters returned in the model fitted by flexsurvreg refer to the relative survival R(t), rather than the absolute survival.The likelihood returned by flexsurvreg here is a partial likelihood defined (as in Nelson et al. 2007, equations 4-5) by omitting the term i log(S * (t i )) (summed over all individuals i in the data, including both censored and uncensored times t i ) from the full likelihood.This term is equivalent to minus the sum of the cumulative hazards.It can be omitted from the likelihood for the purpose of estimating the parameters of the relative survival model, since it does not depend on these parameters.Hence if a full likelihood is required, (e.g. for model comparison) then this term should be added to the partial likelihood.
Similarly, the predicted survival or hazard (e.g. as returned by summary.flexsurvreg,see Section 3.4) from a relative survival model refers to R(t) or h(t).Hence if the overall survival or hazard is required, the predictions of relative survival should be converted to the "absolute" scale by combining with the baseline, though no specific tools for doing this are provided by flexsurv.

Weighting and subsetting
Case weights and data subsets can also be specified, as in standard R modelling functions, using weights or subset arguments.

Built-in models
flexsurvreg's currently built-in distributions are listed in Table 1.In each case, the probability density f () and parameters of the fitted model are taken from an existing R function of the same name but beginning with the letter d.For the Weibull, exponential (dexp), gamma (dgamma) and log-normal (dlnorm), the density functions are provided with standard installations of R.These density functions, and the corresponding cumulative distribution functions (with first letter p instead of d) are used internally in flexsurvreg to compute the likelihood.
flexsurv provides some additional survival distributions, including a Gompertz distribution with unrestricted shape parameter, Weibull with proportional hazards parameterisation, loglogistic, and the three-and four-parameter families described below.For all built-in distributions, flexsurv also defines functions beginning with h giving the hazard, and H for the cumulative hazard.
A package vignette "Distributions reference" lists the survivor function and parameterisation of covariate effects used by each built-in distribution.
Generalized gamma This three-parameter distribution includes the Weibull, gamma and log-normal as special cases.The original parameterisation from Stacy (1962) is available as dist = "gengamma.orig",however the newer parameterisation (Prentice 1974) is preferred: dist = "gengamma".This has parameters (µ,σ,q), and survivor function where is the incomplete gamma function (the cumulative gamma distribution with shape γ and scale 1), Φ is the standard normal cumulative distribution, u = γ exp(|q|z), z = (log(t) − µ)/σ, and γ = q −2 .The Prentice (1974) parameterisation extends the original one to include a further class of models with negative q, and survivor function I(γ, u), where z is replaced by −z.This stabilises estimation when the distribution is close to log-normal, since q = 0 is no longer near the boundary of the parameter space.In R notation, 2 the parameter values corresponding to the three special cases are dgengamma(x, mu, sigma, Q=0) == dlnorm(x, mu, sigma) dgengamma(x, mu, sigma, Q=1) == dweibull(x, shape = 1 / sigma, scale = exp(mu)) dgengamma(x, mu, sigma, Q=sigma) == dgamma(x, shape = 1 / sigma^2, rate = exp(-mu) / sigma^2) Generalized F This four-parameter distribution includes the generalized gamma, and also the log-logistic, as special cases.The variety of hazard shapes that can be represented is discussed by Cox (2008).It is provided here in alternative "original" (dist = "genf.orig")and "stable" parameterisations (dist = "genf") as presented by Prentice (1975).See help(GenF) and help(GenF.orig) in the package documentation for the exact definitions.

Covariates on ancillary parameters
The generalized gamma model is fitted to the breast cancer survival data.fs2 is an AFT model, where only the parameter µ depends on the prognostic covariate group.In a second model fs3, the first ancillary parameter sigma (α 1 ) also depends on this covariate, giving a model with a time-dependent effect that is neither PH nor AFT.The second ancillary parameter Q is still common between prognostic groups.

Plotting outputs
The plot() method for flexsurvreg objects is used as a quick check of model fit.By default, this draws a Kaplan-Meier estimate of the survivor function S(t), one for each combination of categorical covariates, or just a single "population average" curve if there are no categorical covariates (Figure 1).The corresponding estimates from the fitted model are overlaid.Fitted values from further models can be added with the lines() method.
The argument type = "hazard" can be set to plot hazards from parametric models against kernel density estimates obtained from muhaz (Hess 2010;Mueller and Wang 1994).Figure 2 shows more clearly that the Weibull model is inadequate for the breast cancer data: the hazard must be increasing or decreasing -while the generalized gamma can represent the increase and subsequent decline in hazard seen in the data.Similarly, type = "cumhaz" plots cumulative hazards.
The numbers plotted are available from the summary.flexsurvreg()method.Confidence intervals are produced by simulating a large sample from the asymptotic normal distribution of the maximum likelihood estimates of {β r : r = 0, . . ., R} (Mandel 2013) normboot.flexsurvreg.This very general method allows confidence intervals to be obtained for arbitrary functions of the parameters, as described in the next section.
In this example, there is only a single categorical covariate, and the plot and summary methods return one observed and fitted trajectory for each level of that covariate.For more complicated models, users should specify what covariate values they want summaries for, rather than relying on the default 3 .This is done by supplying the newdata argument, a data frame or list containing covariate values, just as in standard R functions like predict.lm.Timedependent covariates are not understood by these functions.
This plot() method is only for casual exploratory use.For publication-standard figures, it is preferable to set up the axes beforehand (plot(..., type = "n")), and use the lines() methods for flexsurvreg objects, or construct plots by hand using the data available from summary.flexsurvreg().

Custom model summaries
Any function of the parameters of a fitted model can be summarised or plotted by supplying the argument fn to summary.flexsurvreg or plot.flexsurvreg.This should be an R function, with optional first two arguments t representing time, and start representing a left-truncation point (if the result is conditional on survival up to that time).Any remaining arguments must be the parameters of the survival distribution.For example, median survival under the Weibull model fs1 can be summarised as follows R> median.weibull<-function(shape, scale) { 3 If there are only factor covariates, all combinations are plotted.If there are any continuous covariates, these methods by default return a "population average" curve, with the linear model design matrix set to its average values, including the 0/1 contrasts defining factors, which doesn't represent any specific covariate combination.+ qweibull(0.5, shape = shape, scale = scale) + } R> summary(fs1, fn = median.weibull,t = 1, B = 10000) Although the median of the Weibull has an analytic form as µ log(2) 1/α , the form of the code  given here generalises to other distributions.The argument t (or start) can be omitted from median.weibull, because the median is a time-constant function of the parameters, unlike the survival or hazard.
10000 random samples are drawn to produce a slightly more precise confidence interval than the default -users should adjust this until the desired level of precision is obtained.A useful future extension of the package would be to employ user-supplied (or built-in) derivatives of summary functions if possible, so that the delta method can be used to obtain approximate confidence intervals without simulation.

Computation
The likelihood is maximised in flexsurvreg using the optimisation methods available through the standard R optim function.By default, this is the "BFGS" method (Nash 1990) using the analytic derivatives of the likelihood with respect to the model parameters, if these are available, to improve the speed of convergence to the maximum.These derivatives are built-in for the exponential, Weibull, Gompertz, log-logistic, and hazard-and odds-based spline models (see §5.1).For custom distributions (see §4), the user can optionally supply functions with names beginning "DLd" and "DLS" respectively (e.g., DLdweibull, DLSweibull) to calculate the derivatives of the log density and log survivor functions with respect to the transformed baseline parameters γ (then the derivatives with respect to β are obtained automatically).
Arguments to optim can be passed to flexsurvreg -in particular, control options, such as convergence tolerance, iteration limit or function or parameter scaling, may need to be adjusted to achieve convergence.

Custom survival distributions
flexsurv is not limited to its built-in distributions.Any survival model of the form (1-3) can be fitted if the user can provide either the density function f () or the hazard h().Many contributed R packages provide probability density and cumulative distribution functions for positive distributions.Though survival models may be more naturally characterised by their hazard function, representing the changing risk of death through time.For example, for survival following major surgery we may want a "U-shaped" hazard curve, representing a high risk soon after the operation, which then decreases, but increases naturally as survivors grow older.
To supply a custom distribution, the dist argument to flexsurvreg is defined to be an R list object, rather than a character string.The list has the following elements.
name Name of the distribution.In the first example below, we use a log-logistic distribution, and the name is "llogis" 4 .Then there is assumed to be at least either a function to compute the probability density, which would be called dllogis here, or a function to compute the hazard, called hllogis.
There should also be a function called pllogis for the cumulative distribution (if d is given), or H for the cumulative hazard (to complement h), if analytic forms for these are available.If not, then flexsurv can compute them internally by numerical integration, as in stgenreg (Crowther and Lambert 2013).The default options of the built-in R routine integrate for adaptive quadrature are used, though these may be changed using the integ.optsargument to flexsurvreg.Models specified this way will take an order of magnitude more time to fit, and the fitting procedure may be unstable.An example is given in §5.2.
These functions must be vectorised, and the density function must also accept an argument log, which when TRUE, returns the log density.See the examples below.
In some cases, R's scoping rules may not find the functions in the working environment.They may then be supplied through the dfns argument to flexsurvreg.
pars Character vector naming the parameters of the distribution µ, α 1 , ..., α R .These must match the arguments of the R distribution function or functions, in the same order.
location Character: quoted name of the location parameter µ.The location parameter will not necessarily be the first one, e.g., in dweibull the scale comes after the shape.
transforms A list of functions g() which transform the parameters from their natural ranges to the real line, for example, c(log, identity) if the first is positive and the second unrestricted.5 inv.transforms List of corresponding inverse functions.
inits A function which provides plausible initial values of the parameters for maximum likelihood estimation.This is optional, but if not provided, then each call to flexsurvreg must have an inits argument containing a vector of initial values, which is inconvenient.Implausible initial values may produce a likelihood of zero, and a fatal error message (initial value in 'vmmin' is not finite) from the optimiser.
Each distribution will ideally have a heuristic for initialising parameters from summaries of the data.For example, since the median of the Weibull is µ log(2) 1/α , a sensible estimate of µ might be the median log survival time divided by log(2), with α = 1, assuming that in practice the true value of α is not far from 1. Then we would define the function, of one argument t giving the survival or censoring times, returning the initial values for the Weibull shape and scale respectively6 .
More complicated initial value functions may use other data such as the covariate values and censoring indicators: for an example, see the function flexsurv.splineinits in the package source that computes initial values for spline models ( §5.1).
Example: Using functions from a contributed package The following custom model uses the log-logistic distribution functions (dllogis and pllogis) available in the package eha.The survivor function is S(t|µ, α) = 1/(1 + (t/µ) α ), so that the log odds log((1 − S(t))/S(t)) of having died are a linear function of log time.
Example: Wrapping functions from a contributed package Sometimes there may be probability density and similar functions in a contributed package, but in a different format.For example, eha also provides a three-parameter Gompertz-Makeham distribution with hazard h(t|µ, α 1 , α 2 ) = α 2 + α 1 exp(t/µ).The shape parameters α 1 , α 2 are provided to dmakeham as a vector argument of length two.However, flexsurvreg expects distribution functions to have one argument for each parameter.Therefore we write our own functions that wrap around the third-party functions.

Arbitrary-dimension models
flexsurv also supports models where the number of parameters is arbitrary.In the models discussed previously, the number of parameters in the model family is fixed (e.g., three for the generalized gamma).In this section, the model complexity can be chosen by the user, given the model family.We may want to represent more irregular hazard curves by more flexible functions, or use bigger models if a bigger sample size makes it feasible to estimate more parameters.

Royston and Parmar spline model
In the spline-based survival model of Royston and Parmar (2002), a transformation g(S(t, z)) of the survival function is modelled as a natural cubic spline function of log time: g(S(t, z)) = s(x, γ) where x = log(t).This model can be fitted in flexsurv using the function Normal / probit Φ −1 (S(t, z)) Table 2: Alternative modelling scales for flexsurvspline, and equivalent distributions for m = 0 (with parameter definitions as in the R d functions referred to elsewhere in the paper).

Spline parameterisation
The complexity of the model, thus the dimension of γ, is governed by the number of knots in the spline function s().Natural cubic splines are piecewise cubic polynomials defined to be continuous, with continuous first and second derivatives at the knots, and also constrained to be linear beyond boundary knots k min , k max .As well as the boundary knots there may be up to m ≥ 0 internal knots k 1 , . . ., k m .Various spline parameterisations exist -the one used here is from Royston and Parmar (2002).
where v j (x) is the jth basis function . If m = 0 then there are only two parameters γ 0 , γ 1 , and this is a Weibull model if g() is the log cumulative hazard.Table 2 explains two further choices of g(), and the parameter values and distributions they simplify to for m = 0.The probability density and cumulative distribution functions for all these models are available as dsurvspline and psurvspline.A model with an absolute time scale (x = t) is also available through timescale="identity".
Covariates on spline parameters Covariates can be placed on any parameter γ through a linear model (with identity link function).Most straightforwardly, we can let the intercept γ 0 vary with covariates z, giving a proportional hazards or odds model (depending on g()).

g(S(t, z)) = s(log(t), γ) + β ⊤ z
The spline coefficients γ j : j = 1, 2 . .., the "ancillary" parameters, may also be modelled as linear functions of covariates z, as giving a model where the effects of covariates are arbitrarily flexible functions of time: a non-proportional hazards or odds model.

Spline models in flexsurv
The argument k to flexsurvspline defines the number of internal knots m.Knot locations are chosen by default from quantiles of the log uncensored death times, or users can supply their own locations in the knots argument.Initial values for numerical likelihood maximisation are chosen using the method described by Royston and Parmar (2002) of Cox regression combined with transforming an empirical survival estimate.
For example, the best-fitting model for the breast cancer dataset identified in Royston and Parmar (2002), a proportional odds model with one internal spline knot, is R> sp1 <-flexsurvspline(Surv(recyrs, censrec) ~group, data = bc, k = 1, + scale = "odds") A further model where the first ancillary parameter also depends on the prognostic group, giving a time-varying odds ratio, is fitted as R> sp2 <-flexsurvspline(Surv(recyrs, censrec) ~group + gamma1(group), + data = bc, k = 1, scale = "odds") These models give qualitatively similar results to the generalized gamma in this dataset (Figure 3), and have similar predictive ability as measured by AIC (Table 3).Though in general, an advantage of spline models is that extra flexibility is available where necessary.
In this example, proportional odds models (scale = "odds") are better-fitting than proportional hazards models (scale = "hazard") (Table 3).Note also that under a proportional hazards spline model with one internal knot (sp3), the log hazard ratios, and their standard errors, are substantively the same as under a standard Cox model (cox3).This illustrates that this class of flexible fully-parametric models may be a reasonable alternative to the (semi-parametric) Cox model.See Royston and Parmar (2002) for more discussion of these issues.

Implementing new general-dimension models
The spline model above is an example of the general parametric form (Equation 1), but the number of parameters, R + 1 in Equation 1, m + 2 in Equation 4, is arbitrary.flexsurv has the tools to deal with any model of this form.flexsurvspline works internally by building a custom distribution and then calling flexsurvreg.Similar models may in principle be built by users using the same method.This relies on a functional programming trick.

Creating distribution functions dynamically
The R distribution functions supplied to custom models are expected to have a fixed number of arguments, including one for each scalar parameter.However, the distribution functions for the spline model (e.g., dsurvspline) have an argument gamma representing the vector of parameters γ, whose length is determined by choosing the number of knots.Just as the scalar parameters of conventional distribution functions can be supplied as vector arguments (as explained in §4), similarly, the vector parameters of spline-like distribution functions can be supplied as matrix arguments, representing alternative parameter values.
To convert a spline-like distribution function into the correct form, flexsurv provides the utility unroll.function.This converts a function with one (or more) vector parameters (matrix arguments) to a function with an arbitrary number of scalar parameters (vector arguments).For example, the 5-year survival probability for the baseline group under the model sp1 is R> gamma <-sp1$res[c("gamma0", "gamma1", "gamma2"), "est"] R> 1 -psurvspline(5, gamma = gamma, knots = sp1$knots) [1] 0.6897025 An alternative function to compute this can be built by unroll.function.We tell it that the vector parameter gamma should be provided instead as three scalar parameters named gamma0, gamma1, gamma2.The resulting function pfn is in the correct form for a custom flexsurvreg distribution.
R> pfn <-unroll.function(psurvspline,gamma = 0:2) R> 1 -pfn(5, gamma0 = gamma[1], gamma1 = gamma[2], gamma2 = gamma[3], + knots = sp1$knots) [1] 0.6897025 Users wishing to fit a new spline-like model with a known number of parameters could just as easily write distribution functions specific to that number of parameters, and use the methods in §4.However the unroll.functionmethod is intended to simplify the process of extending the flexsurv package to implement new model families, through wrappers similar to flexsurvspline.
Example: splines on alternative scales An alternative to the Royston-Parmar spline model is to model the log hazard as a spline function of (log) time instead of the log cumulative hazard.Crowther and Lambert (2013) demonstrate this model using the Stata stgenreg package.An advantage explained by Royston and Lambert (2011) is that when there are multiple timedependent effects, time-dependent hazard ratios can be interpreted independently of the values of other covariates.
This can also be implemented in flexsurvreg using unroll.function.A disadvantage of this model is that the cumulative hazard (hence the survivor function) has no analytic form, therefore to compute the likelihood, the hazard function needs to be integrated numerically.This is done automatically in flexsurvreg (just as in stgenreg) if the cumulative hazard is not supplied.
Firstly, a function must be written to compute the hazard as a function of time x, the vector of parameters gamma (which can be supplied as a matrix argument so the function can give a vector of results), and a vector of knot locations.This uses flexsurv's function basis to compute the natural cubic spline basis (Equation 4), and replicates x and gamma to the length of the longest one.
R> hsurvspline.lh<-function(x, gamma, knots){ + if(!is.matrix(gamma)) gamma <-matrix(gamma, nrow = 1) Other arbitrary-dimension models Another potential application is to fractional polynomials (Royston and Altman 1994).These are of the form M m=1 α m x pm log(x) n where the power p m is in the standard set {2, −1, −0.5, 0, 0.5, 1, 2, 3} (except that log(x) is used instead of x 0 ), and n is a non-negative integer.They are similar to splines in that they can give arbitrarily close approximations to a nonlinear function, such as a hazard curve, and are particularly useful for expressing the effects of continuous predictors in regression models.See e.g., Sauerbrei et al. (2007), and several other publications by the same authors, for applications and discussion of their advantages over splines.The R package gamlss (Rigby and Stasinopoulos 2005) has a function to construct a fractional polynomial basis that might be employed in flexsurv models.
Polyhazard models (Louzada-Neto 1999) are another potential use of this technique.These express an overall hazard as a sum of latent cause-specific hazards, each one typically from the same class of distribution, e.g., a poly-Weibull model if they are all Weibull.For example, a U-shaped hazard curve following surgery may be the sum of early hazards from surgical mortality and later deaths from natural causes.However, such models may not always be identifiable without external information to fix or constrain the parameters of particular hazards (Demiris et al. 2011).

Multi-state models
A multi-state model represents how an individual moves between multiple states in continuous time.Survival analysis is a special case with two states, "alive" and "dead".Competing risks are a further special case, where there are multiple causes of death, that is, one starting state and multiple possible destination states.
Multi-state modelling with flexsurv was previously described in this section of the current vignette.Version 2.0 of flexsurv added several new features for multi-state modelling, including multi-state modelling using mixtures, and transition-specific distribution families in cause-specific hazards models.These models are now fully described in a separate flexsurv vignette, "Flexible parametric multi-state modelling with the flexsurv package".

Potential extensions
Multi-state modelling is still an area of ongoing work, and while version 2.0 extended flexsurv in this area, more tools and documentation in this area would still be useful.The msm package arguably has a more accessible interface for fitting and summarising multi-state models, but it was designed mainly for panel data rather than event time data, and therefore the event time distributions it fits are relatively inflexible.Models where multiple survival times are assumed to be correlated within groups, sometimes called (shared) frailty models (Hougaard 1995), would also be a useful development.See, e.g., Crowther et al. (2014) for a recent application based on parametric models.These might be implemented by exploiting tractability for specific distributions, such as gamma frailties, or by adjusting standard errors to account for clustering, as implemented in survreg.More complex random effects models would require numerical integration, for example, Crowther et al. (2014) provide Stata software based on Gauss-Hermite quadrature.Alternatively, a probabilistic modelling language such as Stan (Stan Development Team 2014) or BUGS (Lunn et al. 2012) would be naturally suited to complex extensions such as random effects on multiple parameters or multiple hierarchical levels.
flexsurv is intended as a platform for parametric survival modelling.Extensions of the software to deal with different models may be written by users themselves, through the facilities described in §4 and §5.2.These might then be included in the package as built-in distributions, or at least demonstrated in the package's other vignette flexsurv-examples.Each new class of models would ideally come with guidance on what situations the model is useful for, e.g., what shape of hazards it can represent some intuitive interpretation of the model parameters, their plausible values in typical situations, and potential identifiability problems.This would also help with choosing initial values for numerical maximum likelihood estimation, ideally through an inits function in the custom distribution list ( §4).

Figure 1 :
Figure 1: Survival by prognostic group from the breast cancer data: fitted from alternative parametric models and Kaplan-Meier estimates.

Figure 2 :
Figure 2: Hazards by prognostic group from the breast cancer data: fitted from alternative parametric models and kernel density estimates.

Figure 3 :
Figure 3: Comparison of spline and generalized gamma fitted hazards for the breast cancer survival data by prognostic group.

Table 3
compares all the models fitted to the breast cancer data, showing absolute fit to the data as measured by the maximised -2×log likelihood −2LL, number of parameters p, and Akaike's information criterion −2LL + 2p (AIC).The model fs2 has the lowest AIC, indicating the best estimated predictive ability.

Table 1 :
Built-in parametric survival distributions in flexsurv.

Table 3 :
Comparison of parametric survival models fitted to the breast cancer data.