Individual patient data meta-analysis of survival data using Poisson regression models

Background An Individual Patient Data (IPD) meta-analysis is often considered the gold-standard for synthesising survival data from clinical trials. An IPD meta-analysis can be achieved by either a two-stage or a one-stage approach, depending on whether the trials are analysed separately or simultaneously. A range of one-stage hierarchical Cox models have been previously proposed, but these are known to be computationally intensive and are not currently available in all standard statistical software. We describe an alternative approach using Poisson based Generalised Linear Models (GLMs). Methods We illustrate, through application and simulation, the Poisson approach both classically and in a Bayesian framework, in two-stage and one-stage approaches. We outline the benefits of our one-stage approach through extension to modelling treatment-covariate interactions and non-proportional hazards. Ten trials of hypertension treatment, with all-cause death the outcome of interest, are used to apply and assess the approach. Results We show that the Poisson approach obtains almost identical estimates to the Cox model, is additionally computationally efficient and directly estimates the baseline hazard. Some downward bias is observed in classical estimates of the heterogeneity in the treatment effect, with improved performance from the Bayesian approach. Conclusion Our approach provides a highly flexible and computationally efficient framework, available in all standard statistical software, to the investigation of not only heterogeneity, but the presence of non-proportional hazards and treatment effect modifiers.


Background
Meta-analysis methods are used to integrate quantitative findings from a set of related research studies with the aim of providing more reliable and accurate estimates of a treatment effect [1]. Traditionally a meta-analysis requires aggregate data (AD), extracted from publications or received directly from study authors. Summary statistics (e.g. log hazard ratios) are then synthesised using a fixed or random effects meta-analysis [2], where random effects can account for between study heterogeneity in the treatment effect. Meta-regression models [3] attempt to explain this excess heterogeneity with study-level covariates. However, the use of AD to conduct a metaanalysis has inherent problems, for example, hazard ratios are not always explicitly given in publications, leading to the development of alternative techniques to extract appropriate summary statistics [4]. Despite this, even when using the methods of Parmar et al., it can still be difficult to extract hazard ratios, as shown by Riley et al. [5].
An approach often considered the gold-standard alternative to an AD meta-analysis is a meta-analysis of individual patient data (IPD), which utilises the raw data from each study. IPD meta-analyses have been shown to be most common when analyzing time-to-event data [6]. The benefits of conducting an IPD meta-analysis with time-to-event data include: follow-up time can be longer and more up to date, analyses can be standardised across studies, model assumptions can be checked e.g. proportional hazards, and confounders can be adjusted for. However, IPD can be difficult to obtain, and a variety of methods have been developed to undertake meta-analyses from the published literature of time-to-event data. An early proposal by Dear [7] showed how to jointly synthesise survival proportions reported at multiple times, by utilising their correlation and combining them in a multivariate meta-analysis using generalised least squares. Dear investigated only fixed effects, leading the extension of Arends et al. to incorporate random effects [8]. Techniques to extract summary statistics from published studies have also been demonstated [4] for the use in standard AD meta-analyses. Fiocco et al. recently used a Poisson correlated gamma-frailty approach to combine survival curves under heterogeniety, allowing the investigation of both between-study variance and within and between-arm correlations [9]. A frailty approach has also been implemented by Feng et al. incorporating crossed random effects using penalized quasi-likelihood under a Poisson likelihood [10]. Further extensions of AD metaanalyses include assessment of the proportional hazards assumption [9,11] IPD meta-analyses of time-to-event data can use either a two-stage or one-stage approach. The most commonly used, the two-stage, is achieved by first fitting individual survival models to each trial. The chosen estimates of effect are then combined in a standard meta-analysis framework, now equivalent to an AD meta-analysis. In a one-stage IPD meta-analysis, patient data from all studies are analysed simultaneously within a hierarchical framework. This draws parallels with the analysis of IPD from multi-centre clinical trials, accommodating clustering within treatment centres; however, in a multi-centre trial the treatment effect is not often random, whereas in a meta-analysis it often is. This is because in a multi-centre trial we can achieve greater consistency in inclusion/exclusion criteria and other aspects of trial protocol, indicating that a fixed treatment effect is likely to be more appropriate. Senn discusses these issues in more detail [12], but we emphasise that, although random-effects models are rarely used to analyse multi-centre trials, they could also adopt the methods we present here. Indeed, published trial analysis guidelines do state: "mixed models may be used to explore heterogeneity of the treatment effect. These models consider centre and treatment-by-centre effects to be random, and are especially relevant when the number of sites is large" [13]. A range of hierarchical survival models within the Cox framework have been developed [14][15][16][17], which can effectively account for heterogeneity in treatment effect and baseline risk. However, these methods can have a high computational burden and/or rely on userwritten programs, not currently available in standard statistical software [16]. Furthermore, these models do not investigate the validity of the assumption of proportional hazards. These reasons serve as motivation to consider alternative approaches, such as the percentile ratio [18] as a target of inference in this setting, developed predominantly for when the proportional hazards assumption appears violated.
The aim of this paper is to explore the use of Poisson regression, and the generalised mixed model extensions, to incorporate random effects to perform one-and twostage IPD meta-analyses of time-to-event outcomes, as an alternative to hierarchical Cox models, and to extend the models to incorporate non-proportional hazards and treatment-effect modifiers.

Methods
The Poisson approach to survival analysis Poisson regression is used in the modelling of count data and contingency tables; however, the extension to modelling survival data via a piecewise exponential model [19] serves as an alternative approach to the widely used Cox model. It has been shown how the Cox model can be fitted using a Poisson GLM due to the shared form of the contribution to the partial log-likelihood, by splitting follow-up time into as many intervals as there are events [20]. However, this method can be computationally intensive. Alternatively, we can choose a smaller number of time intervals with fixed length, where patients are at risk of experiencing events within these [21], to closely approximate the Cox model. We also obtain direct estimates of the baseline hazard rate. Fine splitting of the timescale has been used to allow the use of splines and fractional polynomials to model the baseline hazard continuously [21,22].
A standard approach when choosing interval lengths is to use yearly splits [23]. The narrower the time interval, the more computationally intensive these methods will be; however, methods to compensate for this are available and described below. The shape of the underlying hazard function plays a crucial role in choosing the number of intervals necessary to successfully capture its variation. In this paper, quarter year, half year and yearly splits are compared.
Undertaking a one-stage IPD meta-analysis within a Poisson framework is beneficial due to the highly developed area of GLMs. Random effects GLMs are available within all commonly used statistical software packages (e.g. Stata, SAS and R), allowing models to be applied without the need for specialist software.

Model fitting in a single trial
Consider the analysis of time-to-event data from a single trial, investigating the effect of a treatment. For the i th patient, let x i denote treatment group, coded 0/1 to denote control/treatment. A standard Cox proportional hazards model can be applied (and estimated by maximising the partial likelihood [24]): where h 0 (t) is the unspecified baseline hazard rate and β 1 the log hazard ratio (i.e. the treatment effect) for the treatment group compared to the control group. By splitting follow-up time into k = 1,...,K intervals and assuming a constant hazard within each interval we can apply the Poisson model: where d ik is the event indicator, taking the value of 0 or 1 (censored or event), representing a Poisson process for each patient during each time interval. Note that d ik will not follow a Poisson distribution per se, but by doing so we recover the correct form of the likelihood for a piecewise exponential model. β 1 is once again the log hazard ratio for the treatment group compared to the control group. λ k is the baseline hazard rate during the k th time interval. Time at risk, y ik , is included as a log offset in the linear predictor. If we split follow-up time at each unique event time and apply the Poisson model, we would obtain an identical estimate of the treatment effect, β 1 , to that from a Cox model.

Two-stage IPD meta-analyses models for survival data
The two-stage approach can be thought of as more traditional, with individual survival models applied to each trial, and appropriate summary statistics extracted to allow AD meta-analysis techniques to be applied.
We extract from the j th trial: the log hazard ratio for the treatment group compared to the control group, β 1j , and its variance V(β 1j ) , using either Cox or Poisson models, which can then be combined in a standard AD meta-analysis. Such AD meta-analysis models include a fixed effect model, where we assume all trials are estimating the same true treatment effect, applied for example using the inverse variance weighted method [1]. Alternatively, a random effect model can be applied where we assume that each estimate of the treatment effect comes from a distribution of treatment effects, with mean β 1 and variance τ 2 . Following a random effect meta-analysis, a prediction interval can be calculated for the predicted treatment effect in an individual study, to help show the potential impact of between-trial heterogeneity [25,26].

One-stage IPD meta-analyses models for survival data
We now describe one-stage IPD meta-analyses models using the framework of proportional hazards models. The following models, if fitted using the Cox proportional hazards model, correspond to those developed by Tudur-Smith et. al. [14], which are estimated by maximising the penalized partial likelihood to find the best linear unbiased predictors, from which the REML estimators of the variance components were found [27].

Model A: Fixed treatment effect with proportional trial effects
For the i th patient, i = 1,...,n j , in the j th trial, j = 1,...,J, the hazard function at time t can be written as: where h 0 (t) is the baseline hazard function in the reference trial (say j = 1, so b 01 constrained to be zero). b 0j is the proportional effect on the baseline hazard function due to the j th trial, now j = 2,..., J. x ij is coded -0.5/0.5 to denote control/treatment group and b 1 is the log hazard ratio for the treatment group compared to the control group, assumed equal across all trials. Model A makes the restrictive assumption that the hazard functions in all trials are proportional to a common baseline function.
The treatment group coding of -0.5/0.5 is used in all one-stage models presented in this paper. Using this coding of the treatment group indicator, we assume equal variability in the log hazard rate across trials for both treatment groups. If we chose the 0/1 coding, this imposes the restrictive assumption that the variability in the log hazard rate of the treatment group coded 0, is zero [14,28].

Model B: Fixed treatment effect with baseline hazard stratified by trial
In reality, the assumption that the hazard functions in all trials are proportional is likely to be inappropriate. By allowing separate baseline hazard functions for each trial we can relax this assumption, whilst still assuming proportional hazards between treatment groups within each trial. Allowing separate baseline hazards, we have: where h 0j (t) is the baseline hazard function in the j th trial. As in Model A, b 1 represents the log hazard ratio for the treatment group compared to the control group, assumed constant across trials. No allowance for between study variation in the treatment effect is made in Models A and B.

Model C: Random treatment effect with proportional trial effects
Models which allow for between-trial heterogeneity in the treatment effect are now considered. The following formulations assume an underlying mean treatment effect, coming from a population of treatment effects. The hazard function for the i th patient in the j th trial can be written as: where h 0 (t) is the baseline hazard function in the reference trial (say j = 1, so b 01 constrained to be zero). b 0j is the proportional effect on the baseline hazard function due to the j th trial, now j = 2,...,J. b 1 is now interpreted as the mean log hazard ratio for a population of treatment effects, with b 1j the deviation of the log hazard ratio in the j th trial from the population mean. This assumes that the b 1j 's come from a Normal distribution with mean zero and variance τ 2 . This formulation produces a measure of the between-trial heterogeneity in the treatment effect, τ 2 .
Model D: Random treatment effect with baseline hazard stratified by trial Finally, separate baseline hazards are allowed, with a random treatment effect: where h 0j (t) is interpreted as in Model B, with b 1 , b ij and τ 2 defined in Model C. Model D, as in Model B, assumes proportional hazards across treatment groups only within trials.
Models A to D, within a hierarchical Cox framework, were applied by Tudur-Smith et al. [14] to IPD data from 5 trials comparing 2 anti-epileptic drugs with time-to-event outcome first seizure. A total of 1225 patients were analysed. To illustrate the computational burden of hierarchical Cox models, the application of Model C took 29 hours to achieve convergence, whilst the application of Model D took 53 minutes to achieve convergence.
The Poisson approach to one-stage IPD meta-analysis models of survival data We now introduce Poisson based GLM formulations of the models shown above. Techniques to increase the computational efficiency of the models are described in Section titled "Model fitting" below.

One-stage IPD Poisson generalised linear survival models
Models A and C: Fixed/random treatment effect with proportional trial effects. For time intervals, k = 1,...,K, we now have: where λ k represents the constant hazard rate in the k th interval for the control group, in the reference trial.
Models B and D: Fixed/random treatment effect with baseline hazard stratified by trial. Models B and D are similarly altered. For trials, j = 1,...,J, and time intervals, k = 1,...,K, we can write the baseline hazard function as: h 0jk (t) = λ jk (8) where λ jk represents the constant hazard rate in the j th trial during the k th time interval.

Model fitting
We present Model A in the form of a Poisson GLM: where d ijk is the event indicator, taking the value of 0 or 1 (censored or event), representing a Poisson process for each patient in each trial during each time interval. b 0j and b 1 are as in Model A, with l k once again the hazard rate in the control group of the reference trial. Time at risk, y ijk , is included as a log offset in the linear predictor. The extension to separate trial effects can be achieved by simply replacing the linear b 0j and l k terms with the interaction of them, i.e. Model B.
Fixed effect Models A and B can be implemented using any GLM software package, such as glm within Stata [29]. Models C and D, with random treatment effects, can be implemented using a multilevel mixed effects Poisson regression package, such as Stata's xtmepoisson.
It is widely known that within a mixed effects framework, maximum likelihood performs poorly when estimating variance parameters when there are a small number of studies [28]. This provides motivation for considering a Bayesian approach to the models discussed above, described and undertaken in the simulation study and results sections below.
If we have N independent Poisson distributed random variables, each with mean l, then the sum of these N distributions is itself a Poisson distributed random variable with mean Nl. Given this condition, it is possible to 'collapse' each split dataset across covariate patterns (for example, separately collapse the dataset for males and females) [30]. A Poisson GLM model can then be fitted to the collapsed dataset, giving identical parameter estimates to a Poisson GLM fitted to the non-collapsed dataset. This process dramatically reduces computation time when datasets are large; however, is only valid when categorical covariates are used. It is not possible to collapse across covariate patterns when including truly continuous covariates.
When handling sparse event data, the situation may arise when no events occur within a split time interval. In this case, when applying the models described in this section, we obtain nuisance estimates of the baseline hazard rate for any time interval in which no events occur. This can be remedied by the merging of time intervals.

Simulation study
To fully assess the performance of these methods a simulation study was devised. Data is simulated consisting of a random treatment effect and proportional trial effects. We investigate the impact of the number of studies and time interval length by simulating either 5, 10 or 30 trials, and applying Poisson one-stage models with time intervals of length 0.25, 0.5 or 1 year. Each trial is simulated under the following steps: 1. Generate 2000 patients; 50% assigned to treatment, 50% to control. 2. Simulate a random treatment effect (on the log scale) with mean, α = -0.4, and inherent betweentrial heterogeneity, τ = 0.2. Therefore b 1~N (-0.4,0.2 2 ), indicating a 33.0% (95% CI: 0.8%, 54.7%) reduction in the event rate due to treatment. 3. Generate a fixed trial effect, b 0~N (0,0.5 2 ), again on the log scale. 4. Generate survival times from a Weibull distribution using a formulation proposed by Bender et al. [31]. Scale and shape parameters were defined as l = 0.042 and g = 1.2, respectively. These values are based on fitting a Weibull survival model to the SHEP trial. All observations are censored after 5 years. This produces a 74.8% and 82.4% survival proportion after 5 years in the control and treatment groups, respectively.
This results in 9 scenarios, in which 1000 repetitions were simulated. For each simulated dataset, Model C was applied both classically using xtmepoisson within Stata, whilst WinBUGS, through the use of winbugsfromstata [32], was used to apply the equivalent Bayesian model. Each Bayesian model was applied with a burn-in of 1000 and sample of 5000. This was deemed adequate to achieve convergence through extensive testing of the simulations. Vague priors were assigned to all parameters under the Bayesian approach. The treatment group indicator was coded -0.5/0.5.

Extensions to the one-stage approach Treatment effect modifiers
It is becoming increasingly accepted that variation in treatment effects, as a source of heterogeneity, can only be sufficiently detected and explained when IPD are available [33]. IPD allows one to examine covariates and within-trial interactions at the patient-level. In contrast, meta-regression of only AD allows one to examine studylevel covariates and interactions across-trials, and this has been shown to have low power to detect true interactions between patient covariates and treatment effect [34], and may also be subject to ecological bias and study level confounding [35].
The discrimination between within-trial and acrosstrial treatment-covariate interactions is a current issue in IPD meta-analysis [35,36], which requires further work within the survival analysis field. Below we present a simple one-stage model which produces a weighted average of the within-and across-trial interactions, though in our applied example the within-trial interaction dominates. Fixed treatment effect with separate trial effects Let w ij be a patient-level covariate, e.g. overweight status (coded 0/1 for no/yes, see Table 1) for the i th patient in the j th trial. Extending Model B to incorporate a treatment-covariate interaction gives: where l jk is the baseline hazard rate during the k th time interval in the j th trial, β 1 now represents the treatment effect when w ij = 0, μ is the change in the log hazard rate of the control group for a one-unit increase in w ij and γ is the change in the treatment effect for a one-unit increase in w ij .

Non-proportional hazards for the treatment effect
It has been shown that the benefits of a treatment can be deemed greater during the initial period of follow-up time in certain contexts [37]. In this situation, an assumption of proportional hazards for the treatment effect will be violated. In other words, a beneficial treatment effect may diminish with time. We describe a simple approach of investigating the presence of non-proportional hazards in the treatment effect, which can be extended to any covariate within the model. h ijk (t) = λ jk exp β 1 x ij + φx ij z ijk (11) β 1 now represents the log hazard ratio for the treatment group compared to the control group when t <t s , withφ the change in the log hazard ratio when t ≥ t s , relative to when t <t s . The estimated log hazard ratio for the treatment group compared to the control group when t ≥ t s is therefore a linear combination;β 1 +φ . The inclusion of non-proportional hazards can be investigated using the likelihood ratio test, comparing with Model B. This can be extended by further splitting of follow-up time; however, the time variable, z ijk , would generally be assumed to have fewer intervals than those used to model the baseline hazard rate. Extension to include a time-varying treatment effect in Models A, C and D is easily conducted.

The hypertension data
The example dataset used to illustrate the models in this paper comes from an IPD meta-analysis investigating the effects of anti-hypertension drugs in lowering systolic and diastolic blood pressure as determinants of cardiovascular outcomes [38]. Randomised controlled trials (RCTs) were selected on the availability of IPD and the comparison of an active treatment to a placebo or control. This resulted in the inclusion of 10 trials consisting of 28,581 patients. Meta-analysis is important to summarise the average treatment effect, and any heterogeneity in the treatment effect, across these different trials, and it enables a broader assessment of the effects of hypertension treatments than is possible in a single trial alone.
Summary statistics for the time-to-event outcome allcause death and an overweight covariate are presented in Table 1. Overweight status is a binary covariate, coded 0/1 for no/yes, dichotomising Body-Mass Index (BMI) at 25 kg/m 2 . Detailed summary statistics can be found in the original meta-analysis [38].

Single trial application
Comparing approaches, we apply a proportional hazards model investigating the effect of the treatment. The SHEP trial is used as an example, with outcome allcause death. Estimated hazard ratios for the treatment effect are presented in Table 2. We observe complete agreement in estimates and 95% confidence intervals across models, showing a non-statistically significant reduction of 8.7% (95% CI: -10.1%, 24.3%) in the hazard of death for patients in the anti-hypertension treatment group compared to those in the control group.

Two-stage IPD meta-analyses models for survival data
We now apply two-stage random effects meta-analyses models to the hypertension data. In the first step we compare the Cox and Poisson models to obtain the estimates of the treatment effect in each trial,β 1j and associated variance V(β 1j ) . The second step is then conducted using the random effects AD meta-analysis model of DerSimonian and Laird [2]. Table 3 shows the estimates of the pooled hazard ratio. All 4 models produce consistent estimates of the pooled treatment effect, showing a 12% (95% CI: 2.6%, 20.4%) reduction in the hazard of death for patients in the active anti-hypertension treatment group compared to those in the control. No evidence of heterogeneity was found (τ 2 = 0) , indicating in this case a fixed effect model would suffice and would yield identical estimates. Forest plots from the two-stage meta-analyses using Cox models and Poisson models with 0.5 year splits are shown in Figures 1 and 2, respectively, illustrating the consistent estimates of the treatment effect at both the trial and meta-analysis level.

One-stage IPD meta-analyses models for survival data
We now apply each of the models described in the methods section "One-stage IPD meta-analyses models for survival data" to the hypertension data, using the Poisson method both classically and under a Bayesian approach. Further comparison of Models A(fixed treatment and fixed proportional trial effects) and B (fixed treatment and baseline stratified by trial) are made using Cox proportional hazards models, under a classical approach. Under Bayesian Models A, B, C and D all parameters are assigned a vague prior of N(0,1000 2 ), excluding the heterogeneity parameter in Models C and D, where τ~N(0,1) with τ > 0. A burn-in of 1000 was used, with 100,000 samples and thinning at every 20th sample to remove autocorrelation. Estimates of the treatment effect and 95% confidence/ credible interval are seen in Table 4. Comparing estimates obtained under classical Cox formulations of Models A and B with equivalent Poisson models, we observe almost identical estimates of the treatment effect and 95% confidence intervals for each time interval length. For example, under all 4 classical one-stage IPD meta-analysis models with fixed treatment effect and proportional trial effects, we observe a 12.3% (95% CI: 3.0%, 20.7%) reduction in the hazard of death for patients in the active anti-hypertension treatment group compared to those in the control group. Consistent estimates of the treatment effect are obtained across all 3 choices of time interval. Each mixed effects model also produces an estimate of heterogeneity in the treatment effect, seen in Table 5.  Hazard ratios from Cox models  Poisson GLMs, interval length = 0.5 years  to the equivalent Bayesian models estimate of τ = 0.081 (95% Cred. Int.: 0.004, 0.310). The classical model has estimated τ to be essentially zero, and consequently failed to construct a 95% confidence interval.

Random effects Meta−Analysis of Hazard Ratios
To illustrate the computational efficiency of the method, using interval lengths of 1 year; application of Models C and D to collapsed data under a classical approach took 4.6 seconds and 60 seconds, respectively, to achieve convergence. Under a Bayesian approach the equivalent models took 64 seconds and 63 seconds, respectively, to complete the sampling.
Example code to fit Model C both classically within Stata, and under a Bayesian approach in WinBUGS [39] can be found in the Appendix.

Simulation results
Results from the simulation study, detailing mean estimates and coverages of the treatment effect and heterogeneity can be found in Tables 6 and 7, respectively. From Table 6, the treatment effect estimates appear consistent across classical and Bayesian frameworks for each model. A scatter plot matrix can be seen in Figure 3, further illustrating agreement between classical and Bayesian estimates. Coverage improves as the number of trials increase; however, within the classical models coverage is much less informative due to the moderate downward biases seen in the estimates of heterogeneity in Table 7. There is clear evidence that, irrespective of the number of trials or interval length, the classical mixed effects models consistently underestimate the true underlying heterogeneity of τ = 0.2. Estimates from the Bayesian models are generally less biased. Figure 4 shows a scatter plot matrix comparing classical and Bayesian estimates of τ, illustrating the classical approach consistently producing lower estimates of τ, compared to the Bayesian approach.
We also conducted the simulations described above using a treatment group coding of 0/1. The estimates of heterogeneity from the classical model had much larger downward bias. For example, when using 0.5 year intervals, estimates of τ for 5, 10 and 30 studies were 0.112, 0.138 and 0.165, respectively when using the 0/1 treatment coding, compared with 0.147, 0.176 and 0.193 seen in Table 7 for the -0.5/0.5 coding. Estimates under a Bayesian approach remained consistent with those seen in Table 7.
We extended the simulation study to include application of Model D (random treatment effect with baseline hazard stratified by trial) to data simulated as described above. Unfortunately, due to excessive computation time, it proved infeasible to conduct the simulation study on all 9 scenarios. For example, a single run of the scenario including 10 trials with 0.25 year splits takes approximately 32 minutes. However, the 5 trial scenarios were completed and showed entirely consistent results to those described above. The computational difficulties are

One-stage approach extensions Treatment effect modifier
We apply Model (10), both classically and in a Bayesian framework, to the hypertension data to examine whether treatment effect is modified by being overweight (as defined by a BMI value ≥ 25). Note we dichotomise BMI to illustrate the methodology here, but in practice continuous variables are better analysed on their continuous scale. All parameters in the Bayesian approach use the vague prior N(0,1000 2 ). Results are shown in Table 8. We observe almost identical estimates across classical models for each of the parameters of interest. When a patient is not overweight, all classical models predict a treatment effect reducing the mortality rate by approximately 14.2% (95% CI: -0.1%, 26.4.4%, 21.6%) in the hazard of death for overweight patients in the active anti-hypertension treatment group compared to those in the control. Being overweight is estimated to produce a 27.4% (95% CI: 16.5%, 37.0%) reduction in the mortality rate, with treatment group held constant. The equivalent Bayesian models produce almost identical estimates of effect compared to the classical models. Using the approach of Riley et al. [36] we also separated within-study and between-study interactions but it did not change these findings.

Non-proportional hazards
We now apply Model (11) to the hypertension data, letting t s = 1. Results are presented in Table 9. From the classical models, a statistically significant (at the 5% level) 34.3% (95% CI: 16.2%, 48.5%) reduction in the hazard of death for patients in the active anti-hypertension treatment group compared to those in the control is observed in the first year of follow-up. The treatment effect after the first year is calculated by exp (β 1 + ). This produces a non-significant reduction of 6.4% (95% CI: -4.5%, 16.2%) in the hazard of death for patients in the active anti-hypertension treatment group compared to those in the control, showing evidence of a diminishing treatment effect. Figure 5 illustrates this change by plotting the piecewise constant hazard rate in each treatment arm for the COOP trial. Extension to incorporate a random treatment effect is also possible.

Discussion
The importance of having IPD available has been established, allowing a full exploration of between-study heterogeneity [34] and the verification of model assumptions. By obtaining IPD, computational issues may become apparent with the sheer size of patient data being analysed when incorporating random effects. This issue is clearly highlighted when using other large datasets within the hierarchical Cox framework [14]. However, it should be noted that a variety of techniques have been developed to  investigate heterogeneity and non-proportional hazards, for example, when combining aggregate level data from published studies [4,[7][8][9]11].
In this paper, our aim was to illustrate an effective alternative to hierarchical Cox models, minimising computational issues and providing further interpretational benefits. Through minimal splitting of follow-up time, reliable estimates of effect can be obtained. Choice of interval lengths will depend on the underlying shape of the hazard function; however, the hazard ratio may be insensitive to the baseline, as illustrated by consistent estimates of the treatment effect across the 3 choices of interval length used in this paper. By combining the Poisson approach with the collapsing technique described above, we can dramatically reduce computation time. When analysing data with rare events, such models may be further advantageous through the need of less intervals. Differential follow-up times between trials can also be accounted for through this approach. Our approach provides direct estimates of the baseline hazard rate which is clinically important. These estimates allow the calculation of risk differences, or number needed to treat [40]. However, a limitation of our approach is that the collapsing technique described cannot be used with truly continuous covariates, such as age measured in days.
Investigation of random treatment effect models showed a marked underestimation of heterogeneity under the classical approach. This may in fact be explained by the tendency of maximum likelihood to underestimate variance parameters [28]. Under the Bayesian approach we observed improved performance, with comparatively lower absolute biases; however, it must be noted that, given the nature of the MCMC algorithm, the Bayesian approach will always provide a positive estimate of between study heterogeneity. A recent simulation study emphasised the need for care when choosing non-informative priors on variance parameters [41], which has specific relevance when investigating heterogeneity in the treatment effect, as in Models C and D. An alternative estimation procedure, such as h-likelihood [42], could be investigated.
It must be noted that if purely interested in a pooled treatment effect, then there is no advantage in pursuing a one-stage over a two-stage approach; however, investigation of treatment effect modifiers and modelling assumptions should be conducted simultaneously, which can only be done effectively through a one-stage approach. Although previous work has provided effective methods to investigate heterogeneity in the metaanalysis setting [14][15][16][17], we feel our approach provides a highly simplistic alternative which can incorporate the investigation of non-proportional hazards in covariate effects, and that of treatment-effect modifiers, both of which should be considered in any IPD meta-analysis.
In our analysis of the hypertension dataset, we observed a 27.4% (95% CI: 16.55, 37.0%) reduction in the mortality rate when a patient is overweight compared to a non-overweight patient, with treatment group held constant. Although this is a surprising result, it is one that has been identified previously [43]. Previous work by one of the authors of this article has also observed this relationship between BMI and mortality; however, further identified that the true factor lowering risk is height, i.e. lower risk is seen for overweight patients because they tend to be taller [44].
The approach detailed in this paper has the further benefit of allowing adjustment for confounders to be implemented simply. This becomes important when analysing IPD from observational studies, where the need to adjust for confounders is often paramount [45].
The flexibility of the Poisson approach described may be extended through the use of splines to model not only the baseline hazard, but also any time-dependent effects [21]. This would result in more plausible predictions, allowing a continuous function estimate of both.
Finally, we recognise that the IPD approach does not necessarily solve all the problems for meta-analysis [46]; in particular, IPD may not be available from all the studies requested. In this situation a sensitivity analysis may be needed to examine whether IPD meta-analysis conclusions remain robust when aggregate data from non-IPD studies are additionally included as far as possible [35].

Conclusion
For an IPD meta-analysis of survival data, our approach provides a highly flexible and computationally efficient framework. The methods are available in all standard statistical software, allowing the investigation of not only heterogeneity, but the presence of non-proportional hazards and treatment effect modifiers.