Use of days alive without life support and similar count outcomes in randomised clinical trials – an overview and comparison of methodological choices and analysis methods

Background Days alive without life support (DAWOLS) and similar outcomes that seek to summarise mortality and non-mortality experiences are increasingly used in critical care research. The use of these outcomes is challenged by different definitions and non-normal outcome distributions that complicate statistical analysis decisions. Methods We scrutinized the central methodological considerations when using DAWOLS and similar outcomes and provide a description and overview of the pros and cons of various statistical methods for analysis supplemented with a comparison of these methods using data from the COVID STEROID 2 randomised clinical trial. We focused on readily available regression models of increasing complexity (linear, hurdle-negative binomial, zero–one-inflated beta, and cumulative logistic regression models) that allow comparison of multiple treatment arms, adjustment for covariates and interaction terms to assess treatment effect heterogeneity. Results In general, the simpler models adequately estimated group means despite not fitting the data well enough to mimic the input data. The more complex models better fitted and thus better replicated the input data, although this came with increased complexity and uncertainty of estimates. While the more complex models can model separate components of the outcome distributions (i.e., the probability of having zero DAWOLS), this complexity means that the specification of interpretable priors in a Bayesian setting is difficult. Finally, we present multiple examples of how these outcomes may be visualised to aid assessment and interpretation. Conclusions This summary of central methodological considerations when using, defining, and analysing DAWOLS and similar outcomes may help researchers choose the definition and analysis method that best fits their planned studies. Trial registration COVID STEROID 2 trial, ClinicalTrials.gov: NCT04509973, ctri.nic.in: CTRI/2020/10/028731. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-023-01963-z.


Table of contents
. Summary of models considered 2 Model syntax and priors 8 Figure S1. Assessment of the proportional odds assumption in cumulative logistic model 12 Posterior predictive check plots (Figures S2-S11) 14 References 19 Use of days alive without life support and similar count outcomes in randomised clinical trials -an overview and comparison of methodological choices and analysis methods 2 Additional file 1 -Supplementary tables and figures Mean difference.
Measures derived from the above values, e.g., ratio of means.
Unlikely to adequately fit the typical distribution of days alive without life support (and similar outcomes) and thus unable to generate new, similar data from the model.
Works for outcomes with zero-inflation and values below zero (e.g., some HRQoL tools [2]), but interpretation of results is hampered if death is considered a worse outcome than 0 days [3].
May adequately estimate the mean difference with uncertainty for larger sample sizes (according to the central limit theorem [1]) and can adequately assess uncertainty even if the residuals are not normally distributed if Bayesian posterior distributions for the means/mean differences are analysed, or if bootstrapping or robust standard errors are used in the frequentist setting.
May provide implausible predictions in some cases due to the lack of upper and lower truncation.
Quick to estimate and easy to interpret parameters; only one set of covariate estimates needed. Does not work for outcomes with both zero-inflation and values below zero (e.g., some HRQoL tools [2] or if death is considered a worse outcome than 0 days without life support [3] Three-part* model: Can adequately model excess minimum and maximum values; all predicted values will be in the valid range. Does not work for outcomes with zero-inflation and values below 0 (e.g., some HRQoL tools [2] or if death is considered a worse outcome than 0 days without life support [3]).
Flexible distribution but may not always fit the data perfectly.
Three components, more complex and time-consuming to estimate than simpler models; three sets of estimates per covariate are needed, which can increase uncertainty, especially in smaller samples. Can be considered a generalisation of common non-parametric tests that allows adjustment [7,9].
Odds ratio for the treatment difference.
Can also calculate the No distributional assumptions -can also model values below 0 (e.g., if HRQoL is assessed [2] or if death is included as a distinct outcome worse than 0 days [3]).
Primary result is an odds ratio -easier-to-interpret results may be derived from the model (although it may be necessary to, e.g., convert distinct event categories, such as death modelled as -1, to the valid range of values, e.g., 0).
If the proportional odds assumption is violated, the result can still be seen as a valid average of the odds ratios at the different cut-points [7,9], but may then not generate data that closely mimics the input data and calculation of derived estimates (e.g., on the absolute scale) may be incorrect.
Many terms to estimate as separate intercepts have to be estimated for each value of the outcome variable (minus one) so more time-consuming to estimate than simpler models.

Model syntax, and priors
In this section, the syntax required to fit the models and the used link functions using the brms R package, and the complete priors used (all flat or very vaguely informative, corresponding to the current default priors in brms). In the syntax listed … denotes non-general parts of the model calls, dawols_dead0 denotes days alive without life support to day 28 with the value 0 assigned to non-survivors, dawols_dead0_prop corresponds to dawols_dead0 but scaled to a proportion (by division with the maximum possible value, 28), dawols_deadm1_fct denotes the days alive without life support to day 28 with -1 assigned to non-survivors and encoded as a categorical variable. As the center argument was not specified (see brmsformula documentation), distributional parameters were centred and priors are thus on the centred parameters.

Priors:
-treatment: a flat, i.e., improper/uniform prior on the treatment effect.
Use of days alive without life support and similar count outcomes in randomised clinical trials -an overview and comparison of methodological choices and analysis methods 9 Additional file 1 -Supplementary tables and figures -sigma: a student_t (3, 0, 8.9) prior for the sigma parameter with a lower boundary of 0, i.e., a half-Student's T distribution prior.

Hurdle-negative binomial regression
Syntax: brm(brmsformula(dawols_dead0 ~ treatment, hu ~ treatment), family = negbinomial(mu = "log", shape = "identity", hu = "logit"), …) with the two parts of the formula specifying that both parts of the model (with hu denoting the logistic regression sub-model) are assumed to vary with the treatment allocation (of note, the default link function for the shape auxiliary parameter in brms is an identity link if not modelled separately, but a log link if a specific model formula is provided for this parameter). Priors: -treatment: flat, i.e., improper/uniform prior on the treatment effect in both sub-models.
-intercept, hu: a logistic(0, 1) prior on the intercept part of the logistic regression sub-model, i.e., a logistic prior with mean of 0 and standard deviation of 1.
-shape: a gamma(0.01, 0.01) prior for the shape (phi) parameter of the zero-truncated negative-binomial sub-model, i.e., a gamma prior with alpha and beta parameters of 0.01 and a lower boundary of 0.

Zero-one-inflated beta regression
Use of days alive without life support and similar count outcomes in randomised clinical trials -an overview and comparison of methodological choices and analysis methods 10 Additional file 1 -Supplementary tables and figures Syntax: brm(brmsformula(dawols_dead0_prop ~ treatment, zoi ~ treatment, coi ~ treatment), family = zero_one_inflated_beta(mu = "logit", phi = "identity", zoi = "logit", coi = "logit"), …) with the three parts of the formula specifying that all three parts of the model (with zoi denoting the sub-model modelling the probability of having proportions of 0 or 1 and coi denoting the sub-model modelling the probability of having proportions of 1 conditional of having a proportions of either 0 or 1) are assumed to vary with the treatment allocation (of note, the default link function for the phi auxiliary parameter in brms is an identity link if not modelled separately, but a log link if a specific model formula is provided for this parameter). Priors: -treatment: flat, i.e., improper/uniform prior on the treatment effect in all three sub-models.
-intercept, zoi and coi: logistic(0, 1) prior on the intercept part of the two logistic regression sub-models, i.e., a logistic prior with mean of 0 and standard deviation of 1.
-phi: a gamma(0.01, 0.01) prior for the phi parameter of the beta regression sub-model, i.e., a gamma prior with alpha and beta parameters of 0.01 and a lower boundary of 0.

Posterior predictive check plots
The following plots illustrate the posterior predictive checks [10] for all models. The plots labelled 'densities' present posterior predictions for all patients in the COVID STEROID 2 trial [11] dataset for the two treatment groups; of note, these overlain density plots give a good overall view of the distributions, but the smoothing may lead to some visual artefacts primarily close to the