swdpwr: A SAS macro and an R package for power calculations in stepped wedge cluster randomized trials
Introduction
In cluster randomized trials (CRTs), the unit of randomization is the cluster, improving administrative convenience and reducing treatment contamination [1]. In stepped wedge designs (SWDs), all clusters start out in the control condition and switch to the intervention condition in a unidirectional and randomly assigned order, and once treated, the clusters maintain their intervention status until the end of the study. At pre-specified time periods, a random subset of clusters cross over from the control to the intervention condition. Stepped wedge randomization may be preferred for estimating intervention effects when it is logistically more convenient to roll-out intervention in a staggered fashion and when the stakeholders or participating clusters perceive the intervention to be beneficial to the target population [2].
Two different stepped wedge designs have been considered: the cross-sectional design and the closed cohort design [3]. We do not consider open-cohort designs in this work as the related methods for binary outcomes are currently unavailable. Only complete designs with no transition periods are included. In a cross-sectional design, different participants are recruited at each time period in each cluster; while in a closed cohort design (which for simplicity will be referred to as a cohort design hereafter), participants are recruited at the beginning of the study and have repeated outcome measures at different time periods [4]. A distinguishing feature of all CRTs is that outcomes within the same cluster tend to be correlated with one another [5]. Because in SWDs outcomes are measured at different time periods, the within-period and between-period intracluster correlation coefficients may be different and thus should be separately considered in designing SWDs [5]. An additional within-individual correlation should be included when it is a cohort SWD to account for the repeated measures within the same individual over time [6]. Two statistical models can be used to account for these three levels of intraclass correlation: the conditional model and the marginal model. Conditional models are based on mixed effects models [7], [8], which accommodate the intraclass correlations via latent random effects. Marginal models describe the population-averaged responses across cluster-periods, and are usually fitted with generalized estimating equations (GEE) [9]. The interpretations of regression parameters can be different under these two models, with the important exception of the identity and log links when random effects and the covariates are independent, as is typically assumed [10]. The design and analysis of SWDs have been mostly based on conditional models, for instance, Hussey and Hughes [2], Woertman et al. [11], Hemming et al. [12], Hooper et al. [13], Li et al. [14]. As a complementary approach, the marginal models separately consider the mean and correlation models and carry a straightforward population-averaged interpretation. Accordingly, assuming a continuous or binary outcome, Li et al. [15] proposed methods for the design and analysis of SWDs using marginal models. Alternatively, the conditional model considers the causal effect of interventions on individuals, under the assumption of no unmeasured confounders present or other source of bias.
Binary outcomes are frequently seen in cluster randomized trials as endpoints. However, existing methods for sample size calculation of SWDs have been almost exclusively focused on continuous outcomes. Hussey and Hughes [2] proposed an approach based on linear mixed effects models, estimated by weighted least squares for continuous outcomes, and provided an approximation to this approach for binary outcomes. Systematic reviews indicate that the majority of SWDs with binary outcomes used this approximation method [16], [17], which may either overestimate or underestimate the power in different scenarios [18]. To improve this approximation, Zhou et al. [18] developed a maximum likelihood method for power calculations of SWDs with binary outcomes based on the mixed effects model and Li et al. [15] proposed a method for binary outcomes within the framework of GEE that employed a block exchangeable within-cluster correlation structure with three correlation parameters.
These new methods have not yet been implemented in publicly available software, making it difficult for researchers and practitioners to apply these new methods to rigorously design their studies. There are a few current software packages for power calculations in SWDs. The Hussey and Hughes approach [2] was implemented by the swCRTdesign [19] in R and an Excel spreadsheet (http://faculty.washington.edu/jphughes/pubs.html). Hemming and Girling [20] developed a Stata menu-driven program steppedwedge based on the Hussey and Hughes model. These approaches consider the linear mixed effects model for continuous outcomes and perform approximate calculations for binary outcomes. Hemming et al. [21] developed the Shiny CRT Calculator programmed in R using linear mixed effects models for continuous outcomes and included the normal approximation for binary outcomes (https://github.com/karlahemming/Cluster-RCT-Sample-Size-Calculator). This method accommodated cross-sectional and cohort designs with three intraclass correlations (intracluster correlation, cluster autocorrelation (CAC) and individual autocorrelation (IAC)) [13] that are different from those in our work. The IAC and CAC are both cluster mean correlations, however, the intacluster correlation coefficients in our software are defined for within-cluster individual level observations (their differences are clarified in Li et al. [22]). These approximations for binary outcomes may or may not be accurate, depending on whether or not the normal approximation to the binomial distribution is reasonable. Alternatively, Baio et al. [23] developed the R package SWSamp (https://sites.google.com/a/statistica.it/gianluca/swsamp), which allows simulation-based sample size and power calculations for several general scenarios including cross-sectional and cohort designs for continuous, binary and count outcomes. However, this package does not allow for random cluster-by-time interaction (therefore assuming the within-period intracluster correlation coefficient (ICC) is the same as the between-period ICC). In addition, two more recent R packages, clusterPower [24] and CRTpowerdist [25] also allow for simulation-based power calculation for continuous, binary and count outcomes, but with a focus on cross-sectional designs. Hence, in an effort to make computationally efficient (and non-simulation-based) power methods under both the cross-sectional and closed-cohort designs for continuous and binary outcomes more accessible, we have developed user-friendly software for methods proposed by Zhou et al. [18] and Li et al. [15] based on the conditional and marginal models, respectively. Particularly, we focus on the exchangeable, nested exchangeable, and block exchangeable within-cluster correlation structures. Methods of binary outcomes for other types of correlation structures (eg. exponential decay structure) are not implemented in the current version of our software. The software engine has been developed in Fortran and is incorporated into the SAS macro %swdpwr, the R package swdpwr and a Shiny app.
Section snippets
Methods
Throughout this article, the regression parameter denotes the treatment effect. For testing the treatment effect, we consider the following hypothesis:where is the true value of under the alternative hypothesis that . In this software, power is calculated based on a two-sided Wald-type test given by:where is the cumulative distribution function of the standard normal distribution, is the significance level, and
Software description
Table 1 displays all the scenarios and correlation structures that are implemented in the software, accommodating cases and methods with and without time effects. The input parameters are the same for R and SAS. Hereafter, the mean response refers to the average outcome and the proportion for continuous and binary outcomes, respectively. Different input parameters values, including for the anticipated mean response in the control group at the start of the study, the anticipated mean response in
Examples
The usage of the software is based on platforms of R and SAS, which requires separate illustrations. The following sections are organized according to different scenarios such as continuous and binary outcomes, cross-sectional and cohort settings, different model options, different link functions, different time effects assumptions. Each section will contain examples under both platforms. The implementation of SAS macro for these examples will be shown in Appendix C. When the input arguments
Application
We provide two real-data applications and show the implementation by SAS.
Discussion
This article has described the use of the R package swdpwr and the SASmacro %swdpwr for power calculations in SWDs. The software is designed under two computer platforms where users specify input parameters for different scenarios of interest, accommodating cross-sectional and cohort designs, binary and continuous outcomes, marginal and conditional models, three link functions, with and without time effects, and unequal allocation of clusters per sequence. This software addresses the
Declaration of Competing Interest
The authors have declared no conflict of interest.
Acknowledgments
This work was supported by the grants NIH/R01AI112339 and NIH/DP1ES025459.
References (40)
- et al.
Design and analysis of stepped wedge cluster randomized trials
Contemp. Clin. Trials
(2007) - et al.
The efficiency of stepped wedge vs. cluster randomized trials: stepped wedge studies do not always require a smaller sample size
J. Clin. Epidemiol.
(2013) - et al.
Current issues in the design and analysis of stepped wedge trials
Contemp. Clin. Trials
(2015) - et al.
Stepped wedge designs could reduce the required sample size in cluster randomized trials
J. Clin. Epidemiol.
(2013) - et al.
Optimal allocation of clusters in cohort stepped wedge designs
Stat. Probab. Lett.
(2018) - et al.
Sample size calculations for stepped wedge and cluster randomised trials: a unified approach
J. Clin. Epidemiol.
(2016) - et al.
Crtpowerdist: an R package to calculate attained power and construct the power distribution for cross-sectional stepped-wedge and parallel cluster randomized trials
Comput. Methods Programs Biomed.
(2021) - et al.
“Cross-sectional” stepped wedge designs always reduce the required sample size when there is no time effect
J. Clin. Epidemiol.
(2017) - et al.
A note on “Design and analysis of stepped wedge cluster randomized trials”
Contemp. Clin. Trials
(2015) Design and Analysis of Group-Randomized Trials
(1998)
Designing a stepped wedge trial: three main designs, carry-over effects and randomisation approaches
Trials
Intra-cluster and inter-period correlation coefficients for cross-sectional cluster randomised controlled trials for type-2 diabetes in uk primary care
Trials
Mixed-Effects Models in S and S-PLUS
Approximate inference in generalized linear mixed models
J. Am. Stat. Assoc.
Longitudinal data analysis using generalized linear models
Biometrika
Equivalence of conditional and marginal regression models for clustered and longitudinal data
Stat. Methods Med. Res.
Stepped-wedge cluster randomised controlled trials: a generic framework including parallel and multiple-level designs
Stat. Med.
Sample size calculation for stepped wedge and other longitudinal cluster randomised trials
Stat. Med.
Sample size determination for gee analyses of stepped wedge cluster randomized trials
Biometrics
Systematic review finds major deficiencies in sample size methodology and reporting for stepped-wedge cluster randomised trials
BMJ Open
Cited by (7)
Power calculation for detecting interaction effect in cross-sectional stepped-wedge cluster randomized trials: an important tool for disparity research
2024, BMC Medical Research MethodologyAn introduction to the statistical analysis of stepped cluster randomized trials
2024, Chinese Journal of Evidence-Based MedicineEstimating intra-cluster correlation coefficients for planning longitudinal cluster randomized trials: A tutorial
2023, International Journal of EpidemiologyA general method for calculating power for GEE analysis of complete and incomplete stepped wedge cluster randomized trials
2023, Statistical Methods in Medical ResearchSample size calculators for planning stepped-wedge cluster randomized trials: a review and comparison
2022, International Journal of Epidemiology