swdpwr: A SAS macro and an R package for power calculations in stepped wedge cluster randomized trials

doi:10.1016/j.cmpb.2021.106522

Computer Methods and Programs in Biomedicine

Volume 213, January 2022, 106522

https://doi.org/10.1016/j.cmpb.2021.106522 Get rights and content

Highlights

•
User-friendly SAS macro, R package and Shiny app for stepped wedge designs power calculations.
•
Accommodation of various scenarios for binary and continuous outcomes.
•
Addressing the implementation gap and providing more accurate power calculations for binary outcomes.

Abstract

Background and objective:The stepped wedge cluster randomized trial is a study design increasingly used in a wide variety of settings, including public health intervention evaluations, clinical and health service research. Previous studies presenting power calculation methods for stepped wedge designs have focused on continuous outcomes and relied on normal approximations for binary outcomes. These approximations for binary outcomes may or may not be accurate, depending on whether or not the normal approximation to the binomial distribution is reasonable. Although not always accurate, such approximation methods have been widely used for binary outcomes. To improve the approximations for binary outcomes, two new methods for stepped wedge designs (SWDs) of binary outcomes have recently been published. However, these new methods have not been implemented in publicly available software. The objective of this paper is to present power calculation software for SWDs in various settings for both continuous and binary outcomes. Methods: We have developed a SAS macro %swdpwr, an R package swdpwr and a Shiny app for power calculations in SWDs. Different scenarios including cross-sectional and cohort designs, binary and continuous outcomes, marginal and conditional models, three link functions, with and without time effects under exchangeable, nested exchangeable and block exchangeable correlation structures are accommodated in this software. Unequal numbers of clusters per sequence are also allowed. Power calculations for a closed cohort employ a block exchangeable within-cluster correlation structure that accounts for three intracluster (intraclass) correlations: the within-period, between-period, and within-individual correlations. Cross-sectional cohorts allow for nested exchangeable or exchangeable correlation structures defined by the within-period and the between-period intracluster correlations only. Our software assumes a complete design and equal cluster-period sizes. While the methods accommodate correlation structures of constant within-period intracluster correlation coefficient (ICC) as well as a different within- and between-period ICC, it does not allow the between-period ICC to decay. Results: swdpwr provides an efficient tool to support investigators in the design and analysis of stepped wedge cluster randomized trials. swdpwr addresses the implementation gap between newly proposed methodology and their application to obtain more accurate power calculations in SWDs. Conclusions: In an effort to make computationally efficient (and non-simulation-based) power methods under both the cross-sectional and closed-cohort designs for continuous and binary outcomes more accessible, we have developed this user-friendly software. swdpwr is implemented under two platforms: SAS and R, satisfying the needs of investigators from various backgrounds. Additionally, the Shiny app enables users who are not able to use SAS or R to implement these methods online straightforwardly.

Introduction

In cluster randomized trials (CRTs), the unit of randomization is the cluster, improving administrative convenience and reducing treatment contamination [1]. In stepped wedge designs (SWDs), all clusters start out in the control condition and switch to the intervention condition in a unidirectional and randomly assigned order, and once treated, the clusters maintain their intervention status until the end of the study. At pre-specified time periods, a random subset of clusters cross over from the control to the intervention condition. Stepped wedge randomization may be preferred for estimating intervention effects when it is logistically more convenient to roll-out intervention in a staggered fashion and when the stakeholders or participating clusters perceive the intervention to be beneficial to the target population [2].

Two different stepped wedge designs have been considered: the cross-sectional design and the closed cohort design [3]. We do not consider open-cohort designs in this work as the related methods for binary outcomes are currently unavailable. Only complete designs with no transition periods are included. In a cross-sectional design, different participants are recruited at each time period in each cluster; while in a closed cohort design (which for simplicity will be referred to as a cohort design hereafter), participants are recruited at the beginning of the study and have repeated outcome measures at different time periods [4]. A distinguishing feature of all CRTs is that outcomes within the same cluster tend to be correlated with one another [5]. Because in SWDs outcomes are measured at different time periods, the within-period and between-period intracluster correlation coefficients may be different and thus should be separately considered in designing SWDs [5]. An additional within-individual correlation should be included when it is a cohort SWD to account for the repeated measures within the same individual over time [6]. Two statistical models can be used to account for these three levels of intraclass correlation: the conditional model and the marginal model. Conditional models are based on mixed effects models [7], [8], which accommodate the intraclass correlations via latent random effects. Marginal models describe the population-averaged responses across cluster-periods, and are usually fitted with generalized estimating equations (GEE) [9]. The interpretations of regression parameters can be different under these two models, with the important exception of the identity and log links when random effects and the covariates are independent, as is typically assumed [10]. The design and analysis of SWDs have been mostly based on conditional models, for instance, Hussey and Hughes [2], Woertman et al. [11], Hemming et al. [12], Hooper et al. [13], Li et al. [14]. As a complementary approach, the marginal models separately consider the mean and correlation models and carry a straightforward population-averaged interpretation. Accordingly, assuming a continuous or binary outcome, Li et al. [15] proposed methods for the design and analysis of SWDs using marginal models. Alternatively, the conditional model considers the causal effect of interventions on individuals, under the assumption of no unmeasured confounders present or other source of bias.

Binary outcomes are frequently seen in cluster randomized trials as endpoints. However, existing methods for sample size calculation of SWDs have been almost exclusively focused on continuous outcomes. Hussey and Hughes [2] proposed an approach based on linear mixed effects models, estimated by weighted least squares for continuous outcomes, and provided an approximation to this approach for binary outcomes. Systematic reviews indicate that the majority of SWDs with binary outcomes used this approximation method [16], [17], which may either overestimate or underestimate the power in different scenarios [18]. To improve this approximation, Zhou et al. [18] developed a maximum likelihood method for power calculations of SWDs with binary outcomes based on the mixed effects model and Li et al. [15] proposed a method for binary outcomes within the framework of GEE that employed a block exchangeable within-cluster correlation structure with three correlation parameters.

These new methods have not yet been implemented in publicly available software, making it difficult for researchers and practitioners to apply these new methods to rigorously design their studies. There are a few current software packages for power calculations in SWDs. The Hussey and Hughes approach [2] was implemented by the swCRTdesign [19] in R and an Excel spreadsheet (http://faculty.washington.edu/jphughes/pubs.html). Hemming and Girling [20] developed a Stata menu-driven program steppedwedge based on the Hussey and Hughes model. These approaches consider the linear mixed effects model for continuous outcomes and perform approximate calculations for binary outcomes. Hemming et al. [21] developed the Shiny CRT Calculator programmed in R using linear mixed effects models for continuous outcomes and included the normal approximation for binary outcomes (https://github.com/karlahemming/Cluster-RCT-Sample-Size-Calculator). This method accommodated cross-sectional and cohort designs with three intraclass correlations (intracluster correlation, cluster autocorrelation (CAC) and individual autocorrelation (IAC)) [13] that are different from those in our work. The IAC and CAC are both cluster mean correlations, however, the intacluster correlation coefficients in our software are defined for within-cluster individual level observations (their differences are clarified in Li et al. [22]). These approximations for binary outcomes may or may not be accurate, depending on whether or not the normal approximation to the binomial distribution is reasonable. Alternatively, Baio et al. [23] developed the R package SWSamp (https://sites.google.com/a/statistica.it/gianluca/swsamp), which allows simulation-based sample size and power calculations for several general scenarios including cross-sectional and cohort designs for continuous, binary and count outcomes. However, this package does not allow for random cluster-by-time interaction (therefore assuming the within-period intracluster correlation coefficient (ICC) is the same as the between-period ICC). In addition, two more recent R packages, clusterPower [24] and CRTpowerdist [25] also allow for simulation-based power calculation for continuous, binary and count outcomes, but with a focus on cross-sectional designs. Hence, in an effort to make computationally efficient (and non-simulation-based) power methods under both the cross-sectional and closed-cohort designs for continuous and binary outcomes more accessible, we have developed user-friendly software for methods proposed by Zhou et al. [18] and Li et al. [15] based on the conditional and marginal models, respectively. Particularly, we focus on the exchangeable, nested exchangeable, and block exchangeable within-cluster correlation structures. Methods of binary outcomes for other types of correlation structures (eg. exponential decay structure) are not implemented in the current version of our software. The software engine has been developed in Fortran and is incorporated into the SAS macro %swdpwr, the R package swdpwr and a Shiny app.

Section snippets

Methods

Throughout this article, the regression parameter $β$ denotes the treatment effect. For testing the treatment effect, we consider the following hypothesis: $H_{0} : β = 0 versus H_{A} : β \neq 0,$ where $β_{A}$ is the true value of $β$ under the alternative hypothesis that $β_{A} \neq 0$ . In this software, power is calculated based on a two-sided Wald-type test given by: $Φ (\frac{| β_{A} |}{\sqrt{Var (\hat{β})}} - Z_{1 - α / 2}) + Φ (- \frac{| β_{A} |}{\sqrt{Var (\hat{β})}} - Z_{1 - α / 2}),$ where $Φ (\cdot)$ is the cumulative distribution function of the standard normal distribution, $α$ is the significance level, and $Z_{1 -}$

Software description

Table 1 displays all the scenarios and correlation structures that are implemented in the software, accommodating cases and methods with and without time effects. The input parameters are the same for R and SAS. Hereafter, the mean response refers to the average outcome and the proportion for continuous and binary outcomes, respectively. Different input parameters values, including for the anticipated mean response in the control group at the start of the study, the anticipated mean response in

Examples

The usage of the software is based on platforms of R and SAS, which requires separate illustrations. The following sections are organized according to different scenarios such as continuous and binary outcomes, cross-sectional and cohort settings, different model options, different link functions, different time effects assumptions. Each section will contain examples under both platforms. The implementation of SAS macro for these examples will be shown in Appendix C. When the input arguments

Application

We provide two real-data applications and show the implementation by SAS.

Discussion

This article has described the use of the R package swdpwr and the SASmacro %swdpwr for power calculations in SWDs. The software is designed under two computer platforms where users specify input parameters for different scenarios of interest, accommodating cross-sectional and cohort designs, binary and continuous outcomes, marginal and conditional models, three link functions, with and without time effects, and unequal allocation of clusters per sequence. This software addresses the

Declaration of Competing Interest

The authors have declared no conflict of interest.

Acknowledgments

This work was supported by the grants NIH/R01AI112339 and NIH/DP1ES025459.

References (40)

M.A. Hussey et al.
Design and analysis of stepped wedge cluster randomized trials
Contemp. Clin. Trials
(2007)
K. Hemming et al.
The efficiency of stepped wedge vs. cluster randomized trials: stepped wedge studies do not always require a smaller sample size
J. Clin. Epidemiol.
(2013)
J.P. Hughes et al.
Current issues in the design and analysis of stepped wedge trials
Contemp. Clin. Trials
(2015)
W. Woertman et al.
Stepped wedge designs could reduce the required sample size in cluster randomized trials
J. Clin. Epidemiol.
(2013)
F. Li et al.
Optimal allocation of clusters in cohort stepped wedge designs
Stat. Probab. Lett.
(2018)
K. Hemming et al.
Sample size calculations for stepped wedge and cluster randomised trials: a unified approach
J. Clin. Epidemiol.
(2016)
Y. Ouyang et al.
Crtpowerdist: an R package to calculate attained power and construct the power distribution for cross-sectional stepped-wedge and parallel cluster randomized trials
Comput. Methods Programs Biomed.
(2021)
X. Zhou et al.
“Cross-sectional” stepped wedge designs always reduce the required sample size when there is no time effect
J. Clin. Epidemiol.
(2017)
X. Liao et al.
A note on “Design and analysis of stepped wedge cluster randomized trials”
Contemp. Clin. Trials
(2015)
D.M. Murray
Design and Analysis of Group-Randomized Trials
(1998)

A.J. Copas et al.

Designing a stepped wedge trial: three main designs, carry-over effects and randomisation approaches

Trials

(2015)

J. Martin et al.

Intra-cluster and inter-period correlation coefficients for cross-sectional cluster randomised controlled trials for type-2 diabetes in uk primary care

Trials

(2016)

J. Pinheiro et al.

Mixed-Effects Models in S and S-PLUS

(2006)

N.E. Breslow et al.

Approximate inference in generalized linear mixed models

J. Am. Stat. Assoc.

(1993)

K.-Y. Liang et al.

Longitudinal data analysis using generalized linear models

Biometrika

(1986)

J. Ritz et al.

Equivalence of conditional and marginal regression models for clustered and longitudinal data

Stat. Methods Med. Res.

(2004)

K. Hemming et al.

Stepped-wedge cluster randomised controlled trials: a generic framework including parallel and multiple-level designs

Stat. Med.

(2015)

R. Hooper et al.

Sample size calculation for stepped wedge and other longitudinal cluster randomised trials

Stat. Med.

(2016)

F. Li et al.

Sample size determination for gee analyses of stepped wedge cluster randomized trials

Biometrics

(2018)

J. Martin et al.

Systematic review finds major deficiencies in sample size methodology and reporting for stepped-wedge cluster randomised trials

BMJ Open

(2016)

Cited by (7)

Power calculation for detecting interaction effect in cross-sectional stepped-wedge cluster randomized trials: an important tool for disparity research
2024, BMC Medical Research Methodology
An introduction to the statistical analysis of stepped cluster randomized trials
2024, Chinese Journal of Evidence-Based Medicine
Estimating intra-cluster correlation coefficients for planning longitudinal cluster randomized trials: A tutorial
2023, International Journal of Epidemiology
A general method for calculating power for GEE analysis of complete and incomplete stepped wedge cluster randomized trials
2023, Statistical Methods in Medical Research
power swgee: GEE-based power calculations in stepped wedge cluster randomized trials
2022, Stata Journal
Sample size calculators for planning stepped-wedge cluster randomized trials: a review and comparison
2022, International Journal of Epidemiology

View all citing articles on Scopus

View full text

swdpwr: A SAS macro and an R package for power calculations in stepped wedge cluster randomized trials

Highlights

Abstract

Introduction

Section snippets

Methods

Software description

Examples

Application

Discussion

Declaration of Competing Interest

Acknowledgments

Contemp. Clin. Trials

J. Clin. Epidemiol.

Contemp. Clin. Trials

J. Clin. Epidemiol.

Stat. Probab. Lett.

J. Clin. Epidemiol.

Comput. Methods Programs Biomed.

J. Clin. Epidemiol.

Contemp. Clin. Trials

Design and Analysis of Group-Randomized Trials

Designing a stepped wedge trial: three main designs, carry-over effects and randomisation approaches

Trials

Intra-cluster and inter-period correlation coefficients for cross-sectional cluster randomised controlled trials for type-2 diabetes in uk primary care

Trials

Mixed-Effects Models in S and S-PLUS

Approximate inference in generalized linear mixed models

J. Am. Stat. Assoc.

Longitudinal data analysis using generalized linear models

Biometrika

Equivalence of conditional and marginal regression models for clustered and longitudinal data

Stat. Methods Med. Res.

Stepped-wedge cluster randomised controlled trials: a generic framework including parallel and multiple-level designs

Stat. Med.

Sample size calculation for stepped wedge and other longitudinal cluster randomised trials

Stat. Med.

Sample size determination for gee analyses of stepped wedge cluster randomized trials

Biometrics

Systematic review finds major deficiencies in sample size methodology and reporting for stepped-wedge cluster randomised trials

BMJ Open