Sample size calculation in cluster randomization clinical trials

The question of sample size is basic to the planning of any clinical study. When a survey is carried out using cluster sampling, between-cluster variation at each level of sampling contributes an additional source of variation, which must be allowed for in addition to between-subjects i.e. within cluster variation, to validly estimate parameters or to test their significance. The number of subjects needed for a cluster randomization trail is larger than for a study of the same power in which individual subjects are randomly sampled. In this paper, the illustrations are done to quantify the impact of increasing intraclass correlation on sample size.


Introduction
Cluster randomization trials (CRTs) are experiments in which entire social units or clusters of subjects rather than independent subjects are randomly allocated to intervention groups. For e.g. villages are selected as the randomization unit in clinical trials evaluating the efficacy of disease screening programs and schools are selected as the randomization unit in trials evaluating impact of nutritional pills on children's health. Randomizing individuals to treatments is not always feasible and cluster randomized trials are increasingly being utilized in the evaluation of health care interventions. 1 In schoolbased smoking intervention studies, randomization of schools rather than students to different treatment conditions is the usual approach to sampling. [2][3] CRTs are gaining popularity in health research to deal with large scale population surveys. There are few challenges in designing the CRTs, one, who may provide consent on behalf of a particular group and on what authority they may do so, another, in CRTs, the units of randomization and observation may not be the same, the group that receives the experimental treatment may not be the same as the group from which data are collected. For e.g. in trial assessing the surgical efficiency of two surgical instruments, surgeons are randomized to use surgical instruments by collecting post-operative pain scores on patients operated by surgeons. Weijer et al. have discussed these challenges in designing CRTs, the research community and regulators are persistently working on to conquer these challenges. 4 This research work is dedicated to address one of the challenge in calculating sample size for CRTs.

Intraclass correlation coefficient
The intraclass correlation coefficient ρ (ICC) measures the degree of similarity among responses within the same cluster. This parameter ρ may be interpreted as the standard Pearson's correlation coefficient between any two responses in the same cluster. In designing cluster-based randomized trials or intervention studies, accurate estimates of ICCs are required for sample size calculation to achieve desired power. In my earlier research, the ICCs at two and three levels have been illustrated in detail. [5][6]

Variation inflation factor
Variation inflation factor (VIF) is the ratio of the variance of an overall sample mean estimated from cluster means to the variance of an overall sample mean estimated from subjects within clusters. Generally VIF is a function of the average cluster size and the intraclass correlation coefficient (ICC) for the outcome variable under study i.e.
where γ is the average number of subjects per cluster and ρ is the ICC for the outcome variable. To estimate the required sample size, the design effect or variation inflation factor (VIF) must be incorporated into the sample-size calculation. 7 In my earlier research, the VIF at two and three levels have been illustrated in detail. [5][6] Concepts Consideration of units to be independent leads the situation of ignoring the variability at higher levels, thus having inferences with inflated power. 1,[8][9][10][11] Nesting implies violation of the assumptions of independence of observations and ignoring this dependency in data yields inflated test statistics when observations are correlated.
Decisions have to be made first about the number of clusters which should be selected and second the number of units which should be selected from each cluster. Even very small ICC values may have a big impact on sample-size estimation. Several authors have discussed how to use ICC estimates in calculating the number of clusters needed per treatment to detect a treatment effect. It is illustrated below the use of the ICC estimates for sample-size calculation in testing the hypothesis about the difference between means of two treatment groups. Type I error is fixed atα , and we want the test to have power 1 β − . If we were using simple random sampling (SRS), the sample size required would be: Here, N is the number of subjects required per treatment group, 1 2 Z α − and 1 Z β − are the values of standard normal variate for which the probability of smaller values is 1 2 α − and 1 β − respectively, 2 σ is the variance (assumed common) in each treatment group, and δ is the difference in either direction in the treatment means which we would want to detect. If we fix the number of subjects per cluster at ' γ ', the number of clusters ' n ' required using SRS will be obtained from the above formula, by taking N nγ = . To take into account the intraclass correlation, we have to multiply the variance by a factor of VIF , the variation inflation factor. The number of clusters required using cluster sampling for each treatment group will be: and ρ is the intraclass correlation. [8][9] Illustrations This is the simulated case study to calculate sample size in a cluster randomization clinical trial to assess the effect of nutrients on height in infants completing 3 years. To detect the difference of 1.1 inches (34.5 and 33.4 inches in treatment and placebo groups respectively) with 6.2 inches of common standard deviation in both treatment groups with 5% type I error, Table 1 presents the total sample size per group with increasing ICC to achieve 80% power by considering 100 infants in each cluster, hence 100 is the average number of subjects per cluster.

Discussion and conclusion
By focusing on Table 1, case 1 with ICC=0 represents the SRS and total sample size per group is 500. Based on this SRS case, the sample size and % increase in sample size with respect to SRS is calculated with increasing ICC. In case 2, even a very small ICC=0.001, increases the sample size by 10%, almost same amount of increment in sample size is evident with an increase in sample size by 0.001. In case 11, ICC=0.01 increases total sample size by almost two fold. In case 16, ICC=0.1 increases the sample size by almost 10 fold. Figure  1 showing an increasing trend, depicts the impact of increasing ICC on sample size. Conclusively, role of even a very small ICC can't be ignored while designing the CRTs. To draw inferences from cluster randomized clinical trials, more sample size is required to produce the same power as compared to SRS schemes. The VIF and ICCs have to be supplemented into sample size calculations in order to furnish precise sample size to meet power requirement.