Yoko Tanaka

Multi-regional clinical trials have been widely used for efficient global new drug developments. Due to potential heterogeneity of patient populations, it is critical to evaluate consistency of treatment effects across different regions in a multi-regional trial in order to determine the applicability of the overall treatment effect to the patients in individual regions. Quan et al. (2010) proposed definitions for the assessments of consistency of treatment effects in multi-regional trials. To facilitate the application of their ideas to design multi-regional trials, in this paper, we provide the corresponding R functions for calculating the unconditional and conditional probabilities for demonstrating consistency in relationship with the overall/regional sample sizes and the anticipated treatment effects. Detailed step by step instructions and trial examples are also provided to illustrate the applications of these R functions.


Introduction
The applications of multi-regional clinical trials (MRCT) in global new drug developments present opportunities as well as challenges.One of the challenges is to assess consistency of treatment effects across different regions in order to determine the applicability of the overall treatment effect to the patients in individual regions.By consistency, we specifically refer to the similarity of treatment effects across regions.Some regulatory agencies (e.g., see Japanese guidance in Ministry of Health, Labour and Welfare of Japan 2007 for Joint International Clinical Trials) recommend incorporating consistency assessment as a part of the trial objective.To follow this recommendation, at the design stage, we have to first specify a method or definition for consistency assessment and then derive the appropriate overall and regional sample sizes to ensure desired probabilities for demonstrating consistency based on the selected definition.
There could be many different ways to define consistency.Quan et al. (2010) explored a number of consistency definitions.Due to space limitation, they only briefly presented the formulas for calculating the unconditional and conditional probabilities for demonstrating consistency.Nevertheless, they provided a lot of computation and simulation results to compare these definitions under different parameter configurations.As they pointed out, many factors can impact the probability for consistency assessment.To facilitate the application of their ideas, in this paper, we provide source code of R (R Development Core Team 2011) functions for the computations and simulations.Thus, MRCT designers can conveniently perform explorations at the design stage to help them to determine trial specifications.

Definitions for consistency assessment
Let s be the number of regions in a MRCT and let X ij and Y ij denote, respectively, the control and experimental treatment group endpoint values for the j th patient within region i, assumed to have normal distributions: To simplify the presentation, at the design stage, we assume that within a region there are equal numbers of patients in each treatment group, and we further assume that the variances are the same across group and region, i.e., σ 2 1 = • • • = σ 2 s = σ 2 .If this is not the case, slight modifications in the following formulas and in the programs should be made before the calculations.Let N i denote the number of patients per treatment group for the i th region, and N = s i=1 N i be the total number of patients in the trial in each treatment arm.Further, let δ i = µ iY − µ iX be the true treatment effect within region i, assuming that a larger value implies a better outcome.We estimate δ i by the sample mean difference between treatment groups within region i: A common estimate of an overall treatment effect is (1) Via Equation 1, the per group overall sample size that achieves 1 − β power to detect an overall treatment effect of δ with a significance level α one-sided test is where z a is the (1 − a) quantile of the standard normal distribution.
Let f i denote the proportion of patients within region i, and u i the ratio of the treatment effect in region i to the overall effect: Then s i=1 f i = 1 and s i=1 u i f i = 1.
The probability to claim consistency by Definition 1 is where We may be interested in assessing consistency of treatment effects across regions only when the overall treatment effect is significant.
Given that there is an overall significant treatment effect, the conditional probability to claim consistency by Definition 1 is ) , where ) is a multivariate normal random vector with mean

Definition 2: Exceeding a prespecified effect size
As an alternative to Definition 1, consider Definition 2: δ1 > b, δ2 > b, . . ., and δs > b, where b ≥ 0. A possible advantage of Definition 2 over Definition 1 is that if the overall treatment effect is robust and the effects of certain regions are reasonable but not exceptional, we may still be able to show consistency.

Definition 3: Significantly exceeding a proportion of the overall effect
Definition 3 is in a hypothesis testing framework.The null and alternative hypotheses are Consistency will be claimed if H 3 0 is rejected and H 3 a is accepted.Using a confidence interval approach, H 3 0 will be rejected at significance level α if The unconditional and conditional probabilities of rejection are, respectively, where Z s+1 is defined in Section 2.1.

Definition 4: Absence of significant treatment-by-region interaction
In Definition 4, applying the treatment by subgroup interaction test to MRCT with regions as the subgroups, the null and alternative hypotheses are follows a central chi-square distribution with (s − 1) degrees of freedom and consistency will be claimed if H 4 0 is not rejected.Since δi − δ and δ are independent, the unconditional and conditional probabilities of showing consistency based on Definition 4 are identical.

Definition 5: Not significantly worse than the overall effect
Definition 5 is a modified version of Definition 4 in which we test the one-sided individual hypotheses If H 0i is rejected and H ai is accepted at level α , the region i effect is not considered consistent with the overall effect, and therefore consistency across all regions has not been shown.In other words, in order to claim consistency, none of the H 0i 's can be rejected; using a confidence interval approach, the following must hold for all i: Again, since δi − δ and δ are independent, both the unconditional and conditional probabilities of showing consistency based on Definition 5 are identical.

Other considerations: Random effect model
Hung ( 2007) considered the following model for a MRCT, in which region effect is considered to be a random effect: Two estimators of δ have been used in Quan et al. (2010): δ and δ& = s i=1 w i δi / s i=1 w i , where w i = 1/(τ 2 + 2σ 2 /N i ).For those definitions involving only the observed treatment effects, like Definitions 1 and 2, the random effect model can be applied.For example, for Definition 1, the conditional probability of showing consistency based on this random effect model is where ) is a multivariate normal random vector with mean

Program overview
R function prob.def1aims for calculating the unconditional and conditional probabilities to claim consistency of treatment effect based on Definition 1. R package mvtnorm (Genz et al. 2012;Genz and Bretz 2009) should be installed and loaded first in order to use function pmvnorm.This function computes the cumulative distribution function of the multivariate normal distribution with any covariance matrix:
beta: 1 − β power for detecting an overall treatment effect of delta.
delta: The standardized overall treatment effect for calculating the (1 − β) overall power.
s: Number of regions.
f: Vector of (f 1 , f 2 , . . ., f s ), where f i is the proportion of patients within region i and u: Vector of (u 1 , u 2 , . . ., u s ), where u i is the ratio of the treatment effect for region i to the overall effect, u i = δ i /δ, and s i=1 f i u i = 1.
pi: A threshold that each region should achieve at least pi of the observed overall effect in order to claim consistency.
Then the program checks if the lengths of f and u are both equal to s, If any of them is not true, the program will be stopped and a warning message will be returned.
The returned values of prob.def1() are a list of uncond.prob:unconditional probability to claim consistency.cond.prob:conditional probability to claim consistency given a significant overall treatment effect.
The R functions for calculating the unconditional and conditional probabilities for claiming consistency based on Definition 2-5 are prob.def2-prob.def5respectively.They use similar input arguments and provide similar returned values.Notice that unconditional and conditional probabilities for Definitions 4 and 5 are identical.The prob.def4 could also be used for power calculation for general subgroup analysis.
Definition 1 can be applied to the random effect models, and functions rand1.def1and rand2.def1calculate the probabilities to claim consistency using δ& and δ respectively.Quan et al. (2010) also considered the minimum required proportion of sample size for a particular region, say region 1, so that there is an (1-beta) probability of demonstrating consistency.When there are 4 regions and the region effects are the same, the minimum proportions of f 1 s under conditions x can be calculated by functions f11.defx, f12.defx and f13.defx respectively (x = 1, 2, 3).

Examples
In this section, trial examples are used to illustrate the applications of the programs.Suppose we want to conduct a MRCT in s = 3 regions.In order to have 80%(=1 − β) overall power to detect an overall standardized treatment effect δ = 0.25 at one-sided significance level α = 0.025, the overall sample size per treatment group should be N = 252.Let the proportion vector of study patients in the 3 regions be f = (f 1 , f 2 , f 3 ) = (1/3, 1/3, 1/3), the ratio vector of the treatment effects to the overall treatment effect in the 3 regions be u = (u 1 , u 2 , u 3 ) = (1, 1, 1) and π = 1/s.Then, the input parameters for prob.def1 are To make the exact replication of the outputs possible, we set the random number generator seed using set.seed.Applying prob.def1 will yield R> set.seed(5000)R> prob.def1(alpha,beta, delta, s, f, u, pi) That is, the unconditional and conditional probabilities to claim consistency based on Definition 1 for such a design setting are 67% and 76% respectively.If these probabilities are considered too small, we can increase the overall power from 80% to 90% or decrease β from 0.2 to 0.1.Because of the increase in overall sample size from N = 252 to 337, the corresponding unconditional and conditional probabilities are then 76% and 81% respectively.Basically, we can adjust the overall sample size and f to reach the desired probability for consistency assessment.
Consider the example in Quan et al. (2010).With 558 patients, 186 receiving placebo and 372 in the active treatment group, there was > 99% power to detect a between-treatment difference of δ = 0.005 with σ = 0.013 for a significance level α = 0.025 one-sided test (to have enough safety data to meet the regulatory requirement, this study was over powered for efficacy analysis).At the design stage, we would like to determine the minimum required proportion of sample size for a particular region, say Region 1, so that there was an 80% probability of demonstrating consistency.If the sample sizes of the other 3 regions were the same (f 1 < f 2 = f 3 = f 4 ) and conditional probability was the concern, then using Definition 1 with π = 1/4, f 1 should be 13%.If the unconditional probability was set to 80%, f 1 should be 14%.These can be obtained by f11.def1 with specifications: R> alpha <-0.025R> beta <-0.01 R> delta <-0.005R> sigma <-0.013R> pi <-1/4 R> u <-rep(1, 4) R> set.pow <-0.8 R> f11.def1(alpha, beta, delta, sigma, u, pi, set.pow) Notice that, when we run the program, if the desired probability for demonstrating consistency is set to too high, the required minimum proportion f 1 could be nonexistent, unless we increase the overall sample size further.

Discussion
In the design stage of a MRCT, we are interested in the probability and the required proportions of sample sizes of individual regions for claiming consistency of treatment effects.This paper is developed to provide the R source codes for calculating probability and sample sizes based on various definitions of consistency.
Functions prob.def1-prob.def5calculate both the unconditional and conditional probabilities for claiming consistency of treatment effect in MRCTs based on Definitions 1-5 of Quan et al. (2010).For random effect model, Definition 1 is implemented in rand1.def1and rand2.def1for two types of estimators of δ respectively.Definition 2 can also be applied to random effect model with slight modifications to functions rand1.def1and rand2.def2.The minimum required proportion of sample size for a particular region can be obtained by functions f11.defx, f12.defx and f13.defx respectively for Definitions x = 1, 2, 3. R function pmvnorm is used to compute the cumulative distribution function of the multivariate normal distribution, which is included in R package mvtnorm of Genz et al. (2012).This function produces probabilities for both the singular and nonsingular joint distributions.
As demonstrated in the trial examples in Section 4, program users have to adjust the overall and regional sample sizes given the anticipated corresponding treatment effects to get the desired probability.If the sample sizes for some regions are too small, these small regions could be considered to be combined to form a new region.Otherwise, the probability will be reduced dramatically.All these have to be prespecified in study protocol.