On sample size determination based on a Bayesian index of superiority for two independent binomial proportions

In clinical study, calculation for sample size is often done using the frequentist framework. However, several methods of calculation using the Bayesian framework have been proposed. In this paper, we propose a method for the determination of sample sizes using the index θ = P ( π1 > π2|X1, X2 ) , where X1 and X2 denote the binomial random variables for trials n1 and n2 and parameters π1 and π2, respectively. In this paper, we propose a new method of calculation, by which we determine in advance the estimated difference between the posterior proportions and the value to be derived at the end of the test to obtain the necessary sample size. We provide the lists of necessary sample sizes with various assumptions and the calculation program using Mathematica. Subjects: Science; Mathematics & Statistics; Applied Mathematics; Mathematics for Biology & Medicine


Introduction
In clinical trial tests, the difference between two independent binomial proportions is frequently examined. The comparative result is used to determine the propriety of a new treatment. Biostatisticians determine the main analysis method, estimated difference between binomial proportions, significance level, and power before a clinical study is conducted. This method is used to calculate the sample size within the frequentist framework.
A large body of literature describes methods for calculating sample size, including both frequentist (Lachin, 1981;Lemeshow, Hosmer, Klar, & Lwanga, 1990) and Bayesian approaches. Several criteria have been proposed for Bayesian sample size determination. The specific application to the binomial parameter was examined in detail by Pham-Gia and Turkkan (1992), while Adcock (1987) considered multinomial experiments, which include the binomial as a special case. Adcock (1992) compared the various approaches presented in the above two studies. Joseph, Wolfson, and Berger (1995) proposed three different Bayesian approaches to calculate sample size. These proposed methods are based on highest posterior density-credible intervals and are discussed and illustrated in the context of a binomial experiment. Decision-theoretic criteria (Lindley, 1997) and sample sizes based on average power of hypothesis tests (Spiegelhalter & Freedman, 1986) have also been considered. Chaloner and Verdinelli (1995) recently published a review of Bayesian optimal design, while Adcock (1997) reviewed both frequentist and Bayesian sample size criteria. More recently, M'Lan, Joseph, and Wolfson (2008) investigated the binomial sample size problem using generalized versions of the average length and average coverage criteria, median length and median coverage criteria, and worst outcome criterion and its modified version. These methods of calculating sample size are based on a highest posterior density-credible interval.
Berry (2012) detailed a Bayesian approach for comparing the binomial proportions of two groups using various examples. Zaslavsky (2009) proposed a one-sided hypothesis based on a one-sample situation. Kawasaki and Miyaoka (2012) proposed an index.
where X 1 and X 2 denote the binomial random variables for trials n 1 and n 2 and parameters π 1 and π 2 , respectively. They provided approximate and exact expressions for index θ using a beta prior and presented the results of actual clinical trials to demonstrate the utility of this index. This index can be used to determine the probability of the binomial proportion of a treatment group to be superior to that of a control group. Thus, this index determines the inferiority between two independent binomial proportions. The proposed index θ is advantageous in that it can be easily and intuitively understood and applied.
In this paper, we propose a new method for calculating sample size using index θ in (1.1). For this method, the difference between two independent posterior binomial proportions and index θ are decided before clinical research, and the calculation result of the sample size required in the case is shown. Moreover, we compare sample sizes using approximate and exact methods for index θ.
The remainder of this paper is organized as follows. We obtain approximate and exact expressions for index θ in Section 2. In Section 3, we propose a method for calculating sample size using θ in (1.1), and we provide lists of necessary sample sizes from simulation results with various assumptions in Section 4. In Section 5, we consider an imaginary clinical trial of a new treatment and calculate the value of sample sizes using the lists of necessary sample size provided in Section 4. Finally, we conclude the paper with a brief summary in Section 6.

Methodology
Let X 1 and X 2 denote binomial random variables for n 1 and n 2 trials with parameters π 1 and π 2 , respectively. The conjugate prior density for π i is a beta distribution with parameters α i and β i , where i > 0, i > 0, and i = 1, 2. The proposed posterior density for π i is (1.1) = P 1 > 2 |X 1 , X 2 http://dx.doi.org/10.1080/23311835.2017.1284972 Approximate expression for θ One method of calculation for the index θ in (1.1) is an approximation using the standard normal table. Let π 1, post denote the binominal proportion in the posterior density. We assume that the a i and b i of the posterior density are large. We need to find a Z-test statistic. We can show the expectation of difference for the posterior density and the variance of difference for the posterior density as follows: The Z g -test statistic is approximately distributed as the standard normal distribution. Therefore, the approximate probability of index θ is where Φ(⋅) is the cumulative distribution function (CDF) of the standard normal distribution. We can easily calculate the approximate probability. In this paper, we show the difference between an exact probability and the approximate probability in Section 4.
Exact expression for θ.
In contrast, Kawasaki and Miyaoka (2012) derived an exact expression for index θ in (1.1) using the posterior density. The exact expression for index θ is is the hypergeometric series, and (k) t is the Pochhammer symbol.

Method of Calculation for Sample Size
We propose a method of calculation for sample size using θ in (1.1) in this section. The detailed flow of the method is as follows: (1) Determine the value of hyperparameters α 1 , β 1 , α 2 , and β 2 , in beta distributions that are prior distributions.
(2) Determine the value of δ, which is the difference in the posterior densities of the two groups: (3) Determine the value of θ min , which is the lower limit of θ.
(4) Determine the ratio r between the sample size of the two groups (where n 1 = r × n 2 ).
(5) Based on the above assumptions (1 to 4), set to n 2 = 1 and calculate the following.
(6) Find the combinations of the realized values X 1 and X 2 of the random variables X 1 and X 2 , which are satisfied by the condition of δ in step 2.
(7) Calculate the value of θ for all combinations found in step 6.
(8) Update the value of n 2 by n 2 + 1 and go back to step 6 (iterate this calculation an appropriate number of times). 9) Find the value of the minimum n 1 for which all obtained values of θ in step 7 exceed θ min , and determine that the necessary sample size for the treatment group is this value. The necessary sample size for the control group can be defined by the relative expression n 1 = r × n 2 .

Simulation
In this section, we simulate the necessary sample size using θ in (1.1), which was defined in Section 2. Each assumption and setting is as follows: the noninformative prior is ( 1 = 1 = 2 = 2 = 1), the expected difference between the posterior densities of the two groups is δ = 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.11, 0.12, 0.13, 0.14, 0.15, the lower limit of θ is θ min = 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, and the ratio of sample sizes between the treatment and control groups is r = 1, 2, 3. In addition, the necessary sample size under other conditions is easily calculated by simulation, although we do not include this in the present paper.
Simulation Results when n 1 = n 2 .
In this section, we display the result when the sample sizes of groups 1 and 2 are equal. Figure  1(a)-(d) feature θ on the vertical axis and sample size on the horizontal axis. In Figure 1(a)-(d), we set the estimated difference between the posterior binomial proportions as 0.05, 0.10, 0.15, and 0.2, respectively. Each horizontal line indicates the value when θ is 0.7, 0.75, 0.8, 0.85, 0.9, or 0.95. Moreover, we displayed two sections in Table 1; one is calculated based on the exact probability of θ and the other, on the approximate probability. Figure 1(a)-(d) shows that when sample size becomes bigger, θ increases, but not monotonously. This is a trait of discrete distribution in the frequentist framework. We also find that the difference between the posterior proportions becomes larger as the θ value become higher. In accordance with this, the necessary sample size decreases.
We display the detailed results for sample size in Table 1. The vertical column shows the value of θ, and the horizontal row shows the difference between posterior proportions. We can see at several points that the necessary sample size for exact calculation is slightly higher than for approximation.
Simulation Results when n 1 ≠ n 2  Table 2 shows the detailed results as above. These figures and tables show that the necessary sample size n 1 + n 2 is larger in this  case than when sample sizes are equal among groups. We find from the following result that this trend increases as the imbalance between n 1 and n 2 becomes larger. In addition, Figure 3(a)-(d) and Table 3 show the result when n 1 = 3 × n 2 . Those also show that when sample sizes among groups are imbalanced the necessary sample size becomes larger. We also find that as the imbalance between n 1 and n 2 becomes larger, the difference between the necessary sample sizes for exact and approximate calculation tends to become larger.  Example In this section, we provide examples of the setting method for sample size for a case in which we consider an imaginary clinical trial of a new treatment based on the chart in section 3. Suppose the following situation. Assume a non-informative prior as the prior distribution ( 1 = 1 = 2 = 2 = 1). Set the same sample size for the treatment and control groups (n 1 = n 2 ). Assume that the expected value of difference of the posterior density between the two groups is 0.08 (δ = 0.08). Under the above condition, from Table 1, if the lower limit of θ is set to 0.9 (θ min = 0.9), the necessary sample size for the treatment and new groups is 132 by both exact and approximate calculation. If the lower limit is set to 0.95 (θ min = 0.95), the necessary sample size becomes 239 by exact calculation, but 225 by approximation.

Conclusion
In this paper, we suggested a new calculation method to obtain the sample size for the index of binomial proportion = P 1 > 2 |X 1 , X 2 suggested by Kawasaki and Miyaoka (2012) and simulated the actual value. We also defined a method of calculation for θ based on both approximation and exact methods and compared the results.
We used Mathematica, which is standard calculation software for simulations. The main reason we used Mathematica is because hypergeometric series calculation is necessary for exact calculation of θ, and its formula is implemented in Mathematica. Detailed programming code is provided in Appendix A.
The method to define sample size suggested in this paper begins with setting the estimated difference between the posterior proportions and the lower limit of θ. In addition, by setting the proportion of the sample sizes of the treatment and control groups and adding the information (if it is available), the sample size can be derived.
We found three important results regarding θ derived by simulation for both exact calculation and approximation. First, when the sample size becomes larger, θ increases, but not monotonously. Second, as the estimated difference between the posterior proportions becomes larger, the necessary sample size decreases. Third, when the difference between the proportions of the treatment and control groups becomes larger, the necessary sample size for both groups becomes larger.
In addition, comparison of the exact and approximate values in the simulation yields the following result. Exact calculation and approximation require almost the same sample size, and the trend for θ to increase when the sample size becomes larger is almost the same. The sample size for exact calculation is slightly larger than that for approximation. This difference tends to increase as the imbalance between the treatment group and the control group increases.
Finally, index θ, which has been extensively investigated in this research, is derived using the Bayesian framework rather than the frequentist framework as before. One of the merits of using the Bayesian framework is that it is easy to understand. Therefore, the index is expected to deliver very important information on clinical trials, where setting the sample size is considered very important. The setting method suggested in this paper may serve as a useful guideline in this regard.