Exact Conditional Inference for Two-way Randomized Bernoulli Experiments

Exact conditional inference for two-way randomized experiments with Bernoulli-distributed outcomes is a useful special case of exact logistic regression, but unlike the general case, it is trivial to compute. We present an R function which can easily be translated into any other language, making this type of analysis more readily accessible.


Minimalist experiments
Exact (conditional) logistic regression has a well-deserved reputation for computational difficulty (Hirji, Mehta, and Patel 1987). This has restricted the use of it to those who have access to comprehensive (and expensive) packages like SAS (SAS Institute Inc. 2003) or LogXact (Cytel Inc. 2006). As of this writing, R (R Development Core Team 2007) still has no exact logistic regression package. exactLoglinTest (Caffo 2006) is designed to test for independence in Poisson log-linear models, and is cumbersome to use to test hypotheses about the parameters of binomial logistic models.
However, there are useful special cases of exact logistic regression which pose no computational difficulty but which are nevertheless underused due to the lack of specialized code. One of these cases is the analysis of small-sample fully randomized experiments with two twolevel factors and Bernoulli-distributed data in cells of identical size. Such experiments are "minimalist" in the sense that they are the simplest possible experiments involving factors and their interaction. The analogue with larger samples and continuous data is the two-way factorial analysis of variance, taught in most introductory statistics textbooks. Yet without specialized code, the analysis of minimalist factorial experiments requires the application of exact logistic regression to data like that in Table 1, where X and Y represent the two factors and N is the fixed cell size. As can be seen from the table, minimalist experiments have very regular structures. In particular, a data set can be fully characterized by the marginals, n 11 , and N . Moreover, the three independent variables are mutually orthogonal; computing the p value for one, conditional on the other two, works the same way no matter which variable is analyzed. Full-fledged exact logistic regression is unnecessarily powerful here, yet standard tests for 2 × 2 contingency tables (e.g. Fisher's exact test or the exact McNemar test) are inappropriate for testing hypotheses like H 0 : X + = X − , where X + (X − ) denotes factor X at the high (low) level.

A permutation test
Fortunately, minimalist experiments can be analyzed with a simple permutation test. Consider testing H 0 : A + = A − , where factor A could be any one of X, Y or XY . Conditional on N and the marginals (b + , b − ) and (c + , c − ) associated with the other two factors B and C, we group the data in the summary form Moreover, the values of A + and A − are the sums of the diagonals and off-diagonals, respectively, that is, If the alternative hypothesis is H 1 : A + > A − , then analogous to Fisher's exact test, the exact one-tailed p-value is computed by where n 11 denotes the observed value of N 11 and the numerator counts all the permutations which are more extreme in the direction of H 1 . By contrast, if H 1 : A + < A − is considered, then the exact one-tailed p-value is computed by For H 1 : A + = A − , since the distribution may not be symmetric, two-tailed p values should not be computed by doubling the one-tailed p. A commonly used approach (Agresti 2002, p. 93) is to sum all the permutations more extreme than the observed value in both directions, that is, where a + = 2n 11 +b − −c + is the observed value of A + and E(A + ) = u i= (2i+b − −c + )P(N 11 = i) denotes the expectation of A + .

Example
A psychologist hypothesizes that the ability to solve a certain puzzle is affected by working environment (indoors vs. outside). Worrying about a possible confound or interaction with time of day, she runs the experiment in both the morning and the afternoon. She randomly assigns 28 college students to the four conditions defined by the factors Place and Time, and records whether or not they solve the puzzle. The results are shown in Table 2.

N = 7 (28/4) Indoors Outside
Morning 0 4 Afternoon 3 5 The researcher thus can conclude that Place does make a difference (two-tailed p < 0.05), even when Time and any interaction are taken into account. If the same data are analyzed with an asymptotic logistic regression, nonsensical results are produced (all p > 0.9, with parameter estimates varying across statistical software). This is due to the quasicomplete separation of the data points (Albert and Anderson 1984) caused by n 11 = 0.
The smallest experiment with at least one two-tailed p < 0.05 has N = 3 (12 observations), the smallest with at least two has N = 6 (24 observations), and the smallest with three has N = 9 (36 observations). Tests with a variety of data sets show that the one-tailed and twotailed p values produced by minexp.p are identical to those given when the EXACT statement is used in the PROC LOGISTIC procedure of SAS 9.1 for Windows, except that SAS only reports the lower one-tailed p value regardless of the direction of the alternative hypothesis.

Discussion
Our function is of course quite limited, and cannot be applied to experiments with varying cell maxima, more than two factors, factors with more than two levels, stratified data, and so on. Some modifications would not require an enormous increase in computational complexity. For example, computing p values for an experiment with different cell maxima would simply involve substituting these maxima for N in the appropriate places in the equations for , u, and P(N 11 = i). Likewise, for an experiment with n factors, each of the 2 n − 1 p values would be conditional on 2 n − 1 marginal values, in a manner parallel to the conditioning on b + , c + , and c − in the two-factor case (the 2 n th marginal value is superfluous given the sum of all cell maxima).
The only real difficulty is that once we go beyond the conceptual simplicity of the minimalist experiment, there is no reason not go all the way and create a full-fledged analogue to the analysis of variance for conditional inference in Bernoulli experiments. Such a function should also allow the user to define model formulas in the standard flexible R fashion, and output more than just the p values, including objects for manipulation by other functions.
Nevertheless, our function makes a useful type of analysis available to researchers unable to obtain access to more comprehensive packages for exact statistics. Those impatient for a more comprehensive version are certainly welcome to build on the code snippet we have demonstrated here.