Biostatistical analysis of the micronucleus mutagenicity assay based on the assumption of a mixing distribution.

The in vivo micronucleus assay can be analyzed by comparing the number of micronuclei (MN) of several dose groups with those of a control group. In several publications, difficulties arose in estimating a suitable distribution for MN, even in the untreated historical control groups. Mitchell et al. described the presence of a subpopulation of more susceptible responders. Based on this assumption of such a subpopulation, score tests were used for the mixing distribution of responders and nonresponders (behavior same as in untreated control animals) within the dose groups. The power behavior of these tests was characterized with a simulation study. The advantage of score tests can be shown, even in the practical and important guideline case of only five animals per group.


Introduction
The statistical analysis of the ifl vivo micronucleus assay is based on significance tests for the differences between the numbers of micronucleated polychromatic erythrocytes (MN) in the control group and several dose groups. In several publications, difficulties arose in estimating a suitable distribution ofthe MN, even for the untreated case of historical control groups: a) Amphelett and Delow (1) described the validity of the Poisson distribution, b) Hart and Engberg-Petersen (2) found a good approximation to the binomial distribution, c) Mitchell et al. (3) reported a negative binomial distribution, d) Mackey and MacGregor (4) established an extra-binomial variation under treatment with clastogenic agents, e) Salsburg and Holden (5) detailed problems in choosing a suitable distribution for historical control data.
Mitchell et al. (3) discussed the presence of outliers in MN data in terms ofa possible existence ofa subpopulation ofmore susceptible responders. With this model, Ashby and Mirkova (6) explained the variation in the MN data. A theoretical background can be derived from the genetically based polymorphism in mammalian P-450 xenobiotic metabolizing enzymes (7). Another explanation is based on heritable strain differences in MN induced by polycyclic aromatic hydrocarbons (8). In addition, nonresponders may arise due to improper administration ofthe test substance in a single animal. This case will, however, not be considered here. Mitchell  analysis ofhistorical control data in relation to concurrent control data and elimination of outliers with traditional statistical methods.
Due to the unclear distribution behavior of the outcome variable, rank tests were commonly used in several papers. For example, Leimer et al. (9) described the application of the Fisher-Pitman permutation test on micronucleus assay data. On the other hand, MacGregor et al. (10) recommended the use of Armitage's (11) trend test assuming a binomial distribution for MN in relation to the global number ofpolychromatic erythrocytes. In this respect, rank or permutation tests avoid the pooling ofMN within the groups under the binomial sampling assumption and consider the importance ofanimal-to-animal variation. For this reason, special types ofrank tests (so-called score tests), assuming a mixing distribution for the number ofresponders and nonresponders in the dose groups, will be considered here.

Analysis Based on the Mixing Distribution Assumption
Several methods assuming a mixing distribution can be used to solve the test problem. Here, only nonparametric score tests for the Lehmann (12) alternative hypothesis will be used (formulated as a one-sided, two-sample problem without limiting generalization).
LetX, ... , X,,be the MN responses ofthe control group with the distribution function H(x), and let Y1, . . . ,Y, be the MN responses ofa dose group with the distribution function G(x). The hypotheses can be formulated under the mixing distribution assumptionofrespondersandnonresponders inthedosegroupas: L HOTHORN Ho: H(x) = G(x) where p is the proportion of nonresponders, (l-p) is the proportion of responders and p is assumed unknown.
Two types of Lehmann alternative will be considered here: shift * Fpatho(z) = G(x - 6) according to Good (13) and power *e Fp,,ti() = G"(:x) according to Lehmann (12). Johnson et al. (14) suggested approximate score statistics for the shift alternative based on the following mixed normal score function: where d is a constant (in the simulation study reported below, d=0.5,1,1.5,2 were used; here, only the case d=1 will be reported) and 4) is the distribution function ofthe standard normal distribution.
Conover and Salsburg (15) proposed the following approximate score function for the power alternative, as a generalizaton of Wilcoxon-Mann-Whitney (WMW) scores: sc(i) = (i/(m + n + 1))G-1 where i is the rank in the combined (x+y) sample, a is an integer constant (in the simulation study a=3,4,5,6 were used; here, only the case a=4 will be reported. In toxicology, tests based on this mixing distribution assumption have ben used for behavioral studies (16), teratological studies (17), sister chromatid exchange (I), and chronic studies (18).

Simulation Study
In a simulation study, two questions will be addressed: a) Is the assumption of such a mixing distribution a suitable approach for analyzing data from the micronucleus assay? b) Can we observe an increase in power (e.g., in relation to the commonly used WMW U-test), even in the guideline case considered here where nj only equals 5?
The empirical distribution shown in Table 1 [(3) which approximates negative binomial distribution] was generated for the control groups using a PC program. Plwer estimations (based on  Table 2 shows that the score tests give a higher power than the WMW test for a medium-size effect between the control group and the dose group (represented by a shift 2) and the typical a level of 0.05, even for only one nonresponder in five animals (p=0.2). These power differences are not relevant for smaller shift parameters (e.g., 1). The differences arise with a larger a level, so that nj = 5 and a = 0.01, should be avoided.
The question that arises is whether increasing the number of animals up to 10 will give clear advantages of the score tests. Table 3 presents the related power estimations. For small and medium shift parameters, the increase in power ofthe score tests is higher in relation to the small sample size situation.
The power behavior dependent on the proportion of nonrespondersp is given in Table 4. Table 4 presents the differences between the score tests and the WMW test. These are seen to be negligible both in the direction of a small proportion of nonresponders (unimodal distribution ofall animals exhibiting a large reaction) and in the direction of a high proportion of nonresponders (unimodal distribution of animals exhibiting a small reaction; the estimation ofp=0 equal to & is not given in this table). Advantages ofscore tests are seen for proportions of p=O.2-0.8, whereby the dependence [based on the efficiency 1.     (19)] is not symmetric at aboutp=0.5 (Fig. 1). Only one point of the power function is given in Tables 2-4. Therefore, the power functions for selected values ofp,nj,sD and a are shown in Figure 2. For medium-size shifts, the differences among the power functions are important up to a maximum shift value (decreasing with smaller a levels), after which parallelism of the power functions holds true.
These simulation results for the biostatistical analysis of the micronucleus assay suggest that the score tests have an advantage in power in relation to the commonly used WMW test. These advantages are particularly relevant for a) medium to large effect differences between the control and dose group, b) ranges of p.0. 2.. . ,0.8, c) values of nj = 5 and a = 0.05. This advantage increases as the sample size, nj, and ca level become larger.  The micronucleus assay sometimes represents a "control versus k dose groups design" for a one-sided, ordered alternative hypothesis (because only increasing MNs with increasing doses are considered biologically significant). Based on the twosample tests described above, a simple a priori ordering procedure (20) can be used in this case.

An Example
Experimental data from Kliesch et al. (21) were used for a micronucleus assay on mice, 24 hr after singleperos treatment of methyl methane sulfonate (MMS) ( Table 5). Results of the biostatistical analysis are shown in 'Thble 6. This example shows the greater sensitivity for the contrast between the control group and dose group 10 for both the Fisher permutation test and the score test.

Conclusions
From the results presented here, one can conclude that the choice ofstatistical method for the analysis ofmicronucleus assay data when MN is increasing relative to controls is not critical at the commonly used level of a=0.05. However, a suitable choice of test is necessary for small or medium-sized increases in numbers of MN. This is applicable, for example, in the case of the no-observed-effect dose estimation. With a simulation study, based on an empirical negative binomial distribution of MN and a shift alternative, an advantage in the power behavior ofselected score tests assuming a mixing distribution of responders and non-responders is evident, even for the guideline case nj=5, a 20.05, p > 0.1, and a medium-sized shift between dose and control groups.