A Web-based Simulator for Sample Size and Power Estimation in Animal Carcinogenicity Studies

A Web-based statistical tool for sample size and power estimation in animal carcinogenicity studies is presented in this paper. It can be used to provide a design with suf(cid:12)cient power for detecting a dose-related trend in the occurrence of a tumor of interest when competing risks are present. The tumors of interest typically are occult tumors for which the time to tumor onset is not directly observable. It is applicable to rodent tumorigenicity assays that have either a single terminal sacri(cid:12)ce or multiple (interval) sacri(cid:12)ces. The design is achieved by varying sample size per group, number of sacri(cid:12)ces, number of sacri(cid:12)ced animals at each interval, if any, and scheduled time points for sacri(cid:12)ce. Monte Carlo simulation iscarried out inthis tool to simulate experiments of rodent bioassays because no closed-form solution is available. It takes design parameters for sample size and power estimation as inputs through the World Wide Web. The core program is written in C and executed in the background. It communicates with the Web front end via a Component Object Model interface passing an Extensible Markup Language string. The proposed sta-1

the estimated power. In this study, the core simulation program is written in C.
Animal carcinogenicity bioassays are routinely used to evaluate the carcinogenic potential of chemical substances to which humans are exposed. In a typical animal carcinogenicity study on occult tumors, each animal is assumed to begin with a tumor free state. Mice or rats are commonly used species. They are randomized into a control group (typically, animals that are exposed to a control agent or observed without any exposure) or into 2 to 3 test groups that receive specified levels of exposure and are observed until they either die or are sacrificed. In an experiment with multiple sacrifices, sacrificed animals are pre-assigned to a specific dose level and sacrifice time at the beginning of the experiment. In a single terminal sacrifice, all surviving animals are sacrificed and subjected to necropsy at the end of the experiment, which is typically a period of 104 weeks (2 years). During the study, age at death and the information on the presence or absence of the tumor of interest are collected for each animal. The primary goal of the experiment is to assess a dose-related trend of test agent exposures in the incidence of the tumor of interest. The tumor of interest can be any occult tumor for which the time to tumor onset is not directly observable. Our software can also seek a reduced design (78 -104 weeks) with an acceptable power. The proposed statistical tool can also be used to seek an optimal design by choosing the design with the maximum power of the trend test for a given total sample size. This tool will help researchers conduct more efficient and cost effective experiments.
The logrank test of Mantel and Haenszel (1959) may be used for comparing hazards of death from rapidly lethal tumors. To compare the prevalence of nonlethal tumors, the prevalence test proposed by Hoel and Walburg (1972) may be used for incidental tumors. However, the data obtained from a carcinogenicity experiment generally contain a combination of fatal and incidental tumors. Peto et al. (1980) suggested combining the fatal and incidental tests for comparing tumor onset distributions. The procedure proposed by Peto et al., has been called the cause-of-death test or the Peto test.
The development of the presented statistical tool is motivated by a series of animal studies at M. D. Anderson Cancer Center that explores the mechanisms underlying the chemopreventive effects of test agents. The first step is to establish the carcinogenic potential of the tobacco carcinogen NNK, a byproduct of tobacco smoke, in retinoic acid receptor-β (RAR-β) transgenic mice. The Peto test, recommended by the International Agency for Research on Cancer (IARC), is used to compare the tumor incidence rate among groups in the presence of potential confounders. It is a widely used statistical test to determine a dose-related trend for a test agent in the occurrence of occult tumors.
The purpose of this paper is to present a statistical tool for Web-based sample size and power estimation using the Peto test statistic to provide sufficient power for detecting a dose-related trend in the occurrence of a tumor of interest. This package simulates rodent bioassay experiments with either a single sacrifice or multiple sacrifices. A comprehensive list of design parameters can be specified by the users through the WWW.
The underlying models are described in Section 2. A detailed description of the use of the proposed estimator is demonstrated in Section 3. A design of a carcinogenesis experiment for lung cancer prevention research is illustrated in Section 4 as an example.
Concluding remarks and suggested future study directions are described in the last section.

Model and Test for a Sample Size and Power Estimator
Consider a carcinogenicity/tumorigenicity experiment with the control group and G − 1 dose groups of animals. Suppose that N i animals are randomly assigned to the i -th group, and they are followed over time for the development of irreversible and occult tumors. The animals in the i -th group receive a dose level of i of a test agent. We assume that all animals come from the same population and have no tumor on day zero of the experiment. The time scale is divided into J intervals such that the j -th interval is given by I j = (t j −1 , t j ], j = 1, . . . , J . Note that t 0 = 0 and t j denotes sacrifice time point for j = 1, . . . , J . For an experiment with either a single sacrifice or multiple sacrifices, t J denotes the terminal sacrifice time point.
It is assumed that three independent random variables completely determine the observed outcome for each animal. The random variables are the time to onset of tumor, T 1 ; the time after onset until death from the tumor, T 2 ; and the time to death from competing risks, T C . Also let T 1 + T 2 = T D , where T D represents the overall time to death from the tumor of interest. Thus the tumor of interest is present in an animal at the time of death if T 1 ≤ min{T C , T S }, where T S denotes a scheduled time to sacrifice of an animal. When T D ≤ min{T C , T S }, an animal dies from the tumor of interest. Otherwise, it dies from competing risks or sacrifice.

Distribution of the Random Variables
• Time to tumor onset (T 1 ) Let S i (t) be a survival function of the i -th group with respect to a random variable T 1 representing time to onset of the tumor of interest. Assume S i (t) follows a Weibull distribution: where δ 1 ≥ 0, δ 2 ≥ 0, and t max represents the duration of the study or the time for a terminal sacrifice. The hazard ratio θ i between the i -th dose group and the control group (i = 1) is typically greater than or equal to 1 (θ i = 1 for i = 1 and θ i > 1 for i > 1) for i = 1, . . . , G. The scale parameter δ 1 can be calculated by specifying the tumor onset probability 1 − S 1 (t max ) at the end of the study in the control group. With θ 1 = 1 and a given shape parameter δ 2 , δ 1 = − log S 1 (t max ). In this estimator, we allow the value of the parameter δ 2 of the Weibull distribution for ranging between 1.0 and 6.0 in order to reflect a wide variety of tumor onset distributions. When there are no competing risks, the tumor onset probability at the end of the study in each dose group is determined by the hazard ratio and the baseline tumor onset probability in the control group by the end of the study.
• Time to death from competing risks (T C ) The survival function for time to death from competing risks, T C , is taken to be where φ i ≥ 1, γ 1 ≥ 0, γ 2 ≥ 0 and γ 3 ≥ 0 (Portier et al., 1986). With φ 1 = 1 in the control group (i = 1), γ 3 can be calculated as is the probability of survival with respect to competing risks in the control group at the end of the study.
The values of γ 1 and γ 2 are chosen as 10 −4 and 10 −16 , respectively. These values are close to the ones fitted to the historical control data such as Fisher 344 rats and B6C3F 1 mice in Portier et al. (1986). These parameter values can be also found in many other settings Ahn, 1996, 1997;Ahn, Zhu and Yang, 1998;Ahn et al., 2002). The competing risks survival rate can be determined according to tumor types and historical data showing the survival rates of mice and rats. The value of φ i can be calculated after specifying the competing risks survival rate

• Time to tumor death (T 2 )
For simplicity, the survival distribution for tumor-induced mortality, T 2 , is taken to have the same form as that for death from competing risks and the values of γ 1 , γ 2 and γ 3 remain the same as in (2). These types of models using a modified Weibull distribution can be found in other literature (Kodell, Chen and Moore, 1994;Ahn and Kodell, 1995; for the distribution of T 2 can also be considered.

Construction of the Peto Test
The data are generated according to the distributions of T 1 , T 2 and T C for each animal.
They are collected at the j -th interval according to the following five events: animals On the other hand, an animal dies without the tumor of interest in the j -th interval if T C ∈ I j and T 1 > T C . A sacrificed animal has the tumor of interest at the time of It does not have the tumor at the time of sacrifice when T 1 > t j and T C > t j . These data are applied to the Peto test to estimate sample size and power.
First, consider the animals that did not have the specific tumor before death or tumor-bearing animals for which that tumor was not the cause of death. Let n i j = a 1i j + a 2i j + b 1i j + b 2i j be the number of animals in group i dying during interval I j from causes unrelated to the presence of the tumor of interest, and let y i j = a 1i j +a 2i j be the number of these animals in which the tumor was observed in the incidental context, for i = 1, . . . , G and j = 1, . . . , J . For each interval I j , the tumor prevalence data may be summarized in a 2× G table, as in Table 1. All tumors found in sacrificed animals are classified as incidental. The intervals defined by the pre-assigned NTP intervals (Bailer and Portier, 1988) are recommended to implement the incidental part of the Peto test.
The expected number of tumors in the i -th group for the j-th interval is where K i j = n i j /n . j . Thus, the observed and expected numbers of tumors in the i -th group over the entire experiment are , and δ ri is defined as 1 if r = i and 0 otherwise.
Second, consider the animals that died with a tumor of interest. The method used for the fatal tumors is similar to that used for the incidental tumors. Table 2 is a contingency   table for Table 2 is calculated in the same way as for the incidental tumors, and the corresponding The analysis of data on occult tumors using contexts of observation is based on the Then a dose-related trend test can be considered by using where l = ( 1 , . . . , G ) T , and i stands for the dose metric for the i -th group with 0 = 1 < 2 < . . . < G . Under the null hypothesis, Z R is asymptotically distributed as a standard normal.

Usage of Sample Size and Power Estimator
The proposed estimator for sample size and power takes input parameters from a series of pages in the Web site (http://biostatistics.mdanderson.org/ACSS). The title page in Figure 1 provides a general description of the proposed estimator. By clicking the Continue button, it moves to a user log-in and registration page, as shown in Figure 2.
A new user is required to register to obtain a user name and a password.
Users may save or retrieve their work by entering or selecting a file name. For a test run, the session may remain as "default", as shown in Figure 3. A default session name automatically generated by concatenating the user name, date and time is provided, and it shall be used as the "subject" of an e-mail for delivering the output. The session name may be changed as the user wishes. Figure 4 shows a page of the detailed input parameters in an experimental design. It starts with requesting three input parameters (a) the number of dose groups, (b) whether the experiment uses multiple sacrifices or a single terminal sacrifice, and (c) an integer seed for the random number generator.
The number of dose (or treatment) groups commonly considered are 2 to 4 groups, including the control group. An experiment with multiple sacrifices is simulated to perform sacrifices at specified interim time points, as well as a terminal sacrifice at the end of the study. All the remaining live animals are assumed to be sacrificed at the end of the experiment. The number of scheduled sacrifices, including the terminal sacrifice, is typically either 3 or 4 in a two-year study. In addition, a seed for the random number generator can be chosen by the user as any positive integer for the Monte Carlo simulation. Either the same number of sacrificed animals or a different number of animals can be specified in each group and/or each interval.
The probability of tumor onset by the end of the study in the control group needs to be specified in a design to determine the time to tumor onset. A tumor onset probability of each dose group, therefore, is determined by a hazard ratio and a tumor onset probability in the control group. An underlying distribution of time to tumor onset is assumed to be a Weibull distribution . The shape parameter in the Weibull tumor onset distribution ranges between 1.0 (exponential distribution) and 6.0 in order to reflect a wide variety of tumor onset distributions. The hazard ratio is a ratio of The significance level of the test typically can be specified 1%, 5% or 10%. The choice of a one-or two-sided test is also needed to estimate power. The number of data sets (i.e., simulation runs) needs to be decided for the Monte Carlo simulation. In this example, 5000 simulation runs are given as a default value. The e-mail address of the user is required to receive simulated results via e-mail.
Before submitting input parameters, all design parameters are displayed once again on the Web page for verification (See Figure 7). Once all the input parameters entered are confirmed, they are submitted to the compiled C program for simulation runs.
When the simulation runs are completed, an output file is sent to the e-mail address specified by the user. A simulation with 5000 runs typically takes about 5 minutes.
The output file contains design parameters, average tumor rate, average competing risks survival rate, average lethality rate for each group, and average death rate with information on tumors and sacrifices per dose group. At the end, the power to detect a dose-related trend is shown. Figure 8 shows an image of the output file generated from the above example.

Example: Testing the Carcinogenic Potential of NNK in a Prevention Study for Lung Cancer
The development of this statistical tool was motivated by a recent lung cancer prevention study developed at M. D. Anderson Cancer Center. One particular goal was to evalu- The power computation is based on the following assumptions. One hundred antisense RAR-β hemizygous (+/0) mice and one hundred antisense RAR-β homozygous (+/+) mice will be obtained for the experiments. For each type of mice, fifty mice will be randomized into a group that either receives or does not receive exposure to NNK.
To test the carcinogenic potential of NNK, a dose metric of 0 or 1 is used for the control group or exposed group, respectively. Serial sacrifices are scheduled at weeks 39, 52, 65, and 78 (at the end of the study). The time to lung cancer development is assumed to follow a Weibull distribution with the shape parameter 3. It is expected that 55% of the antisense RAR-β hemizygous mice will develop a lung cancer by 78 weeks. On the other hand, 86% of the antisense RAR-β homozygous mice are expected to develop a lung cancer by 78 weeks. In this experimental design, 85% competing risks survival rate is considered. The lung adenomas and adenocarcinomas are assumed to be highly lethal. Five thousand simulation trials are run.
The statistical power (in %) under the one-sided 5% nominal significance level is listed in Table 6. A hazard ratio between the treatment group and the control group in each type of mice is chosen as 2.0, 2.5 and 3.0. Three different designs are considered.
One is to estimate power with 6 mice at each serial sacrifice (39, 52 and 65 weeks) in a total of 50 mice per group. Another setting is to calculate power with 6 sacrifices at 52 weeks out of a total of 55 in the control group and with 3 sacrifices at 52 weeks out of a total of 45 in the dose group. The other configuration is to estimate power with a total of 30 mice per group and 3 mice per serial sacrifice at 39, 52 and 65 weeks.
The same competing risks survival rate (85% for the control and a dose group) and different competing risks survival rates (85% for the control, 50% for a dose group) are considered in this example. Under the null hypothesis of no treatment effect on the tumor of interest, it may still be reasonable to assume different competing risk rates among different dose groups. For example, even though the carcinogen may have no effect on the development of lung cancer, it may increase competing risks, such as the development of a liver tumor or bladder tumor, etc, and, consequently, results in lower competing risks survival rates in the dose groups. A lung cancer is typically considered as highly lethal in humans. The median survival time is less than a year for stage III and less than 6 months for stage IV non-small cell lung cancer (Ginsberg et al., 1993).
Lethality parameters of 1500 and 800 are selected in hemizygous and homozygous mice, respectively, to reflect highly lethal tumors. Expected lethality corresponding to these parameters is about 80% in this example.
With 50 hemizygous mice in each group and the same competing risks survival rate between groups, 79.9% power was achieved to detect a hazard ratio of 2. However, the power decreased slightly to 75.8% with different competing risks survival rates. On the other hand, at least 80% power was achieved with 50 homozygous mice in each group. With 30 mice in each group of hemizygous and homozygous mice, we had less than 80% power to detect a hazard ratio of 2 in the presence of both the same and different intercurrent mortalities. A better design, in this example, was achieved with the control group of 6 mice for an interim sacrifice at 52 weeks out of 55 mice in total, and with a treatment group of 3 mice for an interim sacrifice at 52 weeks out of 45 mice in total. In this design, 80% power was achieved to detect a hazard ratio of 2 or higher among antisense RAR-β hemizygous and homozygous mice with the same and different competing risks survival rates.

Discussion
We have developed, based on our knowledge, the first Web-based sample size and power simulator for animal carcinogenicity studies to detect a dose-related trend in tumor incidence following exposure to a putative carcinogen. It is applicable for studies on occult tumors for which the time to tumor onset is not directly observable. It was designed for rodent bioassays that have either multiple sacrifices or a single terminal sacrifice.
The Peto test (Peto et al., 1980) is used to compare the incidence rate of occult tumors among groups in the presence of potential confounders. It requires data with cause-ofdeath information determined by pathologists. Monte Carlo simulation is introduced in this tool to simulate experiments of rodent bioassays because no closed-form solution is available.
This package could be used to construct a design with a sufficient power to detect specified hazard ratios by varying the sample size, number of sacrifices, time points for sacrifice, and number of sacrificed animals at each sacrifice, if any, under the given design considerations. As an example, the application of this tool was illustrated by an animal experiment for lung cancer prevention research. An advantage of the proposed Web-based sample size and power estimator is wider accessibility to the user, provided that an Internet connection and a Web browser are available (Microsoft Internet Explorer 5.5 and above, or Netscape 4.76 and above). In addition, this statistical tool provides a user-friendly environment so that the user can search for an optimal design.  Through a Monte Carlo simulation study, the proposed tool can seek an efficient design. For a given sample size and study duration, an optimal design can be obtained using our tool by choosing the design with the maximum power of the trend test. This tool can help investigators conduct more efficient and cost effective experiments.