Multiple comparison among groups of growth curves.

The problem of comparing a sequence of independent experiments divided into several groups with a control is discussed under the logistic growth-curve models. We propose a method for constructing multiple testing procedures using the closed testing procedures and the random-effect model for summarizing estimated values of parameters.


Introduction
In toxicological studies it is ofgreat importance to evaluate side effects or toxicity of new drugs, industrial compounds, and environmental contaminants. We are interested in investigating the maximum dose levels below which no toxicity is observed or, if a maximum level exists, if it is tolerable. We call this level the maximum noneffective dose (MNED). Ruberg (1) considers the problems of inference about the minimum effective dose (MED) by comparing various dose groups with a control and reaches the conclusion that the simulation studies have superior contrast procedures compared to the other multiple comparison procedures (2,3). Ifwe design an experiment in which differences in doses among groups are small, MNED and MED are almost the same. In this article we focus on inferring the MNED from the viewpoint of a safe dose.
In the usual dose-response studies we can use the standard multiple comparison techniques as described above. However, it is difficult to apply them in cases where the responses are continually observed with constant exposure to some dose level and when the observations are arranged on the time axis for each experiment. The problem is how we can select the time point at which to compare the two groups. Even ifwe select one point for comparison, we may lose other information such that the observations of one group are always fewer than those of the other groups. This situation is illustrated in the Figure 1. We are interested in comparing groups in which each animal is observed continually, and we propose a multiple comparison technique for these types of data sets.  The subscripts i, j, and k mean, respectively, the ith group, jth individual in each group, and kth observation for the each individual on the time axis. For example, xjk is the weight of the jth animal at time k in the ith group. We assume zero dose for i = 0 as a control group and that dose level increases monotonously with index i. Our main purpose is to compare the weight curves observed on the time axis for each animal in the (a + 1) groups. We would like to know whether there are any differences in weight curves among dose groups. Can we draw the line that discriminates the dose level below which we may conclude no effects or no obstructions in growths and compare it to the control group? This question drives us to investigate the maximum noneffective dose statistically. Yoshimura (4) considers this problem with no timedependent observations from the viewpoint ofthe standard multiple comparisons. However, we cannot extend those multiple comparison techniques to the problem ofweight curves. First we will fit some growth curve model with a few parameters and second we will compare the derived estimates ofthese parameters.
We assume the logistic growth curves as weight curves for animals, which is described as follows: = 1 + e-f(tj+IA) +es (j = (2) Here e is independently normally distributed with mean 0 and variance &l. Note that the indexes i and k are omitted for simplicity. The parameter Kis interpreted as the final weight of an animal with constant exposure to a dose of some level. So the estimate of K is one of our primary interests. We first compute the estimates by maximizing the likelihood, for each individual. In fact, we must solve the n(a + 1) likelihood equations. Then we would obtain 4n(a + 1) parameter estimates, {Ifij) pijX Aieaj}( 07 11 ... a; j = ,...n) and their asymptotic variances. In the following section we propose a method to summarize these estimates for multiple comparisons.

Multiple Comparisons by Random-Effects Model
It is useful to use a random-effect model for summarizing estimated variables for each group. Kom and Whittemore (5) use the random-effect model for obtaining overall estimates assuming a multiple logistic model for each patient.
We assume that Kij is normally distributed with mean Kij and The operator Sb is used to generate the covariance matrix of the vector argument. In our case the covariance matrix is a simple, diagonal matrix whose elements correspond to the asymptotic variances. Under the null hypothesis, the statistic given by Equation 6 has asymptotically a chi-square distribution with a degrees of freedom. For multiple comparison we can use the closedtesting procedures proposed by Marcus et al. (6), which require that sets ofhypotheses are closed under intersection and that each test is of level a. Then we can assure that the overall error rate is less than a if we use these multiple comparison procedures. We consider the following set of hierarchical hypotheses closed under intersection: H1: Ifo = I, H2 : Ifo = K1 = K2 Ha: Io = I =IK2 = =K.
The closed procedures are constructed by testing each null hypothesis with a level a and finding the hypothesis Hxo; HX for X < Xo is not rejected and Hx for X > Xo is rejected with a level a. Here we can use the test statistic (Eq. 5) for each test.

Examples
We consider two examples in which a new drug is tested for toxicity. The first data are male body weights (grams) for a control and a dose group (200 mg/kg) from a 5-week toxicity study in rats. The summary statistics are shown in Table 1.
What troubles toxicologists is that the control versus dose group comparison shows significant difference at days 4-32 and but is not significant at day 35 by the f-test. Should they conclude significance or nonsignificance? We estimate the final body  weights KVaIU"S in the random-effects model. The mean K is 402.20 g for the control and 382.72 g for the dose group. The test statistic of Equation 6 with a = 1 results in 1.53, and therefore we can conclude that the two groups are not significantly different from the viewpoint of the final estimated body weights compared to the upper probability of chi-squared distribution with 1 degree of freedom. Second, we choose for illustration the data set consisting ofa control group and four dose groups: for each group, weights of 16 rats were taken at 27 time points. Dose groups levels are 15, 35, 85, and 200 mg/kg. Figure 2 shows the plots of growth of 16 rats in 15 mg/kg dose group.
The estimate of the final body weight and its asymptotic variance in each group are given in Table 2. Table 3 shows the values of statistics obtained from Equation 6. From the table, we can find that the dose group of 200 mg/kg is significantly different from the control group in the final body weight.
Sometimes laboratory workers make an error in administering drugs to rats and hurt the rats' throats. Subsequently the rats  will not eat food, and they lose weight. If we come across this kind of suspicious data, we can remove the observations that are verified as abnormal by laboratory workers and continue to analyze the remaining data set. For example, in our data set a male rate ofthe 200 mg/kg dose group had an abnormally large weight loss at day 91. Ifwe delete this observation at this point, we can obtain the slightly smaller chi-square value of50.769 in comparison with 51.447, which is given in the last row in Table  3. In this example we can conclude that the dose group of 85 mg/kg is the MNED with regard to the final body weights.

Conclusion and Discussion
We discussed the multiple comparison problems ofparameter estimates assuming logistic growth-curve modelsjointly using a random-effect model and the closed procedure. It is important to build a model that includes only a few parameters to avoid the difficulty ofhandling multiplicity ofthe observation time points: It is similar to the problem for detecting a trend. We assumed normalities for using the random-effects model in this paper, but it is better to check the normalities ofparameter estimates. This may result in devising a new technique for data analysis using random-effects models.
Some researchers might oppose using this approach to finding MNED or MED by multiple testing procedures. Suppose that no effects are observed within some dose level in mechanism A but that a small effect is observed within this dose level in mechanism B. As for mechanism B, we sometimes consider the effect not important compared to curing a disease with that drug. In addition, ifwe have a large enough data set, we can detect the very small effects. Statistically speaking, we do not distinguish between mechanism A and B ifwe do not find any significance. Therefore, we must note that the derived MNED is a statistical result and that it strongly depends on the sample size. Further research would be needed to find the MNED.