Animal experimentation and its relevance to man.

The problem of quantitatively estimating human cancer risk based upon animal carcinogenesis studies is reviewed. Mathematical functions for dose-response relationships are discussed with particular emphasis on multistage models. These models are based upon a single cell somatic mutation theory for the carcinogenesis process. It is shown that the multistage model and others which incorporate background additively are well approximated in low dose region by a linear function. The relationship between time-to-tumor and the multistage model is indicated. This relationship is important when dealing with less than life time exposure such as with data from many occupational studies. Design of bioassay experiments and its impact on risk estimation is noted. Finally, the problem of species-to-species extrapolation is considered.


Introduction
The problem of identifying human carcinogens and assessing their potency will often depend, out of necessity, on laboratory experimentation. This ranges from DNA chemistry to chronic exposures in lifetime animal studies. Epidemiological studies in carcinogenesis are naturally more appropriate for assessing human risk. Unfortunately they require long term prior exposures to man which do not exist with new compounds. For older compounds laboratory and epidemiological studies hopefully interact in a creative manner.
The problem I wish to discuss is how one can estimate human risk to a carcinogen based upon laboratory studies. This question involves several problems concerning the appropriateness of animal models for predicting effects in man. In the standard lifetime, rodent bioassay such as the one used by the National Cancer Institute there are several ways in which to arrive at incorrect conclusions. For example, often concerns are expressed about the statistical errors relating to sample size, etc., and also to the possible biological errors in the sense that the particularly chosen rodent strains may not be *National Institute of Environmental Health Sciences, National Institutes of Health, P.O. Box 12233, Research Triangle Park, North Carolina 27709. relevant to man for the type of carcinogen under consideration.
If a particular animal model is a reasonable qualitative predictor of human cancer, how reasonable is it as a quantitative predictor of carcinogenesis potency in man? Clearly, if the regulation of environmental carcinogens involves risk-benefit considerations, we must have acceptable methods for estimating effects in man at low exposure levels based upon laboratory animal studies at relatively high exposure levels. These high experimental levels are necessitated by limited size (cost) of the animal study and the need for observed experimental effects. This raises interesting experimental design considerations. On the one hand we require high statistical power for confident qualitative identification of carcinogenesis, and this is obtained at high dose levels. On the other hand experimentation at low dose levels is desired for predicting effects at low exposure levels.

Dose-Response Models in Carcinogenesis
In order to estimate carcinogenic effects, some assumptions or models are needed which relate the frequency of tumor with exposure level of the carcinogen in question. If the true form of the doseresponse function were known, one could correctly October 1979 estimate the effects of the carcinogen at low levels within a stated degree of statistical uncertainty. This would provide the first step in a human risk assessment. The low dose estimates based upon the experimental animal data and dose-response model must then be extrapolated to man. This requires a considerable number of biological assumptions and is an uncertain process at best.
Several quantitative theories of carcinogenesis have been proposed which relate tumor incidence with both dose rate and duration of exposure of the carcinogen. These models generally assume that the carcinogenesis process is single cell in origin and is the result of several stages which can include somatic mutation. The transitional events are individually assumed to depend linearly on dose rate. This then leads in general to a model in which the probability of tumor is approximately a low order polynomial in dose rate. In the low dose region which would relate to environmental levels, one finds that the response is well approximated by a linear function of dose rate (1)(2)(3)(4). Another class of doseresponse models which have been applied are those which have a history in bioassay work. For example, both the ligistic and probit functions are used for estimating LD5o in toxicology. Mantel and Bryan (5) have applied the probit to low dose estimation in carcinogenesis. Its drawback is primarily that the model is not based upon a mathematical description of a biologic mechanism as is the case with the multistage somatic mutation model. Since it produces quite different results in the low dose region it probably should not be applied in those cases where it is believed that the carcinogen is direct acting with cellular DNA as is proposed in the multistage models (4,6).
Of the multistage models, the one discussed by Armitage and Doll (3) has received considerable attention (4, 6-9). The function which is used states that the probability of a tumor by time t using dose rate d is P(d, with a>i'°, f3i ' Ofori= 1, ...,k. This represents a k-stage or mutational step model with dosage and time factoring. The above mentioned authors have developed maximum likelihood estimates for the unknown parameters in the model. Further they have approximate confidence estimates on the low dose estimates of the probability of tumor. These procedures are mathematically complicated in that com-puting algorithms are required in order to obtain the estimates. Recently Guess et al. (6) applied these algorithms to experimental carcinogenesis data on vinyl chloride, DDT, chloroform, DMN, and dieldrin. In all of these cases k = 1 or 2 was found to provide the best fit, although DMN yielded a zero estimate of the linear coefficient. In each case the upper bound of risk was estimated to be linear at low dose as one would theoretically anticipate. It was previously mentioned that the multistage models were approximately linear at low dose rates. An important question is: how good is the linear approximation? It may be that linear extrapolation is a perfectly adequate method and much simpler than using complicated computer algorithms. Furthermore, there is the intuitive belief that the lower convex portion of a typical dose-response curve would be bounded above by a straight line (10). It is felt, however, that a line is a too conservative upper bound in that it greatly overestimates the risk at low dose levels. It has been shown by Crump et al.  Thus linear extrapolation in the lower convex portion of the dose response curve is both a reasonable upper bound and direct estimate of the probability of tumor (Table 1). This is an appropriate procedure to use unless there is specific information concerning the mechanism of the carcinogenic process which would indicate a model other than the multistage. This might be the case, for example, with certain promoters or cocarcinogens.
The models discussed so far have been very simplistic in nature. The multistage model assumes that the mutational step is proportional to the administered dose rate. This ignores the influences of pharmacokinetics which may indicate a changing proportion of active metabolite with administered dose level. Also immune response and DNA repair systems are not taken into account. Even without these biological complications the models are fairly complex, which indicates that we are still a long way from a nonlinear methodology in which one would have much biological confidence.

Time To Tumor
Most dose-response studies in animal experimentation are plots of dose rate versus total tumor incidence for the duration of the study. In large studies there is often sufficient data to plot cumulative tumor incidence versus age. This is then done for each dose and the curves compared. Models are also fit to data and have the possibility of use in low dose estimation. Albert and Altshuler (11) have fit the log-normal model of Blum and Druckery to several sets of laboratory data. Also Whittemore and Altshuler (12) have fit both the Weibull and lognormal model to the physician smoking data of Hill and Doll. As with the ordinary dose-response models one finds that there usually are not sufficient data to distinguish between competing models. Yet the choice of a time-to-tumor model can greatly change any estimated low dose effects.
It has been suggested (13) that the time to first tmor or latency period increases with decreasing dose. Then with a sufficiently small dose the time to first tumor will exceed man's lifetime, thus producing an effective carcinogenic threshold. Unfortunately as shown by Guess and Hoel (14), the observed time to first tumor will increase with decreasing dose for the multistage model, yet no true threshold will occur at low doses with this model. In other words, a nonthreshold model will predict that the data will look as though a threshold in time exists.
Using the Armitage and Doll (3) multistage model with the incorporation of time-to-tumor one has for the age-specific incidence rate k I(t,d) = ktk-l H (ai + fid) i = I with k the number of stages and d the dose rate. The power of the duration of exposure t corresponds to what is seen with vital statistics data on human cancer types. Whittemore and Altshuler (12) found, for example, with the lung cancer and cigarette smoking data that I(t,d) = ct5d, where t represented the duration of smoking while d was the daily consumption rate. This shows the importance of the duration of exposure in any attempt to estimate risks.

Design and Detection
The routine screening of compounds for carcinogenic activity is a slow and expensive process using lifetime rodent studies. These assays, such as used by NCI, are designed for detection of activity and are not intended to provide dose response information needed for low dose risk estimation. It is necessary to use high dose experiments in order to generate as much statistical power as possible. Even so, weak carcinogenic activity may very well go undetected. For example, suppose one is comparing 50 control animals with 100 treated animals using Fisher's exact test and nonrandomized decisions. Then for a = 0.01 (one-tailed) one has less than a 5% chance of detecting an increased tumor incidence of 5% over the background rate. This would essentially go unnoticed. In actual assays both sexes and maybe two species will be used. This will increase the power, assuming an appropriate decision rule is applied which considers the multiple testing of many tissues besides the replications in sex and species. Fears et al. (15) have looked into these issues in depth.
If we want to estimate effects at low dose levels, the high dose bioassay often will be quite useless.  (17). Maltoni placed his low dose experimental points equally at d = 0, 50, 250, and 500. The Hoel and Jennrich design suggests 27% at d = 0, 53% at d = 83, 15% at d = 342, and 5% at d = 500. The design question, which is still open, is how would one combine detection with low dose estimation. The problem is to have both high dose levels for power ofdetection and also low dose levels for risk estimation.

Species-to-Species Extrapolation
The problem of the appropriateness of the rodent carcinogenesis study for human risk prediction is most difficult. One usually begins by stating that of the 25 known human chemical carcinogens all except for possibly arsenic and benzene are also rodent carcinogens. There is good qualitative agreement between laboratory results and epidemiological studies. There are, however, few well done human studies as compared to the hundreds of compounds studied in rodent bioassays. Tomatis (18) discussed the predictiveness of long-term rodent studies. He points out the early identification of the carcinogenicity of diethylstilbestrol, 4-aminobiphenyl, and vinyl chloride through experimental animal testing.
Recently short-term in vitro testing has been compared with the long-term animal studies. Tests such as Ames' Salmonella mutagenicity assay are felt to be presumptive of carcinogenicity. This is based upon the proposition that many types of cancer are 28 the result of somatic mutations and often are single cell in origin. Recently an IARC working group (19) expressed the opinion that "When there are inadequate animal data, positive results in validated short-term tests are an indication that the compound is a potential carcinogen and should be tested in animals for an assessment of its carcinogenicity. Negative results in short-term tests cannot be considered sufficient to rule out carcinogenicity." By the Ames test (20,21) most of the chemical carcinogens tested so far are mutagenic (157/175) and most noncarcinogens are not mutagenic (95/108). Further, there is some preliminary work suggesting that cancer potency in animal studies is quantitatively predicted by the Salmonella test (22). Much more work is needed in this area, since few compounds have been compared in this manner.
Meselson has considered in a NAS report (23) several human carcinogens (benzidine, chlornaphazine, DES, aflatoxin Bi, vinyl chloride, and cigarette smoke) and compared the sensitivity of man with laboratory animal species. He found using the most sensitive animal species as a predictor for man to be approximately correct for benzidine, chlornaphazine and cigarette smoke. For the other three compounds man was observed not to be as sensitive as predicted from the animal studies.
Studies such as the one conducted by Meselson will, at least empirically, answer the question as of the precision of risk estimation. The process of quantitatively estimating carcinogenic effects in man will improve especially with better incorporation of pharmacokinetic considerations and understanding of environmental and genetic differences in human populations. With respect to the precision of risk estimates, however, there is not an apparent theoretical method for quantifying biological errors. Thus we are dependent upon empirical experience which is somewhat inadequate at the present time.
Rall (24) has discussed species differences in carcinogenesis testing in some detail, paying particular attention to pharmacological differences and similarities in the species. He concludes that "laboratory animal carcinogenicity tests predict well for man and that such tests do offer a mechanism by which the prediction of human carcinogenesis is possible before human exposure and with reasonable accuracy." In gathering empirical evidence for the qualitative predictability of animal studies for man, careful use of epidemiological evidence is needed. The most straightforward situation is when one deals with data from direct studies. Lung cancer and smoking is a good example where careful modelling of both level and duration of exposure has been considered (25,12). From this study one feels that accurate estimates of risk can be made, at least for British physicians who smoke. Extending these estimates to the general population may still be quite reasonable. This assumes that we are interested in the median man and feel that the British physician is an adequate model. Errors of magnitude do occur, however, when we attempt to estimate the risks to susceptible subgroups in the population such as uranium miners and asbestos workers. Genetic and environmental susceptibles are often not identified nor predicted. Thus synergistic effects are a real problem and often will be unanticipated.
Ecological or indirect studies are much more difficult to assess with respect to both predictability and precision. These studies attempt to correlate vital statistics data and environmental exposure data. Further, the studies attempt to quantitatively predict health effects based upon exposure data usually employing simple regression models. Examples include mortality rates and various air pollutants. Also chloroform levels in drinking water and bladder, colon, and rectal cancer rates have been studied recently. It is known that the predicted health effect often will vary depending upon the statistical method of analysis used. The reliability and thus usefulness ofthis quantitative approach to risk estimation is still not understood.
Estimates are needed for risks to the general population from various environmental agents when the only human data available is from high exposure groups. For example, risk calculations have been made for populations living in the vicinity of vinyl chloride plants. These estimates are extrapolations from highly exposed worker populations.
Even with the application of dose response and exposure duration models, childhood and in utero exposures are not included in the model. Thus serious biological errors could be made. It is the judgement of the possibility of such model errors which is needed for the quantification of the precision of the risk estimates. The statistical aspects can be properly assessed. What is needed is quantification concerning the likelihood of the model being representative of the true state of nature.
Finally, one technical point should be discussed. In the comparison of human data with animal data we are often dealing with less than lifetime human exposures. Much of the human data is obtained from industrial exposures. Applying the multistage model one finds a critical dependency on the duration of exposure. As discussed earlier the age specific rate often is of the form dtk-l where the number of stages k is 4, 5, or 6. Now, for a human study with 20 years exposure and followup one would underestimate the life time risk by a factor of about 500 [(70/20)]k-1 with k = 6, and it is not taken into account. Thus we see the care with which the human data must be treated when quantitative comparisons to animal studies are made.

Conclusion
The research area of low dose risk estimation has and will continue to be of statistical interest. Doseresponse modelling will also continue although we must be on guard not to lose sight of the assumptions made. There seems always to be the danger of providing more mathematical sophistication than the biological knowledge warrants.
Statistical errors or confidence statements do not include a quantification of biological errors (e.g., wrong mathematical or biological model). Since the biological errors cannot be easily if at all quantified they tend to be ignored. The possible biological errors could easily overwhelm the statistical errors and thus considerable care is needed in the interpretation of confidence statements.
Specific research needs from a statistical standpoint include the following: (1) additional bioassay design of experiments research, including improved methods of statistical testing as well as risk estimation; (2) incorporation of pharmacokinetic considerations including repair systems in dose-response estimation; (3) empirical studies which attempt to quantitate the effects in man with those in laboratory animal experiments; (4) attention to quantifying the effects of susceptible subgroups and also synergistic activity.