Best-estimate low-dose extrapolation of carcinogenicity data.

At the 1974 NIEHS conference on low-dose extrapolation, Peto (I) reported that evidence concerning chronic exposure to direct carcinogens (i.e., those acting on the DNA, producing mutations) suggests that the excess rate of incidence over background should be given by the product of a function of the dose and a function of the exposure duration. The function of the dose should be a plynomial of the form

Best-Estimate Low-Dose Extrapolation of Carcinogenicity Data by H. A. Guess* and K. S. Crump* At the 1974 NIEHS conference on low-dose extrapolation, Peto (1) reported that evidence concerning chronic exposure to direct carcinogens (i.e., those acting on the DNA, producing mutations) suggests that the excess rate of incidence over background should be given by the product of a function of the dose and a function of the exposure duration. The function of the dose should be a plynomial of the form qldl + q2d2 + ... + qKdK with qi -0 and q, > 0. Expressions of this form for the excess rate of incidence over background correspond to lifetime response probabilities of the form P(d) = 1 -exp {-(qo + qldl + * + qKdK)} qi -0(1) We have developed a low-dose extrapolation technique for dichotomous data, using an assumed dose-response relation of the form (1) but without making any assumptions about whether the coefficient of the linear term is positive or about the degree of the polynomial. We use maximum likelihood to determine both the degree K of the polynomial and the coefficients qi > 0, for 0 -i -K. Allowance for nonzero background is made automatically by fitting the constant term, q0. Our estimate of the increased response probability over background is given by At very low doses this looks like P(d)-P() = e-£O q4dl (3) *Environmental Biometry Branch, National Institute of Environmental Health Sciences, Box 12233, Research Triangle Park, N. C. 27709. for d « 1, where I is the smallest positive integer for which q1 > 0. At high doses, the shape of the curve is governed by the highest-order coefficient, qK.
Unlike previous dose-response models, which contain only one or two free parameters to be determined, this model contains infinitely many, because the degree of the polynomial is left to be determined by the data. For any given set of data, only finitely many coefficients will be nonzero; however we do not make any a priori assumptions about which coefficients will nonzero. In view of Eq. (3), the resulting low-dose extrapolation will look like that of an 1-hit model for some I = 1,2, . . depending on the data.
By considering at once all polynomials with nonnegative coefficients-unrestricted as to degree we are able to compare dose-response curves that are linear in the low-dose range with dose-response curves that are extremely flat in the low-dose range. For each set of data we calculate the (in most cases) uniquely determined set of nonnegative coefficients which best fits the data, in the sense of maximizing the likelihood function over the class of all polynomials with nonnegative coefficients. Maximum likelihood estimation by use of polynomials with nonnegative coefficients (absolutely monotonic) polynomials) is quite different from maximum likelihood estimation with polynomials with unrestricted coefficients. When the coefficients are unrestricted, it makes no sense to consider polynomials whose degree is greater than the number of doses at which tests are conducted. When the coefficients are constrained to be nonnegative, it is generally not possible to fit the data points exactly, no matter how high the degree. In general increasing the degree beyond a certain point can actually cause the likelihood function to start decreasing. Unlike the situation with unrestricted coefficients, increasing the degree of absolutely monotonic polynomials does not lead to complex curve shapes. All that happens is that the high-dose part of the curve hooks up more sharply while the low-dose part of curve flattens out. The flatter the low-dose part of the curve, the higher will be the predicted dose corresponding to a given level of increased risk over background. Thus it is desirable not to exclude high degree polynomials because doing this limits artificially the extent to which data which are best fit by very flat curves can be fit by curves of the form (1). For this reason, we have developed the theory needed for maximizing the likelihood function globally over all absolutely monotonic polynomials, unrestricted as to degree. We have an efficient method for calculating such estimates and have obtained theorems describing their statistical properties, such as asymptotic unbiasedness, strong consistency, and asymptotic normality. In addition, we derive and calculate asymptotic (largesample) confidence intervals for the risk estimates P(d) -P(O) and compare the calculated confidence intervals with those obtained by computer simulation. These confidence intervals are not the ordinary binomial confidence limits at the test doses. Rather they are obtained by means of a limit theorem which we prove. Finally, we have developed a Monte Carlo goodness-of-fit test, which involves comparing the actual data with sets of artificial data simulated using the best estimate response probabilities.
All of these calculations have been assembled into one computer program which reads the dichotomous data and calculates estimates of the increased response probability over background, confidence intervals for these estimates, and goodness-of-fit-test results.
Our main conclusions are as follows: We can easily specify hypothetical sets of dichotomous dose-response data for which the best estimate (maximum likelihood) low dose extrapolation of the form of Eq. (1) is much closer to a probit extrapola-*Probit curves vanish to infinite order at the origin in the sense that d-k4)(a + b log d)-0 as d -Ofor all k = 1,2,.
where 4) is the standard normal distribution function. Curves of the form (1), being analytic, can only vanish to finite order. Thus, given any curve of the form (1) and any probit curve, the probit curve will eventually undershoot the curve of form (1) for d sufficiently close to 0. However, as Table I shows, it is not difficult to fit a probit curve with a curve of the form (1) down to doses corresponding to risk levels of about 10-7. tWith the dimethylnitrosamine data it is possible to make the linear coefficient q, vanish by excluding the response at the highest dose tested. However, the goodness-of-fit, as measured by our Monte Carlo goodness-of-fit test, was not rejectable at the 10%'o level with the highest dose included. For this reason, it did not seem valid to exclude this response. tion than to a linear extrapolation. Table I illustrates such data. We cite these results to demonstrate that it is mathematically possible for risk estimates based on Eq. (1) to be approximately the same as risk estimates based on a probit model and to differ by several orders of magnitude from risk estimates based on a linear model.* aThe data are expected values computed from the probit curve P(d) = 4) (-7.18 + 3.30 log d). This curve was also used to get the probit low-dose extrapolation. The multistage and linear extrapolations were obtained from the above hypothetical data by maximum likelihood. In every set of actual data we have analyzed, the linear term q1 is positive, and the best-estimate lowdose extrapolation is much closer to a linear extrapolation than a conservative Mantel-Bryan extrapolation. We have analyzed data for vinyl chloride, dieldrin, DDT, dimethylnitrosamine, and ionizing radiation. Figures 1 and 2 illustrate the results for dieldrin.t In the range of increased response between 10-5 and 10-8 the best-estimate doseresponse curve of the form of Eq. (1) for these agents is typically one or two orders of magnitude above a conservative Mantel-Bryan extrapolation.
Our results have implications which should be considered by anyone who intends to design a large-scale experiment to measure the shape of dose-response curves for chemical carcinogens in the very low-dose range. When both very flat curves and gradually sloping (linear or nearly linear) curves are considered together, it is extremely dif-and the other in which there is an expected background effect of about 10%. Figure 3 shows low dose extrapolations for the hypothetical data simulated as described in Table 2. The expected response frequencies at the test doses lie on the curve P(d) = I -exp {-1.54 x 10-148 } (4) One hundred sets of data were simulated binomially for the eight doses shown in Table 2 with 1000 animals per dose. A maximum likelihood best estimate dose-response curve of the form (1) was calculated for each data set. At each of eight levels of increased risk over background P(d) -P(O) = 10-1, 10-2, ..., 10-8 the doses corresponding to these risk levels were ranked in decreasing order. The curve labeled "true dose response curve from d Dose(ppm)  Table 2. which the test data were drawn" is the graph of Eq.
(4). The other four curves in Figure 3 were obtained from the best-estimate low-dose extrapolations of the 100 sets of simulated data. For example, the curve labeled " 10th lowest dose (out of 100) at each risk level" is the curve constructed by connecting the 10th lowest of the 100 best-estimate extrapolated doses corresponding to a risk level of 10-1,  0.451 aThe response probabilities used lie on the curve: P(d) = Iexp {-1.54 x 10-'4d8}. One hundred sets of data were simulated, and a best-estimate dose-response function of the form (1) was calculated for each set. The doses corresponding to increased risks of 10-1, 10-2, .. ., 10-8 were calculated for each set and each of these 8 sets of 100 doses was tabulated in decreasing order.
with the 10th lowest at a risk level of 10-2, and so on up to the 10th lowest at a risk level of 10-8. The median curve was constructed by connecting the median of the 100 best-estimate extrapolated doses at each of the risk levels 10-1, . . ., 10-8.
The median curve is quite close to the true doseresponse curve (4), as it should be. However, the striking result of this study is that the envelope curve for the 10th lowest dose at each risk level is almost perfectly linear at risk levels below 10-3. This curve represents roughly an upper 90% confidence interval on the true dose response curve. The study shows that if a carcinogen with the highly nonlinear (almost thresholdlike) curve (4) as its true dose-response curve were to be tested in a perfectly conducted large-scale (8000 animal) test with no background effects present, using the doses shown in Table 2, the test data would probably not reject the hypothesis of linearity at the 10%o significance level. In fact, the upper 90% confidence curve from the test would probably not be very different from a simple linear extrapolation of the expected response frequency at the lowest positive dose tested.
When background is present, it is even more difficult to reject the hypothesis of near linearity in the risk range of 10-6. Table 3 illustrates this dramatically. By changing the outcomes of only 11 out of 8000 animals the best-estimate dose corresponding to an increased risk of 10-6 over background changes by more than three orders of magnitude.
Intuitively, what is happening here is that slight changes in the experimental outcome involving a few tenths of a percent of the animals at each dose have an appreciable probability of occurring by pure chance. Such changes can transform data which are best fit by an extremely flat doseresponse curve into data which are best fit by a dose-response curve with just the slightest hint of a positive linear term. When one extrapolates back to risks of 10-6 or less, this small difference in slope is magnified by several orders of magnitude. Table 3A. Hypothetical data illustrating extreme sensitivity of best-estimate extrapolation to minute changes in data when background is present and true curve is flat. Animals tested Experiment 1  Experiment 2  0  1000  103  100  2  1000  99  99  15  1000  100  105  30  1000  109  112  35  1000  131  131  40  1000  187  187  45  1000  305  305  50 1000 506 506 aBest-estimate dose response curves: experiment 1: P(d) = I -exp {-(0.105 + 1.1 x 10-15d7 + 1.5 x 1014d8)}; experiment 2: P(d) = I -exp {-0.106 + 1.6 x 10-4d' + 1.2 x 10-14d8 + 7.2 x 10-17d9)}. Sumrimary An important tentative conclusion from our work is that when dose-response curves that are extremely flat (probit-like) in the low-dose range are compared with dose response curves that are nearly linear in the low-dose range, the nearly linear curves fit the data better for each of the four chemical compounds for which we have analyzed data. A second conclusion is that if anyone has data to support a very flat dose-response curve in the low-dose range, this technique would permit one to infer the flat shape from the data rather than having to assume a flat curve shape, as one implicitly does when using a probit technique. (However, as the discussion above indicates, it is likely that any valid upper 5 or 10% confidence limits on the risk estimates from the data would not differ much from linear extrapolations.) Finally, we believe that simulation studies such as are discussed above could be useful in designing animal carcinogenicity tests and in helping to decide whether a given test design has much chance of accomplishing its objectives. In our opinion, these results have implications both for test design and for risk assessment. They suggest that if the hypothesis of a nearly linear dose-response curve in the low-dose range cannot be ruled out by assumption, then it seems questionable that it can be rejected by the data, even in cases where it may be false.