A model-free approach to low-dose extrapolation.

Estimates of risk associated with exposure to low levels of carcinogenic substances present in the environment are generally obtained by linear extrapolation from higher exposure levels at which risks can be estimated directly. In this paper, we examine the scientific basis for the assumption of low-dose linearity in carcinogenic risk assessment and the different statistical methods that have been proposed for linear extrapolation. A model-free approach to linear extrapolation is described and illustrated using epidemiological data on radiation carcinogenesis. The statistical properties of this method are empirically assessed using 572 selected sets of bioassay data.


Introduction
The goal of cancer risk assessment is to predict the risk of tumor occurrence in people exposed to carcinogenic agents present in the environment. Such estimates of risk are useful in assessing the potential health impact of such exposures and in evaluating risk management strategies for exposure mitigation. In practice, cancer risk assessment is a complex process for a complex disease. There are many different forms of cancer, many with different disease etiologies. There exists uncertainty regarding the mechanisms of initiation, promotion, and progression of neoplastic changes; the pharmacokinetic distribution ofreactive carcinogenic metabolites within exposed individuals; and the pharmacodynamic effects of the proximate carcinogen in target tissues.
The most relevant data for prediction ofcancer risk are derived from human populations subjected to well-characterized conditions of exposure resulting in an elevated level of risk. Epidemiological data have been of great value in identifying a number of agents capable of causing cancer in humans, particularly through observations on certain occupational groups or individuals exposed to moderately high levels ofthe agent of interest. To estimate the potential risks associated with lower environmental exposures, downward extrapolation of these results may be required.
In many cases, epidemiological data on a suspect carcinogen may be nonexistent or inadequate for purposes of quantitative risk assessment. This can occur due to a lack of accurate information on exposure levels or the presence of confounding risk factors. In this event, prediction of human cancer risks may be attempted using laboratory studies of carcinogenicity, on the basis that animal carcinogens are presumptive human carcinogens (I) and that some degree ofcorrelation in carcinogenic potency exists between animals and humans (2). Because ofthe need to elicit potential toxic effects using a limited number ofexperimental subjects, the doses used in laboratory studies are generally much higher than human exposure levels. Consequently, the need to extrapolate from high to low doses also arises with toxicological data.
Past approaches to the low-dose extrapolation problem have relied on an assumed mathematical function relating cancer risk to exposure. There are many different candidates for such a doseresponse model, some with stronger biological bases than others (3). Tolerance distribution models such as the probit and logit have generally evolved in the study ofnoncarcinogenic end points to describe dose-response relationships in the observable response range. Mechanistic models describe carcinogenesis as a stochastic multistage process, in which neoplastic conversion of stem cells proceeds through a series ofwell-defined stages involving both genetic damage and changes in cell kinetics. Unfortunately, with the limited information provided by epidemiological and toxicological studies, it is possible to postulate different models that fit the data equally well, but which provide point estimates ofrisk at low doses that differ by several orders of magnitude (4).
The purpose of this paper is to provide a procedure for lowdose risk estimation that does not depend upon the selection of a specific dose-response model. Our goal is to obtain the best possible upper confidence limit on low-dose risk using only data on tumor occurrence rates from epidemiological or toxicological studies. The only assumption made is that the underlying doseresponse curve is linear or sublinear at low doses. Estimates of low-dose riskbased on the model-free procedure proposed in this paper are compared with corresponding estimates based on the linearized multistage model using a large number of data sets previously reported in the literature.

Multistage Model
The multistage model is currently the most widely used model for cancer risk estimation. As formulated by Armitage and Doll (5), the probability P(d) ofa tumor occurring following exposure to a fixed dose d up to time t is given by i-I where i = 1, ... ,k indexes the distinct stages ofa k-stage process.
Here, ai +b, d represents the rate at which transitions to stage i occur, with a, > 0 denoting the spontaneous transition rate and bj d (bj 2 0) representing the effects ofdose d. (The constant c is proportional to the number ofindividual cells at risk in the tauget tissue.) This model predicts that the age-specific cancer incidence rates will be proportional to age raised to the power (k-i) and provides a good description of human cancer incidence curves with 2 .k . 6. For applications, Crump et al. (6) proposed the modified multistage model where the q* 2 0. Although the class of polynomials with nonnegative coefficients included in the exponent in Eq. (2) is broader than the corresponding class in Eq. (1), this formulation is easier to apply in terms ofparameter estimation. For small d, we have P (d)-P(0) = qld (3) l-P(O) Thus, when the background P(O) is small, qI represents the slope ofthe dose-response curve in the low-dose region. Although the original model [Eq. (1)] is linear at low doses, the extension in Eq. (2) allows for the case q, = 0. In practice, an upper confidence limit ql(O is used, which will be strictly positive (7). This upper bound has come to be known as qt and provides a measure of carcinogenic potency based on the linearized multistage (LMS) model.

Point Estimates Versus Confidence Limits
The use ofa 95 % upper confidence limit on q1 rather than its maximum likelihood estimate has been the subject of some discussion. Proponents ofbest estimates argue that the use ofupper confidence limits leads to unwarranted conservatism in risk estimation (8). When decision-making allows for alaing risks against benefits, it has also been argued that best estimates of benefit should be compared to best estimates of risk. Upper bounds on risk based on the linearized multistage model have also been criticized in that they are highly insensitive to the data on which they are based (9). For various reasons, the U.S. Environmental Protection Agency (10) has taken the position that, in general, best estimates of risk cannot be reliably computed at this time. In the absence of a suitable best estimate ofrisk, the Agency advocates the use of linearized upper bounds. This position is based in part on the fact that the best estimate of q, may be 0, in conflict with the strict linearity implied by Eq. (1). Even when positive, the maximum likelihood estimator ofq, can be relatively unstable, with minor perturbations to the data resulting in marked changes in its estimated value (11).

Biologically Based Cancer Models
The Armitage-Doll model has been subject to criticisms that it does not provide a complete description ofthe process ofcarcinogenesis. Specifically, the k-stages envisaged in the model are largely phenomenological and do not necessarily represent welldefined biological changes. In particular, when the number of stages required to fit the data is large, it is difficult to interpret these stages as specific mutational events. The model also fails to provide for the development oftarget tissues with age and for the dynamics of cells involved in neoplastic conversion.
Moolgavkar (12) and his co-workers (13,14) have developed a two-stage biologically based model of carcinogenesis that explicitly provides for tissue growth and cell kinetics. This model assumes that two mutations, each occurring during cell division, are required for a stem cell to be transfonned into a malignant cancer cell. Initiated cells that have sustained the first mutation may be promoted through nongenotoxic mechanisms that increase the net birth rate ofinitiated cells. Thorslund and Charnley (15) have applied a form ofthis model in the estimation ofcancer risks associated with exposure to chlordane and dioxin. However, the estimability ofthe unknown model parameters requires further study (16).

Range-of-Risk Estimates
Since precise mechanisms ofcarcinogenic action are geneally unknon, it follows that no model, no matter how elaborate, can claim to be correct. This uncertainty has promptd proposas for the use ofa range-of-risk estimates based on different plausible models. Calculation of a range-of-point estimates serves little useful purpose and does not contribute to a real understanding of the uncertainty in the extrapolation process. Since point estimates depend on the fonn ofthe model selected, the number of point estimates is limited only by the number of models entertained.
In our view, a more realistic approach to expressing uncertainty is to recognize that the risk could be as high as that predicted by linear extrapolation or as low as 0. The risk will be 0 when a thrshld exists below which neoplastic conversion does not occur. Paynter et al. (17) have suggested that a threshold may exist for thyroid tumor induction, although the evidence in dtis regard is not conclusive.

Is Linear Extrapolation Conservative?
Linear extrapolation often is criticized as being too conservative. Schell and Leysieffer (18) show that the one-hit model, which is linear at low to moderate doses, provides an upper bound on risk for any dose-response model satisfying an increasing failure rate condition with dose. (This condition holds for commonly encountered dose-response models, the probit model being an exception.) Bailar et al. (19) show that a significant fraction ofbioassays conducted for the National Toxicology Program demonstrate supralinearity at high experimental doses and argue that at low doses the one-hit model may thus not be conservative in some cases. Crump et al. (20), Peto (2I), and Hoel (22) all argue that low-dose linearity occurs when substances augment existing carcinogenic processes. The formation ofDNA adducts, which may be predictive ofcertain tumors induced by genotoxic carcinogens, has often been observed to be linear at very low doses (23,24). The question is thus not so much if low-dose linearity exists, but over what range the dose response is approximately linear. For the multistage model, Crump et al. (20) have shown that linear extrapolation will be quite accurate, at least when the excess risk does not exceed the spontaneous risk.

TIssue Dosimetry
Measurements or predictions ofthe dose ofthe proximate carcinogen reaching the target tissue can be used to obtain more accurate estimates of low-dose risks. This can be done using physiologically based pharmacokinetic (PBPK) models that describe the fate ofchemical substances in the body (25). These models describe metabolic processes within a number ofrelevant physiological compartments and have been successfully used to model the metabolism of several chemical carcinogens (26).
When one or more steps in the process ofmetabolic activation are saturable, the dose delivered to the target tissue may not be directly proportional to the administered dose (27). In such cases, risk estimates based on the administered dose can be biased (28). At sufficiently low doses, however, most kinetic processes will be first-order, in which case the relationship between external and internal doses will be linear.

Linear Extrapolation
We have argued that dose-response curves for some carcinogens may be expected to be linear at low doses. Ifthe doseresponse curve is actually sublinear in the low-dose region, linear extrapolation provides an upper limit on low-dose risk. In this section, we first review previously proposed methods for linear extrapolation and then describe our model-free approach.

Prvious Approaches
Gross et al. (29) suggested a method for linear model extrapolation based on discarding data starting at the upper end of the dose range until a linear model prvided an adequate description of the remaining data. Van Ryzin (30) suggested the use of any model that fit the data reasonably well to estimate the dose producing an excess risk of 1% and then using simple linear extrapolation to lower doses. Gaylor and Kodell (31) proposed fitting a model to the aailable data and then using linear exutapolation below the lowest dose at which observations were taken.
Since the esimate at the lower doses mightbe unduly influenced by the choice ofthe model used in the experimental dose range, Famier et al. (32) suggested liner exrapolabon below the lowest dose or the dose corresponding to an estimated risk of 1%, whichever was larger.
Krewski et al. (33) propose an entirely model-free procedure based on linear extrapolation below the lowest dose showing an excess (not necessarily statistically significant) risk. Krewski et al. (34) modified their procedure to consider linear extaplation fiom all doses for which there were no statistically significant increases in tumor incidence above the baseline level, selecting the smallest slope for low-dose risk estimation. In a similar vein, Gaylor (35) considered the smallest slope obtained from all the possible combinations ofdata from the doses where the lowest dose was in the convex portion of the dose-response curve. In both cases, upper confidence limits on the slopes were used. Both Krewski et al. (33) and Gaylor (35) showed that low dose risk estimates based on these model-free procedures were generally close to those obtained from the linearized multistage model.

Model-Fre Approach
The only assumption that we wish to entertain in assessing low-dose cancer risks is that of linearity of the dose-response curve at low doses. Under this assumption, low-dose risk assessment requires estimation ofthe slope ofthe dose-response at the origin given by > 0 Ad d=O (4) Without making specific assumptions concerning the functional form of the dose-response curve other than low-dose linearity, a natural estimator of( at a dose d close toO would be the slope ofthe secant from (d, P(d)) to (0, P(0)), since Id-+f as d -0.
This approximation suggests a simple model-free approach to linear extrapolation.
Consider a bioassay with t+l dose levels 0 = do < d1 <.. . <d, where do = 0 corresponds to the control group. Of then1 animuls at dose d1, suppose thatx, develop the lesion of interest during the course of the study (i=0, 1, ... ,t). The probabilityp, oftumor development at dose d, may then be estimated by , = x; In,. Linear interpolation between a point -(I <i . t) andjyields the secant approximation to the linear component of the dose-response curve.
To ensure that this approximation is reasonable, we need to restrict the set [Pj, * * *,P, 3 to some subset.., * * * P,.] of points (1 5 1* S t)suchthatthis subsetlieswithinaregionofthedoseresponse curve in which the secant approximations will not underestimate the low-dose slope. After smoothing the proportions so as to form a monotonically increasing set [pj3 using isotonic regression, Gaylor (35)  ) adopted a simpler approach in which t* was chosen to correspond to be the largest dose below the first dose at which the observed response rate among the exposed groups was significantly greater than the response in controls. (Here, statistical significance is evaluated at the 5% level using the Fisher-Irwin exact test.) Ifthe lowest dose exhibits a statistically significant increase in tumors, only this dose is used for extrapolation. In this case, the results should be interpreted with caution since there is less assurance ofconvexity. To allow for experimental error, an exact binomial upper confidence limitp1'O was calculated onp, (i=1, ... t*), along with a lower confidence limitpo(L) onpo. The minimum (positive) value of the t* secants (Pilpo(L) I)di (i=l, . . . ,t*) is then used as an upper confidence limit on the low-dose slope. Because no dose-response model has been assumed, we refer to this as model-free extrapolation (MFX).
Because the minimum of up to t such secants is selected, the overall confidence level associated with this procedure requires consideration. By the Bonferroni inequality, an overall 95 % confidence level may be achieved using individual confidence limits of 5/(t+1) %. Since not all t secants are used when t* < t, it is possible that this Bonferroni bound may be improved upon. This is currently under investigation.

Illustrative Examples
To illustrate the application of the model-free approach to linear extrapolation, we consider the data on radiation-induced stomach cancer shown in Table 1, previously analyzed by Krewski et al. (34). These data are shown in graphical form in Figure IA after re-expression in terms of relative risk. The secant bounds based on those exposure groups not demonstrating a significant increase in risk (p < 0.05) are shown in Figure 1B. The secant with the smallest slope represents the MFX bound on low-dose risk.
To compare the MFX approach with the traditional LMS, consider the bioassay data shown in Table 2 on kidney tumors induced in Fischer 344 rats following oral exposure to nitrilotriacetic acid (NTA) for 24 months (38). These same data are displayed graphically in Figure 2A, along with the fitted multistage model (39). The best-fitting model involves five stages but does provide a good description of the dose-response curve.
The (100 -5/6) = 99.17% upper confidence limits on the response probabilities in each ofthe exposed groups are shown in Figure 2B, along with the associated secant bounds on the lowdose slope. (No secant is shown for the dose of 2 % NTA in the diet, since the tumor response at this dose was significantly    To compare the MFX approach with the traditional LMS, consider the bioassay data shown in Table 2 on kidney tumors induced in Fischer 344 rats following oral exposure to nitrilotriacetic acid (NTA) for 24 months (38). These same data are displayed graphically in Figure 2A, along with the fitted multistage model (39). The best-fitting model involves five stages but does provide a good description ofthe dose-response curve.
The (100 -5/6) = 99.17% upper confidence limits on the response probabilities in each ofthe exposed groups are shown in Figure 2B low-dose slope. (No secant is shown for the d4 the diet, since the tumor response at this dose greater than 0, the control response.) The nf these secants occurs at a dose of 1.5 % and has a XTIr-A :_ 4-1* A:-sidered to avoid studies with larger mammals such as dogs or Dose Response monkeys in which exposure occurred over a relatively small fraction of their lifespan. Second, experiments in which exposure took place by more than one route were excluded because of pharmacokinetic complications arising with multiple exposure routes. Third, only experiments with at least two dose groups (in addition to the unexposed control group) were used in keeping with minimal standards ofbioassay design. Fourth, experiments with reduced survival among exposed animals were excluded since this could bias tumor occurrence rates downward. (Using only low doses with MFX would help to alleviate this problem, as reduced survival generally occurs at higher doses.) Experiments selected according to these criteria often included data on tumor occurrence at more than one site. Gross aggrega-1b50  tions of sites (all target sites or tumor bearing animals) were excluded on the basis that most carcinogens appear to be site Low Dose Slope specific. Similarly, aggregations ofall tumors at a given site were omitted. For our purposes, only the most significant site was considered, this being the site on which most concern would likely focus in practice. Here, significance was defined in terms ofthe p-value ofthe Cochran-Armitage test for increasing linear trend in tumor response with increasing dose (43). In cases where two or more sites had the same p-value, the one with the smallest TD5o (the dose resulting in 50% tumor incidence) was selected.
To ensure that compounds selected for analysis were considered in some sense to be carcinogens, only those results for which the (one-sided) p-value for the trend test was less than 1% were admitted. Additional evidence ofcarcinogenicity was required by 1-50 2t00 further demanding an expressed opinion by the original investigators that the compound was considered carcinogenic. Application ofthese criteria to the Gold database yielded 585 ilotriacetic acid for 24 experiments for analysis. The slope ofthe dose-response curve at the origin was estimated using MFX and the LMS model. In the latter analyses, doses associated with a down turn in the doseose of 2 % NTA in response curve at high doses were omitted. In 13 cases, the samose of 2% NTAin ple size limitations for MFX were exceeded. This left 572 exwas significantly periments for comparison purposes. unimum slope of The distribution ofthe ratios (MFX/LMS) ofthe two estimates value of0.061 per across the 572 data sets is shown in Figure 3. The median ratio percent N IA in tne wiet.
An upper confidence limit on the slope of the dose-response curve at the origin can also be derived under the LMS model. For comparability with MFX, this is calculated as 10-2/d* where d* represents a 95 % lower confidence limit on the dose correspondingtoanadditional riskof 1% (39). (This differs slightly fromq* when the background tumor response rate is not low.) This leads to a slope of0.024, a factorof2.5 lowerthan obtained with MFX.

Empirical Evaluation
The general performance of our model-free extrapolation (MFX) procedure in comparison with the traditional LMS may be empirically evaluated by applying both methods to experimental data on a more extensive series oftest compounds. In this regard, Gold et al. (40)(41)(42) have assembled a useful reference database ofbioassay data drawn from 3749 experiments reported in the literature. Here, an experiment is defined in terms ofresults for one sex of one species from one research report.
For our purposes, a subset of this database was selected for analysis. First, only data on rats, mice, and hamsters were con-  There were eight instances in which MFX exceeded LMS by a factor of more than 10-fold. A case-by-case examination of those cases revealed a leveling offor even a decrease in the doseresponse curve at higher doses, which tended to reduce the value ofq*. Since MFX does not generally use high-dose data, a higher (and likely more accurate) estimate of the slope of the doseresponse curve at low doses is obtained.

Summary and Conclusions
The quantitative assessment of risks associated with low-level exposure to carcinogens present in the environment continues to be an important problem upon which consensus remains to be attained. This issue is particularly contentious when extrapolations not only from high to low doses but from laboratory animals to humans must be made. Nonetheless, such estimates are often needed for purposes of risk management.
The LMS model has traditionally been used for low-dose risk estimation. It is now widely recognized that this model provides an incomplete description of chemical and radiation carcinogenesis, neglecting important factors such as tissue growth and cell kinetics. Dose-response relationships demonstrating a high degree of curvature at high doses can occur as a result of cellular proliferation or saturation of metabolic processes required to form the proximate carcinogen, but can be explained only with a large number of stages in the multistage model. Although more biologically based models have emerged within the last decade, these models involve additional unknown parameters that may not be directly estimable using epidemiological or toxicological data on tumor occurrence rates.
Irrespective of the actual dose-response model, there are a number ofarguments that suggest that the dose-response curve may be linear at low doses. Specifically, low-dose linearity may be expected to hold with agents that act by augmenting ongoing carcinogenic processes. DNA adducts formed with genotoxic carcinogens also appear to be linearly related to dose at low levels of exposure.
For these reasons, a model-free approach to carcinogenic risk assessment that assumes nothing more than low-dose linearity seems appealing. The model-free extrapolation (MFX) procedure described in this article is based on a series of secant approximations to the slope ofthe dose-response curve in the lowdose region, with the minimum of such approximations selected for risk assessment purposes. This represents the best upper confidence limit on low-dose risk consistent with the data. An analysis of572 experiments demonstrated that MFX yields estimates of low-dose risk are largely comparable to estimates derived under the LMS model. In addition to making a minimal number ofassumptions, MFX does not make use ofdata at high doses where survival may be impaired or normal physiological function disrupted.