Use of biological markers and pharmacokinetics in human health risk assessment.

There are two reasons to connect discussions of biological markers and pharmacokinetics. First, both tend to open up the black box between exposure and effect. Doing this promises more complete scientific understanding than simple input-output analysis, the possibility of better mechanism-based projection of risk beyond the range of possible direct observations, and the possibility of greater sensitivity of analysis, in some cases going from the organism to the cell as the unit of analysis. Second, pharmacokinetic (or similar pharmacodynamic) analysis will often be essential for appropriate interpretation of biological marker information. One needs some sort of dynamic model of the generation and loss of the marker in relation to exposure in order to use a biological marker, either to form a better measure of dosage (either accumulated past dose, or biologically relevant dose), or to make an improved prediction of effect. (For example, the use of a blood cadmium level alone to predict kidney effects might be inferior to predictions based on aggregate past accumulation of cadmium in the kidney, based on the past history of cadmium blood levels x time). Several examples will be discussed of the use of biomarkers and pharmacokinetics in risk assessments for both carcinogenesis and other effects.


Introduction
There is an inescapable connection between the construction of dynamic models of pathological processes and the use of biomarkers, parameters that putatively represent some step along the causal pathway between exposure and effect. hnplicitly or explicitly, any use ofa biomarker in a risk assessment requires one to make some sets of quantitative dynamic assumptions about both the relationship between exposure and the biomarker and the relationship between the biomarker and the ultimate health consequences of interest. At the same time, any construction of a dynamic model for use in risk assessment must remain a theoretical exercise until specific predictions about the relationships of intermediate parameters to exposure and/or effect can be verified.

Philosophy of Science Issues
There are three basic reasons why it is desirable to use both models and markers to open up the black box between exposure and effect. The use ofmodel and markers a) lead to a more complete scientific understanding and incorporate more relevant information about causal mechanisms than a simple input-output analysis; b) offers the eventual prospect of better mechanismbased projection of risk beyond the range of possible direct observations; and c) offers the possibility of greater sensitivity of detection and quantification of adverse effects in some cases, going from the organism to the cell as the unit of analysis. Realizing the potential of both biomarkers and pharmacokinetic modeling, however, requires overcoming some philosophical assumptions that are common in the scientific disciplines that must contribute the tools for both measuring biomarkers and studying health effects in human populations. On the one hand, experimental scientists in Baconian tradition () are reluctant to build elaborate mathematical models, having been conditioned to view such theoretical efforts as unproductive speculations that divert attention from the necessaxy job ofmaking measurements of natural phenomena as they really are (2). On the other hand, more mathematical/statistical workers, who have largely been in control ofrisk assessment procedures up to this point, often do not have the detailed familiarity with causal mechanisms to feel comfortable building realistic mechanismbased representations of complex biological processes. In any event, doing so would complicate the use oftheir usual black box curve-fitting approaches to analysis, introducing more variables than can be directly estimated from any single data set and therefore requiring relatively innovative (from a statistical standpoint) procedures to incorporate diverse information from different sources. This paper will advance what may be a startling proposition to experimentalists: by uncovering anomalies in the fit between data and theory, analysis can be as fruitful in producing new knowledge in some cases as additional data gathering. Theoretical modeling and data-gathering activities can properly be thought ofas complementary and synergistic enterprises in science. cooperative agreement with the National Institute for Occupational Safety and Health. The institutional conrext is mentioned to emphasize that these studies are directed at a practical aim-to respond to judicial requirements (from the Supreme Court's benzene decision among others) that regulators do the best they reasonably can to assess the magnitude of the risk posed by the substances under consideration for regulation, and the prospective benefits of the regulatory actions they propose.
Because the work is intnded in part to serve decision-making, and because as a society we must necessarily make decisions on the control and acceptance ofrisk based on currently incomplete information, the use ofdifferent biomarkers and models should not be construed as an assertion that either we know everything we would ultimately like to know about the quantitative causal relationships implied, or, the markers and models are fully scientifically validated. Rather the tests that should be applied in deciding whether a particular model or marker is apropriate for provisional use in risk assessment are: a) Does the model or marker, by incorporating additional relevant information on likely causal processes, help to better clarify the "range ofnot clearly incorrect estimates" (11) on the magnitude of the risks under study, thereby helping the risk managers and their constituents to appreciate the potential consequences of their actions under alternative, reasonably possible, states ofthe world? b) Can the model or marker serve as a useful point ofdeparture for further scientific research making specific predictions about neasurable parameters that can be tested in future experimental or epidemiological efforts?
The presentation ofthese examples below necessarily focuses on only a few highlights ofwhat has been learned in confronting the detailed analysis issues. The full reports tend to be booklength documents, which are difficult to publish in the shortened form required by most scientific journals. The length of full reports results from an attempt to explore new methodology, and because to serve the decision-making function oudined above, the studies must include extensive analyses ofthe sensitivity of the conclusions to different structural assumptions and plausible values ofkey model parameters. Very often, all this simply will not fit within 20 pages even for one study, and it is even less feasible to do this for the range of studies listed in TIble 1.

Use of Pharmacokinetics and Biological Markers to Improve Carcinogenesis Risk Assessment
One general goal ofthe three carcinogenesis case studies was to improve high-dose/low-dose interpolation. Through the use of better measures of the internal dose of DNA-alkylating substances at different external exposure levels, we hoped to avoid attributing high-dose pharnacokinetic nonlinearities to the fundamental multiple mutation mechanism of carcinogenisis. Another goal ofthese studies was to improve interspecies projection of risk.
The perchloroethylene and butadiene analyses attempt to quantify the metabolism ofthese substances to active epoxy intermediates, thus the intermediate parameter ofinterest is a function of metabolized dose. By now the basic structure of such models, where high-dose nonlinearities are assumed to result from saturation ofa single liver enzyme with Michaelis-Menten enzyme kinetics (Fig. 1) is relatively familiar. For ethylene oxide, a preformed epoxide alkylating agent, the challenge was to determine the rates of detoxifying metabolism for different species and at different exposure rates and thus quantify the internal dose x the substance available for DNA reaction.
In the cases of perchloroethylene and ethylene oxide, the metabolism models were calibrated with the aid ofindependent data for all three species ofinterest (mice, rats, and humans). One necessary caveat, however, is that the important issue ofthe appropriate dose metric for risk per unit of active metabolites x time after pharmacokinetic analysis is still not settled. For the best-estimates ofrisk, a metabolized dose/(body weight)M' projection rule is used because this best fits the rat/mouse carcinogenic risk data for perchloroethylene and because it conforms with an assumption first articulated by Boxenbaum (12) that the active elimination oftoxic substances should scale with body weights in parallel with general metabolic rates. This leads to an expectation that half-lives for elimination in larger animals should increase in proportion to (body weight) A. This specific expectation on elimination rates was fulfilled by the results of a the modeling for ethylene oxide. The estimated half-lives for eliminaton ofethylene oxide were estimated as 6.4, 9.2, and 41 min in mice, rats, and humans, respectively, for the best-estimate series ofmodels. When fit to a standard allometric equation (13), T% = K (body weight)the exponentm was estimated at 0.24, not very different from the value of0.25 which would be expected from the metabolic rate scaling rule. When this regression equation in turn was used to make a prediction for a result not included in the original analysis, the half-life for 17.5-kg beagle dogs as studied by Martis and co-workers (14), the expectation was for a half-life of 28.4 min. Martin et al.'s actual findings were half-lives of29.3 ± 5.7 min and 36.5 + 18.5 min (SD) after IV administration of25 and 75 mg/kg dose levels.
One ofthe important lessons from our work was the high frequency with which we found it necessary to modify or elaborate the standard pharmacokinetic model design represented in Figure 1 to accommodate the facts and data types available in specific cases. An earlier report (15), described some distinctive features of our human perchloroethylene models that were required to accommodate both the extensive alveolar air exhalation data of Stewart et al. (16) and the metabolite urinary excretion data for human workers. Further unanticipated assumptions were required to interpret data from available metabolic disposition experiments in animals: assumptions about gastrointestinal absorption rates were required for interpretation of gavage experiments, and assumptions about the rate ofloss ofmetabolized material were required for interpretation ofexperiments in which the compound was administered by inhalation over an extended (6 hr) period. The difficulties in interpreting the animal data as published indicate that ifmodeling were undertaken in conjunction with the data collection, the experiments might be alerted by the need to make additional or slighdly different measurements that would make the data more useful. The rather large (5-fold) discrepancies that we found in comparing the low-dose implications ofthe two worker studies ofmetabolite excretion (17,18) indicate that the modeling exercise, by bringing diverse data together under a common analytical umbrella, can serve to identify inconsistencies among different data sets (puzzles that might be usefully resolved by additional observations). In this case, the authors ofthe later paper (18) failed to comment on the difference between their results and those of the earlier researchers, even though the author groups for the two papers have at least one name in common.
At the outset, after completing the perchloroethylene model, we thought that the ethylene oxide modeling effort would be very straightforward. However, there were two major surprises. The first major surprise was that rat absorption and excretion data for ethylene oxide (19) were only interpretable, whatever model structure and parameters we tried, if breathing rates declined at relatively high doses (100 and 1000 ppm). The experimenters in this case noted gasping and other signs oflung distress at the 1000 ppm level, and based on the experience ofAlarie and co-workers (20,21), it is entirely reasonable to have expected that high levels of an irritant gas would tend to reduce respiration. Here, however, is another case where only our modeling revealed a 2t HAT77S L,GSH_GEN FIGuRE 2. Conceptual model ofethylene oxide metablism in the liver. primary implication ofthe data and, ofcourse, the desirability ofhaving measurements, rather than a need to estinmae, this important constant, which turned out to be a vanable.
The second surprise that caused us to even more radically restructure our ethylene oxide model was an extensive body of evidence ofglutathione depletion in diferent orgns at high dose rates (22). This in itself would be expected to decrease the metabolism rate of ethylene oxide at high doses whether or not there was also saon ofany meabozing enzymes that might be involved in catalyzing the reactions. Because we had no independent evidence of enzyme saturation, we decided to see if models based solely on glutathione depletion as a mechanism producing high-dose nonlinearities would be compatible with the available data.
As it happened, they were. Constructing such models, hower, required, for each organ/organ group involved, seting up a baseline model of glutathione generation and loss, in the light ofavailable data on glutathione equilibrium levels and turnover (23,24). The basic conceptual model for this system in the liver is diagrammed in Figure 2, where k2 is the rate of the bimolecular reaction between ethylene oxide and glutathione, and kl is the fraction of total ethylene oxide metabolism that is accounted for by glutathione. These were the adjustable parameters in the ethylene oxide models, with the constrint, for the initial models, that k2 and kl were kept uniform across different organs. Different sets ofdata were used to set the valuesof theadjustablea esfordifrspecies: a)Forrats, weused Golkaretal. (28) fohlowing iP ir of low doses. Given these metbolism rates, low-dose alveolar ventilation rates were set to reprduce the absorption data of Ehrenberg et al. (29).
After the initial series ofmodels were constructed, rat model predictions were compared with the high-dose glutathionedepletion data of McKelvey and Zemaitis (22) and the hemoglobin adduct data of Oserman-Golkar et al. (30). For the final models, it was found that having k2 be twice as large in the liver as for other organs resulted in a somewhat improved fit to the animal glutathione-depletion data, but that this had litde influence on the final risk results.
The butadiene modeling took as an important point ofdepar-Mmre our experience with the animal inhalation data for perchlorethylene: it was possible for metabolized matrial tobe lost prior to the placement ofanimals in metabolism cages at the end ofexposure. Tlwo earlier risk assessments forbutadiene (31,32) used as their measure ofdose in animals, the amountofbutadiene retained at the end of6-hr exposures in experiments by Bond et al. (33). To the degree that butadiene was processed and active metabolites effectively lost from the animals before being measured at the end ofthe 6-hr exposures, delivered doses in the animal bioassays wold be underestmated by this approach. On the other hand, to the degree that some ofthe butadiene retained at the end ofthe 6-hr period is later exhaled unchanged (and therefore escapes activating metabolism) the delivered dose would be overestimated.
To calibrate our metabolism models in this case, we used inferences from butadiene chamber absorption studies ofKreiling et al. (34) and blood butadiene measuranents from the Bond et al. (33) data set. To do this, we found that it was impossible for all metabolism to be occurring in the liver. In this case, even if all ofthe blood flowing tothe liver in rats and mice were to have been completely cleared ofbutadiene, the metabolic elimination would still not have been sufficient to account for the observed rate of butadiene metabolism. This, combined with the direct observations ofSchmidt and Loeser (35) ofsubstantial butadiene metabolism by lung tissue, caused us to combine the liver and vessel-rich tissue groups (including the kidney, etc.) in our butadiene models to allow the greater blood flow to the latter tissues to be exposed to our models' metabolizing enzymes. The resulting models did indeed indicate that appreciable butadiene was likely to have been lost prior to measurement in the Bond et al. experiments, and that moreover, the estimates of human delivered dose used in the risk assessments were likely to have been appreciably overstated.
Tible 2 shows the bottom-line results of all three of our pharmacokinetic-based carcinogenic risk assessments in comparison with the risks that would be expected from the more conventional assessments done by the EPA's Caminogen Assessment 'ND, not done. Implicitly a best esma equivalet to a lifetime risk of0.104 ws a om t dl hman l Ias in the Hosedt study (37). This is acentrl tndency estimate because no statstca upperconfidece limit prcedure was used in _ .

BIOMARKERS AND PHARMCOKINE77CS INRISKASSESSMENT
Group. The differences between our best (or least unlikely) and plausible upper limit risk numbers reflect in each case a series of differences inproceduresdesignedtogivedecision-makers some senseoftheuncertaintiesoftheanalysis resultng from both uncertainties in the pharmacokinetics/metabolism and conventional uncertainties in projection of human risks from animal data.
For example, the best estimate numbers reflect a) maximum likelihood estimates of conventional multistage dose-response relationships in animals (modified in one butadiene data set to account for possible interactions between butadiene-induced mutagenic transitions and similar transitions causing background cancers in humans); b) the geometric mean of risk determinations in different species and sex groups in animals; c) an animalto-human projection of risk depending on dose/(body weight)3A and d) the best estimates of metabolic formation from animal and human pharmacokinetic models. The plausible upper limit numbers reflect a) the upper 95 % confidence limit estimate of the linear tenn of multistage dose response models; b) the carcinogenesis experience of the most sensitive species and sex tested; c) an animal-to-human projection of risk depending on dose/(body weight)6; and d) esimates of human dose in relation to animal dose derived from our plausible upper limit versions ofour human pharmacokinetic models. Comparing our plausible upper limit results with those of EPA, it can be seen that the modeling sometimes increases and sometimes decreases the final upper bound estimates of risks, although in all three cases our least unlikely estimates are below EPA's upper bound figures.
In conclusion, pharmacokinetic analysis is nobody's unambiguous, quick solution to the problem of uncertainty in carcinogenic risk analysis. Each of the models I have developed to date need to undergo serious structural modification in the light of the data available for the specific case. The process of doing these modifications is a developing art, requiring liberal doses ofjudgment rather than cookbook formulas. In addition, the models raised as many interesting questions as they answered, often revealing unsuspected sources of uncertainty and nonobvious difficulties in fairly assessing the extent of the uncertainties.
As often as not, the pharmacokinetic analysis does not make a major difference in the final numerical projection of risks (particularly ethylene oxide). The exception is butadiene, where nbke Elemnts ofa new anlysis for noncancer heath effect medited  there was nearly an order-of-magnitude effect. Nevertheless, there is hope that in long run, pharmacokinetic analysis can both facilitate the process of asking better and more relevant experimental scientific questions and help make risk assessment models somewhat better in the sense of incorporating more realistic and more experimentally testable information about the causal processes underlying both carcinogenesis and other adverse health effects.

Use of Various Intermediate Parameters to Improve Risk Assessments for Noncancer Effects
Bernd the previous exaples in pharmacokinetic-based carcinogenesis risk assessment, we have also undertaken a number ofventures into quantitative risk assessment for a variety ofother types ofeffects. This is an area with great potential importance both scientifically and for social policy. Traditionally, noncancer effects have not been the subjects ofthe same kind ofquantification as has recently become standard for cases of cancer. We believe that the usual no-observed-effect level/safety factor approach has serious limitations, both from a scientific standpoint and for the needs of social decision making. Table 3 gives an overview of the kinds of analyses we believe scientists should seekto develop as a replacement, at least for those efiets that can be casually linked to an accessible funtional intermediate parameter.

Acrylamide Neurotoxicity
Chemicals producing adverse effects by classic chronic toxic damage processes are defined as those that are fundamentally reversible, at least in pre-clinical stages, but that take a relatively long time (weeks or months) for reversal/repair to occur (38). This applies to some, but possibly not all (39), of acrylamide's neurotoxic effects. Risk assessments for these chemicals need to address a number of significant issues: a) What are the relationships between external dose and the generation of the internal damage/toxicant accumulation? b) What are the nature and dynamics of reversal of the slow step in the process that makes the process chronic? c) What are the differences among species in both the generation of damage/toxin accumulation and the repair/reversal process? d) How much interindividual variation can be expected among exposed people in both damageproducing and repair processes (and therefore susceptibility to toxicity)?
The acrylamide case is interesting in that it indicates the potential helpfulness of an entirely theoretical modeling exercise in basic toxicologicalresearch. As can be inferred fromTable 1, we do not know the exact physical form ofthe incipient damage that accumulates over weeks or months to ultimately lead to the grosser manifestations of peripheral neuropathy. In the case of acrylamide, three decades of experimental observations have yielded an extensive characterization of neuropathic effects at both the morphological (40) and functional levels (41). At the key biochemical/molecular level, however, there is an alnost embarrassing richness ofcandidates for causal intermediate processes in the generation of neurological damage. Among the most prominent of these are inhibition of retrogrde transport systems (which convey material from the axons back to the cell body) (42,43). A number of other mechanisms have also received serious study.
For our modeling work, we elected to return to some of the most classical studies ofacrylamide neurotoxicity (44)(45)(46) and apply a simple dynamic analysis model to them. The data sets analyzed are those that have provided information on some specific manifestation of toxicity produced by different combinations of acrylamide dose rate and duration of exposure (Table 4). Similar data were available for some other effects and some other species. We found that the pattern of increase in the time required to achieve a particular effect could provide us with two important pieces ofinformation relevant to the assessment ofrisks. The first piece ofinformation is the dynamics ofrepair of the the incipient damage, i. e., how much of the past accumulated damage is repaired per day? How does this calculated repair rate appear to change a) across species, and b) for different adverse effect end points, with different amounts of calculated accumulated damage? The second piece of information is the dose ofacylamide that would bejust barely able to produce each effect in each species if the experiment were conducted over the animal's entire lifespan.
Information of the first type may also be helpful in neurotoxicology research. Specific biomarkers for the main process causing a particular response should be repaired in different locations and in different species with the dynamics that are consistent with the repair rates calculated from the dose versus time-ofeffect data.
Our model for analyzing acrylamide data (Fig. 3) is built around three assumptions: a) A particular adverse effect occurs whenever a specific amount of damage is accumulated in the relevant portions ofthe nervous system. There is no appreciable delay between the production ofdamage and the manifestation of the resulting effects. b) Damage is produced at a rate that is ACCUM_DAMAGE DAMAGE-PROD REPAIR FIGURE 3. Model of acrylamide damage accumulation and repair. approximately linear with the milligram per kilogram dose administered to the animals. c) Repair ofthe accumulated damage occurs at a rate that depends direcdly on the amount of accumulated damage that there is to be repaired.
The first assumption provided us with our primary tool for quantitatively analyzing the data. Basically, by trial and error, for each data set, we determined the repair rate that made the amount ofaccumulated damage approximately equal for each of the dose and time combinations that were observed to produce a particular response. Some variations on the second and third assumptions were explored during the course of model development.

Male Fertility Effects of Glycol Ethers
The assessment ofthe effect of glycol ethers on male fertility used an analysis of the pharmacokinetics of ethoxyethanol (EE) and its metabolite, ethoxyacetic acid (EAA), to help interpret observations of EAA excretion and sperm count distributions in recent studies of two groups of workers with EE exposure and concurrent controls (47)(48)(49). Based on existing observations ofrelationships between sperm concentrations and male fertility performance [which are not without controversy among andrologists (50)], we assessed the likely results of observed changes in sperm count distributions in the worker groups in two kinds of units: the increase in the numbers of couples expected to experience a sufficient delay in achieving pregnancy to seek medical treatment [analysis after the method of Meistrich and Brown (51)]; and the increase in the monthly probability of achieving the pregnancy [with a female partner drawn from a particular population, based on data from Steinberger and Rodriguez-Rigau (52)]. Tables 5 and 6 show these different perspectives on the implications of the changes in sperm count distributions in this case. The assumptions from Meistrich and Brown (51) underlying the calculations in Table 5 are a) a uniform multiplicative sperm reduction effect across the entire distribution of sperm counts, b) a linear relationship between the multiplicative spenn count "reduction factor" and the excess infertility risk for a "reduction factor" of 1.24, and c) a one-hit killing function for spenn progenitors in relation to dose rate. It can be seen in Table  5 that the two studies ofworker groups, while qualitatively reinforcing each other, had appreciably different quantitative implications for spenn count changes. It may be relevant that the shipyard painters were exposed to much more variable concentrations of the glycol ethers.

Possible Effects on Infant Mortality As a Result of Processes Related to Reductions in Birth Weights
Finaly, Iwouldliketobriefly review someintriguingdata from workthatis stillinprocessontheeffectsofglycol ether exposure during pregnancy. We have two types ofdata to analyze: quantal data on the incidence offetal death and teratogenic anomolies in exposed animals and data on the change in a sontinuous variable, fetal weights. We were surprised to observe that the latter type of data (53,54) seemed to be compatible with a linear doseresponse relationship (Figs. 4 and 5). Mechanistically, it seems possible that a rapidly growing organism, using essentially all t)The variability for shorter averaging times would be greater, and hence the difference between to geometric mean exposure and the exposure exceeded 5% of the time would be greater. CThese data represent the estimated dosage and mean sperm count reductions observed in the actual epidemiological data. Data on other lines ofthe table represent projections using the Meistrich and Brown (51) assumptions. End Wt Range (Grams) the available metabolic energy it can muster to grow and differentiate, might well have little or no functional reserve capacity. Thus, even marginal additional stresses might cause effects without having a true threshold dose that could be absorbed without producing at least a marginal adverse change. Ifthe indicated change in fetal weights in animals were to be paralleled by a change in average birth weights in humans, there could be an effect on infant mortality, which is very strongly associated with human birth weights (55). However, if we are willing to make this leap, it is still not entirely clear exactly how we should project birth weight changes to the human situation.
Perhaps the most straightforward approach would be to simply assume that the entire human distribution of birth weights receives the same multiplicative reduction in weight. To see if this was a reasonable model of birth weight change in humans exposed to an array ofdifferent stressors, we have recently compared the population distributions of all black and all white singleton births from 1980 (Fig. 6, Table 7). Overall, it can be seen in Figure 6 that both black and white birth weight distributions tend to be bimodal, with the lower mode including 2.5 to 5 % ofall groups. When the differences in birth weights at different percentiles ofthe black and white distributions are compared (Table 7), it appears that there are more profound reductions at the lower end ofthe birth weight distribution than at the high end. Were this pattern to be produced by an environmental chemical, there would be greater implications for changes in infant mortality than ifthe agent caused a simple multiplicative reduction in birth weights at all percentiles of the distribution. It will be instructive to examine changes in birth weight distributions associated with more defined stressors (such as smoking and alcohol) to see what patterns of birth weight change might be indicated for different agents and whether accompanying changes in infant mortality from these agents are well predicted by associated changes in the distribution of birffi weights.

Conclusions
The use ofbiomarkers and pharmacokinetic analysis can serve a number ofpurposes in risk assessment studies and basic scientific research on health hazards. These include a) raising interesting questions about causal mechanisms ("how much" and "when" dynamics issues) that need to be resolved in experimental and epidemiological observations and b) exploring the plausible social consequences for different risks if specific relationships between exposures, intermediate parameters, and end effects were to take on specific, reasonably likely forms.