Topics in cancer risk assessment.

The estimation of carcinogenic risks from exposure to chemicals has become an integral part of the regulatory process in the United States within the past decade. With it have come considerable controversy and debate over the scientific merits and shortcomings of the methods and their impact on risk management decisions. In this paper we highlight selected topics of current interest in the debate. As an indication of the level of public concern, we note the major recent reports on risk assessment from the National Academy of Sciences and the U.S Environmental Protection Agency's proposed substantial revisions to its Guidelines for Carcinogen Risk Assessment. We identify and briefly frame several key scientific issues in cancer risk assessment, including the growing recognition of the importance of understanding the mode of action of carcinogenesis in experimental animals and in humans, the methodologies and challenges in quantitative extrapolation of cancer risks, and the question of how to assess and account for human variability in susceptibility to carcinogens. In addition, we discuss initiatives in progress that may fundamentally alter the carcinogenesis testing paradigm.


Introduction
Although risk assessment in various forms has been used since antiquity, its use to assess and regulate the carcinogenic hazards of environmental chemicals in the United States dates to the early 1970s (1)(2)(3)(4)(5). In 1983, the National Research Council (NRC) released its influential report Risk Assessment in the Federal Government: Managing the Process (6), which described a paradigm for cancer risk assessment that has become the template for its development and application in the United States. The U.S. Environmental Protection Agency (U.S. EPA) issued its first formal guidelines for the conduct of cancer risk assessment in 1986 (7), and since that time, risk assessment has been one of the most prominent scientific issues in legislative, policy, and regulatory discussions.
Discussions of risk assessment, particularly for carcinogens, have been especially intense during the 1990s. Among the significant products of these discussions are the report of the National Research Council's Committee on Risk Assessment Methodology: Issues in Risk Assessment (8), the report of the National Research Council's Committee on Risk Assessment of Hazardous Air Pollutants: Science and Judgment in Risk Assessment (9), and the Report ofthe President's Commission on Risk Assessment and Risk Management (10).
Risk reform legislation was introduced in both Houses of Congress during 1995 in an attempt to legislate the use of risk assessment and the structure of the risk assessment process. While risk assessment legislation has not been passed by Congress, relatively minor components have been incorporated in the 1996 reauthorization of the Safe Drinking Water Act and will likely be incorporated in other environmental statutes. Although risk assessment legislation has been rationalized, in part, by perceived shortcomings in the process, the NRC (9) generally endorsed the U.S. EPA's approach to quantitative risk assessment, including the agency's reliance on default assumptions. However, the U.S. EPA's 1986 guidelines, while in actuality allowing some flexibility in the conduct of risk assessment for carcinogens, have often been viewed as overly rigid in practice. The guidelines have also been viewed by many as well behind the times and unable to incorporate recent advances in toxicology, epidemiology, and related disciplines. The U.S. EPA's draft revisions to the 1986 guidelines (11), released for public comment in April 1996, have proposed several changes that should, if adopted, allow a more flexible process based on the best available science.
In the following paragraphs we highlight some of the scientific issues that are the focus of the current discussion.
Mode of Action and Dose-Response Relationships: Qualitative and Quantitative Assessment of Cancer Risks Perhaps the most significant recent advances in cancer risk assessment have occurred through improvements in the understanding of the mechanisms and/or modes of action by which chemicals induce cancer. Data on mechanisms, toxicokinetics, and toxicodynamics have begun to be incorporated into the hazard evaluation and dose-response assessment components of cancer risk assessments. The U.S. EPA's proposed guidelines (11) provide a framework for the use of such scientific information, when available, in place of the default assumptions that have characterized most cancer risk assessments in the past.
Specifically, the guidelines call for the use of mode of action data whenever possible. In the current jargon of cancer risk assessment, identifying the mechanism of action of a chemical implies a comprehensive understanding of every event in the process by which the chemical produces tumors in a particular organism. Knowledge of the mode of action suggests that one understands in general the critical events in that process, but not necessarily the details. There is no carcinogen for which the mechanism of action is fully known; thus, the new U.S. EPA guidelines do not propose such an unrealistic standard. The guidelines state: "While the exact mechanism of action of an agent at the molecular level may not be clear from existing data, the available data will often provide support for deducing the general mode of action. Under these guidelines, using all of the available data to arrive at a view ofthe mode of action supports both characterization of human hazard potential and assessment of dose response relationships" (11).
The following section of this paper briefly highlights two cases in which consideration of the mode of action is already impacting the qualitative assessment of cancer risks. The proposed guidelines also offer new approaches to the quantitative assessment of cancer risks, including the consideration of mode of action in selecting the approach to characterize dose response. Because the extrapolation of dose-response data far beyond the range of experimental observations has been such a critical issue in the cancer risk assessment debate, it is also discussed in more detail in the section "Quantitative Extrapolation of Cancer Risks." Mode ofAction in the Qualitative Assessment ofCancer Risks: Two Cases Cancer risk assessment is predicated upon evidence that exposure to a chemical is associated with tumorigenesis in humans or animals. In cases where hazard identification is based on studies in animals, specifically the 2-year rodent carcinogenicity bioassay, findings of chemically induced tumors in the animals are considered evidence that the chemical being tested has the potential to cause cancer in humans. This assumption may be supported or challenged by a weight-of-evidence examination of collateral scientific information that identifies and characterizes, to the extent possible, the similarities and differences between carcinogenic processes in humans and animals.
Such weight-of-evidence approaches may have considerable implications for regulatory policy (12). For example, kidney tumors in male rats, associated with the accumulation of hyaline droplets containing the urinary protein a2-microglobulin (a22p), have been interpreted to result from regenerative hyperplasia in response to droplet-associated cellular necrosis (13). The accumulation of droplets is suggested to be due to reversible binding of chemicals (or their metabolites) to a2js, inhibiting its normal proteolytic degradation (14). Since neither a2p nor any functionally equivalent protein has been found in humans, kidney tumors in male rats attributable solely to chemically induced a2P deposition in the renal tubules are not considered to be relevant for humans. This interpretation is currently the basis for the U.S. EPA policy relative to the evaluation of chemicals associated with the induction of kidney cancer in male rats (13). This interpretation has been challenged by Melnick and colleagues (15,16), who suggest that test agents or their metabolites bind to a2p and are transported to the kidney tubules where the bound material is released and causes localized cytotoxicity leading to tumor formation. Distinguishing whether it is the a2p-ligand complex that is causing cytotoxicity and regenerative hyperplasia, or whether a2P is simply transporting and concentrating cytotoxic chemicals in the kidney, likely can be resolved experimentally. The contribution of the weight-of-evidence approach is that it has identified hyaline droplet accumulation in the renal tubules as the key to understanding this type of lesion.
A similar weight-of-evidence evaluation has recently been applied to understanding the relevance of findings of urinary bladder tumors in rodents (17). Summaries by Gold et al. (18) and Huff et al. (19) indicate that approximately 4 to 10% of the chemicals evaluated in rodent carcinogenicity studies have been associated with tumors of the urinary bladder. While certain bladder tumor eliciting compounds, e.g., 2naphthylamine and 4-aminobiphenyl, operate through genotoxic mechanisms, many act through nongenotoxic processes. Examples include melamine, uracil, diethylene glycol, and sodium saccharin (20). Bladder tumors elicited by such chemicals typically occur at high doses and are accompanied by bladder stones composed of the test compound, its metabolites, or other treatment-related substances.
Careful examination of available scientific information suggests the following model to explain findings of urinary bladder tumors in rats exposed to nongenotoxic agents. Relative to humans, the urine of rodents is high in osmolality (21) and rich in protein content (22). The physical and chemical characteristics of rodent urine constitute a unique environment that is conducive to the formation of microcrystalluria, urinary precipitates, and/or bladder calculi (21)(22)(23). This may be particularly important during chemical carcinogenicity testing, where animals are chronically exposed to high doses of test agents such that the agents and/or their metabolites accumulate in the urine. These factors (i.e., crystals, precipitates, and stones) can be physically and/or chemically cytotoxic to bladder urothelial cells and thus can trigger a regenerative hyperplasia (23). Damage to the urothelium is aggravated because on voiding, the rodent bladder forms folds and ruggae that preclude complete voidance, resulting in precipitate and calculi retention (24). In humans, the anatomical orientation of the bladder is such that stones are apt to be voided or, if retained, the associated discomfort typically results in their removal by medical intervention. Epidemiologic studies suggest that the presence of calculi is only weakly associated with the occurrence of human bladder tumors (25).
This weight-of-evidence analysis suggests that findings in rodents of bladder tumors associated with calculi may be of diminished concern for human health risk assessment. Indeed, this model is likely to be reflected in U.S. EPA policy relative to the interpretation of such lesions (12). In order to support such a mode of action and the suggestion of diminished concern, an investigator would need to carefully characterize urine, precipitate, and stone chemistry; test agent metabolism, distribution, and excretion; and establish the dose dependency of the observations. Because urine chemistry is influenced by water consumption, diet, time of day, and other factors (21,22), careful attention and documentation of laboratory and analytical procedures is critical to the evaluation and interpretation of such studies.

Quantitative xtrapolation ofCancer Risks
The fundamental problem of extrapolation to low doses in the quantitative estimation of carcinogenic risk was acknowledged already in the U.S. EPA's 1986 Guidelines for Carcinogen Risk Assessment (7): Since risks at low exposure levels cannot be measured directly either by animal experiments or by epidemiologic studies, ... mathematical models have been developed to extrapolate from high to low dose. Different extrapolation models, however, may fit the observed data reasonably well but may lead to large differences in the projected risk at low doses. These guidelines established the linearized multistage model (26) as the U.S. EPA default procedure for low-dose extrapolation. The U.S. Food and Drug Administration (FDA) has generally used a linear extrapolation procedure developed by Gaylor and Kodell (27).
The risk estimates derived by these methods have considerable inherent variability and uncertainty, and the numbers are considered to be conservative upper bounds on the risk. The output of the Environmental Health Perspectives * Vol 105, Supplement 1 * February 1997 linearized multistage model (LMS) that is most commonly used in regulatory applications is the 95th percentile upper bound on the calculated risk for a given exposure (26,28), but the uncertainty in this estimate is such that the risk characterization usually states that the true risk may actually be zero (7).
The controversy that has developed in recent years over cancer risk assessment and the resultant risk management and regulatory decisions in the United States derives, in large measure, from this uncertainty and from variability in the estimates. Our inability to accurately estimate risks at low doses, based on data from lifetime animal studies conducted at much higher doses, is a multifaceted problem that has stimulated considerable discussion and research (8,9,29,30,31), and this review can only mention some of the issues that are shaping the debate.
One issue that has attracted a great deal of attention is: What is the most appropriate expression of the output of a cancer dose response or risk assessment? Some have argued that the 95th percentile upper bound on a risk estimate that already incorporated a number of conservative default assumptions was overly conservative and was leading to risk estimates that were too high and regulatory standards (e.g., target cleanup levels) that were too stringent (32,33). It has been argued that more realistic numbers that characterize the central tendency of the risk estimate (e.g., maximum likelihood estimate, best estimate) should be used instead. Others have countered that there is really little information available to assess the conservatism of the regulatory risk estimates for humans and that estimates based on the 95th percentile upper bound may even underestimate the risk for some populations and chemicals (34)(35)(36).
A problem with relying solely on estimates of central tendency is that they tend to be unstable; i.e., in extrapolating from experimental animal tumor data to risks in the one-in-a-million range, the maximum likelihood estimate can be highly sensitive to very small changes in tumor incidence that occur as normal variability in animal studies. For example, Kodell and Park (37) have shown that increasing the number of animals with tumors from 1 of 50 to 3 of 50 in the mid-dose level of a three-dose level (plus controls) study can decrease the LMS maximum likelihood estimate of the dose corresponding to a 106 risk by more than 100-fold.
Thus, a consensus has developed within the regulatory risk assessment community that risk estimates derived from mathematical models like the LMS model must be expressed as more than a single number (9). Several possible ways to express the estimates have been suggested. These include the presentation of some estimate of central tendency along with the upper and lower bounds (38), the application of Monte Carlo analysis to generate an overall probability distribution for the risk estimates (39)(40)(41)(42)(43), and other probabilistic approaches (44)(45)(46).
The U.S. EPA's proposed revised Guidelines for Carcinogen Risk Assessment (11) will move the agency and environmental risk assessment away from reliance on the LMS model for extrapolation of cancer risks. Several alternatives are proposed in its place. The linear default that likely will replace the LMS model in most cases will be a linear extrapolation from the lower 95% confidence limit on the EDIo, the dose associated with a 10% excess risk over background. It is suggested in the proposed guidelines that a full characterization of the dose response should "report the central estimate of the ED10, the upper and lower 95% confidence limits, and a graphical representation of model fit." Exactly how this will translate into a risk characterization when extrapolated remains to be determined. [For example, Nilsson (47) notes, in passing, that linear extrapolation from the EDIo through the dose-response origin will "greatly exaggerate risk" for genotoxic carcinogens with a steep dose response and where "strongly promotive factors operate in the high dose range used in animal studies...."I In the simplest case under the proposed guidelines (11), the ED1o will be based on tumor incidence data from a rodent bioassay. However, ancillary data also may be used to extend the dose-response curve below the dose range in which tumors are observed, if the ancillary data can be clearly linked to the carcinogenic response. For example, the dose-dependent formation of DNA adducts of the chemical may be quantifiable at levels below the range in which an increased incidence of tumors is detectable. Such data may be used to extend the doseresponse curve for tumors if there is a high degree of confidence that the formation of these adducts is a requisite step in the development of the tumors in question and will display the same dose response as the tumor data at low doses (48)(49)(50)(51). Many other biomarkers and preneoplastic changes have been reported that may be evaluated on a case-by-case basis for use in extending the dose-response curve (52)(53)(54)(55)(56)(57)(58). However, the dose-dependent link between the biomarker and tumor end point must be firmly established.
The shape of the cancer dose-response curve at low doses has been a topic of much theoretical discussion and debate. For DNA-reactive carcinogens, it has been argued that their additivity to the background rate of ongoing carcinogenic processes predicts that the dose-response curve will be linear at low doses (59-61). Lutz (62) argued that "the presence of endogenous DNA damage implies that exogenous DNA-carcinogen adducts give rise to an incremental damage that is expected to be proportional to the carcinogen dose at the lowest levels." This, of course, says nothing about the doseresponse relationship at higher doses where curvature may occur due to saturation of critical metabolic pathways or DNA repair mechanisms, or where cytotoxicity may occur or cell proliferation may be induced (31,(63)(64)(65)(66)(67)(68). In addition, there is a body of experimental data that suggests that exposure to low levels of some carcinogens, including ionizing radiation, may induce and enhance the efficiency of general repair mechanisms, and the debate over this potentially beneficial (hormetic) effect is drawing increasing attention (69)(70)(71)(72)(73).
For rodent carcinogens that are functionally nongenotoxic (i.e., do not react directly with DNA in inducing tumors), new risk assessment paradigms are being proposed, based on information on the mode of action, that use a "margin of exposure" analysis analogous to that used commonly for noncancer endpoints. In such cases, "the risk is not extrapolated as a probability of an effect at low doses" (11). Rather, estimated human exposure levels are compared with the lower 95% confidence limit on the ED1o from the animal carcinogenicity study or other studies of precursor effects (with doses adjusted for the animal-to-human extrapolation). This margin of exposure analysis does not necessarily require the demonstration of a true threshold for the carcinogenic process but rather a sufficiently clear understanding of the mode of action to support the presumption of an effective threshold (highly nonlinear dose response) (17,(74)(75)(76)(77). It has been suggested that experimental proof of a biological threshold is somewhat akin to proving a negative (78). Indeed, the caution by Melnick et al. (16) against "excessive Environmental Health Perspectives -Vol 105, Supplement * February 1997 reliance on oversimplified classification schemes" (such as those postulating thresholds) is well taken and is one of the issues that led the U.S. EPA to discard the alphanumeric classification scheme for carcinogens in favor a narrative description of the weight of the evidence (11).
The new U.S. EPA guidelines call for a fuller understanding of the carcinogenic process and the use of all of the information available, rather than sole reliance on rodent tumor data. Clearly, the way forward will bring with it pitfalls and problems as toxicologists attempt to evaluate and assimilate the rapidly advancing knowledge base in fundamental cancer biology, hormonal effects (79,80), chemical interactions with cellular components, and toxicokinetics (57) in their risk assessments. The temptation to oversimplify the complex may still lead to over-or underestimates of the carcinogenic potential of individual chemicals for humans. In the final analysis, the actual impact of the increasing emphasis on delineation of mechanisms and modes of action on regulatory risk assessment for carcinogens will be an important benchmark in assessing the value and usefulness of the proposed guidelines.
More complex biologically based mathematical models of the carcinogenic process also have been developed, but early hopes that these models might be useful in estimating small risks at low doses have not been realized as yet (81). Two general types of these models are the timeindependent models exemplified by the work of Moolgavkar and others (82)(83)(84)(85) and the time-dependent models exemplified by the work of Ellwein and Cohen (66,(86)(87)(88)(89). These models have been used, for example, in examining factors influencing the occurrence of lung cancer in uranium miners (90), the development of skin papillomas in mouse initiation-promotion studies (91), and the mechanisms of induction of liver and bladder tumors in mice in the EDO, study (66,88,92). Other biologically based mathematical models continue to be developed (93-98), but the predictive capability of such models still appears to be limited by the number of parameters requiring quantitative data input and by uncertainties regarding the detailed mechanisms of carcinogenesis (81,(99)(100)(101)(102)(103).

Human Interindividual Variability in Susceptibility
Recent developments in the public health arena have heightened awareness that individuals vary in their susceptibilities to potential hazards in the environment and that such variability reflects genetic heterogeneity as well as differences in exposures. For example, the emergence of human retrovirus-mediated disease has focused attention on the vulnerability of immunocompromised individuals to potentially hazardous agents in the environment. At the same time, various reports (8,104) have identified a number of biologically based distinctions between adults and children with respect to vulnerability to environmental agents. The notion that humans vary in their susceptibilities to toxicants is supported by observations that not all workers exposed to benzidine develop urinary bladder cancer (105), that humans are not all equally susceptible to air pollutants (106), and that individual-to-individual variability in disease susceptibility is associated with a variety of genetic and other factors (107)(108)(109).
Concerns about human variability in susceptibility have recently crystallized around the question: Are safety factors or default assumptions currently used in risk assessment adequate to protect a human population composed of individuals who differ in susceptibility to potentially hazardous materials (9)? Superimposed on this question is the issue of environmental justice, which brings focus to the interactions between genetic variability and exposure variability. To address the question, the scientific community must determine how differential susceptibility may be quantitatively assessed and how well it is understood with respect to the human population.
Animal studies provide considerable insight into the issue of differential susceptibility. Within a single species (e.g., mouse), the toxicity of certain chemicals differs substantially between strains (110). Mechanistically, such differences in susceptibility may be accounted for by genetically determined differences in the expression or regulation of detoxifying enzymes or other metabolic factors. Susceptibility to potential developmental toxicants may also depend upon the maternal genotype as well as that of the offspring. Unfortunately, the range of factors and mechanisms associated with differential susceptibility in animals is neither fully characterized nor understood. Nevertheless, similar information, if available for humans, would be of value in examining differential sensitivity of individuals to potential hazards in the environment.
The NRC (9) recommended that the U.S. EPA should adopt an explicit default assumption for susceptibility, and that a default susceptibility factor greater than 1 or a default distribution of susceptibility should be incorporated in cancer risk estimates. The NRC also recommended that research should be conducted to explore the relationships between variability in factors such as DNA adduct formation and variability in susceptibility to carcinogenesis. Additionally, the NRC recommended that research be conducted to provide guidance on how to design epidemiologic studies to assess the influence of a number of factors on interindividual variability in susceptibility. Specific concerns include the contribution of age and gender, and genetic, metabolic, and physiologic parameters to interindividual variation in response and susceptibility; the identification of critical mechanisms; biomarkers to identify sensitive subpopulations; and the statistical and mathematical understanding of variability of response within the human population. Such information likely will have significant impact on the methods used to assess human variability as it applies to chemical risk assessment. Although there is some understanding of genetic variability, there is little understanding of the implications of that knowledge for risk assessment. Where there are physiological or other measures of variability in response, there is often limited knowledge of the genetic basis for those observations. More information is needed to determine the adequacy of uncertainty factors and conservative assumptions in protecting a highly variable human population (106,111,112).

Carcinogen Hazard Identification: Changing the Testing Paradigm
In the years following World War II, concerns about the adverse health effects of chemicals, particularly their potential to cause cancer, led to the development of routine testing protocols for assessing the carcinogenic potential of chemicals. Initially operating under the auspices of the National Cancer Institute, the FDA, and other government agencies, these programs evolved into the National Toxicology Program (NTP). Established in 1978, the NTP has provided an organizational umbrella for federal toxicity testing programs. In addition to its prominence as a premier toxicity testing and research program, the NTP has played a key role in identifying and standardizing methods for assessing chemically induced toxicity and carcinogenicity. During the last two decades, there has been an incredible expansion in our understanding of the biological principles and processes of carcinogenesis. Driven by technological advances in analytical instrumentation, molecular biology, and monoclonal antibody technology, the questions posed by scientists and the tools available to address such questions have become increasingly more sophisticated. Not surprisingly, such forces have changed profoundly the scientific approaches to hazard identification and pose a significant challenge to the risk assessment community in terms of incorporating new knowledge and new testing procedures into the risk assessment process. Two examples will illustrate this point.
Good animal husbandry is critical to the success of any animal study and thus figured prominently in the establishment of the NTP in vivo toxicity and carcinogenicity testing procedures. In the interest of good husbandry, an ad libitum feeding protocol was established for laboratory rodents to ensure the availability of adequate amounts of food on demand while obviating the need for, and costs associated with, individualized feed rationing and scheduled feedings. In recent years, it has become apparent that among rodents used in 2-year bioassays, survival has decreased, obesity increased, and the incidence of background tumors and intercurrent diseases have increased relative to their counterparts in studies conducted during the 1960s and 1970s (113)(114)(115)(116). Among control animals, these observations confound the interpretation of the results of the bioassay and call into question the reliability, reproducibility, and predictability of the procedure (117).
Careful examination of data from NTP-sponsored and NTP-conforming studies has revealed a consistent pattern of increased rate of animal body weight gain relative to the historical data (113,118,119). Moreover, Keenan et al. (120,121) report considerable intra-and interlaboratory variability in body weights of animals maintained under nominal ad libitum feeding protocols. Differences in feeding device configuration and accessibility accounted for a nearly 2-fold difference in body weight between same-age animals fed ad libitum. These and other studies suggested that animals of lower body weight exhibited greater 2-year survival, fewer background tumors, and less intercurrent disease than their obese counterparts (120,121). Similar findings occur when animals fed a portion-controlled ration of nutritionally sufficient feed are compared with ad libitum fed animals (122,123). The relationship between food intake and weight gain is further confounded by genetic drift in rodent breeding colonies because of selection for rapid growth and large litter size (124,125) to satisfy the demand for test animals.
These findings pose a considerable challenge to the NTP and to the risk assessment community. If ad libitum feeding is responsible for poor health of the animals and for inter-assay variability, what should be done to address the problem of overfeeding, how would the control of dietary intake affect animal metabolism and physiology, and what are the implications of changing feeding practices relative to the sensitivity of the bioassay and to the cumulative database as it currently exists? A considerable body of data (126,127) suggests that feeding rodents controlled portions results in healthier animals that perform better and more consistently in the bioassay. Many measures suggest that controlling dietary intake can produce metabolically and physiologically robust animals without substantively altering their sensitivity to chemical toxicants and carcinogens (128)(129)(130).
The FDA, led by scientists from the National Center for Toxicological Research, is drafting recommendations to be published in the Federal Register that call for the routine use of dietary control during chronic toxicity and carcinogenicity testing. These FDA recommendations describe a model for developing a controlled feeding protocol and suggest that dose range finding and other studies that support the design of long-term studies should also be conducted under the same feeding protocol as envisioned for the 2-year studies. Although controversial in terms of changing a fundamental component of the bioassay, the FDA's recommendations incorporate current scientific knowledge with the intent of improving the bioassay. The response of the risk assessment community to these recommendations and to the scientific challenges they pose will offer insight into how improved scientific understanding can be incorporated into the carcinogenicity testing process.
The 2-year carcinogenicity bioassay also has been criticized because the process of identifying potential human carcinogens is slow and costly (131); often many years elapse between the initiation of a study and the issuance of the final report. Scientists at the National Institute of Environmental Health Sciences (NIEHS) and elsewhere have proposed using genetically modified rodents for carcinogenicity testing (131,132). Advances in molecular biology and molecular genetics have provided the technological base for inserting selected genes into rodents during early embryologic development, or conversely, selectively inactivating certain genes, e.g., tumor suppressor genes. The resulting transgenic and gene knockout animals, respectively, afford unique opportunities to assess the role of specific genes and gene products in carcinogenic processes. The study of animals with potentially enhanced sensitivity to chemical carcinogens due to overexpression of selected gene products or inactivation of genes that suppress tumorigenesis could lead to more rapid and costeffective detection of such compounds while reducing the number of animals needed for hazard identification.
One proposal currently in front of the scientific community is to screen for chemical carcinogens in two genetically modified strains of mice, the TG.AC transgenic animal and the p53 knockout mouse (131). TG.AC mice carry an activated v-Ha-ras oncogene that is expressed primarily in the skin (133). This gene is activated in transformed cells from various human and mouse tumors (134) and the expression of the activated form of the gene in TG.AC mice is thought to be functionally equivalent to an initiated animal in the context of the multistage model of carcinogenesis (132). The p53 tumor suppressor gene is inactivated in a variety of human and mouse tumors (135,136). When one copy of this gene is inactivated in mice, approximately 50% of the animals develop tumors by 18 months of age (136). Because p53 hemizygous animals have a propensity for developing tumors in a variety of tissues and organs, they have been proposed as models for screening for carcinogenic effects associated with the inactivation of tumor suppressor genes (137). Tennant et al. (131) have examined the responses of TG.AC transgenic and p53 knockout mice to exposure to eight and five chemicals, respectively, and are initiating studies with additional compounds (RW Tennant, personal communication). These include a number of compounds that have been evaluated in 2-year carcinogenicity studies, others that are currently being tested by the NTP, and still others that have not been and are unlikely to be tested by the NTP. Based on the small number of known carcinogens tested to Environmental Health Perspectives * Vol 105, Supplement 1 * February 1997 date, the results of these studies suggest that p53 hemizygous mice can be used to specifically identify mutagenic carcinogens while TG.AC mice respond to both mutagenic and nonmutagenic carcinogens, but not to noncarcinogens (131).
The reliability, reproducibility, and predictability of carcinogen screening studies in genetically modified mice has yet to be established. Important questions surround the selection of chemicals to be evaluated during nominal validation studies such as those proposed by the NIEHS (138). The studies described by Tennant et al. (131) exposed hemizygous p53 knockout mice to the test chemicals for 24 weeks by gavage, in feed, or by topical administration, while TG.AC mice were exposed for 20 weeks by topical application. Issues of dose selection and route and duration of exposure warrant further consideration in the design of studies using genetically altered animals. The use of genetically induced mice will necessitate critical analysis of the relationship between dose and duration of exposure to facilitate discrimination between chemical effect and background responses. It is unclear whether chemically induced papillomas in TG.AC mice are predictive for nonskin, target organ lesions. Technology does exist to regulate transgenes to achieve organ-specific expression of the gene product (139), but it is probably premature to factor such approaches into carcinogen screening.
Considerable uncertainty surrounds the interpretation of findings of chemically induced tumors in genetically modified animals. There is concern that the greater sensitivity of the genetically modified mice may lead to a high rate of false positive results. In the context of prospective studies, it is unclear how to distinguish a false positive from a true positive and, indeed, the appropriate standard against which to base such a distinction. As has been pointed out by many observers, the 2year carcinogenicity bioassay has never been validated, and while human cancer may constitute the gold standard, few data are available to use in such a fashion.
Another important issue is how, if appropriate, to phase in the use of screening studies in genetically altered animals. As currently configured, the four-cell bioassay provides for increased confidence in predicting human risk if positive results are obtained in both species and/or animals of each sex. Some have proposed replacing the 2-year mouse study with short-term studies in genetically modified mice. Clearly more studies will be needed to support such an approach. Although many anticipate the development and commercialization of transgenic rats, most, if not all, of these issues will apply to those animals as well.
The International Conference on Harmonization, an international coordinating body composed of representatives of pharmaceutical manufacturers and regulatory agencies, recently proposed that carcinogenicity testing in genetically modified mice or other assay systems be substituted for the conventional 2-year mouse bioassay (139). This proposal has stimulated a coordinated, international multilaboratory study to examine a number of the issues raised in the preceding paragraphs. The study is being coordinated by the ILSI Health and Environmental Sciences Institute.
The technology and techniques for genetic manipulation are rapidly evolving, as is understanding of the genetic basis for cancer susceptibility (11,140,141). This suggests that new animal models, each offering unique insights into the carcinogenic process(es), will come to the attention of the toxicology and risk assessment communities. Indeed, it is already apparent that the cancer profile of hemizygous p53 knockout mice differs depending upon the strain of mouse in which the p53 gene is inactivated (142). Thus, thoughtful evaluation of the existing genetically modified animals and their role in carcinogenicity testing should reflect the best available scientific understanding while recognizing that new, and potentially more relevant, animal models will emerge from the world's research laboratories.

Closing Thoughts
These are exciting times of discovery and change, of new knowledge and high expectations for cancer risk assessment. It is becoming increasingly apparent that an understanding of the mode of action for rodent carcinogens will often be essential for an adequate risk assessment. Such an understanding will provide insight for interpreting the rodent bioassay results and allow a fuller characterization of the cancer dose-response relationships for many chemicals. Developing the optimal database for a chemical may require a substantial investment of time and resources, imposing practical limitations on the process. Thus, there will be a continuing need for a flexible approach to risk assessment, reflecting the availability of data. Characterization of carcinogenic risks will need to be transparent with clear expression of what is known and what is not known, and with careful articulation of the variability and uncertainty in the assessment.
The issues that we have highlighted point to some fundamental questions that currently confront the risk assessment community. How do we improve our current testing methods to enhance their value for identifying potential human health hazards? What can we do to make these tests more rapid, more accurate, and more cost effective in an environment of diminishing resources, yet one that demands more certainty? How do we exploit current and emerging scientific technology and understanding to bring sensitive and specific assays into the armamentarium of the toxicologist? How can we effectively integrate the rapid expansion of knowledge of carcinogenic processes and modifying factors into cancer risk assessments? What criteria and processes will provide confidence that new and emerging methodologies will adequately protect the public from potentially harmful chemicals? These and other issues pose significant, but likely not insurmountable, challenges to the toxicology and risk assessment communities worldwide as we approach the new millennium.