The significance of mouse liver tumor formation for carcinogenic risk assessment: results and conclusions from a survey of ten years of testing by the agrochemical industry.

A survey was performed on the results of 138 carcinogenicity studies conducted in various mouse strains by the agrochemical industry over the period 1983-1993. Data for liver tumor incidence, liver weight, and histopathology were collected along with data on genotoxicity. Studies were judged positive or negative for liver tumor formation on the basis of apparent dose response, malignancy, and difference from historical control values using a weight of evidence approach. Thirty-seven studies were judged to be positive for liver tumorigenicity in one or both sexes. There was no evidence showing an influence of the mouse strain and the duration of the study on the proportion of positive studies. Although 8 of the chemicals tested in the 138 studies were positive in the Ames test, only one of these was judged positive for carcinogenicity. Only 6 of the 37 positive chemicals had any other reported positive genotoxicity findings. A clear relationship between hepatomegaly at 1 year after exposure and a positive tumorigenic outcome at 18 months or 2 years after exposure was demonstrated. Whereas the average relative liver weight of top dose animals was 110% of control in negative studies, it was 150% in positive studies. Likewise, very few negative studies demonstrated significant pathological findings after 1 year, whereas the majority of positive studies had significant liver pathology. The implications of these findings for extrapolation to humans are discussed.

1Rh6ne-Poulenc Agro, Sophia Antipolis, France; 2Bayer AG, Toxicology Department, Wuppertal, Germany; 3Zeneca Limited, Central Toxicology Laboratory, Macclesfield, United Kingdom; 4Novartis Crop Protection Inc., Toxicology/Cell Biology, Basel, Switzerland A survey was performed on the results of 138 carcinogenkicty studies conducted in various mouse strins by the agrochemical industry over the period [1983][1984][1985][1986][1987][1988][1989][1990][1991][1992][1993]. Data for liver tumor incidence, liver weight, and histopathology were collected along with data on genotoxicity. Studies W j itive or i for fliver tum fmtonon te bas of apparent dose response, land di fm l ues sna wegt of evidenc approach. Thirty-seven studies were judged to be it r l one or h see. Therew no dence an influence of e mouse s d tin an e duration of the study on the proportion of positive studies. Although 8 of the chemicals tested ian the 138 studies were positive in the Ames test, only one of these was judged positive for carcinogenicity. Only 6 of the 37  Carcinogenicity studies are performed in ammals to evaluate the risk of long-term exposure to chemicals for humans. The assumption made from the start of the toxicological evaluation process is that the effects found are relevant to humans in the absence of a compelling argument to the contrary. Over the years, organizations such as the National Toxicology Program (NTP) have assembled large databases on carcinogenicity testing which reveal that carcinogenicity studies run in rats are not necessarily predictive of the result of similar studies run in mice (1)(2)(3).
Available databases suggest that about 70% of mouse carcnogens are positive in the rat and 70 to 75% of rat carcinogens are positive in the mouse (2,33. While this situation is consistent with the concept of ensuring that critical effects are not missed, it also provides for the possibility that some of the effects seen in one species are not necessarily relevant for another. Three other findings are dear from examination of such databases: 1) some tumor types are more common in rats than in mice and vice versa; 2) many compounds produce only one tumor type in one of the two test species; and 3) many compounds produce tumors only at the highest dose tested.
In many respects these databases must be suspected of being misleading about the universe ofchemicals. Compounds selected were not chosen at random, but were selected according to various criteria, one of which was suspicion of carcinogenicity (1,4). For this reason alone, it would be interesting to look at a group of chemicals that are highly varied in structure, having in common only that they are biologically active in some way that makes them useful in agriculture. In principle, such compounds can be assumed to be free of any convincing evidence of genotoxic activity, at least at the time they were developed. Therefore, for the majority of cases in which agrochemicals show positive results in carcinogenicity testing, they can be considered nongenotoxic carcinogens.
Liver tumors in mice show up as the sole evidence of carcinogenicity so often that their significance to humans has often been debated (5)(6)(7)(8). This is so much so that current European Union (EU) legislation considers that if the sole evidence comes from liver tumors in "certain sensitive strains of mice," that this should not be regarded as evidence of carcinogenicity. Unfortunately it is not stated which strains of mice should be considered sensitive.
This study performed by the European Crop Protection Association (ECPA), with assistance from the North American Crop Protection Association (NACA), has attempted to address the issues raised here. What is the role of toxicity in liver tumorigenesis in mice? Which strains appear more sensitive? Is there a strong correlation between liver weight increases and the induction of liver tumors? And finally, after analyzing these data, under which conditions should the finding of mouse liver tumors be considered as predictive of carcinogenicity in humans either in the liver or in other organs? mouse carcinogenicity studies conducted between January 1983 and January 1993. Data required were: 1) year of study; 2) duration of study (months); 3) strain of mouse; 4) doses as percentages of the top dose; 5) percentage survival by dose/sex group; 6) percent tumor incidence for hepatocellular adenoma, hepatocellular carcinoma, and other liver tumors; 7) dose/sex group mean body weight after 6 months; 8) dose/sex group mean relative liver weight at 1 year and/or at termination; 9) genotoxicity; and 10) histopathological findings in liver after 1 or 2 years according to the following scale: indistinguishable from control, slight but significant hypertrophy, more severe hypertrophy with or without fatty vacuolation, and severe degenerative changes.
A few submitted studies had to be rejected because the duration was less than 18 months or the study was outside the specified time range.
The assessment focused on the model, i.e., on liver tumor formation in mouse carcinogenicity studies, and was performed blind with respect to compound identity. Further, information was neither available on possible oncogenic effects in other organs nor on possible oncogenic effects in corresponding rat carcinogenicity studies. Treatment of data. For control data, analyses of variance using duration of study, sex, and strain as independent variables were used to assess the importance of these factors on each of the quantitative endpoints requested in the survey. Separate analyses ofvariance were carried out for each ofthe endpoints.
In assessing the data from the treated groups, the first step was to provide an evaluation of whether a study was positive or negative in terms of tumor profile. Information on whether studies were considered positive by the parent company was not requested on the original data sheets. Studies were judged to be positive or negative by a weight-of-evidence evaluation according to the following hierarchy: * High dose compared to control: The existence of a treatment-related effect was considered possible from a more than 50% increase in response. the relationship or reinforce it? * Trend in malignancy: If a study was considered borderline after steps 1-3, it was regarded positive if there was a dose-related trend towards a higher proportion of malignant tumors or if the high dose group showed a markedly higher proportion of malignant tumors than all other groups. None of these differences or trends was tested with statistics because the sample was considered too heterogeneous for this to add to the credibility of the results. For example, tumor incidences in control groups are known to be dependent not only on the strain but also on the supplier and the laboratory where the study was performed.
Clearly negative and positive studies were easily identified and the hierarchical approach was applied only to decide the borderline cases. The final decision should be seen as an expert judgment, and no attempt was made to ascertain whether this decision was consistent with that of the company supplying the data. As the identity of the compounds was unknown, no bias from this source was possible.
The importance of duration of study, sex, and strain on the proportion of positive studies was assessed using a logistic regression with sex, strain, and duration as independent variables.
For top dose effects, as defined in subsequent sections, the importance of body weight, survival, relative liver weight, and liver histopathology on the proportion of positive studies was assessed using logistic regression. Separate analyses were carried out with each individual parameter as the independent variable. Analyses for relative liver weight and liver histpathology were also carried out separately for lower dose groups.
All data were stored and analyzed using the SAS software package (SAS Institute, Cary, NC) (9).

Survey Response
A total of 138 valid studies was received from 10 companies. The number of studies available for each strain, subdivided by duration, is summarized in Table 1. Seven strains of mice were used in the reported studies, although only four of these had 10 or more studies available. These were the CD-1 (67 studies), NMRI (31 studies), B6C3F1 (21 studies), and C57Bl/10J (10 studies). Alpk:AP(Swiss), SPFICRJCL:ICR), and TIF:MAGf mice were used in 1, 2, and 6 reported studies, respectively.
The studies ranged in duration from 18 to 27 months, but for the purposes of this report they were classified into three groups   Control Group Data Table 2 contains a summary of survival at study termination, absolute body weight at 6 months, and tumor incidence for the control groups for each study duration, split by sex and strain of mouse. The most important findings are listed below. There were clear statistically significant (p<O.O1) differences in the survival rates for the control groups in the different strains. The B6C3F1 had very good survival to 2 years (about 80%), whereas the other strains averaged approximately 40-55% at 2 years. There were no conclusive differences in survival patterns between the sexes. Not surprisingly, survival was lower as the duration increased.
There were dear statistically significant (p<0.01) differences in absolute body weights at 6 months for the different strains, which were independent of study duration and sex. The C57BI/1OJ and B6C3F1 mice tended to be smaller; the Alpk:AP(Swiss),  (16) 17 (20) 56 (65) Total females 14 (19) 13 (18) 15 (21)  57(78) Worst-affected sex Overall 25 (34) 17 (24) 13 (18) 45 (62) Tif:Magf, and SPFICR JCL:ICR) mice were larger; and the body weights of NMRI and CD-I mice were intermediate. For hepatocellular adenomas and carcinomas, the incidence in males was generally higher than that seen in females. The incidence of these tumors in the C57BI/1OJ mice was very low, the highest incidences being seen in TifMagf, B6C3F1, Alpk: AP(Swiss), and CD-1 mice. There was no conclusive evidence for an increase in tumor incidence with study duration in those strains for which this comparison could be made.
Other liver tumors were clearly more prevalent in studies >23 months in duration than in shorter-term studies. Their incidence was particularly high for the Tif:Magf, C57BI/1OJ, and male Alpk: AP(Swiss) mice. Because these tumor types were not usually further identified in the survey forms, they were not analyzed further. Treated Group Data Classification oftumorigenicizy. Of the 138 studies reported, 101 were judged to be negative in terms of tumorigenicity in both sexes, and 16 were positive in both sexes (Table 3). Ofthe remaining 21 positive studies, 16 were positive in males only and 5 were positive in females only. Although the incidence of positive studies was higher in males than in females, this difference was insufficient to attain statistical significance (p = 0.09).
The proportion ofpositive studies was not affected by either the duration of the study (Table 4) or the strain ofmouse (Table 5).
Genotoxicity. Only 8 of the chemicals tested in the 138 studies showed positive Ames tests. Of these 8 studies, 7 were considered negative for tumors in both sexes and 1 was positive for tumors in males only.
Of the 37 studies judged positive for tumors, only 1 was Ames positive and only 6 showed a positive result in any genotoxic assay (predominantly chromosomal in vitro).
Survival effects. Survival effects were calculated as the top dose group percentage survival minus the control group percentage survival. Top dose survival effects are summarized in Table 6.
No conclusive difference was seen in the survival effects between the sexes. A small but statistically significant (p = 0.02) association was seen between tumorigenic response and reduced survival for males only. On the basis of survival and body weight as described below, however, there was no evidence of excessive toxicity as a confounding effect in the overall database.
Body weight effects. Body weight effects were calculated as the difference in absolute body weight at 6 months between the top dose group and the control group, expressed as a percentage of the control group body weight. Top dose body weight effects are summarized in Table 7.
A body weight reduction in excess of 10% was seen in 14% ofthe studies, with a total of 49% showing an effect in excess of 5%. No top dose reduction in body weight was seen in 13% ofstudies. Top dose body weight reductions were statistically significantly (p<0.01) greater in males than in females.
The information at 6 months can only provide a snapshot of the actual body weight effect achieved, in that different effects could have been observed earlier or later in a study. Also, examination of body weight gain would have given a greater percentage difference, but this could not be calculated from the information requested in the survey. However, it would appear that  on the basis of body weight alone. There was, however, no conclusive evidence for an association between the body weight effect achieved and the carcinogenic response.
Liver histopathology. Figures 1 and 2 show the relationship between tumorigenic response and liver histopathology findings for males and females at 1 and 2 years, respectively. A clear statistically significant (p<0.01) association is present for both sexes at both 1 and 2 years at the top dose. This was also present for the highest middle doses in both sexes at 2 years and in males only at 1 year. Histopathological findings were clearly more severe in positive studies than in negative studies. Indeed, the vast majority of findings of more severe hypertrophy with or without fatty vacuolation and severe degenerative changes were seen in positive studies. It was usual to see no histopathological effects at the low dose. These findings were present at 1 and 2 years, despite the increased background histopathology likely to be present at 2 years.
Relative liver weight. Relative liver weight effects were calculated as the treated group mean expressed as a percentage of the control group mean. Figures 3 and 4 show the relationship between the tumorigenic response and the relative liver weight effect at 1 and 2 years, respectively, by sex and dose level. A statistically significant association (p<0.01) was seen between tumorigenic response and increased relative liver weight at 1 and 2 years in both sexes at the top dose. This was also seen at the highest middle doses for both sexes at 1 and 2 years (p<0.05).
Relative liver weight effects were clearly greater in positive studies than in negative studies. Indeed, an increased relative liver weight of less than 20% at 1 year would appear to be highly predictive of a negative tumor response at 2 years. Relative liver weights at the low dose were generally unaffected. The findings at 1 and 2 years were consistent, although the 2-year data is subject to more uncertainty owing to the possible inclusion or exclusion of tumor bearing animals from the reported results.

Discussion and Conclusions
More than half of the chemicals tested to date in chronic rodent bioassays have been found to be tumorigenic in both rats and mice at the high doses administered. Even though the majority, in amount and number, of the chemicals to which humans are exposed are natural, only 77 (10) and references cited therein]. The high number of positives is not simply the result of selective testing because of a pnori suspicion of mutagenicity and carcinogenicity. Many chemicals were selected because they were high-volume chemicals and were widely used in industry, crop protection, or food technology. The liver is the most frequent site of carcinogenicity in both rats and mice in these experiments (11,12). This fact is illustrated in a recent review (11) of 299 mouse tumorigens in a large database, from which 171 (57%) induced tumors in the liver in mice, and of 354 rat tumorigens, from which 143 (40%) induced tumors in the liver in rats. In the current survey, about 27% of the mouse bioassays reported were judged to be positive in terms of hepatocellular tumorigenicity. This does not necessarily mean that they were considered positive by the company submitting the data or by regulators, and the same is true for negative studies. The percentage of positive studies in the current survey is lower than that reported in the review (11), which may be due to the fact that the compounds were selected to be developed after excluding compounds with alerts for carcinogenicity. It could also be due to a different threshold for judging a study positive. In our approach, positive or negative carcinogenicity was judged by an examination of the study on its own and against the control data of other studies in the same strain, not  by statistics. Nevertheless, a significant proportion of the compounds tested were liver tumorigens, and this requires consideration of the mechanisms underlying spontaneous and chemically induced expression of mouse liver tumors and its likely significance for humans. Spontaneous formation of hepatocellular neoplasms in mice and rats may differ because spontaneous liver tumors are relatively rare in most strains of rats (13) but are frequent in many strains of mice. The historical control incidence for liver tumors in the NTP studies was 35% for male B6C3F1 mice and 13% for females, while it was 3% for male F344 rats and 1% for female F344 rats, indicating not only species but also sex differences with regard to spontaneous liver tumors (14). The existence of sex differences in spontaneous rates and in sensitivity to chemically induced tumors was confirmed in this study, although the difference in sensitivity was not dramatic.
Interspecies comparisons between rats and mice of liver-specific tumorigenic effects revealed that a chemical tumorigenic to the liver of male and/or female rats was four times more likely to be a mouse liver carcinogen than was a chemical not carcinogenic to rat liver (76% vs. 19%) (12). On the other hand, there appears to be no particular tendency for mouse liver tumors to predict rat liver tumors (15). These observations suggest a significantly higher sensitivity of the mouse liver towards tumorigenic effects and may question the predictivity of mouse liver tumorigenicity data for such effects in the rat. The predictivity is probably even less for humans, for which chemically induced Volume 105, Number 11, November 1997 * Environmental Health Perspectives hepatocellular tumors have been described as rare events (16,17). In addition, major interstrain differences exist among mice. The C3H/HeN mouse, for example, is known to exhibit extremely high (up to 100%) incidence, whereas the C57BL/6N and BALB/cA strains are characterized by a very low (<1 %) frequency of spontaneous hepatomas (18). These large strain differences in spontaneous rates were confirmed, but we did not find evidence of any striking difference in sensitivity between strains based on the criteria we used to judge positive studies.
Among the 37 studies judged to be positive for mouse liver tumor formation in this survey, only one was Ames positive. This means that nearly all positive results were obtained with compounds which are likely to induce liver tumors by nongenotoxic mechanisms. The analysis of our data has demonstrated a strong association between the outcome of positive tumorigenicity and increased liver weight at 1 year. Top dose liver weights in positive studies averaged 150% of control values, whereas negative studies were at about 1 0%. [Strongly increased liver weights as a biological response might provide a good indication that a sufficient dose was administered. As an alternative to MTD values based on reduced body weights, one might interpret a dose level that causes a strong increase in the (relative) liver weight as sufficient to represent an MTD.] Likewise the association between positive histopathological findings (hypertrophy and/or degenerative changes) and liver tumor formation was equally strong. Very few negative studies had any positive findings in the 52-week histopathology, whereas the majority of positive studies had at least minor pathological changes. These findings, while not implying a common mechanism, suggest that for the investigated nongenotoxic compounds the features of hepatomegaly and/or histopathological changes at 1 year are predictive of a significantly increased chance to observe liver tumors at the final sacrifice.
Various hypotheses exist on the mechanisms by which nongenotoxic hepatocarcinogens induce hepatomegaly and/or histopathological changes, and finally tumor formation (19)(20)(21). Upon subchronic administration, many of these compounds either stimulate hepatocyte cell proliferation and/or act as liver enzyme inducers. Compounds that act as liver mitogens directly exhibit a proliferative stimulus on hepatocytes. Alternatively, compounds that are cytotoxic to hepatocytes stimulate cell proliferation indirectly, whereby this compensatory hyperplasia is regarded as a regenerative process in response to sustained cytotoxicity. Liver enzyme induction can be observed with many compounds upon short-term treatment and is interpreted as an adaptive, reversible response of the organ to a functional load. The increased incidences of liver tumors, as observed with such compounds upon long-term treatment of laboratory rodents, are generally confined to high dose levels at which the compounds interfere with normal liver homeostasis, i.e., act as mitogens, cytotoxicants, or enzyme inducers.
The effects caused by these nongenotoxic compounds do not imply a direct interaction with DNA and are therefore likely to be threshold related, i.e., dose levels exist at or below the level at which no response will take place (20,21). If prolonged disturbance of organ function is a prerequisite of a positive outcome in the majority of studies, it casts considerable doubt on the potential to extrapolate these findings to lower levels of exposure where hepatomegaly or liver toxicity are absent. Consequently, these high dose phenomena are unlikely to occur under conditions of human exposure.
Additional mechanistic features specific to the mouse may support the view that tumor formation in mice is related to liver cell proliferation or enzyme induction and may differ from humans. There is good evidence for the existence of a considerable number of initiated or latent tumor cells specifically in mouse liver. A promotional effect of mouse liver carcinogens may be characterized by a specific proliferation stimulus on these cells specific for this species (22)(23)(24)(25). In addition, monooxygenase activities in the mouse liver are generally higher than those in rats or humans; among them are aryl hydrocarbon hydroxylase and biphenyl 2-hydroxylase, two activities mainly catalyzed by cytochrome P450 isoenzymes of gene family CYPIA (26)(27)(28). These isoenzymes are known to be efficient in the production of reactive intermediates, and their constitutive presence in mouse liver, albeit at low levels, suggests a certain endogenous initiating potential in this species, particularly affecting unprotected replicating DNA under conditions of increased mitogenic activity, which would not occur in humans (26).
In considering the relevance of mouse bioassays to human risk assessment, in addition to the mechanism of liver tumor formation in the mouse, risk factors relevant for carcinogenesis in humans are to be considered as well. Two questions are to be answered: Can potential human carcinogens be identified by a mouse liver tumor response? More specifically, can potential human liver carcinogens be identified by a mouse liver tumor response?
Benzidine provides an answer to the first question. Like some other genotoxic aromatic amines, benzidine is a bladder carcinogen in humans and induces liver tumors in rodents. In this case, liver tumor formation in mice and rats is obviously predictive for tumor formation in humans in another organ, i.e., the bladder (29,30). However, aromatic amines are genotoxic carcinogens, and their main mode of action is tumor initiation by covalent binding to DNA following metabolic activation (31). To our knowledge, there is no corresponding example of a nongenotoxic mouse liver carcinogen that was identified as a carcinogen in humans in another organ.
The following discussion will therefore concentrate on epidemiological data and etiological considerations that may elucidate the differences between liver tumorinducing agents in rodents and humans.
Epidemiological evidence from the clinical use of high doses of phenobarbitone, a noncytotoxic stimulator of cell proliferation and inhibitor of apoptosis as well as an enzyme inducer in the rodent liver (21,32,33), for more than 80 years indicates neither increased incidences of hepatic cancer nor the formation of tumors at any other site in humans (34). The main risk factors for hepatocellular carcinomas in humans are viral hepatitis, alcoholism, and aflatoxin. In high incidence areas, viral infection is the predominant risk factor, whereas in the Western World alcohol is the single most important risk factor (35)(36)(37). Viral hepatitis has been proposed to be responsible for 75-90% of all hepatocellular carcinomas worldwide (38). For high incidence areas, the combined effects of hepatitis, aflatoxin, and alcohol complicate a risk calculation. Synergistic effects of ethanol and hepatitis on liver cancer have been observed in Japan (39)(40)(41)(42) and urban regions ofAfrica (43).
Although liver cirrhosis usually coexists with hepatocellular carcinoma, some patients with hepatitis B infection may develop hepatocellular carcinomas that are not superimposed on cirrhosis, indicating that viral hepatitis may exert some direct effect on hepatocarcinogenesis (37,44). Hepatocellular carcinomas and cirrhosis appear to share a common etiology, and the likely culprit in the high incidence areas is persistent infection with hepatitis B virus (45). In contrast, in low incidence areas, alcohol-related cirrhosis is probably the most important cause of hepatocellular carcinoma (35).
Even in the absence of cirrhosis, hepatocellular carcinomas in humans usually do Environmental Health Perspectives * Volume 105, Number 11, November 1997 Review -Carmichael et al.
not occur in an otherwise histologically normal liver. In an Italian study (46), more than 140 cases of hepatocellular carcinoma were investigated, and all but one occurred in patients with histologic evidence of liver cirrhosis. In a series of 618 hepatocellular carcinomas in Japanese patients, 66 hepatocellular carcinomas without cirrhosis but only 3 cases (0.5%) without histological findings like dysplasia and fibrosis in the surrounding liver were found (47).
The impact of natural or man-made chemicals on human hepatocellular carcinoma epidemiology is doubtful or very small (48). There were a few recent reports of an increased incidence of liver tumors in chemical workers (49)(50)(51) and in solventexposed workers in the Nordic countries (52,53). Epidemiological evidence for the induction of hepatocellular carcinomas by environmental pollution has not been reported except for a few reports from China. An association between hepatocellular carcinomas and drinking of ditch water was reported by several authors (54)(55)(56); however, this association could not be attributed to pesticides (55,57) and is more likely virus related.
It can be seen that the vast majority of liver tumors in humans can be attributed to known causes mostly associated with hepatitis and/or cirrhosis. In addition, the mechanism of liver tumor genesis in mice and humans appears to differ, casting doubt on whether mouse carcinogenicity studies are predictive of a potential carcinogenic effect in humans. Moreover, there does not appear to be a corresponding public health problem. This leads to two alternative conclusions: the high proportion of positive mouse (and, to a lesser extent, rat) studies is an artifact of their design and has very little relevance to man, or the potential demonstrated by these studies may be relevant to very high levels of exposure but not to trace quantities typical of the everyday environment.