The OECD program to validate the rat uterotrophic bioassay. Phase 2: dietary phytoestrogen analyses.

Many commercial laboratory diets have detectable levels of isoflavones (e.g., phytoestrogens such as genistein [GN]) that have weak estrogenic activity both in vitro and in vivo. During validation studies of the uterotrophic bioassay, diet samples from 20 participating laboratories were collected and analyzed for three major phytoestrogens: GN, daidzein (DN), and coumestrol (CM). Soy phytoestrogens GN and DN were found at total phytoestrogen levels from 100 to 540 microg/g laboratory diet; a forage phytoestrogen, CM, ranged from nondetectable to 4 microg/g laboratory diet. The phytoestrogen levels were compared with both baseline uterine weights of the control groups and with the relative uterine weight increase of groups administered two weak estrogen agonists: bisphenol A (BPA) and nonylphenol (NP). The comparison uses a working assumption of additivity among the phytoestrogens, despite several significant qualifications to this assumption, to estimate total genistein equivalents (TGE). Some evidence was found that phytoestrogen levels in the diet > 325-350 microg/g TGE could diminish the responsiveness of the uterotrophic bioassay to weak agonists. This was especially true for the case of the intact, immature female version of the uterotrophic bioassay, where higher food consumption relative to body weight leads to higher intakes of dietary phytoestrogens versus ovariectomized adults. This dietary level is sufficient in the immature female to approach a biological lowest observable effect level for GN of 40-50 mg/kg/day. These same data, however, show that low to moderate levels of dietary phytoestrogens do not substantially affect the responsiveness of the assay with weak estrogen receptor agonists such as NP and BPA. Therefore, laboratories conducting the uterotrophic bioassay for either research or regulatory purposes may routinely use diets containing levels of phytoestrogens < 325-350 microg/g TGE without impairing the responsiveness of the bioassay.

The Organisation for Economic Cooperation and Development (OECD) initiated a highpriority activity in 1997 to develop new and revised guidelines for screening and testing potential endocrine disrupters (OECD 1998a). One activity is to validate the uterotrophic bioassay to screen suspected estrogen receptor (ER) agonists or antagonists to identify those with in vivo activity and to evaluate positive compounds for definitive testing. The validation program demonstrates that the uterotrophic bioassay is reproducible and reliable for a potent reference estrogen, 17α-ethinyl estradiol (EE); five weak ER agonists, bisphenol A (BPA), genistein (GN), methoxychlor, nonylphenol (NP), and 1,1,1trichloro-2,2-bis(o,p´-chlorophenyl)methane or o,p´-DDT; and a negative chemical, dibutylphthalate (Kanno et al. 2001(Kanno et al. , 2003a(Kanno et al. , 2003b.
Among several protocol parameters, phytoestrogens in the laboratory diet are a possible source of interference with the uterotrophic bioassay. There are rare reports that laboratory diets induced statistically significance increases in uterine weights or interfered with the bioassay (Drane et al. 1975(Drane et al. , 1980Huggins et al. 1954;Zarrow et al. 1953). Analyses show that many laboratory diets contain GN, daidzein (DN), and other phytoestrogens at levels of 100 µg/g diet (Boettger-Tong et al. 1998;Brown and Setchell 2001;Degen et al. 2002;Thigpen et al. 1999aThigpen et al. , 1999b. Isolated phytoestrogens, such as GN, DN, and coumestrol (CM), have biological activity consistent with an estrogen mode of action Whitten and Patisaul 2001). Phytoestrogens are weak agonists for the ER (Branham et al. 2002) and elicit statistically significant increases in uterine weights at high doses in the uterotrophic bioassay (Bickoff et al. 1962;Farmakalidis et al. 1985;Farmakalidis and Murphy 1984;Jefferson and Newbold 2000;Kanno et al. 2003aKanno et al. , 2003bMarkaverich et al. 1995;Odum et al. 1997;Perel and Linder 1970). The best-characterized phytoestrogen, GN, induces uterine weight increases, accelerates vaginal opening, and elicits other responses associated with estrogenic activity at doses of 300 µg/g diet or 30 mg/kg/day (Casanova et al. 1999;Fritz et al. 1998;Santell et al. 1997;You et al. 2002aYou et al. , 2002b.
Three issues are addressed in this article: a) whether changes in uterine weights are associated with phytoestrogen levels in laboratory diets, suggesting a possible interference with uterotrophic bioassay; b) what levels of phytoestrogens may lead to interference and should be avoided; and c) what other other instances of apparent interference are present and what are their causes. This article addresses these questions by comparing the results of uterotrophic bioassays from the OECD phase 2 validation studies using both the intact, immature as well as the adult ovariectomized (OVX) versions of the uterotrophic bioassay (Kanno et al. 2003a(Kanno et al. , 2003b. Laboratory diets were submitted for phytoestrogen analyses, and the response of the bioassay with two particularly weak agonists, BPA and NP, was examined. based on Setchell et al. (1987) and has been used for both food and biological samples (Morton et al. 1994;Thigpen et al. 1999b). The coefficient of variation for intraassay precision was 3-13% (Morton et al. 1994) and was determined prior to the present study by multiple analysis of the same diet batch on separate occasions where results were within 10%.
GC-MS was carried out on a DB5 MS bonded silica capillary column (10 m × 0.25 mm, phase thickness 0.25 µm) using helium as carrier gas and a temperature of 70-300°C at 40°C/min . Isotope dilution MS  was performed using selective ion monitoring at  mass 425 for DN, 429 for d4-DN, 555 for  GN, 559 for d4-GN, 496 for CM, and 500 for d4-CM. Peak area ratios were determined for analytes and internal standards. Calibration curves were constructed, and sample concentrations of GN, DN, and CM were determined.
Uterotrophic bioassay and statistics. All conditions and procedures for the uterotrophic bioassay and statistical methods have been reported previously (Kanno et al. 2001(Kanno et al. , 2003a(Kanno et al. , 2003b. Studies were performed in accordance with OECD's animal care guidelines (OECD 2000) and appropriate national regulations. The BPA and NP results are expressed as the ratio of the geometric means of uterine weights (relative to vehicle control) after adjusting for body weight (bw) of the animal at necropsy, along with lower and upper 95% confidence levels for those means.

Results
Analytical results. The analytical results for GN, DN, and CM are shown in Table 2.
The results indicate a substantial presence of soy products with all diets containing detectable GN and DN, with DN consistently lower than GN. Samples 17 and 22 had the lowest total GN and DN levels of approximately 100 µg phytoestrogens/g diet, sample 26 had the highest levels with 541 µg phytoestrogens/g diet, and most samples had levels between 150 and 350 µg phytoestrogens/g diet. In contrast, CM, found in forage crops such as alfalfa, was not detected in 8 of 26 samples and did not exceed 4.1 µg/g diet in any sample.
In three cases, the same lot number of diet was submitted by more than one laboratory. These three possible duplicates were diet codes 4 and 8, diet codes 5 and 9, and diet codes 12 and 13 (Tables 1 and 2). Although not precise split samples, it was interesting to compare the analytical results for these samples. Analyses of samples 4 and 8 closely correspond for all three substances. For samples 5 and 9, the correspondence is excellent for DN, but there is an approximately 20% difference in GN levels at 216 and 170 µg/g diet, respectively. For samples 12 and 13, the DN and CM correspondences are excellent, but the GN analyses differ with 218 and 180 µg/g diet, respectively. Where different lots of the same diet were submitted, the pattern of analytical results was consistent within a diet.
Food consumption. Eight participating laboratories recorded food consumption for intact, immature females, and three recorded data for OVX females. Intact, immature female displayed a rapid rise in food consumption from approximately 2-4 g/animal on day 1 to 6-11 g/animal on day 4 before necropsy. As intact, immature animals were group housed with no less than three animals per cage, the value is based on mean food consumption per animal for a cage. For the OVX females, food consumption was more stable at 14-24 g/animal on day 4 among the laboratories. All amounts may include wastage and spillage.
Where animals had diminished increases in body weight or even body weight losses, food consumption was lower, and these cases were not considered representative. The approximate food intake ranges used below were then 130-170 g/kg bw/day for immature animals and 60-75 g/kg bw/day for OVX animals.
Vehicle control uterine weights. The blotted vehicle control uterine weights from the dose-response and coded single-dose studies have been previously reported (Kanno et al. 2003a(Kanno et al. , 2003b. However, additional unreported controls for EE doses were conducted in some laboratories during the dose-response studies, and these are included herein. The arithmetic mean blotted weights for the intact, immature groups ranged from a minimum of 14.8 mg to a maximum of 58.0 mg in 60 groups. As shown in Figure 1A, when these data are plotted in rank numerical ascending order, an extended tail of high vehicle control uterine weights appears in five groups from laboratories 6, 20, and 21. These high uterine weight values are double the vehicle control uterine values recorded at the lower end of the range and are a possible concern for reducing the dynamic range and responsiveness of the uterotrophic bioassay. A similar tail was not clearly evident in the adult OVX vehicle control groups (Figure 2), where the arithmetic mean blotted weights ranged from 71.5 mg to 110.7 mg in 34 groups. The highest OVX control uterine weight value was again recorded in laboratory 6.
One other observation was notable for the adult OVX animals. In phase 1, the recommended time of regression between ovariectomy and the first substance administration was 10 days. This was changed to ≥ 14 days in phase 2; the added time appears to allow a further decrease in the mean blotted vehicle control weights ( Figure 2) and should slightly improve the responsiveness of the OVX version.
Body weights and uterine weights. Differences in body weight are one factor that could lead to differences in blotted uterine weights. Therefore, the mean uterine weights of control groups were plotted against their respective body weights.
There is a modest increase in uterine weight with increasing body weight in the intact, immature animals ( Figure 3). Five laboratories have been highlighted in Figure 3 and include laboratories 6, 20, and 21 noted above. First, high body weights do not account for higher blotted uterine weight values in these laboratories. Second, the lowest blotted uterine weights were in laboratory 14.

Mini-Monograph | Uterotrophic bioassay validation-dietary analyses
Environmental Health Perspectives • VOLUME 111 | NUMBER 12 | September 2003 Figure 1. Group mean absolute blotted uterine control weights (mg) for all intact, immature female groups in phase 2 dose-response and coded single-dose studies where dietary analyses were performed. The weights are arranged in ascending rank order to illustrate the distribution.

Figure 2.
Group mean absolute blotted uterine control weights (milligrams) for all adult OVX groups in phase 2 dose-response and coded single-dose studies where dietary analyses were performed (dark blue diamonds), and for all adult OVX groups in phase 1 where no dietary analyses were performed (light blue squares). The weights are arranged in ascending rank order to illustrate the distribution. Animals in phase 1 underwent a 10day period of regression after ovariectomy before test substances were administered, and animals in phase 2 underwent a 14-day period of regression.  . Group mean absolute blotted uterine control weights (milligrams) plotted against the respective man absolute body weights for intact, immature female groups in phase 2. The results of specific groups from laboratories 6, 14, 19, 20, and 21 that are discussed in the text have been highlighted in light blue and labeled.  ND, not determined. a The equivalency factor used to convert DN to GN was 0.8, and the equivalency factor used to convert CM to GN was 10. The converted microgram per gram diet were then summed to give TGE. b In the case of laboratory 5 and diet sample 3, it was discovered that the diet sample had not been used in the uterotrophic studies, but had been used in parallel studies to validate the castrated male or Hershberger bioassay and was submitted in error. Although the phytoestrogen analyses are reported in Table 2, these data have not been used in Figures 3-6.
Third, laboratory 19 was noted to be less responsive in both dose-response and coded single-dose studies (Kanno et al. 2003a(Kanno et al. , 2003b), but appears to be unremarkable. In the OVX studies, uterine weights were relatively uniform, and no association with body weight is discernable (data not shown). Calculation of genistein equivalents. To assess any interaction between dietary phytoestrogens and uterine weights on responsiveness, a working assumption was that different phytoestrogens interact in a simple, additive manner. This permits a proxy calculation of total genistein equivalents (TGE) in the diet.
The assumption of additivity has significant qualifications. First, there are two forms of the ER, α and β, with different tissue distributions (Kuiper et al. 1996(Kuiper et al. , 1997 and with some differences in binding affinity, particularly for phytoestrogens (Kuiper et al. 1997(Kuiper et al. , 1998. Second, the ability of the ER to mediate gene transcription depends upon interaction with a set of co-activators at the external surface of the ligand-binding domain (McKenna et al. 1999;Moras and Gronemeyer 1998;Xu et al. 1999). These coactivators are tissue dependent, supporting the concept of selective estrogen modulators to explain the differential and even opposite response of certain ligands in one tissue and not another (Safe et al. 2001;Shang and Brown 2002). A classic example is tamoxifen, an estrogen antagonist in breast tissue, but partial agonist in uterine tissue. Therefore, data from the same tissue and end point should be used to construct any estimated estrogen equivalents, for example, an increase the uterine weight, and extrapolations to other tissues and end points done with care.
The second qualification arises from data generated from the co-administration of several estrogenic compounds in the uterotrophic bioassay (Edgren and Calhoun 1957, 1960, 1961. These results question direct additivity and linearity of the equivalency assumption across a range of doses, suggesting some degree of additivity in the lower region of the dose-response curve and antagonistic activity in the higher region of the dose-response curve. Because the concern here is in the lower region of the dose-response curve and all data used are deliberately drawn from that region, additivity will be presumed. The third qualification rests on the need to have relevant, high-quality and comparable in vivo uterine data for each chemical. However, the available uterotrophic data for GN, DN, and CM are fragmented between rat and mouse, are often from vaguely described protocols, have different selected doses and spacing of those doses and different routes of administration, and use wet uterine weights with imbibed fluid in some studies and blotted uterine weight in others (Bickoff et al. 1962;Farmakalidis and Murphy 1984;Farmakalidis et al. 1985;Jefferson and Newbold 2000;Markaverich et al. 1995;Odum et al. 1997;Perel and Linder 1970). A review of these data supports the qualitative conclusions that a) GN is slightly more potent orally than DN; and b) CM is significantly more potent orally than either GN or DN, probably by an order of magnitude or more. Stressing the high degree of uncertainty, the values chosen for equivalency factors are 0.8 to convert DN into GN units on a weight basis, and 10 to convert CM into GN units on a weight basis.
Calculation of total phytoestrogen intake. To calculate approximate TGE intake per kilogram body weight per day, conversion factors are needed for both immature and OVX animals. For the immature animals, an average 55 g body weight and average intake of 8.2 g/day of laboratory diet yields a factor of 150 g/kg bw/day to convert the dietary level of TGE into a daily intake. For the OVX animals, an average 260 g body weight and average intake of 18.5 g/day laboratory diet yields a conversion factor of 67.5 g/kg bw/day. An essential point evident from these conversion factors is that phytoestrogen intakes on a body weight basis will be slightly greater than 2-fold that of OVX animals on the same lot of laboratory diet.
Comparison of control uterine weights to total genistein equivalents. The plausible effect of concern is that phytoestrogen intake would increase control uterine weights, thereby diminishing the dynamic range of the bioassay and affecting the ability to detect very weak agonists. To first examine this possibility, vehicle control uterine weights have been pooled into two sets, intact, immature and adult OVX, to plot uterine weight values against respective TGE intakes.
The intact, immature data intakes ranged from 18 to 75 mg TGE/kg bw/day ( Figure 4A) and suggest that dietary phytoestrogens lead to a slow progressive increase in uterine control weights. This suggestion is supported by laboratories 20 and 21, where both have high uterine control values and the highest TGE intakes. This suggestion is also supported by laboratory 14 with the second lowest dietary TGE value and the lowest control uterine weights, but these were also the animals with the lowest body weights. A linear least-squares analysis of these data yields the equation y = 0.407x + 14.08 and an r 2 value of 0.319. Omission of laboratories 20 and 21 with the highest phytoestrogen diets reduces the equation to y = 0.223x + 20.31 with an r 2 value of 0.109. This still includes two possible inconsistencies: a) laboratory 6 has the highest control uterine weight value and an intermediate TGE value, and b) laboratory 19 with the lowest dietary TGE value has almost twice the control uterine weights as those in laboratory 14.
The data for the adult OVX blotted vehicle control uterine weights ranged from 12 to 23 mg TGE/kg bw/day and are shown in Figure 4B. In this case, the diet appears to have no impact on the control uterine weights, and the OVX intakes were half that of the intact, immature animals on the same or similar diets.
Comparison of weak agonist responses to total genistein equivalents. A second method of evaluating the possible effect of phytoestrogens would be to examine the responsiveness of the uterus to weak estrogen agonists at various dietary intake levels. The largest comparable data sets for weak estrogen agonists are those from the administration of BPA and NP by oral gavage and sc injection in the OECD validation studies. Three subsets of data have again been analyzed for each substance: intact, immature females by po and sc administration, and OVX adults by sc administration (protocols A, B, and C, respectively) (Kanno et al. 2003a(Kanno et al. , 2003b. For BPA, 13 gavage studies at 600 mg/kg/day and 25 sc studies at 300 mg/kg/day in the intact, immature female and 12 sc studies at the same dose in the OVX Lab 21 Lab 20 Lab 19 Lab 14

Lab 19
Lab 6  Group mean absolute blotted uterine control weights (milligrams) plotted against the estimated total dietary genistein equivalents intake for intact, immature female groups in phase 2. The results of specific groups from laboratories 6, 14, 19, 20, and 21 that are discussed in the text have been highlighted in light blue and labeled. Laboratory 20 has two data points. Laboratory 14 encountered a high number of animal mortalities with protocol A (Kanno et al. 2003a(Kanno et al. , 2003b. (B) Group mean absolute blotted uterine control weights (milligrams) plotted against the estimated total dietary GN equivalents for adult OVX groups in phase 2. The results of specific groups from laboratories 6 and 19 that are discussed in the text have been highlighted in light blue and labeled.
adult are available. For NP, 13 gavage studies at 250 mg/kg/day and 26 sc studies at 80 mg/kg/day in the intact, immature female and 12 sc studies in the OVX adult are available. In this comparison, the ratio of geometric mean uterine blotted weight values for treated animals relative to the vehicle controls have been plotted with lower and upper 95% confidence limits against the calculated TGE intake. The expectation is that any effect from dietary phytoestrogen levels would be evident in a decrease in the relative ratio as the dietary phytoestrogen intakes increase.
The BPA oral gavage data for the intact, immature female are shown in Figure 5A. By this route, BPA is an extremely weak estrogen, with the uterine weight increasing only 35-40% at 600 mg/kg/day (Table 10 in Kanno et al. 2003a). Visually, the data suggest a slight decrease in responsiveness with the calculated TGE intakes. This suggestion is driven by the results of laboratory 8 with a highest mean relative uterine weight increase of 1.91 and an intake of 26-27 mg TGE/kg/day. A linear least-squares analysis yields a very modest negative slope and r 2 value of only 0.265 (Table 3).
The BPA sc data for the intact, immature female are shown in Figure 5B. No clear decrease in bioassay responsiveness is visually evident. However, consistent with concerns for increased uterine control weights at high TGE intakes, laboratory 20 was not responsive to BPA at TGE intakes of about 75 mg TGE/g diet, and laboratory 14 with the lowest TGE intake had the highest relative increase and achieved statistical significance by the widest margin. Laboratory 6 with high uterine control weights showed a limited relative increase, but did achieve statistical significance at this BPA dose. Several other laboratories with equal or higher TGE diets appeared to be more responsive than laboratory 6. The only other laboratory not achieving statistical significance is clustered with eight other laboratories that did achieve statistical significance ( Figure  5B, lower 95% confidence level < 1 at 46 mg TGE/kg/day). Again, a linear least-squares analysis yields a very modest negative slope and an r 2 value of 0.296. When the data from laboratory 20 are excluded, the slope is reduced by half, and the r 2 value falls to 0.053 (Table 3).
The BPA sc data for adult OVX are shown in Figure 5C. All laboratories achieved statistical significance, and the relative increase in uterine weights actually rose slightly with higher TGE intakes. Laboratory 6, with the highest control uterine weight, and laboratory 19, with the lowest TGE intakes, had somewhat lower responses when compared with the results in other laboratories. In this case, the least-squares analysis gave a positive rather than negative slope, and the r 2 value was 0.196 (Table 3).
The NP po data for the intact, immature female are shown in Figure 6A. No decrease in responsiveness of the bioassay with increasing TGE intakes is evident. Only laboratory 12 did not achieve statistical significance. Although the mean relative increase in uterine weight was similar to the other laboratories, four animals died in this laboratory, leaving only two survivors and seriously reducing the power (Kanno et al. 2003a). Laboratory 14, with the lowest estimated TGE intake, showed no evident increase in responsiveness compared with other laboratories. Least-squares analysis showed a very modest negative slope and an r 2 value of only 0.017 (Table 3).
The NP sc data for the intact, immature female are shown in Figure 6B. Again, no visual trend in decreased responsiveness of the bioassay with the estimated TGE intakes is evident. Seven studies did not achieve statistical significance, but the TGE intakes for the first six ranged from 15 to 50 mg/g diet where other laboratories successfully achieved statistical significance. Laboratory 19, with the lowest TGE intakes, also did not achieve statistical significance, and laboratory 14, with the second lowest TGE intakes, was no more responsive on average than other laboratories. Laboratory 6, with the highest uterine control weights, was one of the laboratories not

Mini-Monograph | Uterotrophic bioassay validation-dietary analyses
Environmental Health Perspectives • VOLUME 111 | NUMBER 12 | September 2003 1563 Figure 5. (A) The ratio of the geometric mean blotted uterine weights adjusted for body weight of the groups treated by oral gavage with 600 BPA mg/kg/day relative to control vehicle uterine weights are plotted against the estimated total dietary genistein equivalents intake for intact, immature female groups in phase 2. The 95% upper and lower confidence levels are included. The results of specific groups from laboratories 8 and 14 that are discussed in the text has been highlighted in light blue and labeled. (B) The ratio of the geometric mean blotted uterine weights adjusted for body weight of the groups treated by sc injection with 300 BPA mg/kg/day relative to control vehicle uterine weights are plotted against the estimated total dietary genistein equivalents intake for intact, immature female groups in phase 2. The 95% upper and lower confidence levels are included. The results of specific groups from laboratories 6, 14, and 20 that are discussed in the text have been highlighted in light blue and labeled. (C) The ratio of the geometric mean blotted uterine weights adjusted for body weight of the groups treated by sc injection with 300 BPA mg/kg/day relative to control vehicle uterine weights are plotted against the estimated total dietary genistein equivalents intake for adult OVX groups in phase 2. The 95% upper and lower confidence levels are included. The results of specific groups from laboratories 6 and 19 that are discussed in the text have been highlighted in light blue and labeled.  The values for the high phytoestrogen diet in laboratory 20 were omitted and the least-squares analysis was performed to assess the influence of these data on the overall trend.
achieving statistical significance. However, laboratory 20, with the highest estimated TGE intake of 75 mg/kg/day, had no evident response to this NP dose. The least-squares analysis showed a modest negative slope and an r 2 value of only 0.007, and the omission of laboratory 20 resulted in a slightly positive slope and an r 2 value of 0.154 (Table 3). The data for the adult OVX animals administered NP via sc injection are shown in Figure 6C. No obvious decrease in responsiveness of the bioassay was evident with the intakes all < 25 mg TGE/kg/day. The response to this dose of NP, like the immature groups, is lower than with the BPA dose. Four of the studies did not achieve statistical significance, although the lower 95% confidence level approached statistical significance in all cases. All four of these studies had intermediate levels of dietary phytoestrogen intakes. One of these groups was in laboratory 6, with the highest control uterine weights. Laboratory 19, with the lowest dietary TGE intake, was not more responsive than other groups.
Laboratory 21 has not been included in Figures 4A and 5A, as it failed to record body weights at necropsy, and relative increase in uterine weights adjusted for body weights could not be calculated. As 54 mg TGE/kg/day in this laboratory was the second highest intake, a close examination of the uterine weights themselves for BPA and NP is warranted. In BPA dose-response studies, an increase in absolute uterine weights was present, dose related, and achieved statistical significance (Kanno et al. 2003a(Kanno et al. , 2003b. However, the relative values were not as high as other laboratories at the maximum BPA doses (see Table 3B and Figure 1 in Kanno et al. 2003b). In NP dose-response studies, the uterine weights displayed a pattern of higher baseline values and relative uterine weight increases were lower than other laboratories (see Table 5B and Figure 4 in Kanno et al. 2003b). Therefore, these data are generally consistent with a pattern of diminished bioassay responsiveness at high estimated TGE intakes.
To further assess the possibility of a dietary phytoestrogen impact, data for the other weak estrogen agonists GN, methoxychlor, and o,p´-DDT from protocols A, B, and C were analyzed for a linear trend using the least-squares method. The results are shown in Table 3. The slopes are again modestly negative for protocols A and B. The r 2 values for protocol A range from 0.002 to 0.282. The r 2 values for protocol B that include laboratory 20 are somewhat higher for GN and methoxychlor (0.630 and 0.519, respectively). The slopes for protocol C are slightly negative in the case of GN and positive for the other two weak agonists, and the r 2 values range from 0.444 to 0.003.

Discussion and Conclusions
These studies were performed to validate the intact, immature female and adult OVX versions of the uterotrophic bioassays (Kanno et al. 2001(Kanno et al. , 2003a(Kanno et al. , 2003bOECD 1998a). Although not a controlled experiment, the size of the data set presents an excellent opportunity to test for possible influence of dietary phytoestrogens and other factors on the responsiveness of the bioassay. Because of the different nomenclatures between statistical sensitivity and its use in validation, namely, the proportion of all positive chemicals that are correctly classified as positive in an assay (ICCVAM 1997;OECD 1998b), the term responsiveness is used to describe the ability of an assay to respond to a substance at somewhat lower doses or even to achieve a statistically significance difference at any dose.
The data taken as a whole support the ability of laboratories using these uterotrophic protocols to detect weak estrogen agonists in vivo even when diets contain significant levels of phytoestrogens. This appears to resolve the concerns that dietary phytoestrogen levels per se may deleteriously impact the performance of the bioassay (Brown and Setchell 2001;Thigpen et al. 1999b). No evidence was found in the adult OVX version that estimated intakes < 25 mg TGE/kg/day would increase control blotted uterine weights or decrease the responsiveness of the bioassay ( Figures 4B, 5C, 6C; Table 3). For the intact immature animals, the data do suggest a gradual increase in control uterine weights as phytoestrogen intakes rise, and two laboratories with intakes > 50 mg TGE/kg/day displayed an apparent decrease in responsiveness. Laboratory 20, with the highest TGE intakes, did not respond to either test substance when numerous other laboratories achieved statistically significant uterine weight increases, and the data from laboratory 21, with the second highest doses, provide tentative support ( Figures 4A, 5A,B, 6A,B; Table 3). However, some caution is necessary before arriving at any conclusions. In the NP dose-response studies, the mean of the control uterine weight in laboratory 20 was 54 mg, and the means of the test substances were 34-41 mg. This suggests that the control value could be an anomaly and not due to dietary phytoestrogens. The BPA dose-response studies used the same control, and the test substance uterine weight means ranged from 27 to 97 mg, further suggesting the control could be an anomaly.
The higher food consumption rates of the immature animals relative to their body weights appears central to this possible difference between the intact, immature and adult OVX versions. Higher food consumption rates effectively double the phytoestrogen intake of the immature animals when compared with  The ratio of the geometric mean blotted uterine weights adjusted for body weights of the groups treated by oral gavage with 250 NP mg/kg/day relative to control vehicle uterine weights are plotted against the estimated total dietary genistein equivalents intake for intact, immature female groups in phase 2. The 95% upper and lower confidence levels are included. The results of specific groups from laboratory 14 that are discussed in the text have been highlighted in light blue and labeled. (B) The ratio of the geometric mean blotted uterine weights adjusted for body weight of the groups treated by sc injection with 80 NP mg/kg/day relative to control vehicle uterine weights are plotted against the estimated total dietary genistein equivalents intake for intact, immature female groups in phase 2. The 95% upper and lower confidence levels are included. The results of specific groups from laboratories 6, 14, 19, and 20 that are discussed in the text have been highlighted in light blue and labeled. (C) The ratio of the geometric mean blotted uterine weights adjusted for body weights of the groups treated by sc injection with 80 NP mg/kg/day relative to control vehicle uterine weights are plotted against the estimated total dietary genistein equivalents intake for adult OVX groups in phase 2. The 95% upper and lower confidence levels are included. The results of specific groups from laboratories 6 and 19 that are discussed in the text have been highlighted in light blue and labeled. Estimated intake of GN in equivalents (mg/kg bw/day) C the adult OVX animals on the same diet. The same higher food consumption is also noteworthy for the mouse, an alternative species for the uterotrophic bioassay, as mouse intakes from an equivalent diet would be even higher than that of the immature rat. Much of this examination and its conclusions hinge on chemical analyses of the diets and estimation of TGE intakes. The analytical method has been published previously and used (Morton et al. 1994;Odum et al. 2001). These data are based upon mass spectra ion fragments. Those samples that are effectively lot duplicates are in close agreement; and the levels and ratios of GN, DN, and CM are in close agreement with HPLC analyses with ultraviolet detection of various North American and European diets (Brown and Setchell 2001;Degen et al. 2002;Thigpen et al. 1999aThigpen et al. , 1999b. This same method has been studied in a small interlaboratory study with a standardized soya flour sample (Wiseman et al. 2002). The expected result was returned with this method, indicating that enzymatic hydrolysis and recovery were adequate (Clarke D. Personal communication). Similar results for the analysis of rodent diet PMI 5002 (Purina Mills, St. Louis, MO), allowing for possible batch variations between diets, were obtained with this method [DN 88 µg/g, GN 204 µg/g for laboratory 5, and DN 117 µg/g, GN 218 µg/g diet for laboratory 12 (Table 2)] and by Thigpen et al. (1999b) (DN 86 µg/g, GN 73 µg/g diet), again indicating that this method is comparable with those of others. In contrast, the estimation of TGE rests on a) the largely untested assumption of direct additivity between weak ER agonists in vivo, and b) an assessment of various literature data to derive the potency of various phytoestrogens. The tissue specificity of ER ligands (Shang and Brown 2002) and the interaction studies of Edgren and Calhoun (1957, 1960, 1961 suggest both caution and the need for robust experimental data.
Based on the uterotrophic data, the analytical data, and the TGE assumptions and calculations, this analysis suggests that TGE intakes > 40-50 mg/kg/day should be in question and avoided. This in turn suggests a limit of 325-350 µg TGE/g diet for immature animals. This judgment is consistent with the biological activity of pure phytoestrogens in other uterotrophic studies and other toxicologic studies. This intake limit is near the GN uterotrophic lowest observable effect level (LOEL) from this same validation program (Kanno et al. 2003b). In toxicologic studies employing soy-free diets, biological responses consistent with an estrogenic mode of action have been observed at GN LOELs between 300 and 1,000 µg/g diet (Casanova et al. 1999;Delclos et al. 2001;Fritz et al. 1998;Santell et al. 1997;You et al. 2002aYou et al. , 2002b. Two controlled dietary studies have been recently performed specifically to test the impact of dietary phytoestrogens on the uterotrophic bioassay, and these data are consistent with our conclusion that low levels of dietary phytoestrogens do not significantly impair the uterotrophic bioassay. Yamasaki et al. (2002) fed rats diets containing approximately 20, 100, and 200 µg TGE/g diet as calculated herein with dose ranges of BPA, GN, and NP. No effect on the responsiveness of the uterotrophic bioassay was observed, and the phytoestrogen intakes at the high dose would have been approximately 30 mg TGE/kg/day diet. In a second study, rats were fed a series of diets containing 5, 50, 250, and 1,250 µg Novasoy extract/g diet and compared the responsiveness of the uterotrophic assay and several other uterine indicators to administered doses of EE (1 µg/kg/day) and BPA (600 mg/kg/day). These doses are sufficient to attain near-maximal responses by sc injection (Kanno et al. 2003b). A statistically significant increase in uterine weights with 1,250 µg Novasoy extract/g diet was observed. However, no interaction was evident with either the EE or BPA with the blotted uterine weight or the measured increase in the uterine epithelial cell height at any dietary level of the Novasoy extract (Wade MG, Lee A, McMahon A, Cooke G, Curran I. Personal communication). Both of these data sets are comparable to ours, as the same protocol B was used in those studies.
The implications for these observations extend to other areas of current interest. Experimental data have raised the question of whether very low doses of BPA, for example, 2 or 20 µg/kg/day orally administered, might result in observable changes in several end points in the mouse (Howdeshell et al. 2000;Nagel et al. 1997;vom Saal et al. 1998). These investigators reported using PMI 5001 and 5008 diets, which are reported elsewhere to contain levels of phytoestrogens that would approach or exceed 300-350 µg TGE/g diet (Brown and Setchell 2001;Degen et al. 2002;Odum et al. 2001;Thigpen et al. 1999b). As previously noted, mice consume even higher quantities of diet than rats on a relative body weight basis. For example, in National Toxicology Program studies, mice consumed a mean of 7.2 g diet/day for a 25 g bw, or 288 g/kg bw/day, and rats consumed a mean of 14.8 g diet/day for a 200 g bw, or 74 g/kg bw/day (Moore 1995). The latter quantity compares favorably with the data and calculations herein of 60-75 g/kg bw/day, considering that OVX animals here were somewhat higher in body weight and their food intake should have then slightly decreased. Using dietary phytoestrogen levels of 200 and 350 µg TGE/g diet, we then estimate the TGE intake from the laboratory diet like those low-dose BPA studies to have been in the approximate range of 50-100 mg TGE/kg/day. Four presumptions are needed to estimate the dose of BPA relative to dietary TGE intakes in these studies: a) an equivalency factor for BPA of 0.06 based upon directly comparable data from po administration of BPA and GN in this validation program (Kanno et al. 2003a(Kanno et al. , 2003b; b) the additive interaction of phytoestrogens and BPA as estrogens; c) a molecular similarity of action via the ER in both rat uterus and mouse tissues; and d) similar pharmacokinetics in both species, such as similar levels of hepatic glucuronidation and rates of biliary excretion. These presumptions lead to an estimate that the BPA doses would then have contributed only approximately 0.12 and 1.2 µg TGE/kg/day, or about 0.002% of the estimated dietary TGE dose ingested by these animals. This may assist in explaining the inability of other workers to reproduce the original data (Ashby et al. 1999a;Cagen et al. 1999).
Human exposures to phytoestrogens are variously estimated to range from 0.5 to 4 mg/kg/day for adults and 4.5 to 10 mg/kg/ day for infants consuming soy-based infant formula (MAFF UK 1998;Setchell et al. 1997;Whitten and Patisaul 2001). This level of dietary intake would then be the predominate human exposure to exogenous estrogens other than pharmaceuticals. In addition to the apparent LOEL of approximately 50 mg/kg/day based on this dietary analysis, these intakes are within range of the GN po LOEL in this validation program (Kanno et al. 2003b) and the range of LOELs/lowest observable adverse effect levels observed in several toxicologic studies (Casanova et al. 1999;Delclos et al. 2001;Fritz et al. 1998;Newbold et al. 2001;Santell et al. 1997;You et al. 2002aYou et al. , 2002b. This indicates that human phytoestrogen consumption warrants examination as a model for any risk presented by estrogenic substances. Other tasks at hand are to identify conditions that may impair the performance of the bioassay or prevent acceptance of data from the bioassay and to recommend remedies. The validation data indicate several instances where the responsiveness of the uterotrophic bioassay may have been diminished, particularly, by high control uterine weights, that are not attributable to dietary phytoestrogen levels. For example, laboratory 6 had phytoestrogen levels similar to a number of other laboratories, and laboratory 19 had both low dietary TGE values and normal control uterine weights, yet decreased responsiveness in these laboratories was evident for several test substances (Kanno et al. 2003a(Kanno et al. , 2003b. To further reinforce the evidence against a role for phytoestrogens, note that laboratory 14 used the same diet as laboratory 6, laboratory 8 used the same diet as laboratory 19 (Table 1), and neither laboratories 14 and 8 experienced any evident problems.
There is evidence that dietary factors other than phytoestrogens levels may affect uterine weights and the timing of vaginal opening (Ashby et al. 1999b(Ashby et al. , 2000Odum et al. 2001;Thigpen et al. 1987aThigpen et al. , 1987b. In fact, some purified diets, free of phytoestrogens, have yielded statistically significant increases in uterine weights when compared with diets with limited quantities of phytoestrogens (Ashby et al. 1999b. Upon investigation, increasing levels of endogenous, prepubertal estrogens are plausible factors, as co-administration of the estrogen receptor antagonist Faslodex (Ashby 1999b) or the gonadotrophin-releasing hormone antagonist Antarelix ) reduce the increased uterine sizes from these diets to even lower levels.
We suggest that several precautions will benefit the development of guidelines for the uterotrophic bioassay. First, laboratories should request that diets have total GN and DN levels < 350 µg/g diet where immature animals are used. Occasional analyses may be needed to verify the levels. Second, laboratories should monitor the uterine weights of their laboratory colonies or those of the animals from their supplier. We suggest that blotted uterine weights for control immature animals should be consistently < 35 mg, and mean blotted uterine weights > 40 mg should be questioned. In the case of adult OVX animals, in addition to monitoring the animals at necropsy for complete ovariectomy, mean blotted uterine weights > 115 mg should be questioned. Third, the OECD validation studies provide a base data set to compare the performance of established laboratories and to qualify new laboratories. This data set includes the potent reference EE, and the large BPA or NP data sets apply to the likely target area of weak ER agonists. In addition, these data should also assist the design of more definitive and controlled experiments on the possible effect of laboratory diets.
We conclude that modest to low levels of dietary phytoestrogens do not substantially increase control uterine weights or reduce the responsiveness of the uterotrophic bioassay to weak ER agonists. Therefore, laboratories conducting the uterotrophic bioassay for regulatory or research purposes can continue to use diets containing levels of phytoestrogens < 325-350 µg TGE/g diet. Above these levels, the evidence suggests that phytoestrogens may compromise the responsiveness in the case of the intact immature version of the bioassay. As food intake relative to body weight is an essential factor in determining actual intake, these cautions should also be applied to the mouse when used in the uterotrophic bioassay.
Laboratories should also be aware of data suggesting that even phytoestrogen-free diets may impair the responsiveness of the bioassay under some conditions. There were other instances of high baseline uterine weights and limited responsiveness that cannot be attributed to the phytoestrogen content of the diets. Therefore, we also conclude for purposes of quality control that laboratories conducting the uterotrophic bioassay for either research or regulatory purposes should monitor the uterine weights of their control groups for data acceptance. Laboratories should also periodically demonstrate the adequate responsiveness of their systems to estrogen agonists. In this respect, the OECD data set for EE and five weak agonists can be used for performance comparisons and to benchmark laboratory performance over time.