Summary of reference chemicals evaluated by the fish short‐term reproduction assay, OECD TG229, using Japanese Medaka, Oryzias latipes

Abstract Under the Organisation for Economic Co‐operation and Development (OECD), the Ministry of the Environment of Japan (MOE) added Japanese medaka (Oryzias latipes) to the test guideline fish short‐term reproduction assay (FSTRA) developed by the United States Environmental Protection Agency (US EPA) using fathead minnow (Pimephales promelas). The FSTRA was designed to detect endocrine disrupting effects of chemicals interacting with the hypothalamic–pituitary–gonadal axis (HPG axis) such as agonists or antagonists on the estrogen receptor (Esr) and/or the androgen receptor (AR) and steroidogenesis inhibitors. We conducted the FSTRA with Japanese medaka, in accordance with OECD test guideline number 229 (TG229), for 16 chemicals including four Esr agonists, two Esr antagonists, three AR agonists, two AR antagonists, two steroidogenesis inhibitors, two progesterone receptor agonists, and a negative substance, and evaluated the usability and the validity of the FSTRA (TG229) protocol. In addition, in vitro reporter gene assays (RGAs) using Esr1 and ARβ of Japanese medaka were performed for the 16 chemicals, to support the interpretation of the in vivo effects observed in the FSTRA. In the present study, all the test chemicals, except an antiandrogenic chemical and a weak Esr agonist, significantly reduced the reproductive status of the test fish, that is, fecundity or fertility, at concentrations where no overt toxicity was observed. Moreover, vitellogenin (VTG) induction in males and formation of secondary sex characteristics (SSC), papillary processes on the anal fin, in females was sensitive endpoints to Esr and AR agonistic effects, respectively, and might be indicators of the effect concentrations in long‐term exposure. Overall, it is suggested that the in vivo FSTRA supported by in vitro RGA data can adequately detect effects on the test fish, O. latipes, and probably identify the mode of action (MOA) of the chemicals tested.


| INTRODUCTION
The Ministry of the Environment of Japan (MOE) published its fourth program on endocrine disrupting effects of chemical substances (EDC) "EXTEND 2016" (MOE, 2016) of which the basic concepts and the framework were inherited from the preceding program "EXTEND 2010" (MOE, 2010), in June 2016. The EXTEND 2016 Program, which consists of Tier-1 screening to evaluate endocrine disrupting potency and Tier-2 testing to assess adverse endocrine disrupting effects on animals, has been progressing on the implementation of testing and assessment strategies, focusing on effects on reproduction (estrogen and androgen related), development (thyroid hormone related), and growth (juvenile and ecdysone hormones related). Regarding effects on reproduction, applying an in vitro reporter gene transcriptional assay (RGA) using estrogen receptor (Esr) and androgen receptor (AR) of Japanese medaka (Oryzias latipes), for example, the Esr1 (also known as ERα) and the ARβ, and an in vivo fish short-term reproduction assay (FSTRA) with Japanese medaka, are used in the Tier-1 screening. Under the Organization for Economic Co-operation and Development (OECD), the FSTRA and fish 21-day assay, which the MOE had been involved in developing the protocols using Japanese medaka, were established as OECD TG229 (OECD, 2012a) and OECD TG230 (OECD, 2009a), respectively. TG229, for which the OECD had revised the test conditions for Japanese medaka based on the proposal from the MOE in 2012 (OECD, 2012a), recommends the use of three small fish species as test species, that is, fathead minnow (Pimephales promelas), zebrafish (Danio rerio), and Japanese medaka. In the EXTEND 2016 Program, the Medaka Extended One Generation Reproduction Test (MEOGRT), which has been adopted as OECD TG240, is conducted for candidate chemicals in which effects on fish reproduction were suspected based on the results of the Tier-1 screening evaluation.
The FSTRA, conducted in accordance with the OECD TG229 includes the endpoints directly related to reproduction, that is, spawning status, and is designed to detect endocrine disrupting effects interacting with the hypothalamic-pituitary-gonadal axis (HPG axis), which may respond to substances that impact on it at different levels (OECD, 2012a). The literature on adverse effects on reproduction of fish caused by several estrogenic substances, such as 17β-estradiol (E2), 17α-ethynylestradiol (EE2), and alkylphenols, has seen numerous papers published since the late 1990s when EDC issues became apparent (Hara, Hiramatsu, & Fujita, 2016;Matthiessen, Wheeler, & Weltje, 2018;OECD, 2009b;Urushitani et al., 2007). Moreover, much research has been conducted to develop novel test methods, biomarkers and endpoints and select test species to assess various adverse effects that might be related to the endocrine system (Carnevali, Santangeli, Forner-Piquer, & Basili, 2018;Manibusan & Touart, 2017;McArdle et al., 2020). On the other hand, there are studies in which multiple chemical substances in various modes of action (MOA) are comparably evaluated by a unified test species, conditions, and procedures, like a screening test in regulatory use. In the FSTRA, vitellogenin (VTG) protein, secondary sex characteristics (SSC), and reproductive status are quantitatively measured, and the Esr agonistic and antagonistic, AR agonistic and antagonistic, and steroidogenesis inhibitory potency of the test chemicals can be assessed based on the response in the endpoints (OECD, 2012a). The reproductive status, that is, fecundity and fertility during the exposure period, is the most important endpoint despite being measurable without high technical expertise. On the other hand, it was also suggested that fecundity can be influenced by nonchemical factors (presumably selection of smaller females, low food, and/or water temperature, but not specified in the report) in the validation report of the FSTRA using fathead minnow by the United States Environmental Protection Agency (US EPA, 2007).
The FSTRA using fathead minnow is a key component of the US EPA's Endocrine Disruptor Screening Program (EDSP), which uses a weightof-evidence analysis based on data from several assays to identify the potential for chemicals to act as agonists or antagonists of Esr or AR or inhibitors of steroidogenic enzymes (Ankley & Jensen, 2014), though it was pointed out that VTG and fecundity in the controls had high intralaboratory and interlaboratory variabilities, based on the data from 49 studies performed in the US EPA's EDSP, and the historical control data were of limited use during study interpretation (Wheeler, Valverde-Garcia, & Crane, 2019). Because females of Japanese medaka produce 10-30 eggs daily in the conditions appropriately controlled (Ankley & Johnson, 2004;Flynn et al., 2017;Hirshfield, 1980;Koger, Teh, & Hinton, 1999;OECD, 2006b), the means of the daily fecundity during the FSTRA in controls are probably more stable compared with the other small test fish, for example, zebrafish and fathead minnow.
Nevertheless, to practically and effectively use the FSTRA using Japanese medaka as Tier-1 screening under the EXTEND 2016 Program, it is needed to evaluate intralaboratory and interlaboratory variabilities in control fish for endpoint measurements including reproductive status and to verify the applicability of the endpoints for screening of chemical effects by various MOAs.
This report describes the results from a study of the in vivo FSTRA using Japanese medaka in accordance with the OECD TG229 and in vitro reporter gene assay (RGA) using Esr and AR of Japanese medaka, for 16 substances. The 16 test chemicals, which were selected referring to the other validation studies and the literature describing EDCs in fish, included progesterone receptor (PR) agonists known from mammals and a substance in which negative effects on the endocrine system have been suspected (negative substance), as well as Esr agonists and antagonists, AR agonists and antagonists, and steroidogenesis inhibitors. These reference chemical studies were conducted with the financial support of the MOE, in the process of establishing the test guideline for Japanese medaka and to obtain the knowledge and the data to support the Tier-1 evaluation under the EXTEND 2016 Program.

| MATERIALS AND METHODS
2.1 | Test chemicals 2.1.1 | Esr agonists As Esr agonists, four chemicals were used. E2 is an endogenous estrogen in vertebrates and has been frequently used in several validation studies of fish testing, for example, the Medaka multigeneration test (MMT), which is the original method of MEOGRT OECD, 2015;US EPA, 2013), the 21-Day Fish Screening Assay (OECD, 2006a(OECD, , 2006b, and the Fish Sexual Development Test (FSDT) (OECD, 2011c). EE2 is a synthetic estrogen that, like E2, has been the most widely used in fish studies, as a reference chemical of a strong Esr agonist (OECD, 2009b). An alkylphenol, has weak estrogenic activity in fish and was used for the OECD validations for the 21-day Fish Assay and FSDT (OECD, 2006b(OECD, , 2011a. 4-Chloro-3-methylphenol (CMP) was used in the validation of the MMT, as a weak estrogenic substance US EPA, 2015). A study suggested that CMP bound to both the recombinant human and rainbow trout (Oncorhynchus mykiss) Esrs and induced rainbow trout VTG mRNA at about the same concentrations as overt toxicity (Schmieder et al., 2004;US EPA, 2013).

| Esr antagonists
Two chemicals, tamoxifen citrate (TAM) and raloxifene hydrochloride (RAL), were used for the Esr antagonist studies. Both chemicals are selective Esr modulators, which have features that can act as Esr agonists and antagonists depending on the target tissues in mammals Shang & Brown, 2002), and are used for treatment of breast cancer, osteoporosis, and postmenopausal symptoms.
Regarding TAM effects on fish, a decrease in VTG and fecundity was reported from the partial life cycle assay with female zebrafish (van der Ven, van den Brandhof, Vos, & Wester, 2007).

| AR agonists
As AR agonists, three chemicals were used. 17β-trenbolone (TRB) and 17α-methyltestosterone (MT) are synthetic anabolic-androgenic steroids. For example, TRB is used as a subcutaneous implant for growth promotion in beef cattle (Ankley et al., 2003). In the validation for the 21-day Fish Assay, FSTRA with fathead minnow and MMT (MEOGRT), TRB was used as an AR agonist test chemical OECD, 2009a;US EPA, 2007, 2013. MT is also commonly used to validate fish assays (OECD, 2009b). Though 11-ketotestosterone (11KT) is an endogenous androgen in fish species, because of the cost of this androgen, 5α-dihydrotestosterone (DHT), produced from testosterone by 5α-reductase in mammals, was used for the validation of the FSDT (OECD, 2012b).

| Steroidogenesis inhibitors
Ketoconazole (KCZ) and prochloraz (PCL), which are known as azolebased fungicides that inhibit the synthesis of ergosterol, a vital component of the fungal cell membrane (Zarn, Bruschweiler, & Schlatter, 2003), were used as steroidogenesis inhibitors. Several literatures have reported that these chemicals inhibited directly and/or consequently the activity of several cytochrome P450 enzymes, including aromatase, involved in steroidogenesis in vivo and/or in vitro (Andersen, Vinggaard, Rasmussen, Gjermandsen, & Bonefeld-Jorgensen, 2002;Blystone et al., 2007;Monod, De Mones, & Fostier, 1993;Skolness et al., 2011;Villeneuve et al., 2007). Both the chemicals were used in the validation study of the FSTRA using fathead minnow by the US.EPA (Ankley et al., 2005;US EPA, 2007), and PCL was also used to validate the FSDT and the MEOGRT OECD, 2011b).

| PR agonists
A natural progesterone (P4) and a synthetic progestin, levonorgestrel (LNG), were subjected to the assay to assess the applicability of the FSTRA protocol to the effects of PR agonists on the test species (medaka). P4 is an endogenous steroid involved in the menstrual cycle, pregnancy, and embryogenesis in mammals and LNG is a secondgeneration progestin derived from 19-nortestosterone (Lorenz et al., 2011). Because these two chemicals have been used as a component of oral contraceptives, they have been detected in the environment, in water Shen, Chang, Sun, Wang, & Wu, 2018). Zeilinger et al. (2009) andRunnalls, Beresford, Losty, Scott, andSumpter (2013) reported that synthetic progestin had affected reproduction of wild fish at environmental levels.

| Negative (inactive) substance
To validate the protocols of fish short-term assays, a few substances, such as n-octanol, potassium permanganate, and sodium dodecyl sulfate (SDS), had been previously used, but problematic results, such as considerable decreases in measured concentration and unexpectedly high fish mortality, were observed in these experimental studies (OECD, 2007(OECD, , 2010US EPA, 2007). In the present study, SDS was selected as a negative substance because the dilution water had a relatively low degree of hardness, for example, in 50 mg/L as CaCO 3 , which was expected to contribute to the stability of test concentration and the suppression of mortality by excessive toxicity.
For all the test chemicals, commercially available reagents with the highest degree of purity were obtained. The details of the reagents tested in the in vivo FSTRA and the in vitro RGA studies are shown in Table 1.

| Fish short-term reproduction assay
The experimental studies on the FSTRA for the 16 chemicals were conducted in two laboratories in Japan, the National Institute for Environmental Studies (NIES, Lab-1) and the Institute of Environmental Ecology, IDEA Consultants Inc. (Lab-2), in accordance with OECD TG229 (OECD, 2012a) (Table 1).

| Test fish
The NIES-R strain of Japanese medaka (O. latipes), one of the orangered varieties, was used in all the 16 chemical studies. For the studies, healthy and mature male and female medakas at 16 ± 2 weeks old were selected from a single-stock population, which were bred within the test facilities. The test fish were maintained in conditions similar to the assay for at least 7 days, and spawning status was checked during the acclimation period. On the day the chemical exposure commenced, to ensure balanced distribution of test fish between the treatment groups for the reproductive status of the fish, the replicate vessels (fish tanks), with each containing three male and female medaka, were distributed to each treatment and control group, for example, by a randomized block design based on the number of fertilized eggs laid in the last 5-7 days during the acclimation, in each assay.

| Test concentrations
For each test chemical, three or four concentrations were set by reference to the results of previous validation studies and toxicity tests with Japanese medaka (Table 2). To prevent excessive lethal effects, the highest test concentration was determined based on acute toxicity (e.g., one third or tenth of the 96 h LC 50 ). The water solubility of the test chemical was also taken into consideration when deciding the highest test concentration. To set the lower concentrations, a spacing factor ranging between two and five was used. All assays included a dilution water control (DWC). and/or sonicating for an appropriate time or by using a solid-liquid saturator. In the two laboratories, flow-through exposure systems, in which a series of test solutions at the target concentrations can be continuously prepared and delivered to test vessels at controlled flow rates (Haselman et al., 2016;Watanabe et al., 2017), were used for the exposure experiments. Dechlorinated tap water in which the water quality had been checked at appropriate frequency was used as dilution water to prepare the test solutions and as a test solution for the controls.

| Chemical analysis
The chemical concentrations of test solutions, including DWC, were quantitatively measured once a week during the exposure period. The water samples, collected from the test vessels, were immediately subjected to chemical analysis or stored at 4 C until analysis. If there was a need to derive the limit of quantification (LOQ) required, a suitable pretreatment procedure, for example, solid-liquid extraction, was applied to the water samples. The analytical method and LOQ in each study are summarized in Table 3.

| Chemical exposure and observation
In the FSTRA, three mature adult males and females in a test vessel were exposed together to the test chemicals in flow-through condition for 3 weeks. In all the assays, each treatment group, including the DWC group, contained four replicate tanks. During the exposure period, the fish were fed a sufficient amount of live brine shrimp nauplii (newly hatched Artemia sp.) for daily spawning ad libitum. Fecal material in the tanks was appropriately removed by siphoning after feeding. Water temperature of test solutions was recorded for at least one vessel in each treatment and control group every day and for all vessels at least once a week. Furthermore, dissolved oxygen and pH of test solutions were measured for all test vessels at least once a week. Mortality and abnormal behavior and appearance in the fish exposed to test chemicals were daily observed, and any fish that died were removed from the tank as soon as possible. To assess the reproductive status, all eggs females spawned were collected every day. At the completion of the chemical exposure, surviving fish in each tank were sampled.
All of the chemical exposures were conducted in the conditions shown in Table 4 and satisfied the test acceptance criteria provided by the OECD TG229, except for the SDS study. In the SDS study, the dissolved oxygen level of test solution at the highest concentration dropped below 60% of saturation for a few days, due to microbial growth. However, it is considered that this problem did not have a significant impact on the test fish, because no mortality or any abnormal appearance or behavior was observed during this period.

| Endpoint measurements
Spawning status (fecundity and fertility) The eggs females released on the bottom of the tank were collected by siphoning, and the egg clutches still on female abdomen were carefully picked from the body of the females captured using a small net.
All the eggs collected were microscopically observed and separately counted for fertilized and unfertilized. As an endpoint regarding fecundity, the mean of the number of the total of fertilized and unfertilized eggs (/female/day) was determined, and the fertility rate, which was defined as the ratio (percentage) of the number of fertilized eggs to the number of total eggs over the 21-day exposure period, was calculated.

Necropsy and sample collection
The fish surviving at the completion of the exposure were anesthetized in ice-cold water and then were dissected and sampled whole liver after measuring body length and weight. The liver samples were weighed and immediately stored under −20 C or less, until VTG

T A B L E 3 Methods and LOQs for test solution analysis in FSTRAs
Test chemical Sample pretreatment Analysis LOQ quantification. In addition, the anal fin was imaged in a flat and spread-out condition for each fish, or the posterior region of the fish including the anal fin was collected and stored in appropriate fixative, for example, 10% neutral buffered formalin, prior to SSC assessment.
Briefly, the liver samples were homogenized with the assay buffer included in the ELISA kit used. The homogenates in microtubes were centrifuged, and the supernatant (hepatic extract) was collected. The VTG concentrations in the hepatic extractions were quantitatively determined using commercially available ELISA kits, EnBio Medaka VTG ELISA (EnBio Tec Laboratories Co. Ltd., Tokyo, Japan) or medaka VTG ELISA assay kit (Trans Genic Inc., Fukuoka, Japan). In each of the 16 studies, the VTG ELISA kits within the same lot were used to eliminate an interlot variation. The limit of determination was 1.0 ng/mg of liver weight in all the VTG analyses.

Secondary sex characteristics
For SSC, the number of joint plates in which papillary processes (PPs) were visibly formed was counted on the images of anal fins or on the fixed samples under a microscope (Nakamura et al., 2014; OECD, 2012a).

| Statistical analysis for FSTRA data
For each endpoint, differences between the chemical treatment and the DWC were statistically analyzed in the replicate means basis (OECD, 2006c). Briefly, first the homogeneity of variance was assessed by Leven's or Bartlett's test, and then the data in which homogeneity of variance was confirmed were subjected to one-way analysis of variance (ANOVA) followed by Dunnett's test. The data in which the assumption of homogeneity of variance was rejected were appropriately transformed (e.g., by log transformation, square root transformation, or arcsine transformation) and reanalyzed for homogeneity of variance. If no homogeneity of variances was found, even in the transformed data, the data were analyzed by nonparametric Kruskal-Wallis test followed by Steel's test or Dunn's test. A p value less than 0.05 was considered significant for all the statistical analyses.
Based on the results from the statistical analysis, the lowest observed effect concentration (LOEC) was determined in each endpoint. The LOEC was defined as the lowest tested concentration in which a significant effect (decrease or increase) compared with the control was observed and equal to or greater effects were found in all the concentrations higher than that (OECD, 2015).

| In vitro RGA
To support interpretation of the biological effects observed in the FSTRA studies, in vitro RGAs using medaka Esr1 (mEsr1) and medaka ARβ (mARβ) were performed for the 16 test chemicals. In the present study, Esr1 and ARβ, which are ancestral subtypes among Esrs and ARs in Japanese medaka, were selected based on the phylogenetical analyses of Esr and AR with each receptor RGAs (Ogino et al., 2016(Ogino et al., , 2018Tohyama et al., 2015).

| Medaka Esr1 RGA
The mEsr1 RGA consisting of agonist and antagonist assays were performed according to the methods previously reported (Katsu et al., 2010;Lange et al., 2012;Miyagawa et al., 2014 were used as positive controls, respectively, to confirm the assays adequately worked.

| Medaka ARβ RGA
The mARβ RGA in which the agonist and the antagonist assays were performed as same as the mEsr1 RGA was carried out according to the methods previously reported by Lange et al. (2015). For the assays, human liver cancer cell line, HepG2, was used as the host cells, and mARβ/pcDNA3.1, MMTV-Luc, and pRL-TK-Rlu were used as mARβ expression vector, reporter vector, and internal control vector, respectively. Other materials and methods, for example, operating procedure, reagents, and incubation conditions, were almost the same as the mEsr1 RGA previously described beside that 11KT (Sigma-Aldrich, purity 99.0%), a main AR ligand in fish, was competitively spiked at 50 nM into the test medium with test chemicals in antagonist assay. In the agonist and the antagonist assays, 11KT and 2-hydroxyflutamide (2HFLT; Sigma-Aldrich, purity 98.5%) were used as positive controls, respectively.

| Estimation of effect concentration
Effect concentrations, that is, EC 50 (a half maximal effective concentration) in the agonist assay and IC 50 (a half maximal inhibitory concentration) in the antagonist assay, were estimated by nonlinear curve fitting (e.g., three parameters logistic regression curve) using In the P4 study, a remarkable decrease of P4 concentration was found in the chemical analysis conducted on the 13th day of the exposure (in the second week) for all the exposure concentrations.
Because the cause was considered to be biodegradation due to the microbial growth, cleaning of the inside of the exposure system, for example, dilution tanks, solution supply pipes, and exposure aquariums, was conducted, and at the same time, the cause was verified.
When the test solutions prepared in the system were sampled in another glass tanks and treated with ultraviolet (UV) irradiation, the P4 concentration after 24-h UV treatment maintained more than 80% of the nominal, whereas the P4 concentration in the sample without UV treatment was decreased to less than 20% of the nominal suggesting biodegradation (data not shown). Though the extreme reduction of P4 concentrations were solved within 2 or 3 days by the device cleaning, as a result, the time-weighted means of measured P4 concentrations less than 50% of the nominal were suggested for all, except the lowest, the exposure concentrations.

| Mortality
The results of the FSTRA studies, that is, the mean of endpoint measurements for each treatment and control fish, are summarized in Table 5. The other measurement data but not included in apical endpoints in OECD TG229, for example, lengths, weights, hepatosomatic indices (HSI), and gonadosomatic indices (GSI) of the exposed fish, are provided in Table 6.
For the mortalities in controls, the test validity criterion suggested in the OECD TG229, less than 10%, was satisfied in all the studies (Table 5). In the chemical treatments, a remarkable high mortality was found in males (33%) at the highest concentration of EE2, in both males (59%) and females (42%) for KCZ, and in both males (50%) and females (83%) in P4 exposure. The highest concentrations of E2 and PTH caused a slight increase in mortality (17%).

Number of total eggs
In the controls, the fecundity that is denoted as an average number of total eggs one female medaka laid a day during the 21-day exposure period ranged between 19 and 34 eggs over the 16 studies, as shown in Note. Data denoted in mean ± standard deviation and arrows indicate that a significant increase (") or decrease (#) from the control was found (p < 0.05).
T A B L E 6 Length, weight, HSI, and GSI at the completion of the exposure in FSTRAs in which no overt toxicity was observed. A concentration-dependent decrease and a significant difference from the control at the concentrations in which the mortality was less than 10% was observed in both the PR agonists (P4 and LNG) and the KCZ studies. No effect on fecundity within the test concentrations was suggested in the fish exposed to CMP, FLT, VCZ, PCL, and SDS (Table 5).

Number of fertile eggs
In the 12 studies other than those with EE2, TAM, P4, and LNG, the LOECs on number of fertile eggs were the same as the LOECs on fecundity (number of total eggs). In the EE2 and the TAM studies, the LOECs on number of fertilized eggs were one concentration higher, which is less sensitive, than that for the total eggs, and conversely, was one concentration lower in the two PR agonist studies (Table 5).

Fertility rate
The mean of the fertility rate during the 21-day exposure period in the controls ranged between 90% and 98% over the 16 studies. A significant decrease could be detected for four chemicals (RAL, VCZ, PCL, and SDS) in which no effect on both the number of total and fertilized eggs was detected (Table 5).

| Hepatic VTG
VTG induction is generally at a low level in male fish and hence has been widely used as a biomarker to screen chemicals for estrogenic activity on fish over the years (Hansen et al., 1998;Kime, Nash, & Scott, 1999;OECD, 2009a;Sumpter & Jobling, 1995;Tyler, van der Eerden, Jobling, Panter, & Sumpter, 1996). In the present study, the mean hepatic VTG levels in the control males were less than 10 ng/mg of liver, except for the FLT, PCL, and P4 studies, where the mean values for VTG in control males increased up to a maximum of 38 ng/mg of liver. As to the mean of VTG levels in control females over the 16 studies, a variation between 428 and 4,440 ng/mg of liver, which might be depended on the types or interlot variation of the ELISA kits used, was recognized (Table 5).
In the Esr agonist studies, the male VTG levels were significantly elevated depending on the exposure concentrations and exceeded the VTG levels in the control females at the highest concentrations for E2, EE2, and PTH. The LOECs for VTG induction in males were 22.1 ng/L, 17.8 ng/L, 1,060μg/L, and 96.5 μg/L for E2, EE2, CMP, and PTH, respectively. With regard to the female hepatic VTG, a significant increase was found in all the four Esr agonist studies, but the LOECs were higher than the LOECs on male VTG in all the studies, except for CMP. An increase of VTG in male medaka was also caused by the Esr antagonist exposure, but a concentration dependence was not obvious in the RAL study. On the other hand, VTG levels in females were concentration dependently decreased by both the TAM and the RAL treatment, and the LOECs were respectively 10.0 and 721 μg/L. Among the AR agonists, only MT decreased female VTG levels in a concentration-dependent manner and a significant difference from the control could be detected in the two highest T A B L E 6 (Continued) Abbreviations: AR, androgen receptor; CMP, 4-chloro-3-methylphenol; DHT, 5α-dihydrotestosterone; E2, 17β-estradiol; EE2, 17α-ethynylestradiol; Esr, estrogen receptor; FLT, flutamide; FSTRA, fish short-term reproduction assay; GSI, gonadosomatic index; HSI, hepatosomatic index; KCZ, ketoconazole; LNG, levonorgestrel; MT, 5α-methyltestosterone; P4, progesterone; PCL, prochloraz; PTH, 4-tert-pentylphenol; RAL, raloxifene hydrochloride; SDS, sodium dodecyl sulfate; TAM, tamoxifen citrate; TRB, 17β-trenbolone; VCZ, vinclozolin.
concentrations. KCZ and PCL, steroidogenesis inhibitors, reduced hepatic VTG in females and the LOEC in which no mortality was caused was suggested to be 233 and 44.9 μg/L, respectively. The AR antagonists altered the hepatic VTG levels in neither males nor females. Regarding the PR agonists, female hepatic VTG was significantly decreased at the highest concentration of 226 ng/L in the LNG treatment, but P4 caused no alteration at any exposure concentration.
In the SDS study, an increase and a decrease, which were statistically significant but suspected as not being caused by interaction with the HPG axis, were observed in females but no VTG induction was found in males.

| Secondary sex characteristics
The formation of PPs on anal fin rays is masculine SSC in Japanese medaka. The knowledge that the development of PP is promoted by androgen-dependent augmentation of bone morphogenic protein 7 and lymphoid enhancer-binding factor-1 in males and can be induced in females by exogenous androgen exposure, that is, exposing to AR agonistic chemicals, has been reported by Ogino et al. (2014). In the controls over the 16 chemical studies, the mean values of the SSC, that is, the number of joint plates with PP on anal fin rays, ranged from approximately 60 to 120 in males, whereas none were consistently observed in females.
The three AR agonists, MT, DHT, and TRB, and both the PR ago- In the AR agonist studies, a tendency for SSC to be slightly increased in a concentration-dependent manner was also found in males, but a statistical significance from the control was detected only in the DHT study. Regarding the other MOAs including AR antagonists, no alteration in either male or female SSC was observed, although a statistical significance, which might be caused by a factor not associated with the MOA of the test chemical, was found in the KCZ study (Table 5).

| Medaka Esr1 and ARβ RGAs
The results of RGAs, the agonist and antagonist assays using mEsr1 and mARβ, are summarized in Table 7. Regarding the agonist assays using mEsr1, EC 50 of a positive control E2, to confirm the verification of the assay, was 0.00098 μM (9.8 × 10 −10 M), and EC 50 s of 0.00088, 0.97, and 61 μM were obtained for EE2, PTH, and CMP, respectively.
Conversely, an EC 50 could not be determined for the negative substance SDS because a significant increase in hold activation was not shown even at the highest concentration of 100 μM. In addition, the two PR agonists were assayed in the mEsr1 agonist assay, and an EC 50 of 1.1 μM was obtained only for LNG. In the antagonist assays with mEsr1, IC 50 s of 0.14, 0.0026, and 0.00052 μM were obtained for TAM, RAL, and 4HTAM (a positive control), respectively, while no Esr inhibiting activity in the 1 nM E2-mediated mEsr1 transactivation was observed for all the test chemicals of AR agonists, steroidogenesis inhibitors, PR agonists, and the negative substance SDS.
In the agonist assays using mARβ, the EC 50 of 11KT for the positive control was 0.0027 μM and for MT, DHT, and TRB was 0.00012, 0.49, and 0.0036 μM, respectively. The two PR agonists also indicated a positive response, and in particular the EC 50 for LNG was 0.000013 μM, indicating that it was 50 times more potent than the positive control 11KT. Again, no significant increase in hold activation was found for SDS. With regard to the antagonist assays, IC 50 on 50 nM 11KT-induced mARβ agonistic activity could be determined for three Esr agonists (E2, EE2, and CMP) and the steroidogenesis inhibitor KCZ, as well as for the AR antagonists including the positive control 4HFLT. The IC 50 for the AR antagonistic effect was 0.33, 12, and 5.1 μM for 2HFLT, FLT, and VCZ, respectively, whereas EE2 had the lowest IC 50 of 0.14 μM.

| DISCUSSION
In the FSTRA using Japanese medaka, mature adult males and females were exposed together to each test chemical in a test vessel for 3 weeks. The chemical treatment was conducted with a minimum of three concentrations and an appropriate control in which four replicate tanks to ensure an adequate statistical power in analysis of endpoint data were containing. As apical endpoints to evaluate endocrine disrupting effects of the test chemicals, the eggs females produced were daily counted separately for fertilized and unfertilized in each tank during the exposure period, and the hepatic VTG and the SSC (i.e., the number of joint plates on anal fin rays which PPs formed) were quantitatively measured in all the fish survived at the completion of the exposure.
Fecundity and fertility can be the most useful indicators of the general reproductive condition of mature fish because these endpoints reflect the successful integration of a variety of physiological processes, for example, disturbances in the HPG axis that directly or indirectly impair gamete maturation and/or interfere with reproductive behavior will reduce spawning frequency and fecundity (US EPA, 2007). In the FSTRA (TG229) guideline, on the other hand, it is also suggested that these endpoints are not intended to unequivocally identify specific cellular mechanisms of action (OECD, 2012a). In the present study, a significant reduction in both the number of total and fertilized eggs was able to statistically be detected for the 10 chemicals other than CMP, RAL, FLT, VCZ, PCL, and SDS (Table 5).
These results demonstrated that various MOA, except AR antagonism, that might interact with the HPG axis mostly affect the reproduction of mature medaka and also suggested that it is probably difficult to determine the MOA of a test chemical based on the changes in the average egg production (/day/female) during the exposure period. In the FSTRA, because the eggs spawned in each tank must be recorded every day, the time course of fecundity and fertility over the exposure period can be assessed. As shown in Figure 1, different trends (variation patterns) supposed to be associated with the MOA of the test chemicals were found in the daily egg production over the exposure period. The fecundity of the fish exposed to Esr agonist at higher concentration (e.g., EE2 at 424 ng/L) gradually decreased after the first week ( Figure 1A). In contrast, the exposure to TAM, an Esr antagonist, drastically reduced the daily fecundity within 3 days from the initiation of chemical treatment ( Figure 1B). Similarly, the egg production dropped more rapidly on the day after the beginning of the exposure with strong AR agonist treatment, but unlike the TAM treatment, large fluctuations in daily fecundity were found in the MT exposure, especially at the lowest concentration ( Figure 1C). A similar response was evident in the fish exposed to PR agonists such as LNG ( Figure 1D). Interestingly, in the PCL study where a significant reduction could not statistically be detected in the mean number of total eggs during the 21 days, the daily fecundity was rather reduced a few days from the start of the chemical exposure at the highest concentration but recovered after that ( Figure 1E).
The daily fecundity in controls that was the average of the number of total eggs spawned by 12 females must be stable over the exposure period where the feeding and the environmental conditions are properly controlled under the FSTRA protocol because Japanese medaka is a daily spawner (Ankley & Johnson, 2004;Leaf et al., 2011;Padilla et al., 2009). It is assumed that the variation in daily fecundities caused by the chemical treatments reflected MOA of the chemicals tested and reproductive response of the fish exposed, although the fecundity endpoint in the FSTRA might not be sensitive to AR antagonistic effects (Ankley et al., 2003;Dang, Traas, & Vermeire, 2011;Nakamura et al., 2014). The results from the EE2 study can be interpreted as that the strong Esr agonist interfered with certain functions of male medaka, for example, spermatogenesis and reproductive behavior (Islinger, Willimski, Völkl, & Braunbeck, 2003;Schultz, Skillman, Nicolas, Cyr, & Nagler, 2003;Seki et al., 2002), and then reduced the reproductive activity of females in the same tank, as a secondary effect. A significant decreasing male GSI at the highest concentrations of E2, EE2, and PTH, shown in Table 6, supports this interpretation. Several literatures have reported that a short-term exposure of adult males and females to synthetic androgens such as MT and TRB caused a reduction of egg production in medaka, fathead minnow, and other small test fish (Ankley et al., 2003;Jensen, Makynen, Kahl, & Ankley, 2006;Kang et al., 2008;Pawlowski, Sauer, Shears, Tyler, & Braunbeck, 2004;Robinson, Staveley, & Constantine, 2017). In the present study, a large fluctuation was observed in daily egg production in addition to the decrease in total T A B L E 7 Results of agonist and antagonist assays for mEsr1 and mARβ RGA egg production in the treatment by AR agonists. These results suggest that androgen treatment might not only interfere with oogenesis but also disrupt maturational and ovulatory mechanisms resulting in inhibition of the normal release of mature oocytes during spawning (Hemmer et al., 2008). Consequently, in the females treated with the AR agonists, the GSI was probably elevated by the increase of matured oocytes retained in their ovaries. As shown in Table 6, an increased GSI in female was also evident in the exposure with PR agonists. Several in vivo studies have suggested that synthetic progestins including LNG significantly affected transcriptional expression levels of genes related to HPG axis and reduced reproductive capability in fish (Frankel, Meyer, & Orlando, 2016;Han et al., 2014;Paulos et al., 2010;Runnalls et al., 2013;Svensson, Fick, Brandt, & Brunström, 2013;Zeilinger et al., 2009). The in vitro RGA with mARβ demonstrated that the natural progesterone P4 and the synthetic progestin LNG were potent AR agonists in Japanese medaka, as shown in Table 7, the same as in other fish species such as fathead minnow (Bain, Kumar, Ogino, & Iguchi, 2015;Ellestad et al., 2014). In the SDS study, used as a negative substance, no significant effect compared with the control was found for the number of total eggs, but a significant reduction was statistically detected in the fertility rate at the highest concentration (Table 5). With regard to the fertility rate, a significant reduction compared with the control was also detected in the studies for RAL, VCZ, and PCL where no effect was statistically detected in both the number of total and fertile eggs.
These results suggest that fertility rate is more sensitive than the other two endpoints for reproduction in FSTRA. Part of the reason for this is that the variations of fertility rates in the controls were relatively smaller than the other endpoints. At the same time, it should be considered a possibility that some toxic effect unrelated to endocrine disruption caused the reduction in fertility. For example, in the 10 mg/L of SDS exposure, it is presumed that surfactant lipid peroxidation of SDS damaged the sperms and this consequently reduced fertility (Dietrich et al., 2007;Rosety et al., 2001Rosety et al., , 2007. The results of F I G U R E 1 Daily change in the mean number of total eggs during the 21-day exposure period. Data denote the mean of the daily fecundity (n = 4). The exposure concentrations in which a significant reduction from the control was statistically detected in the number of total eggs throughout the exposure period were marked with an asterisk (p < 0.05). (A) 17α-Ethynylestradiol, (B) tamoxifen citrate, (C) 5αmethyltestosterone, (D) levonorgestrel, (E) prochloraz, and (F) sodium dodecyl sulfate RGA showed that SDS has neithEsr agonistic nor antagonistic activity to both mARβ and mEsr1, supporting this interpretation. Overall, the 16 chemical studies demonstrated that the reproductive endpoints in the FSTRA using Japanese medaka could be fairly sensitive and helpful to detect the effects of test chemicals on HPG axis, although it would be necessary to suspect that an activity unrelated to an endocrine disrupting effect influenced them.
VTG in male fish is a sensitive biomarker to identify that fish have been exposed to estrogenic (Esr agonistic) chemicals and thus has been frequently and widely used in laboratory and field studies for EDCs (Hara et al., 2016;Harries et al., 1997;Sumpter & Jobling, 1995;Wheeler, Gimeno, Crane, Lopez-Juez, & Morritt, 2005;Yamanaka et al., 1998). In Japanese medaka, VTG induction level is generally assessed by either measuring the amount of VTG protein in blood (Chikae, Ikeda, Hasan, Morita, & Tamiya, 2004;Tabata et al., 2003) or liver extraction Nakamura et al., 2014;Seki et al., 2002), for example, by ELISA methods, or determining the amount of VTG mRNA expression in liver sample, for example, by quantitative polymerase chain reaction (qPCR) methods Lee, Jeon, Na, Choi, & Park, 2002). In the present study, hepatic VTG concentrations were quantitatively determined using two types of ELISA kits for which the intravariability and the intervariability had been assessed (Tatarazako et al., 2004).
In the Esr agonist studies, the VTG levels in male medaka exposed to E2, EE2, and PHT were significantly and concentration dependently and CMP, obtained from the MMT studies, were both 28 ng/L and 345 and >345 μg/L, respectively. Likewise, as to the effects of PHT on Japanese medaka, Seki et al. (2003) reported that the LOECs on reproductive impairment and VTG induction in F0 generation (i.e., in the medaka continuously exposed to the test chemical after fertilization) were 224 and ≤51.1 μg/L, respectively in full life cycle test (FLCT). Furthermore, EE2 significantly elevated male VTG and reduced fertility in F0 generation even at 9.26 ng/L in FLCT (MOE, 2006). These results indicated that the sensitivity of male VTG is comparable between the FSTRA and the long-term tests such as FLCT, MMT, and MEOGRT, although the endpoints related to reproductive status are rather less sensitive in the screening assay than in the definitive tests, especially with regard to Esr agonistic activities.
The relationship of effective concentrations of the four Esr agonists were quite similar between the FSTRA and the agonist assay of RGA with mEsr1. As shown above, in the FSTRA, the LOECs (in molarity) for male VTG were almost the same for EE2 and E2, and the LOECs for PHT and CMP were 14,000 and 110,000 times higher than that of E2, respectively. In the results from the agonist assays, the EC 50 was slightly smaller for EE2 than that of E2, as with the FSTRA, and the EC 50 s for PHT and CMP were 990 and 6,200 times higher than E2, respectively (Table 7). These results demonstrate that estrogenic potency in vivo is predictable from in vitro data in comparison of effective concentration rankings of Esr agonist chemicals for in vivo LOECs and in vitro EC 50 s (Lange et al., 2012).
A significant VTG induction was also found in the male medaka exposed to Esr antagonists. A selective Esr modulator, TAM, caused both the VTG induction in males and the VTG reduction in females significantly at the same concentration, 10.0 μg/L. Flynn et al. (2017) reported that the LOECs for TAM were 10 μg/L on reproduction (fertility) and 1.3 μg/L on female VTG (reduction) in MMT; hence, it is suggested that the sensitivity of reproduction to Esr antagonistic activity in FSTRA is not much different from long-term exposure testing. As regards to the Esr agonistic activity, the FSTRA suggested that the LOEC on female VTG for TAM (10.0 μg/L) was 70 times lower than that for RAL (721 μg/L), whereas the antagonist assay of mEsr1 RGA indicated that the IC50 for TAM (0.14 μM) was more than 50 times higher than that of RAL (0.0026 μM). Although the results of only two chemicals, these might be suggested that uncertainties remain in extrapolating from in vitro to in vivo toxicity, due to differences in, for example, the physicochemical properties or the external and internal concentrations of the chemicals tested (Groothuis et al., 2015;Stadnicka-Michalak, Tanneberger, Schirmer, & Ashauer, 2014). VTG reduction in female fish is also a sensitive indicator for detecting effects by chemicals with aromatase/steroidogenesis inhibitory activity (Panter et al., 2004;Villeneuve et al., 2009).
The formation of PP on anal fin is one of the masculine SSCs in Japanese medaka (Iguchi, Ogino, Miyagawa, Yatsu, & Tatarazako, 2019;Oka, 1931) and is an easily quantifiable in vivo assay endpoint. The molecular mechanisms in their development have been elucidated . In the AR agonist studies, MT, DHT, and TRB induced PP on female anal fins, and the numbers were concentration dependently increased. The LOECs on PP formation in females for these three AR agonists were 20.1 ng/L, 1.03 μg/L, and 26.8 ng/L, respectively, and were the most sensitive in the endpoints for FSTRA. In the long-term testing (e.g., FLCT and MMT), induction of SSC in females and reduction of reproduction were caused at 9.98 ng/L of MT (Seki et al., 2004), and at 32 and 13 ng/L, respectively for TRB . These results indicated that SSC induction in female Japanese medaka has the same sensitivity between short-term and long-term exposure, and thus, there is a possibility that the LOEC from the short-term screening assay is indicative of the suspected effects on reproduction in long-term exposure, as well as the VTG in male for Esr agonists.
A reduction of SSC in males is an endpoint to identify the antiandrogenic (AR antagonistic) effect of test chemicals in a shortterm assay. Panter et al. (2004) reported that FLT at 938.6 μg/L significantly reduced the number of nuptial tubercles in male fathead minnow in a 21-day exposure and the validation study of FSTRA using fathead minnow demonstrated that all the three participating laboratories detected a statistically significant decrease in nuptial tubercle number in the male fish exposed to VCZ at the nominal concentration of 900 μg/L (US EPA, 2007). On the other hand, a significant effect on male SSC was detected for neither FLT nor VCZ in the present study.
As for the effects of AR antagonist on Japanese medaka, Flynn et al. (2017) reported that the LOECs on reproduction (fecundity and fertility) and SSC in females were 253 and 33 μg/L, respectively for VCZ in the MMT study. Furthermore, both FLT and VCZ inhibited the mARβ-mediated transactivational activity induced by 50-nM 11KT in the antagonist assay of RGA using mARβ (Table 7). In mature male medaka, a 21-day exposure period might not be enough to invisibly degenerate the PPs that had already formed as a branching bone nodule from bone segments in anal fin rays . Hence, to adequately detect AR antagonistic activity in vivo screening, it is necessary to establish a more sensitive novel test method, for example, using juvenile medaka (Nakamura et al., 2014).
In conclusion, the present study demonstrated that the FSTRA (TG229) using Japanese medaka is applicable in identifying the effects of chemicals with Esr agonistic, Esr antagonistic, and AR agonistic potency on the reproduction of fish, as well as steroidogenesis inhibitory activity, though the assay suggested an insensitivity to AR antagonists. In addition, the Japanese medaka demonstrated the effects of the progestin, LNG, which masculinized and reduced reproductive activity in females by activating ARs in the FSTRA, similar to the reports on other small test fish Frankel et al., 2016;Svensson et al., 2013). Regarding the endpoints, VTG and SSC were sensitive to the activities of Esr agonists and antagonists, and AR agonists, respectively, and their LOECs obtained from the FSTRA might be helpful in inferring the effect concentration on reproduction in long-term exposure. Furthermore, the spawning status, such as the change in daily fecundity over the exposure duration, will support identifying the MOA for the test chemical, though it is necessary to take into account the possibility of toxic effects not interacting with the endocrine system, such as a surfactant action. On the other hand, a fairly high variability among the controls for the 16 chemical studies was found in female VTG levels and fecundity and suggested that comparison between the treatment and the control fish should be limited within the same study for these endpoints. In order to enrich the guidance on interpretation of the screening assay, including the in vitro RGA, it will be necessary to constantly extend and update its knowledge base, for example, incorporating data from the FSTRA (TG229) conducted under the EXTEND 2016 Program.