Consensus workshop on methods to evaluate developmental immunotoxicity.

A workshop cosponsored by the National Institute of Environmental Health Sciences and the National Institute for Occupational Safety and Health was convened in Washington, DC, on 17-18 October 2001 with the goal of developing a consensus document on the most appropriate experimental approaches and assays available to assess developmental immunotoxicity. The work group was composed of scientists from academia, the chemical and pharmaceutical industries, and federal agencies with expertise in developmental immunology, developmental toxicology, immunotoxicology, and risk evaluation. This consensus document presents an overview of the major summations made by the work group. A summary of early work in the field is provided, which includes potential immunotoxic agents, followed by brief discussions of our current understanding of developmental immunology. This report concludes with the work group's consensus of the most appropriate experimental design and tests to screen for potential developmental immunotoxic agents in experimental models, including potential limitations and data gaps.

suppression" and "neonate" identified more than 1,000 reports published during the last decade. Although the majority of these studies described agents used for immunosuppressive therapies and chemotherapy, many described developmental immunotoxicity associated with exposure to environmental or occupational agents. Although only a few of this latter group represented studies in human populations and many suffered from design flaws (e.g., small populations), taken together they provided sufficient warning to raise concern regarding the potential health effects associated with exposures in infants and children. Of particular concern was that immunotoxicity often appeared more severe and/or persistent when the exposure occurred perinatally when compared with exposure in adult animals. These concerns were highlighted in a 1993 report from the National Research Council titled Pesticides in the Diets of Infants and Children (National Research Council 1993), in which the immune, reproductive, and nervous systems were identified as potential targets for pesticide exposure. Although considerable attention has focused on the identification, characterization, and standardization of methods for developmental neurotoxicity and reproductive toxicity (e.g., Garman et al. 2001;Levine and Butcher 1990;Schwetz and Tyl 1987) only limited discussions on developmental immunotoxicity have occurred Holsapple 2002). The present workshop extends these earlier discussions by establishing a consensus on the most appropriate experimental designs and methods to identify and characterize developmental immunotoxicity in experimental animals given our current knowledge.

Background
Because the committee had neither the opportunity nor the intent to conduct a thorough literature review, a complete list of agents reported to cause developmental immunotoxicity in humans and animals is not provided in this report; the reader is referred to earlier reviews on this subject (Barnett 1996;Holladay and Luster 1994;Holladay and Smialowicz 2000). The work group instead provided examples of agents believed to represent developmental immunotoxicants based on confirmed, peer-reviewed experimental studies and/or human reports (Table 1). Few, if any, of these agents can be considered uniquely immunotoxic to the neonate because most also demonstrate immunotoxicity after adult exposure. However, for the most part, the agents listed produce either more severe or persistent effects after perinatal exposure than those observed after adult exposure, which appears consistent with our current understanding of the immune system. For example, damage to the thymus during lymphocyte selection, which occurs primarily during the perinatal period, would have a more profound effect than postmaturational thymic damage. Similarly, destruction of neonatally abundant pluripotent stem cells would likely have a more pervasive outcome than destruction of single lineages or differentiated cells that predominate in adults. This phenomenon is exemplified in immunotoxicity studies with 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) and diethylstilbestrol, where the immunologic effects in the neonate are qualitatively and quantitatively different than those that occur in adult animals (Faith and Moore 1977;Gehrs and Smialowicz 1999;Kalland 1982;Luster et al. 1979;Ways et al. 1987).

Development of the Immune System
Ontogenesis of the immune system in vertebrates is well characterized and involves sequential multiple switching of hematopoietic compartments, resulting in highly regulated cell production, cell migration into primary and secondary lymphoid organs, and cell differentiation within these organs under microenvironmental influences. A detailed discussion of this topic is beyond the scope of this document; the reader is referred to recent reviews (Fadel and Sarzotti 2000;Marshall-Clarke et al. 2000;Landreth 2002). For the most part, the events involved in development of the immune system are similar in humans and species used in experimental studies, including rodents. One major difference, however, is that in humans the immune system is well developed by the end of the first trimester (13 weeks) of gestation, whereas in rodents immune system ontogenesis continues throughout gestation and into early postnatal life, until the animal almost reaches sexual maturity. Thus, the stage of immune development at the time of birth, as well as the level of immunocompetence during early postnatal life, varies between species. For example, hematopoiesis in rodents occurs primarily in the fetal liver until several days after birth, whereas in humans hematopoiesis begins to diminish in the liver and predominate in the bone marrow beginning in the fifth month of gestation (Abboud and Lichtman 2001). Although species-specific differences in maturational times occur in other organ systems as well (Stites et al. 1974), the work group indicated that it is essential that a study design accommodate these differences to allow for extrapolating from animal studies to humans.
"Windows of vulnerability" have been proposed to exist during specific periods of immune ontogeny . This concept is based on presumptions derived from established processes of immune system development rather than actual data with developmental immunotoxic agents. Figure 1 identifies five discrete windows of immune maturation in rodents, based on known periods of highly active cell expansion or cell colonization of lymphoid organs. In rodents, the period of susceptibility first begins during the embryonic period, as blood-forming precursors are formed, and continues into postnatal life. Although similar windows of vulnerability exist in humans, the relative times that these events occur are correspondingly earlier. This phenomenon has been demonstrated in developmental immunotoxicity studies with lead, where different effects may be observed depending on the test species and window of exposure. For example, alterations in macrophage and T-cell function occur in rodents given full gestational lead exposure ). However, rats exposed to lead during the first half of embryonic development show persistent changes only in macrophage function, whereas exposure later in development results in both macrophage and T-dependent functional deficiencies (Bunn et al. 2001). Chickens exposed to lead at early stages of embryonic development produce lower levels of inflammatory mediators with no effects on T-cell function, but exposure during the latter stages of incubation significantly suppresses T-cell function (Lee et al. 2001). The consensus from the work group regarding these studies was that determining specific windows of vulnerability for individual agents would be beneficial for studying mechanisms of action but would have limited value for hazard identification, and any study design tailored for hazard identification should consider a more global approach.

Experimental Design and Immunologic Assays
As reflected in the peer-reviewed literature, there is no consensus on the most appropriate experimental species, designs, or tests for examining developmental immunotoxicity. This random study approach may affect not only the quality of the study but also the ability to use the data in assessing human risk. The committee felt that there was a critical need to identify the most appropriate experimental designs and test procedures for developmental immunotoxicology testing currently available.
The first issue that was considered by the work group was selection of an appropriate test species. After humans, the species in which we know the most about the immune system is the mouse. However, given that considerable information now exists on the immune system in rats, the historical reproductive toxicologic database available in the rat, and the low background incidence of malformations as well as stress effects in the rats, there was a consensus that the rat should be the species of choice for developmental immunotoxicology testing. For developmental immunotoxicology studies that address mechanisms of action, rather than hazard identification, the mouse may be the species of choice because methods, reagents, and animal models (e.g., transgenic mice) are more readily found. Limited examples exist where both mice and rat data are available for the same agent, and in most cases both species appear to respond in a similar manner. However, there are several notable exceptions. For example, changes in thymus weights were reported to be a more sensitive indicator than antibody responses after prenatal TCDD exposure in rats, whereas the reverse was found in mice (Gehrs et al. 1997). Concerning strain selection, no consensus was reached. It was noted that outbred rats are more representative of the human population and are commonly employed in reproductive studies, although inbred strains would introduce less variability in immune tests.
The next topic addressed by the work group was exposure design. Specifically, the committee was asked to identify a design that would best accommodate hazard identification for developmental immunotoxicology. Discussions focused on the following issues: the need to establish an exposure paradigm that would accommodate differences in kinetics of immune maturation that exist between humans and the test species (i.e., rats); whether to include all periods of potential vulnerability during immune ontogenesis ( Figure 1); and the importance of discerning agents that prevent immune components from developing fully versus those that simply  Barnett et al. (1985Barnett et al. ( , 1990aBarnett et al. ( , 1990b delay immune maturation. Also of concern to the committee was identifying an exposure design that would accommodate appropriate times for measuring immune response after antigen challenge, the hallmark of adaptive immunity. Several examples were provided that served to demonstrate the important interrelationship between time of exposure, time of measurement, and assay selection. In human neonates, for example, not only are immune responses weak but also there is skewing of responses such that T-helper 2 (T H 2), rather than T H 1 immunity predominates. Therefore, neonatal stimulation with antigens dependent on specific cell types or mediators may not allow for an optimal response to occur (Ridge et al. 1998), and evaluation of these responses at too early a time point could potentially mask chemical effects on immune function. This can be shown when hydrocortisone is administered late in gestation in rodents, because inhibition of the antibody response can be observed at postnatal day (PND) 28 but not at PND 14 (Ezine and Papiernik 1981).
The major objective of the work group was to identify a simple study design that best addressed hazard identification, so it was agreed that the exposure design should be sufficiently flexible to be incorporated into existing developmental toxicology protocols, rather than needing to be a "stand-alone" study. Existing study protocols, such as the one-generation, two-generation, or segment III studies (for pharmaceuticals) would be suitable because they cover most, if not all, of the critical periods for immune system development. Stimulated by the National Research Council report (1993), Pesticides in the Diets of Infants and Children, a collaborative research project was initiated between the National Institute of Environmental Health Sciences and the U.S. EPA to address some of the scientific and regulatory concerns in this area by exploring the long-term effects of pesticide exposure on noncancer end points in neonatal rats (Chapin et al. 1997;Smialowicz et al. 2001). The experimental design established by this group addressed many of the concerns voiced by the work group. Specifically, the design employed rats, was well-suited for hazard identification, could be incorporated into a preexisting study design, covered all of "the windows of vulnerability," and could address possible long-lasting effects. A schematic diagram of the design that allows for immunotoxicology, neurotoxicology, and reproductive and general toxicology assessment is shown in Figure  2 (Smialowicz et al. 2001). In this protocol, pregnant dams are dosed from gestational day 12 to PND 7. On PND 8, the pups are dosed directly using the same dose levels, which continues until PND 42, the approximate end of puberty in rats and a time when the immune system is fully mature. Litters are standardized to four males and four females on PND 7 and weaned on PND 21. Subsets of rats (one male and one female from each litter) are used for immunologic evaluation on day 42. Other pups in the litter may be used for reproductive or lactational assessment and necropsy. The work group emphasized the need to include, as a minimum, plasma concentrations of the test agent in order to help document placental or breast milk transfer of the test material. If the agent is transferred in breast milk, the requirement of directly exposing the neonate before weaning will need to be reconsidered. Independent of the treatment route, exposure through PND 42 is required to accommodate the entire period of immunologic ontogenesis. Limitations in this study design are that relatively few animals are examined, limited mechanistic information is provided, and neither long-term effects nor recovery is routinely assessed. The next issue addressed by the work group was determining the most appropriate assays for identifying developmental immunotoxicants. Although perinatal exposure to immunotoxic agents may have severe and persistent effects on immune function, there is no universal agreement on what constitutes the most predictive or sensitive perinatal tests, unlike studies with adults. Discussions focused on the need to employ end points that accommodate the temporal nature of the maturational processes, reflect the function and interactions of the major cellular components that constitute the immune system, and are well understood biologically.
The National Toxicology Program has conducted extensive studies in adult mice for immunotoxicity using a "tiered" approach (Luster et al. 1988). The database from these studies, which covers more than 50 compounds, has been collected and analyzed in an attempt to improve the accuracy and efficiency of screening chemicals for immunosuppression and to better identify those tests that predict immune-mediated diseases (Luster et al. , 1993. Of the end points examined, quantification of the T-dependent antibody response was shown to be a highly reliable indicator for immunotoxicity, as indicated by its ability to identify immunotoxicity and predict changes in host resistance. This may be because the T-dependent antibody response represents a specific response to antigen and involves most of the major cell types in the immune system (i.e., B cells, T H 2 cells, antigen-presenting cells) and mediators [i.e., tumor necrosis factor, interleukin-1 (IL-1), IL-4, interferon γ, adhesion molecules]. Subsequent studies in other experimental species, including rats (Temple et al. 1993), have helped confirm these observations. The clinical relevance of the antibody response, (e.g., in relation to childhood vaccines and resistance to certain bacterial pathogens) and the opportunity to make direct comparisons with humans were also considered advantages. With this in mind, investigators have examined immune responses to childhood vaccines to determine immunotoxicity in Dutch preschool children exposed perinatally to environmental polychlorinated biphenyls and dioxins (Weisglas-Kuperus et al. 1995).
The committee's recommendations for an assay battery that fulfilled most of the criteria set forth are listed in Table 2 and include, among others, quantifying the primary antibody response to a T-dependent antigen, although the committee did not determine the optimal age at which the test animal should be examined. It has previously been established that the antibody response in rats approximates a fully mature response by PND 42, whereas at PND 21 the response has yet to reach maximum levels (Ladics et al. 2000). The responses obtained between PNDs 21 and 42 have yet to be closely examined, and kinetic studies will need to be conducted between these time points to more clearly identify when maximum antibody titers are first obtained. The committee did not provide recommendations for any specific antigen but discussed the benefits and disadvantages of some commonly used antigens, such as sheep red blood cells (RBCs) and tetanus toxoid. If antibody tests were  conducted using antigens commonly employed in childhood vaccines, the results would be more applicable to human studies. However, there is no reason to believe that the response to sheep RBCs would be any different from those to other antigens, and reagents and methods for using sheep RBCs are established in rats. The committee also recommended measurement of thymus, spleen, and lymph node weights, although the latter two could be influenced by immunization. Careful measurement of thymus weights has provided more consistent data than has thymus cellularity and appears to be a consistent indicator of developmental immunotoxicity (Holladay and Luster 1996). There was broad consensus that complete blood counts (CBCs) are also a potentially useful and sensitive measure in neonates. Intense hematopoietic development occurs early in neonatal development, and any moderate to major loss in a cell lineage may be reflected as a decrease in the CBC or altered differential. Lastly, the committee recommended, in addition to antibody response, inclusion of a second functional test that would provide a measure of T H 1 immunity, such as the cytotoxic T-lymphocyte assay or the more commonly employed delayed hypersensitivity response.
Several assays were discussed that are commonly employed in clinical immunologic evaluations and immunotoxicity testing in adults, but their utility in developmental immunotoxicity testing would require further evaluation. These include macrophage function, complement analysis, and surface marker analysis ( Table 2). The opportunities and limitations of cell surface marker analyses were given particular attention because they have been used extensively to study immune development and because animal studies have shown that alterations in fetal thymus and fetal liver hematopoietic cell numbers and phenotypes are a common occurrence after gestational exposure to developmental immunotoxic agents. In particular, depression of fetal thymus cell numbers and qualitative and quantitative changes in surface marker expression on precursors cells in the fetal liver and thymus have been observed by cytometric analyses in rodents after exposure to a variety of immunotoxic agents, including dioxins, diethylstilbestrol, ethylene glycol monomethylether, benzo[a]pyrene, and T2 mycotoxin [reviewed by Holladay and Luster (1996)]. The clinical consequences of these changes have not been established. However, if the effects are severe enough, long-term immunosuppression, anergy (i.e., failure to respond to antigen), or autoimmunity may result by affecting positive and negative selection processes that occur early in immune system development (Hardin et al. 1992;Kamath et al. 1998;Silverstone et al. 1994). Cytometric analysis for both the frequency and the intensity of common surface molecules (e.g., B220, CD3, CD4, and CD8), as well as precursor surface molecules (e.g., terminal deoxynucleotidyl transferase, c-Kit, CD25, CD19, CD43, CD69, and IL-7R), has often been used to examine lymphoid tissues from prenates and neonates. The work group recognized that cytometric analyses may be highly sensitive and can signal chemical-induced effects on maturation and development. However, concerns regarding its use as a screening tool centered on the inability to interpret the biologic significance of quantitative changes in precursor and lineage-specific cell numbers. Therefore, its current use may be more applicable to mechanistic studies (Immunotoxicology Technical Committee 2001).
A third group of tests, representing general and lineage-specific stem cell assays (Weissman et al. 2001), were considered to have potential utility within the context of early (i.e., prenatal) developmental immunotoxicology screening (Table 2). From a mechanistic standpoint, these tests have greatly increased our understanding of hematopoietic processes. Although these assays are increasingly used in preclinical tests for pharmaceuticals to help identify potential adverse drug reactions of a hematologic nature (Parent-Massin 2001), as yet they have unknown utility in developmental immunotoxicology screening and will require further investigation.

Summary
During the last 20 years, reproductive toxicologists and immunotoxicologists have identified a number of procedures and issues relative to testing methods, study design, and data interpretation for examining potential developmental immunotoxicants. This workshop brought together experts with a wide variety of backgrounds and areas of expertise. Their discussions focused on establishing a framework for the development of rational testing guidelines as regulatory agencies grapple with the issue of environmental exposures and children's health. Although consensus was reached on most topics, in those that were not, data gaps and research needs were identified. The conclusions from the workshop described in this summary report are meant to aid in conducting developmental immunotoxicity studies and reduce some of the uncertainties that exist in extrapolating data from experimental animals to estimates of risk in humans. Given the limited experience in screening for developmental immunotoxic agents, the listed test procedures should be considered, at best, recommendations that, as time proceeds and data accumulate, will and should undergo revision.