The State of the Scientific Revolution in Toxicology

Correspondence: Thomas Hartung, MD, PhD Center for Alternatives to Animal Testing (CAAT) Johns Hopkins University 615 N Wolfe St., Baltimore, MD, 21205, USA (THartun1@jhu.edu) IX – The Nature and Necessity of Scientific Revolutions X – Revolutions as Changes of World View XI – The Invisibility of Revolutions XII – The Resolution of Revolutions XIII – Progress Through Revolutions An abridged version1 was prepared by Frank Pajares at Emory University. Kuhn uses the term scientific revolutions for what we more commonly call paradigm shifts: “Because paradigm shifts are generally viewed not as revolutions but as additions to scientific knowledge, and because the history of the field is represented in the new textbooks that accompany a new paradigm, a scientific revolution seems invisible.” Kuhn’s fundamental assertion that science does not change continuously but in waves was not really new, even more than five decades ago. For example, Swami Vivekanada (1863-1902)


Introduction: Thomas S. Kuhn's view on scientific revolutions
Thomas Samuel Kuhn (1922Kuhn ( -1996 remains one of the most influential science philosophers. His seminal work on The Structure of Scientific Revolutions (Kuhn, 1970) addresses in thirteen chapters: I -Introduction: A Role for History II -The Route to Normal Science III -The Nature of Normal Science IV -Normal Science as Puzzle-solving V -The Priority of Paradigms VI -Anomaly and the Emergence of Scientific Discoveries VII -Crisis and the Emergence of Scientific Theories VIII -The Response to Crisis alyzed (Luechtefeld et al., 2016). We discussed the challenge of animal-based ADME (adsorption, distribution, metabolism and excretion) (Tsaioun et al., 2016). We considered defense mechanisms (Smirnova et al., 2015) and mechanisms in toxicology (Hartung and McBride, 2011;Kleensang et al., 2014;Smirnova et al., 2018). We discussed the challenges of extrapolation (Hartung, 2018a) and statistics (Hartung, 2013) in toxicology. The neglect of exposure information (Sillé et al., 2020), the problem of precautionary action (Hartung, 2017), and the uncertainty of toxicological assessments (Luechtefeld et al., 2018) were addressed. Readers are referred to these publications and others.
The important point is that many aspects among the core "beliefs" of toxicology can be challenged. The many "imperfections" can be added up. Most toxicologists are probably shocked to see that the six most common OECD guideline animal tests (analyzing 350 to 750 chemicals per test), which were used to test the same substance more than twice, sometimes dozens of times, found a toxic substance in a repeat experiment in only 69% of the cases (Luechtefeld et al., 2018) despite Good Laboratory Practice and highly standardized testing. The situation seems no better for the more complex systemic toxicities : 57% reproducibility for more than 120 cancer bioassays at close to $1 million per study is only one example. Wang and Gray (2016) evaluated for 37 chemicals undergoing cancer bioassays in rats and mice of both genders in the US National Toxicology Program the correspondence of results between species and genders as well as with available historical data from chronic studies -the result was sobering: "Overall, there is considerable uncertainty in predicting the site of toxic lesions in different species exposed to the same chemical and from short-term to long-term tests of the same chemical." In summary, whatever items of the above list the core belief of an individual toxicologist includes, there is increasing evidence that they do not hold.

Anomaly and the emergence of scientific discoveries
Kuhn sees the need for anomalies, i.e., something that "subverts the existing tradition of scientific practice", in order to push forward scientific progress and paradigm change. This is not how science normally works. In Kuhn's words, "Normal science does not aim at novelties of fact or theory and, when successful, finds none." In the case of toxicology, in 2008, we already listed the following anomalies: -The effort of REACH -The change in drug industry to biologicals and the rise of nanomaterials -The drying out of the pharmaceutical pipeline -Market forces is quoted 2 , "Everything progresses in waves. The march of civilization, the progression of worlds, in waves. All human activities likewise progress in waves -art, literature, science, religion." Kuhn's merit lies in detailing the characteristics and mechanisms of these waves. Figure 1 summarizes the core of the concept of Kuhn's Scientific Revolution Cycle (Kerry et al., 2008).
In toxicology, we increasingly talk about evolution (Hartung et al., 2008a), revolution (Davis et al., 2013), future toxicology (Juberg et al., 2008), 21 st century toxicology (Krewski et al., 2020), next-generation risk assessment (Moné et al., 2020), and new approach methods 3 , which might in itself indicate the broadly perceived need for change and hint that something is indeed happening. It is tempting to apply Kuhn's framework to this area of science. In 2008, we carried out such an analysis (Hartung, 2008a). At the time, we noted a number of anomalies challenging the current paradigm. We began with some assumptions as summarized by Frank Pajares 1 : A scientific community cannot practice its trade without some set of received beliefs. These beliefs form the foundation of the "educational initiation that prepares and licenses the student for professional practice". The nature of the "rigorous and rigid" preparation helps ensure that the received beliefs exert a "deep hold" on the student's mind.
Normal science "is predicated on the assumption that the scientific community knows what the world is like" -scientists take great pains to defend that assumption. To this end, "normal science often suppresses fundamental novelties because they are necessarily subversive of its basic commitments". Research is "a strenuous and devoted attempt to force nature into the conceptual boxes supplied by professional education". A shift in professional commitments to shared assumptions takes place when an anomaly "subverts the existing tradition of scientific practice". Let us expand on some of these assertions.
"A scientific community cannot practice its trade without some set of received beliefs." In 2008, we elaborated on some examples of beliefs in a fairly cynical way but, at their core and to differing extents, shared by many in toxicology: a) Chemicals are bad. b) Animals can closely reflect human reactions, including the uptake, distribution, metabolism and excretion of substances, as well as organ-specific effects, defense mechanisms and cascade events. c) We can extrapolate from high-dose/short-term effects to low-dose/long-term effects. d) We do not need statistics. e) Poor exposure information does not matter too much. f) If in doubt, be precautionary. g) Whatever the uncertainty was at the time of risk assessment, we can stop worrying afterward. Further to some remarks in the 2008 article, we addressed many aspects individually later: We showed that the majority of chemicals on the market are not hazardous, i.e., we found no hazard classification for more than 20% of 9,800 chemicals an--Legislation forestalling scientific developments (e.g., EU cosmetics legislation) -Globalization In fact, finally attempting to tackle the backlog of testing chemicals already on the market, first by REACH and in the meantime by other legislations world-wide, challenges the limits of our test capacities (Hartung and Rovida, 2009a;Rovida and Hartung, 2009;Hartung, 2010a). While these analyses were disputed by some stakeholders, especially the European Chemicals Agency and the Environmental Defense Fund (Hartung and Rovida, 2009b), the numbers were actually almost spot on: With the EU animal use statistics published in 2020 (Busquet et al., 2020a), the extent of animal use due to REACH becomes evident, though disguised by the fact that these statistics do not report embryos in reproductive toxicity tests despite the fact that they fall under the EU Directive on the use of animals for scientific purposes 2010/63/EU (Hartung, 2010b). We had predicted that reproductive toxicity testing would dominate animal use for REACH (Hartung and Rovida, 2009a) with about 90% of animals used. In our 2020 analysis of the EU animal use statistics we found, "What remains outside of the scope of annual statistical reporting, even if covered by the scope of the Directive, are: a) Foetal forms of mammals" … "Reproductive and developmental toxicity include far more pups than adult animals, e.g., a two-generation study treats only 20 male and 20 female, but in total on average 3,200 animals are involved in case of rats (factor 80) and 2,100 in case of rabbits (factor 53). Similarly, the one generation study OECD TG 414 treats 40 animals but 784 rats (factor 20) or 560 rabbits (factor 14) are involved. The developmental toxicity screening test OECD TG 422 treats 20 animals but involves on average 412 (factor 21). Applying this to 140,513 animals for reproductive toxicity testing or 97,671 animals for developmental toxicity in 2017, several million animals would need to be added." (Busquet et al., 2020a). Thus, in 2017 alone, applying average factors for non-counted pups, 7.8 million animals were used for REACH reproductive toxicity testing. It is worthwhile comparing this with statements at the time of passing the legislation: "…the vice-president of the European Commission Guenther Verheugen, said on 7 November 2005 4 that in a 'worst-case scenario' 3.9 million more animals could be used for testing, which he said was "not ethically defensible". He added that the Commission had ideas that would enable it to reduce this extra testing by 70%." A number of "anomalies" for toxicity testing we identified concerned economical aspects, including globalization (Bottini et al., 2007), which we discussed around this time Hartung, 2009, 2010) and again more recently (Meigs et al., 2018). Simply put, if pesticide safety testing represents a really comprehensive assessment, $20 million, more than 30 animal tests, more than 5 years and 20 kg of substance needed is simply not translatable, even in part, to the large number of chemicals on the market: A recent analysis from 19 countries suggests a total of 350,000 chemicals based on their inventories (Wang et al., 2020): "Here, 22 chemical inventories from 19 countries and regions are analyzed to achieve a first comprehensive overview of chemicals on the market as an essential first step toward a glob- US Food and Drug Administration (FDA) has just announced an Innovative Science and Technology Approaches for New Drugs (ISTAND) Pilot Program 5 to expand drug development tools (DDTs) by encouraging the development of DDTs that are out of scope for existing DDT qualification programs but may still be beneficial for drug development. Three (of the six) examples they mention are highly relevant in our context: -Use of tissue chips (i.e., microphysiological systems) to assess safety or efficacy questions -Development of novel nonclinical pharmacology/toxicology assays -Use of artificial intelligence (AI)-based algorithms to evaluate patients, develop novel endpoints, or inform study design. These shifts are what Kuhn describes as scientific revolutions -"the tradition-shattering complements to the tradition-bound activity of normal science".

Crisis and the emergence of scientific theories
According to Kuhn, and  outside of normal science altogether For the discussion here, the approach of using some no-effect levels from animal studies plus safety factors as the traditional paradigm is in essence incompatible with an approach that is based on human pathophysiology. In fact, we take the traditional data when we like or can accept them and use human pathophysiology if we need to "de-risk" these findings.
As expressed in Kuhn's thoughts above, the two approaches do not just substitute for each other -they solve overlapping but somewhat different problems. For example, the new approaches are often cheaper and faster, allowing more testing, be it of the number of substances, replicates, doses or their mixtures. Mixture effects are still little addressed but a possible source of tremendous problems (Docea et al., 2018;Fountoucidou et al., 2019;Sergievich et al., 2020;Tsatsakis et al., 2016Tsatsakis et al., , 2019a. Similarly, inter-individual differences ("personalized toxicology") and human-specific effects cannot be studied. The other way around, al understanding of chemical pollution. Over 350 000 chemicals and mixtures of chemicals have been registered for production and use, up to three times as many as previously estimated and with substantial differences across countries/regions. A noteworthy finding is that the identities of many chemicals remain publicly unknown because they are claimed as confidential (over 50 000) or ambiguously described (up to 70 000). " We have discussed elsewhere the effect of legislation, especially the ban on cosmetic ingredient testing in Europe (Hartung, 2008b). Other industrial sectors creating testing and regulation needs from today's perspective and not yet mentioned in 2008 are food (Hartung and Koëter, 2008;Hartung, 2018b) and non-combustible tobacco products such as e-cigarettes, especially with respect to their additives (Hartung, 2016a).
Another anomaly, already mentioned in the 2008 text, is the rise of nanomaterials. Without entering the discussion here, the fact that enormous numbers of different materials can be created from the same chemical with strongly altered biological (and inevitably toxicological) behavior represents an enormous challenge to the number of risk assessments required (Hartung, 2010c;Hartung and Sabbioni, 2011;Silbergeld et al., 2011).
Little is known about the potential adverse effects of long-term exposure to complex mixtures at low doses (Hernández and Tsatsakis, 2017), which actually represents the most common exposure to environmental chemicals. Therefore, traditional chronic toxicity evaluations for a single chemical could possibly fail to identify all the risks adequately (Tsatsakis et al., 2016). Only an integrated approach of in vivo, in vitro and in silico data, together with systematic reviews or meta-analysis of high-quality epidemiological studies will improve the robustness of risk assessment of chemical mixtures and will provide a stronger basis for regulatory decisions. The increasing understanding that we need to study chemical effects at low and realistic dose levels around the regulatory limits and with the simultaneous investigation of several key endpoints represents another anomaly to current thinking.
We have to add a more recent anomaly, i.e., the pressure to develop treatments and vaccinations for the current COVID-19 pandemic as fast as possible. Years of safety assessments before going into humans are simply not acceptable. This goes far beyond the regulatory toxicology part of drug development -we need new approach methods for (Busquet et al., 2020b): -Drug repurposing -Target discovery -Drug efficacy -Vaccine development -Combination therapies -Drug safety -Quality Drug and vaccine development take on average 10-12 years (Meigs et al., 2018). Pressure to obtain more relevant safety assessments for humans, faster, cheaper, and thus for more chemicals, has led to shifts in the way we do toxicology. In fact, the additional structure into the data. At the basis (Hartung, 2019), a map of the chemical universe was created, based on 10 million structures, where similar chemicals are close and dissimilar ones are far from each other, the distance reflecting their degree of similarity. The resulting map allows us to place any chemical in its similarity space and not only intrapolate from its neighbors over all their properties but also to assign a certainty to the prediction based on the specific constellation of information. In simple terms, if there is non-contradictory information from many closely similar structures, the result is pretty certain. The resulting read-across-based structure activity relationship (RASAR) already in its first implementation impressed with a balanced accuracy of 87% for 190,000 cases of chemicals with a known hazard classification. This compares favorably with only 81% reproducibility of six OECD animal guideline tests in a subset of the database (Hartung, 2019).
This approach, based on only curated legacy databases, has been expanded by us and others since 2018. The vision is to integrate more types of information and increasingly use biological similarity, not only the similarity of chemical structures with respect to shared functional groups. This holistic evidence integration (Hernandez et al., 2019) brings a tool to the hands of the practitioner of toxicology to support the analysis of untested or large groups of substances. Formal evaluations are on the way, and the first regulatory acceptance has just been achieved with Australia's new chemical legislation in July 2020. This illustrates, how a technology not really noticed a decade ago (our own 2010 review on computer-aided toxicology (Hartung and Hoffmann, 2009) did not even mention it), is suddenly making an impact. Big data and AI are new kids on the block of toxicology.
animal testing is still difficult to replace for anything involving complex metabolism of the chemical, immune reactions, the microbiome, behavioral effects, etc. These are the different problems that can be solved and that will require us to decide what is more important or can be emphasized less. And then other values come in, completely outside the system, which influence these decisions. These are of an ethical, economical and also legal (Can the safety of a product be defended in liability cases?) nature. These discussions also cross-fertilize with technical developments, i.e., new approach methods. Two main developments become evident, i.e., the increase in fidelity of (human) cell culture (Marx et al., 2016(Marx et al., , 2020 and the advent of reliable computational toxicology. The latter is fueled by the enormous growth in big data relevant to toxicology (Hartung, 2016b;Luechtefeld and Hartung, 2017). As shown in Figure 2, many technologies now provide big data sources, which can be analyzed using machine learning, i.e., artificial intelligence (AI).
These big data and AI technologies already have impacted on every aspect of our life in recent years. Simply said, AI is making big sense from big data (Hartung, 2018c). No different, toxicology is seeing the rise of big data, e.g., the use of omics technologies, high-content imaging, sensor technologies, robotized testing such as by ToxCast and the Tox21 alliance, curated legacy databases, scientific publications, the grey literature of the internet, etc., which all feed these big data. The magic of AI is that more and more of these data can be brought together and integrated, which is called data fusion or transfer learning.
In 2018 (Luechtefeld et al., 2018), we pushed this even further by introducing automated read-across into this modelling, i.e., taking advantage of the fact that chemical similarity is bringing Fig. 2: Technologies that change toxicology from data-poor to data-rich, enabling and requiring computational analysis now, safety testing for newly developed drugs as well as assessment of the potential toxicity of numerous environmental exposures, will be largely carried out using human biochips … [and] will mostly replace animal testing for drug toxicity and environmental sensing giving results that are more accurate, at lower cost, and with higher throughput." 7 .
The new paradigm has clearly arrived at NIH leadership. This reflects the changing world view toward new approach methods and its regulatory application in leading circles of toxicology. Similarly, the US FDA is embracing the new technology 8,9 . This is all the more important in light of the central role of the FDA for the world's largest drug market: While the US has only 4.3% of the world population, they consume 48% of all drugs and 64% of those under patent 10 . Given the central role of the pharmaceutical industry in leading the development and use of new approaches in the safety sciences, this is of critical importance. Noteworthy, the movement towards an "investigative toxicology" (Beilmann et al., 2019) based on de-risking and mechanistic elucidation of toxicities, supporting early safety decisions in the pharmaceutical industry, has been embraced by many of these companies.

Summary and conclusions
In 2008, we set out with the need for all science to change (Hartung, 2008a): "A hallmark of science is its continuous development -tomorrow's experiments will challenge today's hypotheses. At least, that is the theory. In practice, science progresses in waves. Roughly with the average half-life of a university department chairman, ideas, hypotheses and schools of thought make it into textbooks and, with few exceptions, are then condemned to die a silent death. They have simply to make place for the new knowledge, which roughly doubles every seven years in the life sciences. Textbooks cannot double in size to keep up with this pace." This holds everywhere, but in regulatory toxicology at least up to 2008.
It is worthwhile also to attribute some change to the emergence of new technologies. We can only change if there are new options, and it makes sense to complain about the state-of theart, as expressed by Peter Singer: "I don't think there's much point in bemoaning the state of the world unless there's some way you can think of to improve it. Otherwise, don't bother writing a book; go and find a tropical island and lie in the sun." However, we should not underestimate the importance of showing the need for change in order to prompt new developments. But only with the availability of alternatives will we see the push of agents of change, the revolutionary developers and the publicizers and vendors of their inventions. In the area of alternatives to animal testing as the engine of change in toxicology, this in- -Perhaps the scientist who continues to resist after the whole profession has been converted has ipso facto ceased to be a scientist. Again, this seems to be a fair description of what is happening in toxicology in recent years. To provide another piece of quantitative evidence (Meigs et al., 2018): "Notably, despite increasing R&D budget, pharmaceutical industry is continuously reducing animal testing in Europe: the share of relatively stable 12 million animals used in Europe dropped from 31% (2005) to 23% (2008) and to 19% (2011), clearly indicating that a substitution by other technologies is taking place."

The nature and necessity of scientific revolutions and revolutions as changes of world view
It is fair to say that until the middle of the first decade of this century, alternative methods were more seen as complements to traditional approaches and as a substitute only in a few select cases. Many toxicologists considered their need more as a tribute to animal welfare activists and, subsequently, to policy makers. This has changed. To give just two quotes, the head of the US National Institutes of Health, Francis Collins wrote (Collins, 2011), "With earlier and more rigorous target validation in human tissues, it may be justifiable to skip the animal model assessment of efficacy altogether." With a 2018 budget of $37 billion, NIH is the largest single public funder of biomedical research in the world 6 . In 2016, in his testimony to the Senate Labor, Health & Human Services Subcommittee on April 7, he went further, showing an organ-on-chip device: "I predict that, ten years from cludes not only the novel cell culture technologies referred to above but increasingly big data and AI approaches.
This article aimed to illustrate that actual change is on the way, very much along the line of Kuhn's analysis. In 2008, already a number of anomalies started to induce a feeling of crisis, i.e., that the toxicological toolbox and approach cannot serve the societal and economical needs. Twelve years later, this has intensified, and the new approach is shaping and winning a growing following. The discussion is no longer whether to change but how and how fast. According to Kuhn, we are entering the revolutionary phase. And we can be hopeful, as Carl Sagan wrote in 1987 (Sagan, 1987): "In science it often happens that scientists say, 'You know, that's a really good argument; my position is mistaken,' and then they would actually change their minds and you never hear that old view from them again. They really do it. It doesn't happen as often as it should, because scientists are human and change is sometimes painful. But it happens every day. I cannot recall the last time something like that happened in politics or religion."