Perspectives on the Evaluation and Adoption of Complex In Vitro Models in Drug Development: Workshop with the FDA and the Pharmaceutical Industry (IQ MPS Affiliate)

to facilitate the industry implementation and qualification of microphysiological systems (MPS). Abstract Complex in vitro models (CIVM) offer the potential to improve pharmaceutical clinical drug attrition due to safety and/ or efficacy concerns. For this technology to have an impact, the establishment of robust characterization and qualification plans constructed around specific contexts of use (COU) is required. This article covers the output from a workshop between the Food and Drug Administration (FDA) and Innovation and Quality Microphysiological Systems (IQ MPS) Affiliate. The intent of the workshop was to understand how CIVM technologies are currently being applied by pharmaceutical companies during drug development and are being tested at the FDA through various case studies in order to identify hurdles (real or perceived) to the adoption of microphysiological systems (MPS) technologies, and to address evaluation/qualification pathways for these technologies. Output from the workshop includes the alignment on a working definition of MPS, a detailed description of the eleven CIVM case studies presented at the workshop, in-depth analysis, and key take aways from breakout sessions on ADME (absorption, distribution, metabolism, and excretion), pharmacology, and safety that covered topics such as qualification and performance criteria, species differences and concordance, and how industry can overcome barriers to regulatory submission of CIVM data. In conclusion, IQ MPS Affiliate and FDA scientists were able to build a general consensus on the need for animal CIVMs for preclinical species to better determine species concordance. Furthermore, there was an acceptance that CIVM technologies for use in ADME, pharmacology and safety assessment will require qualification, which will vary depending on the specific COU. exposed to a training set of 47 clinically studied CNS-acting drugs with a wide range of seizurogenic and neurotoxic liabilities at physiologically-relevant exposures. The acute and chronic effects of these drugs on calcium burst characteristics were studied as six independent variables and combined with cellular viability using a logistic regression model to weigh and combine the endpoints into a predictive model. Individual endpoints (e.g., calcium peak amplitude, frequency, width) themselves had poor predictive value. However, when these endpoints were integrated into the logistic regression model, they were able to detect seizurogenic or neurodegenerative liability in 50% of the neurotoxic CNS-active drugs with 93% specificity. Cutoffs were established for each endpoint to favor specificity over sensitivity to minimize potential false positives. The neuronal spheroids were then exposed to 26 drugs with clinically-established safety and exposure profiles. These drugs were selected due to their dispa-rate and well-studied mechanisms of neurotoxicity or long track record of clinical safety. In this independent test set, the neuronal spheroids were able to detect 61% of the drugs with known neurotoxic liabilities, with 92% specificity across 10 independent mechanisms of toxicity. These studies have demonstrated that neuronal spheroids have the potential to detect neurotoxic liability during drug discovery and may help reduce the incidence of neurotoxicity in the clinic.

packages built around specific contexts of use (COU). The FDA Drug Development Tool (DDT) Qualification Program defines COU as "Context of use is the manner and purpose of use for a DDT; when FDA qualifies a biomarker, it is qualified for a specific context of use. The context of use statement should describe all of the elements characterizing this purpose and manner of use. The qualified context of use defines the boundaries within which the available data adequately justify use of the DDT" (FDA, 2021a). Focused efforts to characterize performance and assess the analytical validity of new tissue/cellular platforms will be needed to build a high level of confidence in the technologies in order to drive industrialization, advance regulatory acceptance, and fully realize the promise of MPS. Creating a sufficient qualification data package around a specific COU can be a significant barrier depending on the technical complexity, throughput, and the associated resources pertaining to a particular MPS. Moreover, the burden of evidence required for qualification may vary depending on both the COU and how the data will be used for decision-making, both internally and to eventually sup-potentially including two or more of the following: mechanical factors such as stretch or perfusion (e.g., breathing, gut peristalsis, flow), incorporating primary or stem cell-derived cells, and/ or including immune system components" (Fig. 2). Since this workshop, the working definition of MPS has been evolved by the U.S. Food and Drug Administration (FDA) 2 and the IQ MPS Affiliate 3 with significant alignment.
MPS have promising applications in mechanistic pharmacology or toxicology investigations, preclinical safety screening, and evaluation of drug disposition that could ultimately improve the translational success of drug candidates into the clinic, with the additional potential benefit of impacting the 3Rs (replacing, reducing, and refining animal use) in preclinical studies. Over the last decade, the sophistication and capabilities of these cellular models have matured dramatically, and the landscape has become more diverse and increasingly complex. Yet, a number of challenges remain before these cellular technologies can be fully incorporated into drug discovery and development, with one of the most critical being establishment of robust qualification 2 https://www.fda.gov/science-research/about-science-research-fda/advancing-alternative-methods-fda 3 https://www.iqmps.org/

Fig. 1: Drug discovery and development
The image addresses when CIVM can be used for internal decision-making without undergoing qualification required for regulatory submission and shows common themes and specific topics that overlap between drug discovery and development phases. HT, high throughput; ID, identification; tox, toxicology to support drug discovery may not be visible to regulatory agencies in regulatory submissions since the relevance to clinical outcomes may be unclear. These are some of the many challenges associated with MPS and other exploratory data generation.
To address these challenges and support the adoption and implementation of MPS in drug development, the FDA and IQ MPS Affiliate members convened an interactive face-to-face workshop at the FDA White Oak Campus on February 26, 2020.
The goals of this workshop were 1) to understand how CIVM are being applied by the FDA and pharmaceutical companies in the drug discovery and development process, 2) to identify hurdles, real or perceived, to the adoption of CIVM, and 3) to outline evaluation and qualification pathways for these technologies. This report summarizes presentations, breakout sessions, discussions, insights, and outcomes of the collaborative and interactive workshop.

FDA involvement in CIVM development and regulatory use
FDA has been actively involved in the field of complex in vitro models for many years. For example, in 2011 the Defense Advanced Research Projects Agency (DARPA) funded research on MPS. DARPA included FDA in this program from the beginning to help ensure that regulatory challenges of reviewing drug safety and efficacy were considered during development of MPS platforms. In 2012, the National Center for Advancing Translational Sciences (NCATS) funded the Tissue Chip Development Program and also included FDA. These programs recognized that it was critical to have regulator input if one aim was ultimately to use a method for regulatory use.
To encourage development of new toxicological approaches, the FDA published its Predictive Toxicology Roadmap in 2017, and FDA scientists also contributed to the development of the Interagency Coordinating Committee for the Validation of Alterna-port regulatory filings (Hendrix et al., 2021;Leptak et al., 2017). For example, qualifying a liver-based microfluidic organ-on-achip model to predict risk for human drug-induced liver injury (DILI) would likely require a robust data package with responses from large test and validation sets that represent the many different etiologies associated with this toxicity. Conversely, the qualification and associated data package needed for the same hepatic MPS to address target engagement and pharmacodynamics of an exploratory therapeutic would presumably require less data. The qualification package for internal (e.g., Sponsor) decision-making may be different from what is required for regulatory acceptance. Similarly, qualification packages needed for MPS to augment traditional and accepted models/data would require less rigor than those needed to support claims that replace and/or refute traditional and accepted models/data in drug discovery and development. The factors outlined above make defining simple guidelines for qualification around COUs for MPS or other exploratory in vitro models challenging.
Lastly, there often can be hesitation in generating exploratory data on molecules in drug discovery and development if it is not clear how to translate those findings to humans for risk assessment. This is especially true for safety data related to molecules with ongoing active clinical programs. Accordingly, this becomes a "chicken or egg" conundrum, where one needs a robust qualification package in order to generate data on the molecules of most interest; yet, a major part of that qualification package is the translatability of the data to clinical outcomes, which may require prospective generation during drug discovery. This is becoming a bigger problem considering the emergence of new modalities such as cell and gene therapies, antisense oligonucleotides, bispecific antibodies, etc., where there will be less relevant comparators around which to build qualification packages. In these cases, there will be a greater need for deliberate data generation and exploration in the drug discovery space to eventually establish translatability later in the clinic. As such, MPS data generated in an exploratory fashion Current regulations do not explicitly define the types of studies needed to address these sorts of issues. However, guidance documents do provide recommendations on the types of studies considered appropriate. Regulations and guidance also allow new approaches to be submitted to the Agency in regulatory applications. A method does not have to be formally validated before it is submitted but should be supported by suitable data sets to qualify the model for the intended use as described below.
When assessing data from CIVMs submitted to the Agency, reviewers consider how scientifically valid the information is for the particular purpose or COU based on supporting information (see Tab. 1 for example). Principles of validation are well described in Organization for Economic Co-operation and Development (OECD) and ICCVAM publications, and while it may not be necessary to apply all these principles rigorously for all COU, they do provide useful guidance for how new technologies can be evaluated. Sponsors are encouraged to discuss with the FDA and other regulatory bodies the potential use of CIVMs before submission so that feedback can be provided, and appropriate expectations established.
The interaction of the FDA with outside stakeholders such as the IQ MPS Affiliate provides venues for FDA personnel to learn about the new technologies and become familiar with their capabilities and limitations. Stakeholders learn about FDA perspectives and concerns through these venues. Ultimately, these interactions will help to build the confidence that both regulators and industry need to move these technologies forward.

Workshop planning
A key goal for the workshop was to facilitate discussions around building confidence in the applications of MPS for regulatory submission endpoints. To effectively accomplish this, the work-tive Methods (ICCVAM) roadmap published in 2018. Both roadmaps focus on regulatory Agency needs and the development of new approaches and methodologies with COU that help to address these needs. FDA scientists continue to be engaged with both initiatives. In addition, the Agency has created an Alternative Methods Working Group to promote the development and use of new technologies to better predict human and animal responses to products relevant to its regulatory mission.
Despite these activities, the use of complex in vitro models in a regulatory context has been somewhat limited. It is relatively common for Sponsors to use CIVM (e.g., micropatterned cells, spheroids, organoids, 3D static tissues, etc.) to explore the pharmacology of drug candidates and to screen candidates for specific toxicities of concern early in drug discovery. Currently, the more advanced CIVM such as organs-on-a-chip are less commonly used as a screening tool in drug discovery. Despite disparities in the use of different model types in drug discovery and development, data derived from such models may be submitted to the Agency in investigational new drug applications at the discretion of the Sponsor. However, these studies are seldom pivotal in determining whether it is reasonably safe to proceed into clinical trials.
For a complex in vitro model, or any new technology, to be useful in a regulatory setting, it must provide information about an issue or question that needs to be addressed for the regulated product. This includes issues such as: -Identifying dose levels or systemic exposures at which no adverse effects are observed -Determining a reasonably safe first-in-human dose for human pharmaceuticals -Identifying potential target organs of toxicity -Identifying potential developmental and reproductive toxicity -Identifying potential carcinogenicity -Identifying and understanding the factors that affect different responses by sub-populations -3D spheroid model can be valuable for the early estimation of chondrogenic differentiation capacity of multipotent stromal cells . -3D microfluidic co-culture models could be useful in evaluating vasculogenic potential of multipotent stromal cell trophic factors. -The results from both models could have predictive value for biological activity of cellular products.
-Characterizing liver MPS performance using the IQ MPS Affiliate recommendations is very helpful in identifying adequate models for safety testing (Baudy et al., 2020). -Liver MPS models can be valuable for DILI de-risking and mechanistic investigation but understanding their context of use is critical. -Micropatterned models have been found to be especially helpful for liver toxicity screening.
-MPS can increase sensitivity to detect risk and/or integrate multiple DILI mechanisms into a single system. -Each new model requires internal qualification, which can be challenging. -The numerous DILI etiologies require multiparametric risk assessment, including data derived from liver MPS models.
-3D spheroids improve hepatotoxicity prediction over current 2D HepG2 cell assay. -3D spheroids detected more clinically hepatotoxic compounds at day 14 compared to 5 other in vitro models.
-Model detected changes in ALT elevations when preclinical animal studies showed no hepatotoxicity with a proprietary compound.
COU and replace existing models (e.g., in vitro to in vivo extrapolation (IVIVE) including use of rat, human, dog CIVM data to build confidence). The topic of species differences and concordance allowed consideration of situations when there is a lack of concordance between data sets (e.g., negative in rat and dog in vivo studies but positive in the human CIVM). How to address translation of the data (e.g., tissue efficacy) between human (CIVM and/or clinical), dog, and rat in vivo data was also considered.
The final topic of discussion was focused on overcoming the barriers to including CIVM data in regulatory submissions. This topic enabled exploration into what types of supporting CIVM data regulators would expect to see included in submissions and, from an industry perspective, what data industry would be likely to submit. This topic also allowed the breakout subgroups to address whether animal (e.g., rat) versions of CIVM would be needed to investigate concordance across in vitro and in vivo models as well as across species. Each breakout session shop format included two didactic sessions and a breakout session with three areas as described in Table 2.
The didactic sessions consisted of the FDA representatives describing their experiences and perspectives on CIVM with a microfluidic component (i.e., MPS), whereas the industry representatives presented case studies focused on either liver or other organ/tissue (lung, intestine, neurological) models (Tab. 2) that spanned the full breadth of the CIVM definition (Fig. 2).
The breakout subgroups were separated into three main areas that included ADME (absorption, distribution, metabolism, and excretion), pharmacology, and safety. This helped to better facilitate the discussions that focused on three key topics including: -Performance criteria -Species differences and concordance -Overcoming the barriers to inclusion of CIVM data in regulatory submissions Performance criteria was chosen as a topic to explore how to develop CIVM and use existing data to accept them for specific -Robust CYP3A4 induction in response to known inducers in human adult ileal and colon organoids -Hepatocyte CYP3A4 inducer (phenytoin) did not induce CYP3A4 in intestinal organoids. -Using hepatocyte induction data to predict gut induction by phenytoin may not be appropriate.
-Lung chip device with mechanical cyclic stretch and IL-2 stimulation was able to recapitulate human pulmonary edema with relevant pathophysiology. -Clinical translatability with pharmacological agents was evidenced by suppressed pulmonary vascular leakage.
-Neural spheroids capture many, but not all, seizure mechanisms related to small molecules with high specificity in training and test sets. -Neural spheroids have sufficient throughput and reproducibility to rapidly screen preclinical projects for backup molecules with lower seizure risk.
-2D and 3D human iPSC-derived neuronal models produced the same compound ranking. -Payload permeability appears important in ADC-mediated neurotoxicity in vitro . (Rouse et al., 2018a,b), CIVMs are investigated in CDER for ensuring readiness of the Agency to the regulatory outcomes of this emerging technology and in mission-critical applied research to assess the safety, efficacy, quality, and performance of drugs. As potential drug development tools, CIVMs must yield reproducible results, operate robustly, and perform under well-defined quality control criteria. Most importantly, CIVMs must be developed for COU and demonstrate to improve or be equivalent to the outcome of currently used techniques or provide novel or unprecedented mechanistic insight of drug effects. With CIVMs, low reproducibility of results is thought to result from variations in device handling (assembly, protein coating, priming/pumping, media changes, operational calibration, etc.) or from variations in the origination of cellular materials (cell isolation protocols, genetic backgrounds, types of media, induced pluripotent stem cells (iPSC), differentiation, cell-specific handling, etc.). Given the different factors that can lead to variability in system performance, the following goals drive the research developed in CDER around variability in experiments that represent specific applications of CIVMs: i) Assessing site-to-site variability by repeating experiments in CDER laboratories that were already done by developers or testing centers to evaluate experimental factors that ensure the robust use of CIVMs; ii) Managing chip-to-chip variability by characterizing how different device or cell batches can affect results; iii) Establishing protocols and standard operating procedures to standardize system preparation, drug treatment and schedules for measurement of drug effects based on COU. To achieve such goals and increase the impact of research in the field, CDER participates in collaborations with stakeholders involved in drug development from other FDA centers, government agencies, industry, and academia (Isoherranen et al., 2019). The Health and Environmental Sciences Institute (HESI), ICCVAM, the National Institutes of Health (NIH) Tissue Chips Consortium, and The European Organ-on-Chip Society (EUROoCS) are some examples where multiple drug development stakeholders develop collaborative efforts at different levels around this field. Overall, the involvement of CDER researchers with industry consortia, such as the IQ MPS consortium, aims to identify and develop strategies to address gaps and was tasked with identifying key outcomes to structure the discussion effectively.

Identifying gaps, opportunities and needs for using CIVMs in drug development
The potential of CIVMs to predict clinical drug effects stems from their ability to enhance the physiological relevance of cells in culture by exposing them to a microenvironment with tissue-specific properties that drive cellular function to better represent components of clinical drug effects (Low and Tagle, 2017;Khetani et al., 2015;Lancaster and Knoblich, 2014;Ribeiro et al., 2019b). Systems have been developed to enhance the physiology of cellular function through different approaches, such as culturing cells in 3D, in co-culture with other cell types, within biomimetic matrices, exposure to mechanical or electrical stimulation, or by inducing more mature cellular morphologies. These approaches vary depending on the type of tissue to be modeled, as well as on the cell types used. Several systems that enhance cellular physiology can be found in the literature for the same tissue type, such as for liver (Khetani et al., 2015;Ribeiro et al., 2019b;Baudy et al., 2020). Given the diversity of systems in the field for particular tissue types and the lack of clarity on their potential COU in drug development, CDER researchers have been characterizing applications of CIVMs in predicting clinical drug effects. Overall, scientific research in CDER is dedicated to applications related to the center's mission to protect and promote public health by ensuring that human drugs are safe and effective for their intended use, that they meet established quality standards, and that they are available to patients. As done in other research fields blood, cellular and gene therapy, and vaccines (FDA, 2021b). In contrast to most drugs that are chemically synthesized, and whose structure is known, most biological products are complex and isolated from a variety of natural sources including humans, animals, or microorganisms. Due to their inherent complexity, it is quite challenging to fully characterize biological products by conventional testing methods and, even after thorough characterization, it is possible that some components of a final biological product remain unknown. In addition to their complexity, most biological products are fragile and sensitive to changes in the manufacturing process; therefore, even minor changes in the manufacturing process could significantly affect the quality and functional capacity of biological products. Therefore, to ensure product safety, effectiveness, consistency, and quality, it is critical to tightly control the manufacturing process as well as source materials. Furthermore, it is critical to develop new test methods for biological product characterization that provide enhanced sensitivity, specificity, and predictive value.
In the case of regenerative medicine and cellular therapy products that are regulated by the Office of Tissues and Advanced Therapies (OTAT) in CBER, there are several critical manufacturing issues to consider early in the development of a cellular product. These include the control of source materials, identification of critical product quality attributes, and development of a sufficiently robust manufacturing process. Regenerative medicine is the "process of creating living, functional tissues to repair or replace tissue or organ function lost due to age, disease, damage, or congenital defects" (Perez-Terzic and Childers, 2014; Mao and Mooney, 2015), and cell therapies rely on the administration of viable cells into a patient's body to grow, replace, or repair damaged tissue to treat a disease or an injury. When cells are starting materials as well as final, finished therapeutic products, the regenerative medicine cellular therapy industry faces challenges in manufacturing large-scale and high-quality cellular products at a cost meeting market expectations. Comprehensive cell characterizations that are designed to distinguish critical quality attributes of cellular products are considered to help meet regulatory expectations for product identity, purity, and potency (Sung et al., 2020). Reliable and predictable product characterization during the early stage of product development could help detect subpopulations of cells in starting materials as well as in final products that might not be effective or that could possibly present safety concerns.
Well-designed CIVMs that recapitulate critical physiological conditions have the potential to contribute toward developing and improving test methods for product characterization and toward identifying product attributes that are predictive of safety, efficacy, and potency. In addition, because CIVMs have the ability to tightly control microenvironmental factors and process parameters such as soluble factors and shear stress, CIVMs can also be potentially employed to understand the effect of various manufacturing process parameters on producing consistent cellular products. During the workshop, a CBER scientist presented two ongoing projects demonstrating how CIVMs could be potentially used to assess the functional capacity of regenerative medicine cellular products. Two forms of CIVMs, 3D organoids opportunities of CIVMs in COU related to drug development and regulation.
In general, CIVMs have been demonstrated to hold high potential for predicting the efficacy and safety of new drugs or generic drugs by providing data related to drug mechanisms of action, effects of patient-specific properties, drug pharmacology, and toxicity mechanisms. The COU for CIVMs will depend on the type of drug to be tested and on the effects that need to be evaluated. To eventually shift drug testing paradigms that may rely on "black box" compensatory physiological mechanisms in animal models or humans, CIVMs are evaluated to meet the requirements of performance in specific applications that define the COU. Lists of compounds with well-known clinical outcomes and mechanistic effects are key for developing the appropriate COU for cell-based in vitro tools. CDER research on CIVMs was initiated around hepatic and cardiac systems because toxic effects in these organs are the main causes for drug attrition (Fermini et al., 2018;Weaver and Valentin, 2019) and hepatic metabolism, and transport plays a strong role in modulating drug effects (Malki and Pearson, 2020).
CDER research on hepatic CIVMs is currently dedicated to MPS, spheroids, and sandwich cultures, with primary cells or hepatocytes differentiated from iPSCs (Ribeiro et al., 2019b;Dame and Ribeiro, 2021). Engineered heart tissues and heart-on-a-chip using iPSC-derived cardiomyocytes are also being characterized by CDER researchers for potential applications in predicting drug toxicity (Ribeiro et al., 2019a). Overall, results demonstrated that organ-specific cellular functions lasted longer in these systems in relation to more traditional culture platforms. With liver systems, reproducibility between two sites was initially tested around known mechanisms of trovafloxacin hepatotoxicity that were dependent on inflammatory factors. Other compounds are being used to further investigate variability in liver system performance. Results from cardiac systems demonstrated some potential for testing prolonged toxic effects of drugs on cardiomyocyte contractile function. Upcoming research is evaluating a heart-liver interconnected system for predicting cardiotoxic effects of drug metabolism (Dame and Ribeiro, 2021) and applications of generic drugs in lung MPS and kidney organoids. Since the quality of the cells utilized is central for performance of investigated CIVMs, CDER researchers are also studying performance standards for both primary and iPSC-derived cells. Longterm goals of CDER research in this field will strive to set specific COU for CIVMs, which will require multi-stakeholder partnerships (Isoherranen et al., 2019).

Research program in the FDA Center for Biologics Evaluation and Research (CBER):
Assessing the functional capacity of regenerative medicine cellular products CBER regulates various biological products for human use and protects and advances public health by ensuring that biological products are safe, effective, and available to those who need them. Toward the goal of advancing the scientific basis for the regulation of biologics, CBER scientists conduct a variety of mission-related programs, including research into allergenics, aspects of fialuridine-induced mitochondrial toxicity as observed by the dose-dependent reductions in urea synthesis rates, whereas the 3D printed model could not. However, an advantage of the 3D printed model over the micropatterned model was the ability to detect the endogenous human glycine-conjugated bile acid glycochenodeoxycholic acid (GCDCA) in the supernatant, possibly due to the significantly higher hepatocyte number present. GCDCA could be used as a potential biomarker to evaluate bile salt export pump (BSEP) inhibitors associated with cholestatic liver toxicity.
It was concluded that identifying the COU for the application of CIVMs is essential and that more complexity is not necessarily always better. Micropatterned models can meet many of the critical hepatocyte performance criteria described by the IQ MPS Affiliate and provide a robust platform to study mechanism(s) of hepatocyte toxicity. Ideally, a single liver CIVM would be able to address all of the complex biology that is needed while also allowing for numerous assay endpoints to be measured. Current major gaps and challenges of CIVMs include a lack of hepatic blood and bile flow, lack of functional compartmentalization, and the need for significant investment and effort to incorporate computational data management tools to improve in vitro to in vivo predictions to enable in vitro model-based human risk assessment.

A comparison of complex in vitro liver models to recapitulate signals associated with clinical DILI (Aaron Fullerton, Genentech)
As a rapidly growing number of CIVMs enter the non-clinical safety landscape with the promise to address shortcomings in the prediction of drug candidates for DILI liability, it has become exceedingly important to understand the COU as well as the benefits and challenges of adopting these diverse CIVMs in drug discovery and lead optimization. In studies conducted at Genentech over the last few years, we have addressed the added value of 3D human liver microtissues over traditional primary human hepatocyte monoculture for the prediction of DILI risk. The initial portion of this work was published in Proctor et al. (2017), wherein human liver microtissues (hLiMTs) comprised of primary human hepatocytes and Kupffer cells were reported to enhance sensitivity and the predictive value to detect drugs with high risk of clinical DILI. Expanding on this effort using a diverse test set of commercial compounds with well characterized DILI in humans (80 DILI-positive/90 DILI-negative) to establish response thresholds in these assays, it was further reported that hLiMTs demonstrate an enhanced sensitivity over primary human hepatocyte monoculture while maintaining a high level of specificity for compounds with DILI liability.
Additionally, retrospective studies utilizing these in vitro models were performed to assess an internal molecule, i.e., a highly potent inhibitor of a novel (non-oncology) target for Genentech. Although no evidence of liver injury was observed in the non-clinical safety studies in both rats and dogs, the clinical development of this molecule was discontinued following transaminase elevations (with no Hy's law violations) in healthy volunteers during Phase 1. The subsequent in vitro studies further and 3D microfluidic compartmentalized co-culture systems, are being tested to evaluate the chondrogenic and vasculogenic potential of multipotent stromal cells (MSCs), respectively. Although MSCs are being investigated in clinical trials to evaluate their ability to protect, restore, and repair tissues in the human body, no MSC-based product has been licensed in the US yet despite the significant investment in manufacturing and clinical trials. This is partly due to the inherent heterogeneity of MSC populations and partly due to the lack of reliable quantitative assays that can provide predictive values of manufactured MSCs. To evaluate the chondrogenic potential of MSCs, functionally relevant morphological profiling (FRMP)  was applied to identify changes in the size and shape of stimulated MSC organoids that can help predict whether the cells will differentiate into effective therapeutic products . It was shown that certain morphological features, such as the size of the organoids, can predict as early as four days after stimulation which MSCs will eventually develop their chondrogenic ability. Using high-throughput screening-compatible CIVMs, the team was able to screen the chondrogenic and vasculogenic potential of MSCs manufactured from varying donors and cell passages. Continued efforts on developing and testing relevant CIVMs on product characterization will advance the scientific basis for the regulation of complex biological products and will help enhance the safety, effectiveness, quality, and consistency of the products. The CIVM field has been developing over the past 20 years, with one of the earliest examples being directing the deposition of cells in specific spatial patterns, a process originating from Thomas Boland and accomplished with a retrofitted inkjet printer (Wilson and Boland, 2003). Since then, the notion that spatial cues can enhance and stabilize liver model function has been substantiated by numerous micropatterned and 3D printed liver CIVMs (Khetani and Bhatia, 2008;Nguyen and Pentoney, 2017). In our studies, we characterized such models by performing full transcriptome sequencing with comparison to 2D liver cell lines, iPSC-derived liver cells, primary hepatocytes, and human liver biopsies.
A major finding was that a micropatterned model could maintain high fidelity with human liver biopsies over time (Kang et al., 2020). A survey of the albumin production and urea synthesis rates was separately analyzed for 17 different CIVMs, and it was found that the majority of models had performance levels below target calculated in vivo human levels (Baudy et al., 2020). It was notable that a micropatterned liver model and some liver chip microfluidic models could meet these performance criteria as well as demonstrate favorable drug metabolism capacity. In a study comparing a micropatterned model to a 3D printed liver model, it was found that the former was capable of reproducing der an IRB/EC-approved protocol. The primary 3D PHH spheroid assay that utilized 384-well ultra-low attachment plates was characterized by studying growth via phase contrast, fluorescent live cell imaging, and monitoring viability by CellTiter-Glo ® ; it demonstrated sustained high viability over 22 days. Drug metabolism was demonstrated over 14 days by observing 7-ethoxycoumarin depletion. Compound penetration using amiodarone was observed in the hepatocyte spheroids and both typical hepatocyte morphology and the lack of an autolytic core were also observed. The 3D PPH model was compared to five other in vitro hepatocyte assays (2D HepG2 cell health assay, rat liver tissue, human liver tissue, sandwich-culture 2D primary human hepatocytes, and the HepatoPac 2D assay) to determine the predictive capacity of the assays. At day 14, only the 3D PHH model was able to correctly identify all five (troglitazone, acetaminophen, diclofenac, bosentan and fialuridine) of the selected high-DILI concern compounds, as defined by the FDA DILI rank and LTKB databases (Chen et al., 2016(Chen et al., , 2011. The 3D PHH model was then used to test a larger compound set of 199 compounds, which showed a specificity of 100% and a sensitivity of 33% when used together with dose, compared to 100% specificity and 20% sensitivity when using the 2D HepG2 cell health assay together with dose, reflecting that it can confidently flag more hepatotoxic compounds than the 2D HepG2 cell health assay under the conditions tested. Assay quality, miniaturized format, and robustness gave confidence to proceed with embedding the 3D PHH assay in production at GlaxoSmithKline as part of an integrated hepatotoxicity strategy, and the possibility to use the 3D PHH for further mechanistic readouts to enhance in silico DILI modelling programs (Ekert et al., 2020).

Characterization of a liver CIVM for internal decision-making (Kazuhiro Tetsuka, Astellas Pharma, Inc.)
This case study describes the characterization of a 3D-bioprinted co-culture transwell model of human liver and its qualification for assessment of hepatotoxic potential. The model consists of human primary hepatocytes, stellate cells, and umbilical vein endothelial cells. For characterization, the model was exposed to various concentrations of acetaminophen (APAP) for up to 28 days. Increasing the duration of APAP exposure reduced the half-toxic concentration (TC50). In addition, treatment of the model with 30 mM of APAP for 6 hours did not affect tissue viability but reduced tissue glutathione (GSH) levels to about 60%. The GSH levels recovered when APAP was removed after 6-hour exposure and culture was continued without APAP for 18 hours (Tetsuka et al., 2020). In another characterization study, we examined toxicokinetics/toxicodynamics of APAP-induced hepatotoxicity to investigate the relationship between drug exposure and tissue viability in the model. The model was exposed to APAP intermittently for 14 days with 1.5, 6, or 24 hours of daily exposure, and tissue viability was monitored. The area under the concentration-time curve of APAP and, to a lesser extent, maximum concentration of APAP were correlated with reduced tissue viability (Ohbuchi et al., 2018). After these investigations, the model was further qualified for assessment of hepatotoxicity. For illustrated the value of CIVMs for prediction of DILI risk, as compound treatment in a traditional primary human hepatocyte monoculture in vitro model did not demonstrate an impact on hepatocyte cell viability up to the highest dose tested (100 µM), whereas similar treatment with this compound in the hLiMTs model resulted in an IC 50 for cell viability of 10 µM, suggesting an increased DILI risk for the compound.
Hepatic response to the case study molecule was also evaluated using a liver-chip MPS in vitro model. This hepatic CIVM consisted of a microfluidic two-channel design with primary human hepatocytes populating one channel and the other seeded with liver sinusoidal endothelial cells to comprise a vascular compartment. Upon treatment with the compound, significant increases in LDH were detected in the liver-chip effluent, and baseline levels of albumin production decreased by > 90% after 72 hours. These results are indicative of a substantial stress response to the molecule and suggest the liver-chip CIVM may have enhanced sensitivity to detect adverse drug effects associated with DILI as compared to 2D cell systems. However, these effects were only observed at much higher doses of the molecule (IC 50 = 100 µM) as compared to those seen in hLiMTs (IC 50 = 10 µM).
These studies illustrate the difficulty of translating dose-response relationships across in vitro models without the aid of sizable internal qualification studies in order to carefully calibrate thresholds for the various assay endpoints that can be employed in these CIVMs. As these MPS models present challenges related to compound throughput, cost, and experimental complexity, these internal qualification studies represent a significant commitment required for characterization and adoption of these CIVMs in drug development. However, available evidence continues to build that CIVMs have the potential to transform how we address DILI risk and elucidate mechanisms of hepatotoxicity. In particular, their value may be to increase sensitivity to detect risk and/or integrate multiple mechanisms of DILI into a single physiologically-relevant system.

3D primary human hepatocyte (PHH) assay for early hepatotoxicity screening (Jason Ekert, GSK)
The 3D PHH model was conceived and characterized in a precompetitive, Innovative Medicines Initiative-sponsored public/private consortium, MIP DILI (Mechanism-based Integrated Systems for the Prediction of Drug-Induced Liver Injury) (Bell et al., 2018). At GlaxoSmithKline, the 3D PHH spheroid assay was previously identified as an opportunity to improve hepatotoxicity prediction over the 2D HepG2 cell health assay. The key features in developing an early hepatotoxicity assay for small molecules are 1) capturing a considerable proportion of severely hepatotoxic compounds, as part of an integrated drug development strategy; 2) testing chronic exposure to compound over a 14-day period; 3) low batch-to-batch variability of hepatocytes and viability that can be maintained in culture for up to 14 days, and 4) maintaining cell functionality requiring the hepatocytes to be metabolically active and evaluated in a high-throughput fashion. The primary hepatocytes were sourced ethically, and their research use was in accord with the terms of the informed consents un-hepatocyte incubations provided phase 1 and phase 2 metabolite profiles consistent with in vivo metabolism as observed in plasma from the respective preclinical species. Further, the system was able to detect species-specific metabolism, providing additional confidence in its use.
Building on the internal experience with the MPCC hepatocyte system, this model was utilized to enhance confidence in the identification of metabolic pathways for a low-clearance clinical candidate compound, using an experimental design comparable to the one described above. The MPCC system provided a necessary advantage over other low-clearance assay systems in this specific case due to the instability of the test compound in the relay hepatocyte assay buffer. The identification of glucuronidation and other non-CYP pathways in the MPCC model reduced the concern for drug-drug interactions by strong CYP3A4 inhibitors such as itraconazole. Further, the correlation of in vitro to in vivo metabolism provided confidence in metabolite safety testing. Importantly, IND submission of in vitro metabolism data obtained using the MPCC hepatocyte system for the investigated clinical compound was accepted without query, demonstrating receptiveness of advanced in vitro model data on the part of the regulators.

Assessment of cytochrome P450 induction in gut using adult stem cell-derived ileal and colon organoids (David Stresser, AbbVie)
Induction of cytochrome P450 enzymes by drug candidates can lead to elevated rates of metabolism and premature elimination of themselves or co-medications, leading to therapeutic failure. Although induction may occur in multiple tissues, only liver and gut are considered quantitatively meaningful in affecting pharmacokinetics. Preclinical or in vitro cellular models to evaluate induction in the intestine are generally absent or exhibit poor performance. Intestinal organoids represent a novel and physiologically relevant model possessing multi-cellular structures that retain traits of normal intestine physiology, such as an epithelial barrier and cellular diversity, and are amenable to multi-well plate experimental designs. In our investigation, matched human enteroid and colonoid lines, generated from ileal and colon biopsies from two donors, were cultured in extracellular matrix for three days, followed by a single 48-hour treatment with prototypical inducers rifampin, omeprazole, 6-(4-chlorophenyl)imidazo[2,1-b] [1,3] thiazole-5-carbaldehyde O-(3,4-dichlorobenzyl)oxime (CITCO), and phenytoin at concentrations that are known to induce target genes in the plated human hepatocyte model. Following treatment, mRNA was analyzed for induction of target genes. Rifampin induced CYP3A4 with estimated EC 50 and a maximum fold-induction of 3.75 µM and 8.96, respectively, for ileal organoids, and 1.40 µM and 11.3-fold, respectively, for colon organoids. Ileal, but not colon organoids exhibited nifedipine oxidase activity, which was induced by rifampin up to 14-fold. The test compounds did not increase mRNA expression of CYP1A2, CYP2B6, MDR1 (Pgp), BCRP, and UGT1A1 in ileal organoids. While omeprazole induced CYP3A4 (up to 5.3-fold, geomean, n = 4 experiments), constitutive androstane receptor activators, phenytoin and CITCO, did not induce CYP3A4. Since phenytoin is a well-established inducer of CYP3A4 in liver, its failure to this purpose, the hepatotoxic potential of propriety compound X was compared among the 3D-bioprinted co-culture transwell model, animal toxicology studies, and clinical studies. Exposure of the 3D-bioprinted co-culture transwell model to compound X for 28 days elevated alanine aminotransferase (ALT) in culture media. Given that elevated ALT levels were also observed in clinical trials in which compound X was administered for 2 weeks, but not in animal toxicology studies of compound X even at high doses, this 3D-bioprinted co-culture transwell model may be useful for detecting human-relevant hepatotoxicity.

Application of a micropatterned co-culture (MPCC) hepatocyte system to support drug metabolite profiling in regulatory submissions (Jennifer Liras, Pfizer)
Evaluation of drug biotransformation is a critical component of the drug discovery and development process as it provides an understanding of the major metabolic pathways and enzymes involved in drug clearance. Such studies to determine metabolite profiles are also key to identifying active metabolites and ensuring human relevance of preclinical safety evaluations. These investigations typically involve comparison of human and preclinical in vitro metabolite profiles in addition to a determination of steady state plasma metabolites in preclinical safety species.
The most commonly used in vitro systems to study drug-metabolite profiles are subcellular hepatic fractions or suspensions of primary human hepatocytes (Dalvie et al., 2009). Cryopreserved, primary hepatocytes, considered an appropriate system for the prediction of human metabolic profiles, are limited to a relatively short incubation time as their metabolic activity plummets drastically in suspensions over 3-6 hours. This can prevent proper evaluation of low-clearance drugs as the incubations fail to generate adequate amounts of metabolite for detection. Some of these limitations can be overcome by adaptations to the suspension assay, such as the relay method, which involves transferring the supernatant, containing the drug and metabolites, from the hepatocyte incubation after 4 hours to freshly thawed hepatocytes, thereby extending test article residence time with active drug metabolizing enzymes (Ballard et al., 2014). Advanced in vitro systems are needed to more closely mimic the in vivo system with sustained metabolic activity.
To this end, we characterized a CIVM, a micropatterned co-culture (MPCC) hepatocyte system, as a cross-species metabolite profiling tool (Ballard et al., 2016). In the MPCC system, primary hepatocytes are seeded in carefully pre-fabricated patterns (i.e., "islets") surrounded by fibroblasts (stromal cells). This micropatterned design supports extended culture longevity, assuring adequate enzyme expression and activity. To demonstrate suitability of the system for use in biotransformation studies, a test set of compounds that had diversity in their metabolic enzyme pathways as well as known species differences were selected for evaluation in the system. Briefly, MPCC hepatocyte systems utilizing human, cynomolgus monkey, beagle dog and Sprague-Dawley rat cells were treated with a set of 7 compounds. Metabolites in collected culture media were analyzed by ultra-high pressure liquid chromatography-tandem mass spectrometry (Ballard et al., 2016). The results revealed that MPCC leakage. GSK2193874 demonstrated the ability to fully inhibit the increase in vascular permeability induced by IL-2 when cyclic mechanical strain was applied. In an in vivo comparison, the pulmonary edema lung-on-a-chip model with IL-2 and 10% mechanical strain resulted in similar changes in barrier permeability to when IL-2 was administered to mice with mechanical ventilation. These studies were able to recapitulate pulmonary edema in a lung-on-a-chip model and show the utility of this model in early drug discovery to test small molecules for efficacy, as well as its potential to further explore PK/PD profiles (Huh et al., 2012).

Neural spheroid models for the preclinical detection of neurotoxicity (Matthew Wagoner, Takeda)
Neurotoxicity is a leading cause of safety-related attrition in drug development, in large part because of the dearth of predictive preclinical in vitro and in vivo models (Easter et al., 2009;Mead et al., 2016). The emergence of human stem cell-derived neural spheroid and organoid models in recent years has given preclinical safety scientists the opportunity to screen discovery-phase pharmaceuticals for potential adverse effects on human neuronal tissue function and viability years before heading to clinical trials. To determine if these microtissues could be useful in the early detection of neurotoxicity, we studied the viability and calcium electrical burst patterns of iPSC-derived neuronal spheroids composed of astrocytes and glutamatergic and GABAergic neurons in response to 84 structurally diverse clinically-tested pharmaceuticals divided into training and test sets (Sirenko et al., 2019). These neuronal spheroids displayed robust and reproducible calcium electrical burst patterns for weeks, allowing us to study the effect of acute and chronic exposure of these drugs on neuronal function and viability. The neuronal spheroids were exposed to a training set of 47 clinically studied CNS-acting drugs with a wide range of seizurogenic and neurotoxic liabilities at physiologically-relevant exposures. The acute and chronic effects of these drugs on calcium burst characteristics were studied as six independent variables and combined with cellular viability using a logistic regression model to weigh and combine the endpoints into a predictive model. Individual endpoints (e.g., calcium peak amplitude, frequency, width) themselves had poor predictive value. However, when these endpoints were integrated into the logistic regression model, they were able to detect seizurogenic or neurodegenerative liability in 50% of the neurotoxic CNS-active drugs with 93% specificity. Cutoffs were established for each endpoint to favor specificity over sensitivity to minimize potential false positives. The neuronal spheroids were then exposed to 26 drugs with clinically-established safety and exposure profiles. These drugs were selected due to their disparate and well-studied mechanisms of neurotoxicity or long track record of clinical safety. In this independent test set, the neuronal spheroids were able to detect 61% of the drugs with known neurotoxic liabilities, with 92% specificity across 10 independent mechanisms of toxicity. These studies have demonstrated that neuronal spheroids have the potential to detect neurotoxic liability during drug discovery and may help reduce the incidence of neurotoxicity in the clinic.
induce CYP3A4 in the organoid model (which was qualified by its marked responsiveness to rifampin), suggests that using hepatocytes alone to gauge induction in other tissues is not appropriate. Further investigation of the root causes of tissue-specific induction indicated that low-level expression of constitutive androstane receptors in the intestine and in intestinal organoids relative to pregnane X-receptor could be responsible. Omeprazole failed to induce CYP1A2 mRNA but induced CYP1A1 mRNA (up to 7.7-fold and 15-fold in ileal and colon organoids, respectively, n = 4 experiments). The induction of CYP1A1 was notable, since this enzyme is generally considered to be low or absent in liver; it is expressed and inducible in the small intestine of some individuals, possibly those with prior exposure to CYP1A1 inducers. However, our model exhibited relatively high intra-and inter-experimental variability, an opportunity for improvement in future versions of this test system. Nevertheless, the data suggested intestinal organoid induction responses are distinct from those of hepatocytes and represent the prospect of improved risk assessment for induction by drug candidates. The data presented at the workshop as well as additional findings have recently been published (Stresser et al., 2021).

Pulmonary edema in a human lung-on-a-chip model for PD/efficacy (Jason Ekert, GSK)
When developing an in vitro pulmonary edema model, a number of key physiological characteristics should be considered, including 1) an air interface, 2) alveolar/airway epithelium, 3) a matrix-embedded fibroblast layer, 4) pulmonary endothelium, and 5) a media/blood channel that could include leukocytes or key immune cells like macrophages. For pathophysiological features, this includes showing fibrin clots, vascular leakage, and accumulation of fluid in the alveolar space. For these studies, GlaxoSmithKline worked with an academic lab and used a CIVM with a polydimethylsiloxane (PDMS) device that had two parallel microchannels separated by a 10 µm membrane, allowing cyclic stretch to be applied to reproduce physiological breathing movements. Human pulmonary microvascular endothelial cells and the human alveolar epithelial cell line H441 were used. IL-2 was employed as a trigger to recapitulate pulmonary edema pathophysiology. IL-2 resulted in accumulation of fluid in the model's alveolar space and interstitial tissue, "vascular leakage" across the alveolar-capillary barrier, and fibrin clot formation. Breathing-like strain on the PDMS chip demonstrated that mechanical strain in synergy with IL-2 enhanced the opening of cell-cell junctions in both alveolar epithelium and capillary endothelium, leading to increased vascular permeability. Clinical translatability was studied using pharmacological agents that alter vascular leakage. This included Ang-1, which stabilizes endothelial intracellular junction and Ang-2, an antagonist to Ang-1 that destabilizes the capillary barrier. Ang-1 prevented induced vascular leakage by Ang-2 or IL-2, showing clinical translatability of the pulmonary edema model. A lead pharmacological agent (GSK2193874, a transient receptor potential vanilloid 4 (TRPV4) blocker) in early drug discovery was tested using the pulmonary edema lung-on-a-chip. TRPV4 ion channels can be activated by mechanical strain, leading to increased alveolar-capillary permeability and vascular ta referenced. It was suggested that such qualification data could be appended to submissions or referenced as published literature. While both approaches could achieve the purpose of demonstrating CIVM qualification for an individual regulatory submission, it should be noted that publication of literature highlighting CIVM qualification for specific COU would be more widely applicable and beneficial to industry as a whole.
The discussion focused on the interpretation of conflicting data between preclinical species and human and subsequent translatability to the clinic. The concern regarding translatability and confidence in human CIVM data was noted as the perceived cause of hesitation to include such data in regulatory submissions for both the safety and ADME groups. The pharmacology breakout group expressed less concern about concordance between human CIVM data and animal models due to a paucity of available animal models for some diseases. While both the safety and the pharmacology discussion groups recognized the potential value of animal-based CIVM in the absence of appropriate animal models, the safety group also highlighted current challenges in the availability and feasibility of evaluating animal CIVM. The safety and pharmacology group discussions stressed the advantage that CIVM could afford in the absence of appropriate animal models.
Animal research remains a sensitive topic, and it is desirable to reduce, refine and replace animals with alternate testing strategies to protect human and environmental health and safety while meeting regulatory requirements. Within the pharmaceutical industry and regulatory authorities, there is great interest in reduction, refinement, and replacement (3Rs) efforts to ensure higher standards of animal welfare within the research sector. Through the IQ MPS Affiliate, the industry is striving to go beyond what is required and is working to implement the 3Rs. This is being attempted through ensuring that animal welfare is employed in tandem with high-quality science, with the end goal of benefiting both human and animal health. Lack of human-relevant complexity in the current high-throughput models is a major challenge that the MPS model developers could address. This said, their structural, functional, and biochemical attributes are not fully characterized. Currently, since CIVM mimic some aspects of organs such as brain, kidney, lung, intestine, stomach, and liver, they are being increasingly used by researchers to understand cellular physiology, interactions, processes, and disease modeling. However, despite the progress made, there are inherent challenges with these systems such as maintaining cell viability while continuing to replicate a biologically relevant organ model.

Key takeaways from the ADME breakout session
During the breakout session focused on CIVM applications for ADME, participants from the FDA and industry engaged in discussion around qualification and performance criteria. Example questions included: -How do we develop performance criteria and use existing data to accept CIVM for specific COU or replacement of existing models (e.g., in early development, would in vitro to in vivo extrapolation of clearance from rat and dog or non-human primate CIVM data be needed to build confidence in predictions from human CIVM)?

Evaluating ADC-related peripheral neuropathy with human iPSC-derived neuronal models (Terry Van Vleet, AbbVie)
Antibody-drug conjugates (ADCs) are novel chemotherapeutics designed for more selective delivery of cytotoxic agents to cancer cells. However, toxicity of ADCs in normal cells has been reported in multiple preclinical and clinical studies, leading to significantly narrowed safety margins. Peripheral neuropathy is a frequent adverse event with microtubule inhibitor (MTI) ADCs that can be challenging to assess in short-term preclinical studies. For example, peripheral neuropathy was not predicted based on preclinical toxicology studies in non-human primates or rats treated with valine citruline monomethyl auristatin E (cv-MMAE) ADCs. MTI payloads monomethyl auristatin F (MMAF) (non-permeable) and MMAE (permeable) were tested alone and as ADCs linked to AB-095, an antibody against the non-mammalian tetanus toxoid protein, in two in vitro human iPSC-derived neuronal models: 1) a 2D model in which test articles were administered to cells during neurite outgrowth and 2) a 3D model that examined test article effects on a pre-formed neurite network. Multi-parametric evaluation of imaging data resulted in ranking each test article's peripheral neuropathy potential as MMAE > AB095-MMAE > AB095-MMAF ≥ MMAF. This ranking is consistent with what can be expected based on clinical experience. Both 2D and 3D human iPSC-derived neuronal models produced essentially the same test article ranking. Results indicated that the effect on neurite outgrowth formation and preformed neurite network integrity appear to provide the same outcome. Specifically, payload permeability appears to be important in ADC-mediated neurotoxicity in vitro, and target-independent (non-specific) ADC uptake is unlikely to contribute significantly to neurodegeneration.

Common themes that evolved from breakout sessions
The breakout sessions were organized primarily by functional area expertise in the ADME, pharmacology, and safety sciences. At the outset, participants were instructed to discuss three main topics: challenges of data submission to health authorities, performance criteria of models, and species concordance. As expected, the level of concern surrounding a given topic varied by functional area and was reflected in the reporting of each group. Model qualification was of particular concern to all breakout groups though each approached the topic slightly differently. The safety and ADME breakout groups each discussed qualification within a specified COU and how to properly demonstrate model qualification. In contrast, the pharmacology breakout group stressed verification of pathway expression within a CIVM and differentiation from traditional 2D cell culture. Both the safety and ADME groups stressed a need to provide qualification data in regulatory submissions to boost confidence in a given model though no details were determined regarding the specific level of qualification needed. The appropriateness of a CIVM for the intended COU would likely be judged in isolation within the regulatory submission in question as it relates to the qualification da-be underestimated from current widely used simple platforms). Also, CIVMs might provide better characterization of certain parameters required for PBPK modeling than other current methods. Further future applications may include understanding dermal penetration and systemic bioavailability of dermal products. Finally, CIVMs may lend themselves to evaluating and leveraging potential endogenous drug metabolizing enzyme and transporter (DME-T) biomarkers to speed up DDI study design. The overarching theme was that CIVMs may hold high potential for improved and predictive ADME understanding, but a substantial amount of work is needed to realize this full potential.

Key takeaways from the pharmacology breakout session
The discussion in the pharmacology breakout session was centered around two main topics: 1) species differences and concordance and 2) performance criteria for MPS models. Unlike safety and ADME evaluation, which relies heavily on the use of preclinical species, pharmacology applications focus specifically on human disease state and therefore the use of animals may not be as critical for this COU. Specific examples illustrating disease-specific scenarios requiring animal model applications or lack thereof were discussed. In oncology, in vivo models are considered to be the gold standard and are generally preferred over in vitro models; however, in central nervous system disorders, translatability of animal models to the human disease is often not fully established, and the extent of their utility is therefore questionable in this context. In cases of schizophrenia and autism, iPSC-derived in vitro models may be more relevant because information may be available on patient-specific genetic backgrounds, however studying social interactions or complex phenotypes can only be achieved by observing live animals (Grunwald et al., 2019;Dolmetsch and Geschwind, 2011). In general, the purpose of the assay, measured endpoints, and specific disease will influence whether an animal or in vitro human model is preferred. The rationale for selecting either could be acceptable as long as the model (human or animal) is predictive of the disease state. The participants of the breakout group agreed that the weight of evidence in pharmacology situations drives the decision-making, and a pragmatic approach should generally be favored.
In the discussion on the performance criteria of CIVMs, the key feature distinguishing CIVMs from traditional 2D in vitro models is that they are likely to be more predictive and, most importantly, more physiologically relevant to allow for prolonged (up to many weeks in culture) studies. Depending on the specific question, traditional 2D models may provide "sufficiently good" information, but this may not always be the case, and in these scenarios, application of CIVMs may provide additional endpoints that are otherwise not available. When a disease is heterogenous and is represented by multiple etiologies, it is not necessary for the CIVM to recapitulate all of them, but, at a minimum, the protein and pathway of interest should be expressed in the cells used and the disease-relevant phenotype should be recapitulated in the investigated CIVM platform. The cell source and cell relevance are also important considerations. In gener--What would be the stepwise process for acceptance? How broad or narrow can the COU be? Key takeaways from this breakout session were that all systems should be qualified with the COU in mind and that narrower COUs would be relatively easier to qualify than a more multi-faceted assay. Based on advances in the field, some members of the FDA expected CIVM data to be included in more submissions in the future. Data generated using new models should be supported by qualification data submitted either as an appendix and/or as a peer-reviewed publication. Additionally, inclusion of a Sponsor's internal non-public supportive data that demonstrates the utility of the model for decision-making should be considered for submission. Industry members highlighted that a significant limitation to current applications of CIVM in the ADME space is that current use is restricted primarily to qualitative assessments, and value in modeling and prediction would come from more quantitative applications. An example of a qualitative assay is the case study provided by Pfizer, which evaluated the cross-species metabolite profiles in a long-term co-cultured hepatocyte model (Ballard et al., 2016) (Tab. 2). Aspects of these models that limit quantitative potential were discussed and include the need for accurate cell counts, consistency between wells and/or seedings, and reproducibility across days. To qualify these aspects, it is important to understand the performance of the cells, identify donor cells with day-to-day reproducibility, and establish known and expected positive and negative control outcomes that can be used as between-study controls. Indeed, the primary concerns with these new model systems encompassed donor variability, the need to screen cells for preset criteria, reproducibility, precision, non-specific binding to the matrix/materials used, and incremental improvements over the current gold standard models. Intestine and liver CIVMs (i.e., single-organ and combination or multi-organ) that contain a microfluidic component would be of high interest as, in principle, this could enable evaluation of enterohepatic recirculation, first pass effect, and bioavailability. These models could allow for in vitro derived predictions of fraction absorbed (Fa), fraction escaping gut extraction (Fg), fraction escaping hepatic extraction (Fh), and understanding biliary elimination and the potential role of recirculation in a human-relevant system. Additional future goals, specifically for microfluidic CIVMs, included their utility for predicting tissue concentrations, for understanding the impact of disease states on AD-ME, and to enable full understanding of pharmacokinetic (PK) and pharmacodynamic (PD) relationships. CIVMs also provide an opportunity to enable the study of complex drug-drug interactions (DDI) and transporter enzyme interplay. This would open the possibility of mechanistic understanding within the gastrointestinal and liver axis, including interactions between P-glycoprotein (Pgp) and cytochrome P450 3A (CYP3A) substrates and inhibitors. Additionally, the impact of organic anion-transporting polypeptide (OATP) inhibitors on liver concentration and subsequent pharmacodynamics of OATP substrates (e.g., statins) could be predicted. A long-term goal could be development of full PK on a chip to provide better prediction than the current allometric scaling approach (from animal data) or IVIVE from human liver microsomes/hepatocytes (e.g., in vivo clearance of a drug may sults to the Agency. FDA noted that there are no regulatory barriers that would prevent Sponsors from submitting such data with a drug application. However, it is the Sponsor's responsibility to provide enough supporting data to demonstrate that the model has been qualified for that particular COU. The level of detail provided should be sufficient for the FDA to form an independent opinion of the scientific rigor of the analysis and the validity of the conclusion. Safety decisions would not be based only on human CIVM results. A weight of evidence approach should be used to determine the overall risks associated with any safety signals taking into consideration all available data, whether from CIVM or in vivo animal toxicity studies. Such an approach would consider the relevance of each model to the safety signal. If there were safety-related hazards identified in human cells, in the absence of clinical data, it would be incumbent upon the Sponsor to demonstrate the lack of risk associated with the in vitro data set and/or identify acceptable approaches to monitor for the safety signal implicated by the in vitro data in the clinic.

Species concordance
Both the IQ MPS Affiliate and FDA scientists generally agreed that animal CIVMs for preclinical species would be beneficial for drug safety assessment to determine species concordance. There are decades of safety and pharmacology data generated by the pharmaceutical industry from in vivo models that provide informative data before going into human clinical trials. Likewise, there are examples where compounds were stopped early for a safety signal during in vivo preclinical studies that may not translate to humans. While the initial focus has predominantly favored human-based CIVM, the creation of animal-based CIVM will likely be driven by this need. Some animal CIVM are already available (Chang et al., 2017;Deosarkar et al., 2015). These systems could be used to gain confidence around whether a particular preclinical finding will translate to humans. To do so, known species-specific effects should be reproducible in their CIVM counterparts as well. For drug toxicities already known to show poor correlation (for example liver), this could be of less value. Few papers have been published that examine a compound's action across preclinical species and in humans using in vitro systems, and no large-scale, comprehensive evaluations have been conducted. There is evidence for species differences with respect to drug effects on transporters as well as general toxicity (Cohen, 2004). Producing preclinical animal CIVM will help increase confidence in the human-based systems, as differences in in vivo species sensitivity may also be evident in these models (Ewart et al., 2017;Van Vleet et al., 2019). Similarly, recapitulating animal observations for ADME properties including distribution (i.e., liver concentrations) and metabolism using animal-derived CIVM would build confidence in human predictions from human CIVMs where observations are not attainable in the clinic by plasma or excreta sampling (i.e., liver concentrations or biliary excretion). Showing strong concordance between animal (in vitro and in vivo) and human CIVM data could lead to the al, patient-derived cells (primary, adult organoids or iPSCs) may be preferred, but if they are not available, genome engineering of healthy control cells can be an option to induce a disease state. Alternatively, drugs or chemical agents can be used to elicit the necessary pathophysiology. Other sets of considerations for evaluating performance criteria of CIVM include operational parameters, like assay stability, day-to-day variability, and the necessary quality control assays to ensure that models are acceptable every time an experiment is performed. Specific parameters for assay performance and qualification should be developed internally to provide a point of reference for establishing assay robustness.

Key takeaways from the safety breakout session
The safety breakout session primarily focused on how industry can overcome barriers, whether perceived or actual, to regulatory submission of CIVM data. The discussion was subdivided into clarity of terminology on qualification versus validation, how the Agency manages new in vitro data, species translation considerations with regards to safety, and potential ways to share data between industry and the FDA to continue advancement of the field.
The group discussed terminology for data rigor, particularly expectations for qualification for specific COU assays and/ or systems. Discrepancies in terminology were identified as a knowledge gap going into the discussion and, as such, considerable time was devoted to gaining a deeper understanding and agreement on expectations. The group discussed the extensive efforts for a specific COU such as irritation that have been undertaken in reconstructed skin models and published as guidelines by the OECD, which is not to be confused with activities leading to the official qualification of biomarkers used in clinical trials. For the purposes of determining safety of molecules for a specific endpoint, it was agreed that validation efforts conducted under the auspices of the OECD represent the highest level of validation. However, regulatory submission of CIVM data generated in a qualified model/assay would be sufficient. The IQ MPS Affiliate also recently published a series of manuscripts dealing with considerations, options, and tools for CIVM qualification (Baudy et al., 2020;Pointon et al., 2021;Fabre et al., 2020;Ainslie et al., 2019;Fowler et al., 2020;Hardwick et al., 2020;Peters et al., 2020;Peterson et al., 2020;Phillips et al., 2020). This then begged for clarification on what is meant by "qualified". Disclosure of background data supporting a model/assay for a specific COU, whether published literature or internal Sponsor data, was deemed sufficient to be designated as qualified. Further, it was highlighted that disclosure of data supporting model/assay qualification was a gap often observed in regulatory submissions. The group agreed that the supporting data would need to be specific for the intended COU and include the assay conditions and format, cell types, analyses, and controls used. Disclosure of such supporting data would then enable the regulatory reviewer to make an informed decision on the appropriateness and applicability of the CIVM data submitted for the Sponsor's drug entity/ program. The FDA encouraged Sponsors to publish and reference where possible, and when not possible to submit CIVM re-possibility of replacing or reducing the use of in vivo animal models with human CIVMs.
The overarching take-home message from this workshop is that acceptance of CIVM technologies for use in ADME, pharmacology, and safety assessment, will require qualification, which will vary depending on the specific COU, to allow the assessment and prediction of adverse clinical outcomes and compound efficacy. This will demand a continuous dialogue and feedback among all relevant stakeholders including members of pharmaceutical and regulatory bodies.
It is also clear that alignment on below terminology would assist with this dialog: -Qualification -Characterization -Reliability -Reproducibility -Robustness -Performance criteria -Confidence -Endpoint -Biomarker As highlighted in this workshop, three topics 1) performance criteria, 2) species differences and concordance, and 3) overcoming the barriers to inclusion of CIVM data in regulatory submissions require further discussion, clarification, and data. There is an urgent need to reduce attrition within the drug development process, and integrating CIVM technologies has the potential to assist with this challenge.

Continual FDA & industry engagement and next steps
During the workshop, the importance of continued engagement between FDA and industry was discussed. Since the workshop, both stakeholders have continued interactions through frequent teleconferences and webinars to discuss specific topics such as current technology status, COU, and animal (non-human) cellbased CIVMs. Planning for a second workshop is underway.