International ring trial of the epidermal equivalent sensitizer potency assay : reproducibility and predictive capacity

Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: http://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.


Introduction
Repeated exposure to chemical allergens increases the risk of becoming sensitized to that particular chemical.Once an individual has become sensitized, any following exposure to the same chemical may result in allergic contact dermatitis.the contact dermatitis can range from a mild skin rash to extensive skin blistering.Within the North American and Western European populations, the prevalence of skin sensitization to at least one chemical is approximately 20% (Peiser et al., 2012).The risk to develop allergic contact dermatitis is considered a serious health issue and the identification of potential sensitizing agents within consumer products is therefore crucial.may demand toxicity tests for chemicals produced in quantities of over 1 ton per year (Grindon, 2007;Grindon et al., 2008;Rovida and Hartung, 2009).
Within the integrated European Framework Program 6 Project Sens-it-iv (LSHB- CT-2005-018681;2005-2011) a number of potential in vitro assays were developed that mimic the key mechanisms of skin sensitization and that therefore may provide alternatives to animal methods (Roggen, 2013).When used in an integrated testing strategy, some of these assays may be able to assess whether or not a chemical is a potential sensitizer (chemical label), and some may also determine the potency of that sensitizer (chemical classification) (Basketter and Kimber, 2009;De Wever et al., 2012).In order to determine whether these assays may actually be suitable to replace the llNA for risk assessment of potentially sensitizing substances, validation according to EURL-ECVAM guidelines of these assays and other assays developed in parallel to Sens-it-iv is required.the key mechanisms that the assays are based on are i) chemical penetration to the viable epidermal cell layers to result in cytokine release and cytotoxicity (EE potency assay -the subject of this manuscript) (dos Santos et al., 2011); ii) formation of hapten-protein complexes, the activation of the Keap1/Nrf-2 pathway, and triggering keratinocytes to release innate danger signals in the form of cytokines, ATP and reactive oxygen species (e.g., Keratinosens™, IL-18 NCTC assay, Direct Peptide Reactivity Assay (DPRA)) (Natsch et al., 2011(Natsch et al., , 2013;;Galbiati et al., 2011;Gerberick et al., 2004); iii) dendritic cell maturation and changing biosignatures (e.g., MUTZ-3 GARD assay, hCLAT, MUSST, PBMDC) (Maxwell et al., 2011;Lindstedt and Borrebaeck, 2011;Johansson et al., 2013;dos Santos et al., 2009); iv) dendritic cell migration (e.g., MUTZ-DC migration assay) (Gibbs et al., 2013b), and, finally, T cell priming in the local lymph node (e.g., T cell amplification and differentiation assays) (Martin et al., 2010).
Many assays under development are aimed at distinguishing a sensitizer from a non-sensitizer (YES/NO answer: chemical label).An assay that addresses sensitizer potency (chemical classification) is of high importance when considering the need to totally replace in vivo animal testing for hazard and risk assessment of skin sensitizing chemicals (Mehling et al., 2012).This manuscript describes the international ring trial of an assay that may be able to rank sensitizers according to their potency (dos Santos et al., 2011;Spiekstra et al., 2009).The EE potency assay is a modification of the EURL-ECVAM validated ee assay for assessing the corrosive and irritant properties of a chemical and therefore, by definition, will not distinguish a sensitizer from a non-sensitizer (Fentem et al., 1998;Spielmann et al., 2007) (for epiCS ® see OeCD guideline Test Number 439: In vitro skin irritation).The validated skin irritation/corrosion test basically assesses the undiluted test chemical.Our EE potency assay is a modification in the sense that we have expanded the possibility to carry out a dose response of the diluted chemical using the same model (EE) and the same end point (cell viability as assessed by MTT reduction) to address sensitizer potency based on irritant potential.This sensitizer potency classification is based on the clinical observation that there is a clear role for irritancy in contact sensitization due to the irritant properties of many sensitizers (Agner et al., 2002;Basketter et al., 2007;Bonneville et al., 2007;McLelland et al., 1991).The local trauma results in an increase in epidermal cytokine production, e.g., IL-1α.Previously, we have shown that there is a relationship between the strength of the sensitizer and the irritant potential of the chemical (dos Santos et al., 2011;Spiekstra et al., 2009).The primary readout of the ee potency assay is the eC50 value, i.e., the chemical concentration leading to a 50% decrease in EE viability (MTT assay) compared to vehicle exposed EE.The second readout parameter is the IL-1α2x, i.e., the chemical concentration resulting in a 2-fold increase in the release of the pro-inflammatory cytokine IL-1α into the culture supernatant.Using a panel of 12 test chemicals we have shown that the eC50 value in particular, and IL-1α release to a lesser extent, correlated well to LLNA-EC3 data and Human Repeat Insult Patch Test (HRIPT) data with regards to ranking sensitizer potency when using the VUMC-EE model (dos Santos et al., 2011).The EC3 concentration is the primary parameter used in the murine llNA and represents the chemical concentration resulting in a three-fold increase of 3 H-thymidine incorporation in the auricular draining lymph node, compared to vehicle control (Gerberick et al., 2007a).The HRIPT is a test that assesses the maximum no observed threshold effect level (NOEL in µg/cm 2 ) of a chemical in human volunteers (Basketter et al., 2005).Since the EE potency assay does not identify sensitizers, it has to be used as a tier 2 assay on the sensitizers identified in a tier 1 assay (e.g., NTCT assay, DC maturation or migration assay).Since the assay assesses potency, it has the potential to identify the maximum safe threshold concentration of a chemical.
Once a potential assay has been developed, the next phase is optimization and testing transferability and reproducibility of the method in different naïve laboratories.this is essential for future widespread implementation of the assay.This ring trial set up (also referred to as phase 1 of pre-validation) of the assay involves finalization of a preliminary standard operating procedure, testing the transferability of the assay in different laboratories, and finally testing the intra-laboratory and inter-laboratory reproducibility and predictive capacity of the assay with a coded panel of test chemicals.Pre-validation is required before an assay can enter the validation phase with an extended panel of test chemicals in multiple laboratories.
Previously, the transfer phase of the EE potency assay international ring trial has been reported (Teunis et al., 2013).The transferability of the standard operating procedure (SOP) from the lead laboratory (VUMC) to 3 other European laboratories (University of Applied Sciences Utrecht, The Netherlands (HU), University of Milan, Italy (DiSFeB) and BASF Chemical Company, Ludwigshafen, Germany (BASF)) was assessed using two training chemicals (DNCB and resorcinol).Furthermore, the transferability of the method from the VUMC-ee to the commercially available epiCS ® (previously EST1000™) (CellSystems, Biotechnology GmbH, Troisdorf, Germany) was also described.
In the current study we report the results obtained from the international ring trial.the intra-laboratory and inter-labora-2 Materials and methods

Method outline
For a full description of the technology transfer and standard operating procedure (SOP) for the EE potency assay see supplementary materials in Teunis et al. (2013).Following the SOP, any chemical that is soluble in DMSO or a mixture (4:1) of acetone:olive oil (AOO) can be tested.The maximum solubility of all test chemicals in this ring trial was determined by an independent laboratory (TNO, Zeist, The Netherlands).For an overview of the EE potency assay method see Figure 1.tory reproducibility of the ee potency assay in four european laboratories is described along with putative positive and negative acceptance criteria.A test panel of 13 coded chemical sensitizers was used to test the predictive capacity of the assay in ranking sensitizer potency.For this, EC50 and IL-1α2x values were compared to published mouse LLNA-EC3 and human NOel and DSA05 data.DSA05 is the chemical dose per skin area in µg/cm 2 leading to a sensitization incidence of 5% in the tested human population (Schneider and Akkan, 2004).Thereby it was possible to establish a prediction model and to establish a linear correlation graph to rank sensitizers with regards to their weak to extreme sensitizing potencies.The chemicals are listed according to their potency values obtained from a combined assessment of all data available from the human category scale, human NOEL, human DSA 05 , and murine LLNA-EC 3 experiments.When human and murine data were conflicting or limited, the human data were prioritized in the ranking above murine data.
Human category scale: 1 = Extensive evidence of contact allergy in relation to degree of exposure and size of exposed population; 2 = A frequent cause of contact allergy, but of less significance compared with induction of skin sensitization in a HRIPT category 1; 3 = A common cause of contact allergy, perhaps requiring higher exposure compared with category 2; 4 = Infrequent cause of contact allergy in relation to level of exposure; 5 = A rare cause of contact allergy except perhaps in special circumstances (Basketter et al., 2014).
Human NOEL (µg/cm 2 ) = no observed effect level; all available data for NOEL is shown.
Human DSA 05 (µg/cm 2 ) = induction dose per skin area (DSA) that produces a positive response in 5% of the tested population.
The LLNA-EC 3 values are expressed as % according to Basketter et al. (1999): potency classification is based on the mathematical estimation of the concentration of chemical necessary to obtain a threshold positive response (SI=3); this is termed the EC 3 value.Chemicals with an EC 3 value (%) ≥10 to ≤100 are classified as weak, ≥1 to <10 moderate, ≥0.1 to <1 strong, <0.1 extreme.
In vivo data represents cobalt (II) sulphate whereas in the EE potency assay cobalt (II) chloride was tested.named BD-A and BD-B (Fig. 1).From the BD-A, using 10-fold serial dilutions starting from the master starting stock solution, a chemical concentration was identified from the tested range that results in >60%, preferably >80% reduction in EE viability compared to vehicle-exposed ee.then, 3-fold serial dilutions from this starting point (identified from BD-A) were tested in BD-B.
From BD-B, again a chemical concentration was identified from the tested range that results in >60%, preferably >80% reduction in ee viability compared to the vehicle.this chemical concentration was then the highest concentration used in the fine dose (FD) experiments.Two-fold serial dilutions of this starting point concentration were tested in the FD experiments.If a chemical failed to result in >60% reduction in EE viability in BD-A or BD-B, the chemical was excluded from the assay since it would not be possible to obtain an eC50 value in the FD.
For each test chemical, BD-A and BD-B were performed in single-fold, whereas the FD experiments were performed in two independent experiments in each laboratory.Only controls (unexposed, vehicle(s) and positive assessment conditions) were tested in duplicate per independent experiment.Statistical analysis and prediction models are described below.
After having assessed the transferability of the ee potency assay (Teunis et al., 2013), the ring trial reported here, involving four European laboratories: VUMC (lead laboratory), HU, DiSFeB and BASF, was started.Thirteen coded well-known sensitizing chemicals were used (Tab.1).Each chemical was tested in two independent experiments in order to obtain the eC50 and the IL-1α2x values (Fig. 1).Two independent experiments were defined as two experiments performed on different days and using different EE batches.Two read-out parameters were assessed for each chemical: -Readout A: Cell viability measured by Mtt assay and expressed as the eC50 value (effective chemical concentration in mg/ml required to reduce cell viability to 50% compared to vehicle exposed cultures).-Readout B: Release of the pro-inflammatory cytokine IL-1α, measured by ELISA and expressed as IL-1α2x (effective chemical concentration required to result in a 2-fold increase in release of IL-1α into culture supernatant compared to vehicle-exposed cultures).Following the SOP, finding the broad dose (BD) response range was determined by two consecutive range-finding experiments, incubation, filter paper disks were gently removed.Culture supernatant was harvested and stored at -20°C for IL-1α ELISA (FD concentrations only; see below) and epiCS ® were harvested for MTT assay in order to assess cell viability (all cultures; see below).
MTT assay and quantification of IL-1α secretion: the Mtt assay and quantification of IL-1α secretion (R&D System Inc., Minneapolis, Minnesota, by ELISA was performed as described in Teunis et al. (2013, supplementary SOP).

Acceptance criteria
Quality controls of the epiCS ® models: All epiCS ® came with a batch control certificate.Models were checked by CellSystems for barrier integrity (defined as within target when viability was >50% after treatment with Triton X-100 for 2 h).
Skin equivalent performance: In this international ring trial, only putative vehicle and positive control acceptance criteria are defined.Since the acceptance criteria have not been fully tested previously, if an experiment did not fulfill the quality criteria but an eC50 value could be obtained, then the eC50 value was still included in the final analysis.
Putative acceptance criteria for vehicles: Vehicle exposure alone should not result in more than a 30% decrease in cell viability compared to unexposed cultures.If the vehicle results in more than 30% decrease in viability then the EE batch does not fulfill the proposed quality criteria.The percentage difference between the unexposed and the vehicle-exposed EE were calculated as follows: ((average viability unexposed -average viability vehicle exposed) / average viability unexposed) x 100.
Putative acceptance criteria for positive control: exposure to resorcinol should result in 20-80% (preferably 50%) decrease in cell viability compared to vehicle.
Exclusion of chemicals: From the BD-B, a chemical concentration is chosen from the dilution range tested that results in >60%, preferably >80%, decrease in EE viability compared to the vehicle.then, 2x serial dilutions from this starting concentration were tested in the FD experiments.If a chemical failed to result in a >60% decrease in viability in BD-A or BD-B, this chemical was excluded from the assay since no EC50 value would be determined in FD.

Data management and statistical analysis
All the data were collected prior to uncoding of the chemicals.For the statistical analyses, a summary template was designed by the statistician, and the results were transferred to this template by each participating laboratory.this summary template contained internal checks that ensured that no mistakes were made in the transfer of the results.
Reproducibility of the controls: The viability (MTT assay) of the unexposed, vehicle exposed, and positive control were plotted for each batch of ee and the frequency of experiments fulfilling the putative acceptance criteria recorded.

Reproducibility of the BD experiments:
The BD experiments provided the dose range for final testing of the chemical in the FD response experiments.The concentrations obtained in the BD experiments were tabulated and compared between the laboratories (exploratory).

Selection and coding of test chemicals
Chemicals were selected by an independent party (TNO).Initially, over 80 chemicals were short-listed by the project team and from this list, 13 sensitizers were selected and coded by tNO. each laboratory received a uniquely coded set of test chemicals.The code for the chemicals was communicated directly to the statistician (Adriaens Consulting, Aalter, Belgium) after all data had been received by the statistician.All 13 tested chemicals along with the in vivo potency information, vehicles used, maximum solubility, and starting concentrations tested in BD-A and BD-B experiments are shown in Table 1 and 2. With the exception of 2-mercaptobenzothiazole, which was purchased from Fisher-Scientific (ACROS Organics; Loughborough, UK), all chemicals were purchased from Sigma-Aldrich (Sigma, Aldrich, SAFC; St Louis, Missouri, USA).Chemicals were >95% pure with the exception of formaldehyde, which was 36.5-38% in H2O.Isoeugenol was a 98% mixture of the cis and trans form and oxazolone was purified by recrystallization.
Maintenance of EE models: Upon arrival in the laboratories, the epiCS ® skin tissues were handled exactly as recommended by the supplier and as described in detail in teunis et al. (2013, supplementary SOP).Maintenance medium (supplied by CellSystems) was used throughout the procedure and was also used for preparing dilutions of the test chemicals.In short, upon receipt, epiCS ® cultures were transferred to a 6-well plate containing 1 ml maintenance medium and incubated overnight at 37°C, 5% CO2, 95% humidity to allow the cultures to equilibrate.After equilibration, cultures were used for the EE potency assay according to the SOP.
Preparation of chemicals: DMSO (1% in CellSystems ® maintenance medium) or AOO (4:1), the choice depending on which resulted in the highest chemical solubility, were used as vehicles for dissolving the chemicals.BD-A, BD-B, and FD experiments were performed as described in Figure 1.The positive control chemical was resorcinol (60 mg/ml (545 mM) in 1% DMSO).This concentration was selected from past experience by the VUMC lead laboratory (Teunis et al., 2013;dos Santos et al., 2011).For each experiment, unexposed, vehicle-exposed, and positive controls were tested in duplicate and the test chemical concentrations in single fold.In the BD-A and BD-B, 4 concentrations were tested per chemical.In FD, 5 chemical concentrations were tested per chemical.
Exposure to test-chemicals and controls: Pre-sterilized Finn Chamber filter paper discs of 7.5 mm (Epitest LTD Oy, Finland) were impregnated with 25 µl of the test samples (chemical dilutions, vehicles, positive control).Excess fluid was gently tapped from the filter and the impregnated filters were topically applied to the epiCS ® stratum corneum.the epiCS ® were then returned to the incubator (37°C, 5% CO2, 95% humidity).After 24 h of Reproducibility of the fine dose experiments: The FD experiments were performed in duplicate.The agreement in EC50 concentration between the two independent experiments within each laboratory was assessed with scatter plots.Correlations between the two runs were determined by Pearson analysis (twotailed) in combination with line of equality.Analyses with 95% confidence interval using GraphPad Software, San Diego, CA, USA.Correlations were considered significant for p <0.05.
EE potency assay: the eC50 value is the effective chemical concentration required to reduce metabolic activity (corresponding to cell viability) to 50% of the maximum value.The 100% value for cell viability corresponds to the vehicle control (1% DMSO in culture medium or AOO 4:1).EC50 values were obtained by linear regression analysis based on changes in metabolic activity (MTT).In order to rank the chemicals, correlations between EC50 and llNA, NOel or DSA05 were determined by nonparametric two-tailed correlation Spearman Analyses using GraphPad Software, San Diego, CA, USA.
IL-1α release and potency: IL-1α2x values were obtained by linear regression analysis based on the chemical concentration resulting in a 2-fold release in IL-1α.In order to rank the chemicals, correlations between EC50 and llNA, NOel, or DSA05 were determined by nonparametric two-tailed correlation Spearman Analyses using GraphPad Software, San Diego, CA, USA.
Prediction model: In addition to the previously proposed ranking prediction model, in which the lower the EC50 value, the more cytotoxic (irritant) the chemical and the stronger the sensitizing potency of the chemical is, an additional prediction model was identified in this study where strong and extreme sensitizers had EC50 values <7 mg/ml chemical and the majority of the moderate and weak sensitizers had an EC50 value ≥7 mg/ml chemical.

Acceptance criteria: reproducibility of data for vehicle and positive control resorcinol
Very little batch variation was observed between the unexposed batches of epiCS ® used in each laboratory.In total, 23 different batches were used in the 4 different laboratories, with the same batch often being delivered to multiple laboratories.Average OD570 values obtained from the Mtt assay of unexposed EE for the different batches were as follows: VUMC (n=14): 2.866 ±0.279; HU (n=12): 2.695 ±0.266; DiSFeB (n=12): 2.881 ±0.586; BASF (n=12): 3.047 ±0.868.Vehicle exposure generally did not result in more than 30% decrease in cell viability compared to unexposed cultures in accordance with the proposed acceptance criteria for this international ring trial.Of the 23 batches used in this study, only one batch in the VUMC lab (batch 5) and a different batch in the BASF lab (batch 12) showed slightly more than 30% cytotoxicity after vehicle exposure compared to unexposed cultures (Fig. 2).A putative acceptance criterion was also defined for the positive control resorcinol (545 mM): topical exposure to a single concentration of resorcinol should result in 20-80% (preferably 50%) decrease The batch numbers allotted to each laboratory did not correlate between laboratories and therefore the deviations observed between laboratories were not due to the same batch of epiCS ® .The black line corresponds to the upper limit of 80% viability of the positive control exposed EE.
sensitizers benzocaine (DiSFeB) and α-hexylcinnamaldehyde (VUMC, HU) and the strong sensitizer cobalt (II) chloride (BASF) showed poor reproducibility between runs.Of note, cobalt (II) chloride was already identified by VUMC, HU, and DiSFeB as giving unreliable results in the MTT assay due to its interference with the spectrophotometric assay readout.Pearson correlations of all chemicals tested yielded a strong correlation between both runs (FD-1 and FD-2) for VUMC, HU, and BASF.Pearson r values ranged from 0.965 to 0.989 (p-value: 0.0001) in these three laboratories.DiSFeB showed slightly less but still significant correlation (Pearson r value: 0.688 p=value: 0.019).These results indicate extremely low intra-and inter-laboratory variation with regards to the assay protocol.in EE viability.However, variation was observed both within the labs and between the labs (Fig. 2).For VUMC, batch 1 and 4 showed <20% decrease in EE viability and batch 9 showed >80% decrease in EE viability when exposed to resorcinol.For HU, batch 11 showed <20% decrease in EE viability when exposed to resorcinol.For DiSFeB, batches 1, 7, and 11 showed >80% decrease in EE viability when exposed to resorcinol.For BASF, batches 7 and 8 showed >80% decrease in EE viability when exposed to resorcinol.Of note, the batch numbers allotted to each laboratory did not correlate between labs and therefore the deviations observed between laboratories was not due to the same batch of epiCS ® .This indicates that variation was due to technical inter-laboratory variation rather than true batch variation.Since the vehicle and positive performance criteria had not been tested before the start of the study, BD and FD data obtained from batches not meeting the putative performance criteria were still included in all further analysis for determining eC50 values and potency.

Broad dose B response
Chemical concentrations were selected from BD-A for further testing in BD-B (Fig. 1; Tab. 2).From BD-B the chemical concentration could be selected by each laboratory for use in FD and identification of the EC50 (Tab.2).Of the 13 coded sensitizers selected for the study, in the VUMC, HU, and DiSFeB labs, 11 chemicals resulted in >60% decrease in EE viability, enabling a chemical concentration to be selected for further testing in FD (Tab.2).In the BASF lab only 9 chemicals resulted in >60% decrease in ee viability.exposure to p-phenylenediamine and cobalt (II) chloride was reported to give unreliable results or no 60% decrease in EE viability in 3 of the 4 laboratories.For both chemicals this was due to interference with the MTT photometric assay (p-phenylenediamine oxidized spontaneously to a brown compound and cobalt (II) chloride had a strong green color).When unreliable results were reported, the chemicals were excluded from further analysis in FD in the corresponding laboratories.Furthermore, for unknown reasons, 60% decrease in EE viability was not reached when exposing EE to 2-mercaptobenzothiazole, phenylacetaldehyde, or formaldehyde in the BASF laboratory.The 60% decrease in EE viability was also not reached for formaldehyde in the DiSFeB laboratory.The chemical concentration selected to enter the FD was not identical in each laboratory and sometimes differed by a factor of up to 10.In conclusion, the start concentrations for the FD experiments for 11 sensitizers were identified in VUMC, HU, and DiSFeB laboratories, and for 9 sensitizers in the BASF laboratory (Tab.2).

Fine dose response and determination of EC 50 value
Inter-experiment variability: the inter-experiment variability within a laboratory and between laboratories is a measure for the robustness of the assay.For each chemical the EC50 value was determined in two separate runs (FD-1 and FD-2) and the FD-1 and FD-2 results were correlated with each other (Fig. 3).In general, many dots (chemicals) were near or touching the line of equality, indicating very good reproducibility within a laboratory.The weak agreement among the eC50 values obtained in the four laboratories.Two laboratories reported unreliable results for 2-mercaptobenzothiazole (VUMC, HU) and BASF did not test this chemical as no eC60 was obtained in the BD experiments.p-Phenylenediamine and cobalt (II) chloride were also not tested in 3 of the 4 laboratories, since no reliable eC60 was obtained in the BD experiments.

EC 50 potency ranking
From the FD experiments, EC50 values were determined.All individual results for each laboratory and each fine dose experiment are shown in Table 3.The EC50 values were used to rank the potency of the chemical: The lower the EC50 value, the more cytotoxic (irritant) the chemical and the stronger the sensitizing potency of the chemical.In general there was good Results are shown from two independent FD experiments (1, 2) with the exception of 2-mercatobenzothiazole where the 2 runs were repeated (1R, 2R) due to inconclusive data in VUMC and HU laboratories.Areas with dark grey background represent chemicals with an EC 50 <7 mg/ml and areas with light grey background represent chemicals with an EC 50 ≥7 mg/ml.
NT: chemical not tested in FD as no EC 60 concentration was obtained in BD-B; NR = EC 50 value not reached in FD; All >50%: all concentrations in FD resulted in more than 50% reduction in viability so no EC 50 could be obtained.For DisFeB isoeugenol FD2: an EC 50 value was not obtained and therefore the maximum tested FD concentration (20 mg/ml) identified from BD-B (>EC 60 ) is used as the run was not repeated within the study.

Correlation of EC 50 potency values with in vivo LLNA-EC3, and human DSA 05
Next the eC50 data were correlated to human NOEL, DSA05 and llNA-eC3 (Fig. 4).Clearly very reproducible, well correlating, and generally significant results were obtained by each independent laboratory, particularly when the data obtained from the four laboratories were averaged (all laboratories combined: ee-eC50 vs. NOEL spearman r=0.720, p=0.034;EE-EC50 vs. DSA05 spearman r=0.845, p=0.006 compared to EE-EC50 vs. llNA-eC3 spearman r=0.715, p=0.016).For the independent laboratories, the in vitro ee-eC50 correlation to the human DSA05 data was exceptionally high although it should be noted that for the main outlier, oxazolone, only mouse LLNA data was available for the correlations.

Correlation of IL-1α 2x values with in vivo LLNA-EC 3 , and human DSA 05
Since IL-1α release is related to cytotoxicity and irritation, and therefore also possibly to sensitizer potency, it was next determined whether a correlation also existed between the IL-1α2x value, and NOel, DSA05, or llNA-eC3 (Fig. 5; Tab. 5, 6).IL-1α2x is the chemical concentration that causes a 2-fold re- In the majority of the runs, strong and extreme sensitizers had eC50 values <7 mg/ml, whereas the majority of the moderate and weak sensitizers had an EC50 value ≥7 mg/ml (Tab.3).Therefore, it was next determined whether it was possible to differentiate weak / moderate from moderate/ strong sensitizers using a cut-off of 7 mg/ml (Tab.4).Since only 2 FD runs were performed in this study, some chemicals scored an ambiguous result.table 4 describes this prediction model excluding ambiguous results and also describing the worst case scenario if ambiguous chemicals were to score negative.For correct classification of ambiguous chemicals a 3 rd FD run would be required in the future.VUMC, HU, and BASF showed good sensitivity (60-83%), specificity (80-100%), and accuracy (73-82%) in the ee potency assay using the 7 mg/ml as cut-off.Only the DiSFeB laboratory analyzed the strong sensitizer p-phenylenediamine, which resulted in an EC50 value ≥7 mg/ml and was the main reason for the generally lower sensitivity and accuracy obtained by DiSFeB compared to the other 3 laboratories.For this prediction model, the within-laboratory reproducibility of the FD runs had a concordance ranging from 77-100% and the inter-laboratory concordance was 35% for all laboratories combined and 77% for the two best performing laboratories (VUMC and HU) (Tab.4).Data are based on results obtained for the total number of chemicals tested per laboratory for all chemicals from which an EC 50 value could be obtained (see Tab. 3).Where ambiguous results from the 2 independent runs were obtained for a single chemical, the result was neither correct nor incorrect (non-conclusive) and is indicated as worst case scenario as follows: +1 = +1 ambiguous chemical; +2 = +2 ambiguous chemicals.Bold underlined numbers indicate the number of chemicals showing correct potency classification according to in vivo data shown in Tab. 1. E/S <7 = extreme/strong sensitizer with EC 50 cut-off value <7 mg/ml.M/W ≥7 = moderate/ weak sensitizer with EC 50 cut-off value ≥7 mg/ml.Sensitivity = percentage of correctly identified strong/extreme sensitizers; specificity = percentage of correctly identified weak/moderate sensitizers; accuracy = average of sensitivity and specificity.In the determination of sensitivity, specificity, and accuracy: no brackets = both ambiguous results and chemicals not tested (no EC 50 ) are excluded; with brackets = worst case scenario is shown with incorporation of ambiguous results possibly happening to score negative and exclusion of chemicals not tested (no EC 50 ).

Tab. 4: Predictive capacity of EE potency for each laboratory based on an EC
Intra-laboratory reproducibility: number of chemicals having same prediction in FD1 and FD2 is shown; Inter-laboratory reproducibility: number of chemicals having same FD prediction in all laboratories (without brackets) and between only VUMC and HU (with brackets) is shown.

Discussion
In this international ring trial the intra-and inter-laboratory variation and the predictive capacity of the EE potency assay were evaluated.Highly reproducible results were obtained in each laboratory.In all laboratories, human EE-EC50 data showed better correlation to human data than to mouse llNA-eC3 data.Since acceptance criteria had not been previously described, putative acceptance criteria were defined at the start of the study and tested during the study (dos Santos et al., 2011).lease in IL-1α from the EE into the culture supernatant.Indeed, again reproducible, well correlating and generally significant results were obtained by each independent laboratory, particularly when the data obtained from the four laboratories was averaged.Again, in all cases, the in vitro IL-1α2x correlation to the human DSA05 data was very high and in the same order of magnitude as that observed for the eC50 value correlations (all laboratories combined: IL-1α2x vs. DSA05 spearman r=0.929, p=0.002 compared to IL-1α2x vs. llNA-eC3 spearman r=0.770, p=0.013 or IL-1α2x vs. NOEL spearman r=0.810, p=0.022).Values shown indicate the chemical concentration (mg/ml) obtained from the dose response experiments where a 2-fold increase in IL-1α release (IL-1α 2x ) was observed.When the IL-1α 2x correlated to a lower chemical concentration than the lowest concentration tested this is shown by the sign <.
NT: chemical not tested in FD as no EE-EC 60 concentration was obtained in BD-B; NR = IL-1α 2x value not reached in FD; Results are from two independent FD experiments (1, 2) with the exception of 2-mercaptobenzothiazol where the two runs were repeated (1R, 2R) due to inconclusive data in VUMC and HU laboratories.
namaldehyde, citral, eugenol).Also, similar results (EC50 values) were often obtained from the FD between the laboratories despite sometimes up to 10-fold variation in the start chemical concentration being used (e.g., oxazolone).Even though 2 weak sensitizing chemicals (benzocaine, α-hexylcinnamaldehyde) showed poor reproducibility in the duplicate FD runs, these chemicals were still correctly ranked by all laboratories as weak sensitizers.Of note, the EE potency assay appears not only to be reproducible between laboratories but also to a certain extent between different EE cultures (dos Santos et al., 2011;Gibbs et al., 2013a).For example, the EC50 value obtained for DNCB was 0.3 mg/ml in this present study and 1.3 mg/ml in our previous using in-house VUMC EE, which, when plotted on a log scale, represents very little variation.Taken together, these results emphasize the beneficial effect on final reproducibility by starting with a broad dose finding (10-fold dilutions) to identify and fine-tune the final fine dose finding (2-fold dilutions).
The results created in this study have now been incorporated into our most recent yet unpublished developments in which we have been able to identify a single extended dose response of 2-fold dilutions starting at 200 mg/ml.This will enable all unknown sensitizers from weak to extreme to be tested.Notably, benzocaine has been reported to give highly variable results in vivo, in both the LLNA and GPMT (Basketter et al., 1995).Two chemicals proved to be difficult to test.P-phe-Batch variation between the unexposed batches of epiCS ® was very low, indicating that the production procedure and transport of the EE was very standardized.For the vehicle exposure, the acceptance criteria "vehicle exposure alone should not result in more than 30% decrease in cell viability compared to unexposed EE" was met.This indicates that this putative acceptance criterion can now be accepted as a valid acceptance criterion for vehicle exposure when further implementing this assay.In contrast, the putative acceptance criterion defined for the positive control resorcinol (20-80% cytotoxicity compared to vehicle exposed EE) was found to be unsuitable for further studies.The degree of cytotoxicity exhibited often varied from <20% to >80% cytotoxicity between batches and between laboratories when testing a single concentration of resorcinol.In the future, the problem may possibly be solved by testing at least two different resorcinol concentrations, thus allowing for slight shifts in the dose response between experiments or by testing a stronger sensitizer, e.g., DNCB, which shows less variation and is not a prohapten.
The SOP was designed to allow determination of the EC50 value of any unknown chemical using 3 consecutive dose response experiments.Notably, both the intra-laboratory and inter-laboratory variation was low throughout the BD-A, BD-B, and FD experiments.The same start chemical concentration for the FD was often identified by all 4 laboratories (e.g., cin- Sensitizer values are obtained and ranked as described in Table 1.
EE-EC 50 and IL-1α 2x represent values obtained by linear regression analysis, mean ±SD of combined data obtained from the four laboratories.For each laboratory, the average value obtained from the independent runs was used in the analysis.
published by ICCVAM (see ICCVAM, 2013a,b and references listed at footnote of Tab.S1 at http://dx.doi.org/10.14573/altex.1308021S).Also, a new human classification score ranging from 1 to 6 with 1 being the most potent sensitizer group and 6 being the non sensitizer group has very recently been proposed by Basketter et al. (2014) and has also been incorporated into Table 1.When animal and human data were conflicting or limited (e.g., 2-mercaptobenzothiazole), the human data was prioritized in the ranking above animal data (see Tab. 1).It was noticed from the reports that the llNA-eC3 potency data was influenced considerably by the vehicle used and the type and duration of the chemical exposure.therefore, for this study it was decided to use a range of potency data available for LLNA-eC3, and all human DSA05, and limited NOel data available as described in the reports and to correlate this, not only to the results obtained from the individual laboratories but also to the average result obtained from the four laboratories combined.A very good correlation was observed between each laboratory and the in vivo data.This was particularly so with regards DSA05.Taken together, the human data showed a notably better correlation to ee-eC50 data than the murine to ee-eC50 data, although it must be noted that no human data was available for the major outlier oxazolone, thus possibly introducing a minor bias to this result.
Next it was determined whether the release of pro-inflammatory IL-1α into the culture supernatant could provide an additional potency assessment parameter to the eC50 value.Therefore, the IL-1α2x value was correlated to human data and llNA-eC3 data.As with the EC50 values, a significant correlation was found, both generally on the individual laboratory level, as well as in the overall (averaged) correlation.Again the human data showed a notably better correlation than the mouse data.
The major limitation of the EE potency assay is that, although it can classify chemical allergens according to potency, it is not able to determine whether or not the chemical is a potential sensitizer.Previously, we have shown that IL-18 production by epidermal keratinocytes (NCTC2544 cell line) is a biomarker for distinguishing a sensitizer from a non-sensitizer (Corsini et al., 2009(Corsini et al., , 2013;;Galbiati et al., 2011).Parallel to this study, we found that the ee-eC50 potency assay could be combined with IL-18 release by ee in a single assay, thus greatly increasing the value of this assay that uses commercially available ee in the future (Gibbs et al., 2013a).Alternatively, the EE potency assay can be combined with any other assay or test battery that can distinguish a sensitizer from a non-sensitizer in a tiered or integrated approach (Corsini et al., 2009;Johansson et al., 2013;Natsch et al., 2013).Another limitation in the assay is that not all chemical exposures result in an ee-eC50 value being obtained.If there are no solubility issues, e.g., a maximum concentration of 200 mg/ ml could be tested, and still no ee-eC50 value is obtained, it is possible that the chemical is a very weak sensitizer.However, it cannot be ruled out that the chemical does not penetrate the stratum corneum and therefore cannot be tested properly in the assay.In vivo, the penetration route for such a chemical may possibly be via the hair follicle.At the moment such chemicals nylenediamine and cobalt (II) chloride both interfered with the spectrophotometric MTT assay.Whereas p-phenylenediamine oxidizes the substrate in the absence of viable EE, cobalt (II) chloride has a green color.three of the four laboratories excluded these two coded chemicals already in BD-B since no EC50 was obtained.DiSFeB did continue to test p-phenylenediamine, and BASF did continue to test cobalt (II) chloride in the FD.However, both laboratories wrongly classified the chemicals as weak/moderate sensitizers.This suggests a minor modification to the SOP is required specifying in more detail when a chemical should be excluded due to interference with the MTT assay.For example, the SOP should mention prior analysis of the chemical in the Mtt assay in the absence of ee in order to determine whether the chemical distorts the spectrophotometric readout.
Until now, no classical prediction model for the EE potency assay has been defined.Using a test panel of chemicals, EC50 and IL-1α2x values were obtained and correlated to human or llNA-eC3 data (dos Santos et al., 2011).By continuously adding values obtained from well-defined chemicals, this graph will provide a golden standard correlation graph for determining the potency of an unknown chemical allergen.The EC50 and IL-1α2x values of the unknown chemical can then be correlated to values obtained for the standard test panel and extrapolated to an in vivo value.Eventually enough data will be created in order for the EE potency assay to have its own assessment score in a similar manner to llNA-eC3 and human NOel or DSA05 scores, which rank sensitizer potency according to cut-off ranges (see Tab. 1).We foresee that such data will eventually enable the maximum safe threshold concentration of a chemical to be identified when sufficient NOEL and in vitro data are available.In this study an additional prediction model was identified.It was noticed that the potency of a coded chemical could be determined with high accuracy on the basis of a cut-off value for the eC50 (EE-EC50 ≥7 mg/ml = weak to moderate sensitizer; ee-eC50 <7 mg/ml = strong to extreme sensitizer).The average overall accuracy for this approach for the combined results of the four laboratories was 77%, meaning that when using the current assay SOP the chemical was correctly predicted to be either a strong to extreme or weak to moderate sensitizer in 77% of the test situations if two similarly scoring FD runs are obtained.A minor modification to the current SOP should allow for ambiguous scoring chemicals: if ambiguous scoring from the 2 FD runs is obtained, a 3 rd deciding FD run should be performed.this prediction model could be very suitable to quickly screen for the most potent sensitizers.Importantly, the discrimination between two classes of sensitizers (weak and strong) coincides with the European Classification, Labeling and Packaging of substances (CLP) regulation, which is harmonized with the United Nations Globally Harmonized System (GHS) of Classification and Labeling of Chemicals (UN-GHS) (see review de Groot et al., 2010).
In order to test the EE potency assay prediction model further with regards to correlating the EC50 value to available human and llNA-eC3 data, first a detailed review of the literature was performed to identify human and murine potency data.The majority of the data was found in two extensive reports entering further validation.little intra-and inter-laboratory variation was observed and a good correlation was observed between our in vitro eC50 potency data and that derived from human and animal studies.At present, since only a few chemicals have been tested, it is too early to say whether a combined readout of ee-eC50 and IL-1α2x will further improve the prediction model.Our results suggest that this assay may now be suitable for validation as it will provide additional and complementary information to other assays already undergoing such developments.are considered to fall outside of the applicability domain as they cannot be fully tested according to the SOP.
Whereas many assays are being developed or are under prevalidation for determining whether or not a chemical is a potential sensitizer, relatively few assays address sensitizer potency (Mehling et al., 2012).A complicating factor in comparing the potency data from other studies and our study is that correlation of the data to llNA data is performed using different statistical means in the different studies, and few as well as different chemicals have been investigated (Kolle et al., 2013).One very promising assay is the Genomic Allergen Rapid Detection (GARD) assay (Johansson et al., 2011).This dendritic cell based assay uses a genomic biomarker signature of chemical exposed MUTZ-3 cells to determine whether a chemical is a potential sensitizer and also the potency of the sensitizer.Similar to our EE potency assay, oxazolone was a major outlier and α-hexylcinnamaldehyde scored as a weak sensitizer rather than as sometimes reported in LLNA as a moderate sensitizer.Another assay is the KeratinoSens™ assay (ARE-regulated luciferase activity assay using the cell line HaCaT containing a stable insert of the luciferase gene under control of the ARE element of the gene AKR1C2), which has recently undergone international pre-validation (Natsch et al., 2011).Also, a very different type of in vitro assay is the non-cell based peptide reactivity assay based on the ability of a chemical to react with two synthetic peptides containing either a single cysteine or lysine (Gerberick et al., 2007b(Gerberick et al., , 2009)).Interestingly, whereas these last 2 assays could correctly classify oxazolone as an extreme/ strong sensitizer they both have difficulty identifying and therefore also assessing the potency of pro-haptens (Emter et al., 2010).For example, resorcinol and eugenol are false negatives whereas in our EE potency assay these pro-haptens can be accurately assessed (dos Santos et al., 2011).Taken together, these results indicate that if limitations are taken into account, such as chemical solubility, instability and metabolism, then in vitro assays may have the potential to assess sensitizer potency.At the moment it is too early to say whether one assay performs better than another.However, it would be interesting and important to determine whether the different assays are able to complement each other with regards to chemicals that perform poorly in one particular assay and well in another assay (Bauch et al., 2011(Bauch et al., , 2012)).Such an approach to assess skin sensitizer potency may involve the inclusion of multiple toxicological parameters and a weight of evidence approach, using the data from multiple assays as suggested by Natsch and co-workers who have started by analysing the assay end-points for 145 chemicals, tested in the U937, the DRPA, and in the KeratinoSens™ assay (Natsch et al., 2013).The EE potency assay could be not only able to provide additional in vitro information, but also may increase the relevance of the information for humans, since it involves penetration (bioavailability) of a chemical through the stratum corneum in order for the chemical to exert a cytotoxic/irritant effect on the viable epidermal layers below.
In conclusion, our results indicate that the EE potency assay is a robust assay for testing chemical sensitizers of unknown potency and that only minor modifications are required before

Fig. 1 :
Fig. 1: Flow diagram for the pre-validation study illustrating the method used for chemical exposure, broad dose A and B finding, and fine dose finding

Fig. 2 :
Fig. 2: Individual viability values for unexposed, vehicle exposed (1% DMSO or AOO 4:1) and resorcinol exposed (positive control) epiCS ®The batch numbers allotted to each laboratory did not correlate between laboratories and therefore the deviations observed between laboratories were not due to the same batch of epiCS ® .The black line corresponds to the upper limit of 80% viability of the positive control exposed EE.

Fig. 3 :
Fig. 3: Agreement in EC 50 values between the fine dose run 1 and run 2 Dots refer to the values obtained for the different test chemicals in Table 3.Only chemicals that could be tested in FD are included.The line corresponds to the equality line.Note: dots falling on the line or near the line indicate good reproducibility within laboratories.Left side plots show the full range and right side plots show the range to 25 mg/ml.

Fig. 4 :
Fig. 4: Correlation of EE-EC 50 values with human NOEL and DSA 05 data and murine LLNA-EC 3 data In vivo data are derived from Table 6 and represent the average ± range of values described in ICCVAM reports (see refs: ICCVAMa,b,2013).In vitro data are derived from Table3 and 6; EE-EC 50 values are obtained by linear regression analysis based on viability changes (MTT assay).For individual laboratories, EE-EC 50 data represents the average obtained from the two FD runs ± range of the 2 values.For all laboratories combined, data represents the average of the 4 laboratories ±SD.Since the data are used to rank chemical potency, Spearman correlation (r) and p value (two tailed) using all data are shown.The line represents the visual line of best fit when the major deviating chemical oxazolone is excluded from the line for LLNA-EC 3 .

Fig. 5 :
Fig. 5: Correlation of IL-1α 2x values with human NOEL and DSA 05 data and murine LLNA-EC 3 data In vivo data are derived from Table 6 and represent average ± range of values described in ICCVAM reports (see refs: ICCVAM, 2013a,b).In vitro data are derived from Table5and 6 and IL-1α 2x values were obtained by linear regression analysis based on 2-fold increase in IL-1α release into culture supernatants.For individual laboratories, IL-1α 2x data represents the average obtained from the 2 FD runs ± range of the two values.For all laboratories combined, data represents the average of the four laboratories ±SD.Since the data are used to rank chemical potency, Spearman correlation (r) and p value (two-tailed) using all data are shown.The line represents the visual line of best fit when the major deviating chemical oxazolone is excluded from the line for LLNA-EC 3 .

Tab. 2: Chemical information: vehicle, maximum solubility, starting concentrations used in broad dose and fine dose finding
Start concentrations for BD-B and FD are the chemical concentrations that result in >60%, preferably >80% reduction in EE viability compared to vehicle exposed EE in the prior run (BD-A and BD-B, respectively).a Vehicles used in this study for dissolving chemicals before applying topically to EE. DMSO = 1% DMSO in culture medium; AOO = acetone:olive oil (4:1); b Start chemical concentration used in the different laboratories NT: not tested in BD-B in some laboratories as no starting concentration was obtained from BD-A; rep : BD-B run repeated because all concentrations in FD-1 resulted in >50% viability (HU, BASF); NR: result not reliable due to color interference with the MTT assay.