Derivation of the no expected sensitization induction level for dermal quantitative risk assessment of fragrance ingredients using a weight of evidence approach

Some fragrance ingredients may have the potential to induce skin sensitization in humans but can still be safely formulated into consumer products. Quantitative Risk Assessment (QRA) for dermal sensitization is required to determine safe levels at which potential skin sensitizers can be incorporated into consumer products. The no expected sensitization induction level or NESIL is the point of departure for the dermal QRA. Sensitization assessment factors are applied to the NESIL to determine acceptable exposure levels at which no skin sensiti- zation induction would be expected in the general population. This paper details the key steps involved in deriving a weight of evidence (WoE) NESIL for a given fragrance ingredient using all existing data, including in vivo , in vitro , and in silico . Read-across can be used to derive a NESIL for a group of structurally similar materials when data are insufficient. When sufficient target and read-across data are lacking, exposure waiving threshold (the DST) may be used. We outline the process as it currently stands at the Research Institute for Fragrance Materials Inc. (RIFM) and provide examples, but it is dynamic and is bound to change with evolving science as new approach methodologies (NAMs) are actively incorporated.


Introduction
Chemicals, including fragrance ingredients, with the potential to cause skin sensitization can be safely formulated into consumer products at levels not expected to induce skin sensitization. Exposure-based quantitative risk assessment (QRA) for induction of dermal sensitization is applied to fragrance ingredients with sensitization potential to determine safe levels at which they can be used in different product types. The dermal sensitization QRA was developed with the aim of preventing the acquisition or induction of skin sensitization, as opposed to elicitation, because the factors associated with induction are currently better understood than the latter phase. Nonetheless, with the prevention of the induction, elicitation of skin sensitization may be eliminated or reduced. A proposal for assessing the risk of induction of skin sensitization to fragrance materials in different product categories, quantitative risk assessment 1 (QRA1), was first published in 2008  and updated to quantitative risk assessment 2 (QRA2) in 2020 .
Fragrance ingredients determined to be sensitizers based on weight of evidence (WoE) from all available data -in silico, in vitro, and in vivo (human and animal) -require the application of QRA2 for the protection of consumers. In RIFM's WoE approach, a no-observed-effect level (NOEL) for the induction of skin sensitization is confirmed for sensitizers through the human repeat insult patch test. Since the human repeat insult patch test is a confirmatory test, RIFM has proposed renaming it as the "confirmation of no induction in humans" (CNIH) (Na et al., 2020). A NOEL confirmed in a CNIH, conducted according to the RIFM protocol (Politano and Api, 2008), is primarily used to set the NESIL. The NESIL is the critical benchmark or point of departure for the application of the dermal sensitization QRA. Na et al. described how all the available skin sensitization data could be used to set a NESIL based on WoE (to be published). It may be possible in the future to use the categories described in this paper to establish a NESIL without using a CNIH.
Most fragrance ingredients are structurally simple, low molecular weight, predominantly semi-volatile substances consisting of carbon, hydrogen, and oxygen. Chemical structure helps to predict transdermal absorption, metabolism and disposition, and functional groups that can influence toxicity. Structural similarities within fragrance ingredients permit making some generalizations because chemicals that share certain common structural elements typically have comparable physico-chemico and toxicokinetic properties and may exhibit a common mode of action. Data from one or more tested chemicals can be used to predict the toxicity of a structurally similar chemical for the same test or endpoint (Date et al., 2020). The clustering of structurally related materials allows one to reasonably predict some degree of consistency of metabolism and toxicity. Chemical structure-based clustering of the RIFM fragrance chemical inventory has been completed. Fragrance ingredients with limited or insufficient data to determine their sensitization potential or potency can be "read-across" from structurally similar chemicals with sufficient data within the same or adjacent clusters. Read-across is based on the underlying hypothesis that the toxicity of a particular chemical is a function of its molecular structure (Date et al., 2020;T. W. Schultz et al., 2015;Terry W. Schultz, Richarz and Cronin, 2019).
When limited historical data are available for a material, and no appropriate read-across analogs are available, RIFM applies exposurebased waving based on the dermal sensitization threshold (DST). The DST, the dermal exposure level below which no skin sensitization is expected for a chemical based on its reactivity, is an important tool that has resulted in a significant reduction in animal testing (Nishijo et al., 2019;Roberts et al., 2015;R. J. Safford, 2008;Robert J. Safford, Api et al., 2015;R. J. Safford, Aptula and Gilmour, 2011). Generally, fragrance ingredients with limited historical data are used in small quantities and therefore have low exposure to the general population. Further testing can be waived for low exposure materials depending on whether their levels of use fall below the reactive or non-reactive DST. The reactivity of chemicals is predicted based on expert judgment, with the aid of in silico, in vitro, and in vivo data.
Since 2013, RIFM has not conducted any skin sensitization studies in animals, and new testing has been limited to in vitro assays and CNIH. Testing is conducted to fill various data gaps for individual chemicals or chemical clusters. Defined approaches (DAs) amalgamating data from various in vitro sources studying the key events of the skin sensitization adverse outcome pathway (AOP) may be used to predict the sensitization potential of chemicals. An example is the "2 out of 3" DA where in vitro assays of the skin sensitization AOP, including protein binding (direct peptide reactivity assay (DPRA)), keratinocyte activation (Ker-atinoSens or LuSens), and human dendritic cell line activation activation ((hCLAT) or USENS assay), are evaluated in combination to determine hazard (OECD, 2018a(OECD, , 2018b(OECD, , 2020D. Urbisch et al., 2015). These in vitro assays may be conducted to determine the reactivity of chemicals in a cluster or to determine hazard based on the "2 out of 3" defined approach (Kolle et al., 2019;OECD, 2019b;Urbisch et al., 2015). In the 2 out of 3 approach, chemicals with at least two positive results in tests addressing Key events 1-3 (DPRA, Keratinosens, U-SENS h-CLAT) are predicted sensitizers, while chemicals with none or only one positive outcome are predicted non-sensitizers. For sensitizers, a CNIH from the target chemical or read-across must be available to set the NESIL. If no appropriate CNIH data are available to set the NESIL, then this test may be conducted on the target chemical of interest or a read-across analog to clear the materials in a cluster. RIFM currently only uses available OECD validated in vitro methods for hazard identification, but several NAMs are in the development and validation stages by industry shareholders. In the future, RIFM may use NAMs, alone or in combination, to determine potency and set the NESIL for QRA.
Outlined below is a guide on the current iterative steps involved in the derivation of a NESIL for fragrance ingredients in the RIFM chemical inventory.

2.
Steps involved in conducting safety assessment for skin sensitization 2.1.
Step 1: Determine the potential (hazard) to induce sensitization for a target material

A. Look at all historical data
Identifying hazard is the first step in the safety assessment of a fragrance material. To that end, all scientific data (published and 'in house') are included and considered for the safety evaluation of fragrance ingredients. This includes the physical and chemical properties of the materials under investigation, in silico data such as results obtained from (Q)SAR [(Quantitative) Structure Activity Relationship] modeling, chemical categories, grouping, in vitro data, and existing human and/or animal data relevant to skin sensitization. The in vitro data, including protein binding (direct peptide reactivity assay (DPRA)), keratinocyte activation (KeratinoSens or LuSens), and human dendritic cell line activation activation ((hCLAT) or USENS assay), can also be used in determining hazard (OECD, 2018a(OECD, , 2018b(OECD, , 2020Urbisch et al., 2015). These assays contribute to the identification of skin sensitization hazard, but there is insufficient evidence to date that they provide reliable indicators of potency. The OECD continues to evaluate a range of defined approaches for combining data from individual assays, which may also assist in potency determination (OECD, 2017(OECD, , 2019a. Nevertheless, it remains challenging to achieve a complete replacement of in vivo testing for potency determination (D. Basketter et al., 2020). In addition to in vitro data, historical animal data, such as guinea pig studies and the local lymph node assay (LLNA) conducted according to established OECD test guidelines, are considered to predict the hazard potential of the material (OECD, 1992(OECD, , 2010. Existing human data may include CNIH tests, human maximization tests (HMTs), and diagnostic patch tests. Any unequivocal reactions indicative of skin sensitization observed in these confirmatory human tests indicate that the material is a skin sensitizer; however, it should be noted that no human tests are conducted for hazard identification. Since 2008, the methodology of the CNIH has been standardized by RIFM, and the studies are performed with approval from an ethical review board (Politano and Api, 2008). A material is considered a non-sensitizer when clear negative results are available from human, animal, and/or in vitro studies. The absence of protein binding alerts from in silico tools (i.e., OECD Toolbox and Toxtree) strengthens the WoE in the evaluation of non-sensitizers. If the material's potential to induce skin sensitization is demonstrated in any one of the in vivo tests and/or in at least 2 of the 3 in vitro tests, the material is considered a skin sensitizer (Bauch et al., 2012;Kolle et al., 2019).

b. If historical data are insufficient or not available: Determine a suitable read-across
When existing historical data are insufficient to adequately determine the sensitizing potential of a fragrance material, i.e., conclude that the substance is not a skin sensitizer and/or derive a NESIL, the next step is to find a suitable read-across. Read-across is a critical approach used by RIFM to waive testing by using information from structurally similar analogs to bridge data gaps for target materials. The RIFM Database presents an advantage in the search for structural analogs, as it holds the best collection of data on fragrance and flavor ingredients in the world (Api, 2002). Chemicals in this database are clustered into categories/groups that make it easier to search for fragrance ingredient read-across analogs for any endpoint of interest (Date et al., 2020).
Read-across analogs are selected by expert review of chemicals with the aid of computational or in silico methods. Structural, reactivity, metabolic, and physico-chemico similarities are considered in the selection of read-across candidates. RIFM experts have several rules for selecting read-across addressing each endpoint. For the skin sensitization endpoint, the reactivity of a chemical towards skin proteins is the most critical chemical property assessed. The read-across analog must be more reactive and have the same mechanism of reactivity (e.g., Michael addition, Schiff base formation, acylation, etc.) as the target chemical. These are some of the most critical, but not exhaustive, rules applied during an expert review of read-across analogs for skin sensitization. Appropriate read-across analogs for skin sensitization provide databridging studies conducted according to OECD test guidelines or CNIH studies conducted according to the RIFM protocol (Politano and Api, 2008).
2.1.3. C. If no read-across analogs are available: Determine if exposure to a target fragrance material is below the DST for reactive and non-reactive chemicals In the absence of sufficient data or read-across, a material may be evaluated by utilizing the DST. The DST applies the concept of threshold of toxicological concern (TTC) to the evaluation of dermal sensitization, by establishing a level below which there is no appreciable risk for the induction of skin sensitization (R. J. Safford, 2008). This is based on a probabilistic analysis of potency data for a diverse range of known chemical allergens. Available data on the material and materials in its cluster, as well as predictions from in silico tools and expert judgement, are used to determine if a material is non-reactive or reactive. If a material is considered non-reactive, a DST of 900 μg/cm 2 is applied (R. J. Safford et al., 2011), and a DST of 64 μg/cm 2 is utilized for reactive materials (Robert J. Safford et al., 2015). For reactive materials that are further classified in the high potency category (HPC), an HPC DST of 1.5 μg/cm 2 may be applied (Nishijo et al., 2019;Roberts et al., 2015); however, this threshold has yet to be utilized on fragrance ingredients in the RIFM Database. These thresholds of 900 μg/cm 2 and 64 μg/cm 2 are utilized in the dermal sensitization QRA, and when the reported 95th percentile use concentration in finished products of a material does not exceed the maximum acceptable concentration for the non-reactive or reactive DST in all QRA categories as previously described , the safety assessment for the material can be completed. Thresholds of toxicological concern for skin sensitization are constantly under review and may be updated or refined as new data become available. If the use of a material exceeds the maximum acceptable concentrations of its respective DST, testing may be required (described in Step 4). Since exposure is critical in determining if the DST can be applied and continue to be applied, RIFM's policy is to update exposure data a minimum of every 5 years.

Step 2: Dose-response
A dose response relationship provides information on how increasing levels of exposure will produce an increasing risk of dermal sensitization. Historically, several animal models have been used to determine the dose-response relationship for a fragrance ingredient to induce sensitization. Potency, which is derived from the dose-response is crucial information for determining the NESIL. Guinea pig tests (adjuvant and non-adjuvant) have been used for many years to assess the inherent contact sensitization potential of chemicals. Some of these tests are also used to indicate potency, although the murine LLNA (OECD, 2010) became the favored animal test to identify skin sensitization hazards as well as to measure relative potency. The latter is determined from the dose-response curve to derive an EC3 value (i.e., the estimated dose of a substance required to induce a positive threshold response as derived by linear interpolation) (D. A. Basketter et al., 1999). The EC3 value has been demonstrated to closely correlate with the NOEL from human sensitization tests designed to confirm lack of induction (Gerberick et al., 2001(Gerberick et al., , 2004Griem et al., 2003;Schneider and Akkan, 2004). Dose response information determined from the LLNA is important in determining potency.
However, efforts continue to eliminate the need for any in vivo testing. In the European Union, the use of animal testing of cosmetics and their ingredients is banned, and consequently, non-animal tests are essential to provide the basis for hazard assessment. For each of the first 3 steps in the adverse outcome pathway for skin sensitization (AOP) (OECD, 2014), an approved in vitro test has become available. These include the DPRA, OECD TG 442C (OECD, 2020), the ARE-Nrf2 Luciferase Methods (KeratinoSens/LuSens, OECD TG 442D) (OECD, 2018b), and the USENS/hCLAT (OECD TG 442E) (OECD, 2018a). These assays are being used to identify skin sensitization hazards. While there is insufficient evidence that in vitro data alone can be used as indicators of potency, there are data to support that in vitro methods can used in conjunction with other data to derive potency (OECD, 2021).

Step 3: Setting a WoE NESIL
The NESIL is a benchmark that is derived from all available data, including in silico, in vitro, animal, and human data, as well as readacross through the application of the WoE approach to all the relevant data. The NESIL is expressed as a dose per unit area (e.g., μg/cm 2 ) value.
In contact allergy, there is overwhelming empirical support for using quantity per unit area rather than other dose metrics such as concentration applied to the skin (Kligman, 1966;Magnusson and Kligman, 1969;Rees et al., 1990;Upadhye and Maibach, 1992;White et al., 1986). An in-depth review of the published studies that support the use of dose per unit area in risk assessments for induction of dermal sensitization has been published (Kimber et al., 2008).
A human sensitization test is used to confirm the lack of sensitization at an exposure level which is identified as a likely NOEL from all available data, including quantitative structure activity relationships. The test most typically conducted is the human repeat insult patch test . A human repeat insult patch test that is undertaken to confirm the lack of sensitization is referred to as a CNIH. This test exaggerates exposure from normal use of fragrance ingredients in consumer products. Such tests must meet current ethical and methodological criteria and must be the remit of a properly constituted, independent, and transparent, ethical review committee (institutional review board). With the implementation of the QRA1 approach , RIFM recommended the use of the RIFM standard CNIH protocol for the generation of confirmatory human data for use in QRA. Details of this standard protocol have been previously described (Politano and Api, 2008).
Diagnostic patch test data from dermatology clinics are not used in the determination of the NESIL. This is because these data are a measure of elicitation of allergic contact dermatitis, not induction of dermal sensitization. There are insufficient data to discern any quantitative relationship between induction and elicitation. Diagnostic patch test data can be useful to help determine the need for additional data. An expert group on skin sensitization concluded that it would not be appropriate to define elicitation thresholds as a function of skin sensitizing potency (Ezendam et al., 2012). For example, these data may indicate where current exposures to a fragrance material may be a source of clinically relevant allergic contact dermatitis. The absence of positive diagnostic patch test reactions following testing in dermatology clinics may support current exposure levels (use concentrations) for that fragrance material.
A detailed guide on how the NESIL is used as a benchmark in QRA to derive maximum acceptable concentrations for dermal exposure in different product categories has previously been published . Briefly, the process entails application of sensitization assessment factors (SAFs) to the NESIL to account for uncertainties. The SAFs account for inter-individual variability, product composition, frequency/duration of use, skin condition to determined acceptable exposure levels per product category in which a fragrance ingredient may be used. In QRA2, we adjust the maximum acceptable concentrations by taking aggregate exposure into account.
Briefly, several criteria can assist in determining the NESIL. Using a WoE approach, all the available data for a chemical are taken into consideration. Historical animal and human (in vivo) data, quantitative structure activity relationships (QSAR) or in silico models, in vitro models (including in chemico models) and read-across data obtained on structurally and/or mechanistically related chemicals can be applied in the derivation, and uncertainty in the underlying data are considered when deriving a human NESIL.

Step 4. Testing
When the existing data on the material under investigation is insufficient to conclude the safety assessment, generation of additional data is required. In vitro testing can be used to determine hazards. If the material is determined to be a skin sensitizer, a CNIH is required to set the NESIL. In circumstances where CNIH is considered essential for an assessment, a cautious approach is mandatory for the selection of the dose used for testing to minimize the likelihood of sensitizing the exposed study volunteers. All existing data on the target and structurally related analogs must be considered when selecting an appropriate dose for CNIH.
Read-across chemicals may also be important for building the overall WoE to support conclusions made, even when sufficient data is available on a particular fragrance ingredient. In cases where testing cannot be avoided, read-across analogs with insufficient data may be tested. These analogs are prioritized for testing based on the number of tests required for data-gap-filling and the number of materials or clusters that can be cleared in safety evaluation using the read-across.
The steps described above are summarized in a flowchart in Fig. 1. All data and conclusions are reviewed by the Expert Panel for Fragrance Safety, comprising internationally known academic scientists, including dermatologists, pathologists, toxicologists, and environmental scientists (http://fragrancesafetypanel.org/), for approval before publication.

Case studies -WoE approach for skin sensitization analysis and NESIL derivation
The robustness of all available data on a fragrance ingredient is evaluated to establish the WoE for hazard identification, potency, and derivation of the NESIL. Greater weight is placed on studies conducted according to established OECD guidelines and CNIHs conducted according to the RIFM protocol. Structural analysis is done based on in silico predictions from software such as OECD toolbox, TIMES-SS, and Toxtree, as well as expert judgement by the Expert Panel for Fragrance Safety. Data from all in vitro assays on a chemical are also considered, but those conducted according to OECD test guidelines 442C, 442D, and 442E carry more weight. All available historical animal experiments, including those conducted in guinea pigs (Freund's complete adjuvant test (FCAT), open epicutaneous test (OET), closed epicutaneous test (CET), Draize, guinea pig maximization test (GPMT), Buehler) and mice (mouse ear swelling test and LLNA) are considered for hazard identification for a given fragrance ingredient. However, those with OECD test guidelines, i.e., the GPMT, Buehler, and LLNA, provide more WoE for safety assessment. RIFM classifies chemicals in potency categories according to the ECETOC Technical Report 87 (ECETOC, 2003) for animal studies conducted as described in their respective OECD test guidelines. Potency classification from guinea pig studies is not definitive and only Fig. 1. Flowchart of steps involved in the derivation of a NESIL for application of QRA2. provides a range, whereas the LLNA provides the EC3 as a specific quantitative potency value. The LLNA EC3 has been shown to correlate well with the human NOEL ; therefore, LLNA EC3 dose may be selected for testing in CNIH to confirm a NESIL. Clinical patch tests and historical HMTs are primarily used for hazard identification. While both HMTs and CNIH provide benchmarks for NOELs, only CNIH conducted according to the RIFM protocol are considered for confirming a NOEL and setting the NESIL for QRA.
Below are some case studies illustrating how the WoE approach is applied to derive a NESIL. Summarized in Table 1 are case studies of the WoE approach, while Table 2 summarizes some testing strategies and considerations made when deriving a NESIL.
3,3,5,5-Tetramethyl-4-ethoxyvinylcyclohexanone (CAS # 36306-87-3) was determined to be a non-sensitizer based on the 2 out of 3 in vitro defined approach. There were no direct protein binding alerts predicted in silico (Toxtree 3.1.0; OECD Toolbox v 4.2), but a radical reaction alert was predicted with the autoxidation simulator in the OECD Toolbox. A negative HMT and OET are also available on this material. According to the RIFM framework, the negative HMT and OET alone would not be sufficient to determine hazard but provide WoE to support the nonsensitizer evaluation from the in vitro studies.
Citral (5392-40-5) was found to be a sensitizer based on 2 of 3 in vitro studies, animal tests, HMT, and CNIH. Given that the LLNA EC3 correlates well with the human NOEL, the mean EC3 dose of 1414 μg/cm 2 was selected and was confirmed to be negative in the CNIH. The NESIL for citral was set at 1400 μg/cm 2 .
In rare circumstances (such as low volume of use or low exposure levels), the EC3 value (or weighted mean when more than one study exists) can be used to define a default NESIL based on potency considerations (Gerberick et al., 2001). This approach requires expert judgment. α-Butylcinnamaldehyde (CAS # 7492-44-6) annual volume of use was reported to be between 1 and 100 kg according to a 2015 volume of use survey of the fragrance industry (IFRA, 2015) and its 95th percentile total chronic systemic exposure (dermal, oral, and inhalation) was <0.00001 μg/kg/day (Creme RIFM Aggregate Exposure Model version 3.0). This chemical was found to be sensitizing in 2 LLNAs with a weighted mean EC3 value of 11.08% (2775 μg/cm 2 ). A CNIH test was not conducted, and a default NESIL of 1000 μg/cm 2 was defined for this material, based on the potency consideration by Gerberick et al., (2001). The default NESIL was used instead of the reactive DST because there were two historical LLNAs on this material. There is good correlation between LLNA EC3 and human potency (Gerberick et al., 2001(Gerberick et al., , 2004Griem et al., 2003;Schneider and Akkan, 2004).
Analysis based on read-across. Data available on propyl alcohol (CAS # 71-23-8) were not sufficient to determine hazard. Even though this material was negative in a GPMT and Buehler test, the data were deemed insufficient due to the limited number of test animals in the GPMT and the unreported number of animals in the Buehler study. The read-across analog butyl alcohol (CAS # 71-36-3) had sufficient data to confirm that propyl alcohol is a nonsensitizer. cis-3-Nonen-1-ol (CAS # 10340-23-5) had no historical data but was determined to be a non-sensitizer based on read-across to cis-3hexenol (CAS # 928-96-1), which had sufficient data.

DST exposure-based waving
2-Decanone (CAS # 693-54-9) had no skin sensitization data available, while 4-hexen-1-ol, 5-methyl-2-(1-methylethenyl)-(CAS # 58461-27-1) and (Z)-2-Penten-1-ol (CAS # 1576-95-0) had insufficient data to determine hazard. These materials were determined to be non-reactive with in silico structural analysis and assessment by the Expert Panel for Fragrance Safety, but no appropriate read-across analog with sufficient data was found. The exposure to these materials falls under the non-reactive DST of 900 μg/cm 2 , so they were deemed safe under the current declared levels of use.
2-Furanmethanethiol formate (CAS # 59020-90-5) and furfuryl thioacetate (CAS # 13678-68-7) had no skin sensitization studies available, but they were determined to be reactive based on structural analysis. 1-Octen-3-ol (CAS # 3391-86-4) was positive in 2 out of 3 in vitro assays, while p-tolyl acetate (CAS # 140-39-6) was found to be sensitizing in the HMT. Since no appropriate read-across was found for these materials, and their exposure was below the reactive DST of 64 μg/cm 2 , they were concluded to be safe under the current declared levels of use.

Testing strategies
The primary alcohol cluster (Table 2) was found to be sensitizing based on positive guinea pig tests on several chemicals in the cluster and a positive LLNA. However, none of the materials in the cluster had sufficient data to set a NESIL, and no appropriate read-across analogs were available. Two Chemicals in the cluster (heptyl alcohol and 1-decanol) were selected for testing in the DPRA to determine reactivity in order to choose a representative chemical for further testing in the CNIH. Both heptyl alcohol and 1-decanol had minimal reactivity in the DPRA with mean cysteine and lysine depletion of 0.80% and 0.38%, respectively. The DPRA results provided supporting evidence, that the chain length of the alcohol did not impact the reactivity. Heptyl alcohol was selected for further testing in the CNIH, and the NESIL of 9400 μg/ cm 2 was selected. All the other 9 chemicals in the cluster read-across to heptyl alcohol. Only 1 chemical was tested to clear 9 structurally related chemicals by read-across.
A similar case to the primary alcohol cluster is the cinnamyl ester cluster (Table 2). In this cluster, cinnamyl acetate was initially presumed to be a sensitizer because a structurally similar alcohol, cinnamyl alcohol (CAS # 104-54-1), is a sensitizer. However, while in vitro data suggested that cinnamyl alcohol is reactive, cinnamyl acetate is not. Therefore, based on structural evaluation, cinnamyl esters were separated from the alcohols and then cinnamyl acetate was selected as readacross for its cluster. Cinnamyl acetate was determined to be a nonsensitizer based on 2 out of 3 in vitro studies. This conclusion was supported by a CNIH study conducted at 3424 μg/cm 2 according to RIFM's standard protocol. Cinnamyl acetate was selected as the read-across for all the chemicals in the cluster, and no further testing was required for the cluster.
6-Methyl-3,5-heptadien-2-one (CAS # 1604-28-0) was predicted to be a sensitizer based on positive predictions observed in the in vitro KeratinoSens and h-CLAT and supporting evidence from in silico protein binding alerts. A CNIH was conducted at 1299 μg/cm 2 on 6-methyl-3,5heptadien-2-one, based on estimation from the negative LLNA dose at 5% (1250 μg/cm 2 ), but sensitization reactions were observed in 3/110 subjects. The CNIH was repeated at a lower dose of 118 μg/cm 2  3-nonen-1-ol, (z)-;(z)-non-3-en-1-ol;(3z)-non-3-en-1ol; 3-nonen-1-ol, (3z)-;cis-3-nonen-1-ol (approximately 10 times lower than the positive dose) in 105 subjects, resulting in no reactions indicative of sensitization. The NESIL was therefore set at 110 μg/cm 2 based on this CNIH conducted according to the RIFM protocol. This case emphasizes the importance of confirmatory human studies to set the NESIL and using predictive tools and readacross to determine whether LLNA is truly predictive of a NOEL in humans. Some exceptions have been noted where the LLNA EC3 does not correlate well with the human NOEL. Benzaldehyde (CAS # 100-52-7) was not found to be sensitizing when tested up to 25% (6250 μg/cm 2 ) in the LLNA. However, in addition to in vitro tests and historical animal tests demonstrating that it is a sensitizer, benzaldehyde was found to be sensitizing in a CNIH tested at 5905 μg/cm 2 in 6 out of 88 subjects. The NESIL for benzaldehyde was therefore set at 590 μg/cm 2 after confirmation in a CNIH study. We hypothesize that this disparity may be accounted for by the differences in test chemical application methods. In the LLNA, the test chemical is placed by open application to the ear while CNIH studies use closed patches; therefore, the test chemical may volatize in the LLNA but not in the CNIH. A similar example is hexen-2-al (CAS # 6728-26-3) which had an average EC3 of 4.05% (1012 μg/cm 2 ) from 2 separate LLNAs, but was found to be sensitizing in a CNIH at 236 μg/cm 2 . A NESIL of 18 μg/cm 2 was determined, which is much lower than the EC3. In contrast to benzaldehyde and hexen-2-al, hexyl salicylate was an extreme sensitizer in the LLNA with an EC3 of 0.18% (45 μg/cm 2 ) but was not sensitizing in a CNIH study when tested at 30% (35,433 μg/cm 2 ). This finding is true for other salicylates in the RIFM inventory. These case studies emphasize the importance of the weight approach, where all the available data must be considered to determine skin sensitization hazard and potency for a given chemical. These examples also showcase why it is critical to perform confirmatory testing in humans to ensure the safety of fragranced products.

Discussion
The NESIL is the point of departure (PoD) for the dermal sensitization QRA. Deriving this PoD is an iterative process that involves assessing the quality of all available data (historical human and animal [in vivo], in vitro, and in silico) on a fragrance ingredient, determining a read-across, exposure waving based on the DST, and/or integrated testing strategies to determine a WoE NESIL. All fragrance ingredients in RIFM's inventory are evaluated on a five-year rotating basis to ensure that previously made conclusions still hold based on newly available exposure data. Revised safety assessments are published if new relevant data become available. We have outlined how RIFM currently evaluates fragrance ingredients through a stepwise process to derive a NESIL for QRA. This dynamic may change as we actively seek to incorporate NAMs and tools into our safety assessment process.
The RIFM Database and other publicly available channels, such as the European Chemicals Agency (ECHA), provide a wealth of historical data upon which the safety assessment of fragrance ingredients is built. Exposure data are primarily obtained from the Creme-RIFM Aggregate Exposure Model (Comiskey et al., 2017), which incorporates survey data on the use of fragrance ingredients from the fragrance industry. When there are limited or no skin sensitization data available on an ingredient and the gap cannot be bridged by read-across or the DST approach cannot be applied, integrated testing strategies (ITS) can be applied to determine hazard. There has been significant progress in incorporating new in vitro data into the risk assessment process demonstrated by OECD test guideline 497 (OECD, 2021). However, challenges still exist in determining human potency using in vitro methods in order to confirm a NESIL to be used for QRA purposes. Some in vitro assays such as SENS-IS (Cottrez et al., 2015) provide valuable insight into the potency of a sensitizer and could potentially be used to set a default NESIL, but using NAMs to determine potency is an active area of research, and more work is still needed. NAMs based on the integration of in silico and in vitro data are under development, but will be useful for deriving a NESIL without     animal testing (Natsch et al., 2018). In the future, RIFM will integrate these methods in the safety assessment program of new fragrance ingredients or those with limited historical data. Derivation of a NESIL or QRA PoD for a bulk of fragrance ingredients that fall in the Natural Complex Substance (NCS) class will primarily depend on the development of NAMs. NCS, such as essential oils, comprise complex mixtures of chemicals with varying degrees of functionalities and the ability to induce skin sensitization. This complexity makes it challenging to develop NAMs specifically appropriate for NCS. Additionally, NAMs that may be useful for NESIL derivation are being developed largely based on the analysis of discrete chemical substances. Currently, RIFM primarily employs component-based analysis to assesses the sensitization potential of NCS, but future NAMs may permit NESIL derivation without animal testing for this class of fragrance ingredients.