Critical review and probabilistic health hazard assessment of cleaning product ingredients in all-purpose cleaners, dish care products, and laundry care products

Though numerous chemical ingredients are used in cleaning products, empirical mammalian toxicology information is often limited for many substances. Such limited data inherently presents challenges to environmental health practitioners performing hazard and risk assessments. Probabilistic hazard assessment using chemical toxicity distributions (CTDs) is an alternative approach for assessments of chemicals when toxicity information is lacking. The CTD concept allows for derivation of thresholds of toxicological concern (TTCs) to predict adverse effect thresholds for mammalian species. Unfortunately, comparative health hazard assessment of cleaning product ingredients in common use categories such as all-purpose cleaners (APC), dish care products (DCP) and laundry care products (LCP) has not been well studied. However, APC, DCP, and LCP are used routinely for household and industrial applications, resulting in residential and industrial occupational exposures. Therefore, we reviewed and then examined hazard information (median lethal dose (LD50), lowestobserved-adverse-effect level (LOAEL), and no-observed-adverse-effect level (NOAEL)) from different types of standard mammalian toxicity studies for oral toxicity in the rat model from the unique Cleaning Product Ingredient Safety Initiative mammalian toxicology database. Probabilistic distributions (CTDs) were subsequently constructed using LD50, NOAEL and LOAEL data from a specific toxicity study type for all available ingredients in these three use categories. Based on data availability, product type-specific and chemical categoryspecific CTDs were also generated and compared. For each CTD, threshold concentrations (TCs) and their 95% confidence intervals (95% CIs) at 1st, 5th, 10th, 50th, 90th, 95th and 99th percentiles were calculated using the log-normal model. To test whether the common default uncertainty factor (UF) approach (e.g., 3, 10) in mammalian health risk assessment provides sufficient protection, UFs were also derived for LOAEL-to-NOAEL and exposure duration (e.g., subchronic-to-chronic) extrapolations. Relationships between CTDs of acute LD50s and sublethal LOAELs/NOAELs were also examined for acute-to-chronic ratio calculations, which may be useful in extreme circumstances. Results from our critical review and meta-analysis appear particularly useful for hazard and risk practitioners when identifying TTCs for ingredients in product use categories, and other chemical classes. This approach can also support development of regulatory data dossiers through read across, chemical substitutions and screening-level health risk assessments when limited or no empirical toxicity information exists for industrial chemicals.

Though numerous chemical ingredients are used in cleaning products, empirical mammalian toxicology information is often limited for many substances. Such limited data inherently presents challenges to environmental health practitioners performing hazard and risk assessments. Probabilistic hazard assessment using chemical toxicity distributions (CTDs) is an alternative approach for assessments of chemicals when toxicity information is lacking. The CTD concept allows for derivation of thresholds of toxicological concern (TTCs) to predict adverse effect thresholds for mammalian species. Unfortunately, comparative health hazard assessment of cleaning product ingredients in common use categories such as all-purpose cleaners (APC), dish care products (DCP) and laundry care products (LCP) has not been well studied. However, APC, DCP, and LCP are used routinely for household and industrial applications, resulting in residential and industrial occupational exposures. Therefore, we reviewed and then examined hazard information (median lethal dose (LD50), lowestobserved-adverse-effect level (LOAEL), and no-observed-adverse-effect level (NOAEL)) from different types of standard mammalian toxicity studies for oral toxicity in the rat model from the unique Cleaning Product Ingredient Safety Initiative mammalian toxicology database. Probabilistic distributions (CTDs) were subsequently constructed using LD50, NOAEL and LOAEL data from a specific toxicity study type for all available ingredients in these three use categories. Based on data availability, product type-specific and chemical categoryspecific CTDs were also generated and compared. For each CTD, threshold concentrations (TCs) and their 95% confidence intervals (95% CIs) at 1st, 5th, 10th, 50th, 90th, 95th and 99th percentiles were calculated using the log-normal model. To test whether the common default uncertainty factor (UF) approach (e.g., 3, 10) in mammalian health risk assessment provides sufficient protection, UFs were also derived for LOAEL-to-NOAEL and exposure duration (e.g., subchronic-to-chronic) extrapolations. Relationships between CTDs of acute LD50s and sublethal LOAELs/NOAELs were also examined for acute-to-chronic ratio calculations, which may be useful in extreme circumstances. Results from our critical review and meta-analysis appear particularly useful for hazard and risk practitioners when identifying TTCs for ingredients in product use categories, and other chemical classes. This approach can also support development of regulatory data dossiers through read across, chemical substitutions and screening-level health risk assessments when limited or no empirical toxicity information exists for industrial chemicals.
Given the large number of products and possible associated consumer exposure scenarios, a priority setting process (e.g., screening-level risk assessment) is needed to identify consumer products and use scenarios for which more detailed hazard evaluation and/or exposure assessment may be needed to adequately characterize consumer risks and then to identify those substances that represent lower or higher concern to public health and the environment (ACI, 2016).
Current hazard assessment approaches tend to rely on the deterministic hazard quotient (HQ) approach and occasionally on probabilistic hazard assessment (PHA) approaches. The HQ approach often uses a worse-case scenario and point estimate, and is generally considered to provide (overly) conservative protection and hence is useful for lower tiers of hazard or risk assessment (Suter II, 1995). As part of the Cleaning Product Ingredient Safety Initiative (CPISI; http://www. cleaninginstitute.org/CPISI/), a critical review of product inventory and associated mammalian toxicology data was performed , and then a screening-level risk assessment based on the deterministic HQ approach was conducted to characterize the mammalian and human health safety for each ingredient of concern (588 in total). It is important to note that ingredients of concern examined by DeLeo et al. (2018) have a wide range of uses; for example, 588 ingredients are relevant to three main use categories including all-purpose cleaners (APC), dish care products (DCP) and laundry care products (LCP). Unfortunately, human health hazard assessment of all cleaning product ingredients on a product use category basis (APC, DCP, or LCP) has remained elusive. Moreover, extensive safety data exist for some of these ingredients, yet for others only limited data are currently available.
To support health risk assessment of these exposures when there are insufficient chemical-specific data, and to prioritize those most likely to present health risks, other approaches such as threshold of toxicological concern (TTC) or read across need to be applied to estimate potential human health impact and to make informed risk management decisions. The TTC approach can benefit from PHA tools establishing human TTCs for chemicals below which there apparently is no significant risk to human health (US FDA, 1995;JECFA, 2006). TTCs can be used to assess the likelihood that a particular level of exposure to a chemical having no toxic effects in the absence of chemical-specific data, relying on available toxicity data for a wide range of chemicals (Kroes et al., 2004). The TTC approach has been favorably reviewed for regulatory use in PHA of food contact materials and flavoring substances (see Table 1 for oral TTC values in current global regulatory uses).
A TTC-like approach was initially proposed by the United States Federal Food, Drug and Cosmetic Act (US FD&C Act, 1958) for chemical in food contact material (and their components) in the United States, in combination with the development of more sensitive and discriminating analytical methods to handle putative toxicological risks of low exposures. Frawley (1967) first presented an analysis to establish a generic threshold value (threshold of regulation; ToR) or ranges values with the aim to reduce extensive toxicity studies and safety evaluations, and to address, within the available capacity, those substances for which the potential or actual intake is substantial. Rulis (1986) conducted a similar probabilistic assessment of carcinogenicity data (TD50 values) for 130 compounds from a carcinogenic potency database (CPBD) developed by Gold et al. (1984) and LD50s of 159 compounds (rats, oral; subchronic or chronic toxicity) contained from the Registry of Toxic Effects of Chemical Substances (RTECS) database, with a linear extrapolation to a 10 −6 risk; a general TTC value of 0.5 ppb (equivalent to 1.5 μg/person/day or 0.025 μg/kg bw/day) was derived (Rulis, 1986;Flamm et al., 1987;Rulis, 1989). An estimated TTC value of 1.5 μg/ person/day was then implemented by the United States Food and Drug Administration (US FDA, 1995) as the ToR for food contact materials. The TTC concept (TTC-based limits) has also been adopted as by the European Medicines Agency (EMA, 2006) and US FDA (2008) for genotoxic impurities in pharmaceutical products and by EMA (2008) for genotoxic constituents herbal substances/preparations, despite the limits varied based on the potency of carcinogens (i.e., direct or indirect mechanisms of DNA damage).
For non-carcinogenic endpoints, Munro et al. (1996) evaluated the implication of TTC using structural information based on a decision tree approach developed by Cramer et al. (1978). Human TTCs of 1800, 540, 90 and 90 μg/person/day were proposed for Cramer class I, II, and III, respectively using the 5th percentile of the distributions and a safety factor of 100. These distributions were developed using non-carcinogenic data of no-observed-effect levels (NOELs) for chronic, subchronic and reproductive toxicity for 613 organic substances. The European Food Safety Authority (EFSA) adopted those values for the regulation of flavoring substances used in food (EC, 2000). The Joint FAO/WHO Expert Committee on Food Additives (JECFA) also adopted the TTC principle in its evaluations of flavoring substances (JECFA, 1995) in the combination of the general TTC values of 90, 540, 1800 μg/person/day for non-carcinogenic responses with the TTC of 1.5 μg/person/day based on carcinogenic information.
This TTC concept has also been commonly used for safety evaluation of chemicals, such as substances present in foods (Kroes et al., 2004), flavoring agents (Renwick, 2004), natural flavor complexes (essential oils) (Smith et al., 2004;Smith et al., 2005), consumer products (personal and household care products) (Blackburn et al., 2005), cosmetics (Kroes et al., 2007), metabolites of plant protection products (pesticides) (Brown et al., 2009), fragrance ingredients (Api et al., 2015). It has further been considered for exposure-based waiving of toxicity tests under REACH (Bernauer et al., 2008;Rowbotham and Gibson, 2011). Our recent study identified unique TTCs for cleaning product ingredients from different common chemical classes using rodent (oral) hazard data ; however, health hazard assessment of cleaning product ingredients in different use categories such as APC, DCP and LCP has not been well studied. Therefore, it may be useful to identify TTCs for cleaning product ingredients in a specific use category (APC, DCP or LCP) by employing a probabilistic approach such as chemical toxicity distributions (CTDs) commonly used in PHA.
Another consideration is that since higher volume ingredients are potentially more data rich it may be possible to build read across quantitative structure-activity relationship (QSAR) arguments supporting one or more of the toxicological endpoints for a lower volume chemical by using data from the higher volume one (Api et al., 2015). In principle, read across utilizes common endpoint information, including physiochemical properties and toxicity for chemicals to make a prediction on the same endpoint for another chemical. This process can help to avoid generating specific test data on every substance for every endpoint. To address the potential uncertainties associated with read across, uncertainty factors (UFs; often synonymous with assessment or adjustment or safety factors) are commonly applied for a more conservative concern (e.g., default UF approach). In cases where only a LOAEL is available, and a benchmark dose (BMD) approach is not applicable or available, then selecting an appropriate UF is, therefore, important to the practice of performing screening-level health risk assessments, including for cleaning product ingredients. However, whether default UFs (e.g., 10 or 100) are sufficient for various chemical uses remains understudied, particularly for cleaning products ingredients in different product use categories.
By leveraging a unique mammalian toxicology database , the objectives of present study were primarily designed: 1) to review and identify threshold concentration (TCs) for all available ingredients in each common cleaning product use category (APC, DCP, LCP) and specific product types/chemical categories using the CTD approach; 2) to compare relative sensitivities of ingredients in different product types or common categories within a given use category or among the three use categories; and 3) to derive UFs including acute-tochronic ratios (ACRs) for each product use category by using a two probability distributions approach (i.e., CTD comparisons and individual UF probability distribution approaches), accompanied by Table 1 Summary of threshold of toxicological concern (TTC) values (oral) and approaches as pragmatic risk assessment or prioritization tools, identified in areas of chemical hazard and risk assessment within a regulatory context. on EU criteria for the classification of the substance for repeated dose toxicity (R48 "danger of serious damage to health by prolonged exposure") after adjustment with an assessment factor. m Subchronic, chronic, reproductive and developmental toxicity studies for rats and rabbits on 613 chemicals with a wide range of structures and uses; potentially sensitive endpoints: immunotoxicity, developmental toxicity, neurotoxicity and developmental neurotoxicity, endocrine active compounds, and allergenicity.
Z. Wang et al. Environment International 125 (2019) 399-417 Monte Carlo simulation approach. This effort thus extended our recent report of novel TTCs and UFs for cleaning products  to examine specific categories of common product uses.

Data mining
Health hazard information for the rat model (oral; mg/kg bw/day) including median lethal dose (LD50s), lowest-observed-adverse-effect level (LOAEL), and no-observed-adverse-effect level (NOAEL)) following a defined standard mammalian toxicology study type (Test Guideline) were collected and reviewed from the CPISI database (http://www.cleaninginstitute.org/CPISI/). These test types include: Repeated Dose 28-Day Oral Toxicity Study in Rodents (e.g., OECD Test Guideline ( Because some ingredients of cleaning products in a product use category contained little to no sublethal toxicology information (e.g., chronic LOAEL/NOAELs), acute LD50s (Acute Oral Toxicity; OECD TG 401, 420, 423 and 425) were still examined in the current study. Information for each ingredient corresponding to its use category (APC, DCP, or LCP), product type and chemical category were also collected.

Chemical toxicity distributions constructions and threshold concentrations calculations
Probabilistic hazard assessment (PHA) using the CTDs approach was conducted following Wang et al. (2018). Briefly, outliers detected by the Grubb's test (Grubbs, 1969) or Tietjen-Moore test (Tietjen and Moore, 1972) were excluded for each dataset. A geometric mean was used when there were multiple data from a common study type for an ingredient. To increase the robustness of our analyses, datasets for an CTD must have contained a minimum of 5 data points (ingredients). Hazard data were ranked in ascending order, and percentiles were assigned from the Weibull formula (Eq. (1)).
where i is the rank of the datum in ascending order and n is the total number of data points. Normality of residuals of each dataset (log-transformed) was checked by the Shapiro-Francia test (Shapiro and Francia, 1972), and goodness of fit at the lower tail of each distribution was assessed by Anderson-Darling test (Anderson and Darling, 1954). Each CTD was then fitted by the log-normal model (SigmaPlot, version 13.0, San Jose, CA, USA), and the TC values and 95% confidence intervals (95% CIs) were determined by log-normal regression function and Monte Carlo simulation (resampled 5000 times; SAS, version 9.4, Cary, NC, USA). When there were sufficient data points (≥5) for a product type or chemical category in a dataset (use category), product type-or chemical category-specific CTDs were also constructed and compared.
To be consistent, the relative chemical sensitivities between two CTDs were compared based on the calculated TC5s and their 95% CIs; if the 95% CIs of the both TC5s did not overlap, then the two CTDs were significantly different (Sokal and Rohlf, 1995). One-way analysis of variance (ANOVA) and Tukey's post hoc tests were performed to compare the relative sensitivities among different use categories, or among different product types or chemical categories (≥3 CTDs) within a use category on the basis of TC5 values and their 95% CIs (significance level ⍺ = 0.05; Graph Pad Prism™, version 5.00, San Diego, CA, USA).

Uncertainty factors derivations
Uncertainty factors were identified using both CTD comparisons and individual UF probabilities proposed by Wang et al. (2018). First, the UFs for a given use category were derived from corresponding TC ratios of the pairwise datasets using all and similar ingredients (hereafter, "all" indicates all hazard data were used for pairwise CTDs constructions, while "similar" indicates both datasets contain common ingredients). Relative TC ratios and their 95% CIs were computed by Monte Carlo simulation approach as described above. One way-analysis of covariance (ANCOVA) was also conducted to compare the slope and/ or intercept parameters of the two log-normal fitted CTDs (SPSS, version 23, Chicago, IL, USA). To further quantitatively compare the difference between the both CTDs, TC5 ratio and its 95% CI was also applied for each comparison. If 95% CIs did not overlap with unity, then the ratio was significantly different from 1 (Sokal and Rohlf, 1995). Second, UFs were calculated separately for individual ingredient within a use category from the respective pairwise datasets consisting of similar ingredients. All calculated individual UFs values were then ranked and assigned percentiles following the Weibull formula (Eq. (1)) for a probability distribution. Afterwards, overall UFs (95% CIs) covering 1%, 5%, 10%, 50%, 90%, 95% and 99% of all ingredients from each log-normal fitted distribution were computed by an inverse prediction method.
Concurrent meta-analyses were conducted, respectively, using LOAELs and corresponding NOAELs for pairwise CTD comparisons and UFs derivation especially for cases with little or no LOAEL data. Relationships between CTDs using acute LD50s and sublethal LOAELs/ NOAELs were also examined, and corresponding extrapolation factors (ACRs) were also derived. Whether a default UF of 10 or 100 would be protective for various distributions of cleaning product ingredients within a given use category were also evaluated. This study was thus not intended to examine UFs associated with extrapolation from animals to humans or variation of sensitivities within or among human populations.

Chemical toxicity distributions and thresholds concentrations for three common use categories
Among all datasets used for CTDs constructions, a few outliers were identified and removed from our analysis (see Table S1 in Supplementary information for datasets). Using sufficient toxicity data (n ≥ 5) for rats (oral), CTDs using LD50s, LOAELs (Fig. 1), and NOAELs ( Fig. S1) were generated for each toxicity type. Calculated TC values and their 95% CIs at different percentiles are listed in Table 2 (for LD50s and LOAELs) and Table S2 (for NOAELs). These derived TCs and their 95% CIs (e.g., TC1 or TC5) can serve as TTCs for particular datasets and thus can be useful during future hazard and risk assessment for ingredients in different use category cleaning products.
Among three use categories (Table 2), the rat model tended to be more acutely sensitive (Acute Oral Toxicity) to APC (lower TC5: 442 (406, 479) mg/kg bw/day) compared to DCP and LCP (602 (529,682) and 530 (491, 571) mg/kg bw/day, respectively), while they were less sensitive against APC (higher TC5: 6.7 (4.4, 9.0) mg/kg bw/day) than DCP and LCP (2.6 (1.1, 4.9) and 1.7 (0.7, 3.5) mg/kg bw/day, respectively) when considering Repeated Dose 90-Day Oral Toxicity. Nonetheless, there were no significant differences of TC5s among three use categories for other types of subacute and subchronic studies (ANOVA; p > 0.05). For example, TC5s of 17 (9.0, 29), 9.3 (3.0, 20), 14 (7.6, 25) mg/kg bw/day for APC, DCP, LCP, and 1.6 (0.3, 5.0), 12 (0.3, 60) mg/kg bw/day for APC and LCP were estimated when  Z. Wang et al. Environment International 125 (2019) 399-417 considering LOAEL values generated from Prenatal Development Toxicity Study and Reproduction and Fertility Effects studies, respectively. There was only one chronic CTD generated for LCP using LOAEL data from Chronic Toxicity Studies (Fig. 1f), and a chronic TC5 of 3.9 (0.6, 14) was derived accordingly. Comparatively, there were more CTDs generated using NOAEL data generated from additional sublethal toxicology study types such as two reproductive/developmental screening tests, and three chronic toxicity tests (except for Chronic Toxicity Studies for LCP) (Fig. S1). Corresponding TC values were also estimated and listed in Table S2. For example, APC were found to be less toxic to rats (oral) than LCP when considering NOAELs of Chronic Toxicity Studies (DCP shared similar sensitivity to APC and LCP, respectively), while APC were the most toxic category to rats when considering NOAELs derived from Carcinogenicity Studies (that is, we did not examine carcinogenicity endpoints, but analyzed NOAEL for other responses reported from these studies), followed by LCP and DCP.

Product type-and chemical category-specific chemical toxicity distributions
Product type-specific CTDs and chemical category-specific CTDs (Figs. 2 and 3 using LD50s and LOAELs, and Figs. S2 and S3 using NOAELs) were developed and can be viewed as sub-distributions within most of general datasets for a given type of toxicity test. Significant differences between or among product type-specific and chemical category-specific CTDs were also detected by checking whether their 95% CIs of TC5s were overlapping (n = 2) or by conducting one-way ANOVA and Tukey's post hoc test (n ≥ 3; Table S3 using LD50s and  LOAELs, and Table S4 using NOAELs).
Using APC as an example, cleaning product ingredients used in products of aerosol spray, hand wash gel (diluted) and hand wash gel (undiluted) were more acutely toxic to rodents compared to the general CTD consisting of all acute data (one-way ANOVA: p < 0.05, Table S3; left shift at the lower tails of the CTDs, Fig. 2a for APC), while the ingredients used in hand wash powder were the least acutely toxic. Significant differences in sensitivities among product type-specific CTDs were also found using LOAELs generated from Repeated Dose 90-Day Oral Toxicity in Rodents, and aerosol spray, and hand wash wipe were the two most two toxic types of cleaning products to rats compared to other product type-specific and global CTDs ( Fig. 2c for APC, and Table S3).
Due to limited toxicity data, sub-distributions were only generated for 4 specific chemical categories (aliphatic acids and salts, chelants, ethers, inorganic acids and salts) to APC, 1 (inorganic acids and salts) within the DCP product use category, and 3 (aliphatic acids and salts, chelants inorganic acids and salts) for cleaning products used in LCP considering Acute Oral Toxicity (LD50s; Fig. 3). Similarly, rodents (oral) tended to be less acutely sensitive to aliphatic acids and salts in APC, inorganic acids and salts in DCP, and chelants in LCP, respectively compared to the CTD generated using all LD50s, and other chemical categories (Table S3). Ethers-and inorganic acids and salts-specific CTDs were also generated using subchronic NOAELs (Fig. S3), and rats (oral) were less subchronically sensitive (Prenatal Development Toxicity Study) to ingredients of inorganic acids and salts in DCP than the global CTD, while more subchronically sensitive to ingredients of inorganic acids and salts in LCP (Table S4).

Uncertainty factors including acute-to-chronic ratios
Most CTD comparisons using either all (Figs. S4 and S5) or similar (Figs. S6 and S7) ingredients were visually diverged, indicating different sensitivities. This was supported by significantly different slope and/or intercept parameters for the both CTDs fitted by log-normal model (Table S5). UFs were also estimated using the CTD comparison approach with all (Tables 3, S6 and S7) and similar ingredients of both datasets (e.g., Acute Oral Toxicity (LD50) vs Reproduction and Fertility Effects (LOAEL); Tables 3, S8 and S9). Using available pairwise data for each ingredient from the two dataset (Acute Oral Toxicity (LD50) vs Reproduction and Fertility Effects (LOAEL)), individual UFs were also calculated and ranked for construction of probability distributions (Figs. 4 and S8). UFs covering 1%, 5%, 10%, 50% 90%, 95%, and 99% of ingredients were computed from each distribution using log-normal function (Tables 3, S10 and S11). Corresponding TC values (95% CIs) for the CTD comparisons with similar ingredients were also computed (Tables S12 and S13).
The percentiles of ingredients being protected under default UF of 10 and 100 were also estimated from each log-normal fitted distribution. (continued on next page) Z. Wang et al. Environment International 125 (2019) 399-417 derived reproductive NOAELs were predicted to provide estimations of 61%, 2.0%, 56% the time for APC, DCP, and LCP, respectively (Table  S11). In contrast, subacute NOAELs were predicted to be > 93% for the three use categories, respectively, if they were derived from corresponding LOAEL values from the Repeated Dose 28-Day Oral Toxicity Study in Rodents.

Chemical toxicity distributions and thresholds of toxicological concern
Probabilistic modeling approaches have been incorporated in hazard and risk assessments, and management paradigms in an effort to estimate potential adverse effects to human health or the environment (Hanson and Solomon, 2002). For example, PHA using the species sensitive distribution (SSD) approach is based on a continuum of acute or chronic toxicity data for species from an array of taxonomic groups (Solomon and Takacs, 2002). When there is insufficient toxicity information, intraspecies endpoint sensitivity distribution (IESD) can be developed using a range of endpoints for a given species, and leveraged in PHA (Hanson and Solomon, 2002). Similarly, the CTD concept, utilizing toxicity information of a group of chemicals (e.g., a use category or common category) to the same species and same experimental endpoints, has also been used in PHA especially when data availability is limited (Brain et al., 2006). Based on this concept, CTDs can be assembled for various model species (e.g., Daphnia magna and Pimephales promelas) (Williams et al., 2011), test methods (e.g., in vivo and in vitro) (Dobbins et al., 2008), exposure durations (e.g., acute and chronic) (Berninger and Brooks, 2010), mode of action (MOA; e.g., inert chemicals, less inert chemicals, reactive chemicals, and specifically acting chemicals) (de Wolf et al., 2005), or chemical classes (e.g., Cramer class I, II, and III) (Munro et al., 1996), and plotted on the same axes for relative sensitivities comparisons. This approach provides insights to which organism, method, exposure duration, MOA, or chemical classes are more sensitive than others for a group of chemicals of interest and thus provide a robust mechanism for prioritization of testing. We recently expanded this approach to identify novel UFs and to derive TTCs for different chemical classes of cleaning product ingredients . In the present critical review and analysis of a unique mammalian toxicology dataset, we extended the application of CTDs using toxicity data of different study types for rats (oral) in three different cleaning product use categories (APC, DPC, LCP) and for ingredients from common product types or chemical categories within each use category, respectively (Figs. 1-3 and Figs. S1-S3).
The utility of a CTD, like an SSD or IESD, is largely dependent on the quantity and quality of data. Requirements on the minimum number of data points (from individual or multiple studies) for a distribution vary considerably. For instance, OECD (1992) requires at least of five data points, whereas the US EPA (Stephan et al., 1985) and ANZECC and ARMCANZ (2000) recommended eight. In this study, in order to be consistent with our previously published methods (Dobbins et al., 2008;Williams et al., 2011;Wang et al., 2018) and to make full use of datasets from the CPISI database, a dataset containing a minimum of five data points was considered to be suitable for an CTD. Nonetheless, the CTD approach is more robust with increasing data, and is thus more accurate with narrower variances (95% CIs) especially for the lower ends (e.g., TC1 and TC5) of a distribution. Therefore, further study is warranted when additional new hazard data are available, which can serve to update and refine the observations presented here.
Use of CTDs (or SSD/IESD) in PHA allows health and ecological risk assessors to generate a criterion threshold value to protect a desired level of effect to a particular endpoint (e.g., mortality, development or reproduction) for chemicals in a particular chemical class or a use category (e.g., APC, DCP, LCP). Considering SSDs in lower tiered risk assessment, for example, threshold levels (known as hazardous concentrations; HC) are selected based on what is considered an acceptable  Fig. 4. Individual uncertainty factor probability distributions for median lethal dose (LD50) -to-lowest-observed-adverse-effect level (LOAEL) and LOAEL-to-no-observed-adverse-effect level (NOAEL) extrapolations.
Z. Wang et al. Environment International 125 (2019) 399-417 concentration of the substance, thereby protecting most organisms (typically 95%) in an assemblage of species (Wagner and Lokke, 1991;Aldenberg and Slob, 1993). The centile is normally set as 5% (i.e., HC5) by regulatory agencies and other practitioners, and is routinely used to establish water quality guidelines/criteria (Stephan et al., 1985;ANZECC and ARMCANZ, 2000;CCME, 2007) and to conduct ecological risk assessments (US EPA, 1998;Solomon and Takacs, 2002;ECHA, 2003). Other centiles have also been considered such as HC10 (Wang et al., 2014;Wang and Leung, 2015) or HC20 (Saari et al., 2017) in SSD comparisons due to data availability. Hanson and Solomon (2002) defined thresholds of toxicity as a low centile (0.1%) from EC10 distributions (IESD) and then used these centiles as toxicological benchmark concentrations, though they also suggested a higher centile (1% or 10%) may be more appropriate as these tend to be within the observed values with greater confidence in their representativeness. For CTDs, TC5s have been used to evaluate the relative sensitivities among different types of in vitro and in vivo assays (Dobbins et al., 2008;Dreier et al., 2015), between and among different distributions (Solomon et al., 2000;Wang et al., 2018), and identifications of TTCs for industrial organic chemicals (Munro et al., 1996) and for substances present at low levels in diet (Kroes et al., 2004). In this study, TCs (95% CIs) at different percentiles for a CTD were computed (Tables 2 and S2), which were further used to identify TTCs (e.g., TC1 or TC5), compare relative sensitivities between or among CTDs (i.e., TC5), and derive novel UFs through TCs ratios of pairwise CTDs comparisons (e.g., TC5to-TC5 ratios). Additionally, the linear regression function for a Z. Wang et al. Environment International 125 (2019) 399-417 particular distribution was also derived in this study (y = bx + a; Table 2 and S2). These functions can be used by practitioners in future PHA to estimate the probability of finding a compound at or below a measured or predicted exposure threshold for a common endpoint (e.g., mortality, development or reproduction) of a defined type of toxicity test for a particular group of ingredients, such as cleaning product ingredients in a use category (APC, DCP or LCP) or a product types or chemical categories within a given product use category.
Representativeness and compositions of ingredients in a dataset may be another factor influencing CTDs comparisons. Thus, in the present study the general datasets consisting of all available LD50s and LOAELs/NOAELs were separated in major groups (e.g., product types or chemical categories), and the relative sensitivities between or among different product types (~3 order of magnitude differences) or chemical categories (same order of magnitude) were also examined (see Tables  S3 and S4 for more details). With this information, it is feasible to further examine chemical effects on a particular group and minimize noise due to differences in ingredients compositions in a general CTD (Brix et al., 2001;Wang et al., 2014). For instance, the addition of acute LD50s data of more sensitive ingredients of ethers into a general acute LD50s dataset for APC would weigh the general acute LD50s to the lower end, while adding numerous less sensitive data points for aliphatic acids and salts could shift the distribution to the upper end, causing more or less sensitivity, depending on the data (Fig. 3a). The information on the relative sensitivities of ingredients from different product types within a use category are also important because of consumer interest level in what specific ingredients are used or safer; interest among stakeholders in sustainable alternative ingredients; and the potential for use of frame formulas in determining ingredient concentration as part of exposure assessment.
Previous studies demonstrated the possibility of using the CTD concept in TTCs developments for diverse industrial chemicals (de Wolf et al., 2005;Kroes et al., 2005;Connors et al., 2014;Belanger et al., 2015) and pharmaceuticals (Berninger and Brooks, 2010). The current study also supports the potential application of the CTD approach in TTCs development to a group of cleaning product ingredients in a use category (APC, DCP, or LAP) and bolsters support for use of this approach for ingredients in different product types or chemical categories within categories of a given product use. This approach is not only focused or limited to the identification of potential hazards, but also provides a quantitative estimate of potency (Veenstra and Kroese, 2005). As mentioned above, the derived TC values and their 95% CIs as listed in Table 2 (for LD50s and LOAELs) and Table S2 (for NOAELs), which rely on the most robust mammalian toxicity dataset currently available for cleaning product ingredients in a use category (APC, DCP, LCP), can represent TTC values for future PHA. In practice, risk assessors or authorities can decide whether TC1 or TC5 or their lower bands of 95% CIs should be used for a certain case of interest. In contrast to using a set of basic test data as the initial starting pointing for a screening-level risk assessment (HQ approach), particularly for chemicals with little or no data, this approach could be beneficial in both an industrial and regulatory setting for avoiding extensive toxicity testing and safety evaluations when human intake or environmental exposure are at or below such thresholds.
The TTC concept proposes a low level of exposure with a negligible risk can be identified for many chemicals, including those without toxicity information based on knowledge of their chemical structures. For cleaning products, previous assessments have shown that human exposure to household cleaning product ingredients is very low for a number of scenarios in which ingredients of interests comprise up to 30% of the product (e.g., aggregated exposure of soap, linear alkylbenzene sulfonate, and alkyl sulfates < 0.006 mg/kg bw/day; www. heraproject.com). In practice, traditional risk assessment scenarios may not be necessary if an extremely large screening-level margin of exposure (lower risk characterization), and the TTC approach, when data is limited, can provide mammalian doses for a chemical that may not present a safety concern (EFSA, 2016). In the present study, our calculated TTCs were based on current available mammal hazard data for individual ingredients among use categories or product types. This PHA approach thus is not intended to account for effects of chemical mixtures, but can inform substitution of less hazardous chemicals within and among product uses. It further appear that this approach can be extended to other chemical classes and product types, depending on data availability.
It is also important to note that the BMD approach uses all doseresponse data to estimate the shape of the overall dose-response relationship. It thus can be selected in an analogous way to the LOAEL/ NOAEL approach (EFSA Scientific Committee, et al., 2017), because NOAEL and LOAEL values are influenced by ranges of treatment levels employed in toxicology experiments. In contrast to use of LOAELs/ NOAELs, the BMD approach (lower band 95% level of BMD; BMDL) can account for additional toxicological considerations (e.g., dose-response information), and may result in a lower threshold (the lowest BMDL). In the present study we critically examined and performed meta-analyses on a highly unique mammalian database; however, BMDL (lower band 95% CI of BMD) information was sufficiently limited for ingredients in APC, DCP, and LCP categories, which precluded our ability to conduct PHA or to identify UFs using BMDLs for these cleaning product use categories. Further studies should be conducted when additional BMDL values are available, and to refine our current findings (e.g., TTCs and UFs).

Uncertainty factors including acute-to-chronic ratios
Two probability approaches (CTDs comparisons and individual UF probability distributions) were applied for UFs derivations in the present study. Rather than comparing the individual toxicity of chemicals, CTDs incorporate responses for multiple compounds during development of probability distributions. Therefore, UFs derived from CTDs comparisons (i.e., TCs ratios; Tables 3 and S6-S9) may provide sufficient protection through a more data driven approach. The PHA concept was also applied to estimate an overall UF covering most of ingredients of concern from the distributions of individual UFs for a particular use category (Tables 3, S10 and S11). Again, this approach incorporated UFs for multiple ingredients during development of individual UF probability distributions. Moreover, a Monte Carlo simulation approach was also applied and a range (i.e., 95% CIs) of UFs rather than a point estimate were estimated. This allows hazard and risk assessors to determine whether median (50%), or lower (e.g., 2.5%) or higher (e.g., 97.5%) bounds of these UFs should be used in their practice for certain chemicals of interest. To reduce potential bias introduced by representativeness of ingredients for each of pairwise CTDs comparisons, we also conducted coherent meta-analyses using the both datasets consisting of similar ingredients when there were sufficient data points (≥5). Due to data limitation of LOAELs of our critical review, this cannot be resolved currently for some, especially those exposure duration extrapolations. Further study should be conducted for those missing CTDs comparisons using similar ingredients when more hazard data are available.
The UF concept has been integrated in human hazard and risk assessment for the explicit derivation of a human limit value (HLV) for man from experimental or epidemiological toxicity data (e.g., NOAEL or LOAEL) (IPCS, 1994;ECETOC, 1995;US EPA, 2002). A 10-fold default UF can be used to account for various uncertainties, such as interand intraspecies variances, subchonic-to-chronic extrapolations, LOAEL-to-NOAEL ratios, adequacy of the total database, and route-toroute extrapolations (JECFA, 1961;JMPR, 1962;US EPA, 1988). The rationale for such default UF of 10 was also examined by Dourson and Stara (1983) for specific areas of uncertainty in risk calculations, so as to protect the majority of human populations from adverse health effects (Bigwood, 1973;Lu, 1979;Vettorazzi, 1980;Calabrese, 1985). However, results from our study present indicated that a factor of 10 might not be sufficient dependent on specific situations for cleaning product ingredients in the APC, DCP, or LCP categories of uses (Table 3).
When chronic NOAELs are not available, HLVs derivations can be established based on LOAEL values with an UF LeN . Previous studies demonstrated that a factor of 10 or below is adequate for UF L-N (Weil and McCollister, 1963;Dourson and Stara, 1983;Naumann and Weideman, 1995), while a range of UF L-N values are commonly applied in hazard or risk assessment due to the severity of the adverse effect at the LOAEL, including values of 1-10 by US EPA (2002) and 3-10 by IPCS (1994). Based on the results in the present study from UF L-N analysis using the CTDs comparison approach (e.g., TC5 ratios; using all or similar ingredients), UF L-N 's were estimated as 0.9 (0.6, 1.6)-4.7 (2.9, 8.1) for APC, and 0.3 (0.1, 1.5)-3.0 (1.0, 11) for DCP, respectively using corresponding LOAEL-to-NOAEL TC5 ratios of Repeated Dose 90-Day Oral Toxicity Study in Rodents (Table 3). Corresponding UF S-C 's (90th percentile) for APC and DPC were estimated as 46 (23, 245) and 23 (13, 231) from the individual UF probability distribution, while a factor of 10 was observed to only be protective 65% and 75% of the time. Considering both this approach and conservatism, UF L-N of 46 and 23 could be more appropriate for APC and DCP, respectively, when using LOAELs as surrogates for NOAELs derivations. Factors of 2.5-5.3 tended to provide adequate protection on LOAEL-to-NOAEL extrapolations for LCP in the type of Repeated Dose 90-Day Oral Toxicity Study in Rodents.
Exposure duration extrapolations especially for subchronic-tochronic extrapolation have been commonly used in hazard or risk assessments (ECETOC, 1995;Swartout, 1997;Dankovic et al., 2015). Previous studies also supported that a UF S-C of 10 would provide sufficient protection (Weil and McCollister, 1963;Dourson and Stara, 1983;Pieters et al., 1998), despite other studies suggesting a more flexible approach in which 10 could be an upper bound estimate (McNamara, 1976;ECETOC, 1995;Dourson et al., 1996). With limited toxicity LOAEL data, only two UF S-C 's (3.7 (0.9, 27) and 3.0 (0.06, 45)) were computed during our present study for developmental and reproductive responses in rats (oral) for LCP product chemicals. In this case, a factor of 10 would provide sufficient protection for subchroninc Prenatal Development Toxicity Study or Reproductive and Fertility Effects to Chronic Toxicity Studies extrapolation, while such values possibly could be decreased from 10 to 3.7 and 3.0, respectively, based on the existing information. However, such UF S-C 's should be further validated in examined and/or refined when more data are available for CTDs comparisons consisting of similar ingredients or individual UF probability distributions, and to derived separate UF S-C 's for DCP and LCP, respectively.
Appropriate selection of ACRs is crucial for read across processes when extrapolating chronic LOAELs/NOAELs from acute LD50s for health hazard or risk assessment (Weil and Wright, 1967;Layton et al., 1987;Kramer et al., 1996). Though not consistently employed in practice, ACRs can be useful; for example, the State of Michigan (USA) uses a total ACR of 1,000,000 when setting a safety value using rodent LD50s from Acute Oral Toxicity studies as a point of departure (State of Michigan, 2017). Recently, we conducted an initial meta-analysis using all available acute LD50s and developmental and reproductive NOAELs for ingredients of cleaning products, and ACRs of 1996 and 317 were recommended when extrapolating developmental and reproductive chronic NOAELs, respectively from rodent acute LD50s (oral) . Comparatively, for cleaning product ingredients in a specific use category, results from the present study indicated that ACR of 169 could be more appropriate for APC of acute LD50s to chronic LOAELs extrapolations (Table 3). Comparatively, ACRs of 26-399, 124-2336 and 45-990 may be more appropriate for APC, DCP, and LCP, respectively when acute LD50s were used for corresponding chronic NOAELs extrapolations (Tables S7, S9 and S11). In addition, other LD50-to-NOAEL relationships among acute LD50s and subacute, subchronic or chronic LOAELs/NOAELs from other toxicity study types were also examined in the present study. Taking APC product use chemicals, for example, if acute LD50s were used as a surrogate, ACRs of 25-46, 37-50, 37-67 may be useful for sublethal developmental LOAELs for APC, DCP and LCP, respectively (Table 3), while UFs of 28-66, 40-178, and 24-78 were identified, respectively using LD50s and NOAELs relationships (Tables S7, S9 and S11).

Conclusions
Lack of available mammalian toxicity data for most chemicals in commerce inherently challenges human health practitioners performing risk assessments of chemical threats. We reviewed a unique mammalian toxicology database for specific cleaning product use categories and then identified how using the CTDs concept in PHA can support assessments of these chemical classes and specific uses when robust toxicity data for hazard and risk assessment is lacking. We examined available hazard data (acute LD50s and sublethal LOAELs/ NOAELs) generated from different types of standard mammalian toxicity studies for rats (oral) used in health risk assessments for APCs, DCPs and LCPs. Probabilistic CTDs were subsequently developed for all available ingredients in each use category, and specific product types or chemical categories, TCs (95% CIs) from each CTD were computed, and potential TTCs were identified. The present study also indicates that sub-distributions exist in terms of different product types and chemical categories within a particular use category with different relative sensitivities. Moreover, we extended a novel approach  to identify UFs particularly for LOAEL-to-NOAEL and exposure duration extrapolations (e.g., subchronic-to-chronic), and ACRs considering LD50-to-LOAEL/NOAEL relationships for ingredients in the APC, DCP and LCP categories of common household products, respectively. Findings from this study are anticipated to be useful for future mammalian health hazard and risk assessment, such as TTC identification, regulatory data dossier development through read across, chemical substitutions, and screening-level health hazard and risk assessment, especially when limited to no toxicity data is available for industrial chemicals.

Conflict of interest
P DeLeo worked for the American Cleaning Institute during this study. The authors have no other conflicts of interest.