Methods for estimating no‐effect toxicity concentrations in ecotoxicology

A range of new statistical approaches is being developed and/or adopted in ecotoxicology that, when combined, can greatly improve the estimation of no‐effect toxicity values from concentration–response (CR) experimental data. In particular, we compare the existing no‐effect‐concentration (NEC) threshold‐based toxicity metric with an alternative no‐significant‐effect‐concentration (NSEC) metric suitable for when CR data do not show evidence of a threshold effect. Using a model‐averaging approach, these metrics can be combined to yield estimates of N(S)EC and of their uncertainty within a single analysis framework. The outcome is a framework for CR analysis that is robust to uncertainty in the model formulation, and for which resulting estimates can be confidently integrated into risk assessment frameworks, such as the species sensitivity distribution (SSD). Integr Environ Assess Manag 2024;20:279–293. © 2023 Commonwealth of Australia and The Authors. Integrated Environmental Assessment and Management published by Wiley Periodicals LLC on behalf of Society of Environmental Toxicology & Chemistry (SETAC).


INTRODUCTION
The primary concern of most environmental regulatory risk assessments is whether there is a level of toxicity that poses a risk to the survival of a population.Generally, in a protective-risk management framework, the goal is to determine the concentration at which there can be considered no adverse effects on the ecosystem.For this reason, estimation of no-effect concentrations (NECs) for substances of concern is integral to both risk assessment regulation and for the establishment of environmental quality standards (de Bruijn & Hof, 1997).In several jurisdictions the species sensitivity distribution (SSD; Posthuma et al., 2001) is used to extrapolate, from individual estimates of safe concentrations for a range of species, the percentage of the community that is protected at concentration "x" (PCx), or hazardous concentrations (HCx) for 100−x% of all species, that are applied in most current formal water quality guideline value (GV) derivations (Fox et al., 2021).The data underpinning the SSD are the toxicity estimates extracted from individual species concentration-response (CR) experimental data.Ideally, if the goal is high confidence in the level of protection achieved, the input toxicity data should represent the maximum concentration of no "effect" for each species.
There is a wide range of statistical metrics that have been adopted to estimate no-effect toxicity values for use in SSDs (Table 1), and for pragmatic reasons, a range of metrics is generally allowed within most regulatory frameworks.These include the no-observed-effect-concentration (NOEC), effect-concentration of a defined percentage x (typically 10%, EC10, LC10), and the no-effect-concentration (NEC; Table 1).There is considerable debate in the literature regarding the validity and value of different approaches (Fox, 2008(Fox, , 2012;;Fox & Landis, 2016a, 2016b;Green et al., 2013;Warne & Van Dam, 2008).All three methods are used in practice with each having strengths and weaknesses across the myriad situations that arise in CR modeling.The current Australian guidelines state that: "… the preferred order of statistical estimates of chronic toxicity to calculate default and site-specific GVs is: chronic NEC, EC/IC/LCx where x ≤ 10, BEC10, EC/IC/LC15-20, and NOEC.Although all of these acceptable statistical estimates of toxicity are not numerically the same, they are all treated as equivalent for the purposes of deriving GVs …."However, for SSDs developed for the purpose of estimating protective guideline concentrations, the recommendation that "effect" concentrations are allowable in an SSD could be considered problematic because, by definition, such toxicity measures represent an "effect" of the specified amount, and the resulting SSD cannot be assumed to be protective (Fisher & Fox, 2023).Therefore, an NEC is the preferred toxicity metric for SSD modeling in Australia (Warne et al., 2018).
The NEC is typically estimated using a threshold model (Fox, 2008(Fox, , 2010;;Pires et al., 2002) and represents the maximum concentration for which there is no response for a given species, thereby providing a toxicity measure that is ideal for incorporation into SSDs aimed at estimating protective concentrations of contaminants.Although the NEC is considered the preferred measure for inclusion into SSDs in at least some jurisdictions (Warne et al., 2018), CR data do not necessarily exhibit abrupt threshold-like responses, and applying a threshold model in this case will lead to poor outcomes (Krull, 2020).This phenomenon has occurred in our own work, where at times attempts to fit a threshold model have failed using standard packages such as drc in R (Ritz et al., 2015) or, when it does fit successfully, may yield values that are higher than even the EC10 (Negri et al., 2021).In this case, a threshold model of response is clearly not appropriate, and the NEC will not be a suitable estimate of "no-effect".
More recently, Fisher and Fox (2023) introduced the nosignificant-effect concentration (NSEC) as an alternative to the NEC when a threshold is either untenable or not indicated by the data.Initially proposed by Bellio et al. (2000) and further developed by Chèvre et al. (2002), full details of the method are provided in Fisher and Fox (2023) and are not repeated here.Briefly, the technique involves fitting a nonlinear model to the CR data and evaluating the concentration above which the difference between the mean response at the control and that concentration is statistically significant, based on the estimated variability and a predefined level of statistical significance.The NSEC represents an improvement over the conceptually linked NOEC often used in ecotoxicology by decoupling the estimate from the treatment concentrations specifically used in a given experiment and allowing an estimate of precision (Fisher & Fox, 2023).Although Fisher and Fox (2023) provide a thorough description of the mathematical derivation of the frequentist version of the NSEC, as well as implementation of the Bayesian equivalent, the concept has yet to be evaluated more extensively using simulated and case study data.
In addition to the range of toxicity estimates that are used in SSD modeling, there is an even wider range of possible CR models that may be used to derive them.For example, the popular frequentist CR modeling R package drc (Ritz et al., 2015) contains 23 nonlinear functions that can be used, as does our recently developed Bayesian CR modeling package bayesnec (Fisher et al., 2021).These include multiple threshold models as well as a wide array of models representing a smooth decline with increasing toxicant concentration.Although there may be underlying physiological and toxicological mechanisms suggesting a threshold model is appropriate in some cases (Chapman & Wang, 2000), this is not always known, and clear thresholds may not always be apparent in the observed response relationship due to complex physiological responses and variable individual tolerances resulting in a wide tolerance distribution across a population (Van Straalen, 1997).Without a theoretical basis for model selection, it becomes difficult to determine a priori the best model to use for a given set of data.Given the many possible curves that may be used to describe the data, model uncertainty can be high and model selection  Interpolation from a CR model (Fisher & Fox, 2023) NEC No effect concentration-the minimum concentration above which an effect is predicted to occur Parameter estimate of a CR threshold model (Fox, 2010;Pires et al., 2002;Van Der Hoeven, 1997)  Interpolation from a CR model Note: Definitions are adapted from Warne et al. (2015), with the exception of NSEC, which is described in Fisher and Fox (2023).Abbreviations: CR, concentration-response; SSD, species sensitivity distribution.
ambiguous when many of these appear to fit the data equally well.Model averaging (Burnham & Anderson, 2002) represents an approach that can provide a robust way of accommodating model uncertainty.Model averaging involves fitting a candidate set of plausible models to the data and obtaining weighted averaged estimates of the metrics of interest (Burnham & Anderson, 2002).Weights are based on the relative fit of each model to the data.Model averaging is currently being considered as a potentially more robust framework for SSD modeling (Fox et al., 2021) and is widely used in ecology (Dormann et al., 2018).
Here, we explore estimation of no-effect toxicity values using the recently introduced NSEC toxicity measure (Fisher & Fox, 2023) in combination with NEC, within a modelaveraging framework.We begin with a simplified simulation study including only two alternative models, an NECthreshold model and a simple sigmoidal decay function.We then present a case study using real data revealing how the NSEC and NEC values, when combined using model averaging, can yield no-effect toxicity estimates along with estimates of their uncertainty within a single analysis framework.

SIMULATION STUDY
We constructed a simulation study to compare toxicity estimates obtained using NEC, ECx, and the NSEC, across a range of experimental designs with differing replications.Simulations were based on four alternative scenarios representing four different theoretical concentration-response relationships of a binary response endpoint.For all scenarios, we assume a data structure in the following form: {y i = number of "successful" outcomes out of n i replications of an experiment at concentration x i }.Our response y i is modeled using a binomial distribution, with expected value μ i given the number of binomial "trials" n i : where for the first two scenarios the expected value μ i was predicted using the three-parameter NEC exponential decay threshold model of Fox (2010), with the parameters α (top, y intercept), β (exponential decay rate), and γ (the NEC threshold, see Equation 1).
and the indicator function γ ( ) The second two scenarios were based on a threeparameter sigmoidal decay model representing a smooth decline with concentration (see Equation 2).This includes the same parameters α (top, y intercept), β (exponential decay rate), and an additional parameter δ influencing shape of the decay function.
These models were used to generate predicted data for a given concentration (x), using the two R functions given in the Supporting Information.
Parameters were selected for α, β, γ, and δ (Table 2) to yield the four curves shown in Figure 1A.Simulated datasets were generated randomly using these four curves based on a binomial distribution to represent a hypothetical binary endpoint (e.g., survival) with a mean value of 90% success for the control (α-top,).This was achieved by generating theoretical predicted probabilities for two theoretical experimental treatment sequences with either eight (low-density design) or 12 (high-density design) treatment concentration values distributed evenly from 0.01 to a maximum hypothetical value of 10 concentration units.We applied the base R function rbinom() to randomly simulate binomial response data at each treatment level for the predicted probability, with varying levels of n (five or 10 replicates) and size (10 or 20 binomial trials, representing the number of individual test organisms within each replicate).The number of trials used was based loosely on our experience with some of the commercial accredited ecotoxicological tests offered in Australia, such as the Saccostrea echinata (100 individuals/ replicate, https://www.ecotox.com.au/wp-content/uploads/2018/11/Testfactsheet7.pdf) and the Echinometra mathaei (WIECX-25-Sea Urchin Larval, Test ID: 69669, 200 individuals/ replicate) larval development tests.The combination of experimental conditions resulted in eight different sets of simulated data for each of the scenarios examined, and representative examples are shown for each scenario in Figure 1B.This simulation process was repeated 100 times for each design, for all four model scenarios.
"ECxsigmoidal"), which represent the two original model types used to generate the two NEC and sigmoidal scenarios (see Equations 1 and 2).Posterior estimates for the NEC were obtained directly from the NEC model fit for each simulated dataset as the estimated γ parameter.Posterior estimates for NSEC were obtained with the function extract_NSEC (Fisher et al., 2020), which implements the Bayesian method for estimating NSECs, as described in Fisher and Fox (2023), and for the simulation study was based solely on the estimates obtained from the smooth sigmoidal curve.In addition to the NEC and NSEC estimated from their respective threshold and sigmoidal models, a model-averaging approach is used to provide a combined estimate.This model-averaging strategy uses an information theoretic approach (Burnham & Anderson, 2002), and weights the average of the NEC and NSEC based on the relative support of each model given the data, providing a combined estimate of no-effect, which we denote as the N(S)EC.Within this framework, if the NEC model(s) fits the data better, these will have high weight, and the resulting estimate will be a true NEC estimate.Conversely, if the sigmoidal model(s) fit better, the resulting toxicity estimate will be based largely on an NSEC estimate.We also similarly obtained model averaged posterior estimates of EC10, EC5, and EC1 using the function extract_ECx (Fisher et al., 2020).

Comparing model weights
The jagsNEC package uses deviance information criterion (deviation information criterion [DIC]; Spiegelhalter et al., 2002) based model weights to generate model averaged estimates of the toxicity estimates (e.g., EC10 and NSEC).To compare N(S)EC estimates across the different scenarios, we first wanted to establish if the model weighting procedure works effectively in these examples.
We found that the underlying generating model usually had the highest weight for most simulated datasets (Figure 2).There were some exceptions, with a few simulations resulting in high weight for the model not used to generate the data, particularly for low sampling and treatment density (Figure 2A).
There was a tendency for the data generated from the second sigmoidal scenario to yield a higher weight for the NEC model, particularly when there are few treatments (Sigmoidal 2; Figure 2).This tendency persists, even with quite high replication within treatments.There was also a tendency for data generated using the first NEC scenario to  1or 2) relative to the alternative model when fitted using jagsNEC (Fisher et al., 2020) to data simulated according to four theoretical curves, including two no-effect-concentration (NEC) threshold models (NEC 1 and NEC 2, Equation 1) and two sigmoidal models (Sigmoidal 1 and Sigmoidal 2, Equation 2; see Table 2 for details).Simulations were run for a range of sampling designs, including eight or 12 treatment levels, five or 10 replicates within each treatment, and 10 or 20 binomial trials (individual survival estimates).The upper plots (A) show boxplots based on all simulation outcomes, whereas the lower plots (B) show the median weight across simulations as a function of the total number of replicates have a high weight for the sigmoidal model when sampling density is lower, with this occurring more frequently when there were few treatments (NEC 1; Figure 2).
Increasing the number of treatments had a more profound effect on improving the relative weights of the two competing models than increasing the number of trials or replicates within each treatment (Figure 2B).For example, the weight for the NEC model for the NEC 1 scenario was 0.987 for the design with the fewest replicates (five replicates each with 10 trials) and 12 treatments (600 trials; Figure 2B, Supporting Information: Table S1), which is similar to a design with eight treatments and double the level of within-treatment replication (five replicates each with 20 trials, 800 trials, weight = 0.984; Figure 2, Supporting Information: Table S1).For the Sigmoidal 2 scenario, this effect was even more pronounced, with median weights better for the design with 12 treatments across all levels of replication compared with eight treatments (Figure 2).

Comparing toxicity estimates
We examined the NSEC, NEC, and N(S)EC values estimated using jagsNEC (Fisher et al., 2020) for all the simulation scenarios examined, along with EC1, EC5, and EC10, to see how these compared with each other (Figure 3, Supporting Information: Table S2).The model averaged estimated N(S)EC values are close to the true NEC estimates for data simulated using an NEC-type model, although the NSEC estimates are considerably lower than the NEC (NEC 1 and NEC 2; Figure 3A).The close agreement between NEC and N(S)EC reflects the high weight that the NEC model has for FIGURE 3 Toxicity estimates (A) and their coefficient of variability (B) from data simulated according to four theoretical curves, including two no-effectconcentration (NEC) threshold models (NEC 1 and NEC 2, Equation 1) and two sigmoidal models (Sigmoidal 1 and Sigmoidal 2, Equation 2; see Table 2 for details).Simulations were run for a range of sampling designs, including eight or 12 treatment levels, five or 10 replicates within each treatment, and 10 or 20 binomial trials (individual survival estimates).All values were estimated using jagsNEC (Fisher et al., 2020).The estimated no-significant-effect-concentration (NSEC) value is based on a 99% certainty level, and N(S)EC and all ECx estimates are a deviation information criterion (DIC) weighted average of Markov chain Monte Carlo (MCMC)-based model fits for both the NEC-threshold (Equation 1) and the sigmoidal (Equation 2) models.No-effect-concentration values are estimated based only on the NEC model fit, and NSEC is based on the sigmoidal model fit.Also shown are the true toxicity estimates for each scenario (colored horizontal lines).For the NEC scenarios, this is the true theoretical NEC used in simulations.For the sigmoidal scenarios, there is no theoretical NEC, and theoretical EC1, EC5, and EC10 are shown instead.The coefficient of variability is calculated according to the method of Lovitt and Holtzclaw and provides a robust measure of coefficient of variation (Arachchige et al., 2022) the simulated data generated from an underlying NEC model and the NEC model's greater overall contribution to the combined N(S)EC estimate.
Model averaged estimated N(S)EC values are similar to the NSEC estimates for data simulated using a sigmoidal model and, for both sigmoidal scenarios, the NSEC estimates are considerably lower than an NEC estimated from these data (Sigmoidal 1 and Sigmoidal 2; Figure 3A).For sigmoidal models, there is no true theoretical NEC, and we can only compare with true ECx estimates.For the sigmoidal scenarios, the estimated N(S)EC and NSEC values fall across the range of EC1 to EC10 (Sigmoidal 1 and Sigmoidal 2; Figure 3A).
When NEC is estimated by fitting the NEC-threshold model to data generated using the two sigmoidal models (Sigmoidal 1 and Sigmoidal 2), estimated values are quite high relative to even the EC10 estimate, with most estimates higher than the true EC10 value (Figure 3).The estimated N(S)EC and NSEC values are always lower than the estimated NEC value for these sigmoidal models (Figure 3).
Estimates of all toxicity values (including NEC, NSEC, and ECx) are more variable with lower sampling effort across all four scenarios (Figure 3B).Estimates of NSEC as well as ECx exhibit greater variability with low sampling effort and, for some scenarios, variability is further reduced for a design with 12 compared with eight treatments for a similar sampling effort (the blue line is often lower than the green line; Figure 3B).

Actual effects of estimated toxicity
For our simulation study, the knowledge of the underlying response-generating model allows estimation of the actual effect size for the model averaged N(S)EC, NSEC, and NEC estimates, as well as the estimated EC1, EC5, and EC10 values (Figure 4, Supporting Information: Table S2).
For datasets generated using the two NEC models, estimates of NEC, NSEC, and the combined N(S)EC were all close to the theoretical expectation of 0% effect, with a mean difference from expected usually at zero, even for designs FIGURE 4 Actual estimated effects of a range of toxicity estimates (A) and their mean difference from expected (B) estimated from data simulated according to four theoretical curves, including two no-effect-concentration (NEC) threshold models (NEC 1 and NEC 2, Equation 1) and two sigmoidal models (Sigmoidal 1 and Sigmoidal 2, Equation 2; see Table 2 for details).Simulations were run for a range of sampling designs, including eight or 12 treatment levels, five or 10 replicates within each treatment, and 10 or 20 binomial trials (individual survival estimates).All values were estimated using jagsNEC (Fisher et al., 2020), as described in Figure 3. Colored lines show the true EC1, EC5, and EC10 effects.For the NEC, no-significant-effect-concentration (NSEC), and N(S)EC, the theoretically expected values are 0 with relatively low sampling effort (Figure 4B).The exception is the NEC estimate for the second NEC scenario for the design with the least sampling effort, although even then the mean estimated effect is still only approximately 3% (Figure 4B).This reflects the relative robustness of the NEC as a measure of no-effect toxicity for data that have a distinct threshold effect.This robustness appears to be retained when the NEC is combined with the NSEC in a model-averaging framework to estimate an N(S)EC.The ECx estimates derived from models fitted to datasets generated using the two NECthreshold models tended to be lower than expected, particularly for designs with low sampling effort (Figure 4B).There was a somewhat complex pattern for both EC05 and EC10 estimates with the sampling design for the second NEC scenario, possibly due to the position of the treatment concentrations relative to the theoretical position of the NEC (Figure 4B).
For data simulated using the two sigmoidal models, the actual effect sizes of all toxicity estimates varied substantially across simulations, and all were influenced by sampling density (Figure 4).The NEC values estimated from data generated using sigmoidal models represented relatively high actual effect sizes (15%-40% depending on the design) and did not necessarily improve with greater sampling effort (Figure 4B).
There was substantial variability in the actual effect size of the estimated NSEC and N(S)EC values (Figure 4A).Estimates for the NSEC and N(S)EC represented effects nearing 10% on average for designs with the poorest sampling effort, but the calculated effects decreased substantially with increasing sampling effort, representing an effect of approximately 1% on average for the most replicated design in our simulations (Figure 4B).Like the NSEC, there was also substantial variability in the actual effect size of the estimated ECx values, with ECx estimates tending to be greater than the theoretical value when the sampling effort is low, particularly for data based on the second sigmoidal scenario (Figure 4B).As sampling effort increases, this bias gradually decreases, and mean estimates correspond to their true value for the designs with the greatest sampling effort (Figure 4B).For the second sigmoidal scenario in particular, increasing the number of treatment concentrations reduces bias in estimates more quickly for a given sampling effort (Figure 4B).

Case Study 1
In our first case study, we demonstrate how the "no-effect" copper toxicity to the Antarctic marine microalgae Cryothecomonas armigera can be simply estimated using model averaging across a range of NEC thresholds and smooth sigmoidal curves.The data for this case study are from doi:10.4225/15/5746938EC8C8B(with the derived dataset available for download at https://github.com/open-AIMS/NSEC_IEAM/blob/main/case_study1_data.xlsx) and describe the decreasing microalgal population growth rate (normalized as a percentage of controls) after increasing exposure to dissolved copper (Koppel et al., 2017).As the growth-rate data were collated across several separate concentration experiments to ensure alignment, they were normalized to the maximum growth observed within each experiment.Graphical inspection of the data suggested that there was a natural upper bound to the growth rate.To accommodate a response variable with fixed lower (0 growth) and upper (maximum growth) bounds, we used a reparameterization of the Beta distribution (Equation 3) in the likelihood function, after normalization relative to the maximum.The likelihood function for the mean response (μ) is estimated as a function of the two shape parameters of the Beta distribution, using the latent parameter ϕ: 1 .
i i (3) Models were fit using the same jagsNEC package (Fisher et al., 2020) as used in the simulation study.This package uses a model-averaging approach based on DIC weighted averaged predictions across a potential candidate model set composed of a range of functional NEC models adapted from Fox (2010), including two NEC models (NEC3param, NEC4param) and a range of commonly used sigmoidal models (ECx models: ECxLinear, ECx4param, ECxExp, ECxSigmoidal, ECxWeibull1, and ECxWeibull2; see Fisher et al. [2020] and the Supporting Information for more details on the included models, including model formula).We used the default settings in the function fit.jagsMANEC, with 5000 iterations as burn in, 10 000 update iterations, and three chains.Chain mixing was always assessed visually and, where models exhibited poor mixing, they were excluded from model averaged estimates of toxicity estimates.
The DIC-based model weights for the resulting CR curves were spread relatively evenly across three smooth models, ECx4param, ECxsigmoidal, and ECxWeibull2 (Table 3 Figure 5).The NSEC estimates for these models were all comparable at 6 (1-11), 7 (3-11), and 6 (1-11) µg L −1 (Figure 5).We denote the model averaged "no-effect" concentration as the N(S)EC, to indicate it is a weighted average of both NEC and NSEC values.The NEC models had an averaged NEC of 4.6 (3.9-4.9)µg L −1 (Table 4) but low weights (0-0.001;Table 3), so the resulting N(S)EC of 7 (1-11) µg L −1 reflects the weight of the smooth models (Figure 5).Although the N(S)EC value is lower than the NEC, NOEC, or the EC10 (Table 4), the values are slightly higher than the EC1 estimate (Table 4).The estimated "effect" of the N(S)EC value is approximately 2.3% (Table 4).

Case Study 2
In our second case study, we use data from our recent publication (Negri et al., 2021) that was also originally analyzed using the jagsNEC package (Fisher et al., 2020).These data examined the responses of eight tropical marine species to the water accommodated fraction of gas condensate (light crude oil) from the Ichthys and Prelude gas fields off the tropical northwest coast of Australia.The main aim of the original study was to build an SSD following the standard guideline methods (Warne et al., 2015(Warne et al., , 2018) ) to validate mixture toxicity modeling for petroleum hydrocarbons.In their original analysis, either the EC10 or the NEC was used for input to the SSD (whichever was more conservative), as both of these are considered valid toxicity estimates for the purpose of developing SSDs (Warne et al., 2018).Here, we revisit the analysis of the underlying CR curves and use these to explore issues with fitting NEC models to real data, as well as how the N(S)EC toxicity estimate performs in practice.The original data considered eight species.However, three of these did not display a complete response at the highest concentration examined (Acropora muricata, Phyllospongia foliascens, and Rhodomonas salina), and a fourth exhibited evidence of hormesis (Cladocopium goreaui; Negri et al., 2021).For simplicity, these four species were excluded here, leaving the remaining four species-Acropora millepora (coral), Stomopneustes variolaris (sea urchin), Nassarius dorsatus (gastropod), and Amphibalanus amphitrite (barnacle)-to be used in our case study assessment.
Substantial details regarding the collection of these data including the experimental conditions and a description of each assay are described in the original study (Negri et al., 2021) and the supplementary files https://ars.els-cdn.com/content/image/1-s2.0-S0025326X21009334-mmc1.pdf.Response data for assays based on a decline in the percentage of larvae successfully completing metamorphosis (A.millepora and A. amphitrite) or fertilization success (S. variolaris) were initially fit using a binomial likelihood function.However, these initial models were always overdispersed.Instead, these were converted to a proportion and the data fit using a Beta likelihood function, which allows for a more flexible relationship between the mean and the variance relative to the binomial distribution.Response data for assays based on a specific growth rate were normalized if values exceeded one (N.dorsatus)-by dividing by the maximum observed value (to reflect proportional decline) and then also modeled using a Beta distribution, as these were similarly upper bounded as overserved for Case Study 1. Concentration data were included in all models on a log scale, because this was the natural scaling evident in the placement of treatments across the concentration range considered.All models were fit using the R package jagsNEC (Fisher et al., 2020).Chain mixing was always assessed visually and, where models exhibited poor mixing, they were excluded from model averaged estimates of toxicity estimates.As for Case Study 1, only models with NEC as a specific parameter were used to obtain a DIC weighted model averaged estimate of NEC (i.e., both models with the prefix NEC).Estimates of 1% and 10% effect (EC10) and N(S)EC were calculated using DIC weighted model averaged estimates obtained from all successfully fitted models (both models with a prefix NEC and models with the prefix ECx; see Fisher et al., 2020), although in the latter case N(S)EC is a weighted averaged of the NEC and NSEC estimates of the underlying models.
The highest weighting model varied across the four species examined (Figure 6).For A. amphitrite, there was strong weight for a single model-which was the original NEC three-parameter model (Fox, 2010;Figure 6).For this species, the model average based on "all" models was essentially identical to the NEC model set, and the estimated values for the NEC, EC10, and N(S)EC were all nearly identical (with NEC marginally lower than the other two estimates), although the confidence bands for the NEC were substantially narrower than for the other two toxicity estimates (Figure 6; Table 4).For the other three species, there Note: Case Study 1: The effects of copper on the Antarctic microalga Cryothecomonas armigera (Koppel et al., 2017).Case Study 2: The effects of 40% weathered condensate water accommodated fraction (WAF) on four tropical marine species (Negri et al., 2021).All values were estimated using jagsNEC using the default settings (Fisher et al., 2020).N(S)EC and all ECx estimates are a deviance information criterion (DIC) weighted average of Markov chain Monte Carlo (MCMC)-based model fits for both the NEC threshold and the sigmoidal models (see the Supporting Information for more details of individual models).No-effect-concentration (NEC) values are estimated based only on the NEC model fits.No-observed-effect-concentration (NOEC) values are based on fitting the concentration data as a treatment factor via the package glmmTMB (Brooks et al., 2017) and doing a Dunnett's test via glht from the package multcomp (Hothorn et al., 2008). a As the concentration data for this case study were individually measured, they were grouped using a factor of the square-root transformed concentrations rounded to one significant figure to enable a Dunnet's test to be performed.The reported value represents the cell mean for the lowest nonsignificant grouping.
was substantial support for more than one model, and this generally included support for one of the "NEC" model types, as well as a smooth "ECx" model type (see Figure 6).For N. dorsatus, the EC10 estimate is nearly identical to the estimated NEC (blue vertical line overlaps the red vertical line), with the estimated N(S)EC only slightly less, with overlapping confidence intervals (Figure 6; Table 4).Acropora millepora also had similar estimates for the N(S)EC, EC10, and NEC, which all have overlapping confidence intervals, although the N(S)EC was the most conservative and NEC the least conservative for this species (Figure 6; Table 4).For S. variolaris, the NEC estimate was definitively higher than the estimated EC10, and the N(S)EC estimate was lower than both the NEC or EC10 (Figure 6; Table 4).As the N(S)EC estimate was quite low for S. variolaris and in order provide context, we calculated the NOEC value also using a Beta distribution, fitting the concentration data as a treatment factor via the package glmmTMB (Brooks et al., 2017) and doing a Dunnett's test via glht from the package multcomp (Hothorn et al., 2008).We found that the NOEC value was 31.8 µg L −1 , which is far more conservative than our estimated N(S)EC (140 µg L −1 ; Table 4), providing further support that an EC10 estimate for these data would be highly nonconservative, representing a potentially environmentally significant effect.This also demonstrates that, although the estimated N(S)EC is quite low and may appear to some overly conservative, it is less conservative than a calculated NOEC based on the same data.The estimated "effect" of the N(S)EC value for this species is approximately 1.5% (Table 4).

DISCUSSION
The estimation of no-effect toxicity values is critical to deriving safe concentrations of thresholds in the environment.Threshold-based models for estimating toxicity estimates of no-effect are ideal because they directly estimate the NEC as a parameter in the model.However, some CR data exhibit no threshold effect but instead exhibit a gradual decline with concentration rendering an NEC model inappropriate (Fisher & Fox, 2023).Indeed, for both our simulation and the case study examples, when the threeparameter NEC model of Fox (2010) is fit to smooth sigmoidal data, the resulting estimates of NEC are often higher than even the EC10, with simulations suggesting true effects as high as 30%.This occurs because the three-parameter NEC model fit is unable to capture the gradual decline inherent in smoothly sigmoidal data and reinforces the case that fitting threshold models to smoothly sigmoidal data is clearly not appropriate (Fisher & Fox, 2023).In the original analysis of the second case study data, the authors adopted the conservative approach of selecting the lower value of either the NEC or EC10 (estimated using models properly capturing the smooth decline) for inclusion in the final SSD (Negri et al., 2021).This decision is consistent with the current Australian guidelines that aim to derive concentrations that are unambiguously protective of the whole community (Warne et al., 2018).Certainly, using the NEC in this case would result in the derivation of thresholds that are unlikely to be as protective of the community as intended.
The fact that NEC models provide a poor fit to smoothly sigmoidal data resulting in NEC estimates higher in some cases than the EC10 is to be expected, because the NEC three-parameter model fitted here was not used to generate the underlying sigmoidal simulation data.However, the general lack of alternative NEC threshold models means that this is exactly how the NEC would be estimated from these data in practice.We attempted to resolve this issue when it arose in the original analysis of Case Study 2 data by including an additional model that expanded the original three-parameter model of Fox (2010) to allow a sigmoidal decline in the response in the development of jagsNEC (Fisher et al., 2020).However, applying this model to real data often resulted in highly unresolved NEC estimates with extremely wide confidence bands (see Supporting Information: Figure S1).There may be value in further developing the NEC modeling framework to allow for sigmoidal declines using other threshold model parameterizations.However, it seems likely that similar unresolved NEC estimations will result from any sigmoidal model with a relatively flat upper asymptote as the transition point (step/threshold) will likely remain highly uncertain.
For simulated data based on the underlying NEC model, the model averaged estimated N(S)EC values are close to the true NEC estimates for data simulated using an NEC-type model, although the NSEC estimates are considerably lower than the NEC.The high level of similarity between NEC and N(S)EC reflects the high relative weight that the NEC model has when fit to data generated from an underlying NEC-threshold model, which results in the NEC model's greater overall contribution to the combined N(S)EC estimate.Model averaging alleviates the need to undertake separate analyses (threshold vs. nonthreshold models), providing a method for estimating an no-effect toxicity value within a single analysis framework.
Our simulation study demonstrated that, for well-designed experiments with many treatment concentrations, the DIC model weights we used here were usually highest for the underlying data generating model.This suggests the DIC weights did a reasonable job of resolving the true model, and that the model average inference is quite robust, at least for these simulation study data.Weights did vary more widely for simulations with fewer treatment concentrations (eight rather than 12), reinforcing the idea that replication within treatments (a requirement of the one-way ANOVA methods used to generate NOECs) should be reduced in favor of increasing the number of treatments (Fox, 2016).
The DIC weights used here are simple to extract from jags models which provide DIC by default.These are generally considered analogous to other information theoretic metrics such as akaike information criteria (AIC, Spiegelhalter et al., 2002).However, there are potentially much better weighting strategies available for Bayesian model fits (Gelman et al., 2014).This was one motivation for the development of the bayesnec package (Fisher et al., 2021) based on brms (Bürkner, 2017), for which the loo (Vehtari et al., 2020) package provides a range of weighting options (Vehtari et al., 2017).
Model averaging provides additional benefits to ecotoxicologists because it avoids having to decide which model to use when there is no overarching theoretical basis for model selection.Perhaps more importantly, model averaging eliminates the need to select only a single model for inference.This is helpful when more than one model fits the data well, such as in Case Study 1 where model weights were broadly equivalent for three competing models, although in this case the estimated curves (and resulting inference) are similar (Figure 4).We have come across more problematic examples when the log-logistic and Weibull type 1 and 2 models from drc (Ritz et al., 2015) can result in different effects estimates, particularly at the EC10 level, despite having similar AIC values and associated weights.Model averaging ensures that estimated toxicity values are robust, defensible, and usable within the relevant decision context or regulatory framework.Note that we are not suggesting here that model averaging should encourage uncritical "automated" analysis-indeed careful selection of the candidate model set, the appropriate application of statistical methods (see below), and critical evaluation of the model outputs will remain essential elements of the analysis of a CR experiment.
Within the model-averaging framework being put forward here, we suggest that the NSEC (as calculated from sigmoidal models; Fisher & Fox, 2023) be combined with the NEC estimated from threshold models to estimate an overall noeffect toxicity value: the N(S)EC.We believe that the NSEC is a more appropriate estimate of "no-effect" than a low-effect ECx estimate to use within this model-averaging framework, because an ECx, by definition, represents some level of effect.Because the NSEC estimate is based on a significance test relative to the controls (Fisher & Fox, 2023), these estimates will have some level of nonzero observed "effect" associated with them, in the same way that the NOEC can also represent some level of effect (see Mebane et al., 2008).Van Der Hoeven (1997) argues that, because there is a minimum nonzero effect size corresponding to any given NOEC, setting defined effects "x" (of similar size to that minimum) should therefore be acceptable.However, the difference is that the "x" in ECx is typically arbitrarily defined with no direct link to the biological implications of that effect.The "x" in a NOEC (or NSEC) is defined as 0, with any deviation from 0 being regarded as random noise.Although it is true that the probability level considered in the estimation of NOEC is often arbitrary (and perhaps should be more carefully considered with respect to the specific decision context; see e.g., the probabilistic thresholds in Fisher et al., 2018), the basic mechanism of hypothesis testing at the heart of the NOEC (and NSEC) at least provides a clear and coherent framework with which to make decisions under uncertainty.Indeed both Van Der Hoeven (1997) and de Bruijn and Hof (1997) point out that the selection of an appropriate "x" is problematic from a policy perspective.
Where there is sound ecological or biological justification with which to set a meaningful "essentially no-effect" value (x) for an ECx, the use of ECx may be justified in the derivation of protective guidelines.However, there are few situations in which it is possible to theoretically define such an "x," and "x" is typically set at an arbitarily small, fixed value.A fixed low level of effect as defined by an ECx may not translate into a meaningful biological effect.Alternatively, the fixed value of "x" may represent a large effect.Green et al. (2013) discusses a range of examples where statistically significant effects can occur well below a 10% effect, and such effects are likely of biological importance.In this case, using an EC10 may result in highly nonconservative toxicity estimates that will not provide an adequate surrogate for a true NEC.Although the NSEC can have an "effect" in the sense that it is estimated based on a response value that is less than the mean level of the response for the control (as it is a lower bound estimate of the control), this effect is defined as nonsignificant in the context of the variability observed under the experimental conditions.If the "effect" allowed remains within the bounds of the naturally occurring values in the absence of the toxicant, it seems reasonable to infer that such effects are unlikely to have ecological and/or biological consequence on the population.
It could be argued that the application of NEC and NSEC over ECx values in constructing SSDs may have little realized impact on resulting HC estimates.Iwasaki et al. (2015) compared SSDs constructed using NOECs with those based on EC10s and found that point estimates of HC5s based on EC10s were 1.2 (range of 0.6−1.9)times higher.Invariably, it will be necessary to accommodate a range of metrics for assessing toxicity in SSDs, and studies like those undertaken by Iwasaki et al. (2015) provide reassurance that, on average, the choice of metric is potentially not critical.However, this average outcome bias that depends on toxicity metrics can have ramifications for individual cases in environmental regulation.For example, when HCx values from SSDs are used to manage wastewater discharges, with permits awarded based on the assumption that the discharge is "safe" by a certain number of dilutions required to achieve the HCx.Although a bias of 1.2 may seem small, in a permitting situation if you require 1200 instead of 1000 dilutions, the permit conditions are now being exceeded.If we apply the maximum bias of 1.9 overserved across the studies examined by Iwasaki et al. (2015), the conditions are now being exceeded by nearly double.In that setting the choice of toxicity metric can be of great importance, and a clear framework outlining the required methods to be adopted is essential.
As discussed in Fisher and Fox (2023), the derivation of NSEC depends heavily on the estimation of uncertainty in the α (y-intercept) parameter.This means that the statistical methods used must be appropriate and able to accurately capture the level of variability observed in the control treatment.Furthermore, the NSEC will be sensitive to sample size in a way similar to the NOEC (Fisher & Fox, 2023).This is verified by our simulation study, which clearly demonstrates that, for data based on a sigmoidal model, the NSEC declines (becomes more conservative) as sample size increases.This is expected, because experiments with fewer replication result in greater uncertainty in parameter estimation, including α, which reduces the lowerbound estimate and results in higher estimates of the NSEC value.The impact of experimental design on the estimation of NOEC is well understood (Green et al., 2018) and is directly analogous to the NSEC.Guidance on appropriate experimental design and statistical methods for the estimation of NSEC toxicity values are clearly warranted if these approaches are to be adopted (Fisher & Fox, 2023).In addition, it may be reasonable to adopt a similar approach to that of Negri et al. (2021) where the estimated NSEC values are compared with a relevant ECx level, and the lower of the two adopted to ensure an appropriate level of protection is maintained, depending on the context of the analysis being undertaken.
Although the N(S)EC estimates indicated some dependence on sample size, as well as the number of treatment concentrations, values were lower than the EC10 for even the most poorly designed experiment in our simulation study, suggesting that experimental designs would have to be extremely poor for the NSEC to become a less conservative estimate of toxicity than the commonly used EC10 values and is therefore more appropriate as a measure of no-effect.An interesting result of the simulation study (where true ECx values are known) was that estimates of ECx can also be influenced by sampling effort, with some scenarios resulting in substantial overestimates of ECx values for poor experimental designs.Bias in the estimation of ECx highlights that issues associated with appropriate experimental design are not specific to estimation of NSEC and should be considered carefully by ecotoxicologists more broadly.The logistics and cost associated with large experimental designs may put them out of reach for common regulatory studies.However, many designs that we see in current commercial testing laboratories favor replication within treatments, likely because of the requirement of the one-way ANOVA methods used to generate NOECs.We have found that redistributing replication within treatments across a greater number of treatments can often be achieved with little additional cost.

CONCLUSION
Overall, we have demonstrated that the NSEC method proposed by Fisher and Fox (2023) could provide an effective estimate of NEC that can be used when response data do not demonstrate a clear threshold.Embedded in a model-averaging approach, the NSEC and NEC can be combined to yield estimates of N(S)ECs, along with estimation of their uncertainty within a single analysis framework.The outcome is a framework for CR analysis that is robust to uncertainty in the appropriate model formulation, and for which resulting no-effect toxicity estimates can be confidently integrated into the relevant risk assessment framework, such as the SSD.

FIGURE 1
FIGURE 1The four concentration-response (CR) curves generated for the simulation study, including two no-effect-concentration (NEC) and two sigmoidal curves (based on Equations 1 and 2, respectively).(A) Theoretical curves for all four scenarios.For all scenarios the y-intercept parameter (mean value of the response for the control) was set at 0.9, and for the two NEC models, we used the same exponential decay rate of 0.75.The dashed dark red line indicates the theoretical EC10, and the downward blue arrows the position of the NEC for each of the NEC scenarios.Note that there is no theoretical NEC for smooth sigmoidal curves.Colored tick marks along the concentration (x) axis show the placement of the simulated treatment locations for the eight (green) and 12 treatment (dark blue) designs.(B) Plotted realizations for each scenario for all four scenarios, showing an example dataset for a design with eight treatments (dark blue circles) and a design with 12 treatments (green circles).The upper row shows data simulated using the least replicated design (five replicates with 10 trials each), and the lower row shows data for the most replicated design (10 replicates with 20 trials each).

FIGURE 2
FIGURE 2 Deviation information criterion (DIC)-based model weight for the underlying data generating model (Equation1or 2) relative to the alternative model when fitted using jagsNEC(Fisher et al., 2020) to data simulated according to four theoretical curves, including two no-effect-concentration (NEC) threshold models (NEC 1 and NEC 2, Equation1) and two sigmoidal models (Sigmoidal 1 and Sigmoidal 2, Equation 2; see Table2for details).Simulations were run for a range of sampling designs, including eight or 12 treatment levels, five or 10 replicates within each treatment, and 10 or 20 binomial trials (individual survival estimates).The upper plots (A) show boxplots based on all simulation outcomes, whereas the lower plots (B) show the median weight across simulations as a function of the total number of replicates

FIGURE 5
FIGURE 5 Concentration-response relationships for the effects of copper on the Antarctic microalga Cryothecomonas armigera (Koppel et al., 2017).The model average (A) and contributing individual models (only those with weights >0.1, B-D) are shown.The vertical black line indicates the no-significant-effectconcentration (NSEC) estimated from each model (B-D), and the vertical purple line indicates the model averaged "no-effect" concentration (A), denoted as the N(S)EC to indicate it is a weighted average of both NEC and NSEC values

Integr
Environ Assess Manag 2024:279-293 © 2023 Commonwealth of Australia and The Authors wileyonlinelibrary.com/journal/ieamTABLE 4 Toxicity estimates based on two case studies, showing NEC, N(S), EC1, EC10, and the estimated % effect value of the reported N(

FIGURE 6
FIGURE 6 Concentration-response relationships for the effects of 40% weathered condensate water accommodated fraction (WAF) on four tropical marine species.Measured time-weighted average concentrations are expressed as total aromatic hydrocarbons (TAH) on a log scale.Curves are model averaged Bayesian nonlinear Beta model fits with 95% credible intervals indicated by the shaded ribbon where all models are included "All" or only no-effect-concentration (NEC) models are included "NEC."Binomial response data are the proportion of successes (Acropora millepora, and Stomopneustes variolaris; B, D); growth-rate response data taking values greater than 1 are normalized relative to the maximum value (Nassarius dorsatus; C); and growth-rate response data taking values less than 1 are modeled on the original scale (Amphibalanus amphitrite; A).The vertical lines indicate the estimated NEC, EC10, and model averaged "no-effect" N(S)EC values, respectively.The model averaged "no-effect" is denoted as the N(S)EC to indicate it is a weighted average of both NEC and no-significant-effect-concentration (NSEC) values.Note the individual NEC, EC10, and N(S)EC values are similar and therefore difficult to distinguish for (A)-(C)

TABLE 1
Toxicity estimates currently used for estimating no and low-effects for using in SSD modeling

TABLE 2
Scenario parameters used in simulations. ; Note: See the Supporting Information for details of model formulations for each model.