Microplastics could be marginally more hazardous than natural suspended solids – A meta-analysis

Microplastics (MP) are perceived as a threat to aquatic ecosystems but bear many similarities to suspended sediments which are often considered less harmful. It is, therefore pertinent to determine if and to what extent MP are different from other particles occurring in aquatic ecosystems in terms of their adverse effects. We applied meta-regressions to toxicity data extracted from the literature and harmonized the data to construct Species Sensitivity Distributions (SSDs) for both types of particles. The results were largely inconclusive due to high uncertainty but the central tendencies of our estimates still indicate that MP could be marginally more hazardous compared to suspended sediments. In part, the high uncertainty stems from the general lack of comparable experimental studies and dose-dependent point estimates. We therefore argue that until more comparable data is presented, risk assessors should act precautionary and treat MP in the 1 – 1000 µ m size range as marginally more hazardous to aquatic organisms capable of ingesting such particles.


Introduction
Microplastic pollution has emerged as a potential threat to the environment.This has spurred the development of a rapidly expanding research field aiming to quantify the hazard and risk of these pollutants.Assessments of both hazards and risks are complicated by (1) the heterogeneous nature of microplastics (MP), (2) the lack of standardized test methods (Gerdes et al., 2019;Gouin et al., 2019;Redondo-Hasselerharm et al., 2018), (3) the general difficulty in identifying and quantifying MP in complex environmental samples (Cowger et al., 2020;Lusher et al., 2020), and (4) the lack of data comparability driven by inconsistent reporting of MP characteristics (Cowger et al., 2020;Provencher et al., 2019).As a consequence, quantitative risk assessments of MP (Adam et al., 2021(Adam et al., , 2019;;Besseling et al., 2019;Burns and Boxall, 2018;Everaert et al., 2020Everaert et al., , 2018;;Yang and Nowack, 2020) have been criticized for the lack of alignment between exposure and hazard data (Koelmans et al., 2020).More specifically, the problem stems from mismatches between the size, shape and density of particles used in ecotoxicological test assays and those actually quantified in the environment.
Drawing inferences from such data is difficult, and Koelmans et al. (2020) proposed to overcome these issues by rescaling hazard and exposure data to a comparable distributions of particles.This method rests on the assumption that MP are inert particles and that the main mode of toxic action is food dilution, implying that the physicochemical properties of the particles are less important (Mehinto et al., 2022).Assuming that food dilution predominates as a major mechanism also implies that MP and any other non-food particles present in the environment are analogous with regards to their effects.In fact, naturally occurring, non-palatable particles like suspended sediments (SS), chitin and cellulose are known to induce similar effects as MP in aquatic organisms (Gordon and Palmer, 2015;Newcombe and Macdonald, 1991;Abbreviations: pNOEC, predicted No Effect Concentration;eNOEC, estimated Ogonowski et al., 2018).If MP have unique adverse effects and modes of toxic action, they would need to be studied, assessed, and managed as a specific group of contaminants otherwise, they should be considered as an integral component of suspended matter.The question whether MP are unique with regards to their adverse effects, or whether they are toxicologically identical to other non-food particles is hence an important question for aquatic toxicology.
A wide range of experimental conditions has been used for testing the adverse effects of anthropogenic particles to aquatic organisms.This has resulted in a high level of heterogeneity in exposure conditions and experimental designs that, on the one hand, provides insight into the likely effects of various exposure scenarios.On the other hand, it hampers comparability across studies.The variability in test conditions means that any analysis of literature data aiming to assess the relative hazard between different particulate stressors needs to be able to account for several factors pertinent to (1) the test materials, (2) the sensitivity of the test species and specific endpoints, and (3) the experimental conditions (e.g., exposure duration).
The most straightforward approach to achieve comparable data is to subset data to one or several common denominators.However, this approach removes valuable information and is rarely feasible in practice due to the scarcity of studies with comparable experimental designs.One way to solve this misalignment is to statistically account for the variability and normalize the data to a common scale, which can be achieved using various multiple regression techniques (Sun et al., 2021;Thompson and Higgins, 2002).For example, if the goal is to compare the toxicities between different materials that also differ in particle size (a characteristic known to affect toxicity), we can statistically control for the difference in particle size.This approach will give a comparable test of the toxicity of the materials at a fixed size.
Here, we address the question of whether MP are toxicologically different from naturally occurring particles by evaluating the relative hazard of plastic particles and suspended sediments in the size range 1-1000 µm based on a comprehensive set of published ecotoxicological data.Using a series of probabilistic approaches coupled with data standardization, we account for the uncertainty in the data while keeping factors influential to the biological responses at fixed levels.This allows us to obtain more comparable measures of toxicity and, consequently, perform an improved hazard assessment.

Literature review and compilation of ecotoxicity data for microplastics and suspended sediments
Toxicity data on MP were primarily collected by means of a systematic review, covering the period January 2016 -February 2019 published by the Norwegian Scientific Committee for Food and Environment (VKM, 2019).Details regarding the search criteria and the selection process are provided in Appendix I of the VKM report, and the raw data is provided as Supporting Information data table 1.
Data collection for toxicity studies with SS were mainly based on studies from previous data compilations and reviews (Gordon and Palmer, 2015;Ogonowski et al., 2018) but complemented with additional searches using Web of Science, Scopus and Google Scholar using the following search terms: "suspended solids", "suspended matter", "suspended material", "sediment", "mineral particles", "effects", "aquatic", "filter feeder" alone or in combination.In contrast to the search for MP toxicity data, no restrictions on publication date were made.For some older studies listed by Gordon and Palmer (2015), the original manuscripts were unavailable.In those cases we used the toxicity data as reported in the paper by Gordon and Palmer (2015).
To obtain more recent data on both SS and MP toxicity, we performed an updated bibliographic search in May 2023.We targeted toxicity studies 2019-2023 where both particle types (MP and a nonplastic reference particle) were used in the same experiment.We used Google Scholar interfaced with Harzing's publish or perish software v.8.6.4198.8332(Tarma Software Research Ltd, UK) with the search terms "microplastics", "microplastic effect", "toxicity", "hazard", adverse effect", "control particle", reference particle", "suspended sediment", suspended solids".The search generated 660 research papers.To find suitable studies matching our selection criteria (see Section 2.2) we screened the titles and abstracts of all papers.In the end we extracted data from seven studies that had used relevant non-plastic controls in their experiments.
Since manual data extraction for systematic reviews is prone to errors (Mathes et al., 2017), we conducted an additional error screening after the dataset had been compiled.Twenty percent of the data entries (rows in the dataset, Supporting Information data table 1) were selected at random to be reassessed by three of the co-authors.We found minor errors relevant to the data analyses in 6 out of 40 endpoints.Out of these six errors, five occurred in one publication and were related to the assessment factor.In the other case, the LOEC-value was erroneously lower by one decimal point.As we account for publication ID in the analysis we conclude that the risk of systematic errors is small and unlikely to affect our conclusions.
The particle size, either reported as the mean/median size or by visual inspection of the actual size distribution in each study was assigned to sediment grain size classes according to the Wentworth scale (Wentworth, 1922).Size distributions that spanned several grain size classes were assigned to the most predominant class.The division into size classes was necessary as some studies, in particular the SS studies, lacked clearly defined size distributions.For reasons of consistency, we used a nominal particle density for the material used in the studies (Supporting information data table 1).The only exception to this rule was for polymers of undisclosed chemical structure or type.In those cases we used the actual, reported density.
The primary aim of the literature search was to compile a dataset for hazard assessment.For this purpose, we extracted acute and chronic effect concentrations reported in each study in the form of the lowest observed effect concentrations (LOEC), effect concentrations derived from dose-response relationships (EC 10 , EC 20 , EC 50, LD 50 ), and no-effect concentrations defined as the highest observed-no-effect concentration (HONEC).The raw toxicity data in the form of varying dose descriptors other than the no-observed effect concentrations (NOECs) were converted to estimated NOECs (eNOEC) using a conversion factor specific to each descriptor (Adam et al., 2019).

Data subsetting to make datasets comparable for hazard assessment
The compiled dataset was restricted to studies in which aquatic organisms were directly exposed to MP or SS added to the medium.Thus, studies in which the particles were incorporated into food or delivery via trophic transfer were excluded.Studies in which the test medium or particles were deliberately spiked with a toxic chemical were excluded since this would exacerbate toxicity and bias our comparison.Data on fibrous particles were omitted since this particle shape only occurred sporadically, particularly in the SS data.Studies employing particle sizes < 0.98 µm (clay-sized particles) were also removed since the mode of toxic action for nano-sized particles may be different due to their capacity to pass biological barriers and cell membranes (Matthews et al., 2021).Since most empirical evidence points towards food dilution being an important mode of action for microparticles > 1 µm (de Ruijter et al., 2020;Mehinto et al., 2022), we further restricted the data to only contain test organisms where the main route of exposure was through ingestion.This also excluded toxicity data involving primary producers, non-feeding larval stages, eggs, and embryos.We only considered higher levels of biological organization (Galloway et al., 2017), i.e. individual and population level endpoints limited to growth, mortality and reproduction since lower-level endpoints may represent transient responses.The subsetted data used for analysis consisted of 50 studies, seven of which used particle controls (MP = 35, SS = 22) and 299 biological endpoints (MP = 193, SS = 106).18 of the endpoints in the SS category consisted of exposures to natural sediment that contained organic carbon which potentially could be utilized as food.The remaining 88 endpoints used pure minerals, mineral mixtures or some other sort of natural control particle like cellulose.The dataset used in the analyses is provided as supplemental material (Supporting information data table 1).

Conversion from numerical and mass-based concentrations to volumetric units
The choice of dose metric, such as volume, mass, or number of particles depends on a toxicant's main mode of action.The appropriateness of a particular dose metric for solid substances is still under debate, particularly in the field of nanomaterial toxicology (Delmaar et al., 2015;Teunenbroek et al., 2017).Assuming that MP as well as SS mainly affect organisms by means of food dilution, the correct dose metric should be based on volume of particles per volume of medium.
For studies where spherical particles were used and effect concentrations were reported as particle numbers per volume, we firstly converted the numerical concentrations to a mass-based concentration according to the following equation: Secondly, the mass-based concentrations were converted to volumetric concentrations as Where C mass = the mass-based concentration (mg L -1 ), C num = the numerical concentration (number of particles L -1 ), D = polymer density (g cm -3 ), r = the particle radius in µm and C vol = the volumetric concentration (mm 3 L -1 ).

Probabilistic modeling of the relative toxicity of microplastics and suspended sediments
We used two slightly different statistical models to compare the relative toxicity of MP and SS to assess the robustness of the predicted hazard.Both methods have a probabilistic foundation as to account for the uncertainties in the data but differ somewhat conceptually.Since MP are considered emerging pollutants, we wanted our results to be conservative in terms of not providing a false negative conclusion.We hence used an alpha level of 11% rather than the conventional 5% when comparing toxicities between MP and SS.Moreover, the slightly higher alpha level has been deemed more stable in Bayesian analyses (Kruschke, 2015).

Model 1:The hierarchical standardized pSSD+ model
To compare the toxicities across species, we followed the probabilistic SSD-approach first proposed by Gottschalk, Nowack (2013) and more recently adopted by Adam et al., (2021Adam et al., ( , 2019) ) for the risk assessment of MP.In brief, the pSSD+ model developed by Adam and colleagues does not assume the data to fit any specific theoretical distribution and avoids the loss of valuable data by incorporating all available toxicity data at the species level instead of using mean estimates.
In the pSSD+ model, the uncertainties in the underlying data are accounted for using arbitrary uncertainty factors.Instead of using such an ad-hoc approach, the heterogeneity can instead be modeled from the data using well-established multiple regression techniques (Sun et al., 2021;Thompson and Higgins, 2002).Such an approach also has the advantage of estimating the toxicity for specific particle sizes, shapes, exposure times and other parameters.Thus, we utilized a two-step hierarchical approach to model the SSDs.In the first step, we used a Bayesian regression as implemented in the R-package brms (Bürkner, 2017) to predict the toxicity of MP and SS for a fixed set of parameters based on the collated literature data.In other words, we estimated the toxicity for each particle type separately, while keeping particle size, shape and exposure duration constant, making data comparable.The probabilistic model also enabled the uncertainty in the estimated toxicity to propagate through all analytical steps.The precursory standardization model can be described as a basic linear regression model: where β are the regression coefficients, ε is the error term and i is a position indicator for the vectors.The toxicity data, translated into volumetric concentrations [mm 3 L -1 ] ( Koelmans et al., 2020) was standardized, by fixing the continuous predictors to specific values.The Exposure duration was set to 28 days, which was the upper threshold in our data for a chronic exposure (Table S2).Grain size class was set to "1" (corresponding to clay, 0.98-3.9µm) which is the smallest and most common size class in our dataset.Particle shape was set to "irregular" since this is a shape that encompasses both MP and natural suspended solid particles.Material type and Particle shape were modeled as a composite categorical factor (equivalent to a factorial interaction term) with four levels (SS-irregular, MP-irregular and MP-spherical and SS-spherical).In contrast to standard SSDs (Aldenberg and Slob, 1993;Kooijman, 1987), we grouped species-level data to taxonomic class level instead, in order to not overparameterize the model.This was motivated by the untested assumption that closely related species have similar feeding modes and sensitivities to particle exposures.Specification of the priors is provided in the Supporting information and Table S3.Model residuals for the precursory standardization model were evaluated visually (Fig. S1 A).In the second step, the predicted and standardized toxicity values were used as input to the pSSD+ model described by Adam et al., (2021Adam et al., ( , 2019) ) to produce two SSDs, one for MP and another for SS.To retain the uncertainty in the toxicity estimate throughout the analytical process, we used the full posterior distribution for each predicted toxicity value as input in the pSSD+ model.The relative hazard of MPs and suspended sediments was evaluated by comparison of the 5th percentile-hazardous concentration (HC 5 ) from the two standardized pSSD+ models and by comparing the full posterior distributions of the HC 5 -values.Following the studies by Adam et al., (2021Adam et al., ( , 2019)), HC 5 was considered equal to the predicted no-effect concentration (PNEC).The pSSD+ model was generated using 10,000 random permutations.

Model 2: Alternative approach to compare the hazard of microplastic and suspended sediments
In order to validate our approach, we also analyzed our data in an alternative framework using a Bayesian mixed model.The model was used to predict a NOEC (pNOEC) for MP and SS while accounting for the variability in experimental conditions, particle characteristics, exposure duration and variation across taxonomic groups and studies.In contrast to the pSSD+ model, this model accounted for the fact that no-effect studies (HONEC) are right-censored and studies reporting LOECs are left-censored when the LOEC equals the lowest test concentration by explicitly incorporating this uncertainty into the model.Hence, the intention was twofold: (1) to compare chronic NOEC posterior distributions (the relative toxicity) between MPs and suspended sediments while statistically controlling for other explanatory variables, and (2) to identify potential drivers of the toxicity.
The criteria and statistical approaches for determining hazardous or safe concentrations differ among studies which makes toxicity data not directly comparable.To align toxicity data to the same scale, it is common to apply uncertainty factors (UF) to derive the chronic NOEC.Although there is no consensus on what these UFs should be, Wigger et al. (2020) suggested a range of conversion factors; one for the dose descriptor conversion (UF dose , Table S4) and another one to convert acute to chronic toxicity data (UF time ).Along these lines, we used the M. Ogonowski et al. estimated NOEC (eNOEC) as the response variable in the model calculated by dividing the reported dose descriptors by the appropriate uncertainty factor (UF dose ).Contrary to the common approach (Adam et al., 2021(Adam et al., , 2019;;Wigger et al., 2020), we did not multiply UF dose with UF time to derive the chronic NOEC.Instead, we modeled the effect of exposure duration as a predictor in the model.By doing so, we estimated the effect directly from the data instead of using an arbitrary UF with an ad hoc associated uncertainty, thus avoiding the assumption of a positive relationship between exposure time and toxicity a priori.
Toxicity values that equal the highest or the lowest employed test concentrations are so called "censored" data.This means that the true hazardous concentration exceeds the tested concentration range and is unknown.The censoring of HONEC and LOEC-values was accommodated using the cens-function in the R-package brms.Apart from the focal variable Material type, the variables Feeding strategy, Particle exposure, Grain size class and Particle shape were included as co-variates in the model because they are intimately linked to exposure, food processing and the organism's sensitivity to particles.Grain size class was modeled as a monotonic variable due to the ordered nature of the size classes (Bürkner and Charpentier, 2015).To account for the variability between studies, we considered Study as a random effect on the intercept.To account for variability across species we initially also included Species as a random intercept term together with the interaction between Species and Study, but this resulted in a too complex model and unsatisfactory low effective sample size for the interaction term.The final model thus omitted the Species term but retained the interaction term which rests on the assumption that the sensitivities of species within a particular study are more similar than across studies (Table S2).The error structure was modeled as a t-distribution to accommodate the presence of outlying data points further out in the tails of the distribution.For all models, we ran four chains with 5000 iterations each after a burn-in of 2500 iterations was discarded.Thinning was set to one.Markov chain Monte Carlo (MCMC) convergence to the equilibrium distribution was monitored visually using the bayesplot package (Gabry et al., 2019) and by evaluation of R values and effective sample sizes.We found no sign of failed convergence and R-vales were equal to one, indicating that the MCMC chains had converged at similar values.Model residuals for the Bayesian mixed model were evaluated visually (Fig. S1 B).Specification of the priors is provided in the Supporting information and Table S5.
To test whether our model parameters could be considered credibly distinct from zero we plotted the posterior distributions against the Region of Practical Significance (ROPE, Fig. S2).As a complement to the ROPE, we also conducted one-sided hypothesis tests using the hypothesis function in brms to derive evidence ratios and distinct probabilities for our hypotheses against their alternatives.

Microplastic toxicity compared to suspended sediments
We used two different approaches to assess the relative hazard of MP compared to SS.The two models (Model 1 and 2) converged at similar patterns indicating that the average MP used in ecotoxicological tests could be 3-8 times more harmful than a natural SS particle depending on the statistical method used.The uncertainty of these estimates was however great and the results should be interpreted with caution.
Based on the Bayesian mixed model (Model 2), the standardized mean NOEC for MP on the data scale (pNOEC) was approximately 8-fold lower compared to that of SS but the credible interval (Table S1) for the groups overlapped and the partial coefficient for the difference fell slightly within the ROPE (Fig. S2), indicating no significant difference between MP and SS.This aligned with a one-sided hypothesis test where the 89% probability distribution of the difference in posterior means contained zero, suggesting that MP could be as harmful as SS.However, the mode of the difference in posterior distributions was centered around one unit on the log10-scale, corresponding to approximately a ten-fold difference in toxicity (Fig. 1) and the one-sided probability of irregular MP being more hazardous than SS was 93%.This pattern was consistent with the standardized pSSD+ model (Model 1) where the PNEC-distributions for MP and SS overlapped (Fig. 2) and where the one-sided 89% probability distribution of the difference in PNEC posteriors also overlapped zero (Fig. 3).The mean PNEC for MP was however 3.2 times lower than that of SS.This is also largely in line with a previous assessment based on a smaller dataset where the LOEC (at individual and population level) was significantly lower for MPs compared to suspended sediments (Ogonowski et al., 2018).Our results are also consistent with a more recent meta-analysis performed on studies where natural reference particles were used in the test controls (Doyle et al., 2022).Here, the authors found a relatively small but significant increase in average toxicity for MP as compared to SS, although the difference was surrounded by a high degree of uncertainty and prediction intervals overlapped zero.It is, however, important to consider that the apparent differences in hazard can be attributed to other causes than actual differences in toxicity, such as differences in experimental designs and exposure conditions that are difficult to account for statistically.

The use of model particles in test assays yields unrealistic toxicity estimates
Two aspects, we did not capture statistically, may result in a higher toxicity of MP compared to SS.Both relate to the use of pristine MP versus SS in toxicity studies.First, MP will leach plastic chemicals that can, at least in some cases, drive the overall MP toxicity (Beiras et al., 2021;Heinrich et al., 2020;Martínez-Gómez et al., 2017;Zimmermann et al., 2020).In addition, commercially available MP can contain preservatives (e.g., sodium azide) that exacerbate the particles' toxicity (Yang and Nowack, 2020).Accordingly, the toxicity data we used here may include the toxicity of plastic chemicals and preservatives, which does not occur in studies with natural particles.Like MP, SS can also contain chemicals adsorbed from their environment, such as polycyclic aromatic hydrocarbons, polychlorinated biphenyls (PCBs) and other persistent pollutants (Rügner et al., 2019;Santiago et al., 1993) Thus, SS toxicity may also be caused by their physical and chemical composition (Lu et al., 2021;Rivetti et al., 2015).However, since many of our SS-studies used pristine, unconditioned mineral particles (78% of the SS  S2).The shaded area shows the 89% probability (one sided test) for MP to have a lower pNOEC compared to SS.The data is presented on the log10-scale.endpoints, SI data table 1) they likely underestimate the toxicity that would occur under natural conditions.Conversely, the presence of toxic preservatives and other plastic chemicals such as UV stabilizers, surfactants and monomer residues which are specific to some MP studies can leach from the MP during exposure and induce chemical toxicity.Indeed, Yang and Nowak, (2020) demonstrated for nanoplastics that removing toxicity-data for particles that contained sodium azide resulted in higher PNECs (i.e., lower hazard).
To compare MP and SS particles on equal terms, the potential effects of leachable chemicals should be accounted for either by removing data for particles containing chemicals (e.g., if they have not been washed) from the meta-analysis or by modeling it statistically (e.g., as covariates in the meta-regression).However, to do so completely and without bias would be practically impossible because the presence of reported toxic preservatives likely is non-random and skewed towards commercially available MP.In addition, the latter may also contain a multitude of other proprietary and undisclosed chemicals which cannot be easily accounted for (Heinrich et al., 2020).Hence, we chose to treat the chemical component as an integral part of the toxic response, not discriminating between specific physical and chemical toxicity.This results in a higher-than-expected average toxicity but also higher variability associated with that estimate.
A second aspect regards the fact that MP which are aged under natural conditions, will be more comparable to SS and have a different toxicity than pristine MPs which dominate our dataset.This has been demonstrated experimentally with some studies reporting a lower toxicity of aged vs pristine MP (Schür et al., 2021;Zou et al., 2020) and other studies reporting an increase in toxicity after weathering (Zhang et al., 2022(Zhang et al., , 2021)).Although the causes of the altered toxicity are not fully understood, they seem to be related to the disassociation of plastic Fig. 2. Probabilistic species sensitivity distribution based on volume-based toxicity data corrected for inter-study differences in particle characteristics and exposure conditions; A) for suspended sediments and B) for microplastics.The dark shaded horizontal bars represent the 25-75th percentile ranges and lighter shaded area the 5-95th percentile range.PNEC = the Predicted No-Effect Concentration which is equivalent to the hazardous concentration for 5% of the species (HC5).The inserted graphics show the posterior probability distributions for the PNEC.chemicals, the formation of a protein corona and biofilms as well as the fragmentation into smaller nano-sized particles.The latter can increase toxicity of the overall particle mixture during ageing but may reduce the toxicity of the particles in the same size fraction as pristine MP (Zhang et al., 2022(Zhang et al., , 2021)).
The sorption of biomolecules on the particle surface (eco-corona) and ultimately biofilm formation (Galloway et al., 2017) may promote particle aggregation and larger average particle size (Michels et al., 2018;Motiei et al., 2021;Porter et al., 2018).Although the same would be theoretically true for mineral particles there is evidence to suggest that these particles do not aggregate to the same extent as MP (Motiei et al., 2021).This shift in particle sizes may lead to the formation of MP aggregates that are too large to be consumed, which may decrease their bioavailability and hence their capacity to cause adverse effects.In addition, the eco-corona or biofilm on particles can provide extra nutrition and, thus, counteract food dilution effects for some types of aging (incubation in nutrient rich raw wastewater) but not for others (e. g., treated wastewater and river water which are lower in nutrients and microbial activity) (Amariei et al., 2022).Whether and to which extent such modulation also applies to SS is not clear from the literature.Given the lack of studies on the toxicity of aged MP, we could not account for this factor in our meta-analysis.Consequently, our MP and SS toxicity estimates are likely not directly translatable to natural systems since they reflect somewhat artificial conditions.

Differences in test-concentration ranges affect the predicted hazard
The pSSD+ model does not account for the fact that no-effect studies are right-censored (undefined upper effect concentration) or that LOEC values can be left-censored if they equal the lowest used test concentration (undefined lower effect concentration).This may lead to an overor underestimation in toxicity, respectively.In our data collection, the distribution of no-effect data was unequal across MP and SS studies with a higher frequency of such data points in the MP data compared to the SS data (73.1% vs. 40.6%).Also, neither of our models accounts for the fact that the experiments were conducted using different concentration ranges.Experiments involving natural suspended solids or minerals usually employ test concentrations in the order of grams L -1 (Cohen et al., 2014) to cover a natural range of concentrations.Most current MP studies on the other hand use orders of magnitude lower concentrations,   S2).Points and lines represent mean values of the posterior distribution.Error bars and blue bands denote the 89% credible interval.The predicted NOEC is shown on the log10-scale.
due to the desire to test "environmentally relevant" concentrations.In fact, in our data, the average highest concentration for SS was two orders of magnitude higher compared to the MP studies (Table S6).Although many MP studies have been criticized for using unrealistically high test concentrations (Connors et al., 2017;Cunningham and Sigwart, 2019;Lenz et al., 2016), these concentrations are still much lower than naturally occurring levels of SS.The difference in concentration ranges poses a problem when the objective is to compare the hazard of different toxicants based on dose metrics like the LOEC or the NOECwhich are directly dependent on the range of test concentrations used (Fox and Landis, 2016;Laskowski, 1995;van Dam et al., 2012, see also tables S4 and S6).The high proportion of no-effect studies (HONEC) in the MP data indicates that the hazardous concentrations likely are higher and closer to those of SS than our statistical models suggest.
Even though an uncertainty factor was applied to adjust for the unknown effect concentration it may not have been large enough.Such disparities in the experimental designs cannot be statistically accounted for unless dose dependent point estimates are used exclusively (Van Der Hoeven et al., 1997).This was, however, not possible due to the general lack of such data, in particular for MP.Excluding HONEC-data from the model was on the other hand not feasible either because the number of available data drastically decreased and it resulted in a too complex model for the data.Moreover, the removal of censored data (e.g., HONEC) usually results in biased estimates and variances (Bouaziz, n.d.;Turkson et al., 2021).
The best approach to make data fully compatible would be to perform paired comparisons of different particle types within an experiment where the exposure conditions are the same.The use of natural reference particles in MP testing has recently been advocated (Arp et al., 2021;Connors et al., 2017;Gerdes et al., 2019;Gouin et al., 2019;Ogonowski et al., 2018Ogonowski et al., , 2016;;Scherer et al., 2018) but it's adoption has so far been comparatively scarce in the scientific literature.Although it may be impossible to find reference particles that perfectly match the MP-particles under study, they will to a high degree account for the main effects of food dilution which impact most relevant biological endpoints (Mehinto et al., 2022;Ogonowski et al., 2018Ogonowski et al., , 2016)).Hence, we argue that the use of reference particles such as natural minerals is a way to increase the ecological relevance of ecotoxicological studies since it provides a benchmark for particle toxicity (Doyle et al., 2022;Scherer et al., 2020;Schür et al., 2020).Such setups will also help to identify the particle-specific mechanisms leading to adverse effects.

Drivers of toxicity
The Bayesian mixed model (Model 2) enabled the toxicity data to be standardized and comparable between MP and SS studies.The heterogeneity in the toxicity data was large and the proportion of variance explained by the model due to the variability across studies was on average 26% while variability across species nested within studies was slightly higher (33% variance explained).Although this level of variation is expected given the wide variety of test materials, species and experimental designs, this contributed to a high degree of uncertainty in the regression coefficients, with the credible intervals all overlapping zero or being close to overlapping zero, indicating a low degree of confidence (Table S1, Fig. S2).In this context, one advantage of Bayesian over frequentist models is their ability to make probabilistic statements regarding the parameter estimates, which allows for a more nuanced interpretation.A closer inspection of the central tendencies of the coefficient posteriors reveals that even though the overall uncertainty was high, the highest probability densities were centered away from zero for several variables (Fig. S2, Fig. 4).Notably, the probability of Grain size class to have a positive slope (one-sided evidence ratio) was 98% suggesting decreasing toxicity (higher pNOEC) with increasing particle size.This is in line with previous observations of MPs in the current size range (Ziajahromi et al., 2018).We can also see that, on average, 52% of the total change in pNOEC due to Grain size class happens between the first two predictor categories (i.e., clay and silt, Table S1) which indicates that the relationship is non-linear.Although the uncertainty around this estimate was high it did not overlap the ROPE (Fig. S2) suggesting toxicity decreases between the clay and silt size categories.It is probable that very fine particles have additional effect mechanisms apart from food dilution, such as an obstruction of gas exchange through the gills in fishes and invertebrates (Hess et al., 2017(Hess et al., , 2015;;Lowe et al., 2015;Watts et al., 2016), clogged feeding appendages in filtrating invertebrates (Cole et al., 2013;Savinelli et al., 2020) or tissue translocation with potential consecutive down-stream effects (Haave et al., 2021).
As for the variable Grain size class, we saw the same general pattern for Exposure duration (Fig. 4).Although the parameter estimate fell slightly within the ROPE (1.6%.Fig. S2), the probability of a positive slope was 98%, suggesting there likely is a small but positive effect of exposure duration on the predicted NOEC.Although decreasing toxicity with increasing exposure time may seem counterintuitive at first, it is plausible in circumstances where sedimentation is allowed to occur without renewal of the test medium or an effect of increased food intake due to the secondary ingestion of nutritious biofilms associated with the particles (Amariei et al., 2022).Alternatively, it can be an artefact linked to the fact that experiments with longer exposures tend to employ lower test concentrations which is problematic when concentration dependent dose metrics, like LOECs and NOECs, are used (Supporting information Fig. S3-Fig.S6).The failure to control for such effects can bias toxicity assessments when particles of different density and sedimentation rates are compared, in particular for suspension feeding organisms (Connors et al., 2017;Gerdes et al., 2019;Gouin et al., 2019;Ogonowski et al., 2018).Although such experimental designs have been rather common in the past, procedures to overcome these shortcomings have recently been proposed (Gerdes et al., 2019;Motiei et al., 2021).Albeit not fully conclusive, the overall pattern of decreasing toxicity with exposure time remains when the more robust dose-dependent point estimates (EC 50 ) are considered (Fig. S7).
Moreover, we expected the sensitivity of a species/life stage to be linked to its native environment, meaning that organisms during different stages of development should be well adapted to cope with local turbidity levels (McFarland and Peddicord, 1980).Hence, it can be expected that organisms inhabiting areas naturally low on non-food particles like tropical reefs should be more sensitive to MP and SS exposure compared to those adapted to high turbidity.Contrary to our expectation, we did not find any coherent evidence to support this hypothesis which is partly related to the inherent uncertainties in our modeling results but also our ability to assign a particular species and life stage to a particular habitat type which can vary over seasons.Local adaptations likely also induce variance within species that are not generalizable.
Out of the three selected endpoints (growth, reproduction and mortality), growth was the most sensitive and mortality the least, the latter not overlapping the ROPE (Fig. 4, Table S1, Table S2, Fig. S2).The higher sensitivity of the sublethal endpoints was expected and indicates that the model behaved as predicted.This also supports the hypothesis that food dilution and/or increased energy expenditure are important mechanisms when exposed to non-caloric particles because they compromise growth which in turn decreases reproductive capacity and ultimately leads to starvation (Foley et al., 2018;Madon et al., 1998;Ogonowski et al., 2016;Wright et al., 2013).It does however not exclude other possible modes of action for which Dynamic Energy Budget models or other individual-based modelling approaches would be needed.
Using a meta-regression based approach (i.e. a regression were the input data are derived estimates collected in literature, (Takeshita et al., 2022;Thompson and Higgins, 2002) combined with a novel standardization step to harmonize toxicity data for SSD analysis, we have demonstrated that the evidence in favor of the generally assumed higher hazard of MP relative SS is only moderate.Although the central tendencies in the data suggest a marginally higher hazard for MP, the uncertainties around these estimates are substantial and formally, not statistically significant.The apparent difference in hazard can partly be due to systematic differences in experimental designs that cannot be accounted for statistically.To accurately assess the effects of MPs in the field relative to those of SS we see an urgent need for well-designed comparative experiments with plastic and non-plastic particles (fibers included), preferably in mixtures (Gerdes et al., 2019;Redondo-Hasselerharm et al., 2018) where: (1) the potential effect of associated chemicals is accounted for, (2) constant exposure conditions are maintained and (3) dose-dependent point estimates are derived.In lack of better evidence, it is advisable to interpret the results with caution and not fully dismiss the possibility that MP are more hazardous than SS.Hence, irregularly shaped MP in the 1-1000 µm size range should, for the time being, be considered as marginally more hazardous to aquatic consumers capable of ingesting particles in this size range.Future studies will however be necessary to assess whether the patterns observed in this study also hold for aged, secondary MP that dominate in the environment (Bayo et al., 2022;ter Halle et al., 2017).

Fig. 1 .
Fig. 1.Marginal mean difference in posterior probabilities between suspended sediments (SS) and microplastic (MP) groups in the censored, Bayesian mixed model (model 2, TableS2).The shaded area shows the 89% probability (one sided test) for MP to have a lower pNOEC compared to SS.The data is presented on the log10-scale.

Fig. 3 .
Fig. 3. Posterior distribution of the difference in PNEC values (mm 3 L -1 ) between suspended sediments (SS) and microplastic (MP) SSDs.The shaded area shows the 89% probability (one-sided test) for MP to have a lower PNEC compared to SS.

Fig. 4 .
Fig. 4. Conditional effects plot for the explanatory variables in the censored, Bayesian mixed model (model 2, TableS2).Points and lines represent mean values of the posterior distribution.Error bars and blue bands denote the 89% credible interval.The predicted NOEC is shown on the log10-scale.