Critical review of the OSPAR risk‐based approach for offshore‐produced water discharges

The management of produced water (PW) discharges from offshore oil and gas installations in the North Atlantic is under the auspices of OSPAR (Oslo/Paris convention for Protection of the Marine Environment of the North‐East Atlantic). In 2010, OSPAR introduced the risk‐based approach (RBA) for PW management. The RBA includes a hazard assessment estimating PW ecotoxicity using two approaches: whole‐effluent toxicity (WET) and substance‐based (SB). Set against the framework of the WET and SB approach, we conducted a literature review on the magnitude and cause of PW ecotoxicity, respectively, and on the challenges of estimating these. A large variability in the reported magnitude of PW WET was found, with EC50 or LC50 values ranging from <1% to >100%, and a median of 11% (n = 301). Across the literature, metals, hydrocarbons, and production chemicals were identified as causing ecotoxicity. However, this review reveals how knowledge gaps on PW composition and high sample and species dependency of PW ecotoxicity make clear identification and generalization difficult. It also highlights how limitations regarding the availability and reliability of ecotoxicity data result in large uncertainties in the subsequent risk estimates, which is not adequately reflected in the RBA output (e.g., environmental impact factors). Thus, it is recommended to increase the focus on improving ecotoxicity data quality before further use in the RBA, and that WET should play a more pronounced role in the testing strategy. To increase the reliability of the SB approach, more attention should be paid to the actual composition of PW. Bioassay‐directed chemical analysis, combining outcomes of WET and SB in toxicity identification evaluations, may hold the key to identifying drivers of ecotoxicity in PW. Finally, an uncertainty appraisal must be an integrated part of all reporting of risk estimates in the RBA, to avoid mitigation actions based on uncertainties rather than reliable ecotoxicity estimations. Integr Environ Assess Manag 2023;19:1172–1187. © 2022 The Authors. Integrated Environmental Assessment and Management published by Wiley Periodicals LLC on behalf of Society of Environmental Toxicology & Chemistry (SETAC).


INTRODUCTION
Oil and gas have been extracted at production platforms in the North Sea since the 1960s and produced water (PW) is an important waste stream of these offshore activities. Produced water is a mixture of formation water and water injected into the reservoir to maintain pressure and improve yield during the extraction. As oil fields are depleted, the global water-tooil ratio is expected to grow from 2-3 in 2011 (Neff et al., 2011) to 12 in 2025(Al-Ghouti et al., 2019. Produced water is either reinjected into the reservoir via dedicated wells or, more commonly, discharged directly into the sea (Neff et al., 2011). In 2020, the volume of PW produced in the North East Atlantic was estimated at 428 million m 3 , with around 141 million m 3 disposed of by reinjection and the remaining 287 million m 3 discharged into the ocean (OSPAR, 2021).
Upon discharge, PW contains remnants of both naturally occurring petroleum-and formation-associated compounds as well as added production chemicals (PCs). Crude oil varies considerably in composition and easily constitutes over 100 000 individual compounds, depending on the organic nature of the source rock and the changes in conditions over geological time. Likewise, the water itself can vary from fresh to highly saline brines, with large variations in the electrolyte composition. During production, the water composition may change, and active bacteria may be introduced with the injected water. This again leads to compositional variations over the production lifetime of the reservoir. The nature of the water, oil, and gas in turn dictates the use of oilfield chemicals such as corrosion and scale inhibitors, demulsifiers, and H 2 S scavengers. Oilfield chemicals are industrial vendor-specific formulations made up from compound classes such as surfactants (both ionic and nonionic), cosurfactants, and polymers with different functionalities, tailored to enable smooth production. To avoid bacterial growth in the top-side facilities batch, treatments with biocide will frequently be performed. The fate of these compounds is dependent on the partitioning between the oil phase and the water phase, and the individual partition coefficients are dependent on oil composition and water salinity as well as pressure and temperature.
As a result of the complexity in oil and gas production, PW is a highly complex chemical mixture that will vary across platforms and over time for a given field in operation. Several of the PW constituents are of environmental and human health concern e.g. metals (especially heavy metals), benzene, toluene, ethylbenzene, and xylene (BTEX), polycyclic aromatic hydrocarbons (PAH), phenols, alkyl phenols (AP), organic acids, and naturally occurring radioactive materials (NORM) as well as certain PCs (Bento & Campos, 2020;Carlsson et al., 2014;Gabardo et al., 2011;Hamoutene et al., 2010).
The discharge of PW presents a potential ecotoxicological concern for the receiving environment, as it continuously delivers large amounts of organic and inorganic pollutants. In the North Sea region, discharges are governed by the Oslo-Paris convention (OSPAR). Internationally, regulatory discharge limits for PW are often based on the oil-in-water (OiW) content for both the monthly average and the daily maximum, typically ranging from 29 to 40 mg/L and 20 to 100 mg/L, respectively (Neff et al., 2011). For platforms under OSPAR governance, the limit is a monthly average OiW of 30 mg/L (OSPAR, 2021). The phase separation technologies installed at platforms today are commonly able to bring the OiW well below regulation limits (OSPAR, 2021). In 2020, the average OiW content within the OSPAR area was 14 mg/L, although 24 installations failed to meet the OiW requirement (OSPAR, 2021). Concerns of adverse effects caused by PW remain, as PW continues to be a large source of marine hydrocarbon contamination (OSPAR, 2021;Stephens et al., 2000). Furthermore, dispersed oil may not adequately reflect PW toxicity (Carroll et al., 2000) and contaminants associated with the aqueous fraction may be significant contributors to toxicity (Azetsu-Scott et al., 2007;Bento & Campos, 2020;Johnsen et al., 2000;. OSPAR therefore introduced a new approach to PW management in 2010, the risk-based approach (RBA), which is "[…] based on a characterization of the risk to the environment of a produced water discharge by examining both the exposure resulting from discharge of the produced water effluent and the sensitivity of the receiving environment to this exposure" (OSPAR, 2012). The RBA follows a framework resembling that of the environmental risk assessment (ERA) of chemicals, which begins with data collection, followed by a separate hazard and exposure assessment, and combined in a subsequent risk characterization. The output determines management efforts and actions in the following risk management and monitoring of the RBA.
Despite several decades of hydrocarbon extraction, the estimation and investigation of PW toxicity remain a challenge due to its highly complex physicochemical composition and properties. The knowledge gaps on PW toxicity may affect the certainty of the RBA and consequently the quality of the management decisions based on its outcome.
The objective of this study is therefore to provide a critical review of offshore PW toxicity estimations within the RBA framework, to show how the RBA utilizes ecotoxicity data, and to provide recommendations on how to overcome the identified shortcomings of the ecotoxicity-related parts of the RBA framework.

HAZARD ASSESSMENT OF PW IN THE RBA
For hazard assessment, the RBA allows for ecotoxicity evaluations following two approaches: the whole-effluent toxicity (WET) and the substance-based (SB) approaches (OSPAR, 2014b). While both are used to quantify the predicted no-effect concentration (PNEC) of PW, that is, the environmental concentration below which adverse effects are unlikely to occur, they are very different methodically. In brief, the WET approach relies on aquatic toxicity tests carried out directly on actual PW samples, whereas the SB approach relies on an analytical-chemical characterization of PW samples and assumes additivity of the individual aquatic ecotoxicity of each chemical compound (determined independently from PW) found in the PW sample. The following section will review and discuss the current state of knowledge regarding the magnitude and cause of PW ecotoxicity, set against each of the two approaches, respectively.

Whole-effluent toxicity of PW
The WET approach has been widely used in the RBA for PW management. A review of the open scientific literature revealed 16 studies reporting toxicity values obtained by WET of offshore PW (see Table 1). These studies were published from 1987 to 2022 covering 32 species. Most studies analyzed PW samples from oil and gas production platforms in the North Sea (six studies), the Bass Strait (three), and the Gulf of Mexico (three). Singular studies used samples from the NW Atlantic, the SW Atlantic, the Indian Ocean, and the Adriatic Sea. Crustaceans were the most common test organism (11 studies), followed by bacteria (eight), fish (seven), algae (five), sea urchins (four), bivalves (three), bristle worms (one), and polyp animals (one). The WET findings are summarized in Table 1 in terms of the percentage of PW sample that caused an effect or lethality on 50% of the studied organisms (i.e., the EC50 or LC50, respectively).
For the data shown in Table 1, no considerable trends in PW ecotoxicity were found. ANOVA testing found no significant differences within the factors of organism, exposure duration, and PW origin, with the exception of crustaceans being more sensitive than fish (p < 0.01). Previous studies have argued that comparison of PW ecotoxicity across studies is not feasible, because the test substance is not the same due to the intersample variability of PW composition (Azetsu-Scott et al., 2007;Hale et al., 2019;Hamoutene et al., 2010). Data analyzed in this literature review appear to validate this, as the EC50 and/or LC50 values across articles and species range from <0.2% to >100%, with a median of 10.7%, and an average of 16.0% ± 16.5, emphasizing the magnitude of cross-sample, crossspecies toxicity variation. Nonetheless, Figure 1 shows that 50% of the EC50 and/or LC50 reported in the literature was <10% PW and >75% of values were <20% PW, indicating that while PW WET ranges from almost 0 to >100% PW, it is more common for PW ecotoxicity to be in the lower end of that spectrum. An earlier review found a similar trend of PW WET ranging from 0.02% to above 100% PW (Stephenson et al., 1994), and another found that 95% of gathered WET values were between 0.87% and 21% PW (Karman & Smit, 2019).
For a hazard assessment of PW, the data listed in Table 1 offer the possibility to rank toxicities of whole samples. However, the most prominent use of the data in RBA is to estimate the PNEC. For this, the lowest observed NOEC (no observed effect concentration), LOEC (lowest observed effect concentration), LC50 (median lethal concentration), or EC50 (median effect concentration) in a set of ecotoxicological tests covering at least three different trophic levels is divided by an assessment factor (AF). The RBA prescribes the use of a test battery consisting of bacteria, algae, and crustaceans. The size of the AF is based on the type (shortterm vs. long-term) and the number of tests carried out (ECHA, 2008). For PW, typically, only results from short-term tests in the test battery are available and an AF of 1000 has been agreed upon within the RBA (BEIS, 2020). It should be mentioned that the European Chemicals Agency (ECHA) recommend an AF of 10 000 for marine PNEC determinations when only results from the test battery of three short-term tests are used; however, this was deemed to be too conservative for PW by the Offshore Industry Committee, as the 10 000 AF was developed for near-coastal waters and discharges from offshore installations are assumed to be affected by greater and more rapid dilution (BEIS, 2020).

Substance-based toxicity of PW
In contrast to the WET approach, the aim of the SB approach in the RBA is to determine PNEC values for individual PW constituents (Karman & Smit, 2019) or for groups of constituents with similar toxicological modes of action (Johnsen et al., 2000;Smit et al., 2005). The individual PNECs are combined using concentration and response addition to estimate the overall PW ecotoxicity, as the sum of individual compounds or group toxicities (Smit et al., 2005). The PNEC of individual constituents can be determined by dividing the lowest observed toxicity value with an assessment factor (as described for WET above) or by ranking ecotoxicity data to derive a species sensitivity distribution (SSD), describing the potentially affected fraction (PAF) as a function of exposure concentration. In this case, the 5% hazard concentration (HC 5 ) forms the basis for PNEC determination. HC 5 is the concentration at which the PAF is 5% (van Straalen & Denneman, 1989).
In practice, it is not achievable to include all PW constituents in the SB approach, as PW samples often contain >1000 different compounds (Bergfors et al., 2020). Instead, PW is represented by a shorter list of compounds, containing those assumed to be the major contributors to its ecotoxicity. The SB approach therefore requires that drivers of PW ecotoxicity have been identified. The RBA includes a predefined list of compounds to be included in the SB approach. However, it is relevant to critically review this list as identifying drivers remains a challenge due to the complexity of PW composition. This is especially complicated by the large variability in the formulations and constituents of PCs that are vendor based, and therefore also difficult to analyze for.
In this study, the drivers of offshore PW ecotoxicity were identified following a literature review on the causes of PW ecotoxicity, including the scientific literature from 1987 to 2022. The findings were grouped into three different categories depending on how the results of the original studies were generated: (1) "bottom-up" evaluation, (2) "top-down" evaluation, and (3) "effect-directed" evaluation. An explanation for these categories is given below as well as in Figure 2, which provides a conceptual description of how each of the three evaluation methods isolates and identifies drivers of PW ecotoxicity.
Bottom-up evaluation to identifying of ecotoxicity drivers. The bottom-up evaluation is used by studies investigating drivers of PW ecotoxicity by assessing whether the inherent ecotoxicity of an individual constituent has the potential to cause (and drive) PW ecotoxicity at its actual PW concentration, often expressed as its toxic unit (TU). By this evaluation, constituents with the highest TU likely contribute most to the overall sample ecotoxicity.
Metals like calcium, barium, cadmium, copper, mercury, and zinc (Neff, 2002) have been found in PW at concentrations potentially toxic to aquatic organisms. Zinc has been found in concentrations that may be directly toxic to plankton (Azetsu-Scott et al., 2007), bivalves, and algae (Strømgren et al., 1995) and, similarly, for copper toward plankton (Azetsu-Scott et al., 2007), algae, and bacteria (Brendehaug et al., 1992), and for chromium towards algae and bacteria (Brendehaug et al., 1992). Other studies, however, found that heavy metals were unlikely to be contributors to PW ecotoxicity (Jacobs et al., 1992). Concentrations of cadmium, mercury, lead, and nickel have moreover been found to be below toxic concentrations (Hannam et al., 2009) or present in concentration levels comparable to that of unpolluted seawater (Brendehaug et al., 1992). For inorganic ions, studies have argued that ion-related toxicity is likely caused by an altered ion ratio compared to seawater rather than the actual toxicity of the inorganic ions themselves (Moffitt et al., 1992;Neff, 2002). This is supported by the fact that most marine test organisms can tolerate salinities as low as 10‰ and as high as 40‰, provided that the ion ratio is similar to seawater (Neff, 2002). Ammonium is an exception to this as it has been found to be present at toxic levels (Neff et al., 2011).
Polycyclic aromatic hydrocarbons are present in PW but are often considered to be in subtoxic concentrations (Stephenson, 1992), and if present in significant concentrations, the majority would likely be the moderately toxic two-and three-ring PAHs, with the larger and more toxic PAHs present at negligible concentrations in the water phase (Neff, 2002). This was confirmed for one PW sample, in which naphthalene was the biggest contributor to PAHrelated toxicity due to its high concentration (Hannam et al., 2009). Naphthalene has also been found to be present at negligible concentrations (Binet et al., 2011). Similarly, for phenols, the more toxic, highly alkylated phenols were found to be at extremely low concentrations, while the total phenol content was within the toxic range (Brendehaug et al., 1992). For BTEX, high toxicities of PW samples from gas platforms were attributed to BTEX concentrations (Jacobs et al., 1992), but BTEX has also been found at concentrations well below toxic levels (Binet et al., 2011). These variations between studies are likely a result of the large variability in the composition of PW samples as well as the chosen test organism.
Production chemicals are often mentioned as contributors to PW ecotoxicity, but the nature and magnitude of that contribution are largely unknown (Bento & Campos, 2020). An analysis of nine PCs found a biocide, a corrosion inhibitor, an H 2 S scavenger, and a surfactant to be very toxic, and an  The three different evaluation methods to identifying the drivers of produced water (PW) ecotoxicity. The "bottom-up" evaluation analyzes the individual PW constituents and combinations of these, while the "top-down" evaluation analyzes the full PW sample and fractions of this. Finally, the "effectdirected" evaluation compares the sublethal effects of the full PW sample and the individual constituents antifoamer and a viscosifier to be nontoxic (Bento & Campos, 2020). A corrosion inhibitor was found to be present at TU > 10 in a PW sample (Binet et al., 2011) and even traces of a corrosion inhibitor were found to be within its toxic range for several species (Brendehaug et al., 1992).
In some studies, the bottom-up evaluation is expanded by combining individual PW constituents and analyzing their combined toxicity potential, in an effort to reconstruct overall PW ecotoxicity ( Figure 2). This is done either theoretically by mixture toxicology or experimentally by combining PW constituents to a synthetic PW sample. Karman et al. (1996) used the theoretical approach to show that hydrocarbons, phenols, organic acids, and PCs each contributed just over a fifth of the overall toxicity; the rest was attributed to metals. In a similar study by M. Reed et al. (1996), organic acids and aliphatic hydrocarbons were found to be the main drivers of PW ecotoxicity. In a study by de Vries and Jak (2018) covering PW from 25 platforms in the North Sea, aliphatic hydrocarbons, organic acids, and PCs were found to be the biggest contributors, while the contributions of phenols and PAHs were negligible. A similar study of 12 platforms in the Bass Strait found sulfide, aromatic hydrocarbons, and ammonia to be the main ecotoxicity drivers and cyanide, metals, and phenols to be negligible (Parkerton et al., 2018). The other, experimental method of combining constituents in the bottom-up approach was used by Johnsen et al. (1994) to show that the aromatic and phenolic fractions were the main contributors to PW ecotoxicity, while PCs were insignificant. On the other hand, several studies have found that addition of PCs drastically increased the toxicity of the aqueous fraction of PW (Bento & Campos, 2020;Henderson et al., 1999;Tornambè et al., 2012). It has been hypothesized that this is because certain PCs, in particular, biocides and H 2 S scavengers containing glycol derivatives, act as cosolvents and thus facilitate dissolution of oil-associated toxic compounds (Bento & Campos, 2020;Henderson et al., 1999;Tornambè et al., 2012). Recently, an increased oil droplet stability by PCs has been proven experimentally by microfluidics, leading to increased OiW and reporting of synergistic effects by combinations of PCs (Aliti et al., 2022). In PW, the solubility of low-molecular-weight aromatics (Henderson et al., 1999), for example, BTEX and PAHs (Tornambè et al., 2012), is believed to be altered by the presence of these PCs.
The varying and even contradictory findings of the bottom-up evaluation support previous claims that the cross-sample variation in PW composition hinders the comparability of results. Since constituent concentrations, and thus TUs, are sample dependent, the drivers of ecotoxicity likely are as well.
Top-down evaluation to identifying drivers of ecotoxicity. The top-down evaluation is applied in studies that investigate drivers of PW ecotoxicity by analyzing correlations between toxicity and composition across full PW samples. A study by Gabardo et al. (2011) found no correlation between PW toxicity and composition. A multisample, multispecies, comparative study also found that no single chemical was strongly correlated with PW toxicity (Schiff et al., 1992). Additionally, there are several findings of absence of correlations between PW toxicity and the concentrations of oil, hydrocarbons, or organic extracts (Azetsu-Scott et al., 2007;Carlsson et al., 2014;Farmen et al., 2010;Sørensen et al., 2019;Strømgren et al., 1995;Utvik, 1999). Some correlation with ecotoxicity has been found for BTEX (Jacobs et al., 1992;Manfra et al., 2010) and phenols (Flynn et al., 1996), but the opposite has also been reported for both (Flynn et al., 1996;Strømgren et al., 1995). Zinc has been mentioned as being present at high concentrations in the most toxic samples (Manfra et al., 2010;Strømgren et al., 1995) and has also been found to correlate with ecotoxicity (Azetsu-Scott et al., 2007;Schiff et al., 1992). Production chemicals have not been highlighted explicitly by this method. However, they have been mentioned to potentially be the cause of high variance in PW ecotoxicity (Holdway, 2002) and the reason why otherwise known correlations did not fit (Jacobs et al., 1992). Traces of an H 2 S scavenger  and corrosion inhibitor (Brendehaug et al., 1992) were mentioned in two different studies as being the reasons for considerable ecotoxicity differences in samples that had otherwise similar compositions of natural constituents.
Like the bottom-up evaluation, the top-down evaluation shows contradictory findings. Unlike the bottom-up evaluation, cross-sample composition variation cannot be the cause, as it is the basis of this investigation method. Instead, the inconsistent findings may be caused by an incomplete chemical characterization of PW, hampering the possibilities for establishing correlations between PW ecotoxicity and composition. Moreover, an incomplete characterization of PW composition means that indications of correlation may even be coincidental, and instead caused by varying concentrations of unidentified drivers.
Another application of the top-down evaluation is analyzing changes in the ecotoxicity of a single PW sample after dividing it into different fractions (Figure 2). By this method, both polar and non-polar fractions were found to be the main toxicity contributors, depending on the sample (Brown et al., 1992;Sørensen et al., 2019). The water-soluble fraction has been found to be the major contributor to toxicity compared to the oil and/or particulate fraction (Carlsson et al., 2014;Farmen et al., 2010). However, reports of significant toxicity associated with the oil-rich surface slick of samples and the presence of particles (Azetsu-Scott et al., 2007), together with a drastic decrease in toxicity upon sample filtration (Azetsu-Scott et al., 2007;Manfra et al., 2010), indicate that the oil and/or particulate fraction can also contain toxicity-contributing compounds. The organic fraction has frequently been indirectly targeted in previous literature through degradation experiments. Removal of phenols and sulfide by oxidation decreased toxicity for bacteria, and a subsequent removal of ammonia and metals by bioremediation reduced toxicity for bacteria and fish (Corrêa et al., 2010). Removal of organic matter, Integr Environ Assess Manag 2023:1172-1187 © 2022 The Authors. wileyonlinelibrary.com/journal/ieam particularly phenols and aromatics, by biodegradation in seawater reduced toxicity towards bacteria (Brendehaug et al., 1992;Johnsen et al., 1994) and bivalves, but not algae (Strømgren et al., 1995). Studies have also found that removal of phenols resulted in a reduction in toxicity almost as much as when removing phenols, PAH, and BTEX together, suggesting that phenols were the main contributors of the three compound groups (Binet et al., 2011;Flynn et al., 1996). Finally, more formalized toxicity identification evaluation (TIE) procedures can be developed as a part of the top-down evaluation to systematically divide complex mixtures into fractions to identify groups of constituents responsible for ecotoxicity. While this has traditionally been applied for industrial wastewaters (USEPA, 1991), one study was found that used this approach on offshore PW samples (Sauer et al., 1997). The study found different drivers of ecotoxicity for each sample, no fraction was consistently the most toxic, and no sample had more than two fractions as drivers of ecotoxicity (Sauer et al., 1997).
Variation in PW composition results in different drivers across samples investigated by the top-down evaluation, as was also the case for the bottom-up evaluation. Moreover, without a fully characterized PW sample, removing fractions of PW may affect the concentrations of unidentified drivers of ecotoxicity, creating the risk of drawing incorrect conclusions from the observed changes in ecotoxicity.
Effect-directed evaluation to identifying drivers of ecotoxicity. The effect-directed evaluation compares the sensitive, sublethal effects (biomarkers) of PW samples to known effect patterns of individual constituents present in PW (Figure 2). A selection of PW biomarkers was used in previous literature to investigate drivers, as presented below. An exhaustive review of potentially suitable PW biomarkers for monitoring purposes was beyond the scope of this review, but is reviewed in, for example, Beyer et al. (2020).
Produced water has been shown to cause endocrine disruption and multiple studies point to alkyl phenols and naphthenic acids as those acting as estrogenic receptor agonists (Meier et al., 2010;Thomas et al., 2009;Tollefsen et al., 2006Tollefsen et al., , 2011. Neurotoxicity of PW has also been documented and attributed primarily to butylated hydroxytoluene and 1-phenyl-1,2-dihydronaphthalene but also PAH, alkyl phenols, and naphthenic acids (Froment et al., 2016). Oxidative stress has been used as a biomarker for oil originating from the North Sea (Sturve et al., 2006), phenols, PAHs, metals (Bohne-Kjersem et al., 2009), and cadmium (Hannam et al., 2009), compounds, all present in PW. Furthermore, changed regulations of CYP1A (cytochrome P450 1A) and EROD (ethoxyresorufin-O-deethylase) activity are common biomarkers for oil as well as PAH exposure (Casini et al., 2006;Knag & Taugbøl, 2013;Meier et al., 2010;Olsvik et al., 2007). Produced water was found to upregulate (Casini et al., 2006;Knag & Taugbøl, 2013;Meier et al., 2010) and to not affect CYP1A activity (Pérez-Casanova et al., 2012). Produced water may thus contain both CYP1A-inducing and CYP1A-inhibiting compounds (Abrahamson et al., 2008;Meier et al., 2010;Sturve et al., 2006), indicating that the complexity of PW composition hinders the comparability of PW biomarkers and those known for isolated compounds. Finally, an analysis by Manduzio et al. (2005) of changes in gene and protein expression showed that exposure to PW resulted in a proteome pattern change that was larger than (and different from) that of oil-only exposure. This suggests that PW contains several unidentified ecotoxicity drivers of both oil-and non-oil-associated compounds. Produced water exposure also affected genes that are otherwise known to be affected by both organic and inorganic toxicants (Olsvik et al., 2007), phenols (Pérez-Casanova et al., 2010), and low MW PAHs (Bravo, 2005), emphasizing that what drives PW ecotoxicity is likely a wide array of different compounds. However, several biomarkers are broad indicators of environmental stress, and pinpointing specific drivers by this evaluation is inherently difficult. Nevertheless, their high sensitivity means that drivers identified by the effect-directed evaluation likely point to the compounds that drive PW ecotoxicity at environmentally relevant concentrations.
Drivers of ecotoxicity relevant for the RBA. Figure 3 presents the drivers of ecotoxicity as found by each of the three evaluation methods described above and includes constituents mentioned by two or more articles as driving PW toxicity. It is evident that contradictory results stem from the three different evaluation methods; this is likely caused by the difference in PW composition across studies as well incomplete chemical characterizations of PW samples. Thus, what drives the ecotoxicity of PW is likely highly sample dependent (in addition to species dependent), and generic statements on the role of individual compounds or compound groups are difficult to deduce from the three independent evaluation methods reviewed. Nonetheless, Figure 3 shows that while many compounds were mentioned as ecotoxicity drivers, metals (particularly zinc), hydrocarbons (particularly phenols, including alkyl phenols), and PCs were mentioned more frequently than others. Amidst uncertainties regarding PW composition, the compounds in Figure 3 can be considered the minimum set to be included in the SB approach of the RBA, noting that the list is not exhaustive. Currently, dispersed oil and eight groups of naturally occurring constituents (covering 29 compounds) are included in the SB approach within OSPAR RBA for managing offshore installations (OSPAR, 2014a). Table 2 lists these compounds, their PNECs, and the AFs used to derive them. The majority of PNECs used in RBA are derived by the lowest value of a limited set of ecotoxicological data combined with an AF. Only the PNECs of dispersed oil, fluoranthene, benzo(a) pyrene, and the majority of metals are derived by SSDs (OSPAR, 2014a). Establishing a common list of compounds and associated PNEC values to be included in the RBA makes communication and comparability more feasible (OSPAR, 2012), although it may only partly cover the actual drivers of ecotoxicity and may be counteractive to further investigate the chemical composition of PW.   Note: Constituents listed in parentheses are those whose concentration are included; however, their ecotoxicity is represented by the PNEC of the constituent preceding the parenthesis. Abbreviations: AF, assessment factor; PNEC, predicted no effect concentration. a The PNEC value is the listed concentration plus the background concentration, which is ideally site specific. b Most often, an AF of 1000 is used due to limited ecotoxicity data availability.
All compounds mentioned in Table 2, except for some of the metals, were also identified in the literature review summarized in Figure 3. From Figure 3, it is noteworthy that sulfide, ammonia, and organic acids (including naphthenic acids) are absent in Table 2. These compounds have been identified more than once in the literature as drivers of ecotoxicity but are not included in the standard RBA SB approach. It is therefore relevant to reevaluate which compounds should be included in the SB approach of the RBA.
Within OSPAR, the SB approach of RBA allows for an evaluation of which individual, or group of, compounds are major drivers of ecotoxicity. Figure 4 shows the relative distribution of natural and added chemicals mentioned as drivers at the platforms within the OSPAR region, presented in their annual RBA assessments (OSPAR, 2021). In the report covering 2020 discharges of 199 installations, 67 installations had naturally occurring substances mentioned as those "likely to pose the greatest risk to the marine environment"; 30 and 22 installations had BTEX and heavy metals singled out, respectively, and the two were most often mentioned together; and 21 installations pointed to PCs, with corrosion inhibitors, biocides, and H 2 S scavengers singled out at eight, six, and five platforms, respectively, while methanol and emulsion breakers were each mentioned for one platform (OSPAR, 2021). Published studies more often pointed to naturally occurring compounds as ecotoxicity drivers, while the 2020 RBA report by OSPAR had more frequent and specific mentions of PCs (Figure 4). Production chemicals may be mentioned less often in the literature, as the knowledge gap regarding their characteristic, concentration, and ecotoxicological behavior makes it difficult to investigate them and target them in chemical analyses. It may be the same knowledge gap that results in frequent mentions of PCs in the RBA reports (Figure 4), as the lack of data on PC ecotoxicity is compensated by high AFs (500-1000), potentially overestimating their contribution to the overall PW ecotoxicity.

CHALLENGES AND RECOMMENDATIONS FOR IMPROVED HAZARD ASSESSMENT OF PW IN THE RBA
As shown above, the highly variable composition of PW creates challenges for estimating the whole-sample ecotoxicity irrespective of which approach is used. The composition of PW is affected by the physicochemical properties of the oil and the subsurface reservoir as well as the production practices and PW treatment at the platform as described previously (Hale et al., 2016;Utvik, 1999). These factors vary in the short and long term, and cause high variability of PW composition both within and across platforms, as well as over time (Holdway, 2002;Sauer et al., 1997). This not only affects the comparability of PW ecotoxicity (magnitude and causes) across previous studies but also means that sampling a few liters of PW for chemical or ecotoxicological characterization at a platform provides merely a snapshot of the thousands to millions of m 3 PW discharged annually (Karman & Smit,   The relative distribution between the frequencies of produced water constituents mentioned as the biggest contributors to the environmental impact factor of produced water at offshore platforms as assessed by the RBA within the OSPAR region in 2020. Based on analysis of data in OSPAR (2021). BTEX, benzene, toluene, ethylbenzene, and xylene; PAH, polycyclic aromatic hydrocarbons; RBA, risk-based approach making it relevant to assess the sampling frequency needed to properly estimate the large-scale ecotoxicity of a discharge, in addition to assessing PW on a platform-specific basis. Once sampled, it is challenging to maintain the chemical composition of the original PW during transportation, handling, and storage. The composition may change due to evaporation of volatile compounds (Abrahamson et al., 2008;Hansen et al., 2019;Sauer et al., 1997;Sørensen et al., 2019;Strømgren et al., 1995), particles adhering to containers and equipment (Azetsu-Scott et al., 2007;Sambusiti et al., 2020), and biological degradation or abiotic transformation of the constituents (Brendehaug et al., 1992;Johnsen et al., 1994). A study by Schiff et al., 1992 showed that samples of PW stored in glass containers with no headspace at 4°C maintained the concentrations of BTEX over an eight-day period, and concluded that sample integrity could be maintained if samples were subjected to insulation, refrigeration, minimal headspace, and quick transportation. Binet et al. (2011) similarly found no change in PW Microtox® toxicity over a four-day period if stored as described above.
Another overall challenge of the hazard assessment is defining threshold values and which "adverse effects" the environment should be protected against when the PNEC is defined. Studies have found that effects of PW are more often chronic than acute, and therefore, may only become evident after days or weeks of exposure (Azetsu-Scott et al., 2007;Gissi et al., 2021;D. C. Reed & Lewis, 1994;Hannam et al., 2009). This suggests that testing of PW ecotoxicity relevant for environmental protection would benefit from including extended exposure times. Furthermore, the "adverse effects" referred to in the PNEC definition are related to the experimental endpoints of the ecotoxicity data (Smit et al., 2011). For the standardized tests recommended in the RBA, these include survival, growth rate, and reproduction, and the PNECs are set to ensure community protection based on the test results. It has been estimated that PNECs derived by standardized tests do not ensure full community protection level against genotoxic damage and oxidative stress (Smit et al., 2009). Hence, if genotoxic damage and oxidative stress were to be included as relevant adverse effects, the resulting PNECs would be lower than the currently used ones. Production chemicals may moreover cause adverse effects that are not accounted for by current ecotoxicity estimations (Scholten et al., 2000).
Large-scale investigations of ecosystem health in areas of high offshore activity are necessary to assess whether the current endpoints included in the definition of risk as well as the 95% protection level sufficiently protect the receiving environment.

Challenges of the SB approach
In addition to the challenges described above, the SB approach is further challenged by its dependency on a completely characterized PW sample. A full chemical characterization is rarely possibly due to the complex sample matrix and the multitude of possible chemical constituents.
A fraction of PW composition remains unknown, sometimes referred to as an "unresolved complex mixture (UCM)" . A study used a nontarget screening of chemicals in the aqueous phase of PW and detected around 1500 organic compounds in each sample (Bergfors et al., 2020). Only a small fraction could be identified, leaving more than 1000 compounds unknown (Bergfors et al., 2020), not including those of the particulate phase of PW. The knowledge gap on PW composition seriously questions the accuracy and general usability of the SB approach, as it cannot include the UCM in the hazard assessment, and chemicals present at only a fraction of their toxic level can still contribute to combined toxicity (Deneer et al., 1988). The SB approach may thus underestimate PW ecotoxicity due to exclusion of unknown toxicants, in addition to the exclusion of even known drivers as was found in this study by the discrepancies between Figure 3 and Table 1. The plausible, but often unquantifiable, role of PCs further weakens predictions made by the SB approach in RBA. Therefore, more attention should be paid to determining the actual composition of PW. Bioassay-directed chemical analyses, combining outcomes of WET and SB in toxicity identification evaluations, may hold the key to identifying the drivers of PW ecotoxicity.
Another challenge that the SB approach presents is the assumption that the ecotoxicity of PW within compound groups is the sum of individual toxicities and that PW toxicity is not matrix related. Hence, synergistic and antagonistic interactions are assumed to be either negligible or to cancel each other out (Johnsen et al., 2000;Stefania et al., 2009), which is in accordance with what has been found for mixtures with more than five chemicals present (Karman et al., 1996). However, studies of PW found indications of increased toxicity when adding PCs to synthetic PW (Bento & Campos, 2020;Tornambè et al., 2012), most likely due to increased solubility of oil-associated compounds. A statistical analysis on the original grouping of chemicals in PW found that they do not have comparable toxicity within groups (Johnsen et al., 2000;Smit et al., 2005). Both findings undermine the validity of concentration addition used in the SB approach. The subsequent response addition across compound groups based on SSDs (described in Implications for risk estimation in the risk based approach section) requires at least 10 (preferably 15) NOEC values from long-term/chronic studies for species covering eight different taxonomic groups for estimation of a reliable PNEC (ECHA, 2008). This is an amount and diversity of ecotoxicity data that far exceeds what is needed by the WET approach and is currently unavailable for most of the chemicals in Figure 3 and Table 2.

Challenges of the WET approach
The WET approach mitigates some of the shortcomings identified with the SB approach, as it analyzes the PW sample as a uniform whole. It does not rely on chemical characterization, and it includes all constituents along with their potential interactions. It provides directly measured Integr Environ Assess Manag 2023:1172-1187 © 2022 The Authors. wileyonlinelibrary.com/journal/ieam data on the ecotoxicity of whole samples towards a battery of biological tests, both for acute and for chronic assessments. Nonetheless, the WET approach is affected by the issues of sample representativeness and integrity, as with the SB approach. In addition, the whole sample composition may change further during pretreatments such as acidification (Hansen et al., 2019;Sørensen et al., 2019), filtration (Brendehaug et al., 1992;Manfra et al., 2010), pH and salinity adjustment (Baldwin et al., 1992;Elias-Samlalsingh & Agard, 2004;Hale et al., 2019;Parkerton et al., 2018), and aeration or degassing (Abrahamson et al., 2008;Meier et al., 2010;Tollefsen et al., 2011), which are not necessarily part of the WET testing procedure but may be used as means of handling PW samples prior to testing. Further changes to the sample composition may occur during ecotoxicity testing. A nonlinear change in Microtox® toxicity was observed over the duration of a standardized test (ISO, 2016) with exponentially growing algae, which could be related to changes in exposure concentration as a function of time. The changes were observed even when aeration, photodegradation, and biotransformation were accounted for. It was hypothesized that dilution of PW caused changes in the chemical equilibrium, resulting in higher bioavailability of oil-associated toxicants. This suggests that the multiphase characteristics of PW make it difficult to conduct the standard ecotoxicity tests on the mixture, especially long-term tests. This is also highlighted by the difficulty of meeting validity criteria of standard ecotoxicity testing when testing PW, as reported by de Vries and Jak (2018). In their standardized ecotoxicity tests of PW from 25 platforms, 20 tests met the validity criteria defined in the ISO test for Vibrio fischeri, but only 11 and 10 met those of Acartia tonsa and Skeletonema constatum tests, respectively. Only PW from six platforms fulfilled the validity criteria of all three tests (de Vries & Jak, 2018). This is important, since compliance with validity criteria of standardized ecotoxicity test methods is a cornerstone in determining the regulatory reliability of ecotoxicity test results, and only results that are assessed as "regulatory reliable" can be used to derive the PNEC required in the RBA (ECHA, 2008).
Another challenge of the WET approach is that despite the available WET values in the literature (Table 1), the inability to compare ecotoxicity across samples means the PNEC must be derived from ecotoxicological tests carried out on a site-and sample-specific basis. Consequently, the PNECs of PW derived by the WET approach are often estimated based on the results of a few tests and applying a large AF (often 1000) (OSPAR, 2014b). The WET approach also provides no insight into the cause of PW ecotoxicity, meaning that it cannot in itself be used to guide efforts towards targeted treatment or substitutions of compounds.

IMPLICATIONS FOR RISK ESTIMATION IN THE RBA
In the RBA framework, hazard assessment is followed by an exposure assessment where the predicted environmental concentration (PEC) of PW or the individual constituents is estimated depending on whether the WET or SB approach was applied in the hazard assessment, respectively. The PEC can be derived at a predetermined distance by the Rye dispersion model (Karman & Smit, 2019) or in a threedimensional, time-dependent grid by the Dose-related Risk and Effect Assessment Model (DREAM) (M. Reed & Hetland, 2002). In the subsequent risk estimation of the RBA framework, the ratio of PEC:PNEC can be used to assess risk. Alternatively and more commonly, the PEC is used in combination with an SSD to quantify risk, expressed as the PAF (Kabyl et al., 2020;Smit et al., 2005). For the SB approach, the PAF of each PW constituent can be combined to an overall risk by rules of probability addition, expressed as the multisubstance PAF (msPAF) (De Zwart & Posthuma, 2005). In cases where the PNEC was derived by the lowest ecotoxicological value, as is the case for the majority of compounds in Table 2, an SSD can still be derived for risk estimation, by using a toxicological mode of actionassociated slope and scaling the SSD curve to the PNEC value as described by Smit et al. (2008).

Comparison of risk estimates derived by the two approaches
The quantification of risk expressed as the PAF and msPAF allows for a comparison of the WET and SB approach. A study by de Vries and Jak (2018) compared the outcomes of using the WET and SB approach in the RBA on PW samples from 25 platforms within the OSPAR region. The WET approach predicted a higher risk more often than the SB approach, thus representing a more conservative approach (albeit only slightly), while the SB approach provided valuable information about the drivers of ecotoxicity. The study concluded that "[…] the information obtained from the WET tests and a SB approach are complementary (address different aspects of hazard) and should not be used interchangeably" (de Vries & Jak, 2018). The shortcomings of each approach identified in the present study support this conclusion. The free choice between the SB and WET approach in the RBA should ideally be replaced by the requirement that elements of both be utilized, to provide a better estimation of true PW ecotoxicity. An example of how this can be implemented is found in the UK guidance for the RBA (BEIS, 2020), in which a tiered assessment beginning with WET screening and ending with SB evaluation is proposed.
The SB approach is currently used more frequently for hazard assessment than the WET within OSPAR (OSPAR, 2021), and implemented fully in the Norwegian and Danish frameworks for RBA assessments. The SB approach and following assessments are indeed more readily available following a chemical analysis, due to predefined PNECs (Table 2). In addition, the SB approach is better aligned with the subsequent parts of the RBA framework, as seen in the exposure assessment, where the SB approach allows for an individual PEC estimation that accounts for the differences in partitioning, transformation, and fate across PW constituents (M. Reed & Hetland, 2002). For the risk estimation of the RBA framework, the SB approach is at times favored as it is compatible with the environmental Integr Environ Assess Manag 2023:1172-1187 impact factor (EIF). This makes it necessary to address the uncertainties associated with especially the SB approach, and how it relates to estimating risk.

Management of uncertainties in final risk estimates
In the risk estimation of RBA, risk (i.e., "the likelihood that adverse effects may occur" (OSPAR, 2012)) is deemed unacceptable in cases where PEC:PNEC > 1 or where PAF or msPAF > 5%. Risk is evaluated at either a predetermined distance from the platform (commonly 500 m) or in each grid cell affected by the discharge as determined by DREAM (OSPAR, 2014b). In the latter case, the sum of grid cells (of cell size 100 m × 100 m × 10 m) where risk exceeds acceptable levels forms the foundation of the EIF, as described by Johnsen et al. (2000) and Smit et al. (2003). The EIF is a decision-making tool developed along with DREAM (Smit et al., 2003). The advantage of the EIF is that it connects risk to a single unit of measure, which can be compared across platforms, simulated scenarios, and years (Smit et al., 2003). EIF calculations have therefore been used to identify discharges with the highest risk to prioritize where mitigation efforts are most needed (Rye et al., 2004).
A limitation of the EIF and PEC:PNEC risk estimates is that they reflect a binary assessment of whether risk is above or below acceptable levels and thus have no associated margin of error. This is problematic, because although the RBA framework is reliable and based on scientifically sound principles, the ecotoxicological input data needed for the hazard assessment must be of sufficient quality and quantity to support the complexity of its use in the risk estimation of RBA. The PNECs used in RBA are commonly derived by a limited ecotoxicological data set that is rarely in compliance with the data needs for SSD estimations. In addition, the challenges presented here indicate that even when the data are available, it may not be reliable, resulting in a large knowledge gap regarding the actual ecotoxicity of PW and its constituents.
This knowledge gap or data uncertainty is compensated for by the use of AFs, which may be as large as 1000 or, as mentioned previously, even higher if ECHA's recommendation for marine PNEC determination was followed (ECHA, 2008). For the SB approach, this high AF is applied to every constituent (and for most compounds again at the SSD extrapolation from its PNEC) and the approach is therefore susceptible to a high accumulation of uncertainty in the final risk estimate. The AFs create uncertainty not only about the magnitude of PW ecotoxicity but also about what drives it. Large AFs may be unnecessarily conservative and thus inadvertently overestimate the individual chemicals' contribution to the overall risk estimate (the EIF). This may be the reason why PCs are often highlighted by RBA as the biggest contributors to risk, as the lack of ecotoxicity data for PCs de facto yields AFs of 1000 (see Table 2). Results from long-term and/or chronic ecotoxicity tests would reduce the uncertainty and provide a more reliable estimation of the PNEC for PCs. The focus for further data generation should thus be on PCs and other compounds with large AFs to assess their actual contribution to risk in the RBA. Doing so may also decrease the gap between the WET and SB approach, as differences in their risk estimates were found to be mainly attributed to difficulties in determining PC concentrations, which in turn highly affected SB toxicity due to their large AFs (de Vries et al., 2022). Failure to do so could result in wrong interpretation of what causes PW ecotoxicity and thus an unbalanced prioritization of mitigation efforts.

Recommendations
Despite the validity of the RBA framework, it lacks transparency. Reducing its final outcome to a single figure (the EIF or PEC:PNEC ratio) does not do justice to the underlying estimations and assumptions that derived it and may be overly simplified to properly inform decision-making. The findings of this literature review have led to the following recommendations for improved PW risk management: • The WET and SB approaches provide complementary information, with the former assessing if the sample is toxic and the latter identifying drivers of toxicity as targets for treatment or chemical substitution. • The experimental protocols of WET testing should be optimized for handling the complex physicochemical characteristics of PW. • The chemical analysis of PW composition should be improved, targeting the UCM, and the array of compounds included in the SB approach should be expanded accordingly. • The quality and quantity of ecotoxicity data underlying PNEC and SSD calculations should be increased, focusing on chemicals with high AFs.
As a minimum, an uncertainty appraisal must be an integrated part of the final reporting of risk estimates in the RBA to ensure that mitigation actions are given proper weight in relation to the reliability of the ecotoxicity estimations.

DATA AVAILABILITY STATEMENT
Data, associated metadata, and calculation tools are available from the corresponding author Lars Skjolding (lams@env.dtu.dk).