Comparing conventional and green fracturing fluids by chemical characterisation and effect-based screening

• Results do not indicate lower concentrations present in green fluids. • Results do not indicate lower number of chemicals present in green fluids. • Similar genotoxic potential between green and conventional fluids. • No clear difference in toxicity between green and conventional fluids. • Tested green fluids are not environmentally friendly alternatives to conven-


H I G H L I G H T S
• Results do not indicate lower concentrations present in green fluids. • Results do not indicate lower number of chemicals present in green fluids. • Similar genotoxic potential between green and conventional fluids. • No clear difference in toxicity between green and conventional fluids. • Tested green fluids are not environmentally friendly alternatives to conventional fluids.

G R A P H I C A L A B S T R A C T
a b s t r a c t a r t i c l e i n f o

Introduction
Hydraulic fracturing is a well stimulation technique which is used for oil and gas production from relatively impermeable rock formations, such as shale, sandstone or limestone. During this process fracturing fluid, made up of water (~90%), proppants such as sand (~9%), and chemical additives (~1%), is injected into the targeted earth formations (Vidic et al., 2013). The additives used in fracturing fluid include biocides, scale and corrosion inhibitors, oxygen scavengers, cleaners, gelling agents, friction reducers, iron controls, surfactants, cross-linkers, breakers, conditioners and clay stabilisers (Annevelink et al., 2016;Faber et al., 2017). The number and volume of chemicals needed depend on the local subsurface conditions and chemical properties of the water used (Vidic et al., 2013). Although chemical additives are used in relatively low concentrations in the fracturing fluid, due to the large total volumes of fluid needed during a hydraulic fracturing event total loads to the environment can still be high.
To protect health and the environment, air, soil and water contamination should be prevented (Gordalla et al., 2013;Grant et al., 2015;Elliott et al., 2017;Soeder, 2018;Sumner and Plata, 2018a;Faber et al., 2019;Hu et al., 2019;Mehler et al., 2020;Bradbury and Smith, 2020). Contamination is however known to occur through surface spills or underground leaks (Schout et al., 2019;Woda et al., 2018;Wen et al., 2019;Hammond et al., 2020;Wójcik and Kostowski, 2020) and can consist of chemical additives used in the fracturing fluid, compounds naturally present in the targeted formation and reaction products created under the specific high pressure and high temperature conditions (Vidic et al., 2013;Kahrilas et al., 2016;Faber et al., 2017;Plata, 2018a, 2018b). Potential contamination risks from hydraulic fracturing activities can be mitigated to some degree by proper contaminant management and wastewater treatment technologies (Boschee, 2014;Camarillo et al., 2016;Butkovskyi et al., 2017;Faber et al., 2017;Chen et al., 2019;Acharya et al., 2020). While it is difficult to control the compounds mobilised from the subsurface, the use of greener chemicals in fracturing fluid is one of the approaches proposed to reduce risks (Thomas et al., 2019;Fernandez et al., 2019;Wu et al., 2020). However, it is thus far unclear to what extent the current green fracturing fluids indeed reduce these risks.
This paper aims to evaluate the potential that current green fracturing fluids have to reduce the risks related to environmental contamination following hydraulic fracturing. In the research presented here, a selection of the twelve principles of green chemistry (Anastas and Eghbali, 2010;Anastas and Williamson, 1996) is used as a guideline to assess a number of conventional and green fracturing fluids available on the market. These principles of green chemistry relate to chemical properties, processes, their impacts on human health and the environment and their financial implications. The principles related to chemical properties that are used in this study aim to reduce toxicity of parent compounds and their transformation/degradation products and prevent persistence in the environment. Green fluids can be developed by substituting toxic and persistent chemicals with greener alternatives (following the twelve principles) that have the same functional properties. Due to the importance of finding existing alternatives, the field of designing and synthesizing safe chemicals has grown in the past two decades Coish et al., 2016;Kümmerer and Clark, 2016;Erythropel et al., 2018). Most of the focus, however, has been on 'greening' the production processes (i.e., less toxic and persistent feedstocks, solvents and by-products) and less attention has been given to the toxicity and persistency of the endproducts such as additives in fracturing fluids (Kümmerer, 2007;Zimmerman et al., 2014).
The composition of conventional fracturing fluids is relatively well known due to registration of chemicals used in these products in databases such as Fracfocus (Faber et al., 2017;Annevelink et al., 2016;Elsner and Hoelzer, 2016;Stringfellow et al., 2017;Vidic et al., 2013).
There is, however, only limited information available on the specific composition of green fracturing fluids (Tollefson, 2013;Hurley et al., 2016;Thomas et al., 2019). The available literature and patents (Berger and Berger, 2008;Crews, 2006;Hanes et al., 2011;Jung et al., 2015;Leshchyshyn et al., 2010;Leshchyshyn et al., 2013;Loveless et al., 2011;Saini et al., 2010;Shao et al., 2015;Sun et al., 2018;Weston et al., 2015;Wilkins et al., 2016;Yegin et al., 2017;Zhu et al., 2017) mainly relate to the partial composition of green fracturing fluids or well stimulation fluids, or to the use of environmentally friendly alternatives for additives such as surfactants or gelling agents, but do not allow for a complete overview of the composition. Furthermore, the patents relate to fracturing fluids in general or for use in specific domains, such as geothermal energy or coal bed methane, and are not specified towards fracturing fluids used for shale gas extraction.
In view of the limited chemical and toxicological information available on green fracturing fluids, analytical chemistry and bioassay techniques are used here to gain more insight into the properties of green fracturing fluids used. The aim of this study is to compare fracturing fluids which are marketed as containing either conventional or green chemicals, based on both their chemical composition and on their toxicity. This is used to evaluate a selection of adverse health and environmental effects most relevant for long term exposure that could result from contamination of environmental compartments such as sources of drinking water.

Samples and scientific approach
Two conventional fracturing fluids confidentially shared by two different suppliers ('supplier 1' and 'supplier 2') are compared to two green fracturing fluids from the same suppliers. Local tap water used to produce the fracturing fluids was included as control samples. Sample "tap water 1" was used to produce the "conventional 1" and "green 1" fracturing fluids, and "tap water 2" was used to produce the "conventional 2" and "green 2" fluids. A blank, using Millipore water, was also prepared as a control sample. These control samples were handled as all other samples and used for background subtraction, i.e. features that did not exceed 5 times the response in a control sample were not taken into account. The green conventional products from supplier 2 could not be completely dissolved and both a suspension and water accommodated fraction (WAF) was obtained. All controls and fracturing samples or their WAF were analysed for their chemical composition using liquid chromatography-high resolution mass spectrometry (LC-HRMS) suspect screening and non-target screening (Sjerps et al., 2016;Hollender et al., 2017), and for their toxicity as assessed with a selection of in vitro bioassays. Advanced analytical techniques, such as LC-HRMS are needed due to the potentially high number of chemicals used in fracturing fluids (Schymanski et al., 2014b;Faber et al., 2017). LC-HRMS allows for the detection of features and their potential identification using their specific m/z ratios and retention times. Environmental persistence data of the tentatively identified features has been gathered in order to preliminarily assess whether the green fluids are expected to be less persistent in the environment than the conventional ones. Furthermore, in vitro bioassays allow for the detection of biological effects of the chemical mixtures present in the samples that may be associated with adverse health effects (Escher and Leusch, 2012;Escher et al., 2014;König et al., 2017;Blackwell et al., 2018). The present study focused on mutagenicity and specific toxicity (oxidative stress, anti-androgenic activity, estrogenic activity, polyaromatic hydrocarbon activity, genotoxicity and cytotoxicity) using the Ames fluctuation test (Heringa et al., 2011;Reifferscheid et al., 2011;Albergamo et al., 2020) and a number of CALUX reporter gene assays (Pieterse et al., 2013;Sonneveld et al., 2005; Van der Linden et al., 2014).

Suspect list generation
Suspect screening was performed using an oil and gas related suspect list, which includes chemical additives from among others the US Fracfocus database and potential subsurface contaminants as found in the literature (presented earlier in Faber et al., 2017). This oil and gas related suspect list focuses on conventional fracturing fluids and includes 1386 chemical compounds of which 403 can potentially be detected by the applied analytical techniques that focus on relatively polar and organic compounds that are difficult to remove by water treatment technologies and therefore pose a threat to drinking water production (Reemtsma et al., 2016). For the purpose of the present study this earlier oil and gas related suspect list (Faber et al., 2017) was now extended to include green chemicals related to oil and gas activities as found in literature and patents. In order to search for those green fracturing related chemicals, the key words "green", "environmentally friendly", "(bio)degradable", "non-toxic", and/or "clean" were used in combination with "fracturing fluid", "hydraulic fracturing", "stimulation fluid", and/or "surfactants" in Scopus (Burnham, 2006) and Google scholar (Jacsó, 2008). The last year of publications used to generate the suspect list presented here was 2018. In total 53 additional suspects were added to the original oil and gas related suspect list (Berger and Berger, 2008;Crews, 2006;Hanes et al., 2011;Jung et al., 2015;Leshchyshyn et al., 2010;Leshchyshyn et al., 2013;Loveless et al., 2011;Saini et al., 2010;Shao et al., 2015;Sun et al., 2018;Weston et al., 2015;Wilkins et al., 2016;Yegin et al., 2017;Zhu et al., 2017). An additional criterion was that the organic compounds had a mass between 80 and 1300 Da, to be analysed using our methods. The total number of suspects including conventional and green fracturing fluid related chemicals that can be analysed using the analytical-chemical methods applied amounts to 456 (Table A.1). This table includes information on the functions of the different chemicals in the fracking fluid, their use in conventional and/or green fluids, and their chemical properties, such as molecular formulae, molecular weight, and n-octanol-water partition coefficient. Information on toxicity is generally limited for these compounds (Faber et al., 2017) and was therefore not included.
For comparison, also the 2018 SusDat database was used provided by the European Network of reference laboratories, research centres and related organisations for monitoring of emerging environmental substances, known as NORMAN (Dulio and Slobodnik, 2009), which consists of almost 60,000 chemicals relevant for environmental monitoring (Schymanski and Williams, 2017), of which 57,214 can be detected by the analytical methods used in this study, based on the mass cut-off.
The samples and controls were all analysed in triplicate. All samples were transferred to a 50 mL flask and the internal standards atrazine-d 5 and fenuron-unlabeled were added at a concentration of respectively 1.0 μg/L and 0.5 μg/L which allowed for LC-HRMS performance evaluation and quality control. Subsequently, samples were centrifuged, except for the Millipore water control and the two tap water samples from suppliers 1 and 2, prior to filtration using Phenex™-RC 15 mm Syringe Filters 0.2u (Phenomenex, Torrance, USA). The Millipore water control was run at least every five samples to ensure signal stability of the internal standards and avoid carry-over and contamination.
The liquid chromatography (LC) method used for this study is the same than that described in Brunner et al. (2020). The aqueous phase or WAF of every sample was analysed after a tenfold dilution using liquid chromatography coupled to a Tribrid Orbitrap Fusion mass spectrometer (Thermo Fisher Scientific, Bremen, Germany), in both positive and negative ionisation mode, using an Xbrigde C18 column of 2.1 × 150 mm. For the LC gradient, first 5% acetonitrile, 95% water and 0.05% formic acid (v/v/v) was used. This was then increased to 100% acetonitrile, 0.05% formic acid within 25 min, and remained constant for 4 min. The flow rate was 0.25 mL/min. The full scan mass range was 80-1300 Da with ms1 and ms2 resolutions of 120,000 and 15,000, respectively. The Product ions measured by the Orbitrap were generated in the on-routing multipole at a normalised collision energy setting of 35%, using an isolation width of 1.6 Da. Electrospray ionisation (ESI) source conditions were: vaporiser and capillary temperature 300°C , sheath gas 40 Arb, auxiliary gas 10 Arb, sweep gas 5 Arb, RF lens 50%, spray voltage 3000 V (pos) and 2500 V (neg).

Data analysis
LC-HRMS raw data files for all samples including controls were processed using Compound Discoverer 3.0 (Thermo Scientific, San Jose, USA) for peak picking and suspect screening. The feature intensity was reported as peak area. The 'Group Area' relates to the median response of the triplicates and was used from Compound Discoverer for further statistical analysis. Only the features that have an intensity response of at least 50,000 and where the response was at least 5 times superior to that in the control were considered. Searches were performed with 5 ppm mass tolerance. The Compound Discoverer output consists of a feature list, i.e., a table with accurate mass / retention time pairs (features) and their intensity. The feature intensity is reported as peak area. The processed data was exported to R Studio as a .csv file for further data analysis and visualization (R Core Team, 2017). Violin plots were used to visualise the retention time and molecular mass distribution of the different samples (Hintze and Nelson, 1998). A third dimension was added as a color code to present the intensity of a certain feature. Two multivariate analyses techniques, i.e., principal component analysis (PCA) and hierarchical clustering, were applied to group and characterise samples and features. PCA reduces data complexity and can reveal relationships between samples when the principal components are depicted in a scores plot (Masiá et al., 2014). Cos2 was included in the PCA plots, which shows the importance of a principal component for a certain observation (Abdi and Williams, 2010). Hierarchical clustering groups samples and features based on their similarity, as calculated by a distance matrix. Both samples and features were clustered based on Euclidean distances and visualised in a heat map (Köhn and Hubert, 2014).
For the top 5 of retrieved features based on signal intensity, identification was attempted. The elements used to determine the molecular formula were carbon, hydrogen, oxygen, nitrogen, and phosphorus, and if suggested by the MS1 spectra, chlorine and/or sulphur were also considered. The most likely formula was determined by taking into account the direction (+ or -) of the internal standard mass error for a specific spectrum and isotope information. Further identification was carried out using MS2 fragmentation data for spectral library searches against mzCloud (HighChem LLC, Slovakia) in Compound Discoverer 3.0 (Thermo Fisher Scientific), and MetFrag queries (Ruttkies et al., 2016), including MassBank of North America fragmentation similarity searches. The certainty of identification was reported according to Schymanski et al. (2014), where a level 5 is of low confidence and a level 1 of high confidence. The general uses of the compounds and persistence data were added for the top 5 features that were identified to a confidence level of at least 3. The uses were collected from the CompTox Chemicals Dashboard (CompTox, 2019; Williams et al., 2017) and Haz-Map (Haz-Map, 2019;Fitzpatrick, 2004), and environmental fate data was gathered in the form of volatilization half-lives, biodegradation rates, bioconcentration factors and removal percentages in wastewater treatment obtained from the EPISuite n-lake model (Patel and Boethling, 2006), the OPERA model (Mansouri et al., 2018), the BCFBAF model (Garg and Smith, 2014) and the STPWin model (Ottmar et al., 2010) respectively.
In order to find out whether the features detected in the green and conventional products matched more or less with the green and conventional compounds on the oil and gas related suspect list, the percentages of matches to the different categories were calculated for each product. The compounds included in the oil and gas related suspect list were divided into three categories: green, conventional and conventional/green. The last category was necessary because some compounds may correspond to either a green or a conventional suspect on the oil and gas related suspect list.

Ames fluctuation test
Mutagenicity was assessed by testing the samples as a whole or their WAF at different dilutions in Evian water (1, 1:10, 1:30 and 1:100), including the undiluted suspension, in the Ames fluctuation test. Ames fluctuation test bacterial strains, culture media, and S9 liver enzymes from phenobarbital/β-naphtoflavone-exposed rats were purchased from Xenometrix GmbH (Allschwil, Switzerland). Histidine, nutrient broth no. 2 oxoid, 2-AA, MgCl2·6H 2 O, NaH 2 PO 4 ·H 2 O, and Na 2 HPO 4 ·2H 2 O were obtained in analytical grade from Boom (Meppel, the Netherlands). NaCl and KCl were purchased from Avantor Performance Materials B.V. (Deventer, the Netherlands). 4-NOPD, 4-NQO, NF, D-glucose-6-phosphate, nicotinamide adenine dinucleotide phosphate, and ampicillin were purchased from Sigma-Aldrich (Zwijndrecht, the Netherlands). The 24-and 96-well plates were obtained from Greiner Bio-one (Alphen a/d/ Rijn, the Netherlands) and the Corning 384-well plates from Sigma-Aldrich. The Ames-fluctuation test uses genetically modified Salmonella typhimurium bacteria to investigate whether a given sample can cause DNA mutations which may lead to genotoxic effects (Heringa et al., 2011;Reifferscheid et al., 2011). The Ames fluctuation test was performed as reported previously (Heringa et al., 2011;Vughs et al., 2018) with minor modifications, using strain TA98 for the detection of frame-shift mutations and TA100, which is sensitive to base-pair substitution, instead of TAmix (Albergamo et al., 2020). All samples were tested in triplicate in the Ames fluctuation test with and without S9 liver enzyme mix, as well as a solvent control (dimethyl sulfoxide; DMSO) and positive controls in DMSO. 20 μg/mL of 4-nitroquinoline N-oxide (4-NQO) and 500 μg/mL 4-nitro-o-phenylenediamine (4-NOPD) were used as positive controls for TA98-S9. For TA98 + S9, 5 μg/mL 2-aminoanthracene (2-AA) was used, for TA100-S9, 12.5 μg/mL nitrofurantoin (NF) was used and for TA100 + S9, 20 μg/mL 2-aminoanthracene (2-AA) was used as positive control. Results are expressed as the number of cell culture wells in which the pH indicator in the culture medium turned yellow. The average of the triplicate solvent control should show ≤10 yellow wells while for the positive controls ≥25 yellow wells need to be counted for the test to be valid. The Ames fluctuation test gives a binomial response, therefore a χ 2 -test with p < 0.05 was performed to determine if the response significantly differed from the Evian control. When a sample showed a statistically significant response in at least one of the test conditions (TA 98 or TA100 +/− S9), the sample was considered to be mutagenic. Samples that test negative for genotoxicity but do show cytotoxicity might be false negatives.
In some cases, cytotoxicity or stimulation of cell growth may have impacted the sensitivity of the Ames fluctuation assay. Cytotoxicity may lead to false negatives, however, if optical density was reduced with less than 10%, the occurrence of a false negative response is considered unlikely. Stimulation of cell growth may lead to false positives. Additionally, if the negative control showed a high response or the positive control a low response, then a false negative or a false positive, respectively, cannot be excluded for the test samples.

CALUX reporter gene assays
Fluids were tested as complete product or WAF in the CALUX reporter gene assays. The CALUX test uses modified mammalian cell lines to assess activation or inhibition of specific reporter genes as toxicological end-points. Tests were performed by Biodetection Systems (BDS, 2019). Fracturing fluid samples were blindly tested in a battery of bioassays including anti-AR CALUX® (anti-androgenic activity), ERα CALUX® (estrogenic activity), Nrf2 CALUX® (activation of the Nrf2 pathway which is associated with oxidative stress response), PAH CALUX® (polyaromatic hydrocarbon activity), P53 CALUX® (activation of the p53 pathway which is associated with genotoxicity, with and without metabolic activation of S9 liver enzymes) and cytotox CALUX® (cell death) (Sonneveld et al., 2005;Pieterse et al., 2013;Van der Linden et al., 2014;De Baat et al., 2019) using standard protocols of Biodetection Systems (BDS, 2019). DMSO served as a solvent control for all of the assays. Flutamide, 17ß estradiol, curcumin, benzo[a]pyrene, actinomycin D, cyclophosphamide, and tributyltin acetate were used as positive control for the anti-AR CALUX®, ERα CALUX®, Nrf2 CALUX®, PAH CALUX®, P53 +/-S9 CALUX® and cytotox CALUX® assays, respectively. Exposure conditions included a DMSO dilution series of 1×, 10×, 30× and 100×. The selection of CALUX bioassays was based on earlier research on hydraulic fracturing related products (Faber et al., 2019). Polyaromatic hydrocarbons cannot be chemically detected with the analytical methods used in this study, PAH CALUX® was therefore performed in order to detect their presence. The cytotox CALUX assays was also included to study a-specific effects on cell viability that may confound positive or negative responses in the CALUX assays. If a result falls below the limit of quantification (LOQ), it is considered negative since no activation on the specific pathway has occurred.

LC-HRMS based non-target screening
3.1.1. Overview of detected features, molecular weight and retention time ranges The relative standard deviations of the retention time and the peak areas were calculated for the internal standards atrazine-d5 (positive ionisation) and bentazone-d6 (negative ionisation). The relative standard deviations of the retention time are 0.07% and 0.08% for positive and negative ionisation, respectively, showing a good reproducibility of the analysis. The relative standard deviations for the peak area show a good reproducibility in positive ionisation (3.59%). Although still acceptable, in negative ionisation mode, this reproducibility is less clear (17.6%).
There is no clear indication that the tested green products have considerably fewer chemicals or number of features and/or have chemicals present at lower concentrations or summed feature intensities than tested in the conventional products (Fig. 1). The green sample 1 has a significantly higher summed feature intensity for the negatively ionisation results than the other three samples. The same is true for the green sample 2, which has a higher summed feature intensity for positively ionizable chemicals than the other samples. The conventional sample 2, however, contains the highest number of features in both positive and negative ionisation modes.
The distribution of retention times and molecular weights was assessed for the positive and negative ionisation results as a whole (see Figs. A.1 and A.2) and for each individual sample in positive (Fig. 2) and negative ionisation (Fig. A.3). Overall, the results show low molecular weights and a wide range of polarity. For all four fluids, the majority of features have low molecular weights with an average of 500 Da (pos) and 250 Da (neg). The two conventional fluids and the green fluid 2 have similar retention time distributions in both ionisation modes, with most features detected at retention times   ranging from 8 to 12 and to a lesser extent from 17 to 25 min in positive ionisation and between 8 and 12 min in negative ionisation mode. The retention time distribution of the green fluid 1, however, differs from that of the other samples in that most of the features are only detected around 10 min in positive ionisation and around 18 min in negative ionisation. Fig. 3. PCA plots for positive (a) and negative (b) ionisation. Light blue represents the blanks, dark blue represents the tap waters used to prepare the products, brown represents the conventional fluids and green the green ones. The round symbols represent the products 1 and the triangles represent the products 2 and the blank is presented by a square. Cos2 shows the importance of a principal component for a certain observation (the larger the object the higher the importance). 3.1.2. Similarities and differences among samples The PCA plots for positive and negative ionisation results (Fig. 3) based on the LC-HRMS results show the similarities and differences among different samples for positive and negative ionisation. The dimensions 1-3 were chosen based on the screen plots (Figs. A.4 and A.5) as these three dimensions together explain 79% and 86% of the variances in positive and negative mode respectively. In both ionisation modes, all triplicates cluster together, indicating reproducible measurements. The blanks and the tap waters used for preparing the fluids also cluster together. In positive ionisation mode the green product 1 (nrs 12-14) shows a high resemblance to the blanks and tap water samples, in contrast to the green product 2 (nrs 32-34). The conventional products 1 and 2 are clearly distinct from each other and differ substantially from the blanks and the green products. In negative ionisation, the green and conventional products also differ substantially. However, contrary to what was observed in positive ionisation, the green product 2 shows many similarities with the blanks for dimensions 1 and 2. The green product 1, however, seems different from the green product 2 and the blanks when the dimensions 1 and 3 are plotted. The conventional products 1 and 2 seem relatively similar when looking at dimensions 1 and 3, which is not true for dimensions 1 and 2. These observations are also supported by the hierarchical clustering heatmaps in Figs. A.6 and A.7. Fig. A.8 shows the number of suspect hits based on accurate mass matching for the SusDat NORMAN and the oil and gas related suspect list for positive and negative ionisation. The percentage represents the number of suspect hits relate to the total number of suspects in each list. For both lists more matches were found in positive ionisation than in negative ionisation mode. The number of suspect matches per sample in positive and negative ionisation follows the same trend than that observed for the number of detected features in positive and negative ionisation (Fig. 1a). The samples, however, have a much higher percentage match with the oil and gas related suspect list, which can be explained by the specific composition of this list. This emphasizes the value of tailored suspect lists for the detection of chemicals based on non-target screening data. The positive ionisation results show a higher percentage match than the negative ionisation results, suggesting that the oil and gas related suspect list is more relevant to positively ionisable compounds than to the negatively ionisable ones.

Suspect screening
The percentages of matches per category and per sample are presented for positive and negative ionisation modes in Fig. 4ab respectively. For both positive and negative ionisation, the conventional suspects account for the highest number of matches, followed by the conventional/green category while the green suspects represent the lowest number of matches. However, the green product 1 has the most matches with the green suspects in both ionisation modes of all the samples. The green suspects have a higher percentage of matches in negative ionisation than in positive ionisation.
The matches described here are not further specified with regard to confidence levels (Schymanski et al., 2014) so these could potentially be false positives. Confirmation at level 3 or higher is needed to improve the certainty of identification (Vergeynst et al., 2015).

The 5 features with the highest intensities
The five features with the highest intensities were tentatively identified for each sample in positive and negative ionisation mode and reported in Table A.2ab. Laureth-4 (5274-68-0), Laureth-5 (3055-95-6), Laureth-6 (3055-96-7) and Laureth-7 (3055-97-8) are found as suspects in the conventional product 1 and Dodecyl sulphate (151-41-7) is a candidate for the green product 1. These compounds are all reported to be used as surfactants in industrial formulations (Haz-Map, 2019;CompTox, 2019), which is relevant for hydraulic fracturing related activities. Myristyl sulfate (4754-44-3), a candidate for the green product 1 is an inert ingredient used in pesticides (CompTox, 2019). Azelaic acid (123-99-9), a candidate found in the conventional product 2 is used for example as a plasticiser or adhesive and is also a natural component found in some foods (Haz-Map, 2019). No uses were found for the other tentatively identified candidates. The latter two candidates (myristyl sulfate and azelaic acid) may be less relevant for hydraulic-fracturing related activities.
Environmental fate parameters, i.e., volatilization half-live, bioconcentration factor and wastewater removal percentage, were available for a limited number of tentatively identified suspects (Table A.2ab). The volatilization half-lives of the green products range from 3.88-1.47E3 years for the green product 1 and from 16 days to 7.07E07 years for the green product 2. The volatilization half-lives of the conventional product 1 are substantially higher than for the green ones and range from 4.32E08 to 5.15E+16 years, while the conventional product 2 has ranges similar to the green product 2. Biodegradation half-life estimates are comparable for all four products and range around four days, based on these tentatively identified suspects. There also does not seem to be a clear distinction between the conventional and green products with regards to the bioconcentration factor. The overall removal percentages of the tentatively identified suspects in wastewater are low (1.85%) to medium (48.40%), where the conventional products have slightly higher removal rates than the green ones. The results show that the environmental fate parameters do not strongly differ between the green and conventional products, based on available data for a selection of the top five chemicals that were identifiable to a confidence level of at least 3. More insight into the chemicals contained in the fracturing fluids is needed for a complete overview of the environmental persistence.
The intensities of the 5 features with the highest intensities were compared to the total summed feature intensity of each sample (Fig. 5ab). The top 5 features for the conventional products and green product 1 represent about 5-12% (pos) and 10-15% (neg) of the total summed feature intensities, while green product 1 has substantially higher percentages with 48% (pos) to 90% (neg). Unfortunately, only limited publicly available toxicity data (EFSA, 2019; Toxnet, 2019) was available for the top 5 identified candidates. Due to this limited data, this cannot be used to assess differences between conventional and green fracturing fluids.

Bioassay test results
For tests using TA98 with and without S9 and using TA100 with S9, the solvent and positive controls indicated that the outcome of the test can be considered as valid. In the test with TA100 without S9, a relatively high response was observed in the negative DMSO control. The sample responses were compared to the Evian control (used for dilution series) for which no positive responses were observed. For some samples, the exposure interfered with the read-out of the Ames fluctuation test (medium color change) due to an unknown mechanism, which is why data are not available for every test condition (combination of strain and absence or presence of S9 metabolic mixture). However, for each of the fluids, mutagenicity data are available in two or more test conditions. An overview of the Ames fluctuation test results is shown in Table 1. None of the control water samples scored positive in the Ames fluctuation test for mutagenicity. This indicates that any observed mutagenicity of a sample results from the chemicals of which it is composed. No data are available for the conventional product 1, except for the highest dilution 1:100, with a positive result (TA98 + S9). The conventional product 2 showed mutagenicity for the undiluted suspension (TA98 + S9) as well as for the undiluted water accommodated fraction (TA100 + S9). These results indicate that the conventional product 1 has a higher genotoxic potential than the conventional product 2. The green product 1 showed positive responses at 1:30, 1:10 and 1:1 dilutions in the TA100 without S9 condition. These positive responses can, however, not be confirmed due to the positive response by the DMSO control. Clear positive responses were, however, observed for the water accommodated fraction of the green product 2 at dilutions 1:1 and 1:10 (TA100 + S9). Overall, the conventional product 1 shows the highest genotoxic potential out of all the samples tested. However, the green product 2 shows an overall higher genotoxic potential than the conventional product 2.
Conventional product 1 induced effects in the CALUX reporter gene assays. This sample tested positive for anti-androgenic activity and   cytotoxicity. The other samples all showed results below the level of quantification and are considered negative (Table 2 and Table A.3).
For the conventional product 1, the results may be confounded by cytotoxicity which may have resulted in false negatives. Overall, only one of the conventional products induced some effects in the CALUX assays, but the lack of results for the other fluids precludes a conclusion on the differences in toxic potential between conventional and green products.

Discussion and conclusion
The aim of this study was to determine whether the tested green fracturing fluids can be considered as more environmentally friendly than the tested conventional ones by verifying two principles of green chemistry, i.e., lower toxicity and lower environmental persistence.
The distribution of retention times and molecular weights of the green and conventional samples indicate that the compounds are relatively small and have a wide range of polarity and thus a wide range of solubility in water. There is hardly any overlap in features between the four types of fracturing fluids investigated. Considering their composition, the green fracturing fluids contain a lower variety of chemicals than the conventional ones. However, the green product 2 shows a substantially higher summed feature intensity than any of the other samples in positive ionisation and the green product 1 contains a substantially higher summed feature intensity of negative ionisation chemicals than any of the other samples. This means that in the event of a failure, a lower variety of chemicals but at potentially higher concentrations (due to higher signal intensities) could enter the environment with the green products than with the conventional ones.
The positively ionizable compounds have a higher number of matches with the oil and gas related suspect list than the negatively ionizable ones, suggesting that the oil and gas related suspect list is more relevant to positively ionisable compounds than to the negatively ionisable ones. It should, however, be noted that the chromatographic conditions slightly favor positive ionisation over negative ionisation due to the presence of formic acid, and that the Orbitrap used for this project is more sensitive in positive mode than in negative mode. The conventional suspects account for the highest number of matches while the green suspects represent the lowest number of matches. Additionally, the green suspects have a higher percentage of matches in negative ionisation than in positive ionisation. There is a need to improve the suspect list by adding more relevant suspects that ionise in negative mode, and more suspects categorised as green. The matched suspects detected from the NORMAN database (Dulio and Slobodnik, 2009) could be a good starting point, however labour-intensive further identity confirmation would be required. This might provide further insight into oil and gas related chemicals that have green chemical properties and that are used within the European context in order to further improve the oil and gas related suspect list.
From a toxicological point of view, the mutagenicity results from the Ames fluctuation test do not indicate that the green fluids are safer than the conventional ones. All samples showed positive responses for mutagenicity and the green product 2 showed a higher genotoxic potential than the conventional product 2. These results indicate that the green products contain one or multiple chemicals that can be mutagenic. The use of these fluids can thus not be regarded as safe, since it may introduce potentially genotoxic chemicals in the environment, which may result in a health risk. A comparison between the green and conventional fluids could not be made based on the CALUX data, as only the conventional product 1 showed anti-androgenic activity and cytotoxicity. Moreover, the Ames test and the P53 CALUX tests both assess processes that may be related to potential mutagenicity, but the results are not the same for a given sample. The samples all tested positive in at least one of the Ames fluctuation tests and none of the samples induced a P53 CALUX response. This may be explained by the different mechanisms assessed by the two tests. The Ames test determines DNA mutations whereas the P53 CALUX test determines a cellular response to genotoxicity. The CALUX tests used within this project may not be sensitive enough to detect specific effects in the fracturing related samples or the toxicological end-points targeted by the selected CALUX tests may not be relevant for these samples. For a more extensive set of bioassays to base a comparison on, many more bioassays are available for effect-based testing and can assess molecular and cellular mechanisms such as those applied here, or effects on viability, growth and reproduction of small intact organisms as ecological models (Connon et al., 2012;Brunner et al., 2020;Moeris, 2020).
Based on the preliminary environmental fate assessment, there is no clear distinction between the green and conventional products. More insight into the chemical composition of the samples is, however, needed for a complete evaluation and assessment of their -indicates that the sample tested negative; (+) indicates a potential false positive; (−) indicates a potential false negative; / indicates that the sample was not analysed for this test; no data: indicates that the results could not be read; a WAF: water accommodated fraction; b number of detected features; c total summed feature intensities; d the highest dilution at which a sample tested positive is displayed for the Ames test results; environmental persistence. Due to the limited number of tentatively identified peaks, it is difficult to verify the bioassay results with the known toxic effects of the tentatively identified candidates. A positive bioassay response suggests that a sample contains compounds that have an impact on biological mechanisms, that may be related to adverse health effects. Further chemical identifications, exact concentrations, detailed toxicological data and exposure assessments are needed for a comprehensive risk assessment. A higher number of detected features and/or a higher summed feature intensity does not necessarily result in a positive response in the bioassays (Table 2). Indeed, the toxicity of a sample will depend on the toxicity of the chemicals it is composed of.
It should be noted that the results presented here are limited to the analysis of two types of fracturing fluids marketed as either "conventional" or "green" provided by two different suppliers and that the composition and toxicity results might not be representative of fracturing fluids from other suppliers. These results do not support the claim that currently available, green-labeled fracturing fluids are environmentally more friendly alternatives to conventional fracturing fluids. Based on this first assessment, the green fracturing fluids cannot be considered as distinctly safer than the conventional ones and thus there is a need for more research on green alternatives for use in fracturing fluids. For future development of sustainable chemistry, there is a need for closer collaboration among the fields of chemistry, environmental sciences and toxicology (Anastas, 2016). New approaches have recently been developed in order to address these concerns. The concept of a circular chemistry includes the reuse of chemical waste streams (Keijer et al., 2019) and the design of chemicals with the goal of removing the toxic properties whilst keeping the functional properties intact (Anastas, 2019;Zimmerman et al., 2020). Still, legislative stimuli to chemical industries to design more environmentally friendly alternatives are currently largely lacking Munthe et al., 2019;Posthuma et al., 2019).

Declaration of competing interest
The authors declare that they have no conflict of interest.