Benthic eDNA metabarcoding provides accurate assessments of impact from oil extraction, and ecological insights

Ba and Cu. Calculations of the macroinvertebrate-based Norwegian Sensitivity Index (NSI) based on COI metabarcoding data agreed well with corresponding morpho-taxonomy values and with the PI. Further, we identified a set of bioindicator taxa from both metabarcoding datasets, to develop novel biotic indices and demonstrate their predictive performance using cross-validation. Finally, we compared co-occurrence networks from impacted vs. non-impacted sites, to improve the understanding of the ecological consequences of impacts. Our study demonstrates that metabarcoding can act as a meaningful and relatively accurate complement to the current morpho-taxonomic approach.


Introduction
In order to ensure a balance between environmental impact and socioeconomic benefits, Norwegian legislation regulates the extent of offshore oil exploitation, and requires that extraction activities are routinely monitored according to a system that divides the Norwegian continental shelf into monitoring regions that are surveyed on a rotating basis (Norwegian Environment Agency, 2020).In addition to physicochemical parameters, this monitoring regime includes sampling of benthic sediment macrofaunal communities for a subset of sampling stations based on perceived risk of anthropogenic impact.This "morphotaxonomic" monitoring component has been successful (e.g.Gray et al., 1990), but represents a significant investment of time and resources, not only from sampling and processing of collected sediment samples (sieving and sorting), but also due to the manual taxonomic identification of a large number of sediment grab samples (Baird and Hajibabaei 2012;Aylagas et al., 2018;Pawlowski et al. 2018).This, in turn, limits the spatial and temporal resolution possible, as sampling stations are typically surveyed only every three years (Bakke et al., 2011;Norwegian Environment Agency 2020).In addition, it causes a significant time lag, typically more than one year, between surveys and final monitoring reports, which may slow down environmental management due to failure to identify warning signs at an early stage.
Morphological species identifications can also be subject to human error and bias by individual taxonomists, and are limited by a shortage of taxonomic expertise and by cryptic species complexes (Hynes, 1994;Maurer, 2000;Schander and Willassen 2005;Bourlat et al., 2013;Chariton et al., 2014;Cowart et al., 2015;Danovaro et al., 2016).Increased human pressure on marine ecosystems, including accelerating biodiversity loss, climate change and an expansion of extraction operations to vulnerable and previously unexploited ecosystems, often with lesser-known benthic fauna, poses a major challenge for marine monitoring (Glover et al., 2018).Thus, there is a clear and imminent need to adapt, evaluate and implement genomics-based monitoring methods such as metabarcoding, with a clearly demonstrated ability to increase the cost-effectiveness and capability of monitoring for future ocean stewardship (Baird and Hajibabaei 2012;Cordier et al., 2018;Cordier et al., 2020).
Complementing or replacing current monitoring protocols with metabarcoding could provide faster and cheaper results without being limited by available taxonomic expertise, thus allowing for more extensive and timely monitoring.Several studies have shown the feasibility of inferring existing macrofaunal biotic indices (BIs), such as AMBI, directly from macrofaunal taxonomic profiles predicted from metabarcoding data (Lejzerowicz et al., 2015;Aylagas et al., 2016;Cordier and Pawlowski, 2018), although insufficient coverage of reference sequence databases and their taxonomic labels are an important limitation for this approach (Hestetun et al., 2020;Cordier et al., 2020).
A significant strength of metabarcoding is that it also allows targeting of organisms that are unavailable in morpho-taxonomic data, including micro-and meiobenthos (Bourlat et al. 2013;Cowart et al. 2015;Cordier et al., 2018).This advantage has already been demonstrated specifically for monitoring impacts of offshore oil extraction, including the identification of several potential meiobenthic, protist and prokaryotic bioindicator taxa (Lanzén et al. 2016;Laroche et al. 2018a;Mauffrey et al., 2020).Yet, since little is known about the ecology of such species, their predictive power as bioindicators need to be better demonstrated, preferably in concert with developing one or several novel BIs.While de novo approaches can benefit from using taxonomyindependent OTUs as bioindicators, this brings new challenges, such as defining and mapping Operational Taxonomic Units (OTUs) across studies (Mächler et al., 2020;Cordier et al., 2020).
Here, we applied eDNA metabarcoding to a set of benthic monitoring stations sampled by triplicate sediment grabs (3x100 samples) spanning a gradient of environmental disturbance that includes both operational discharges and several instances of accidental leakage of drilling waste from injector wells.Routine monitoring sites from ten different platforms and regional control stations were included from the neighbouring monitoring regions II and III in the North Sea and region IX in the Barents Sea.Sampling and experimental design followed a recommended protocol based on a previous comparative study investigating the effects of replication and methods for DNA extraction and sample homogenisation (Hestetun et al., 2021).Two different taxonomic markers were used, namely mitochondrial cytochrome oxidase subunit I gene (COI) targeting metazoa, and the hypervariable regions V1-V2 of the small subunit rRNA gene, targeting total eukaryotes (18S).
In order to evaluate the accuracy of metabarcoding as a technique for monitoring the impacts of oil extraction, we compared alpha diversity measures and overall community structure between the metabarcoding datasets and the reported morpho-taxonomic data, as well as the correlation of both datasets to physicochemical parameters, including an environmental pressure index integrating four major physicochemical indicators of impact.Further, we estimated impact using the Norwegian Sensitivity Index (NSI; Rygg and Norling, 2013) based on COI metabarcoding data, again comparing its congruence with morphotaxonomic results and impact.Finally, we identified bioindicator taxa from both 18S and COI metabarcoding results, used them to develop de novo BIs that included a broader range of taxa compared to the NSI, and evaluated the predictive power of these de novo BIs using crossvalidation.

Study sites and samples
For the purpose of routine environmental monitoring of oil and gas extraction activities, the Norwegian continental shelf is divided into a system of 11 regions (I-XI) from the southern tip of Norway to the Barents Sea.Benthic monitoring surveys are conducted in 2-3 regions each year on a rotating basis, so that each region is surveyed every three years (Fig. 1A).This monitoring is organised by the industry itself, carried out by a handful of prequalified environmental consultancy companies, and overseen by and reported to the Norwegian Environment Agency.Routine monitoring of benthic environmental status is based on van Veen grab samples, and includes physicochemical parameters and sediment characteristics, as well as morphological identification and analysis of macrofaunal benthic communities (>1 mm).Monitoring stations are semi-fixed, resampled for each survey based on the results from the previous survey, and are typically placed as crosspattern gradients around installations with a smaller number of reference ("regional") stations for each subregion (Bakke et al., 2011;Norwegian Environment Agency 2020).
The samples from this study were collected during the regular monitoring surveys in 2018-2019, covering a subset of installations and sampling stations from regions II (2018) and III (2019) in the northern part of the North Sea between Scotland, Shetland and Norway, as well as region IX (2019) in the Barents Sea.The depths of the stations in regions II and III included in this study range from around 90-180 m, while region IX stations were located between 305 and 362 m below the surface.Sediment conditions are mostly dominated by sand, with slightly larger pelite fractions in the deeper areas (Hatlen et al., 2019;DNV 2020;Mannvik et al., 2020).In total, 291 samples, representing 97 stations distributed over ten installations and ten reference stations (3-4 per region), were selected for metabarcoding sampling.This included four installations from region II: Balder ("BAL", 12 stations), Gina Krog ("GK", 13 stations), Ivar Aasen ("IA", 14 stations) and Ringhorne ("RIN", 7 stations); five from region III: Brage ("BRA", 5 stations), Oseberg C ("OSC", 5 stations), Oseberg F ("OSF", 4 stations), Oseberg South ("OSS", 9 stations) and Veslefrikk ("VFR", 10 stations); and one installation from region IX, namely Snøhvit ("SG", 8 stations).The selection included several installations where regular monitoring had previously reported moderate to severe environmental impact from high hydrocarbon contamination (in particular RIN, OSS and VFR), resulting in degraded macrofaunal communities.The geographical distribution of platforms and reference stations roughly followed a North-East to South-West axis, which was therefore used to model the influence of the geographical distance in multivariate statistics modelling (Fig. 1).
Physiochemical, sediment and macrofaunal data collected and analysed as part of the regular monitoring survey were obtained from the environmental consultant companies STIM Miljøtjenester AS (2018, region II) (Hatlen et al. 2019), DNV-GL (2019, region III) (DNV, 2020) and Akvaplan-niva (2019, region IX) (Mannvik et al. 2020), or through the MOD Database, a repository of Norwegian offshore monitoring data (DNV GL, 2021).The morpho-taxonomy datasets were collected using standard methodology (ISO, 2014) with five grab replicates for each sampled station, sieved with a 1 mm sieve, separated from remaining sediment and identified to lowest possible taxon level by the consultant companies.Physicochemical parameter measurements were in turn subcontracted by the environmental consultant companies to Eurofins (Environment and Analyses departments) and Sintef Norlab AS.
For comparison with metabarcoding results, the reported morphotaxonomic data from the first three spatial replicate grabs from each station were pooled before being compared to pooled metabarcoding data (see below).This station-level pooling was done as the corresponding three metabarcoding replicates were not consistently collected from the exact same grab replicates as those of morphology-based inventories.When available for replicate grabs, arithmetic means of measured physicochemical values were also calculated for each station.All physicochemical and morphotaxonomy data for individual and pooled samples, are available in Supplementary Tables S1-S3.

Calculation of a physicochemical pressure index (PI)
As a proxy for environmental impact from oil extraction activities, a physicochemical pressure index (PI) was developed as previously described in Aylagas et al. (2017), in order to correspond directly to the ecological group (EG) scale, ranging from EG I including species that are very sensitive to disturbance, to EG V, including first order opportunistic species highly tolerant to disturbance (Borja et al., 2000).Total hydrocarbons (THC), total measured polyaromatic hydrocarbons (PAH16), barium (Ba) and copper (Cu) were included in the calculation of the PI.These variables were selected based on low background values compared to disturbed sites, correlation to the environmental status as estimated by morpho-taxonomic monitoring (NSI), and for being unequivocal markers of extraction activity (Gray et al 1990;Olsgard and Gray 1995;Bakke et al., 2013).Since no specific threshold classification exist for Norwegian offshore areas, coastal sediment classifications (AA-EQS), based on the Norwegian implementation of the Water Directive were used instead, where available (2 ppm for PAH16 and 84 mg / kg for Cu) (Direktoratsguppen vanndirektivet, 2018).For THC and Ba, lacking such threshold values, limits for BI were instead established using linear correlation to NSI, as fitted values at NSI = 20 (delimiting EGs II and III) resulting in a limit of 72 ppm for THC and 464 ppm for Ba.The former agrees relatively well with clear effects being reported previously in sites with a concentration above 50 ppm (Norwegian Environment Agency, 2020) Another modification from Aylagas et al. (2017) was to extend the PI to allow values between 5 and 6 (>32x AA-EQS), to decrease the issue of values reaching the theoretical maximum of 5.0 at heavily disturbed stations.
For ordination analyses, specifically principal component analysis (PCA) of parameters, and their correlation to non-metric dimensional scaling (NMDS) based on community dissimilarity, the same parameters were also used to develop two more specific pressure indices.The first, "HC PI", based on total hydrocarbons and PAH16, and the second, "Metal PI", based on Ba and Cu.

Sample processing and DNA extraction
For metabarcoding sampling, surface sediment (0-2 cm) was collected using three grab replicates per sampling station by pooling three random sediment samples from each van Veen grab (in total ca.30 g per grab) (Hestetun et al., 2021).Samples were immediately frozen on board (-20 • C) and kept frozen during transport to the lab as validated by the use of a temperature logger.Sediment samples were thawed at 4 • C and pre-mixed by hand with a spatula before three 0.5 g extract replicate samples per sediment sample were removed for DNA extraction.We employed a hybrid protocol involving Qiagen PowerBead tubes and C1 solution for initial steps, homogenisation with a Precellys 24 homogenizer (6000 rpm for 40 s) (Bertin Instruments) and centrifugation (10 K rpm for 1 min), followed by the QIAsymphony SP robot (DSP DNA kit, Tissue LC protocol) for remaining steps.To control for contamination, extraction negative controls were included for each extraction event.Extracts were quantified using a Qubit 3.0 instrument (Thermo Fischer Scientific).

Metabarcoding library preparation and sequencing
PCR amplification and library preparation was performed in two steps using two molecular markers, 313 bp of the partial cytochrome oxidase subunit I (COI) using modified "Leray" primers (Geller et al., 2013;Wangensteen et al., 2018) and 350-390 bp of the V1-V2 region of 18S rRNA gene, using the primers SSU_F04mod (5'-GCTTGWCTCAAA-GATTAAGCC-3') (Cordier pers. comm.) and SSU_R22mod (Sinniger et al., 2016) followed by a second PCR step using adapter-linked indexprimers with 12 inserted random bases to improve sequencing performance.The modified "Leray" primer pair was chosen based on its ability to target a wider range of marine eukaryotes (Wangensteen et al., 2018).PCR amplification was performed using KAPA3G Plant kit (Kapa Biosystems) with 2 µl extract template.The three extract replicates from Fig. 1.Geographical overview of the studied area.Norwegian offshore monitoring is divided into eleven regions as shown in (A), with installations and regional controls studied here in regions II and III in the North Sea (B) as well as region IX in Barents Sea (C).Regions including studied samples are shaded and only targeted installations or regional reference stations (beginning with "R2", "R3" or "R9") are shown.
A. Lanzén et al. each sediment sample were pooled prior to PCR, and negative controls were used to check PCR cross-contamination.Protocol conditions for 18S included an initial 3 min step at 95 • C, 30 cycles of 30 s each at 95, 57 and 72 • C, and a final 10 m step at 72 • C. The COI protocol was identical, except annealing temperature was set to 45 • C. To account for ambiguous bases in the COI primers, primer concentration was tripled for this marker.Library preparation was done using the Illumina TruSeq i5/i7 barcode set with equimolar concentrations of PCR products.Grab replicates were individually indexed.Sequencing was performed at the Norwegian Sequencing Centre (University of Oslo, Norway) using an Illumina MiSeq instrument with 2x300 bp v3 chemistry.Raw sequences are available from the INSDC Sequence Read Archive with BioProject number PRJNA727023.
Briefly, read pair merging, primer removal and initial quality filtering was performed using vsearch v2.11.1 and cutadapt v1.18.A maximum of 20 and 40 mismatches were accepted when overlapping 18S and COI paired reads, respectively.Primers were then trimmed from overlapped sequence reads allowing a maximum of two mismatches, with sequences lacking the complete forward and reverse primers discarded.All COI amplicons shorter than 274 bp or longer than 333 bp after trimming were discarded, as well as 18S amplicons shorter than 330 bp or longer than 450 bp.Clustering was done using SWARM v2.2.1 (Mahé et al., 2015), using default settings, followed by removal of singletons, reference-based and finally de novo filtering of probable chimeric OTUs, using vsearch, with the same reference databases as later used for taxonomic classification (BOLD and SilvaMod v128, respectively; details below).Dereplication prior to clustering, as well as conversion of SWARM output to an OTU contingency table, was carried out using the scripts fasta_merging.py and matrix_creation.pyfrom SLIM (Dufresne et al., 2019).Finally, we carried out post-clustering correction using LULU (Frøslev et al., 2017) with a 97% similarity cutoff, and taxonomic assignments using CREST (Lanzén et al., 2012).For COI, the BOLD database was used Ratnasingham and Herbert (2017); accessed February 2018 and adapted to CREST as part of release 3.2.1 at htt ps://github.com/lanzen/CREST),while for 18S, we used SilvaMod v128 as reference (https://github.com/lanzen/CREST)based on SILVA SSURef NR v128 (https://www.arb-silva.de/documentation/release-128).
Likely contaminants were identified and removed based on both abundance profiles in the PCR and extraction blanks, in a plate-wise manner, using decontam (Davis et al. 2018), removing 156 18S OTUs and none from COI.We also filtered both metabarcoding datasets in order to remove all OTUs unclassified at phylum rank or as nonmetazoan for COI, and kingdom rank for 18S.Further, filtration was carried out based on taxonomic assignments in order to remove OTUs of likely pelagic origin, as described in Hestetun et al. (2021); see Supplementary Table S4).Cross-contamination was reduced by setting OTU abundances to zero where it occurred in a sample at very low abundances compared to its average abundance across samples (<1%), similar to the UNCROSS algorithm (Edgar 2016).Four individual COI samples (representing individual grab replicates) were also removed from further analysis due to insufficient read depth (<1,000 reads) and one from 18S (<10,000 reads), leaving 96 stations represented by at least one grab replicate in the final COI datasets (i.e.pooled grabs) on which all statistical analyses were carried out.All 97 stations were represented by 18S data.
Alpha diversity estimates (rarefied, i.e. expected richness at minimum read depth, and Shannon diversity) were calculated using the R package vegan v3.2.1 (Oksanen et al., 2019).Bray-Curtis pairwise dissimilarities were calculated based on relative OTU abundances, filtered to compensate for differences in sequence depth and random sampling effects.To this end, we removed all OTUs with a maximum abundance across samples below 0.05% for COI or 0.01% for 18S.Non-metric dimensional scaling (NMDS) as implemented in vegan (function met-aMDS) was used to transform and visualise the dissimilarity matrix to a non-linear approximation in 2D space, and Mantel tests (Mantel, 1967; function mantel) was used to compare dissimilarity matrices between datasets.The function envfit was used to compare physicochemical and biological parameters to the resulting NMDS space.For correlation analyses, we used the lm function in R and agreement was tested by calculating Cohen's Kappa using the kappa2 function of the irr R package v0.84 with squared weight (Gamer et al., 2019;Cohen, 1968).All R scripts used for filtering and statistical analyses are available in the GitHub repository (https://doi.org/10.5281/zenodo.4826641;directory R).

Identification of bioindicators, development and cross-validation of biotic indices
To identify potential bioindicator taxa, we first used TITAN2 v2.1 (Baker and King, 2010) to identify candidate taxa with PI as the explanatory variable.This method allows for detection of changes in taxon distributions along a continuous environmental gradient, such as the PI in this case.Only taxa occurring in five or more stations were retained.All such taxa with predicted "reliability" and "purity" above 0.95 were selected and modelled individually using quantile regression splines (QRS) as described previously by Lanzén et al. (2020) for eco group (EG) assignment.Quantile regression models were constructed using the quantreg package (rq function) for the 90th percentile (Koenker and d'Orey, 1994), corresponding to the value below which 90% of the taxon abundances are expected along the impact gradient.Second to fifth order polynomial splines were fitted to the resulting model (bs function) and EG selected as the group whose weight was closest to the peak of the fitted spline that showed the lowest resulting Akaike information criterion (AIC).If disagreeing with TITAN (EGs I-II vs. TITAN group II or EGs IV-V vs. TITAN group I), the indicator taxa was discarded.We then used all potential indicator taxa identified thus to construct a biotic index, as described in Lanzén et al. (2020).
For cross-validation, installations or regional control groups were rotated and samples from each excluded in one round of TITAN and QRS indicator taxa identification and biotic index construction.PI values were then predicted using the excluded platform or control as a validation dataset.A final biotic index was also derived by using the full dataset.The taxonomic affiliations of all resulting indicator taxa in this final index were illustrated separately for EGs (merging I and II, IV and V) using KronaTools v2.7.1 (Ondov et al., 2011).

Inference and analysis of co-occurrence networks
Network analysis was performed on 18S and COI data using a subset of installations from regions II and III.First, based on their PI value, samples from these regions were split into one group of impacted (PI > 2) and another group of non-impacted sites (PI < 2).Samples belonging to installations that did not have impacted sites were removed from the non-impacted dataset in order not to bias the comparison.All network inferences were performed using CoNet (v1.1.1 beta; Faust et al., 2012) in Cytoscape (v3.8.2) with an identical procedure for all networks.Abundances were imported as read counts, and normalisation to relative A. Lanzén et al. abundances was performed within CoNet.Taxa with a minimum occurrence below 33% were removed (read sums of excluded taxa retained) prior to further steps.Five measures of association were selected: Pearson and Spearman correlations, mutual information, Bray-Curtis and Kullback-Leibler distances.Network inference was then performed according to the permut and boot procedure with 100 permutations and 100 bootstraps.For initial edge selection, automatic threshold selection was employed (number of top and bottom edges set to 1000 for 18S and 450 for COI data).The p-values from each measure were merged using Brown's method, and an adjusted (Benjamini-Hochberg) p-value cut-off of 0.05 was set for the final edge selection.Networks were visualised and analysed in Cytoscape with the built-in Analyze Network tool.Node tables and edge tables were exported and further analysed using R functions.A list of potential indicator species was then constructed by selecting the top 10 taxa based on their node degree, betweenness centrality and closeness centrality, followed by merging these lists and removing duplicated taxa.Finally, NetShift was used to analyse changes among species associations between the impacted (case) and non-impacted (control) networks (Kuntal et al., 2019; https://web.rniapps.net/netshift/).

Alpha diversity and physicochemical parameters
Metabarcoding resulted in 24 and 33 million reads for COI and 18S respectively, after quality filtering, removal of singletons and probable chimeras.Of the raw read pairs obtained, 67% and 88% of the COI and 18S datasets remained after overlapping and initial quality filtering.An additional 0.7% of COI reads were removed as singletons and 0.5% as predicted chimeras.For 18S, 1.0% and 1.9% of reads were removed correspondingly.After additional filtering to remove probable contaminants and cross contaminating reads, non-target sequences (likely pelagic taxa, non-metazoa for COI and unclassified for 18S) and grab replicates with insufficient reads, 4.6 and 18.7 million reads were retained for COI and 18S, respectively (Supplementary Table S5).The majority of COI reads were removed for being non-metazoan, while most 18S reads that were removed were classified as likely to have a pelagic origin.Read coverage per pooled sample varied considerably, ranging from 7 to 133 thousand for COI and from 36 to 341 thousand for 18S.Thus, only rarefied richness values, i.e. the expected richness encountered at minimum coverage, were used when comparing alphadiversity estimates.Further, filtering of rare OTUs was applied so as not to bias multivariate statistics based on community dissimilarity.
A total of 8,186 COI and 25,319 18S OTUs were retained, of which 1,857 and 4,505 were retained for community dissimilarity analysis, after removing insufficiently abundant OTUs.Rarefied richness (RS) of samples ranged from 39 to 705 (median = 395) for COI and 268-1968 (median = 1359) for 18S, and was strongly correlated between the two datasets (Fig. 2D).Morpho-taxonomy data for macrofauna contained a total of 715 different taxa (including genera or higher taxa not classified to species rank) for the corresponding samples (1-157 per sample).Morpho-taxonomic Shannon diversity (H' morpho ) was more strongly correlated to metabarcoding-based rarefied richness (RS COI or RS 18S ) than to metabarcoding-based H' (data not shown).Interestingly, the 18S dataset, which included all eukaryotes, corresponded better to morphotaxonomic diversity than the COI dataset did, although the later was restricted to metazoa, just like morpho-taxonomic data (Fig. 2A-B).
A physicochemical pressure index (PI) was calculated based on the measured values of total hydrocarbon (THC) concentration, PAH16, Ba A. Lanzén et al. and Cu (see Calculation of a physicochemical pressure index).The resulting PI values ranged from 0.05 to 4.6 with only two stations (Veslefrikk 32 and 33) corresponding to PI > 4.0 or "very bad" status, mainly due to extreme THC and PAH16 concentrations (up to 3.7 wt% and 40 ppm, respectively).The impact gradient was skewed towards low impact, with 86 out of the 100 samples being undisturbed or slightly disturbed according to the PI, with values below 2.0.Only six samples had 2.0 ≤ PI < 3.0 (corresponding to moderate disturbance, from Ringhorne and Oseberg F) and another six ranged between 3.0 ≤ PI < 4.0 (heavy disturbance, all from Veslefrikk and Oseberg S; Supplementary Table S2).Nonetheless, the PI correlated moderately well with H' morpho (R 2 = 0.43) and RS 18S (R 2 = 0.30), while correlation with RS COI was weaker (R 2 = 0.11; see Fig. 2C, E, F).

Analysis of community composition and inferred biotic indices
NMDS analysis based on pairwise Bray-Curtis dissimilarities between samples was carried out based on morpho-taxonomic counts (Fig. 3A) as well as relative abundances of abundance-filtered COI and 18S OTUs (Fig. 3B-C).PCA was used instead for key physicochemical parameters (depth, geographical distance, grain size, pelite, gravel, sand, hydrocarbon-specific PI and Meta-specific PI), of which 65% of the variation could be explained by the resulting first two principal components (Fig. 3D).In all of these ordination results, samples from region IX formed a separate cluster, whereas regions II and III could only be separated based on morpho-taxonomy or 18S data (Fig. 3).
The consistency of pairwise dissimilarity matrices between datasets according to Mantel tests (Mantel, 1967) deviated slightly from correlations of alpha diversity measures across the same datasets (see Fig. 3).Specifically, 18S-based results were more similar to morpho-taxonomic results than they were to COI metabarcoding-based ones.Further, 18Sbased clustering was the most similar to that of environmental parameters, according to the same test (R = 0.75 for 18S vs. R = 0.69 for morpho-taxonomy; see Fig. 3).
PI and NSI correlated significantly with NMDS space in all three datasets.The PI had the strongest correlation to COI, followed by 18S and weakest correlation to morpho-taxonomic data (Supplementary Table S6).Depth, geography (SW-NE), Ba, sand and pelite also correlated significantly in all datasets (Fig. 3A-C) with no consistent pattern in correlation strength across datasets, except that PAH16 and individual PAHs did not correlate significantly with the clustering pattern based on 18S (Supplementary Table S6).
The most abundant phylum according to both morpho-taxonomy and COI was Annelida, whereas nematodes were more abundant according to 18S results (see Supplementary Figure S1).Out of all final reads, 52% could be classified to genus rank for COI, compared to only 5% for 18S.To compensate for this discrepancy without losing taxonomic information at lower ranks, taxa representing the lowest assignments regardless of ranks were used in further analysis.Eight out of the 25 most abundant of these best assignment taxa were shared between morpho-taxonomic and COI metabarcoding classifications (Supplementary Figure S2).
NSI values derived from COI metabarcoding data agreed significantly with values calculated using morpho-taxonomic data (n = 97, κ = 0.63, R 2 = 0.52, p < 0.001; Fig. 4A).NSI values derived from morphotaxonomic data as well as COI data also showed strong agreement and correlation to the PI (Fig. 4B-C).NSI could not be predicted confidently using 18S data due to the low number of classifications that could be made to species or even genus rank (in average only 5% of total abundance).

Bioindicators and metabarcoding-based de novo biotic indices
Predicted de novo biotic indices using cross-validation, based on COI A. Lanzén et al. data, performed similar to NSI values derived from the full metabarcoding dataset in terms of correlation with the PI, but with a slightly better agreement (κ = 0.66 for biotic indices vs. κ = 0.57 for NSI; Fig. 4C  and 5A).In 67 out of 96 pooled samples, the correct environmental status based on PI could be predicted and an exact PI at 0.2 units below the measured value was predicted in average (Fig. 5A).In comparison to COI, biotic indices derived from 18S metabarcoding data performed worse, but did result in highly significant correlation (R 2 = 0.20, p < 0.001; Fig. 5B).
Using the whole metabarcoding datasets for de novo biotic index prediction, a total of 49 potential indicator taxa could be identified based on COI, and 118 based on 18S (Fig. 6; Supplementary Table S7).For both datasets, the majority of indicators appeared to be sensitive to contamination or disturbance (EGs I and II).For COI, the taxonomic composition at higher ranks were similar across EGs with Polychaeta and Cnidaria appearing in all groups from sensitive to tolerant or opportunistic taxa (Fig. 6A-C).However, at finer resolution, there was no overlap, and no family was found to harbour both genera or species sensitive to disturbance with others insensitive.
In general, arthropod taxa were more common among indicators of disturbance than taxa from other phyla (groups II-V; Fig. 6B-C).Synchaetidae and Nematoda only occurred in groups IV and V. Several indicators identified in the COI-based index were taxa that are already established as bioindicators, with 26 occurring in AMBI version 6.0 (Borja et al., 2000) of which 14 agreed regarding sensitivity, i.e. group membership (sensu group I-II vs. III vs. IV-V; see Supplementary Table S8).Few metazoan OTUs from 18S data could be classified to genus or species rank (Fig. 6D-F), but at family level many bioindicators identified using 18S overlapped with COI (Supplementary Table S8).
Examples were Capitellidae and Opheliidae, with species rank indicators predicted and assigned to the same ecological group (EG) based on COI data.Further, a large number of protists were represented based on 18S data, including Stramenopiles (mainly in groups I-III and notably Labyrinthulomycetes), Cercozoa (notably Imbricatea, across groups but with different genera represented) and Alveolata (notably ciliates across groups but again with different genera represented).Several fungal taxa were also found among the potential indicators of groups I-III whereas lobose amoeba were found among groups IV-V, along with Excavata (Andalucia and Oxymonadida), NAMAKO groups I and II and Palpitomonas.

Inference and analysis of co-occurrence networks
In total, samples from four platforms (OSS, RIN, OSF and VFR) were included in the network analysis of taxa, of which 40-42% were impacted (PI > 2, see Table 1).For both 18S and COI, the majority of taxa were shared between impacted and non-impacted sites (80% and 54%, for 18S and COI, respectively).A majority of the potential indicator taxa identified appeared in the 18S networks, while only about one third of identified COI indicators appeared (see Fig. 7).All network nodes, potential keystones and edge properties are summarised in Supplementary Tables S9-S14.
There were some notable differences between impacted and nonimpacted networks.For 18S-based results, the major difference was  the number of mutual exclusion type associations, shown by the higher average negative degree and the higher percentage of negative edges in the impacted network (Table 1).Essentially, taxa with a higher node degree tended to have negative edges connected to them in the impacted network, while higher degree taxa had mostly positive edges in the nonimpacted network (Fig. 8A-B).In addition, the non-impacted 18S network was larger, in terms of diameter and radius, with longer average shortest path length, and had somewhat higher density and centralization compared to the impacted network (Table 1).In many ways, the opposite trends were observed in the COI networks.The impacted COIbased network had higher average degree, higher density and significantly higher neighbourhood connectivity (p = 1.7E-11), compared to the non-impacted network.Further, the proportion of negative edges was higher in the non-impacted COI network (Fig. 8C-D).The impacted and non-impacted COI networks were identical in terms of diameter and radius.However, the relatively few shared edges compared to the high percentage of shared nodes suggests a significant reorganization of cooccurrence patterns between impacted vs. non-impacted sites.This was confirmed by low Jaccard edges scores (0.198 and 0.340 for 18S and COI respectively) between the most common sub-networks determined by NetShift.Changes in the composition of detected modules (communities) are shown as community shuffle plots (Supplementary Figure S3).
Considering the attributes of the bioindicator nodes in the 18S network, those with the highest degree typically belonged to EG IV-V in the impacted network, while the most connected belonged to EG I-II in the non-impacted networks.Further, the bioindicator nodes were significantly different from other nodes in the impacted network, having higher degrees due to a greater number of positive associations.There was little overlap between the potential keystone nodes in the impacted and non-impacted networks, although four 18S taxa and six COI taxa were present in both networks (see Supplementary Table S12).Among these, Amastigomonas mutabilis, Marimonadida, PW19, Ascidiacea and Capitella capitata, were also identified as bioindicators.

Discussion
Our results show that sediment eDNA metabarcoding can be used to directly and accurately infer impacts from offshore oil extraction, by  utilising biotic indices such as the Norwegian Sensitivity Index (NSI), given sufficient spatial replication, even though the accuracy obtained did not reach the same accuracy as values inferred using morphotaxonomic data.Here, we used three DNA extraction replicates and three grabs per station, as suggested by an earlier study using a subset of the samples analysed here for methodological benchmarking and optimisation (Hestetun et al., 2021).We also verify that inferred NSI values based on COI metabarcoding correlated and strongly agreed with morpho-taxonomic data and with a custom pressure index (PI) developed to take into account the main impacts caused by oil drilling and extraction in the studied habitats.
Inference of existing indices requires a barcode with sufficient taxonomic resolution for the targeted group, in this case macroinvertebrate metazoa, for which genera or species information is necessary for most indicators in the NSI (Rygg and Norling, 2013).Here, this could only be achieved using COI metabarcoding, as opposed to 18S.This is in line with previous studies (e.g.Gibson et al. 2014), highlighting that 18S, although a suitable marker for protists, is typically not informative at the species level for metazoans (Tang et al., 2012).For COI, sequence records in BOLD were available for 78% of the genera previously identified morphologically in monitoring region IV, just north of regions II and III studied here (Hestetun et al., 2020).While the database coverage for at least one part of 18S for identified morphotaxa in region IV was not very far behind, at 56%, only 5% of the 18S reads could be classified to genus rank, compared to 52% of COI reads.This discrepancy can be explained by Metazoa only corresponding to roughly half of the 18S sequence reads and being dominated by nematodes, which are not included in routine monitoring surveys and for which repositories of barcode reference sequences are only rudimentary (Schratzberger and Ingels, 2018).The relative sequence read abundance of nematodes may also have been overestimated compared to their true proportion of eDNA abundance or biomass, since the 18S primers used here were originally developed specifically for this phylum and thus likely to be negatively biased against other phyla due to mismatches (Blaxter et al., 1998).Finally, database coverage is likely lower for the less studied Barents Sea, from which we also included samples in this study.
In spite of the difficulty in taxonomic classification of the 18S data, we were able to successfully develop novel biotic indices based on this dataset as well as the COI metabarcoding data.We chose to use the lowest, best resolution taxonomic groupings for this purpose, i.e. including unclassified higher groups when assignment in cases where assignment to lower ranks was not possible.This facilitates re-use of developed indices in future datasets as long as the same reference  database is used (Lanzén et al., 2020).It also avoids the problem of choosing a specific taxonomic rank for analysis, which inevitably results in a compromise (Salis et al. 2017).However, the development of de novo indices is in itself a taxonomy independent strategy, and while not done here, can instead be applied directly to unlabelled OTUs based on sequence similarity as in e.g.Keeley et al. (2018), Lanzén et al. (2020) or Mauffrey et al. (2020).While this alternative strategy can bypass limitations imposed by taxonomic references, it has the drawback of requiring any new sequence data not included in the original analysis to be mapped to previously obtained OTU sequences.
Supervised machine learning offer another set of promising approaches that have been used successfully to predict the level of impact from benthic eDNA metabarcoding data (e.g.Cordier et al. 2017;Armstrong and Verhoeven 2020;Lanzén et al., 2020).We also attempted this on the present dataset, using the method described in Lanzén et al. (2020), which resulted in poor performance with non-significant agreement (data not shown).As for de novo index development, it requires a relatively large number of sampling stations spanning an appropriate impact gradient, and while machine learning is known to better handle noisy data and outliers (Cordier et al., 2018), it is possible that the 97 stations included here were not sufficient.An explanation for this poor performance could be that the impact gradient of the dataset was skewed towards low impact, while clearly impacted stations were found only at a few installations (OSS, VFR, RIN).This likely caused the cross-validation used here to under-predict classification accuracy as well.Nonetheless, as opposed to the attempted machine learning approach, impact estimated using our de novo indices correlated strongly with pressure (PI) for both COI and 18S metabarcoding data, with the COI data de novo index showing comparable accuracy to using COI to calculate NSI, and a performance close to NSI based on morphotaxonomy data.
Several of the installations and stations examined have been analysed using metabarcoding previously, specifically VFR and OSC in region III by Lanzén et al. (2016).However, 454 Pyrosequencing, and a primer targeting a different region of the 18S rRNA gene was used in these earlier studies, which makes it challenging to incorporate or compare the data with this study.Lanzén et al. (2016) demonstrated strong correlation between community composition and impact from oil extraction.Several possible indicator taxa were also identified, but their performance was not evaluated using cross-validation, or used to develop novel biotic indices, as in the current study.This was, however, attempted in a recent study by Mauffrey et al. (2020), who compared two other offshore installations in the North Sea, operated by Total E&P Denmark.
Whereas in the present study, where morpho-taxonomy-based NSI showed the best correlation to pressure, Mauffrey et al. (2020) predicted taxonomy-independent biotic indices using metabarcoding that correlated better with physicochemical parameters than morpho-taxonomic indices did.However, there are many differences to our study that make them difficult to compare directly.First of all, we constructed a pressure index independently of the studied sample set that was only based on parameters known to correspond to pressures from petroleum extraction (Ba, Cu, THC and PAHs).Individual parameters in this index scaled with limits established in Norwegian legislation based on ecotoxicology studies (for Cu and PAH16) or a correlation to NSI-based status thresholds (for Ba and THC).Mauffrey et al. (2020) instead used PCA to find a subset and scaling of physicochemical parameters that explained as much as possible of the total variation in measured parameters.These do not necessarily correspond to, or scale with, environmental impact, whereas biotic indices are designed to do exactly this, giving a somewhat unfair advantage to metabarcoding data over morpho-taxonomy in their comparison.Further, a different biotic index was used, namely AMBI, instead of, here, NSI.This likely had a minor influence on results, however.The same 18S primers were also used by Mauffrey et al., allowing for a future meta-analysis to combine or compare these results with the results from our study directly, but this is beyond the scope of the current study.
The strong correlation of the PI used here with NSI and community structure indicates that the PI was a suitable proxy for ecologically relevant impact.However, we noticed that the ratio of PAH to THC varied substantially between platforms (e.g.2E-3 in VFR compared to 6E-5 in OSS).Possible explanations include differences in age, origin and composition of the examined discharges (for instance the relative fraction of drilling waste vs. produced water), since our dataset includes both general impact from operational discharges, and more severe point impact from leaking injector wells.Further, the contribution to the PI of high levels of PAHs was modest compared to that of THC, which typically contributed to balance the PI towards better agreement with NSI.The annual average environmental quality standard (AA-EQS) of PAH16 used as threshold for "good" status in sediments according to current Norwegian legislation for coastal sediments (2000 ppb) may be underestimated, or based on less ecotoxic individual PAHs.
While several previous studies have found that distance to the platform was a main driver of community structure (Laroche et al., 2018a;Cordier et al., 2019;Mauffrey et al., 2020), this correlation was relatively weak in our study (Supplementary Table S6).The likely explanation for this is that the most impacted sites here were associated with injection well leakage, located some distance from the platform and causing high local impact.Indeed, the presence of sedimented hydrocarbons was clearly visible in the grab samples from the worst impacted stations in our study.As reported previously in e.g.Gray et al. (1990), Grant and Briggs (2002) and Bakke et al. (2013), disturbance of benthic communities is most readily visible in the area local to the impact (see Cordes et al. 2016 for a review).
Among the indicator taxa predicted using COI metabarcoding here with assigned EGs in AMBI, the most well known example is probably the annelid worm Capitella capitata (EG V), which is tolerant to poor oxygenation and recognised as a universal and classic indicator of high sediment content of organic matter (Pearson and Rosenberg, 1977).Sediments with high hydrocarbon content tend to become oxygen depleted (Breuer et al., 2004), and Capitella has been reported in previous studies at sites with high hydrocarbon content (Daan et al., 1992;Daan et al., 1994;Olsgard and Gray, 1995).The family Capitellidae was also identified from the 18S dataset (EG IV), while several groups of nematodes, nemerteans, cnidarians, polychaetes and Xenoacelomorpha were identified as sensitive to impact.At least one genus of nematodes, Desmodorida, has previously been reported as a sensitive indicator (Bianchelli et al., 2018).
Similarly to a previous study using 18S metabarcoding, the majority of identified potential bioindicators here were protists (Lanzén et al. 2016).Among them were nine different taxa belonging to the class Labyrinthulomycetes, all placed in EG I-II.This also agrees with Lanzén et al. (2016), where this class was identified as sensitive to high lead and barium concentrations.Labyrinthulomycetes have been found as decomposers or parasites on algae or marine invertebrates (Bongiorni, 2012).Fungi of the order Microascales was also found as a potential indicator tolerant to THC by Lanzén et al (2016).Here it was classified to EG III, which is inconclusive, and likely indicates intermediate tolerance, but sensitivity to very disturbed habitats.Apart from Labyrinthulomycetes, protist classes with the highest number of identified indicator taxa sensitive to impact (EG I-II) were Ciliophora (ciliates) and Imbricatea.The latter are cercazoan protists with secreted surface siliceous scales.Interestingly, two clades of Imbricatea were also identified as opportunistic and tolerant to impact (EG IV-V).Four taxa in the Discosea class of amoeba were also found among EGs IV-V, as well as the clades NAMAKO-1 and 2, possibly anaerobic and identified in anoxic sediments from a meromictic lake (Takishita et al., 2007).
Though not included in the present study, we note that prokaryotes also provide very relevant targets for future studies and attempts to develop de novo biotic indices.Their abundance, central role in ecosystem processes and functional diversity, including specialist taxa able to degrade hydrocarbons, make them appealing bioindicators.In a A. Lanzén et al. study that targeted both prokaryotes and eukaryotes as indicators the former showed higher correlation with disturbance from oil extraction (Laroche et al., 2018a).
Even though we attempted to remove all sequences likely to have a pelagic origin, we expect that we did not manage to find all, and that we may have mistakenly removed a few benthic OTUs collaterally.One way to improve the situation would be to include samples from the water column in future studies to identify pelagic taxa.The Biomarks survey found that about 10% of the OTUs were found both in the benthos and the water column in a mix of European coastal sites (Lopez-Escardo et al. 2018), compared to 16% of the OTUs being removed manually from 18S as likely having a pelagic origin.Lopez-Escardo et al. also noted several polychaetes and molluscs among these OTUs, likely representing larval life stages of predominantly benthic organisms.Though a large part of the pelagic OTUs found in the benthos were likely from debris or dead organisms, it is also possible that a small amount of planktonic organisms entered the sample during transport to the water surface of the van Veen grabs.Hence, a possible way to reduce the number of reads from pelagic OTUs would be to use a more closed system such as a multicorer, as done by e.g.Mauffrey et al. (2020).This also has the added benefit of providing better spatial replication with less sampling effort.On the other hand, as the current Norwegian regulations, following the EN-ISO 16665 standard, require 0.1 m 2 of surface area for morpho-taxonomic analysis, the use of a multicorer would not provide sufficient material for parallel eDNA and morphological sampling (ISO, 2014).
Co-occurrence network reconstruction based on metabarcoding data remains challenging, resulting in a large share of false positives, in spite of the extensive improvements achieved through continuously evolving approaches (Faust et al. 2012, Röttjers and Faust 2018, Berry and Widder, 2014;Barroso-Bergadà et al., 2020).Despite the methodological challenges and the fact that biological implications of network properties remain difficult to interpret, inference of association (cooccurrence) networks from eDNA data has recently gained significant momentum (Lima-Mendez et al. 2015;Röttjers and Faust 2018;Yuan et al., 2021).Nonetheless, association networks of protistan communities constructed from eDNA data have revealed that impacts of anthropogenic disturbances are reflected in network properties, with substantial differences found in the network structure of PAHcontaminated versus pristine coastal sediments inferred from data on all three domains of life (Jeanbille et al., 2016, Forster et al., 2021).Our analysis only indicated subtle differences between the impacted and non-impacted 18S and COI networks in terms of their network topology, with few common patterns between the two datasets.However, the 18Sand COI-based networks reconstructed here did agree in that networks from impacted sites showed higher average degrees and shorter average path lengths compared to those from non-impacted sites.This indicates a more compact yet also more complex structure.
The discrepancy between the two datasets was particularly apparent for the ratio of negative to positive interactions, expected to be higher in the impacted networks as seen for 18S-based networks.A higher proportion of positive interactions could indicate a more resilient community, which we assumed would be the case in non-impacted sites (Laroche et al., 2018b).However, the unexpected trend identified based on COI here has also been observed by DiBattista et al. (2020), who reported that seawater communities heavily impacted by oil and gas drilling exhibited fewer negative interactions based on presenceabsence of eukaryotic families (DiBattista et al., 2020).Nonetheless, both datasets in this study pointed towards a significant rewiring of the co-occurrence networks based on impact.Most of the taxa that persisted across the impact states did not maintain similar association patterns within the community.

Conclusions
This study illustrates how de novo biotic indices sensitive to impacts from offshore oil drilling can be developed from a dataset based on a modest number of stations (n = 97).These indices have the potential to perform comparably to existing morpho-taxonomy-based biotic indices in terms of predicting impact from oil drilling and extraction.Thus, we are confident that, using a larger set of samples from a more extensive gradient of pollutants, indices could be developed with a performance that goes beyond that of current monitoring practices.By avoiding costly and time consuming morphologic identification, metabarcodingbased indices would better use available resources and thus allow for a higher spatial and temporal resolution with the same amount of resources as current monitoring programs.In doing so, future metabarcoding-based surveys would also generate publicly available data useful for e.g.improving our understanding of benthic biodiversity and ecological interactions, and their sensitivity to anthropogenic pressures.We also show that inference of co-occurrence networks can play an important role in this respect by revealing trends in the interaction structure of communities not revealed directly by changes observed in composition and diversity of individual samples.However, it is also clear that morphology-based identification can complement results based on metabarcoding, and can help to identify taxonomic gaps for which reference sequences are missing.Thus, in future monitoring we recommend that metabarcoding is carried out in parallel to morphotaxonomic identification for a period of time, to firmly establish how the two methods compare across time, environment and impact types.The latter can then be applied on a subset of samples as a complementary methodology.

Fig. 2 .
Fig. 2. Comparisons of alpha diversity between datasets and to the physicochemical Pressure Index (PI).Shannon diversity (H') from reported morphologybased monitoring was compared to rarefied richness based on (A) COI metazoan, and (B) 18S total eukaryotic metabarcoding, then to PI (C).We then compared COIbased rarefied richness to 18S-based (D) and to PI (E).Finally, we compared 18S-based rarefied richness to PI (F).Grey lines indicate fitted linear regressions and their adjusted coefficients of determination (R2), and p-values are reported.

Fig. 3 .
Fig. 3. Mantel test scores and ordination plots.Non-metric dimensional scaling (NMDS) based on pairwise Bray-Curtis dissimilarities from (A) morphospecies counts, (B) COI relative OTU abundance, (C) 18S relative OTU abundance, and (D) biplot of Principal Components Analysis of standardised key physicochemical properties.Filled arrows between plots represent Mantel scores (R-values) across community dissimilarity matrices, while arrows inside NMDS plots represent correlation (or in D, eigenvectors) of physicochemical or biological parameters to the transformed ordination space.Roman numerals represent geographical regions.

Fig. 5 .
Fig. 5. Cross-validated predictions of Pressure Index (PI) using de novo biotic indices based on taxa from (A) COI and (B) 18S metabarcoding.Bar graphs represent the distribution of ecological status group predictions relative to corresponding reference values of the modelled impact variable, while box plots represent the distribution of exact predictions relative to measured values.

Fig. 6 .
Fig. 6.KRONA charts presenting the taxonomic distribution of identified indicator taxa per ecological group (EG) for taxa identified from COI (A-C) and 18S (D-F) metabarcoding.EG I -II (A, D) represent low impact, EG III (B, E) intermediate and EG IV -V (C, F) high impact.

Fig. 7 .
Fig. 7. Venn diagram describing the distribution of nodes included in impacted and non-impacted co-occurrence networks, and identified as potential indicator taxa.

Fig. 8 .
Fig. 8. Reconstructed ecological networks from 18S data and COI data.Networks generated with CoNet based on 5 different measures of association between taxa (Pearson and Spearman correlation, mutual information, Bray-Curtis and Kullback-Leibler distance) and using the "permut and boot" procedure.Impacted 18S network (A), non-impacted 18S network (B), impacted COI network (C), and nonimpacted COI (D).Node colours correspond to ecological groups (EG) for identified indicators (grey = not identified indicator, green = EG I-II, orange = II, red = IV-V).Node size is proportional to degree (number of edges) and edge colours correspond to cooccurrence type (green = positive, red = negative, indicating mutual exclusion).

Table 1
Association network metadata and properties.