Assessing the state of marine biodiversity in the Northeast Atlantic

The Northeast Atlantic, a highly productive maritime area, has been exposed to a wide range of direct human pressures, such as fishing, shipping, coastal development, pollution, and non-indigenous species (NIS) introductions, in addition to anthropogenically-driven global climate change. Nonetheless, this regional sea supports a high diversity of species and habitats, whose functioning provides a variety of ecosystem services, essential for human welfare. In 2017, OSPAR, the Northeast Atlantic Regional Seas Commission, delivered an assessment of marine biodiversity for the Northeast Atlantic. This assessment examined biodiversity indicators separately to identify changes in Northeast Atlantic biodiversity, but stopped short of determining the status of biodiversity for many species and habitats. Here, we expand on this work and for the first time, a semiquantitative approach is applied to evaluate holistically the state of Northeast Atlantic marine biodiversity across marine food webs, from plankton to top predators, via fish, pelagic and benthic habitats, including xenobiodiversity (i.e. NIS). Our analysis reveals widespread degradation in marine ecosystems and biodiversity, particularly for marine birds and coastal bottlenose dolphins, as well as for benthic habitats and fish in some regions. The poor biodiversity status of these ecosystem components is likely the result of cumulative effects of human activities, such as habitat destruction or disturbance, overexploitation, eutrophication, the introduction of NIS, and climate change. Bright spots are also revealed, such as recent signs of recovery in some fish and marine bird communities and recovery in harbour and grey seal populations and the condition of coastal benthic * Corresponding author. E-mail address: Abigail.mcquatters-gollop@plymouth.ac.uk (A. McQuatters-Gollop).


Introduction
In 2019 the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Service (IPBES) declared a global biodiversity emergency recognising the degraded, and in many cases worsening, state of global biodiversity, including that of the marine environment (IPBES, 2019). Climate change, ocean acidification, and direct anthropogenic pressures such as overfishing, pollution, non-indigenous species (NIS), and intensive coastal development are affecting much of the global ocean, resulting in declines in, and even loss of, marine species and habitats (Halpern et al., 2008;IPBES, 2019;IPCC, 2019; Secretariat of the Convention on Biological Diversity, 2014). Such degradation affects ecosystem structure, processes, and functioning, consequently affecting essential marine ecosystem services including food provision, carbon and nutrient cycling, coastal protection, and recreation (IPBES, 2019). Global (e.g., Convention on Biological Diversity, Sustainable Development Goal 14) and regional (e.g., EU Marine Strategy Framework Directive, Northeast United States Shelf (NOAA, 2022), European Regional Seas Conventions (i.e. OSPAR's Northeast Atlantic Biodiversity Strategy (OSPAR, 2021)) policy mechanisms recognise this critical link between healthy marine biodiversity and human welfare, and have instated goals to assess, maintain, and improve marine biodiversity. A clear picture of the current state and past trends of marine biodiversity (including xeno-biodiversity (i.e. NIS) is therefore required as a baseline from which to both detect ongoing and future changes, design effective management measures and track progress towards policy targets. Furthermore, establishing the state of marine biodiversity at ecologically-relevant spatial and temporal scales may help to measure regional effects of global climate change effects and distinguish those from localised pressures.
The Northeast Atlantic has a long history of intensive human use (Roberts, 2007), which along with climate change, has caused largescale degradation of marine ecosystems and biodiversity from historic baselines. During the last century Northeast Atlantic marine ecosystems have experienced changes in ecosystem functioning (Capuzzo et al., 2018), alterations to marine food webs (Piroddi et al., 2021), the loss and extinction of species (Lotze et al., 2017;Reynolds et al., 2009;Sguotti et al., 2016), colonisation by NIS (Castro et al., 2017;Tsiamis et al., 2019), and coastal and marine habitat degradation and loss (Airoldi and Beck, 2007;Bennema et al., 2020;Torbjørn et al., 2003). To halt the decline of marine biodiversity and improve its state, the European Union passed into force the Marine Strategy Framework Directive (MSFD) in 2008. The MSFD applies a holistic, ecosystem approach to managing the marine environment with key descriptors of marine biodiversity (e.g. ecosystem functioning, NIS, food webs, and sea floor integrity) explicitly named as some aspects of the marine environment which must reach the objective of Good Environmental Status (GES). The MSFD represents one of the world's most ambitious applications of a comprehensive ecosystem approach, looking across multiple biodiversity components (i.e. pelagic and benthic habitats, fish, marine birds, marine mammals, food webs, and NIS) based on a variety of indicators. As part of this process, 22 indicators have been developed to assess the state of marine biodiversity at the Northeast Atlantic regional scale (OSPAR, 2017m). These biodiversity indicators are assessed to evaluate progress towards targets representing the overall objective of GES for each ecosystem component. In 2017, OSPAR (the Northeast Atlantic regional seas commission) evaluated each indicator individually as part of a Northeast Atlantic ecosystem assessment (OSPAR, 2017m). In many cases, the assessment was challenging due to the scale of assessment, methodological or data limitations, or due to the novel approach of assessing biodiversity across datasets, ecosystems, and multiple scientific experts from OSPAR Contracting Parties at this scale. It was often equally challenging to identify pressures causing changes in the marine environment, either because of scientific uncertainty or due to resource and time limitations. In most cases (e.g. pelagic habitats, food webs, NIS) no quantitative targets for GES were agreed at the OSPAR level at the time, so no overall assessment of the state of marine biodiversity was made during IA2017. In other words, in many cases the indicators were not formally assessed for Good Environmental Status, but biodiversity change was merely described (OSPAR, 2017m).
This paper aims, therefore, for the first time, to assess the state of marine biodiversity across multiple ecosystem components at the Northeast Atlantic regional scale, by using a suite of policy-relevant indicators. Biodiversity status is assessed semi-quantitatively by linking indicator change to pressures or evaluating against threshold values (where available) to reveal a broadscale assessment of Northeast Atlantic marine biodiversity state. Challenges to this process are then identified and recommendations for achieving a more complete and holistic assessment of regional marine biodiversity are given.

Spatial area
The Northeast Atlantic OSPAR maritime area is bordered by 12 countries, ten of which were European Union Member States at the time of this work. Countries bordering the Northeast Atlantic have a total population of approximately 329 million people. Consequently, Northeast Atlantic marine waters support a wide variety of human activities including fishing, shipping, oil and gas provision, and recreation (ICES, 2020).
Within OSPAR, the Northeast Atlantic maritime area can be divided into five regions (Arctic Waters, Greater North Sea, Celtic Seas, Bay of Biscay and Iberian Coast, and Wider Atlantic; Fig. 1). These regions comprise a wide range of environmental conditions and marine ecosystems, from deep ocean to shelf and coastal waters. Where ecologically relevant or where data were limited, specific indicator assessments have used further geographic divisions, referred to here as sub-regions (e.g., the English Channel, the Southern North Sea). Here we focus on the Celtic Seas, Greater North Sea, and Bay of Biscay and Iberian Coast regions, where most of the recently developed indicators were adopted and applied by OSPAR Contracting Parties. Marine birds were also assessed for Arctic Waters, and one fish and one food web indicator for the Wider Atlantic, in 2017, but are outside the focus of this paper.

Science-policy process
Through indicator assessments, the 2017 OSPAR Intermediate Assessment (IA2017) further developed our understanding of the marine environment of the Northeast Atlantic and its current ecological status. IA2017 also provided an opportunity for OSPAR Contracting Parties, most of which are also European Union Member States, to use the information and methodology in their reporting obligations under the MSFD.
A. McQuatters-Gollop et al. Ecological Indicators 141 (2022) 3 The IA2017 biodiversity assessments were carried out through OSPAR's Intersessional Correspondence Group on Biodiversity and Monitoring (ICG-COBAM). Due to the science-policy nature of OSPAR, IA2017 enabled indicators to be co-developed with policy-makers to meet policy needs. The process was organised according to seven ecosystem components (pelagic habitats, benthic habitats, fish, marine birds, marine mammals, food webs, and NIS), each of which was represented by an expert group consisting of scientific experts from OSPAR Contracting Party countries. The expert groups drove scientific indicator development and assessment, engaging in the wider OSPAR process to ensure policy alignment. Science-policy interaction primarily occurred at workshops where scientists and policy-makers worked together to review indicator advances and subsequently draft assessments. This collaborative process promoted scientific progress on indicator development with continual feedback and steer from policy-makers throughout the OSPAR area. For food webs and pelagic and benthic habitats, collaborative work was also progressed under the EU-funded "Applying an Ecosystem Approach to (sub) Regional Habitat Assessments: Addressing gaps in biodiversity indicator development for the OSPAR Region from data to ecosystem assessment" (EcApRHA) project, which enabled the acceleration of indicator development. This process produced indicators and assessments that are scientifically robust; can be communicated appropriately to scientists, policy makers, and the public; and are useful to policy-makers for supporting contributions to national MSFD assessments. The marine biodiversity portion of the IA2017 process mobilised considerable scientific effort, with approximately 200 scientists and 40 policy-makers involved (https://oap.ospar. org/en/ospar-assessments/intermediate-assessment-2017/acknowle dgements/).

Biodiversity indicators: Background context and key characteristics
Biodiversity indicators were developed to represent key aspects of the seven OSPAR/MSFD ecosystem components (Tables 1-7). The biodiversity indicator development process started in 2010 with >140 theoretical, published, or national indicators identified as relevant for biodiversity assessment in the Northeast Atlantic (OSPAR, 2011). A combination of policy relevance and scientific robustness was used to create a shortlist of indicators, some of which had previously been used in assessments and some of which were purely conceptual. The common methodological principles and approaches to determining GES (for OSPAR or MSFD), setting environmental targets and selecting indicators were published as an advice manual (OSPAR, 2012). Additional scrutiny was given to the indicator suite selected by ICES working groups (e.g. ICES, 2013;ICES, 2014b) and in the scientific literature where many new indicators and assessment methods were developed and published during this decade, notably influenced by the progressive implementation of the MSFD in European seas (Borja et al., 2019;Heiskanen et al., 2016;Uusitalo et al., 2016). Expert groups developed these indicators into operational indicators, through a process of testing, validation, and science and policy peer-review. For IA2017, each indicator was assessed individually, from relevant data made available by Contracting Parties, representing a significant milestone in describing the ecological status of Northeast Atlantic biodiversity across multiple ecosystem components. Each indicator was given a short code (PH = Pelagic Habitats, BH = Benthic Habitats, FC = Fish and Cephalopods (although cephalopod data were not used here), B = (Marine) Birds, M = Mammals, FW = Food Webs, NIS = Non-Indigenous Species) and identifying number (Tables 1-7). Some indicators were not operational, nor agreed by OSPAR Contracting Parties, at the time of IA2017, often as a result of lack of development or testing, and were designated as "candidate" indicators that might be used for future assessments (Elliott et al., 2017b;Padegimas et al., 2017). The entire process was performed in collaboration with policy-makers, to ensure policy-relevance, and with (inter) national expert networks to ensure scientific consensus via peer-review.
A variety of indicator types was developed to assess marine biodiversity status. The basic DPSIR (Driver-Pressure-State-Impact-Response) framework can be used to classify indicator types for the purpose of this work (Elliott et al., 2017a). In brief, Drivers of basic human needs lead to Pressures on the marine ecosystem. The Pressures cause State change on the natural system which then leads to Impacts on human welfare. These impacts then require management Responses. The majority of the biodiversity indicators here assess ecological state, although some assess the effect of pressure on state, combining data on extent and intensity of pressures with biological data (e.g. The common conceptual approach to assessing condition of benthic habitat-defining communities (BH2), Extent of physical damage to predominant special habitats (BH3), Harbour porpoise bycatch (M6); Tables 2, 5).. Some indicators are comprised of multiple sub indicators, depending on the pressure or ecological information to be assessed (e.g. Changes in plankton diversity (PH3) and Condition of benthic habitat-defining communities (BH2) (Tables 1, 2)). Food web indicators are central to this analysis and represent an important step towards holistic assessment. For that reason, some food web indicators span multiple ecosystem components (e.g., Changes in phytoplankton and zooplankton communities (PH1/FW5) is both a pelagic and a food web indicator; Production of phytoplankton (FW2) encompasses primary productivity, an aspect of pelagic habitats; Size composition in fish communities (FW3) is related to fish community size structure, Change in mean trophic level of consumers or marine predators (FW4) includes invertebrate and fish data; Table 6). A brief synopsis of each indicator, including its underlying data, data analysis methods, time span, spatial coverage of data, link to detailed methodological guidelines under the OSPAR Coordinated Environmental Monitoring Programme (CEMP), and further references are provided in Tables 1-7.

Data
Data were gathered through OSPAR formal data calls, requests to ICES Working Groups and ICES data calls, national expert group contributions, and direct requests to data providers, many of whom were members of the OSPAR Expert Groups. Spatial and temporal data coverage varied by dataset and indicator due to differences in data availability; for this reason, and because ecological dynamics vary throughout the ecosystem, a common temporal baseline was not used  A. McQuatters-Gollop et al. Ecological Indicators 141 (2022) 4 across indicators and ecosystem components. For example, broad-scale benthic habitat, cetacean abundance, and plankton abundance and biomass data were used throughout several regions, while some other datasets, such as benthic community condition, harbour porpoise bycatch data or changes in plankton diversity, were available only subregionally. For NIS, data were available on a national scale and then processed to meet the regional scale. Approximately 86 datasets were used to develop the indicators, but in some cases (e.g. pelagic habitats, benthic habitats) lack of formal data contribution by Contracting Parties restricted the inclusion of all possible datasets.
A wide spectrum of data types was used to construct indicators. The harmonisation of diverse datasets was often required (e.g. to construct pelagic and benthic habitats and food web indicators). Data and metadata can be obtained from each indicator's IA2017 webpage (see Tables 1-7) and associated Coordinated Environmental Monitoring Programme (CEMP) guidelines, which contain official and detailed methodologies for all indicators contributing to IA2017 (see Tables 1-7; further information in CEMP at https://www.ospar.org/work-areas /cross-cutting-issues/cemp).

Pelagic habitat data
Two types of plankton time-series data were used to construct pelagic habitat indicators: fixed-point sampling data from the UK, France, and Sweden, and underway data from the UK-based Continuous Plankton Recorder (CPR) survey. Differences in sampling and analysis methods exist between all time-series datasets. For example, fixed-point Table 1 The pelagic habitats ecosystem components and associated indicator characteristics used for the OSPAR 2017 intermediate assessment. The underlying data, method, time period, region(s)/sub-region(s) assessed, and spatial coverage are also identified. Lastly, links to its use in OSPAR IA2017, its methodology, published as part of the OSPAR Coordinated Environmental Monitoring Programme (CEMP) or ICES advice, and further publications using the indicator are included. phytoplankton data were collected with a variety of methods, while fixed-point zooplankton data were collected using nets. The CPR survey is a phyto-and zooplankton monitoring programme which samples plankton using a device towed behind merchant navy ships on commercial routes spanning the Northeast Atlantic (Batten et al., 2003). As a result of these differences, pelagic data could not be simply aggregated across datasets. Instead, pelagic indicators were purposely developed to deal with this wide range of plankton data types, regardless of sampling and enumeration method or level of taxonomic identification (see Table 1). Indicator Changes in phytoplankton and zooplankton communities (PH1/FW5) aggregates plankton taxa into "lifeforms" based on common functional traits. Changes in lifeforms can be interrogated temporally, spatially, or as ecologically-relevant lifeform pairs (McQuatters-Gollop et al., 2019a).

Benthic habitat data
The data requirements for all OSPAR benthic indicators can be classified into three main categories: a) benthic species' community composition; b) benthic habitat mapping data; and c) spatial data of anthropogenic pressures. These data streams support the assessment of Table 2 The benthic habitats ecosystem components and associated indicator characteristics used for the OSPAR 2017 intermediate assessment. The underlying data, method, time period, region(s)/sub-region(s) assessed, and spatial coverage are also identified. Lastly, links to its use in OSPAR IA2017, its methodology, published as part of the OSPAR Coordinated Environmental Monitoring Programme (CEMP) or ICES advice, and further publications using the indicator are included. each indicator and in combination provided for an integrated assessment (Table 2). Data on benthic macro-fauna and macro-flora species and communities were collected via standardised methods (See The common conceptual approach to assessing condition of benthic habitatdefining communities (BH2) CEMP, Table 2 (OSPAR, 2018b)). Benthic habitat mapping data provide information on seabed type, classified according to EUNIS (European Nature Information System) levels, and distribution. Benthic habitat maps were produced using EMODNet methods and data (EUNIS level 3), combined with local datasets, at different spatial resolutions, depending on methods used (e.g. groundtruthed sampling, acoustic mapping, and modelling) (Diaz et al., 2004;Ellwood, 2014;Kenny et al., 2003). The information on environmental variables (e.g. depth, exposure, sediment type, etc) was used to classify and model habitats and produce habitat maps (Brown et al., 2011). Information on the extent and distribution of benthic habitats is used alongside the associated sensitivity, in terms of resilience and resistance capacity of benthic species, biotopes and habitat types. Data for surface and sub-surface abrasion by bottom-trawling fisheries provided an indirect measure of pressure on the seabed by fishing. These data were obtained from ICES (2016b). The pressure layer on abrasion by bottom trawling was produced from bottom trawl data provided by OSPAR Contracting Parties, the assessment of related pressure by ICES, and finally an assessment of disturbance of habitats by the OSPAR benthic habitat expert group.

Fish data
Quantitative spatio-temporal data across the OSPAR area were available from scientific trawl surveys coordinated by ICES and supporting the fisheries management process. These surveys collect data on the number and weight of each species of fish along with measurements of body length and mass. To maintain transparency and for quality assurance, a selection of surveys was used for the analyses. The data resulting from the surveys were made available through the routinely updated Database of Trawl Surveys (DATRAS). Relevant ICES Working groups (ICES, 2021a; ICES, 2021b)) were consulted on the appropriate usage of these data and a single data product was subsequently created and, following discussion within ICG-COBAM, stored in an online repository to support the assessment process with comprehensive documentation available Moriarty et al., 2017;Moriarty et al., 2019). Cephalopods and other data collected through these surveys (e.g. benthic invertebrates) were excluded from the data product due to inconsistencies in species identification and recording practises between countries.

Marine bird data
The assessment used data on marine bird species that, at some point in their annual life cycle, rely on coastal and/or offshore marine areas; the term 'marine birds' therefore includes seabirds and waterbirds. Taxonomic groups considered include Anseriformes (ducks, geese, swans), Procellariiformes (fulmars, petrels, shearwaters), Suliformes (gannets, cormorants), Gaviiformes (divers), Podicipediformes (grebes)

Table 3
The fish ecosystem components and associated indicator characteristics used for the OSPAR 2017 intermediate assessment. The underlying data, method, time period, region(s)/sub-region(s) assessed, and spatial coverage are also identified. Lastly, links to its use in OSPAR IA2017, its methodology, published as part of the OSPAR Coordinated Environmental Monitoring Programme (CEMP) or ICES advice, and further publications using the indicator are included. and Charadriiformes (shorebirds, skuas, gulls, terns, auks). Data were available from national monitoring schemes, which each country submitted to the OSPAR Seabird Database via the ICES Biodiversity Data Portal (https://www.ices.dk/data/data-portals/Pages/Biodiversity. aspx). Breeding abundance was assessed using trends in annual estimates of breeding bird abundance derived from counts of adult birds or pairs when they aggregate on land at coastal breeding colonies or sites, where they are easier to monitor than when being dispersed at sea over large areas for foraging. Some colonially breeding species were also assessed using data on breeding successthe average number of chicks fledged per breeding pair per year. Non-breeding bird abundance did not derive from at-sea surveys, but was restricted to counts of birds in intertidal areas or close to the shore and surveyed from land during migration or over the winter. During these times the birds in these habitats aggregate in large numbers and are much easier to count than during the breeding season, when they are dispersed over large areas of remote nesting habitat, such as the Arctic tundra. Many marine bird species assessed in the non-breeding season (mainly shorebirds, ducks, geese, swans, divers, and grebes) breed inland and in areas outside the OSPAR Maritime Area but are reliant on the Northeast Atlantic most of the year during migration and/or over the winter. Indicator Marine bird breeding success/failure (B3) was populated with seabird data for IA2017, although it is anticipated that waterbird data will be included in future assessments.

Marine mammal data
Abundance and distribution data for both cetaceans and phocid seals were available from dedicated national monitoring programmes as well as from international large-scale surveys such as SCANS (Small Cetacean Abundance in the European Atlantic and North Sea; Hammond et al., 2013). These surveys were timed for key periods, i.e. for grey and harbour seal during key life stages such as breeding (pupping) and moulting, whereas the surveys for cetaceans concentrated on the summer months. The sampling methods included line-transect distance sampling as well as mark-recapture analyses of photo-identification data, and for the seals, aerial photographic surveys, and land-and ship-based counts (Brasseur et al., 2015;Buckland et al., 2001Buckland et al., , 2004Galatius et al., 2021). ICES Working groups (ICES, 2014a) were consulted on relevant population-specific assessment units as well as threshold setting methods for population trends and bycatch. Data on bycatch rates of harbour porpoises were primarily obtained through observer schemes that operated according to European Union Council Regulation (EC) 812/2004, and were analysed by the ICES Working Group on Bycatch of Protected Species (WGBYC). ICES advise (2015a) was used for assessment of this pressure indicator.

Food web data
Food web indicators differ in their data requirements according to the complexity of the indicator and the number of ecosystem components considered. Different types of phytoplankton primary production data were used to construct food web indicators. For Production of phytoplankton (FW2), for example, the estimation of primary production comes from different data collection and processing methods such as carbon or oxygen isotope techniques, fluorometric techniques and modelling methods. Since the time-series were collected from fixed stations and transects, a step-by-step approach was developed to integrate these datasets for the indicator calculation . The data requirements for the Change in mean trophic level of consumers or marine predators (FW4) food web indicator can be classified into two main categories: a) biomass data per species for fish and invertebrates from surveys and landings and b) data on the trophic level of each species. Trophic level values were estimated from stomach Table 4 The marine birds ecosystem components and associated indicator characteristics used for the OSPAR 2017 intermediate assessment. The underlying data, method, time period, region(s)/sub-region(s) assessed, and spatial coverage are also identified. Lastly, links to its use in OSPAR IA2017, its methodology, published as part of the OSPAR Coordinated Environmental Monitoring Programme (CEMP) or ICES advice, and further publications using the indicator are included.   content analyses, stable isotope analysis, models or collected from published literature (Table 6). Food web assessments drew upon the already available data for the fish and plankton indicators (Tables 1, 3, 6).

NIS data
Most OSPAR parties provided data on NIS newly recorded within their waters between the 1st of January 2003 and the 31st of December 2014. A distinction was made between newly recorded NIS and new introductions to account for limitations in regional monitoring programmes (at time of data collection) and the frequent time delay between introduction and detection of NIS especially in aquatic systems. The period assessed was selected because it represents two six-year periods (2003-2008 and 2009-2014), from which Contracting Parties are most likely to have complete data sets, enabling comparison of new NIS records between these two six-year reporting periods. This was in order to test the indicator for further use within the six-year rolling programme scenario of the MSFD. Data included the date on which a NIS was first recorded within each of the regions and the location of the observation.

Creating a holistic overview of marine biodiversity in the Northeast Atlantic
Building on the work of the 2017 OSPAR Intermediate Assessment (IA2017) which, in many cases, merely described, rather than assessed, biodiversity change, here, for the first time, we extend the assessment of biodiversity through application of an interpretive assessment for each indicator to classify indicator change within the wider ecosystem context (Table 8). This categorical assessment used expert interpretation of indicator change (from the same experts who led the development of the 2017 OSPAR indicator assessments) with respect to assessment thresholds (where available), links to pressures, and knowledge of indicator state to categorise indicators as in poor, uncertain, or good biodiversity status (Table 8). Assigning indicators to these three biodiversity status categories has been done for the purpose of this research and related scientific interpretation, and has currently no formal link to any policy regulation (e.g. OSPAR, MSFD) for GES assessments. However, it is expected that this exercise will aid in preparation and delivery of the next OSPAR biodiversity Quality Status Report assessment to be published in 2023, which will in turn contribute to MSFD national reporting due in 2024.
Some indicators were assessed against thresholds in IA2017 and so their biodiversity status was straightforward to determine (see summarised threshold information in Table A1). More information can be found in the IA2017 assessment links in Tables 1-7, but key information is summarised here. Indicator Assessment of coastal habitats in relation to nutrient and/or organic enrichment (BH2-A) was assessed against the Water Framework Directive's benthic quality elements (for multimetric indices and associate quality ranges) for each nation's coastal waterbodies. For Recovery in the population abundance of sensitive fish species (FC1), each sensitive species was assessed relative to a percentile of its annual catch rate from all available years of data, with the upper 25th percentile suggesting "recovery" (good status), and the lower 25th percentile indicating continued decline (poor status) following Greenstreet et al. (2012). Sensitive species were then grouped within surveys and overall change for the sensitive fish group was assessed for each survey through application of a binomial test with significance level of 0.05, where significant change depends on number of species monitored and a probability of each to reach the percentile threshold by chance based on a random walk model, following Greenstreet et al. (2012). Multiple survey-based assessments were grouped within OSPAR regions and an overall assessment made using an averaging integration approach. For the indicator Proportion of large fish -Large Fish Index (FC2), a temporal baseline (a recent historic period) where available was taken; this baseline differed by survey. Where a baseline was not available a trend was assessed, with a positive trend in abundance assessed as in good status.
For the marine bird abundance indicators, Marine birds: Nonbreeding abundance (B1-A) and Marine birds: Breeding abundance (B1-B), thresholds were applied as follows. A species was in good status if 70% of baseline abundance was achieved (80% in species laying only 1 egg/year); good status of a species group was then determined if at least 75% of species assessed were in good status. For Marine bird breeding success/failure (B3) a partial threshold was used in IA2017, which was then expanded here. All species in the indicator apart from terns were assessed as in good status if widespread breeding failure (<0.1 chicks/ pair in 5% of colonies or more) happened in fewer than four of the last six years. For terns, widespread breeding failure was defined as < 0.1 chicks/pair in the mean percentage of colonies failing per year, over the preceding 15 years. To categorise the status of Marine bird breeding success/failure (B3) here, good status of the indicator was then determined if at least 75% of species assessed were in good status; otherwise the status was categorised as poor.
Two marine mammal indicators were assessed against thresholds in Table 7 The Non-Indigenous species ecosystem components and associated indicator characteristics used for the OSPAR 2017 intermediate assessment.  Indicator value is above assessment threshold, or indicator represents improving state, or indicator shows no change but state is satisfactory Unassessed Indicator was not assessed in a region due to lack of data, lack of expert resource, or lack of policy support.
A. McQuatters-Gollop et al. IA2017. For both Seal abundance and distribution (M3) and Grey seal pup production (M5), good status was defined as no decline in sealabundance of > 1% per year in the previous six-year period (a decline of approximately 6% over six years), or no decline in seal abundance of > 25% since the fixed baseline in 1992 (or closest value). Most indicators didn't have thresholds, including all pelagic habitats, food web, and NIS indicators as well as some benthic indicators (Subtidal habitats of the southern North Sea (BH2-B), Extent of physical damage to predominant special habitats (BH3)), one fish indicator (Mean Maximum Length of fish (FC3)), and some mammal (Abundance and distribution of cetaceans (M4A), Abundance and distribution of coastal bottlenose dolphins (M4A1), Abundance and distribution of cetaceans other than killer whales and coastal bottlenose dolphins (M4B), and Numbers of harbour porpoise bycaught in commercial nets (M6)) indicators. Biodiversity status for these indicators was characterised according to Table 8. If an indicator did not have an assessment threshold, expert judgement was used to categorise indicator status based on if indicator change represented a declining/improving state (poor/good status) or there was no change in the indicator but its state was considered unsatisfactory/satisfactory (poor/good status). If it was unclear if indicator change represented a declining or improving state, or if an indicator showed no change, but it was uncertain if the state represented was satisfactory, the indicator was categorised as in uncertain status. If an indicator demonstrated change linked to increasing effect of anthropogenic pressure (including climate change), the indicator was categorised as in poor status.
Indicators were not assessed in all regions for several reasons. For some indicator-region combinations adequate data were not available to make an assessment. This was the case in the Bay of Biscay and Iberian Coast for Harbour porpoise bycatch (M6) and Production of phytoplankton (FW2) and also, since seals don't occur in the Bay of Biscay and Iberian Coast, Seal abundance and distribution (M3) and Grey seal pup production (M5). Similarly, indicator Subtidal habitats of the southern North Sea (BH2-B) was too nascent to be sufficiently developed and tested in other regions. Other indicators did not have sufficient policy support from countries bordering a region for their development. This was true in the Celtic Seas for Changes in plankton diversity (PH3) and in the Bay of Biscay and Iberian Coast for Recovery in the population abundance of sensitive fish species (FC1) and Proportion of large fish -Large Fish Index (FC2). The indicator Change in mean trophic level of consumers or marine predators (FW4) lacked both policy support and expert resource for indicator development and assessment in the Greater North Sea and the Celtic Sea, as did Subtidal habitats of the southern North Sea (BH2-B), and all three marine bird indicators (Marine birds: Non-breeding abundance (B1-A), Marine birds: Breeding abundance (B1-B), and Marine bird breeding success/failure (B3)) for the Bay of Biscay and Iberian Coast. The focus of IA2017 was on the Greater North Sea, Bay of Biscay and Iberian Coast, and Celtic Seas region, but future assessments should aim to expand further the indicator assessments to Arctic Waters and the Wider Atlantic as resources, data, and policy support allow.

Results
The assessment of marine biodiversity indicators has revealed widespread changes across all parts of the Northeast Atlantic marine ecosystem at various scales. Here, we build on the work done under IA2017, which examined each indicator separately, to now holistically examine changes in each ecosystem component, assessing biodiversity status and drawing conclusions, where possible, across the whole OSPAR area (Fig. 2). This is a high-level summary focusing on key changes detected in each indicator in each region; for further detail on individual indicator change, refer to the references in Tables 1-7.

Pelagic habitats
Statistically significant changes were observed in pelagic habitats across all regions (references in Table 1). Since these indicators did not have thresholds and pressures causing indicator change were not identified during the assessment, the biodiversity status of Changes in phytoplankton and zooplankton communities (PH1/FW5), Changes in phytoplankton biomass and zooplankton abundance (PH2), and Changes in plankton diversity (PH3) therefore was classified as "uncertain" (Fig. 2, Table A1). Most ecologically-relevant pairs of plankton (referred to here as lifeform pairs; Changes in phytoplankton and zooplankton communities (PH1/FW5)) experienced significant changes throughout the Greater North Sea, Celtic Sea, and Bay of Biscay and Iberian Coast regions, indicating alterations to pelagic habitat structure  Tables 1-7. Icons in red represent poor status, orange represent uncertain status, and green represent good status. Some indicators were not assessed in some regions (grey) due to lack of data, lack of expert resource, or lack of policy support. The Greater North Sea region had the most assessed indicators, and the Bay of Biscay and Iberian Coast the fewest. The Celtic Seas region had the most indicators categorised as in good status, but most of the assessed biodiversity indicators are uncertain or poor for any of the OSPAR regions. and functioning. Among the most pronounced was the relative change in the interannual mean abundance of holoplankton and meroplankton, particularly in the Greater North and Celtic Seas, suggesting changes in the coupling of the benthic and pelagic components of the ecosystem. Although multiple time-series datasets were analysed (see Table 1), changes in phytoplankton biomass (Changes in phytoplankton biomass and zooplankton abundance (PH2)) were remarkably consistent across the OSPAR maritime area with anomalies of phytoplankton biomass proxies rapidly increasing from the 1980s, a manifestation of the welldocumented region-wide regime shift (Beaugrand, 2004). Similarly, zooplankton abundance largely decreased across the whole area, although it did increase in some regions post-2000s. For plankton diversity (Changes in plankton diversity (PH3)), an increase in species dominance (the relative abundance of certain species compared to the overall community) was observed in 2007-2008 for the Greater North Sea stations, suggesting a shift in community composition, but the presence of a definitive trend in species dominance patterns is unclear. In the Bay of Biscay and Iberian Coast, phytoplankton communities showed an increase in species dominance over the time period of assessment.

Benthic habitats
The 2017 assessment of coastal benthic habitats indicated that 89% of the coastal water bodies in the OSPAR Maritime Area, including most of the Bay of Biscay and Iberian Coast and Celtic Seas regions, had benthic habitats meeting thresholds for good status in relation to nutrient and/or organic enrichment (Assessment of coastal habitats in relation to nutrient and/or organic enrichment (BH2-A)) with regard to macroalgae and angiosperms, and 74% with regard to benthic invertebrates (Fig. 2, Table A1). However, there were wide regional variations and several data gaps existed, with localised problem areas for nutrient and/or organic enrichment, notably in the Greater North Sea, which was assessed as uncertain status due to lack of data or uncertainties in response of the indicator to nutrients. No thresholds were agreed for IA2017, but the extent of physical damage to the seafloor (Extent of physical damage to predominant special habitats (BH3)) showed that 86% of the assessed areas in the Greater North Sea and the Celtic Seas had physical disturbance due to bottom trawling, of which 58% were assessed as highly disturbed and thus in poor status (Fig. 2, Table 2). Three quarters (74%) of all assessed areas experienced consistent fishing pressure year on year, which is very likely to affect the ability of habitats to recover. The absence of baseline habitat maps for most of the deeper areas of the Bay of Biscay and Iberian Coast prevented assessment of most of this area, with assessment only possible for shallow and coastal regions. Benthic habitat community quality was sub-regionally assessed in terms of species richness in the southern North Sea (Subtidal habitats of the southern North Sea (BH2-B)) where community quality was found to be generally lower in coastal areas than further offshore, partly due to higher fishing pressure in coastal areas. The indicator was therefore assessed here as in uncertain status due to spatial gaps in the data used; no threshold was agreed in IA2017. Other anthropogenic activities potentially causing pressures on benthic habitats were not assessed in 2017.

Fish:
The decline in abundance of fish species sensitive to fishing effects (Recovery in the population abundance of sensitive fish species (FC1)) was found to have halted in the Greater North Sea (i.e. a statistically significant proportion of species was above the 75th percentile of each species' annual catch rate), but recovery (above the upper 25th percentile) had not occurred, meaning the primary assessment threshold was not met and so the indicator was assessed as in poor status in the region. Recovery (i.e. meeting the threshold of a statistically significant proportion of species above the 25th percentile of each species' annual catch rate) was apparent overall in the Celtic Seas, resulting in good status in that region (Fig. 2, Table A1).
For the wider demersal fish community, there were improvements in parts of the OSPAR area. Recovery in the Large Fish Index (LFI; Proportion of large fish -Large Fish Index (FC2)) in the northern Celtic Seas (west of Scotland) suggests achievement of good status in this subdivision, but elsewhere in the Celtic Seas region the increases observed did not meet targets, resulting in poor status as the threshold was not met. Increases in LFI (Proportion of large fish -Large Fish Index (FC2)) in the Greater North Sea suggested that recovery could be achieved before the next assessment if fishing pressure does not increase, but the region is not yet at good status and hence categorised as poor. Despite these improvements for the region overall, additional analyses of the species composition of demersal fish communities (Mean Maximum Length of fish (FC3)) found that in the south-eastern and central-western North Sea, some areas along the continental shelf edge to the northeast and southwest of Ireland and parts of the Iberian coast were at the minimum observed state (lowest recorded (Mean Maximum Length) values) and thus, the community was dominated by small-bodied species that are generally more resilient to fishing than large-bodied species. This indicator did not have an assessment threshold in IA2017, and was therefore categorised as of uncertain status in all regions (Fig. 2, Table A1). Thus, the increase in the LFI for demersal fish in the Greater North Sea overall is dependent on only part of the region surveyed and, without a further recovery of large-bodied species elsewhere, may be limited.
Community indicators for pelagic fish were newly developed, with a weaker assessment basis, and assessment outcomes were largely of uncertain status. Nevertheless, an increase in the dominance of smaller species in the southern North Sea and west of Scotland merits further surveillance.

Marine birds
All three marine bird indicators were assessed against thresholds. Breeding populations of marine birds were determined to be in poor status in the Greater North Sea and Celtic Seas regions (Marine birds: breeding abundance (B1-B); Fig. 2, Table A1). In all three regions, and for more than a quarter of the marine bird species assessed, there has been a considerable (>20-30%) drop in abundance compared to the levels observed 25 years ago (thresholds depending on clutch size, see section 2.5). Declines in breeding abundance started mainly in the mid-2000s. Frequent and widespread breeding failure has also been observed for > 25% of species assessed in the Greater North Sea during the period assessed (2010 to 2015 inclusive), resulting in poor status. In the Celtic Seas, 75% of species did not show frequent and widespread breeding failure, indicating good status (Marine bird breeding success/failure (B3)). Breeding abundance declines (Marine birds: breeding abundance (B1-B)) and breeding failures (Marine bird breeding success/failure (B3)) were most apparent in those species feeding on small fish in the surface waters of the Greater North Sea and Celtic Seas, highlighting food availability as an important driver in the population dynamics of marine birds in the Northeast Atlantic (ICES, 2017).
Similarly, the non-breeding abundance (Marine birds: non-breeding abundance (B1-A)) of more than a quarter of the waterbird species that visit the Celtic Seas during migration and/or during winter has also declined by >20-30% between 1992 and 2014/15, indicating poor biodiversity status. In contrast, non-breeding populations of marine birds visiting the Greater North Sea were classed as in good biodiversity status, with >75% of species being present in higher numbers compared to levels in the early 1990s.

Marine mammals
Marine mammals, as top predators, can provide suitable indicators to reflect the state of the marine ecosystem. Grey seal abundance (Seal abundance and distribution (M3)) and pup production (Grey seal pup production (M5)) have increased in the Greater North Sea and eastern Celtic Seas since 1992, while the number of grey seal breeding colonies has increased or remained unchanged since 2003. Harbour seal abundance (Seal abundance and distribution (M3)) has been stable or has increased in most places but is in decline in others, though some areas have insufficient data. Indicators Seal abundance and distribution (M3) and Grey seal pup production (M5) have therefore been classed here as achieving good status in the Greater North Sea and eastern Celtic Sea as assessment thresholds were met (Fig. 2, Table A1). Most of the coastal bottlenose dolphin populations (Abundance and distribution of coastal bottlenose dolphins (M4-A1)) remained stable when compared to populations in 1990, but the number of animals in those populations remains low while the population is in decline in Portugal; this suggests poor status across the Greater North Sea, Celtic Seas, and Bay of Biscay and Iberian Coast (Fig. 2, Table A1). There is no evidence for changes in abundance of white-beaked dolphin, minke whale and harbour porpoise since 1994 while there is insufficient evidence for other species (Abundance and distribution of cetaceans other than killer whales and coastal bottlenose dolphins (M4-B)), resulting in uncertain status for this indicator. The distribution of harbour porpoise and minke whale has shifted southward in the Greater North Sea. Insufficient data and high levels of uncertainty prevented a robust assessment of harbour porpoise bycatch (Harbour porpoise bycatch (M6)), resulting in uncertain status, but at least 4000 porpoises (from a population of > 490,000) are incidentally caught in fishing nets annually and bycatch is considered one of the main direct human pressures on this species.

Food webs
The structure of marine food webs was found to be altered across the OSPAR area, but it is unclear how the ecosystem components are affecting each other or responding to the multiple and cumulative effects of human activities and climate change in the area. The food web indicators had no assessment thresholds in IA2017.
The pilot assessment on phytoplankton productivity (Production of phytoplankton (FW2)) was inconclusive across the whole OSPAR area, but showed changes in ecosystem functioning in some discrete sites and in parts of the North Sea, where a decreasing trend in annual phytoplankton production was reported after 1988. This trend is in contrast to the phytoplankton component of indicator Changes in phytoplankton biomass and zooplankton abundance (PH2) and suggests a change in phytoplankton community structure, which potentially could affect upper trophic levels. Scientific uncertainty and data gaps have resulted in uncertain status addressed by this indicator (Fig. 2, Table A1). However, the pilot assessment illustrates the potential of this indicator, representing the main production source of organic matter in the marina environment, to provide valuable information to a broader and holistic ecosystem assessment.
Typical length (Size composition in fish communities (FW3)) was used to assess the size structure within fish communities but as no threshold was proposed the status is uncertain. For demersal fish the indicator was generally increasing across all OSPAR regions since 2010, while no long-term trend was found for pelagic fish, suggesting overall that recovery is ongoing (Fig. 2, Table A1). However, this follows longterm decreases in the Greater North Sea between 1980s and 2000s, with particular declines in the southern and central North Sea. In addition, the lack of recovery in species composition (as evidenced by indicator Mean Maximum Length of fish (FC3)) in the North Sea suggests the ongoing recovery in the size structure of the fish community may be limited. Long-term decrease was also evidenced between 1990s and 2005 in the Celtic Seas, especially in the western Irish Sea, in the continental shelf edge north-west of Ireland, and to the west of Brittany. In contrast, long term increases overall for typical length of demersal fish were evident in the Bay of Biscay and Iberian Coast.
Change in average trophic level of consumers and marine predators (Change in mean trophic level of consumers or marine predators (FW4)), applied in the Bay of Biscay and Iberian Coast, showed no apparent change in overall food web structure over recent decades. However, there are some signs of improvement when only predators (Mean Trophic Level (MTL) 3.25 and MTL4) were considered. The status of the average trophic level in the Bay of Biscay and Iberian Coast therefore remains uncertain (Fig. 2, Table A1). These changes occurred alongside the underlying shifts observed in plankton communities assessed through plankton lifeforms (Changes in phytoplankton and zooplankton communities (PH1/FW5)).

NIs
The number of newly recorded non-indigenous species (Trends in new records of non-indigenous species introduced by human activities (NIS3)) in the Greater North Sea, Celtic Seas and the Bay of Biscay and Iberian Coast regions over the period 2003-2014 varied by year and region (references in Table 7); no assessment thresholds were applied to this indicator in IA2017. In the Greater North Sea there was a relatively constant linear increase in the number of newly recorded NIS over time, whereas the other two regions had particular years with comparatively high numbers of new NIS recorded (for the Celtic Seas, 2006 and 2012, and in the Bay of Biscay and Iberian Coast, 2004) which meant a similar linear increase did not occur. Within the Celtic Seas this resulted in a stepwise increase in newly recorded NIS over the period, however, within the Bay of Biscay and Iberian Coast the cumulative number of recorded new NIS plateaued after 2004 with much fewer new NIS recorded after this date. These results are highly likely to be an artefact of varying monitoring and reporting efforts between the regions over this period rather than discrete episodic events leading to high levels of introductions occurring in particular years (Fig. 2, Table A1), resulting in uncertain status. The information presented does indicate a consistent rate of new introductions occurring within the Greater North Sea and the Celtic Seas. While this does highlight the limitations of assessing new records of NIS and the dependence of the indicator on accurate monitoring data, it also illustrates the effectiveness of comparing six-year blocks of data as an indicator for MSFD purposes.

Discussion
This first science-policy collaborative, regional sea scale effort at assessing the status of Northeast Atlantic marine biodiversity has revealed variation in biodiversity status between the biodiversity indicators, the seven ecosystem components, and the OSPAR regions assessed here (Fig. 2). The semi-quantitative method applied here used expert interpretation of change in a total of 52 indicator-region combinations with respect to assessment thresholds, links to pressures, and/or knowledge of indicator state to categorise indicators as in poor, uncertain, or good biodiversity status. Indicators were not assessed in all regions, due to lack of data, lack of expert resource to conduct an assessment, or lack of policy support (icons shown as grey in Fig. 2). The status of most assessed indicator-region combinations was either uncertain (n = 30; 58% of indicator-region combinations) or poor (n = 13; 25% of indicator-region combinations) (Fig. 2). In fact, very few examples of good status (n = 9; 17% of indicator-region combinations) were identified, which is not in line with meeting MSFD objectives, or wider policy initiatives for sustainability such as the Aichi Biodiversity targets (Convention on Biological Diversity Secretariat, 2010) or United Nations Sustainable Development Goals (United Nations, 2015). Some bright spots categorised in good status were revealed, such as with marine mammals (i.e. seals) in the Greater North Sea and Celtic Seas, coastal benthic habitats in the Bay of Biscay and Iberian Coast and Celtic Seas (with respect to organic and nutrient enrichment), some fish communities in the Greater North Sea and Celtic Seas, and some birds in the Celtic Seas. Of the three regions, the Celtic Seas had the most (n = 5) indicators in good status, including those of coastal benthic habitats, fish, birds, and marine mammals, with 5 indicators categorised as poor status. Similarly, the Greater North Sea had the most (n = 6) indicators with poor status but also had 3 indicators with good status. The Bay of Biscay and Iberian Coast had the highest proportion of indicators categorised as in uncertain status (9 out of 12 assessed indicators). This could be because the Greater North Sea and Celtic Seas have relatively good data coverage when compared to the Bay of Biscay and Iberian Coast, which reduced uncertainty and increased the ability to determine status.
Most of the assessed biodiversity indicator results per region (n = 30; 58% of indicator-region combinations) fell into the uncertain status category as data limitations, such as the quality and quantity of information, prevented robust status categorisation. Data gaps were mostly a result of insufficient geographical and temporal coverage for both pressure and biodiversity data. For example, some marine mammal indicators could not be assessed with confidence since the available pressure data, i.e. harbour porpoise bycatch, and the biological data, i.e. the abundance of cetaceans other than bottlenose dolphins, were not at a wide enough spatial scale. In terms of the temporal scale of analysis, the length of the time-series is an important factor for the interpretation of the indicator and hence, biodiversity assessment. In the case of plankton, the available datasets have different timespans, from 14 to 60 years, making it challenging to establish a common reference period across datasets to assess potential large-scale long-term changes. Paucity of long-term benthic datasets hinders the development of reference points to measure change over time. In the past few years, however, the implementation of the Water Framework Directive (2000) and MSFD (2008) and increased numbers of environmental impact studies and Marine Protected Areas have slightly increased the number of standardised surveys and datasets collected. These datasets, used to assess benthic biodiversity status, improve quantitative and in situ validated knowledge on species and biotopes, but only for a limited number of geographic areas and for specific environmental management scales and objectives. This also highlights the need to develop more coordinated benthic monitoring. The variety of methods and protocols (gears, metadata, taxonomic detail, quality assurance and sampling designs) used to acquire the different datasets also limits the spatial aggregation and comparability of biodiversity data at (sub)regional scales. Likewise, the data collected for NIS were patchy both temporally and spatially due to limited, uncoordinated, and varying efforts in monitoring and species categories considered. The absence of a clear baseline from which assessments of new arrivals could be made has been indicated as an issue for NIS which has subsequently been addressed (Tsiamis et al., 2019). The temporal scale at which indicators are assessed must be therefore appropriate to the policy question being addressed. In future assessments, reporting more indicators over multiple temporal scales (as was done for Recovery in the population abundance of sensitive fish species (FC1), Mean Maximum Length of fish (FC3), and Size composition in fish communities (FW3)) could provide further information relevant to management and use all available data, while still retaining wider ecological context, including multi-decadal trends cause by climate change (Bedford et al., 2020c).
Even where data were available, the novel nature of this work was another contributing factor to the high number of indicators with uncertain status as a result of their relatively low state of methodological development and/or limited previous application in a management context. Some indicators, such as the Proportion of large fish-Large Fish Index (FC2) and Marine birds: non-breeding abundance (B1-A), have a history of assessment. While they were further developed for IA2017, the previous assessment experience, including an established network of experts and data flows, enabled the status category addressed by these indicators to be clearly determined. Most OSPAR common benthic, pelagic and food web indicators, were developed explicitly for the MSFD and were assessed for the first time for the OSPAR Intermediate Assessment 2017. As part of this assessment process, expert networks and data flows had to be established and appropriate indicators had to be selected and developed. An important yet often complex aspect of biodiversity indicators is a clear understanding of the nature and extent of potential pressures that could drive community changes. Modelling and statistical analyses provide a clear route forward to develop our understanding  and such studies have been furthered for fish and food web indicators since OSPAR Intermediate Assessment 2017 (Piroddi et al., 2021;Queirós et al., 2018;Spence et al., 2021;Thompson et al., 2020;Thorpe et al., 2022). This understanding is essential for policy-makers to manage human activities effectively to reduce pressures and restore biodiversity.
Currently, scientific knowledge gaps exist around the mechanisms through which multiple pressures enact change in these new benthic, pelagic and food web indicators; this can be due to lack of data availability and/or relatively low scientific understanding of pressure-state relationships for some benthic, pelagic and food web components. Such relationships are often not linear (e.g. Elliott et al., 2018;Mackey and Currie, 2001;McQuatters-Gollop et al., 2007;Noël et al., 2009;Samhouri et al., 2010) and pressures can act in a cumulative manner making it difficult to identify the most important causes of biodiversity change. For example, pelagic habitat indicators may be influenced at local scales by nutrients while climate change is also acting at wide spatial and long temporal scales (Bedford et al., 2020b;McQuatters-Gollop et al., 2007). Similarly, the cumulative effects of multiple anthropogenic activities, most notably land-based contamination, and fishing, along with climate change, are key pressures on the benthic (Condition of benthic habitat-defining communities (BH2), Extent of physical damage to predominant special habitats (BH3)), food web (Size composition in fish communities (FW3), Change in mean trophic level of consumers or marine predators (FW4)) and fish (Recovery in the population abundance of sensitive fish species (FC1), Proportion of large fish -Large Fish Index (FC2)) indicators.
The effects of pressures on benthic habitats can be also measured by evaluating the sensitivity ranges of the species and biotopes within each habitat type and the spatial and temporal overlaps with those activities causing the pressures (de Juan and Demestre, 2012;de Juan et al., 2020;Dupaix et al., 2021;Tillin and Tyler-Walters, 2014). Activities operating at large scale, such as fishing, will have stronger effects on the sensitivity of benthic habitats and in particular their ability to recover from pressures, than those limited to a geographical area or with a small footprint (Halpern et al., 2008;Jennings et al., 2012;OSPAR, 2017m). The Production of phytoplankton (FW2) indicator was a pilot indicator in development at the time of IA2017, and relationships with pressures had not yet been evaluated. However, the influence of climate change and riverine inputs of nutrients to the North Sea have elsewhere been found to be important factors influencing primary productivity (e.g. Capuzzo et al., 2018).
Climate change was identified as a common key pressure on Northeast Atlantic marine biodiversity. However, in many cases (particularly for pelagic habitat, benthic habitat, and food web indicators) the links between climate change and indicator change were not quantified in IA2017. Climate change was not a key focus of IA2017 and was at the time considered an outside pressure, or prevailing condition, on the system (e.g. Elliott et al., 2015), with its role in driving change unexplored or unquantified for many of the indicators. Climate change was identified as a major pressure causing change in plankton indicators, but resource limitation and scientific knowledge gaps prevented the quantification of this pressure-state relationship as part of IA2017. Since IA2017, however, temperature change has been identified as a key pressure, notably causing changes in phytoplankton and zooplankton communities (Bedford et al., 2020b) and the proportion of large fish (Queirós et al., 2018), knowledge that will progress the scientific robustness of the future assessments.
In benthic habitats, fishing is clearly driving change as evaluated in the Extent of physical damage to predominant special habitats (BH3) indicator, but the ability to quantify the relationship between fishing and changes in species diversity of subtidal benthic communities (Subtidal habitats of the southern North Sea (BH2-B)) is highly dependent on data available from different habitat types under a variety of levels of exposure to pressures (Pitcher et al., 2022;van Loon et al., 2018). For marine mammals, fishing by-catch is one of the main human pressures on harbour porpoises (Harbour porpoise bycatch (M6)). However, there are high levels of uncertainty in estimates of harbour porpoise by-catch rates, making it difficult to confidently categorise ecological status.
Indicators for some ecosystem components are responding to both fishing and climate change. Change in bird indicators (Marine birds: Non-breeding abundance (B1-A), Marine birds: breeding abundance (B1-B), Marine bird breeding success/failure (B3)) is driven by both climate change and fishing, with climate change having an indirect effect on birds through food supply (e.g. forage fish, which are responding to warming), while fishing directly removes fish as a food source for birds from the ecosystem (Cook et al., 2014;Frederiksen et al., 2004). The indicators for Recovery in the population abundance of sensitive fish species (FC1, Greenstreet et al., 2012), Proportion of large fish -Large Fish Index (FC2, Greenstreet et al., 2010), and Mean Maximum Length of fish (FC3, Greenstreet and Rogers, 2006) were constructed to be responsive to the effects of fishing. Nevertheless, in the long term, each indicator will likely be affected by climate since the recovery of sensitive species (Bluemel et al., 2022) and communities (Queirós et al., 2018) will depend on prevailing environmental conditions. Similarly, the Change in mean trophic level of consumers or marine predators (FW4) indicator appears to be mainly sensitive to fishing pressure (e.g., Arroyo et al., 2019;Preciado et al., 2019), although lack of reliable landings data and the inaccuracy of the trophic levels assigned to predators has challenged the quantification of this relationship. Even though results of IA2017 showed a promising apparent increase in the MTL trend based on survey data, indicating a recovery of the Bay of Biscay and Iberian Coast's bentho-demersal system, further analyses have revealed spatial differences in MTL values and demonstrated the negative effect of fishing pressure on the indicator at local scale . These results suggest that the apparent stability revealed by the indicator based on landed catch data may be masking other processes such as the expansion of demersal fisheries to deeper waters, and an over-exploitation of, especially, pelagic species Arroyo et al., 2017). Overall, the stability of the indicator when analysed at the regional level may be indicating that the ecosystem is resilient (withstanding the pressure) but consistently overexploited. The effects of climate change and shifts in productivity interact with fishing, making the effects of various pressures difficult to disentangle or identify targets and thresholds .
Some other anthropogenic pressures were also found to be important for Northeast Atlantic marine ecosystems. An increase in the Trends in new records of non-indigenous species introduced by human activities (NIS3) indicator is likely linked to ballast water, hull fouling and aquaculture (Tidbury et al., 2016). However, this pressure-state relationship was not quantified in IA2017, due to uncertainties and gaps in the data, so the status of the indicator is uncertain (Fig. 2, Table A1). For the three pelagic habitats indicators (Changes in phytoplankton and zooplankton communities (PH1/FW5), Changes in phytoplankton biomass and zooplankton abundance (PH2), and Changes in plankton diversity (PH3)), it is likely that nutrients are influencing change in pelagic indicators in some localised areas (McQuatters-Gollop et al., 2007), but, again, this relationship needs to be quantified in reference to wide scale changes caused by warming (e.g. Bedford et al., 2020b). Nutrients also influence other ecosystem components such as benthic habitats. The Assessment of coastal habitats in relation to nutrient and/ or organic enrichment (BH2-A) indicator clearly showed links to nutrient loading; but data are only available for some coastal habitats, assessed through various different multimetric indices, resulting in uncertain status. Chemical and noise pollution affect individual marine mammals, but the effects of these pressures on populations are not yet well understood (Erbe et al., 2018;Jepson et al., 2016;Murphy et al., 2015). Habitat loss is an increasing pressure for marine bird species displaced from areas where offshore wind farms are operating (Mendel et al., 2019) or ship traffic is disturbing (Mercker et al., 2021;Schwemmer et al., 2011).
Finally, gaps in data or knowledge were not the only challenges for determining if indicator status in an area was 'good' or 'not good'. The original approach for the biodiversity components of the MSFD was based on traditional environmental aspects, such as contaminant effects on biota, with known pressure-state relationships (e.g. Sahlmann et al., 2017). In such cases, indicators are monitored and if the state indicator exceeds a threshold value, it is considered 'not Good Environmental Status'. The IA2017 experience has revealed that, for some ecosystem component indicators, this process is not so easy, simple, or clear and with the idea of 'good' or 'not good' difficult to define with a single threshold value allowing categorisation of status. For pelagic habitats (Changes in phytoplankton and zooplankton communities (PH1/FW5)), for example, the maintenance of ecosystem functioning is more important than changes in the abundance of plankton lifeforms or individual taxa, but is difficult to define with a threshold value. There has been a long history of development of thresholds for commercial fisheries, where the concept is relatively straightforward (Schaefer, 1991), but even here multi-species approaches that take into account food web interactions are considered problematic (Thorpe and De Oliveira, 2019). For non-commercially fished fish species of concern, it is challenging to categorise their environmental status due to the lack of a baseline and/ or reference conditions and, often, limited data. Furthermore, discerning the influence of environmental/natural drivers versus fishing on predator-prey interactions and, ultimately, ecosystem functioning, is an issue that is far from being solved but great progress has been made since IA2017 (e.g. Link and Watson, 2019;Piroddi et al., 2021;Thompson et al., 2020). Although it is possible to measure seafloor disturbance using a combination of sensitivity ranges with levels of exposure to pressures (Halpern et al., 2008;OSPAR, 2017m;Tillin and Tyler-Walters, 2014), it is difficult with the data available to define the boundaries for benthic habitat recovery and therefore to determine 'good' status. This lack of boundary definition presents challenges for quantifying the magnitude of habitat damage, although modelling approaches could be used to estimate the levels of pressure effects or the proportion of habitat area to be protected (Jennings et al., 2012;Rondinini, 2011;Rovellini and Shaffer, 2020). There are also still methodological gaps around how to extrapolate the fine scale measure of the state-pressure relationship (Condition of benthic habitat-defining communities (BH2)) to wider biological and biogeographical scales (Extent of physical damage to predominant special habitats (BH3)). However, the use of condition benthic indicators (Condition of benthic habitatdefining communities (BH2)) to assess habitat condition and develop pressure-state curves (Elliott et al., 2018) is a key step in defining these boundaries as well as in establishing condition (also called quality) thresholds. When this is not possible, a precautionary approach is required, stopping the pressure by managing the related activities (e.g. bottom trawling) to avoid continued habitat damage while environmental status is evaluated. This would notably be possible and important in Marine Protected Areas, even if the benthic habitats are often not the initial and key objective of management (Greathead et al., 2020). Similarly, the presence of newly arrived NIS does not directly relate to poor or degrading environmental conditions as there can be considerable delay, sometimes of several decades, between the introduction of a NIS and related effects on the environment. In many cases especially in the marine environment the effect that NIS present is poorly understood further restricting the capability to directly link new records and environmental effect. In terms of management, the prevention of NIS introduction is crucial, as it is very difficult locally, and almost impossible at wider scale, to avoid the proliferation of invasive marine species which lead to high socio-economic impacts.
Even though knowledge gaps and challenges remain around categorising the status of biodiversity indicators, the work presented here demonstrates that science-policy collaboration is a successful approach to delivering robust science to support policy needs. IA2017 represents one of, if not the, world's most comprehensive applications of the ecosystem approach, which looked across multiple ecosystem components using a variety of indicator types at multiple spatio-temporal scales to better understand and manage marine biodiversity and ultimately realise the ambitions of the MSFD. This process enabled synergistic working between 188 nominated scientists, as well as policymakers from 15 Contracting Parties, to develop policy-useful biodiversity indicators, evaluate indicator change, assess environmental status where possible, and communicate the work appropriately (e.g. McQuatters-Gollop et al., 2019b). Furthermore, the IA2017 process established data flows which increased the level of participation in the assessment from contracting parties. It was found that top-down data calls, issued by OSPAR through the heads of Contracting Party delegations, work better at mobilising data into the assessment process than bottom-up data calls among experts. Stabilising and formalising data flows has facilitated the whole assessment process, particularly for indicator datasets that rely on individual national or even finer scale monitoring programmes (e.g. datasets that are maintained by nongovernmental research laboratories). The working relationships, information flows, cross-cutting issues, and lessons learned during the IA2017 process will underpin and facilitate future biodiversity assessments, and will be important to consider for future policy mechanisms such as the development of the CBD Post-2020 Global Biodiversity Framework. The process, however, does have room for improvement. For example, extending scientific expert contribution by enlarging the expert groups to gain an even more robust scientific consensus, and enabling official citable authorship of experts by OSPAR could contribute positively to the next assessment cycles.

Recommendations for future assessments
To increase the robustness of future assessments, biodiversity monitoring should be improved by filling the gaps in observations. Monitoring should be extended and coordinated so that the entire communities of plankton, benthos, fish, and NIS can be assessed in a more comprehensive way, for example, by including data on the smaller size fraction of the plankton, by monitoring both benthic biomass and diversity, including cephalopods, and by incorporating NIS monitoring in risk areas across all biodiversity components. Also, in further indicator development, other metrics and parameters of novel sampling and monitoring techniques and technologies (e.g. Pitois et al., 2021) and their potential contribution towards the production of biodiversity data products should be considered Hardisty et al., 2019).
Secondly, conceptual understanding is still evolving around pressure-state relationships for many biodiversity indicators. There are particular gaps around how multiple pressures act on benthic, pelagic, and marine mammal communities, and how climate change affects indicators. These pressure-state relationships need to be quantified, and the effects of multiple pressures on indicators' sensitivity disentangled, so that policy can implement effective management measures. Similar to the spatially-nested approach taken for pelagic habitats indicators (Bedford et al., 2020b;McQuatters-Gollop et al., 2019a), which considers pressures at multiple spatial scales, a set of indicators across ecosystem components, and assessed at the regional sea scale, could help to spatially separate areas experiencing direct pressure effects, such as from fishing or nutrients, from long-term change such as climate change or natural variability.
Third, additional work needs to be done on alternative methods of systematically assessing indicators for GES where threshold values are not meaningful. More thinking is needed to accommodate the nuance of interpreting indicators that capture magnitude of change and recovery after halting pressures, or complex indicators that represent multiple aspects of ecosystem functioning. This also includes the further development and testing of the already existing candidate indicators. For example, the candidate food web indicator Ecological Network Analysis Indices (FW9), is necessarily complex as it represents a set of indices across multiple trophic levels, which form an overall picture of ecosystem functioning and organization (de Jonge and Schückel, 2021;Safi et al., 2019). The simple method we applied here, of assessing indicator status by linking indicator results to pressures in the absence of threshold values, can underpin this development.
Fourth, the conceptual modelling of indicator interactions is needed to enable integrated ecosystem assessment. Links between indicators need to be defined and, ultimately, quantified with data. The conceptual understanding of these interconnections is developing but the data or understanding to define and quantify these relationships is lacking. This is likely to be challenging due to multiple spatial and temporal scales, but some work on conceptualising these relationships has already begun (Elliott et al., 2017b;Elliott et al., 2018;Lockerbie et al., 2018). Understanding the interactions between indicators will help us provide even more robust evidence for policy. The Ecological Network Analysis approach (ENA; candidate indicator Ecological Network Analysis Indices (FW9)) could help to understand such interconnections since the set of indices encompasses all components of the benthic and pelagic systems (Fath et al., 2019;Safi et al., 2019). An approach similar to Raoux et al. (2019) and Nogues et al. (2021) could bring together ENA indices (Ecological Network Analysis Indices (FW9)) and the mean tropic level indicator (Change in mean trophic level of consumers or marine predators (FW4)) to assess the status of food webs. The challenge would then be to make the conceptual complexity of the ENA approach understandable to policy makers and the general public (de Jonge and Schückel, 2021).
Finally, there is a clear and demonstrable need for this type of work and so it must be adequately resourced. In many instances indicators could not be assessed in IA2017 due to lack of expert resource to develop indicators and carry out assessments. For benthic and pelagic habitats and food webs, specific recommendations were developed and published as an action plan to guide through collective priorities for how to better progress towards a more ecosystem-based approach to holistic assessment, but this work needs to be consistently resourced (Padegimas et al., 2017). Much of the indicator development thus far has been funded ad hoc through external projects. This meant that expert group leads and indicator leads spent much effort and time responding to competitive national and European funding calls by writing proposals, many of which were not funded. A more efficient way to resource this policy work would be through consistent investment from Contracting Parties.

Conclusions
This work adds value to OSPAR IA2017, by determining, for the first time, the status of biodiversity, as addressed by indicators, across multiple ecosystem components and at a Northeast Atlantic scale, based on an understanding of pressure-state relationships. Some 'good' status bright spots were identified, along with some 'poor' status lowlights. The status of most biodiversity indicators across much of the Northeast Atlantic, however, was found to be 'uncertain', due to data and knowledge gaps, and challenges around the nuanced nature of categorising status for complex biodiversity indicators. The authors hope that this work will contribute to the upcoming formal OSPAR Quality Status Report and Contracting Party MSFD assessment processes by interpreting changes reported in IA2017 with new perspectives and insights that can underpin future formal assessments, and even more importantly achievement, of Good Environmental Status.
More work urgently needs to be done to fill data gaps; identify and quantify pressure-state relationships; understand the influence of climate change on indicators; test options for categorising environmental status where threshold values are unavailable or inappropriate; and conceptualise and quantify links between indicators and ecosystem components. These issues are not straightforward to resolve, particularly when working across multiple spatio-temporal scales. Crucially, success depends on the adequate resourcing of this work, which becomes ever more urgent as climate change and direct pressures on Northeast Atlantic marine ecosystems accelerate.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Table A1
Building on IA2017, the biodiversity status of each indicator has been assessed according to the criteria described in Table 8. Red cells represent 'poor' status for an indicator in a region, while amber cells denote 'uncertain' status and green cells indicate 'good' status. Indicators were not assessed as part of IA2017 for cells in grey. Indicator icons are also used in Fig. 2. A. McQuatters-Gollop et al.