Assessing the environmental status of selected North Atlantic deep-sea ecosystems

a more systematic assessment of habitat and ecosystem components in the deep sea. However, stronger cross- sectoral collaborations, the use of novel technologies and open data-sharing platforms will be critical for establishing environmental baseline indicator values in the deep sea that will contribute to the science base supporting the implementation of marine policies and stimulating Blue Growth.


Introduction
Over the last decades technological development has enabled the exploration of the deep sea i.e. areas below 200 m water depth. This has led to the discovery of a great diversity of habitats and ecosystem components including hotspots of biodiversity like cold-water coral (CWC) reefs (Roberts et al., 2006;Buhl-Mortensen et al., 2010;Henry and Roberts, 2017) and deep-sea sponge aggregations (Maldonado et al., 2017;Kazanidis et al., 2019). Studies have also revealed that deep-sea ecosystems are under multiple pressures arising from human activities such as bottom trawling (Clark et al., 2016;Kroodsma et al., 2018), operation of oil and gas infrastructures (Cordes et al., 2016;Vad et al., 2018), deep-sea mining (Van Dover et al., 2017;Gollner et al., 2017) and changes in water mass characteristics as a consequence of global climate change (Sweetman et al., 2017;Morato et al., 2020). Halpern et al. (2019) assert that most of the ocean (59%) is experiencing significantly increasing cumulative impacts. Human activities can have severe and long-term impacts on deep-sea Vulnerable Marine Ecosystems (VMEs) as by definition (FAO, 2009(FAO, , 2016 these are formed from slow-growing, long-lived organisms whose recovery following human disturbance can take decades or even centuries (Larcom et al., 2014;Huvenne et al., 2016;Clark et al., 2019). Thus, the assessment of health status of deep-sea ecosystems and the implementation of management and conservation strategies is of utmost importance considering the provision of goods and services such as climate regulation, food and energy supply, potential for bioprospecting (Folkersen et al., 2018), cultural services and inspirational contributions (Armstrong et al., 2019).
Despite recent advances in deep-sea research, current knowledge of the structure and functioning of deep-sea ecosystems remains extremely limited. It is often composed from discrete observations over small areas, and over short time scales. This is in contrast to more comprehensive knowledge for coastal ecosystems regarding aspects like the spatial distribution of habitats, the natural variability of ecosystems (at short and long temporal scales), the extent and intensity of multiple human activities and the resilience of ecosystems to human pressures (Glover et al., 2010;Ramirez-Llodra et al., 2011;Murillo et al., 2016;OSPAR, 2017;Dailianis et al., 2018;Miloslavich et al., 2018;Pham et al., 2019). Moreover, information about the distribution of multiple human pressures across the deep sea (Benn et al., 2010;Pham et al., 2014) and experimental work about the response and resilience of deep-sea species and habitats to multiple human pressures is also very limited (Lunden et al., 2014;Büscher et al., 2017;Levin et al., 2019).
The lack of available data, especially of long-term time series, from deep-sea ecosystems make the selection of suitable indicators for assessing the deep-sea environmental status challenging. In 2008, the European Union (EU) established the Marine Strategy Framework Directive (MSFD) with overarching aims achieving Good Environmental Status (GES) in European Member States' seas and protecting the resource base upon which marine-related economic and social activities depend (European Commission, 2008. The MSFD sets out descriptors and criteria in order to evaluate the achievement of GES in European marine waters. The 11 qualitative descriptors comprise: D1-Biodiversity, D2-Non-indigenous species, D3-Commercial fish and shellfish, D4-Food webs, D5-Eutrophication, D6-Sea-floor integrity, D7-Hydrographic conditions, D8-Contaminants, D9-Contaminants in seafood, D10-Marine litter, and D11-Energy incl. underwater noise. Obligations arising from the MSFD apply to areas under the sovereignty and jurisdiction of the EU Member States. The first MSFD reports (initial assessment, GES determination and target setting) were submitted by EU Member States to the EC in 2012. These reports included very little information about the deep-sea areas in the member states' national jurisdiction (i.e. France, Ireland, Portugal, Spain, and United Kingdom as countries with Exclusive Economic Zones (EEZ) in the North Atlantic). This is a major gap as deep-sea areas in the North Atlantic and Mediterranean Sea support rich habitats (rocky and sedimentary habitats, VMEs like CWC reefs, deep-sea sponge aggregations or chemosynthetic ecosystems), the conservation of which could be jeopardized from expanding multisectoral human activities and changing ocean conditions (North Atlantic: Bett, 2001;Reveillaud et al., 2008;Buhl-Mortensen et al., 2015;Hátún et al., 2017;OSPAR, 2017;Mediterranean Sea: Fabri et al., 2019;Maier et al., 2019;Danovaro et al., 2020).
Integrated assessments of deep-sea environmental status require tools able to accommodate different and disaggregated data sets (e.g. density of habitat-forming species, body-size distribution of commercial fishes, intensity of bottom trawling on the seafloor) collected across various spatial and temporal scales OSPAR, 2017;Fabri et al., 2019;Danovaro et al., 2020 for the work of the "Implementation of the MSFD to the DEep Mediterranean Sea" project -IDEM-assessing the environmental status in Mediterranean deep-sea areas). In this context, the "Nested Environmental status Assessment Tool" (NEAT; http://www.devotes-project.eu/neat/) was developed to assess the environmental status in marine environments and specifically for the MSFD and GES Borja et al., 2016). One of the key features of NEAT is that it assembles data from various indicators (average value and standard error) over different spatial and temporal scales in a nested hierarchical structure which is composed of interconnected indicators, spatial assessment units, habitats and ecosystem components. Each of the indicators is linked to a different MSFD descriptor. Therefore, NEAT allows for an integrated assessment of marine environmental status. Up to now NEAT has mostly been used in coastal areas which usually have a fair amount of data; on the contrary the data availability in the deep sea is in general very limited. Those NEAT assessments in shallow-water ecosystems produced useful and sound results always depending on the amount and quality of data available Nemati et al., 2017Nemati et al., , 2018Pavlidou et al., 2019;Borja et al., 2019).
In order to contribute to the implementation of the MSFD in the deep North Atlantic the H2020 ATLAS project ("A transatlantic assessment and deep-water ecosystem-based spatial management plan for Europe"; https://www.eu-atlas.org/) carried out an assessment on the environmental status of nine strategically-selected North Atlantic deepsea areas. The ATLAS work focused on descriptors D1, D3, D4, D6 and D10 as they are particularly relevant for the deep sea. The objectives of the current manuscript in line with the aim of ATLAS's work were to: (1) explore and propose indicators and identify the relevant MSFD criteria that could be addressed in deep-sea ecosystems, (2) evaluate the performance of the NEAT tool in the assessment of deep-sea environmental status, (3) identify challenges (e.g. data standardization) and opportunities (e.g. use of new technology) in the assessment of deep-sea environmental status.

Assessment of environmental status using NEAT
The NEAT is a freely downloadable assessment tool (http://www. devotes-project.eu/neat). Data analysis classifies Spatial Assessment Units (SAUs), habitats and ecosystem components in five predefined categories of environmental status (bad, poor, moderate, good and high) . Assessments of marine environmental status using NEAT are based on the selection of indicators, accompanied by the supply of data to measure these indicators and for the setup of boundary values for each of the indicators. These steps are followed by the designation of the above mentioned SAUs, habitats and ecosystem components in each of the study areas.

Selection of indicators
NEAT includes a database of > 500 indicators for MSFD descriptors D1-D8, D11 (Teixeira et al., 2016). From this database, indicators deemed to be the most appropriate for assessing deep-sea environmental status were selected (e.g. biomass of commercial fish, areal extent of human affected area). This selection took into account the special features of the deep sea including habitat remoteness (and associated difficulties in sampling) and the high sensitivity of deep-sea organisms to human activities. In addition, new indicators were suggested by 15 individual deep-sea experts from the ATLAS consortium and added to those extracted from the NEAT database. This was done to supplement the assessment using deep-sea specific indicators (the NEAT database contains several indicators that are not applicable to the deep G. Kazanidis, et al. Ecological Indicators 119 (2020) 106624 sea e.g. indicators related to seagrasses) or to address the lack of D10 (marine litter) indicators from NEAT database. In total 305 indicators were extracted. To reduce this list to a manageable number of indicators, a further selection was made based on ATLAS expert judgement for the most suitable indicators to be applied to the deep sea. The selection of indicators was based on expert judgment taking into account the relevance of an indicator for the assessment of the deep-sea environmental status and the feasibility of measurements (considering the technological challenges involved in the exploration of the deep sea and the associated limited availability of data). Indicators advocated by 80% of the 15 ATLAS experts, were shortlisted. The only exception to that threshold were indicators for D3, where the level of agreement was lower, and it was decided to reduce the threshold level to 65% to ensure an adequate list of D3 indicators.

Data requirements and collection
Following the selection of indicators, data were amassed from each study area in order to fulfil NEAT data requirements which are described in detail by Berg et al. (2016) and Uusitalo et al. (2016). In summary, the main data requirements to apply NEAT are the following: (1) Designation of the Spatial Assessment Units (SAUs) i.e. areas where the assessment of environmental status takes place. Specific SAUs were designated for each study area. Whenever it was meaningful, SAUs were divided into smaller ones (e.g. the parental SAU was divided into four smaller SAU-1, SAU-2, SAU-3 and SAU-4 in the Rockall Bank; Fig. 2, Table 2) based on data availability, presence/absence of human activities, bathymetry and topography.
In some study areas (e.g. in the Reykjanes Ridge) there is overlap among the SAUs driven by data availability for the indicators used in each of the SAUs (see Table 1 in Supplementary File), (2) Habitats i.e. habitat type present in each of the SAUs (e.g. rocky and sedimentary habitats in the Gulf of Cádiz), (3) Ecosystem components i.e. organisms present at the habitats (e.g. CWCs, sea-pens and sponges in the Gulf of Cádiz), (4) Boundary values i.e. the values separating the five categories of environmental status used in NEAT. The setup of boundary values by ATLAS experts followed six approaches (see recommendations in Borja et al., 2016): 1) Expert judgement only, 2) Expert judgement and use of scientific literature/information from other places or comparison with protected areas, 3) Expert judgement and national legislation, 4) Expert judgement and OSPAR guidance, 5) Expert judgement and data/guidance from International Council for the Exploration of the Sea (ICES) /United Nations' Convention on Biological Diversity (CBD) /United Nations' Framework Convention on Climate Change Decision, and 6) National legislation only. In order to facilitate the task of setting up these boundary values an extensive literature review was carried out. The review spanned over 300 scientific papers, and quantitative data were extracted to set up boundary values in different locations and ecosystems and for each of the indicators (see Table 1 in Supplementary File). The relative frequency of approaches used for the setup of boundary values across the study areas can be seen in Fig. 3. Examples for each of the 6 approaches used in the setup of boundary values are available ( Table 1 in Supplementary File). (5) Supply of average value and standard error values for each of the indicators used in the study areas.
Categories for habitats and ecosystem components were established depending on the data available for each study area. For example, when information was available it was possible to set up "aggregations of CWCs" as a habitat and not just "benthic".
Each indicator must be connected to a SAU, a habitat and an ecosystem component. The same indicator e.g. "Biomass of selected fish species" can be used in the analysis multiple times for multiple SAUs and habitats when it has been assessed for multiple areas. For example, the indicator "Biomass of selected fish species" was used across the four SAUs and nine fish species in the Rockall Bank (see Table 1 in Table 1 Study areas and their key features. Abbreviations: NEAFC (North East Atlantic Fisheries Commission), SAC (Special Area of Conservation). The references cited apply to more than one column. (See below-mentioned references for further information.) G. Kazanidis, et al. Ecological Indicators 119 (2020) 106624 Supplementary File for details). These combinations are called "indicator values" (according to Uusitalo et al., 2016) while "Biomass of selected fish species" is called a "unique indicator". For example, the unique indicator "Biomass of selected fish species" had 36 indicator values across the Rockall Bank, one for each fish species, SAU, habitat and ecosystem component. In order to assess the quality of the data submitted, information supplied by ATLAS experts was also considered for each study area in terms of: a) the representativity of the selected SAUs for the whole study area (poor/moderate/good), and b) the quality of the data (poor/ acceptable/good). The evaluation and characterisation of the data quality were based on the long-term experience of the ATLAS case study leaders, taking into account parameters such as the methods used for the collection of data and the spatial coverage. Quality of the data as well as their type, quantity and spatial coverage were used to describe the representativity. These quality checks are not a prerequisite of NEAT, but it was set within the ATLAS assessment to achieve a more robust evaluation of the results.

Data analyses and evaluation of results
Once the data collection was completed, the boundary values set for each indicator were transformed into a common scale of environmental status given in NEAT (bad 0-0.199; poor 0.200-0.399; moderate 0.400-0.599; good 0.600-0.799; high 0.800-1.000) transforming them linearly onto a normalized common scale . The normalization of boundary values is followed by weighting of the size of the SAUs and indicators by NEAT. The weighting process takes into consideration the size of the SAUs that exist in the assessment. For example, the parental SAU of Bay of Biscay (= 20308 km 2 ) is divided into SAU-1 (slope; 4118 km 2 ), SAU-2 (canyons; 10654 km 2 ) and SAU-3 (interfluves; 5536 km 2 ). The difference in size between the three SAUs is considered in the assessment of the environmental status of the parental SAU (20.3, 52.5 and 27.3%, respectively). Indicator weighting also takes place to balance the assessments where there are more indicators in one habitat than in others (e.g. five indicators in the pelagic habitat vs. one indicator in the benthic habitat). In this study the analysis was performed with NEAT default settings "Weight by SAU Area" and "Do not weight by habitat area". This was done because data was available for the size of the SAUs (Table 2) but not for the size of benthic habitats (due to the lack of habitat maps in some areas). The NEAT value is calculated as the weighted average of all the indicators and is expressed with a value between 0 (bad) and 1 (high) (see above). NEAT values are accompanied by confidence level assessments based on mean and error values provided for each indicator followed by 1000 Monte Carlo simulations .
The results produced by applying NEAT were evaluated by ATLAS experts, considering the following aspects: i) level of agreement between the NEAT outcome and expert judgement, ii) level of satisfaction regarding the amount/type/quality/time frame/spatial extension of the submitted data and confidence on the boundary values set for each of the indicators, and iii) existing knowledge from scientific literature and previous surveys in the study areas. The listing of agreement was set as  Table 2. follows: "complete", "good", "moderate", "little" and "not at all". Furthermore, recommendations were provided by experts on various aspects (e.g. how to improve the NEAT software, the data availability and the data quality in the future) in order to improve our understanding about the environmental status of deep-sea ecosystems.

Assessment of the design
The number of SAUs within each study area varied widely ranging from one, e.g. in Condor seamount (Azores) up to four in the Rockall Bank. In seven out of nine study areas, the parental SAUs were split into subunits ( Table 2). The representativity of the SAUs for the study areas was characterized as "good" in five areas: the LoVe Ocean Observatory, Faroe-Shetland Channel, Mingulay Reef Complex, Condor seamount (Azores), and the Gulf of Cádiz; "moderate" in one area: Porcupine Seabight, and "poor" in two areas: Reykjanes Ridge and Bay of Biscay. In Rockall Bank there were differences among the SAUs regarding the representativity i.e. the representativity was good in SAU-1 and SAU-2 but poor in SAU-3 and SAU-4 (Table 2).
In seven out of nine areas only one habitat type was assessed i.e. "Benthic". Nevertheless, higher resolution data were available in the Bay of Biscay and the Gulf of Cádiz. Specifically, data on the habitats "Lophelia pertusa (now known as Desmophyllum pertusum)/Madrepora oculata on hard substrates", "L. pertusa/M. oculata on soft sediments", "Sea pens/Alcyonaceans on soft sediments", "Antipatharians/ Alcyonaceans on hard substrates" were analyzed in the Bay of Biscay while "rocky" and "sedimentary" benthic habitats were assessed in the Gulf of Cádiz (Table 2).
In total, three ecosystem components were assessed i.e. "benthic invertebrates", "fish" and "benthos". "Benthic invertebrates" were used in those cases where a linkage with a specific faunal group could take place (e.g. corals, sponges) while "benthos" was used in those cases where such a specific linkage was not possible (e.g. when using indicators like "Areal extent of human affected area", or "Areal extent of litter"). Due to limited data availability all three ecosystem components were not assessed consistently across all the nine study areas. For example, the only ecosystem component assessed in the Condor seamount (Azores) was "fish" while "benthic invertebrates", "fish" and "benthos" were assessed in Mingulay Reef Complex (Table 2; see also Table 1 in Supplementary File for a detailed description of all ecosystem components and their links with SAUs and habitats).

Indicators across Descriptors, criteria and study areas
In total, 24 unique indicators were used across the five descriptors (Table 3); one for D1, one for D3, seven for D4, 13 for D6 and two for Table 2 Habitats and ecosystem components, addressed in each of the nine ATLAS study areas. Areal size of each case study and spatial assessment units (SAUs; km 2 ) within each case study, are also given. Categories for habitats and ecosystem components were established depending on the data available for each study area. For example, when information was available it was possible to set up "aggregations of cold-water corals" as a habitat and not just "benthic". "L. pertusa": Lophelia pertusa; "M. oculata": Madrepora oculata.
D10. Indicators used more frequently were: "Areal extent of humanaffected area" (used in eight out of nine study areas), "Areal extent of protected sea areas" (applied in six areas) and "Density of biogenic reefforming species" (used in four areas).
These 24 indicators addressed a total of eight criteria from the GES Decision (European Commission, 2017) (Table 3). Criterion D6C5 (extent of adverse effects from anthropogenic pressures on the condition of the habitat type; see Table 3 for a detailed description of the criteria) was the one with the highest number of indicators (n = nine) addressing it. It was followed by Criterion D4C1 (diversity of trophic guilds) addressed by seven indicators and Criterion D6C2 (spatial extent and distribution of physical disturbance) addressed by two indicators. Five criteria were addressed by only one or two indicators (Table 3). Differences in the development of methodological standards for the MSFD descriptors do not always allow for a direct allocation of the case studies' indicators to MSFD criteria. In such cases, the indicators could work as proxies for the criteria assessments and contribute to the open debates for the proper indicators per MSFD criterion. Most of the indicators were used in a relatively small number of areas and a few indicators were more broadly used (Table 3). Specifically, five indicators were used only in two areas (e.g. "Biomass of selected fish species" in Rockall Bank and in the Condor seamount, Azores; "Species richness of non-commercial fish" in Rockall Bank and in Mingulay Reef Complex), three indicators in three areas (e.g. "Ratio of live versus dead/overgrown coral cover" in LoVe Ocean Observatory, in Mingulay Reef Complex and in Bay of Biscay) while 13 indicators were applied only once (e.g. the indicator "Body length distribution of fish" was only used in Condor seamount (Azores); the indicator "Density of abandoned fishing gear (e.g. lines, nets, etc.)" was only used in Bay of Biscay (Table 3).

Data quantity and quality across indicators, descriptors and study areas
The number of indicator values ranged from 1 to 40. The highest number of indicator values was for the indicator "Biomass of selected fish species" followed by "Abundance of non-commercial demersal fish and cephalopods" and "Areal extent of human affected area" (Fig. 4A). Across the study areas the highest number of indicator values were provided for Rockall Bank (n = 80; 30% in D1 indicators, 60% in D4, 10% in D6) followed by the Bay of Biscay (n = 37; 18.9% in D4 indicators, 64.9% in D6 and 16.2% in D10) and the Mingulay Reef Complex (n = 15; 40% in D4 indicators, 53.3 in D6 and 6.7% in D10) (Fig. 4B). The level of satisfaction regarding the quantity of the data was moderate for most study areas (LoVe Ocean Observatory, Faroe-Shetland Channel, Rockall Bank, Porcupine Seabight, Bay of Biscay,  (continued on next page) G. Kazanidis, et al. Ecological Indicators 119 (2020) 106624 Condor seamount (Azores) and Gulf of Cádiz); there was good (but incomplete) satisfaction at the Mingulay Reef Complex and little satisfaction for the Reykjanes Ridge. The quality of the data across indicators, descriptors and study areas was variable (Fig. 5). For 12 indicators the data quality was considered mainly "poor", for eight indicators was either "acceptable" or "good" and for the rest four it was balanced across "poor", "acceptable" and "good" (Fig. 5A). Across the study areas the data quality was characterized mainly as "poor" in the Bay of Biscay. For the LoVe Ocean Observatory, Mingulay Reef Complex and the Gulf of Cádiz the data quality was considered mainly "acceptable" while in the Faroe-Shetland Channel, Reykjanes Ridge, Porcupine Seabight and the Condor seamount (Azores), the quality of the data was considered mainly "good" (Fig. 5B).

NEAT results for environmental status and comparison with expert judgement
The results of the NEAT analyses revealed differences among the study areas for their environmental status (Tables 4-5). Areas where more indicator values were used showed a higher number of categories of environmental status than areas where less indicator values were used. For example, at Rockall Bank, where 80 indicator values were used, five categories of environmental status were identified (i.e. bad, poor, moderate, good and high) while in the Condor seamount (Azores) where four indicator values were used just one category of environmental status was found (Fig. 6). The assessment suggested that the SAUs of the LoVe Ocean Observatory, Faroe-Shetland Channel, deep Porcupine Seabight, Bay of Biscay and Condor seamount (Azores) (after fisheries closure) were in good status (Tables 4-5), while those of the Reykjanes Ridge, Rockall Bank, Mingulay Reef Complex, Azores (before fisheries closure) and Gulf of Cádiz were not in good status (Tables 4-5). There was good agreement between the NEAT results and expert judgement for the LoVe Ocean Observatory, Faroe-Shetland Channel, Reykjanes Ridge and Porcupine Seabight but the agreement was just moderate for the Rockall Bank, Mingulay Reef Complex, Condor seamount (Azores) and Gulf of Cádiz. For the Bay of Biscay no judgement was expressed (Table 4).
Habitats were assessed to be in different environmental status across the study areas. The environmental status of "Benthic" habitats (i.e. the habitat with the highest number of indicator values) ranged from bad to high environmental status. Specifically, the NEAT suggested a balance between habitats in good status (specifically, 18.2% in high and 27.3% in good) and habitats in non-good status (27.3% in moderate, 22.7% in poor and 4.5% in bad environmental status) ( Ecosystem components were suggested to be in different environmental status across the study areas. The NEAT suggested that the status of "benthic invertebrates" ranged from moderate to high while both "fish" and "benthos" components ranged across all five categories of environmental status. Most "fish" components were assessed to be in moderate or poor status (58% in total) while most "benthos" components were in good or high status (61.1% in total) (  1 Criteria in EU, 2017. D1C2: The population abundance of the species is not adversely affected due to anthropogenic pressures, such that its long-term viability is ensured. D3C3: The age and size distribution of individuals in the populations of commercially exploited species is indicative of a healthy population. This shall include a high proportion of old/large individuals and limited adverse effects of exploitation on genetic diversity. D4C1: The diversity (species composition and their relative abundance) of the trophic guild is not adversely affected due to anthropogenic pressures. D6C2: Spatial extent and distribution of physical disturbance pressures on the seabed. D6C3: Spatial extent of each habitat type which is adversely affected, through change in its biotic and abiotic structure and its functions (e.g. through changes in species composition and their relative abundance, absence of particularly sensitive or fragile species or species providing a key function, size structure of species), by physical disturbance. D6C4: The extent of loss of the habitat type, resulting from anthropogenic pressures, does not exceed a specified proportion of the natural extent of the habitat type in the assessment area. D6C5: The extent of adverse effects from anthropogenic pressures on the condition of the habitat type, including alteration to its biotic and abiotic structure and its functions (e.g. its typical species composition and their relative abundance, absence of particularly sensitive or fragile species or species providing a key function, size structure of species), does not exceed a specified proportion of the natural extent of the habitat type in the assessment area. D10C1: The composition, amount and spatial distribution of litter on the coastline, in the surface layer of the water column, and on the seabed, are at levels that do not cause harm to the coastal and marine environment.  Table 3 for the corresponding indicator name). In (B) the number of indicator values in each study area is shown on the top of the bars.  5. Data quality ("poor", "acceptable" and "good") expressed in % across indicators and descriptors (A) and across study areas (B). In (A) each number on the × axis represents an indicator (see Table 3 for the corresponding indicator name). In (B) the number of indicator values in each study area is shown on the top of the bars.
the LoVe Ocean Observatory 50% were in moderate and 50% in high status whereas in the Rockall Bank all five categories of environmental status were found (16.7% in high, 33.3% in good, 8.3% in moderate, 25% in poor and 16.7% in bad status) (Table 5; see also Fig. 2 in Supplementary File). In Porcupine Seabight, Bay of Biscay and Mingulay Reef Complex the ecosystem components were assessed to be mainly in good or high status while in the rest of the study areas they were not (Table 5;

Assessing the assessment design, quantity and quality of data
Data availability for this analysis was much lower than previous analyses using the NEAT in shallow European seas, as expressed through comparisons of SAUs, habitats, ecosystem components, unique indicators, and indicator values. This is mainly due to the remoteness of deep-sea environments. It should also be taken into account that there is a smaller number of species in the deep sea than on continental shelves due to the lower temperature, productivity, and habitat heterogeneity in the deep sea (Costello and Chaudhary, 2017). In the analyses performed here, there were two hierarchical levels in most cases whereas in some coastal areas studied by Uusitalo et al. (2016) there were often three to four hierarchical levels of SAUs, up to nine habitats and up to nine ecosystem components. Likewise, the number of unique indicators and indicator values in shallow waters were much higher ranging from 11 to 116 and 20 to 466 per study area, respectively Pavlidou et al., 2019) compared to the range of 1-11 unique indicators and 1-40 indicator values in our study. Considering these limitations, data analyses and boundary values set-up here (see Section 4.2) have been performed with caution, following Uusitalo et al. (2016); the same applies to the interpretation of the NEAT outcomes especially in those cases where a small number of indicator values were linked to large SAUs (e.g. the Porcupine Seabight where five indicator values were used over an area of 35500 km 2 ). The cautious interpretation of the results was supported also by considering a) ATLAS experts' judgement, b) results from previous assessments (e.g. OSPAR, 2017) and c) information on the areas gathered from the scientific literature for North Atlantic deep-sea areas (see below).
Given the large differences among study areas in terms of their size, data availability and limited knowledge on the ecology of species/habitats and their response to human activities, comparisons of environmental status between study areas are currently of limited value. However, comparisons of environmental status of SAUs/habitats/ components within each study area can be made especially in cases where there are gradients of human pressures (e.g. the Gulf of Cádiz) or when management measures have been implemented (e.g., in Condor seamount, Azores).
In addition to limited data availability, there was also limited data standardization for almost all the indicators selected (Table 3). For example, for the indicator "Areal extent of litter" measurements units were "Items/km 2 ", "Number of items/minute of video tow" and "Litter/ image". This highlights the need to increase standardization in quantification and reporting of marine litter (OSPAR, 2017). A similar situation was also recorded for the units used to express density of specimens (Table 3). ATLAS experts also identified cases of limited data quality for some indicators. Specifically, in LoVe Ocean Observatory and Bay of Biscay the low quality of the data was due to technical issues (e.g. absence of laser points from the Remotely Operated Vehicles (ROV) cameras, hampering image scalation) that did not allow the quantification of ecosystem components. In the case of Mingulay Reef Complex, data on fish assemblages were collected with a ROV which, in general, is not the most appropriate method to measure fish species richness, abundance and biomass due to avoidance behaviour of objects like large ROVs (Andaloro et al., 2013). This combined with limited visibility during ROV dives at Mingulay Reef Complex (Milligan et al., 2016), helps explain why species richness and abundance of fish in that area was lower than in other CWC reefs where trawling/baited traps Table 4 NEAT values, environmental status and level of agreement between NEAT results and expert judgment for the nine study areas in the North Atlantic. NEAT values are given for the whole study area. Categories of environmental status: bad 0-0.199; poor 0.200-0.399; moderate 0.400-0.599; good 0.600-0.799; high 0.800-1.000.

Table 5
Assessment of the environmental status in nine study areas in the North Atlantic. Results are given for each study area, spatial assessment units (SAUs), habitat and ecosystem component. Explanation of colour code for each of the five categories of environmental status defined by the NEAT software: blue: high; green: good; yellow: moderate: orange: poor; red: bad. "L. pertusa": Lophelia pertusa; "M. oculata": Madrepora oculata.
(continued on next page) G. Kazanidis, et al. Ecological Indicators 119 (2020) 106624   were used (Durán-Muñoz et al., 2012). However, as the Mingulay Reef Complex is a protected area, demersal fishing techniques are not permitted. Landers using time-lapse photography is an alternative and noninvasive methodology that could be used for assessing the abundance of fishes . In the case of the Bay of Biscay and Rockall Bank the low quality of the data for the indicators "Areal extent of biogenic habitats" and "Abundance/biomass of fish" was due to the limited or non-homogeneously distributed sampling effort within the study areas, as noted by ATLAS experts working in the two areas. Beside these cases, there were others where the quality of data was mainly acceptable or good (Fig. 5A). Interestingly, the indicator "Areal extent of human affected area" was mainly composed from acceptable/good quality data across almost all the study areas confirming the good potential for this indicator to be used in environmental status assessments. On the contrary the indicator "Areal extent of biogenic/vulnerable habitats" were mainly characterized from poor-quality data (Fig. 5A) reflecting the current limited knowledge about the spatial distribution of these important habitats and the urgent need to increase the knowledge on spatial and temporal distribution of habitats (Georgian et al., 2019;Kenchington et al., 2019a; see Section 4.5). Across the study areas, acceptable and good-quality data were supplied mainly for the Faroe-Shetland Channel, the Reykjanes Ridge, the Porcupine Seabight and the Condor seamount (Azores) (Fig. 5B). In the Faroe-Shetland Channel the good quality of the data was mainly due to rigorouslydesigned towed camera surveys collecting high-quality video material from areas inside and outside the Faroe-Shetland Channel Marine Protected Area, that allowed quantitative analyses using collected image material . Quantitative data available online for demersal fisheries inside and outside the FSC MPA (https:// marinescotland.atkinsgeospatial.com/nmpi/) contributed also to this high data quality. In the Condor seamount (Azores) the collection of reliable measurements on fish body size and biomass of the commercially-important fishes Helicolenus dactylopterus and Pagellus bogaraveo using appropriate sampling methods (long-line surveys) over the period 1983-2017 (with gaps) allowed the collection of high-quality data for the analyses. Vessel monitoring system (VMS) and electronic logbook data for the Reykjanes Ridge were available at very high spatial resolution (gridded as fine as 150x150 m) that allowed the collection of high-quality information. In Porcupine Seabight fishing pressure data were represented by swept area ratio collated in ICES (2016aICES ( , 2018 and calculated by analysis of VMS data.

Setup of boundary values for the assessment of environmental status
One of the biggest challenges in assessing deep-sea environmental status is the lack of baselines (i.e. specific and quantifiable reference points against which environmental status will be assessed) and the difficulties of setting boundary values (e.g. a threshold value defining a degree of deviation from the baseline) (Borja et al., 2012). Establishing methods to set boundary/threshold values for assessing GES is especially challenging for deep-sea ecosystems due to limited historical data availability which reduces confidence around the established boundary values.
In this study various approaches were used to create boundary values. These mainly relied upon "Expert judgement" followed by "Expert judgement and use of scientific literature/information from other places or comparison with protected areas" (Fig. 3; see also Table 1 in Supplementary File). Expert judgment/use of scientific literature were also used in previous studies assessing environmental status in areas inside Pavlidou et al., 2019) and outside Europe (Nemati et al., 2017) whereas Korpinen et al. (2013) followed the EU Habitats Directive (European Union, 1992) in their assessment of Baltic Sea assuming 25% of the seafloor being under significant impacts as a threshold value for the assessment of environmental status of soft and hard substrates.
Another approach for setting a baseline is using past conditions and historic data i.e. the status of an area before the occurrence of human activities and their impacts (Borja et al., 2012). Unfortunately, this type of information is almost non-existent in the deep sea, even for the northeast Atlantic, the region with the longest tradition in deep-sea biology studies. A telling example is the Bay of Biscay where trawling on the slope began a century ago (Joubin, 1922) and the first biological data produced by the oceanographic expeditions go back in the 19th century (Le Danois, 1948); however, this information unfortunately does not meet required standards (e.g. presence of quantitative data) necessary to establish a baseline. Long-term time series through which a past date/period can be considered as a baseline could also be used (OSPAR, 2017;Hartman et al., 2012 for the Porcupine Abyssal Plain Sustained Observatory fixed-point time-series). Although long-term data (e.g. > 10 years) in the deep sea are very limited, some exceptions exist: for example, the 30-year time series of fishery data available from Rockall Bank. This offers information on fish body length and abundance. However, even in that case the establishment of boundary values for assessing the environmental status is problematic as there is limited understanding about the role of environmental natural variability [e.g. type of substrate (Landres et al., 1999;Henry and Roberts, 2007;Bourque and Demopoulos, 2018), water-mass characteristics (Hátún, et al., 2009;Puerta et al., 2020), and food supply (Klitgaard and Tendal, 2004;] in shaping ecosystem structure and functioning. Indeed, ecosystem variability has been identified as one of the most common critical issues complicating the implementation of policy frameworks (Alexander et al., 2015). For example, in the North Atlantic pronounced changes in the fauna across trophic levels over nearly a century have been attributed to changes in the strength and extent of the subpolar gyre (Hátún et al., 2009), which complicates the establishment of threshold values.
The identification of pristine areas free from anthropogenic pressures, e.g. deep-sea Marine Protected Areas (MPAs) or Special Areas of Conservation (SACs) were also suggested as a possible pathway for the setup of baselines. In the northeast Atlantic a number of deep-sea MPAs/SACs, have been established (e.g. the Sedlo seamount MPA, the Faroe-Shetland Channel MPA, the Rosemary Bank seamount MPA, the East Mingulay and Darwin Mounds SACs, Natura 2000 areas in Bay of Biscay, west coast of Ireland and northern Spain -http://natura2000. eea.europa.eu/). Comparisons between areas inside and outside the Faroe-Shetland Channel MPA have shown that deep-sea sponge aggregations inside the MPA had higher sponge morphotype diversity/ richness/density and higher number of body-size cohorts than areas outside the MPA; the main parameter driving these differences was the lower bottom fishing pressure inside than outside the MPA . Furthermore, monitoring of cold-water corals in the Darwin Mounds revealed no coral recolonisation and very little regrowth eight years after the closure to bottom contact fisheries . It is, however, unknown the extent that deepsea MPAs could be used as a basis to set-up baselines as differences in the structure and functioning of deep-sea ecosystems can arise not only due to human activities but also due to environmental natural variability [e.g. habitat type (Henry and Roberts, 2007;Bourque and Demopoulos, 2018), water-mass characteristics (Hátún, et al., 2009;Puerta et al., 2020), and food supply (Klitgaard and Tendal, 2004;].

Potential most suitable indicators for the assessment of deep-sea environmental status
The widespread use of the indicator "Areal extent of human affected area" indicates the relatively high data availability regarding deep-sea areas affected by demersal fisheries. This suggests the high potential of this indicator to be used for future assessments of deep-sea environmental status, specifically addressing the GES Decision (European Commission, 2017) D6.2 Criterion "Spatial extent and distribution of physical disturbance pressures on the seabed". Knowledge on the spatial extent of human-affected areas in the deep sea already exists for some areas (e.g. Hatton and Rockall Bank, Benn et al., 2010;Norwegian reefs: Buhl-Mortensen and Buhl-Mortensen, 2017;Porcupine Seabight: Vieira et al., 2019, 2020Gulf of Cádiz: Díaz-del-Río et al., 2014), and it is fundamental to assess the deep-sea status due to the high sensitivity of deep-sea species to physical damage and sedimentation following seabed disturbance (Roberts et al., 2000;Puig et al., 2012;Pusceddu et al., 2014;Huvenne et al., 2016). The usefulness of this indicator is enhanced when data on "Areal extent of biogenic/vulnerable habitats" are also available as this allows to overlay pressure and status indicators. In this assessment data for these two indicators were available for the LoVe Ocean Observatory and Bay of Biscay; the data quality, however, for "Areal extent of biogenic/vulnerable habitats" was mainly considered "poor". This highlights the urgent need to improve the quantity and quality of data on the spatial distribution of biogenic/ vulnerable habitats (Section 4.5). In previous assessments the indicators "Physical Damage" and "Cumulative impacts on seabed habitats" were used in OSPAR (2017) and Korpinen et al. (2013) to assess the impact of anthropogenic activities in northeast Atlantic and Baltic Sea deep-sea habitats, respectively. In principle, these indicators used habitat maps, sensitivity/resilience to disturbance and distribution/intensity of human activities highlighting the urgent need to advance our understanding about the distribution of habitats, and especially those that are particularly vulnerable to human pressures, in the deep sea.
Case-study leaders have identified some new indicators (i.e. other than those already existing in NEAT database; highlighted in grey in Table 3) that could contribute in the assessment of the deep-sea environmental status. Three of these indicators are related to non-commercial fish (i.e. "Abundance of non-commercial demersal fish and cephalopods", "Species richness of non-commercial fish" and "Species diversity (Shannon index) of non-commercial fish"). Several of these non-commercial species play an important role in ecosystem functioning and/or are included in endangered species list and thus their assessment is important. For example, 1) several fish species feed on Trisopterus minutus (Magnussen and Magnussen, 2009), 2) Capros aper has a pivotal position on marine food webs grazing on zooplankton and being also an important prey for commercial species (Lopes et al., 2006) and 3) black-legged kittiwake seabirds rely on the sand eel Ammodytes marinus (Arnott and Ruxton, 2002). Furthermore, Eutrigla gurnardus is a new potential commercial species and ICES has recommended that landings and discards are monitored (McCarthy et al., 2018). Finally, skates in the genus Dipturus are critically endangered according to the IUCN Red List of Threatened Species (Dulvy et al., 2006). On top of the indicators related to non-commercial fish, there was also a new indicator suggested related to commercial fish i.e. "Abundance of commercial fish". The abundance of Micromesistius poutassou showed striking shifts since late 1990's with the mechanisms driving these changes remaining unknown (Payne et al., 2012) while the distribution of Helicolenus dactylopterus in the North Atlantic is expected to continue to shift northwards by 2100, following changes in water-mass characteristics . The indicator "Abundance of commercial fish" mentioned here can support in the future the assessments of fish stocks done using indicators like "Proportion of large fish" and "Change in average trophic level of marine predators" (OSPAR, 2017). The indicator "Ratio of live versus dead/overgrown coral cover" can be useful in the assessment of environmental status of coral reefs and gardens. Corals are exposed to multiple pressures (e.g. ocean warming, acidification, deoxygenation) which can cause a decline in their health (Hennige et al., 2015;Büscher et al., 2017). Visual surveys accompanied by machine learning are expected to contribute efficiently in the assessment of status of coral habitats in the coming years (Vad et al., 2017;Piechaud et al., 2019; see also in Section 4.5). The indicator "Areal extent of sedimentary seafloor" can contribute to the assessment of deep-sea sedimentary habitats. This is particularly relevant as some of studies have shown that bottom trawling can have large-scale detrimental impacts on sediments reducing their structural complexity, carbon content, carbon turnover and meiofauna biodiversity (Puig et al., 2012;Pusceddu et al., 2014). Finally, two indicators related to marine litter have been suggested which also can facilitate the assessment of deep-sea environmental status. Indeed, experimental work has shown that both macro-and microplastics significantly reduced skeletal growth rates in the reef-forming cold-water coral Lophelia pertusa (Chapron et al., 2018) while abandoned fishing gear can be harmful for deep-sea fauna (e.g. ghost fishing - Tubau et al., 2015).

Evaluating the performance of the NEAT software -references to other approaches previously used to assess deep-sea environmental status
Considering a) the results of the NEAT analyses, b) the level of agreement between these results and expert judgement and c) the available information in scientific literature, it can be concluded that overall NEAT had a meaningful performance in the assessment of the environmental status in the nine deep-sea areas studied here. This is especially supported by the good match between the gradient of human pressures and resulting environmental status e.g. NEAT managed to show that SAUs under long-standing human pressures were in lower environmental status than other SAUs where human pressures were less (see below). In addition, NEAT managed to capture the response of ecosystem components to management measures showing an improvement in the status of commercially important fish following fisheries closure, as in the Condor seamount (Azores).
The level of agreement between the outcome of NEAT analysis and ATLAS expert judgement ranged from moderate to excellent (Table 4). The deeper areas of the Porcupine Seabight were in good status (Tables 4-5; see also Figs. 1-2 in Supplementary File) mainly due to low swept area ratio values; expert judgement was in agreement with NEAT results. After a long period of exhaustive exploitation until the early 2000s , the implementation of restrictive measures by EC in 2002 reduced the fishing pressure on deep-water fishes and the overall biomass of commercial stocks increased (ICES, 2016b). This improvement of the stock situations was also described in OSPAR (2017). The Deep-Sea regulation (European Union, 2016) limited the use of specific fishing gears below 800 m water depth including the prohibition of bottom fisheries in areas hosting VMEs, such as the Porcupine Seabight. NEAT outcome for the LoVe Ocean Observatory showed that the area was also in good status (Tables 4-5; see also Figs. 1-2 in Supplementary File) which was in agreement with expert judgement. This outcome was supported from the relatively high areal extent and density of biogenic/vulnerable habitats as well as from the small (below < 5%) extent of areas affected by bottom-contact fisheries. Faroe-Shetland Channel and the Reykjanes Ridge showed good agreement between NEAT results and expert judgement (Tables 4-5; see also Figs. 1-2 in Supplementary File). Recent findings showed that sponge aggregations inside the Faroe-Shetland Channel Marine Protected Area (MPA) were in better status than those outside ; very little marine litter was also observed (Pham et al., 2014) and demersal landings/ICES rectangle were much lower inside than outside the MPA. The NEAT analysis suggested a moderate environmental status for the Reykjanes Ridge. This outcome was supported by the fact that some of the examined areas were under strong fishing pressure (Ragnarsson and Steingrimsson, 2003).
In four study areas there was a moderate agreement between NEAT results and expert judgement. NEAT showed an overall poor environmental status for Mingulay Reef Complex (Tables 4-5; see also Figs. 1-2 in Supplementary File). This outcome was in contradiction with expert's judgement who mentioned that good status should be expected for this CWC reef. This discrepancy was an outcome of the very low values -due to methodology used-measured for "Species richness of non-commercial fish" and "Abundance of commercial fish" indicators (Table 1 in Supplementary File) which were linked to SAU-1. The combination of very low measured values for indicators in a large SAU had a major impact in the overall result of the assessment. The result of the assessment for the ecosystem components "benthic invertebrates" and "benthos" -for which data quality was good-was that both components were in good status, which was in complete agreement with expert judgement and the information from the area in the public domain (Roberts et al., 2005;Henry et al., 2013;De Clippele et al., 2017). The results for the Mingulay Reef Complex highlight that low-quality data should be treated with much caution . The Rockall Bank assessment also revealed a moderate agreement between NEAT and expert judgement. Specifically, the NEAT results showed a poor environmental status (Tables 4-5; see also Figs. 1-2 in Supplementary File). This outcome agrees with previous results that highlighted the unfavorable condition of habitats (e.g. stony, bedrock and biogenic reefs) and ecosystem components in Rockall Bank's Special Areas of Conservation due to continued mobile and static demersal fishing (JNCC 2018a, b). The Rockall Bank has supported fisheries for > 200 years (Newton et al., 2008). The reason for the moderate agreement between NEAT and expert judgment is the high natural variability of oceanographic conditions across time and the effect on the fish community (OSPAR, 2017 and Section 4.2). For the Condor seamount (Azores), NEAT showed an increase from moderate (Tables 4-5; see also Figs. 1-2 in Supplementary File) to good status (Tables 4-5; see also Figs. 1-2 in Supplementary File) for body length and biomass of two commercially-important fishes, after the bottom longline and handline fisheries closures. This outcome agrees with the signs of recovery for the commercially most important species here (Eva Giacomello pers. comm.). ATLAS experts highlighted that although NEAT results reflect the current situation for the two fishes, they do not reflect the status of the whole Condor seamount ecosystem as the overall condition is perceived to be good. No bottom trawling has ever occurred, coral gardens are still abundant and fish stocks have not dramatically collapsed Gomes-Pereira et al., 2017;Parra et al., 2017). In the Gulf of Cádiz there was also moderate agreement between NEAT and expert judgement. SAU-2 had a lower environmental status than SAU-1 due to the higher trawling activity in SAU-2 and the much lower values for the density of biogenic reefforming species, mainly sea pens in this case. The lack of historical data and the limited knowledge on the natural variability of this area with complex oceanography and seafloor morphology (Díaz-del-Río et al., 2014) hampered the establishment of boundary values, which in turn led to a moderate agreement between NEAT results and expert judgement. Finally, for the Bay of Biscay there was no judgement expressed regarding the outcome of the NEAT assessment, due to the limited knowledge about the distribution of ecosystems components, low quality of data collected, limited understanding of the role of environmental natural variability in shaping ecosystem structure and functioning and a lack of historical data. It should be mentioned that analysis of long-time series in the Bay of Biscay showed an increase in the "Typical Length" of demersal fishes, no significant changes over time for the pelagic fish communities and a widespread presence of marine litter, especially plastics, on seabed (OSPAR, 2017).
Considering the results of the present study the reasons that lead to a less-than-good level of agreement can be grouped in two broad categories: 1) issues intrinsic to NEAT and 2) issues related to the quantity/quality of data supplied. As regards the first category, one of the main issues is the limited transparency in the algorithms used by NEAT for the assessment of environmental status. Addressing this issue in the future would increase confidence to the results produced by this software. Furthermore, NEAT does not currently account for the amount of data available per SAU or for data of low quality. Development of the relevant weighting schemes in the future would also increase confidence to NEAT outcomes. Finally, in a number of cases the standard error is set to zero, which artificially inflates the confidence level. We suggest that this is an important aspect that needs to be addressed (Chen et al., 2014 and references there in; Lavallé and Beaumont, 2015). It is also important to note that expert judgement by its very G. Kazanidis, et al. Ecological Indicators 119 (2020) 106624 nature will be biased by the viewpoints of the experts concerned. This issue was beyond the scope of the present study. Based on the replies we received it seems that the characteristics of the data used by case study leaders (i.e. amount, type, quality, time frame covered, spatial extension across the study area), the methods used/sampling design followed in data collection, environmental natural variability as well as knowledge based on scientific literature and previous surveys in the study areas, played a role in shaping expert judgement.
In the second category of drivers leading to moderate agreement between NEAT and expert judgement explanations included scarce data on the structure and functioning of deep-sea ecosystems, limited understanding of the role of environmental natural variability across/ within study areas and a lack of historical data which hindered the setup of baselines. These issues are among the greatest challenges to be addressed by experts and the tools they use for assessing the environmental status of the deep sea. The limited availability of information does not hinder solely the current assessment using NEAT. Indeed, these challenges are intrinsic to all the efforts aiming to the assessment of deep-sea environmental status irrespectively of the approach that will be used (expert judgement/assessment tools/ecological models; e.g. Borja et al., 2011;Piroddi et al., 2015;Elliott et al., 2018). The shortcomings mentioned above about data availability are analogous to those reported in a previous assessment of disturbance in OSPAR Threatened and/or Declining Habitats in northeast Atlantic using the "Physical Damage" indicator. Limited data availability on habitat distribution and species' sensitivity to physical damage led to a partial assessment and classification of carbonate mounds, deep-sea sponge aggregations and Lophelia reefs across the nine categories of disturbance shown in OSPAR (2017). Furthermore, it should be mentioned that the role of natural variability and difficulties in the setup of baselines are a challenge also for assessments in shallow-water areas (Korpinen et al., 2013;Myrberg et al., 2019), where traditionally there is much higher availability of data than in the deep sea. Improvements in these fields will enhance the robustness of the analysis and the level of confidence in the produced outcomes (Section 4.5; see also González-Irusta et al., 2018;ICES, 2019).

Next steps in the assessment of deep-sea environmental status
The results of the current and previous efforts to assess the environmental status in deep-sea areas highlight the big challenge that scientists, policy makers, industry and the public will face in the coming years (Orejas et al., in press). Here, we summarize how the amount and quality of deep-sea data could be improved. This will also facilitate the setup of boundary/threshold values and improve the confidence in the assessment of deep-sea environmental status.
Information on human pressures remains scarce and therefore an easier access to Vessel Monitoring System data is imperative in order to improve understanding about the spatial distribution of bottom trawling, which is currently the main human activity resulting in physical damage in the deep sea over large spatial scales (Benn et al., 2010;Amoroso et al., 2018). Intersectoral collaborations (e.g. industryacademia; in present study data from the LoVe Ocean Observatory were supplied by Equinor, an energy company) and (meta)data archiving in online repositories (e.g. ICES VME data base, EMODnet, PANGAEA) will improve knowledge about the impacts of human activities and ecosystem functioning, in general (Stratmann et al., 2019;Vad et al., 2020). The assessment of the distribution of marine litter on the seafloor will be facilitated through the data acquisition taking advantage of the already existing regular scientific trawling surveys and development of specific common protocols (see OSPAR, 2017;Baudrier et al., 2018).
Advances in recording the spatial and long-term distribution of ecosystem components are also fundamental to improve the accuracy of the deep-sea environmental status assessments. The use of updated technological equipment (e.g. towed cameras, autonomous underwater vehicles (AUVs), ROVs, benthic landers; McIntyre et al., 2013;Lavaleye et al., 2017;Levin et al., 2019;Van Engeland et al., 2019) contribute to the collection of large volumes of imagery and environmental data facilitating an improved understanding of the ecosystems' natural variability across space and time. This understanding will be accelerated through machine learning enabling the automatic extraction of features (e.g. benthic ecosystem components) (Osterloff et al., 2016;Piechaud et al., 2019). This will contribute to a better understanding of habitat and species distribution when it is incorporated into ensemble mapping techniques (Araujo and New, 2007;Robert et al., 2016;OSPAR, 2017;Kenchington et al., 2019a;Ramiro-Sánchez et al., 2019).
Research efforts need to address key points like the response of habitat-forming species to multiple stressors (Lunden et al., 2014;Hennige et al., 2015;Büscher et al., 2017) and the behavior of deep-sea species' larvae (Larsson et al., 2014;Strömberg and Larsson, 2017). The latter is especially important in order to improve our understanding about connectivity among cold-water ecosystems and source-sink dynamics (Fox et al., 2016;Kenchington et al., 2019b) which can help to set up priorities for environmental status assessments and conservation strategies. The quality of data collection will benefit from the standardization of methodologies used e.g. through the establishment of standardized reference image databases  and protocols on data collection and reporting [e.g. marine litter as number of items/km 2 -European Commission (2017).

Conclusions
Based on the findings of the present study, we conclude that the indicators "Areal extent of human affected area", "Areal extent of biogenic/ vulnerable habitats" and "Density of biogenic reef forming species" should be considered in future regional assessments of deep-sea environmental status as these would be more inclusive of deep-sea environments including those that form VMEs. As it was shown here, the NEAT software produced meaningful results which were in good/moderate agreement with expert judgement. There are, however, a number of issues that should prevent its use without advice from experts in the field of deep-sea ecology. We think that it would be wise to use NEAT in the assessment of environmental status of areas under clear gradients of human pressures. The software could also be used to assess the status of areas/habitats/ ecosystem components before and after the implementation of management measures (e.g. in the monitoring of habitats/ecosystem components over time within MPAs). Furthermore, considering a) the data-limited situation in the deep sea, b) the need to nest data from studies conducted at different spatial/temporal scales and the challenges related to deep-sea monitoring, we suggest that the assessment of deep-sea environmental status should take place, at habitat and ecosystem level (rather than at species level), in comparison to shallow-water areas . It would also be prudent to set scientific, technical and financial criteria to prioritize the assessment of environmental status in areas of special importance such as areas acting as critical larval sources for overall network connectivity and/or refugia from climate change (Fox et al., 2016;Johnson et al., 2018;Johnson and Kenchington, 2019).

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Funding
This study was funded by the European Union's Horizon 2020 research and innovation programme under grant agreement no. 678760 (ATLAS) to JMR. This output reflects only the authors' view and the European Union cannot be held responsible for any use that may be made of the information contained therein. TS was partly funded by the Dutch Research Council (NWO) within the research programme Rubicon with project number 019.182EN.012. Funding for EK contributions was through Fisheries and Oceans Canada, International Governance Strategy Science Program. HE was funded by the Icelandic Research Fund, RANNÍS, under grant agreement no. 174552. TM received additional support from the Program Investigador FCT (IF/ 01194/2013), IFCT Exploratory Project (IF/01194/2013/CP1199/ CT0002) from the Fundação para a Ciência e Tecnologia (POPH and QREN), and from the PO2020 Azores MapGES project (Acores-01-0145-FEDER-000056). EG was supported by DEMERSAIS and CONDOR project (funded by the Azores Regional Government).