Global distribution of the cold-water coral Lophelia pertusa

Lophelia pertusa plays an important role as a major contributor to many cold-water coral reefs, supporting a high diversity of associated benthic and benthopelagic species. Due to the high sensitivity of L. pertusa to human activity, it has been classified as indicator species for Vulnerable Marine Ecosystems. However, the global spatial distribution of L. pertusa is far from well known. In this study, a database of L. pertusa presence data was compiled derived from the large number of L. pertusa occurrence records appearing in recent years. In conjunction with data layers covering a range of environmental drivers, habitat suitability for L. pertusa was predicted using the Random Forest approach. Suitable habitat for L. pertusa was predicted to occur primarily on continental margins, with the most suitable habitat likely to occur in the North East Atlantic and South Eastern United States of America. Aragonite saturation state, temperature and salinity were identified as the most important contributors to the habitat suitability model. Given the high vulnerability of reef-forming cold-water corals to anthropogenic impacts, habitat suitability models are critical in developing worldwide conservation and management strategies for biodiverse and biomass rich cold-water coral ecosystems.


Introduction
Framework-forming cold-water corals play important roles in the ecological functioning of marine ecosystems (Corbera et al., 2019;Matos et al., 2021). They construct complicated 3-dimensional skeletons of calcium carbonate that may form mounds or reefs, which can fulfil the role of providing an appropriately hard substrate for the settlement of larvae of sessile animals. Cold-water coral reefs also serve as feeding sites, breeding and nursery habitats, and refuges from predators for benthic species, significantly enhancing local biomass and biodiversity (Capezzuto et  . Lophelia pertusa is a scleractinian framework-forming cold-water coral species with a cosmopolitan distribution. It is mainly found at depths from 39 to 3380 m, and is particularly abundant in North Atlantic waters (Roberts et al., 2009;Mortensen, 2000). Lophelia pertusa reefs support a high diversity of benthic species, with more than 1300 species found at such reefs in the North East Atlantic (Buhl- Mortensen et al., 2010;Kutti et al., 2014).
Due to physical fragility and slow recovery rates, reef-forming cold-water corals are very sensitive to anthropogenic disturbance, particularly bottom trawling, and climate change (Pecl et al., 2017;Georgian et al., 2016;Movilla et al., 2014). Spatial distribution information for cold-water corals is necessary for the development of conservation strategies, but at present there remain considerable gaps  Lecours, 2017), providing important input for the creation of conservation and management plans for deep-sea ecosystems (Matos et al., 2021;Reiss et al., 2015). The global habitat suitability of L. pertusa has been predicted in several previous studies; using 2060 records, coarse-resolution 1° environmental data and the Ecological-niche factor analysis (ENFA) method (Davies et al., 2008), and with 863 logged occurrences at 30-arc second resolution environmental data and the Maximum Entropy modelling method (Davies and Guinotte, 2011).
This study aims to create an updated database of occurrence records of the framework-forming cold-water coral L. pertusa, and develop new global habitat suitability maps for L. pertusa, with the model outputs used to better understand its global potential distribution more accurately than was previously possible with the previously utilized smaller databases.

Species data
Georeferenced presence-only records were obtained from peer-reviewed scientific outputs and public databases, including the NOAA Deep Sea Coral Data Portal, the ICES Vulnerable Marine Ecosystems data portal and the Ocean Biogeographic Information System portal (OBIS). In order to reduce potential errors in spatial position of the occurrence records included for subsequent modelling, presence records were processed in four steps. Firstly, for records with neither position accuracy data nor depth information provided were excluded. Secondly, records with position accuracy > 1000 m of linear distance were also excluded. Thirdly, for those records with no position accuracy information, but had depth or depth range data, the provided depth or depth range was compared to that extracted from the gridded bathymetry data, and any record where these depths, or depth range and depth, differed by more than 50 m in absolute depth were excluded. Lastly, the remained records with absolute depth values extracted from bathymetry grids < 40 m were excluded. In order to decrease the influence of sampling bias, only one occurrence record was retained in a single cell (after Davies and Guinotte 2011). In addition, similar numbers of cells were randomly chosen to act as pseudo-absence background data.

Environmental variables
Thirty three environmental variables were applied as candidate predictors in this study, with a cell size of 30 arc-sec in a WGS 84 coordinate system (after Davies and Guinotte 2011). These environmental variables can be summarized into four ecologically relevant groups, including 9 terrain variables and gridded bathymetry, 14 water chemistry variables, 7 biological variables, and 2 hydrographic variables ( Table 2).
Five environmental variables, namely depth, bottom temperature, aragonite saturation state, bottom horizontal current velocity and vertical current velocity were a priori selected for inclusion in the model, due to their proven high ecological relevance in previous habitat suitability studies for framework-forming cold-water corals published in recent years (Matos et  Since inclusion of highly correlated environmental variables may inhibit model performance and interpretation (Huang et al., 2011), the correlation of environmental variables was investigated by each variable group using analysis of the variance inflation factor (VIF) approach, with depth, temperature, aragonite saturation state, bottom horizontal current velocity and vertical current velocity excluded from the VIF calculations. In addition to the 5 environmental variables, these variables with VIF < 5 indicating a low level of co-linearity, were included in final model (Yesson et al., 2017).

Predictive modeling approach
In this study, the Random Forest for classification and regression approach (RF) was used to predict the global habitat suitability for L. pertusa. RF is a regression algorithm, which builds a number of regression trees, and further averages these trees (Breiman, 2001). These trees grow based on different training data subsets created by the bagging procedure to avoid the correlation of trees (Rodriguez-Galiano et al., 2015). RF has been applied to predict benthic species distributions in a series of studies, demonstrating good predictive performances (Matos et  The species presence and pseudo-absence dataset was randomly split into two sections, 80% as calibration data, with the remaining 20% as evaluation data. Three widely used statistics methods, TSS (true skill statistics), AUC (area under the receiver operating characteristic curve) and kappa statistics, were adopted to evaluate model performance (Allouche et al., 2006). To better understand the predicted habitat suitability, three thresholds were applied to reclassify the predicted map, including sensitivity and specificity balanced threshold, a 90% sensitivity threshold and a 95% sensitivity threshold (Thuiller et al., 2003;Tong et al., 2013).

Results
In total, 31635 records were obtained from papers and public databases (Table 1, Figure 1). Through processing via the four steps to remove inaccurate records, 3660 records were retained for analysis, with duplicate occurrences within a particular cell deriving from two or more input sources removed. In addition, 3660 cells were randomly chosen as pseudo-absence data.    Eight out of 28 environmental variables from the three groups were selected with VIF <5, indicating low co-linearity, including aspect, profile curvature, rugosity, topographic position index, salinity, seasonal variation index, particulate organic carbon and vertically generalized productivity model VPGM_min ( Table 2).
The model evaluation showed that the model performed well with TSS 0.974, AUC 0.997 and kappa statistics 0.974. Bottom temperature and aragonite saturation state were the most important variables contributing to the RF model (      (Figure 2a). The suitable habitat was predicted to occur primarily on continental shelves and slopes (Figure 2). The most suitable habitat for L. pertusa was predicted to occur in the North East Atlantic and South Eastern America (Figure 2b). The classified habitat suitability map using 3 thresholds, including the sensitivity and specificity balanced threshold, the 90% sensitivity threshold and the 95% sensitivity threshold. (c) the detailed map of the rectangle section of (b) with hillshade.

Discussion and Conclusions
The habitat suitability model developed for L. pertusa in this study was influenced primarily by temperature and aragonite saturation state, followed by particulate organic carbon and salinity, which is consistent with findings from previous studies into global stony cold-water coral distributions (Davies et. al  In this study, the database of 31635 L. pertusa records was constrained by limitations of position accuracy and/or depth data. The increase in published L. pertusa occurrence records since 2011 resulted in significantly more records being included in the model runs than has been possible in previous studies (from 863 in Davies and Guinotte (2011) to 31645, almost a 40-fold increase). Global habitat suitability for L. pertusa predicted in this study appears greater than found in the previous study by Davies and Guinotte, 2011 (Figure 2a). A number of areas of the global ocean, such as large seamounts, ridges and canyons, were predicted to have higher habitat suitability than in Davies and Guinotte (2011) study (Figure 2a). However, the high suitability of continental shelves and slopes, particularly in the North East Atlantic and South Eastern United States of America, was similar to previous predictions (Davies and Guinotte, 2011) ( Figure 2b). Additionally, the high habitat suitability at North East Atlantic and along continental margin was similar to that demonstrated for other coldwater corals, such as Antipatharia, octocorals and other framework-forming corals (Davies et. al., 2008;Davies and Guinotte, 2011;Yesson et al., 2017;Yesson et al., 2012). There was a narrow region along the continental shelf of China predicted to have relatively low habitat suitability for L. pertusa (Figure 2c).
The spatial resolution of environmental variables is very important for developing useful and accurate habitat suitability models. The global bathymetric dataset (SRTM30; Becker et al. 2009) used in this study, had a resolution of 30-arc seconds, approximately 1 km 2 cell resolution. It successfully captures large terrain features such as large canyons and seamounts (>1 km in diameter), whilst missed smaller terrain features such as mounds and ridges < 1 km in diameter (Davies and Guinotte, 2011). However, smaller terrain relief structures are also known to be important areas for deep-sea scleractinian framework-forming corals. The inclusion of the highest resolution bathymetry dataset SRTM15+ available at present (Tozer et al., 2019) should be considered in any future studies, which may largely improve accuracy of habitat suitability modelling and allow for the incorporation of more species records as resolution increases. The predicted habitat suitability in this study may still be influenced by the sampling bias of L. pertusa records. Selecting background data reflecting the same bias as the occurrence data may be useful to reduce the effects of sampling bias (Phillips et al., 2009). Using a kernel density estimate of sampling effort to reduce the influence of sampling bias could significantly improve the model performance (Burgos et al., 2020;Georgian et al., 2019). Such methods to reduce the influence of sampling bias should be considered in future to improve the performance of global habitat suitability model predictions.