Predicting at-sea distribution of Razorbill in the St. Lawrence Gulf and Estuary, Québec, Canada during the breeding period using GPS telemetry Prédiction de la répartition en mer du Petit Pingouin pendant la période de reproduction dans le golfe et du Canada, à l’aide de la télémétrie GPS

. Seabirds in the St. Lawrence Gulf and Estuary are vulnerable to anthropogenic threats such as oil spills and fisheries bycatch. A better understanding of their at-sea distribution is needed to determine occurrence and abundance hotspots where protection and conservation efforts should be concentrated. The goal of this study was to develop an at-sea distribution map of Razorbills ( Alca torda ) during the breeding period throughout the St. Lawrence to respond efficiently in the event of an environmental emergency. We tracked breeding Razorbills using GPS transmitters (n = 58) in six colonies located in the province of Québec along the St. Lawrence between 2015 and 2018. Two sets of models and maps (habitat suitability and density) were developed for Razorbills from GPS locations and a set of eight environmental covariates using a machine-learning approach (boosted regression trees). We then predicted at-sea habitat suitability and density around all known and active Razorbill colonies throughout the St. Lawrence (n = 85). The main covariates affecting habitat suitability and density of Razorbills were distance to the colony and distance to shore. Sea surface temperature and chlorophyll- a concentration were also important for habitat suitability. The model allowed generating at-sea maps for the entire targeted area during the breeding period. We identified large areas of high suitability (hotspots) to determine locations where Razorbills are most at risk and where conservation efforts should be focused. This work will be important to assess risk and minimize impacts in the case of an environmental emergency such as oil spills occurring in the St. Lawrence. détermine points de protection de les colonies de pingouins connues et actives dans le Saint-Laurent (n = 85). Les principales covariables qui ont affecté le caractère propice de l’habitat et la densité des pingouins étaient la distance à la colonie et la distance à la côte. La température de surface de la mer et la concentration de chlorophylle-a étaient également importantes pour la qualité de l’habitat. Le modèle a permis de générer des cartes en mer pour l’ensemble de la zone ciblée pendant la période de reproduction. Nous avons défini de grandes zones dans lesquelles l’habitat était très propice (points chauds), afin de déterminer les endroits où les pingouins sont les plus menacés et où les efforts de conservation devraient être concentrés. Les résultats de cette étude sont importants pour nous permettre d’évaluer le risque et de minimiser les impacts advenant une urgence environnementale, tel un déversement d’hydrocarbures, dans le Saint-Laurent.


INTRODUCTION
The St. Lawrence is a vast and complex ecosystem where biological productivity is high. It is a major transportation corridor for international commercial vessels subjected to high anthropogenic pressure resulting in cumulative exposure and environmental changes (Beauchesne et al. 2020, Blais et al. 2021.
Few areas of the St. Lawrence are free from stressors such as pollution, acidification, fisheries, invasive species, and ship traffic (Beauchesne et al. 2020). Approximately 4000 vessels per year transit through the seaway, transporting various commodities including over 3 million metric tonnes of petroleum products (St. Lawrence Seaway Corporation Management 2019). Oil spills from oil tankers, ship-to-ship transfer, and oil handling facilities To properly address conservation issues for seabirds, it is imperative to characterize the areas of high use and identify sites of importance. Several seabird monitoring programs have been developed in Canada to monitor breeding colonies (ECCC 2016(ECCC , 2020 as well as offshore distribution and abundance (ECCC 2022). These long-term standardized programs provide a strong basis for population trends and biodiversity hotspots. However, large-scale aerial or boat surveys can be costly and provide a snapshot at the time of the survey only. Alternatively, or in tandem, seabirds can be tracked with GPS transmitters to determine their at-sea distribution over the course of the breeding season. Tracking data from individuals can provide information on important areas based on frequency of use (Irvine et al. 2014). However, it is often not realistic to deploy GPS transmitters on a large number of bird colonies because of the high cumulative cost and the inaccessibility to some of them (e.g., cliff-nesting individuals). Deploying transmitters on individuals from a subset of colonies is therefore more appropriate. Predictive spatial analyses can then be used to extrapolate seabirds' distribution to other colonies based on a set of environmental variables (Lieske et al. 2020). For instance, the foraging range of breeding colonial seabirds is constrained around their nest (Elliott et al. 2009) and the distance to the colony is typically strongly limiting. Other variables such as prey density, environmental variability, bathymetry, and sea surface temperature can also explain seabirds' foraging behavior around their colony (Burke and Montevecchi 2009, Kowalczyk et al. 2015, Soanes et al. 2016, Delord et al. 2020. Recent advances in GPS tracking technologies and associated mathematical models have allowed researchers to: (1) identify environmental variables with the most explanatory power based on a subset of prior information (e.g., GPS positions of tracked individuals), (2) identify optimal values for these environmental variables, and (3) generate spatial predictions for non-sampled areas (e.g., bird distribution; Elith et al. 2008, Lieske et al. 2020. The resulting distribution maps can show where areas of high occurrence and density are located and where conservation actions are required. The goal of this study was to develop at-sea habitat suitability and density maps of Razorbills during the breeding period for 85 colonies located in the St. Lawrence based on tracked individuals from six colonies. We also aimed to identify the location of the largest habitat suitability hotspots. The at-sea distribution maps of the St. Lawrence produced in this study will provide valuable information on their range during the breeding season, which will help better inform management and conservation agencies. In addition, the results will be an important resource in case of environmental emergency events (e.g., oil spill) for preparedness, planning, to direct intervention efforts, to minimize impacts, and for damage assessment.

Study area and tracking data
We captured Razorbills on six breeding colonies located in the province of Québec (Canada) along the St. Lawrence Estuary and Gulf (Table 1; Fig. 1). We deployed 85 GPS transmitters (URIA series, Ecotone Telemetry Inc., Gdynia, Poland) on breeding adult Razorbills between 26 May and 25 July from 2015 to 2018. We captured individuals at their nest site during the incubation period using nest boxes equipped with a remote-triggered door system. Transmitters were programmed to acquire GPS locations at fixed intervals (5-30 minutes depending on colony and year of sampling). GPS fixes were logged on the transmitter until they were automatically downloaded on a base-station located near the nest.

Data processing
We discarded all GPS locations recorded between the capture and the first time the bird came back to the colony to avoid the inclusion of erratic movements caused by post-capture stress. We excluded all locations at the nest using a 20 m buffer around the colony to take into account the theoretical accuracy of the GPS module (L. Iliszko, Ecotone Telemetry Inc. 2020 personal communication). To reduce the variability in sampling effort among individuals, all tracks were standardized to 30 minute Lawrence where birds were equipped with a GPS transmitter during breeding seasons of 2015-2018 (n = 85 deployed; 58 kept in the model; gray circles). These colonies were used to model and predict the spatial distribution of all known and active Razorbill colonies (white circles) in Québec, Canada (gray background). The dividing line shows the limit between Estuary (west) and Gulf (east). Individual bird tracks are shown for the six colonies in insets A to C (letters correspond to the boxes in the top left panel).
intervals. In addition, we restricted the number of positions per individual within the 5-95 percentiles (i.e., 23-367 positions), considering this a reasonable trade-off between standardization of sampling effort and data loss. Because Razorbills rarely spend time on land other than at the colony, locations occurring on land were discarded (n = 221). After data processing, a total of 9803 locations from 58 GPS transmitters were kept for the analysis ( Fig. 1).
Razorbill space use was modeled to relate the occurrence and density of GPS locations to a set of biologically relevant environmental covariates (Lieske et al. 2020). For the purpose of this study, we assumed that the occurrence and density of GPS locations in a given area were considered a good proxy of the occurrence and density of individuals in that area. To estimate density of birds in an area, we first binned GPS locations of each individual into a 1 x 1 km resolution grid centered on the breeding colony. The extent of each individual grid was determined using the mean of maximum foraging distances recorded across all colonies (mean ± SD; 79 ± 46 km). To determine spatial at-sea distribution, we also gave a binary (0, 1) occurrence value to each individual grid cell. For occurrence, the goal was to produce a probability map of the potential distribution, hereafter called habitat suitability index, ranging from 0 to 1 (Hazen et al. 2021). Only grid cells occurring over seawater were kept for further analysis (inland values were set to not available [NA]).
The recent advances in telemetry technology allows the collection of a large amount of data that enables researchers to predict spatial distribution and density. However, remote sampling only produces presence data and thus, pseudo-absence must be inferred from the non-used, available environment (Warton and Shepherd 2010, Oppel et al. 2012, Hazen et al. 2021. Random selection of pseudo-absences ("background") is the most commonly used technique and was chosen to match the goal of our study (Iturbide et al. 2015, Hazen et al. 2021. Based on recommendations of Barbet-Massin et al. (2012), we selected the same number of pseudo-absences as presences. For each individual, we sampled pseudo-absence grid cells for both occurrence and density inside a radius equal to its maximum distance from the colony, but no longer than the chosen grid extent (79 km). The median percent coverage of pseudo-absences or presences in the area was 3% (5th-95th percentiles; 1-15%). A sensitivity analysis of the effect of pseudo-absence sampling on the models outputs is presented in Figure A1.1.

Environmental data
We modeled density and a habitat suitability index for Razorbills using a set of environmental covariates that have been shown or hypothesized to influence the distribution of colonial alcids: distance to the colony, distance to the nearest shoreline, depth, seafloor slope, rugosity, chlorophyll-a concentration, and sea surface temperature (SST; McDuie et al. 2018, Lieske et al. 2020. Satellite imagery of chlorophyll-a concentrations has been successfully used as a proxy for productivity and presence of prey to examine spatio-temporal variations of seabird and whale densities (Suryan et al. 2012, McDuie et al. 2018, Abrahms et al. 2019. Similarly, SST has been found to explain seabirds' occurrence and abundance (Domalik et al. 2018, McDuie et al. 2018, Delord et al. 2020. All covariates were extracted as spatial raster from various sources (Table A1. All individual occurrence and density grids were then stacked with colony-specific environmental rasters. As a final data processing step, individual raster stacks were combined (concatenated) to form a single data set containing occurrence, density, and environmental covariates values. Using the resulting final data set, we tested spatial autocorrelation for each colony using Moran's I ) and Geary's C ; A1.2).

Modeling process
Razorbill habitat suitability and density were modeled using boosted regression trees (BRT; Friedman 2002, Elith et al. 2008). BRT models were chosen for this study over other methods commonly used for telemetry data (i.e., resource selection function [RSF] using spatial point process; Johnson et al. 2013), generalized linear models, generalized additive models, and other machine learning models (Oppel et al. 2012) because they are more robust and less sensitive to outliers, missing and nonindependent data, and collinearity between predictors (Elith et al. 2008, Buston andElith 2011). This machine learning technique can also accommodate for complex and non-linear relationships (Mellin et al. 2012). As a result, BRT modeling has been increasingly popular to predict spatial distribution of seabirds and mammals (Lieske et al. 2020, Hazen et al. 2021. Fitting a BRT requires specification of three important parameters: (1) the number of trees fitted (nt), (2) the learning rate (lr; determines the contribution of each single tree to the growing model), and (3) the tree depth (td; the complexity of interactions that are allowed to be fitted). We tested multiple combinations of td and lr to find the optimal associated nt following Elith (2008 ; Table A1.2). Because the St. Lawrence Gulf and Estuary are different based on many environmental features, we added an additional region covariate (levels=estuary or gulf) to further reduce variation. We tested whether models detected pairwise interactions between predictors and reported their relative strength (Table 2). BRT model building and parametrization were done using gbm package (Ridgeway 2019) along with complementary functions provided in Elith (2008).

Model validation
Density and habitat suitability models were validated using a 10fold cross-validation where the data set comprising all the grid cells of the sampled colonies was randomly divided in 10 fold with 90% of the data set used as training data, which was then used to predict values on the remaining test fold (i.e., 10% of the data set). This procedure was repeated nine times with a different fold until all folds were used as test data set.
Performance of the habitat suitability model was assessed using discrimination and calibration (Pearce and Ferrier 2000). Discrimination performance was reported using the area under the receiver operating characteristic curve (AUC; Fielding and Bell 1997). AUC values range between 0 and 1, with values between 0.7 and 0.9 considered as good for prediction purpose, and values > 0.9 considered as excellent. A linear regression of the relative frequency of observed presences over 10 bins of predicted habitat suitability was used to determine the model calibration (slope) and bias (intercept; Phillips and Elith 2010). A perfect model is represented by calibration = 1 and bias = 0 Elith 2010, Oppel et al. 2012). For the density model, predictive accuracy was reported using the root mean square error (RMSE) and Pearson's correlation coefficient between predicted and observed counts. Calibration and bias were assessed using the slope and intercept of a major axis regression of observed over predicted counts (Potts andElith 2006, Piñeiro et al. 2008). The coefficient of determination (R²) of this regression was also presented as an index of the proportion of the variation in observed counts that is explained by the predictors.

Predictions
Final models were fitted on the entire data set to predict the habitat suitability and density of Razorbills on a radius of 79 km around each independent colony from the St. Lawrence Gulf and Estuary. Information on colonies (i.e., name, location, estimated number of birds) were obtained from the Computerized Database of Québec Seabirds (ECCC 2020) as part of the Québec Seabird Colony Monitoring Program led by Canadian Wildlife Service, Environment and Climate Change Canada. All colonies that were active (i.e., at least one bird detected) during the last survey (excluding surveys older than 1986) were included (J. -F. Rail 2019, personal communication). Similarly to the colony-grid used for modeling, prediction grids were built around all colonies (n = 85) using the same extent and environmental covariates as described above. Habitat suitability index around the colony was predicted for each grid cell using the habitat suitability (occurrence) model. Predicted bird density around the colony (birds/km²) were obtained by re-scaling predicted counts (number of GPS locations in a grid cell) between 0 and 1 (i.e., a utilization distribution): where i = a given grid cell and j = a given colony. We then multiplied by colony size (C) for each colony to obtain bird density (birds/ km²): for grid cell i and colony j. We obtained colony size estimates (number of breeding individuals) from the Computerized Database of Québec Seabirds (ECCC 2020). We divided the number of individuals by two to obtain the number of pairs; we assumed that half of the individuals were at-sea while the other half (the partners) were at the colony. To produce an estimate of bird density for the entire study area, we combined each prediction grid and overlapping cells using the mosaic function in the raster package . For the habitat suitability mosaic, overlapping values were combined based on the probability of a union of events (DeGroot and Schervish 2012): where p i is the habitat suitability index for a pixel of one colony and is the number of overlapping pixel of different independent colonies.
This resulted in a raster of habitat suitability index ranging between 0 and 1. For the density mosaic, overlapping values were summed. Largest areas of high habitat suitability (i.e., hotspot) were determined by keeping cells with a habitat suitability index higher than 0.75 and merging contiguous cells.

Effect of environmental covariates
Distance to the colony was the main covariate affecting habitat suitability and density of Razorbills with a relative influence of 20% and 48%, respectively (Fig. 2). Habitat suitability and density decreased drastically as distance to the colony increased ( Fig.  A1.3). In general, habitat suitability index and density were higher near the coast and in shallower depths. The relative influences of SST and chlorophyll-a were in the second (18%) and third (15%) ranks, respectively, after distance from the colony in the habitat suitability model, but only 6% and 7%, respectively in the density model (Fig. 2). Both metrics decreased with SST and peaked at a chlorophyll-a concentration near 5 mg/m³ (Fig. A1.3).

Predictions
Density and habitat suitability maps for all colonies are presented in Figures 3 and 4, respectively, and in raster formats in Appendix 2 and 3, respectively. The five largest habitat suitability hotspots were: (1) (sst), chlorophyll-a concentration (chloro), distance from the shore (distshore), depth, slope, rugosity, and region. Detailed description of each covariate is presented in Table A1.1 and Figure A1.3.

DISCUSSION
Razorbill habitat suitability and density were successfully predicted for the 85 monitored colonies throughout the St. Lawrence during the breeding season. Using BRT modeling, we determined hotspots of high-habitat suitability, which can be used for conservation efforts. The relative influence of environmental covariates differed between habitat suitability and density models, but distance from the colony ranked first in both models. This latter result was expected, because Razorbills are central place foragers during the breeding period, which forces them to return to the colony to incubate their egg or feed their chick. The foraging range of Razorbills is usually closer to the colony than other sympatric alcids in the North Atlantic (Lavers et al. 2020). In the Mingan Archipelago (Gulf of St. Lawrence), Razorbills traveled on average 7.2 ± 0.1 km from their colony during a foraging trip compared to 30.7 ± 0.6 km for Common Murres (Uria aalge) and 56.7 ± 0.5 km for Black-legged Kittiwake (Rissa tridactyla; Petalas et al. 2021). Breeding Razorbills tracked with GPS in Saint Pierre and Miquelon Archipelago (northwestern Atlantic) foraged at a maximum distance from the colony of 11 ± 10 km (Delord et al. 2020). Distance from the colony was among the principal factors influencing foraging of breeding Wedge-tailed Shearwaters (Ardenna pacifica) in Australian marine waters using a similar statistical approach (i.e., BRT on GPS-tracked adults; McDuie et al. 2018).
Distance to shore was the fourth and second most important predictor for habitat suitability and density models, respectively. In addition, the combined bathymetry metrics (slope, rugosity, and depth) had a moderate relative influence in both models. The foraging habitat of Razorbill is generally found in relatively shallow waters, close to shore, and where feeding conditions (often at fronts and upwellings) and concentrations of their prey are predictable (Huettmann et al. 2005, Lavers et al. 2020. Razorbills are more selective in the choice of their foraging habitat than other sympatric alcids (Wanless et al. 1990, Huettmann et al. 2005. In the Gulf of St. Lawrence, Razorbills are known to forage almost exclusively on capelin (Mallotus villosus) and sandlance (Ammodytes sp.) during the chick-rearing period (Chapdelaine and Brousseau 1996, Lavoie et al. 2012, Petalas et al. 2021. Razorbills use waters closer to shore compared with many other sympatric seabirds (Gaston and Jones 1998, Thompson et al. 1999, Huettmann et al. 2005. Dynamic variables such as chlorophyll-a concentration and SST had a clear effect on the habitat suitability model, but not on the density model. Previous studies found that dynamic variables explained foraging locations of seabirds during the chick-rearing period (Domalik et al. 2018, McDuie et al. 2018. Foraging areas of seabird species can be predicted by low SST and high chlorophyll-a concentrations, which are indicative of upwelling and primary productivity (Grecian et al. 2016). For Razorbills, Delord (2020) found that SST explained the behavioral state (transit vs. foraging) of breeding adults in the Saint Pierre and Miquelon Archipelago. Areas where mixing of water masses occurs (with different temperatures and salinities) have also been found to favor Razorbill aggregation (Begg andReid 1997, Camphuysen andWebb 1999). In our study, habitat suitability decreased with temperature, which is consistent with their preferred prey (capelin and sandlance) that are found in cold waters (Scott and Scott 1988).
Razorbill habitat suitability hotspots were found in five main areas (Fig. 4), which are known colonial seabird diversity and abundance hotspots. The North Shore, near Iles Sainte-Marie, is notable for its large numbers of Razorbill colonies and other alcid species such as Common Murres and Atlantic Puffins (Fratercula arctica; Rail and Cotter 2015). In addition, primary and secondary productivity is notably high in the Saguenay-St. Lawrence Marine Park where large numbers of marine mammals and seabirds can be found (Bédard et al. 1997, Rail 2018 Rail et al. 2013, Pelletier andGuillemette 2022). The Mingan Archipelago National Park Reserve is also an area of high seabird biodiversity and abundance. The five largest habitat suitability hotspots found in our study were corroborated by the results of the Atlas of Seabirds at Sea in Eastern Canada 2006(ECCC 2022 during the breeding season. The Atlas is a series of maps based on the Eastern Canada Seabird at Sea (ECSAS) survey conducted annually by the Canadian Wildlife Service in the Quebec and Atlantic regions. This survey is conducted by recording seabirds at sea from ships using line and point transects followed by a distance sampling approach to estimate bird density (Gjerdrum et al. 2012). Although the maps generated were of lower resolution (100 x 100 km) than the current study (1 x 1 km), these two independent assessment methods captured a very similar spatial pattern throughout the St. Lawrence during the breeding season, which increases the confidence in the validity of our model. The hotspot areas presented in this study also correspond to zones of high cumulative threats (ship-source oiling, marine traffic, bycatch, and light pollution) according to the assessment of Lieske (2020) for seabirds in general, and more specifically for alcids. Finally, each hotspot area identified in our study overlaps with one or more Important Bird Areas (http://www.ibacanada. com). The areas presented here will help to rapidly and efficiently communicate with response agencies the level of threat that Razorbills are facing in the event of an environmental emergency such as an oil spill. They will be useful to designate areas of special concern to prioritize and direct response strategies.
The BRT model performed well for the studied environment and led to comprehensive maps of the Razorbill's at-sea distribution in the St. Lawrence. A similar BRT model to ours was used on the Atlantic Coast of Canada for 13 species/groups of seabirds from over 5000 species-specific colonies to predict abundance (Lieske et al. 2020). The authors used seabird abundance and anthropogenic threats (e.g., ship-source oil pollution) as well as a sensitivity index to those threats to map cumulative risks. Our study incorporated additional environmental covariates in the model and presented maps of finer resolution. BRT models have been used on GPS-tracked marine mammals, sharks, turtles, and seabirds in various marine systems to identify or to predict foraging patches, sensitive areas, or densities at various stages of the animals' life cycle , Hazen et al. 2018, McDuie et al. 2018, Maxwell et al. 2019, Lieske et al. 2020, which shows the wide applications of this method.
The maps presented here will help with conservation efforts, monitoring, risk mitigation associated with anthropogenic activities, and scientific purposes. Seabirds are vulnerable to fisheries bycatch and it is estimated that between 2679 to 45,586 birds die annually across Canada as a result (Ellis et al. 2013). In the St. Lawrence, approximately 464 birds die annually (range: 26-2082; Ellis et al. 2013). These are likely underestimated numbers because of carcasses lost at sea and spatial data gaps due to low observer coverage on fishing boats (Ellis et al. 2013). A better understanding of birds' at-sea distribution may help in identifying ways to mitigate the mortality rate. For instance, certain key areas presented here could be avoided by fisheries activities during critical times such as the breeding period. In addition, seabirds are vulnerable to oil spills because they spend most of their time at sea and any exposure to oil can lead to adverse health effects and death. Spills at sea can occur from accidental or deliberate discharge. As an example, between 1000 and 4000 birds died as a result of an oil spill from the ship the Gordon C. Leitch that occurred during the non-breeding period in 1999 near Havre-Saint-Pierre (Roberge and Chapdelaine 2000), an area used by some of our tagged Razorbills from the Betchouane Migratory Bird Sanctuary. Such an event occurring during the breeding season would undeniably have more serious consequences on the avian community. At-sea distribution of seabirds that are susceptible to oiling like Razorbills can help to assess risk and develop a response strategy to prevent adverse situations (IPIECA 2014). Decisions need to be made quickly in order to mitigate risk to wildlife and prior knowledge on areas of greater or lower risk are needed for a successful operation and an efficient use of resources (ECCC, in press). In the St. Lawrence, higher boat traffic, marine pollution, and overall cumulative risk occurs in the narrower and southern sections (Beauchesne et al. 2020). Our analysis shows that Razorbills in the Saguenay-St.
Lawrence Estuary confluence and in the Gaspé Peninsula face greater oiling and cumulative risks (Fig. 4).
The maps presented in this study will be a good tool to help governmental bodies (e.g., Canadian Wildlife Service) to better prepare and respond to environmental emergencies. For instance, rapid estimates of birds' occurrence and density will be readily available to efficiently communicate information to, and properly advise governments, Indigenous organizations, industry, response organizations, and other stakeholders. These estimates are important to identify wildlife and their key habitats that are expected to be affected (ECCC, in press). Appropriate actions can then be planned proportionally to the potential impact on birds.
Efforts are currently underway to use this model template on other species and obtain a more complete overview of seabirds' at sea across the St. Lawrence. A better general understanding of at-sea distribution will help with conservation efforts and mitigate impacts in case of environmental emergencies.
Responses to this article can be read online at: https://www.ace-eco.org/issues/responses.php/2188