Relating marine mammal distribution to water column prey structure derived from echosounding

: Managing human impacts on marine mammal populations depends on understanding their distributions in space and time. Knowledge of these distributions is becoming increasingly important due to offshore energy development and climate-induced shifts in oceanic habitat. Most models assessing pelagic marine mammal abundance and distribution primarily use ocean surface or ocean floor variables as proxies for water column habitat. Here, we used a ship-based marine mammal sighting survey to test the utility of echosounder-based predictive variables for modeling marine mammal distribution and abundance. We assessed the distribution of 7 marine mammal taxa and 3 feeding guilds along the shelf break off the northeast USA relative to prey structure derived from acoustic data. We classified prey into 4 categories: (1) fish with swim-bladders; (2) small resonant bubbles as phytoplankton, fish larvae, and gelatinous zooplankton; (3) fluid-like scatterers such as krill and copepods; and (4) fish with no swimbladder. We quantified the spatial structure of prey by calculating backscattering strength, location, dispersion, occupied areas, evenness, and aggregation. Spatial resolution along the survey track line was set to bins 1000 m in distance and either 50 or 200 m in depth. We then built generalized additive models (GAMs) using these acoustically derived variables to explain marine mammal distribution. The resulting GAMs explained between 9 and 38% of deviance, with model fit often reflecting aspects of foraging depth and prey preference. This approach could contribute to improved management through more accurate species distribution models that employ direct measurements of prey.


INTRODUCTION
Human endeavors are increasingly encroaching on marine pelagic habitat. Fishing and shipping have long occurred in pelagic waters, and now wind energy development is expanding further into offshore environments (United States 2017). Constraining the impacts of these activities on marine life requires a thorough understanding of the distribution, abundance, and habitat use of the animals that occupy these waters. Marine mammals are of particular concern, whether due to their protected status under laws such as the US Marine Mammal Protec-tion Act (MMPA), specific concerns for small and endangered populations (Forney et al. 2017), high sensitivity to sonar (DeRuiter et al. 2013), or construction impacts such as those from pile driving (Brandt et al. 2011).
Recent studies modeling coast-wide distributions of marine mammals in both the Atlantic and Pacific have linked data from visual surveys to environmental covariates, nearly all of which represented either ocean-surface or ocean-floor characteristics (Forney et al. 2012, Roberts et al. 2016, Becker et al. 2019, Chavez-Rosales et al. 2019). These environmental covariates, while proven useful, act as proxies for water column characteristics such as prey concentrations that define oceanic habitats.
Multiple studies have examined spatial organism structure within the water column as it relates to marine mammal distribution (e.g. Benoit-Bird & McManus 2012, Benoit-Bird et al. 2013, Abecassis et al. 2015. However, few studies have attempted to apply this approach to large-scale abundance surveys involving numerous predator and prey species, and these attempts have had limited success, particularly in applying echosounding data directly (LaBrecque 2016, Roberts et al. 2016, Virgili et al. 2021. To pursue offshore energy development in the USA, companies must comply with the MMPA and Endangered Species Act (ESA) and assess potential effects on marine mammal habitat and prey. The portion of the northwest Atlantic shelf break south of the northeast USA (approximately 41°N and east of approximately 73°W) (see Fig. 1) has high densities of marine mammals during the summer compared to surrounding areas (Chavez-Rosales et al. 2019), yet we know little about the distribution of prey resources in this region. Typical fish sampling for management purposes focuses on the continental shelf and benthic habitat, with limited effort focused on areas off the continental shelf and pelagic habitat. For MMPA strategic stocks (those of particular concern), the MMPA requires an assessment of factors that could cause a decline or impede recovery of strategic stocks (which, in this region, include right whale Eubalaena glacialis, fin whale Balaenoptera physalus, sei whale B. borealis, blue whale B. musculus, sperm whale Physeter microcephalus, longfinned pilot Globicephala melas, and short-finned pilot whale G. macrorhynchus).
Two options to sample the prey field in surveys designed to estimate broad-scale abundance include net sampling and echosounding. Net-based pelagic prey sampling is difficult to accomplish on a marine mammal abundance survey where the primary objective is to visually count cetaceans along a track line that requires a consistent ship speed that is faster than a net can be towed (e.g. 10 knots versus 3− 4 knots). Net sampling during the day limits marine mammal sightings, and sampling at night results in a temporal mismatch between sightings of highly mobile animals and prey resources in the water column. An alternative is to examine organismal spatial structure in the water column with echosounding while simultaneously searching for marine mammals at survey speeds.
Biological acousticians have conducted extensive research to distinguish marine organisms using their acoustic backscatter (Benoit-Bird & Lawson 2016). The principles are founded in mathematical acoustic scattering models (Stanton et al. 1998, Stanton & Chu 2000 and have been validated by both laboratory measurements (Wiebe et al. 1990, Stanton et al. 2004) and field testing (Wiebe et al. 1996, Lawson et al. 2001. Received backscatter depends on the acoustic frequency encountering the targets, as well as their shape, orientation, and body structures. Fish, plankton, or cephalopods can often be categorized acoustically by their relative frequency response (Korneliussen et al. 2016). This approach uses the fact that organisms have a characteristic backscattering response to particular echo sounding frequencies, allowing researchers to distinguish unique patterns particular to a certain class of nekton or plankton. Multi-frequency classification approaches using this technique have been em ployed successfully numerous times in the field (Korneliussen & Ona 2002, Jech & Michaels 2006, Benoit-Bird 2009, De Robertis et al. 2010, Trenkel & Berger 2013, Jech et al. 2018. In this study, we employed the multi-frequency organism classification algorithm developed by Trenkel & Berger (2013) to quantify the acoustic response in the upper 200 m into 4 organism types: (1) fish with swimbladders; (2) organisms with small, en trained resonant gas bubbles such as phytoplankton, fish larvae, and siphonophores; (3) fluidlike zooplankton such as krill and copepods; and (4) fish with no swimbladder (which may also include cephalo pods). We also assessed the frequency response for individual echosounder frequencies and applied a suite of metrics to both the multi-frequency indicator and individual frequency responses that characterize the vertical distribution with measures of density, abundance, location, dispersion, occupancy, evenness, and aggregation (Urmy et al. 2012). We then used these backscatter variables to model marine mammal distribution along the track lines. In conducting this analysis, we assessed the applicability of these various metrics to marine mammal modeling.
We hope the results of this study will spur incorporation of prey and water column structure in future marine mammal abundance and distribution estimates. Current abundance and distribution models focus on static and remotely sensed environmental variables, in part because they are easily measured and attainable over large areas. Incorporating measurements of acoustic backscatter into these models may require additional modeling to estimate prey abundance in areas not surveyed (e.g. Lehodey et al. 2015), incorporation of additional acoustic survey data from areas not surveyed on the marine mammal abundance cruises, redesigning survey track density, or a combination of these measures. This study aims to serve as a proof of concept that measurements of prey and water column structure could provide valuable information for modeling marine mammal distribution from abundance surveys, while also exploring habitat and prey preferences that could inform future use of acoustic data to improve marine mammal abundance and distribution modeling.

Survey design
Marine mammal observations and echosounding data used in this study were collected during the summers of 2011, 2013, and 2016 as part of the Atlantic Marine Assessment Program for Protected Species (AMAPPS). AMAPPS is a comprehensive, multi-agency research program to assess the abundance, distribution, ecology, and behavior of marine mammals, sea turtles, and seabirds throughout the US Atlantic Outer Continental Shelf, from Maine to the Florida Keys. The AMAPPS program has conducted research that spans taxonomic groups and trophic levels and provides an integrative view of protected species, placing them in an ecosystem context that can help inform management decisions regarding federally protected marine species (NOAA Fisheries 2013).
Data used in this study were collected on 6 shipboard surveys (2 yr −1 ) conducted on the NOAA ship 'Henry B. Bigelow' (hereafter referred to as the 'Bigelow') in June and July 2011 and July and August 2013 and 2016. The ship ran established track lines that were repeated each abundance survey. The 26 transects included in this study traversed the continental shelf break, where each transect was a straight line intersecting the shelf break at diagonal angles ( Fig. 1). Echosounding equipment was set to passive mode (listening, not actively pinging) for portions of the survey because of multiple cruise objectives (Cholewiak et al. 2017); these portions were not considered in our analysis. Among these 26 transects, 17 were incorporated from 2011, 22 from 2013, and 18 from 2016 ( Fig. 1). Occasionally, only portions of a transect were sampled due to weather or time considerations, and the remainder of a transect was sampled the following day when the echosounding equipment was recording in passive mode. In all, sur-vey data from approximately 3500 km of ship tracks were used in this study.

Marine mammal observing
While running these transects, 2 independent observing teams visually scanned for cetaceans, seals, turtles, and some fish species. Each team consisted of 2 people surveying with high-powered 'Big Eye' binoculars (magnification power of 25 with a 150 mm lens), and at least one person recording sightings and observing with the naked eye. The data were recorded in VisSurvey, a custom NOAA software program. Data included species, a best estimate of group size, behavior, location, distance and angle from the ship, and sea state, among other variables. If a species could not be discerned, the sighting was classified into a general taxonomic group.
For this analysis, the time and geographic location of visual sightings along the transect were adjusted so that they aligned with the corresponding echosounding data. For example, if a sighting was recorded at a distance of 1700 m in front of the vessel and along the track line, that sighting was assigned a time, latitude, and longitude associated with the time, latitude, and longitude of the echosounding data at which the vessel steamed 1700 m, given its mean speed.

Echosounding processing
Echosounding data were collected using a Simrad EK60 system consisting of 18, 38, 70, 120, and 200 kHz transducers. The transducers were mounted in a retractable centerboard of the 'Bigelow' and calibrated prior to the cruises using a 38.1 mm diameter sphere of tungsten carbide with 6% cobalt binder, for all frequencies and following standard calibration techniques (Foote et al. 1987, Demer et al. 2015. Each transect was processed separately in Echoview software (version 11.1; Echoview 2020) using an identical set of functions and variables for each transect. The acoustic outputs were batch-processed and exported from Echoview with custom Python (ver-  Several steps were taken to clean the data within Echoview. The data from each frequency were evaluated for noise contamination and attenuation and cleaned following the methods proposed by Ryan et al. (2015) and implemented in Echoview (version 11.1; Echoview 2020). Impulse noise removal, attenuated signal correction, transient noise removal, and background noise removal were evaluated for each frequency, and parameters were tuned accordingly. The transient noise filter was only applied to the 18 and 38 kHz frequencies, as its effect was not apparent in the higher frequencies.
The first 10 m of the water column and areas near the sea floor were removed from the analysis. We removed the surface 10 m due to the echosounder near-field effect, in which detected values close to the echosounders are inaccurate, and because of potential bubble interference in the surface waters. Since we were primarily concerned with pelagic structure, specifically those areas above the ocean floor, we ensured that echosounding values from the sea floor were not interpreted as pelagic marine organisms. The ocean bottom line was first drawn automatically by Echoview's best bottom candidate algorithm, and then buffered 3−5 m upwards and manually adjusted where necessary. The final data retained for analysis only included those data collected on the transect when marine mammal survey effort was underway.
After cleaning the data, we created grids of 1000 m in length and either 50 or 200 m in depth that were used to export summary data. Both the 50 and 200 m depth bins were later evaluated for inclusion in each marine mammal model. We chose a horizontal bin distance of 1000 m in an attempt to retain the potential fine-scale relationships that have been suggested to drive predator−prey interactions. Recent research into marine patchiness examining the scales shaping zooplankton, fish, and seabird distributions via acoustics found that submesoscale processes from 100 m to 1 km play a significant role in shaping the pelagic seascape (Bertrand et al. 2014). The 1000 m horizontal bin size acknowledges the nature of the survey and the locations of the animals sighted relative to the ship. At this scale, we are attempting to assess submesoscale characteristics that may indicate the likelihood of prey patches, but we are not attempting to identify a particular small prey patch. However, a number of the metrics described in the following paragraph are intended to capture characteristics within a cell that may indicate the likelihood of a small prey patch within a particular cell. This 1000 m horizontal scale Fig. 1 (continued) also strikes a balance among modeling tradeoffs with regard to excess zeros and small sample sizes. Use of an overly small horizontal scale could also result in autocorrelation and artificially inflate explanatory power, although generalized additive models (GAMs) used in this study limit that possibility since they are considered robust to autocorrelation.
Similar to horizontal bin size choice, vertical bin sizes were chosen through a combination of modeling practicality and ability to capture the appropriate metrics at multiple frequencies. Higher frequencies are attenuated faster than lower frequencies with increasing depth. Given this fact, we did not feel comfortable using the 200 kHz backscatter below 200 m, which informed our choice of 200 m as one depth bin choice. This larger vertical bin size also fit larger deep-water characteristics such as the deep scattering layer. In addition, we included 50 m depth bins that may be informative in the upper water column and also serve as a test to examine whether this vertical scale would be appropriate for particular species. As with the horizontal scale, these choices were made in an attempt to avoid excess zeros and overly segmenting the water column. Overly segmenting the water column could add unnecessary speci-ficity and would have further ballooned an already substantial list of covariates to examine.
We next created metrics to assess water column structure and prey distribution. To assess water column structure, we calculated measures of density, abundance, location, dispersion, occupancy, equivalent areas (hereafter referred to as evenness), and aggregation (Table 1) per acoustic bin. Density was represented by the mean volume-backscattering strength (S v ) (dB re 1 m −1 ), and abundance was represented by the area-backscattering coefficient (s a ) (m 2 m −2 ) (Echoview 2020). Measures of location (center of mass, m), dispersion (inertia, m −2 ), occupied area (proportion occupied), evenness (m), and aggregation (index of aggregation, m −1 ) were derived from Urmy et al. (2012) as implemented in Echoview 11.1 (Echoview 2020).
To assess potential prey distribution, data were processed with a multi-frequency indicator (MFI) algorithm published by Trenkel & Berger (2013). This MFI classifies acoustic backscatter into 4 categories: (1) large gas bubbles (0.4−1.4 mm), used as a proxy for fish with swimbladders; (2) small resonant bubbles (0.1−0.2 mm), used as a proxy for organisms with small, entrained bubbles such as larval fish, phytoplankton, and gelatinous zooplankton (e.g. siphon -  Table 1. Summary of acoustic metrics used to summarize prey data ophores); (3) small fluid-like scatterers, used as a proxy for zooplankton such as krill and copepods (hereafter this category will be referred to as describing krill and copepods); and (4) large fluid-like scatterers, used as a proxy for fish with no swimbladder (e.g. mackerel) or possibly squid (Table 1). In the MFI calculation, a small portion of the backscatter does not fall into one of these 4 categories; however, in our experience, this percentage was typically less than 10%.
The approach operates on the idea that different organisms have unique acoustic responses to a suite of acoustic frequencies. By comparing the relative responses to different frequencies, the dominant organism types in a particular parcel of water can be distinguished. This is a foundational characteristic in marine acoustics that has been validated both in the lab and in the field (Wiebe et al. 1990, 1996, Lawson et al. 2001, Stanton et al. 2004). Similar techniques have been successfully employed numerous times in the field (Korneliussen & Ona 2002, Jech & Michaels 2006, Benoit-Bird 2009, De Robertis et al. 2010, Trenkel & Berger 2013, and this MFI should be transferable to regions other than that in which it was developed. The classification of krill and copepods using this MFI in particular has been validated directly in the study area region (Jech et al. 2018). That said, one goal of incorporating the MFI outputs into this modeling process was to test its applicability to marine mammal distribution in our region.
After classifying the backscatter to scattering type, S v (dB re 1 m −1 ) was extracted from the frequency most associated with that particular scattering type (Table 1). For example, the S v from 38 kHz was extracted for the areas classified as small resonant bubbles because the small bubbles associated with these organisms had their peak resonance at 38 kHz. We also used the 38 kHz S v for fish with swimbladders, since this frequency has often been used to represent these fish in other studies, particularly when paired with 120 kHz data (Logerwell & Wilson 2004, Simmonds & MacLennan 2005, Jech & Michaels 2006. The 120 kHz S v was used for krill and copepods, as this frequency is often used for distinguishing euphausiids (Jech et al. 2018). The 200 kHz was used for fish without swimbladders because they are weak scatterers, their modeled frequency response was highest at 200 kHz, and this frequency was recommended for mackerel assessments (Trenkel & Berger 2013, Scoulding et al. 2017. These processing steps resulted in 3 backscatter categories at the spatial resolution of the raw data: MFI prey classification, prey structure, and individual frequency responses. The data output of these processing steps was binned to cells 1000 m in length and either 50 or 200 m in depth to output summary values per cell from the raw data. These variables were used individually (e.g. 38 kHz frequency response in the 400−600 m depth bin) or in combination (e.g. MFI prey and water column structure representing the dispersion of fish with swimbladders in the 0−200 m depth layer, or the index of aggregation for 38 kHz backscatter in a 0−50 m depth bin).
Not all variables were used for each depth bin. As depth increases, echosounder frequency response is increasingly affected by absorption, resulting in lower signal-to-noise ratios that are increasingly difficult to distinguish from background noise. The higher the frequency, the more strongly the signal is affected. Through examination of echograms and background noise calculations, we determined cut-off depths for each frequency. In the 50 m depth bins, the 200 kHz frequency response was limited to 200 m, the 120 kHz frequency response was limited to 300 m, the 70 kHz was limited to 400 m, and only 18 and 38 kHz frequency responses included data deeper than 600 m. For the 200 m depth bins, we retained all data from the 18 and 38 kHz frequencies down to 2000 m. For the 70, 120, and 200 kHz frequencies, we retained data through the top 600 m, recognizing that only the stronger frequency responses would be reflected in the deeper bins. MFI variables were limited to the top 200 m because the Trenkel & Berger (2013) MFI incorporates all 5 available echosounder frequencies on the 'Bigelow' (18,38,70,120,and 200 kHz), and the MFI assumes equal relative contributions from each frequency.
Ocean depths over the course of the transects ranged from less than 100 m to over 2000 m, resulting in missing values for deeper depth bins. Missing S v values were set to noise values of −105 dB re 1 m −1 because values used in the analysis were limited to those greater than −100 dB re 1 m −1 . Index of aggregation, evenness, inertia, proportion occupied, and s a were set to zero when missing values were present. Center-of-mass variables were set to the center of the respective depth bin as a neutral point, although in practice these variables were only considered for modeling when there were few or no missing values.

Marine mammal modeling
GAMs (Hastie & Tibshirani 1990, Wood 2017 were built to explain the density of marine mammals observed along the transects, using the 'mgcv' package (Wood 2011) implemented in the R statistical lan-guage (version 4.2.2; R Core Team 2022). We built density models for 7 commonly detected species and 3 guilds. The individual species models included short-beaked common dolphins Delphinus delphis (hereafter referred to as common dolphins), bottlenose dolphins Tursiops truncatus, Risso's dolphins Grampus griseus, pilot whales Globicephala spp., sperm whales Physeter macrocephalus, fin whales Balaenoptera physalus, and humpback whales Mega ptera novaeangliae. Models for beaked whales (Ziphiidae) and individual species models for striped dolphins Stenella coeruleoalba were not completed because the sample size was not large enough to model those species groups accurately. We then built models for 3 species guilds based on foraging and dive preferences, which allowed us to include additional species for which the sample size was too small to build an individual species model. The 3 guilds consisted of (1) dolphins, which included common, bottlenose, and striped dolphins; (2) rorquals, which included fin, sei, humpback, and minke Balaenoptera acutorostrata whales; and (3) deep-diving squid-focused species, which included Risso's dolphins, pilot whales, and sperm whales.
We calculated species density estimates using mark−recapture distance sampling (Thomas et al. 2010) as implemented in version 2.2.4 of the 'mrds' R package. This aligns with previous methods used to create abundance estimates and distribution maps with AMAPPS data (Chavez-Rosales et al. 2019, Palka et al. 2021a,b). We fit marine mammal sightings data used in this study to previously established AMAPPS distance models. Model specifications can be found in Palka et al. (2021a), and a detailed discussion of this distance modeling approach can be found in Laake & Borchers (2004). Specifically, we used distance sampling to estimate the probability of detection, p(a), at distance a for each sighting. A sighting in this case was defined as either an observation of one animal of a particular species or a group of animals of the same species group that were sighted together in roughly the same location. The p(a) values for each sighting in a cell were summed by species group to obtain a total p(a) value for each species group observed in a particular cell. We then used this probability to convert counts of sighted marine mammals per cell to densities using the following formula: (1) where D is the density of a particular marine mammal in a specified cell, the numerator on the right side of the equation is the number of observed animals (S is the number of sightings per cell, G is the mean group size per cell), and the denominator is the effective area searched (L is the length of the cell [1000 m], and the number 2 accounts for 2 sides of the ship because p(a) applies to only one side of the ship). Estimating p(a) involved applying truncation distances specific to each species group such that some animals that were sighted were not included in the analysis; sighted animals were not included when the distance resulted in a probability of detection that was very small. Density estimation for most species applied the AMAPPS models from Palka et al. (2021b) to the shelf break transects used in this study. We took this approach to best align established and validated models with the data used in this study. However, we needed to include data from the full AMAPPS study area -which also includes transects extending further off the shelf break -for 2 species, humpback whales and common dolphins, in order to attain a satisfactory model fit.
Density models were fit using the restricted maximum likelihood criterion with the Tweedie family, allowing for flexibility among candidate distributions and the ability to deal with excess zeros and overdispersed data. Plots for individual variables were smoothed with thin plate regression splines, including an additional null space penalization that allowed the smoothed term to reduce to a linear fit, which provided additional restraint against overfitting (Wood 2003).
Model building was conducted in a stepwise fashion whereby the best-fitting single variable was selected as the first variable; the remaining subset of variables were then assessed for fit with the first variable, and the process was repeated if inclusion of additional variables improved the fit. In selecting variables for inclusion, we first sorted by ascending Akaike's information criterion (AIC) values and model deviance explained, creating a subset of models to consider. We then examined potential models through summary statistics such as p-values (< 0.01), plots assessing model fit, and whether the relationship of the variable and species density made ecological sense based on likely foraging depths, potential prey sources, known ocean depth preferences, and general knowledge of the species.
For example, variables representing krill and copepod density were considered unlikely for deepdiving species that often forage on squid but were considered reasonable for rorqual models. Indicators of deep scattering layers were considered likely for deep-diving species but less likely for rorquals. Fish concentrations were considered for all species, although small resonant bubbles that may represent phytoplankton, fish larvae, or siphonophore distributions were discounted for all models, given that they would likely represent a secondary relationship as a food source for cetacean prey. Similarly, variables representing aggregations of prey were considered reasonable candidate variables. In most instances, low evenness and low inertia were considered likely candidate variables and interpreted as patchiness that could represent dense prey aggregations within a cell, although high evenness values could represent the presence of the deep scattering layer, particularly as the acoustic beam spreads out with depth and decreases in resolution. While we used a priori knowledge to guide our choice of model variables, if a particular relationship was strong and consistent, it was considered for inclusion in models even if the relationship was not immediately intuitive.
Fit of potential models was further examined by applying k-fold cross-validation with 1000 random data subsets to calculate a number of diagnostic tests, including relative mean absolute error (rel-MAE) and mean absolute percentage error (MAPE). Concurvity was examined in the candidate models to ensure that strongly related terms were not included in the same model. GAMs should be robust to correlation and concurvity, but models with worst-case concurvity values > 0.6 were discarded from consideration (Chavez-Rosales et al. 2019). The estimated degree of freedom in relation to the sample size was examined to ensure we did not overfit models. We did not impose an a priori limit on the number of variables in each model but instead stopped adding variables when the improvement of fit was marginal in terms of the metrics mentioned above.

RESULTS
GAMs built for 7 taxa and 3 species guilds included as many as 4 significant acoustic variables per model (Table 2). GAM plots of each significant variable provided for each species model showed 3 types of relationships: (1) a negative slope, (2) a positive slope, and (3) more complex curves (Figs. 2−4, S1-S4 & S6-S8 in the Supplement at www.int-res.com/articles/suppl/ m711p101_supp.pdf). Values near zero on the y-axis suggest a neutral relationship with the predictor variable and values greater than zero indicate positive relationships, with larger positive numbers suggesting a stronger relationship; conversely, values less than zero indicate negative relationships with the predictor variable, with more negative numbers suggesting a stronger negative relationship. For example, a de scending slope in a GAM plot from positive to negative values on the y-axis indicates a positive association between marine mammal abundance at low values of the predictor variable and a negative association with high values of the predictor variable.
The best models, defined as those that explained the most deviance, combined measures of water column prey structure and individual frequency responses and had positive relationships with suspected prey (Table 2). Models with the 7 taxa and the 3 species guilds generally reflected foraging preference through depth and prey preference. Models for deeper diving species typically contained variables in the region of the deep scattering layer, models for shallower diving species such as rorquals contained variables depicting the upper water column, and dolphins that have a more plastic foraging approach had a mix of variables representing different portions of the water column. Measures of prey patch aggregations were a key component in many models, usually with negative slopes relative to increasing evenness or inertia (Figs. 2, 3, S1, S3 & S6−S8), implying a stronger association with patchiness that could represent dense aggregations of prey. MFI variables characterizing prey type also proved important (present in 8 out of 10 models), particularly in the form of variables that combined prey type with measures of prey aggregation (7 of 10 models; e.g. evenness of non-swimbladder fish at 10−50 m). Models for shallower diving groups such as rorquals and dolphins fit better than those for deeper diving species known to forage on squid.
For the individual taxa models, the pilot whale model was fit with only one variable; the sperm whale, bottlenose dolphin, and Risso's dolphin models included 2 variables; the humpback whale in cluded 3 variables; and the fin whale and common dolphin models included 4 variables (  Kinlan et al. 2012), with none ranked as poor.
All 10 models included variables that describe concentrations of organisms, including evenness (5 models), inertia (5 models), index of aggregation (1 model), proportion occupied (occupied area) (2 models), and center of mass (1 model) (Tables 1 & 2). The best-fitting models were for fin whales, humpback whales, rorquals, and common dolphins, which incorporated metrics describing water column structure, prey types, and either density or abundance. The fin whales, Risso's dolphins, common dolphins, pilot whales, and the deep-divers group all had positive associations with areas containing low evenness (Figs. 2, 3 . The fin whale, bottlenose dolphin, and deepdiver models incorporated measures of abundance (i.e. s a ), and the humpback whale, common dolphin, Risso's dolphin, dolphin guild, and rorqual guild models incorporated estimates of density (i.e. S v ) ( Table 2). Models for fin whales, humpback whales, common dolphins, Risso's dolphins, and all guild models included variables that combined prey classification and water column structure, and the bottlenose dol-phin model included prey classification with a measure of abundance (Table 2). The fin, humpback, and rorqual whale models included ocean depth, whereas the remaining models only incorporated variables derived from echosounding. For all these models, most observations occurred on the shelf in waters less than 200 m. The initial variable-selection options were too deep to make ecological sense and were effectively serving as a proxy for ocean depth, so we made the decision

DISCUSSION
We demonstrate that acoustically derived measures of prey spatial structure can be used to interpret marine mammal distribution. Acoustically derived measures of 2-dimensional prey distribution, used as covariates in GAMs, begin to address uncertainty in our understanding of how marine mammals use their habitat and are spatially organized in this region. This study demonstrates that acoustically derived prey covariates in abundance surveys can generate similar goodness-of-fit as traditional environmental covariates (Table 2) (Forney et al. 2012, Palacios et al. 2013, Becker et al. 2019.
Most recent attempts to model similar marine mammal density data for abundance estimation have relied almost entirely on parameters measuring surface or ocean floor characteristics, most of which were physical rather than biotic variables (Forney et al. 2012, Roberts et al. 2016, Becker et al. 2019, Chavez-Rosales et al. 2019. While there have been previous attempts to apply acoustically derived prey layers to abundance surveys, many of these efforts either did not significantly improve model performance or did not find spatial relationships between the acoustic variables and cetacean areas of interest (Lehodey et al. 2008, LaBrecque 2016, Roberts et al. 2016, Virgili et al. 2021. Comparison of our models to other marine mammal abundance models prepared from the same region with overlapping data further validate the potential of acoustic prey variables (Chavez-Rosales et al. 2019). The explained deviance between these 2 studies was comparable for rorquals and similar but somewhat lower for common dolphins, and poorer for species known to dive deep to forage on squid (Risso's dolphin, pilot whale, and sperm whale) ( Table 2). The rorqual guild model explained deviance similar to individual rorqual species models. The dolphin and deep-diver models explained less deviance than individual species models, perhaps due to modeling multiple species with somewhat different habi-tat preferences, although variable selection for species guilds generally reinforced the variable selection in individual species models.
Interpretation of prey distribution variables (Table 1) (e.g. Urmy et al. 2012) requires knowledge of each predator's biology, behavior, and possible relationships to prey distribution, as well as the spatial scale (i.e. range and resolution) at which we are making inferences. We assumed that marine mammal distribution is primarily driven by prey resources that can be observed through echosounding and, similarly, can be detected by odontocete echolocation. Our metrics are able to discern degrees of evenness and patchiness of potential foraging resources and infer some aspects of their composition. Through the use of these variables, we can assess our hypotheses and gain insights into how marine mammals interact with prey resources and how this shapes cetacean distribution.
We hypothesized that prey patchiness or consistency would be an important driver of marine mammal distribution and that it may vary by depth and individual frequency response; our model results bear that out. For shallower diving species, aggregated patches of prey within a portion of an acoustic cell may be more discernible than for deep-diving species and may denote important foraging areas with dense concentrations of prey. This is reflected in individual and guild rorqual models that include variables representing shallow aggregations of small schooling fish or krill, which fits with expected foraging (  (Flinn et al. 2002, Hazen et al. 2009, Derville et al. 2020, Jory et al. 2021. We surmise that deep-diving species may prefer an area depicted as an even distribution of organisms encompassing an entire acoustic cell that would represent the presence of the deep scattering layer. The spreading of the acoustic beams with increasing depth and the attenuation of higher frequencies with depth make it more difficult to discern individual prey patches in deeper layers, such that distinct prey patches may be smoothed over an acoustic cell. The penetration depths of higher frequencies limits individual frequency use and also restricts our ability to apply the MFI (e.g. Trenkel & Berger 2013) beyond 200 m depth. However, these classifications would likely still be depth-limited given the attenuation of high frequencies at depths that are likely needed to classify weaker scattering organisms such as squid. This highlights the need for echosounding systems that can be positioned in and near the deep scattering layers, rather than from surface-bound ships, to improve our classification capabilities (Cotter et al. 2021). This MFI is also lacking true classification for cephalopods, although the non-swimbladdered fish category may encompass squid. Future research should aim to include direct classification of cephalopods and deep foraging layers by applying acoustic squid models (e.g. Jones et al. 2009).
Despite these limitations, most of our models for deep-diving species reflected either the presumed foraging depth of the modeled species (Kawakami 1980, Gannon et al. 1997a,b, Watwood et al. 2006, Aguilar Soto et al. 2008, Smith et al. 2015, Quick et al. 2017, Arranz et al. 2019 or the region denoting the top part of the deep scattering layer, within which marine mammals may forage (Fig. 5). Additionally, inclusion of 70 and 120 kHz frequency response variables in the 150−400 m range in most of these deep-diver models suggests the importance of higher frequency response values in depths toward the end of their useful ranges. At these depths, weak signals would not be distinguished from noise and may indicate important habitats by only returning signals from dense concentrations of organisms or strongly reflecting organisms.
An additional challenge for this modeling approach involves species that primarily feed at night on diel-migrating prey, which could pose a temporal mismatch in prey and marine mammal sightings. For example, common dolphins feed on both epipelagic and mesopelagic schooling fish and squid, with most dives less than 100 m (Waring et al. 1990, Overholtz & Waring 1991, Lahaye et al. 2005, Pusineri et al. 2007. A relationship with patchy prey in the top 50 m may reflect shallower dives and a diversified daytime diet, but much feeding is thought to occur at dusk or at night as myctophids migrate upwards in the water column to feed (Waring et al. 1990, Lahaye et al. 2005, Pusineri et al. 2007). In the case of the common dolphin and other dolphins, this nighttime foraging was potentially accounted for by the positive association with deeper concentrations of prey at depth that would vertically migrate in the evening (Fig. 3). Incorporation of passive acoustics and studies that tag animals paired with echosounding would help clarify some of the relationships described in this paper (e.g. Hazen et al. 2009, Abecassis et al. 2015, Burrows et al. 2016) and, in particular, could also better assess prey relationships at night. Identification of portions of the deep scattering layer that migrate upwards in the evening could improve predictive models for species that appear to take advantage of this resource at night.
Diel migration could also potentially explain the lack of a krill variable in the individual humpback and fin whale models. Given that these observations were collected during the day in oceanic waters, concentrations of krill may have been deeper in the water column due to diel vertical migration (Gjøsaeter et al. 2017). However, in this case, this may not have been an issue because most of the observations were not in deep waters. In addition, other variables under consideration for inclusion in the fin whale model did show a positive relationship, with an increasing proportion of krill and copepods at 50−100 m (Fig. S5). In addition, there was a relationship in the final rorqual model that suggests an affinity for dense aggregations of krill in the top 200 m. The lack of a variable representing krill in the individual species models may be due to a relatively low sample size for the individual species models that was overcome with a larger sample size in the rorqual model.
While most model variables could be explained by prey type and dive depth, the inclusion of 18 kHz variables in dolphin and rorqual models is more challenging to explain. These results counterintuitively suggest that areas with low 18 kHz returns are favorable habitat. These same relationships resurfaced in both the individual and guild models for dolphins and rorquals (Figs. 2, 3 & S5-S7) and so are not expected to be a result of sample-size issues. Additional model plots of variables that were considered for the dolphin guild model suggest that high 18 kHz returns may be associated with cells where the area is fully occupied by consistent 18 kHz returns, therefore resulting in high mean density and abundance values but without indications of dense prey patches that could be distinguished within the cell (Fig. S9). Interpretation is complicated because individual frequency responses do not allow classification of the community of organisms. The returns observed here could be indicative of the shelf break oceanography, where there is a marked difference between the upper and lower water columns that may result in consistent areas with low backscatter in the upper water column, with animals feeding just below the upper layer in the more complex bottom half of the water column containing internal waves and presumed density differences where deep bottom water can be seen intruding from off the shelf (Fig. 6).
In future studies, areas off the shelf break should be included to allow for abundance estimations over a wider region. The shelf break region was chosen for this study because it encompassed a constrained area that could serve as a good test case given the range of habitats and depths encompassed there, although future work should include additional offshore regions. Future abundance estimation models should also combine acoustically sensed prey layers with more traditional variables, such as sea surface temperature and latitude, that could provide structure within which acoustic variables may become more informative, improving accuracy. In order to achieve prediction of distribution and abundance in areas not on the ship track, prey structure would need to be modeled using bathymetry, satellitederived oceanographic variables, and ocean models, as is done in the SEAPODYM model (Virgili et al. 2021).

CONCLUSIONS
This study demonstrated the application of acoustic classifications from echosounding data for use in modeling marine mammal abundance and suggests that echosounding-derived variables may be powerful predictors when paired with more traditional variables. We used backscatter from organisms of interest to model the fit between marine mammal abundance and potential prey characteristics such as prey availability and prey structure within the water column. All models incorporated at least one variable describing the structure of prey within acoustic cells, suggesting an affinity for dense aggregations of prey. Most models included at least one variable describing the type of potential prey in these cells, usually combined with characteristics describing the structure of prey aggregations. These models fit best for cetaceans that feed in the photic zone, with a poorer fit for species that forage heavily on cephalopods and those that feed deeper than 200 m, although overall depth preferences were reflected for these deeper diving species. Application of acoustically sensed prey layers could improve abundance models so that they are more responsive to changing ocean conditions and result in higher accuracy by directly modeling prey abundance. Improved abundance models should assist in managing human impacts on marine mammals as the changing climate shifts preferred habitats, and human demand for ocean-based energy resources increases.