Acoustic Classification of Juvenile Pacific Salmon (Oncorhynchus spp) and Pacific Herring (Clupea pallasii) Schools Using Random Forests

Rousseau, Shani; Gauthier, Stéphane; Neville, Chrys; Johnson, Stewart; Trudel, Marc

doi:10.3389/fmars.2022.857645

ORIGINAL RESEARCH article

Front. Mar. Sci., 11 July 2022
Sec. Marine Fisheries, Aquaculture and Living Resources
Volume 9 - 2022 | https://doi.org/10.3389/fmars.2022.857645

Acoustic Classification of Juvenile Pacific Salmon (Oncorhynchus spp) and Pacific Herring (Clupea pallasii) Schools Using Random Forests

Shani Rousseau^1*

Stéphane Gauthier^2,3

Chrys Neville⁴

Stewart Johnson⁴

Marc Trudel⁵

¹Maurice Lamontagne Institute, Fisheries and Oceans Canada, Mont-Joli, QC, Canada
²Institute of Ocean Sciences, Fisheries and Oceans Canada, Sidney, BC, Canada
³Department of Biology, University of Victoria, Victoria, BC, Canada
⁴Pacific Biological Station, Fisheries and Oceans Canada, Nanaimo, BC, Canada
⁵St. Andrews Biological Station, Fisheries and Oceans Canada, St. Andrews, NB, Canada

Acoustic surveys are the standard approach for evaluating many fish stocks around the world. The analysis of such survey data requires the accurate echo-classification of target species. This classification is often challenging as many organisms exhibit overlapping characteristics in terms of shape, acoustic amplitude, and behavior. In this study, a random forest approach was used to distinguish juvenile Pacific salmon (Oncorhynchus spp) from Pacific herring (Clupea pallasii) aggregations using the acoustic and morphological characteristics of their echo traces. The acoustic data was collected with an autonomous, multi-frequency echosounder deployed on the seafloor in the Discovery Islands, British Columbia from May to September 2015. The model was able to differentiate juvenile Pacific salmon from Pacific herring with a 98% accuracy. School depth and school mean volume backscattering strength were the most important predictors in determining the school classification. This study supports other publications suggesting that random forests represent a promising approach to acoustic target classification in fisheries science.

Introduction

Acoustic surveys are commonly used to monitor fish and zooplankton in many parts of the world. These surveys can be conducted from a moving vessel (Johannesson and Mitson, 1983; Simmonds et al., 1991; Simmonds and MacLennan, 2005; Parker-Stetter et al., 2009) or from fixed platforms (Thomson and Allen, 2000; Kaartvedt et al., 2009; Sato et al., 2013). Acoustic surveys are an integral part of a number of fish stock management programs. Examples in Canada include the Pacific Hake joint stock assessment (Edwards et al., 2022), the Atlantic herring stock assessment in the northern Gulf of St. Lawrence (Chamberland et al., 2022), and the capelin stock assessment in Newfoundland (Bourne et al., 2021). In 2015, a project was initiated to monitor juvenile Pacific salmon during their northward out-migration to reach the Pacific Ocean from the Strait of Georgia through the Discovery Islands and Johnstone Strait. Several fixed, upward looking echosounders were deployed in small bays in order to track the migration timing and relative abundance of juvenile Pacific salmon through the area (Rousseau et al., 2018). The Discovery Islands archipelago has been characterized as a high nutrient low chlorophyll (HNLC) region limited by light availability, and has been suggested to constitute a bottleneck region for juvenile Pacific salmon due to reduced food availability (Mckinnell et al., 2014). Pacific herring is also common in the area (Haegele et al., 2005).

The study involved the recording of several months of acoustic data each year, and a need emerged for replicable, automated methods of data processing and classification. The identification and classification of fish species is an ever-existing challenge in fisheries acoustics research. In many cases, acoustic targets are classified through “expert scrutiny”, based on information gathered from various means of validation sampling such as pelagic trawl, seine, and underwater video cameras, as well as prior knowledge on the frequency response, schooling behavior and typical habitat of each species. In addition to being highly time-consuming, this method involves a significant level of subjectivity as it relies heavily on the analyst’s knowledge and experience, and each validation method is subject to its own bias (McClathie et al., 2000; Fernandes, 2009; Boldt et al., 2019).

In recent years, efforts have been undertaken to automate part or all of the acoustic classification process in order to improve accuracy and replicability. Multi-frequency analysis can provide a way to increase objectivity and has been used successfully to differentiate krill from fish (Watkins and Brierley, 2002; De Robertis et al., 2010). However, when applied to more similar species, such as fish with a swim bladder or similar size classes of zooplankton, the success of this technique is often limited by the overlap in their frequency-response (Lavery et al., 2007; Gauthier et al., 2014; Sato et al., 2015). Several attempts have been made to improve the classification process by the introduction of multiple predictors, such as combining acoustic response with morphological characteristics of the aggregations using linear statistical models (Lawson et al., 2001; Woodd-Walker et al., 2003), and obtained promising results. More recently, several studies have explored machine learning methods such as neural networks and random forests as a way to integrate acoustic frequency response and school shape characteristics to automate and improve classification (Cabreira et al., 2009; Fallon et al., 2016; Brautaset et al., 2020; Proud et al., 2020).In this study, we used acoustic data from one of the autonomous echosounders deployed in the Discovery Islands, British Columbia to evaluate the use of a random forest classifier to distinguish juvenile Pacific salmon (Oncorhynchus spp) from Pacific herring (Clupea pallasii) aggregations using their acoustic response and the morphological characteristics of their echo traces. We developed a training dataset using acoustic data classified through expert scrutiny informed by net-based fish sampling from a purse seining program, a high-resolution imaging sonar, and prior knowledge from extensive herring survey monitoring programs in the Strait of Georgia and the west coast of Vancouver Island.

Materials and Methods

Data Sources

As part of a larger study aiming to better understand the early marine survival of juvenile Pacific salmon in the coastal waters of British Columbia, several autonomous, single-beam echosounders (Acoustic Zooplankton and Fish Profiler (AZFP), ASL Environmental Sciences) were deployed in the Discovery Islands between 2015 and 2020. The instruments were deployed on the seafloor looking upward. This enabled us to continuously monitor the relative abundance, distribution, and behavior of juvenile salmon through the area (Rousseau et al., 2018). In the lower mainland of British Columbia, a large number of rivers and streams, which support wild salmon populations, enter the Strait of Georgia, and the majority of juvenile salmon from these systems reach the Pacific Ocean by migrating north through the Discovery Islands and Johnstone Strait (Tucker et al., 2009; Beacham et al., 2014), a region located between the eastern side of central Vancouver Island and the British Columbia mainland (Figure 1). These areas are characterized by narrow restricted channels that have very high current velocities and often high wind conditions, and this often restricts the use of net-based surveys for juvenile salmon.

FIGURE 1

Figure 1 Location of AZFP mooring (black star) in the Discovery Islands, between Vancouver Island and the mainland of British Columbia

Data Collection

For the purposes of this study, we used data collected by one autonomous echosounder deployed in Okisollo Channel from May to September 2015. Okisollo Channel is a sheltered body of water separating the islands of Sonora and Quadra in the Discovery Islands, British Columbia (Figure 1). This site was the most accessible among all sites sampled, and in 2015 we were able to obtain bi-weekly fishing data and conduct bi-weekly high-resolution sonar surveys of the area during the expected migration window of juvenile salmon (Neville et al., 2016; Freshwater et al., 2019) to provide groudtruthing. The site was located in a small bay approximately 170 m from shore, at a bottom depth of 55 m. The echosounder operated at 4 frequencies (67, 125, 200, 455 kHz); however only the three lower frequencies were used in this study. The highest frequency (455 kHz) exhibited attenuation beyond our acceptable 10 dB signal to noise ratio at ranges greater than 30 meters. Each transducer was calibrated by ASL Environmental Sciences using a calibrated hydrophone and transmitter in a freshwater tank. Calibration checks using a 12.7 mm diameter tungsten-carbide sphere were carried out each year (before and after each deployment) to ensure that measured outputs were within 1 dB of the sphere’s theoretical value on or near axis.

A sampling interval of 3 s and a vertical sampling of 0.09 m were chosen as a compromise between data resolution, battery consumption and data storage space. A pulse duration of 500 µs was used at 67 kHz, and 300 µs was used for higher frequencies. A digitization rate of 64 kHz was used for all frequencies.Monthly average sound speed (Mackenzie, 1981) and absorption coefficient (Francois and Garrison, 1982) were calculated from temperature and salinity profiles collected between May and July 2015 (SBE-25 Sea-Bird Scientific). Monthly values for August and September were calculated from a linear interpolation of the May to July time series.Purse seine surveys were conducted twice a week from May 12 to July 15 in 2015 following the known presence of juvenile salmon in the area. Sampling was performed with a small mesh purse seine on a commercial seiner during slack and low flow tides (Neville et al., 2016; Freshwater et al., 2019). The seine was equipped with a bunt of 7 mm mesh designed to retain juvenile salmon and other small pelagics. Purse seining was carried out close to the acoustic mooring site, often within 200 m distance and on a few occasions directly on top of the mooring. As the seine sampled the top 20 m of the water column, it is likely that Pacific herring, which is often found deeper in the water column during day time (Thompson et al., 2016), was under-sampled compared to juvenile Pacific salmon. All species captured by the purse seiner were counted and identified (Table 1). Oncorhynchus spp and Clupea pallasii were the only fish species caught by the purse seine in the study area.Additional information on Pacific herring acoustic signature and school characteristics were obtained from midwater trawl-verified echograms collected on mobile acoustic surveys off coastal British Columbia. These data were collected from the CCGS W.E. Ricker operating a hull-mounted Simrad EK60 echosounder at 38 and 120 kHz (see Boldt et al., 2016 and Gauthier et al., 2016 for a general description of the main surveys). These surveys confirmed that Pacific herring was found at depth during the day and was identified as high-density and vertically elongated schools in the acoustic echograms. Boldt et al. (2019) provide information regarding the challenges of target validation in Pacific herring and other forage fish acoustic surveys.

TABLE 1

Table 1 Total number of juvenile Pacific salmon and Pacific herring individuals caught by the purse seiner in Okisollo Channel.

Small mobile surveys, using a vessel-mounted, side-looking imaging-sonar (Sound Metrics DIDSON), were also conducted in the area of the mooring and helped inform juvenile salmon target classification. The beam of the DIDSON was oriented horizontally from the starboard side of the vessel, with a detection window range of 5 to 10 meters. Surveys were conducted twice a week between June 11 and July 7 2015. Aggregations of juvenile salmon forming near the surface and detected by the DIDSON were visually confirmed by the vessel operator. The DIDSON data revealed that juvenile salmon aggregated rather loosely near the surface at our sampling site, in contrast to the denser, deeper schools typical of herring (Trumble and Humphreys, 1985).

Acoustic Data Analysis

The acoustic analysis was performed with Echoview (version 8.0) (Echoview Software Pty Ltd 2015) and the R software for statistical computing (version 3.5.3) (R Core Team, 2019) with RStudio (version 1.1.463) (RStudio Team, 2018). Echograms showing the main steps of the data analysis process are displayed in Figure 2. Background noise was removed from the acoustic data by linear subtraction using the Background Noise Removal algorithm implemented in Echoview (De Robertis and Higginbotton, 2007). Thresholds for maximum estimated noise were -125 dB at all three operational frequencies and were determined empirically by estimating the volume backscattering strength in the background signal where no biological targets were detected. A signal-to-noise ratio of 10 dB specified the acceptable limit for a signal to be deemed distinguishable from noise.

FIGURE 2

Figure 2 Steps of the acoustic data analysis: raw data at 67, 125 and 200 kHz, respectively (A–C) raw data after background noise has been removed and surface and near-field have been masked (D–F) sum of all three echograms and school detection (G) and school regions applied to echograms (H–J). The colorbar shows the volume backscattering strength (dB re 1 m). The colorbar on the right-hand side also applies to echograms (A–F), while echogram (G) shows a different scale. The vertical range represented in the echograms is approximately 50 meters. The echograms cover a period of 20 minutes on May 13 2015 (herring school), and a period of 30 minutes on May 19, 2015 (juvenile salmon schools)

A multi-frequency method developed by Fernandes (2009) was used to remove data outside of fish schools for the purpose of improving the single-frequency SHAPES algorithm (Coetzee, 2000) implemented in Echoview for school detection (Barange, 1994). This method proved efficient at removing all non-fish signals, as well as several bands of noise caused by side lobes and multiple surface reverberations present in the 67 kHz echograms (see Rousseau et al., 2018 for details). Acoustic data at 67, 125 and 200 kHz were thresholded to -70 dB to remove non-fish echoes and summed across all frequencies. The resulting combined virtual echograms were thresholded empirically to -180 dB. A 5x5 median convolution kernel was applied to remove single targets and noise spikes, followed by a 7x7 dilation convolution kernel to compensate for any loss of data within schools caused by the previous filtering steps. Finally, a mask was applied onto the raw data to all acoustic signal excluded through the previous steps.

Fish schools were detected on the masked raw data thresholded at -70 dB. An imaginary GPS linear track of 1 knot (0.51 m/s) was generated to convert the time units into virtual distance, since Echoview required GPS input to apply the school detection algorithm. A minimum horizontal threshold corresponding to 29 seconds and a minimum vertical threshold of 1 m were selected for fish school detection.

The depth of each school was determined by subtracting their mean range from the range of the acoustically detected surface. Schools were detected between sunrise and sunset hours only. At night, both juvenile salmon and herring lose their schooling behavior and tend to spread out in scattering layers, making it more difficult to distinguish the two species.Schools were classified through expert scrutiny based on the groundtruth information gathered from the on-site imaging sonar and purse seine surveys, as well as information from trawl-verified Pacific herring schools along the coast of Vancouver Island. All salmon species were included in the same class, as their echoes and behavior are likely too similar to allow for an acoustic classification.

The difference in backscatter at all three frequencies (ΔMVBS) was calculated (in the logarithmic domain) for each school and used as predictor variables in the random forest. The transducers’ beam width at half power varied from 10° at 67 kHz to 8° and 9° at 125 kHz and 200 kHz, respectively, resulting in a maximum diameter difference of approximately 2 meters at the surface. The relatively long period for school detection (29 seconds at a minimum) ensured that the school occupied the transducers’ beam footprint, and that the packing density (fish m^-3) inside the two beams was comparable.

The following 23 variables were exported from Echoview for each school: mean S_{v 67kHz} (dB), mean S_{v 125kHz} (dB), mean S_{v 200kHz} (dB), minimum S_{v 67kHz} (dB), maximum S_{v 67kHz} (dB), thickness (vertical extent) (m), perimeter (m), area (m²), skewness of S_{v 67kHz} distribution, kurtosis of S_{v 67kHz} distribution, coefficient of variation of S_v 67kHz distribution, horizontal roughness coefficient of the S_{v 67kHz} distribution (dB re 1 m² m^-3), 3D volume (m³), area backscattering coefficient at 67 kHz (ABC, m² m^-2), mean distance from transducer (m), mean depth (m), number of samples, time interval (seconds), date, time of day, ΔMVBS_67-125kHz (dB), ΔMVBS_67-200kHz (dB), and ΔMVBS_125-200kHz (dB). Unless otherwise stated, predictors derived from the acoustic data were calculated from the school’s signal at 67 kHz, because swim bladdered fish backscatter is slightly stronger at lower frequencies (Lavery et al., 2007).

Random Forest Model

Classification and regression trees are an increasingly popular statistical approach and have been used notably in ecology (De’ath and Fabricius, 2000), psychology and medical sciences (Strobl et al., 2009). This approach is particularly useful in cases where the data is complex, relationships non-linear, and the number of predictor variables is high (Breiman et al., 1984). Random forest models, in particular, are easy to implement, with very little tuning required (Hastie et al., 2009). They are not affected by correlations and interactions between variables, and do not overfit (Breiman, 2001).

A random forest is a modification of bagging, where a collection of classification trees grown on bootstrap samples of observations, cast a vote on the most popular class. During the tree growing process, each node is split using a random subset (m) of predictor variables (p). The unpruned trees and ensemble averaging work to reduce bias (the difference between the true mean and the average of the estimate) and variance (the expected deviation around its mean) (Hastie et al., 2009).The random forest analysis was implemented in the R software environment using the party package (Hothorn et al., 2006; Strobl et al., 2007; Strobl et al., 2008). The cforest function in the party package uses conditional inference, an unbiased recursive partitioning algorithm, to select predictor variables through permutation-based significance tests. Strobl et al. (2007) demonstrate that this unbiased recursive partitioning method, applied by subsampling without replacement rather than bootstrapping, results in unbiased variable selection, in contrast to the CART algorithm implemented in the randomForest package (Liaw and Wiener, 2002).The dataset was composed of 2659 acoustic regions (see Figure 2) representing 343 herring schools and 2316 juvenile salmon schools. Because such an imbalanced dataset can lead to poor prediction performance (Chen et al., 2004), we chose to down sample the dominant class, as it was shown to lead to better performance than over-sampling (Drummond and Holte, 2003). A training dataset was first created by randomly extracting 80% of the schools from each class, and then extracting a 15% subsample from the juvenile salmon class. As a result, 80% of the data corresponding to herring was used in the training dataset, in contrast to 12% for the juvenile salmon schools. The remaining 20% of data from each class was used for validation of the random forest model.The 23 variables initially selected were evaluated for collinearity to eliminate redundancy. Random forests do not require that all variables be non-correlated; however, correlated variables may make the importance order of each variable unclear (Breiman, 2001). The following statistics were conducted to verify collinearity: partial correlation (Whittaker, 1990), collinearity in a linear model (Chambers et al., 1992), and the variance inflation factor (Naimi et al., 2014). As such, the following 13 non-correlated predictors were subsequently retained for the analyses (Table 2): mean S_{v 67kHz}, mean S_{v 125kHz}, minimum S_{v 67kHz}, ΔMVBS_67-125kHz and ΔMVBS_67-200kHz, thickness, area, kurtosis, coefficient of variation, mean depth, time interval, date and time of day.Several parameters can be tuned to optimize the random forest model. In practice, values should be selected as to minimize the out-of-bag error estimate (oob error). The oob error is a means to estimate the accuracy of the model (Liaw and Wiener, 2002; Hastie et al., 2009). Its value is obtained by dividing the number of wrongful classifications by the total number of samples. The oob error becomes almost identical to a K-fold cross-validation as bootstrap samples increase (Hastie et al., 2009). For node size, Hastie et al. (2009) recommend a value of 1, the minimum size of the terminal nodes of the forest. Increasing this number causes smaller trees to be grown. The choice of the number of trees to generate (ntree) should minimize the out-of-bag error estimate (oob error); however, a larger number of trees is preferable to increase the accuracy of variable importance (Breiman, 2002). The variable importance describes the ranked importance of each variable in improving classification accuracy. For classification, the default value for the number of random subsets m to be selected at each node is √p, but it should be chosen in order to minimize the oob error. Increasing m will decrease bias but increase variance (Hastie et al., 2009). In situations where few variables are relevant in the prediction process, a very small m may result in a decrease in prediction accuracy, because many of the trees being built will not incorporate any of the relevant variables (Díaz-Uriarte and Alvarez de Andrés, 2006).Here, a m value of 2 predictors per splits was selected as it minimized the oob error. For the number of trees, the oob error was computed for ntree up to 2000. The oob error decreased rapidly until it reached a minimum around ntree = 50. It reached stability at ~ 1250 trees. To ensure the accuracy of variables importance and the stability of the model, we chose a ntree value of 1500, as it did not increase computing time significantly.Variable importance was measured using the permutation accuracy importance method. This method as well as several alternatives are discussed in Strobl et al. (2007, 2008; 2009). The permutation accuracy importance looks at the difference in accuracy before and after permuting each predictor variable to assess the importance of each variable in predicting the classification. This method allows for an unbiased measure of variable importance compared to other methods such as the Mean Decrease in Accuracy and the Gini Index, in particular in cases where missing values are present, when predictor variables vary in their scale of measurement or their number of categories, and when some of the predictor variables are correlated.We evaluated model performance with three different metrics: accuracy (K = 1 – oob), true skill statistic (TSS = Sensitivity + Specificity - 1, Allouche et al., 2006) and area under the receiver operating characteristic (ROC) curve (hereby named AUC), which represents the true positive rate (TPR, or sensitivity) plotted against the false positive rate (FPR, 1 – specificity) over a continuum of thresholds (Fielding and Bell, 1997; Kuhn and Johnson, 2013). Accuracy is the most direct way to evaluate the model; however, ROC curves and TSS are insensitive to class imbalance (Allouche et al., 2006; Kuhn and Johnson, 2013). Allouche et al. (2006) show that TSS is correlated with AUC, as they are both derived from sensitivity and specificity. Nevertheless, we present and discuss both metrics here. For the purpose of specificity and sensitivity calculations, juvenile salmon schools were selected as the primary class (positives), while herring schools were selected as the secondary class (negatives).

TABLE 2

Table 2 Variables included in each random forest model.

Results

When including all non-correlated variables, the accuracy measures on the validation dataset were 98.1, 97.8 and 98.9% for K, TSS and AUC, respectively (Table 3). Table 4 shows the confusion matrix and the oob error for the training and validation datasets. Mean depth was the most important variable, followed by mean S_{v 67kHz} and date. Coefficient of variation, mean S_{v 125kHz}, ΔMVBS_67-200kHz, ΔMVBS_67-125kHz and thickness followed, but their order varied slightly depending on the random selection of the training dataset. Minimum S_{v 67kHz}, area, time interval, time of day and kurtosis were the least important predictors (Figure 3).

TABLE 3

Table 3 Performance metrics of the validation dataset for each random forest model.

TABLE 4

Table 4 Confusion matrix for random forest generated using training dataset of 555 and validation dataset of 528 randomly selected regions.

FIGURE 3

Figure 3 Importance of variables in the random forest model using all non-correlated variables (A) and 5 most important variables (B).

When conducting the random forest on the five most important variables (Table 5), K increased slightly to 98.3%, while TSS and AUC both decreased (93.0 and 96.5%, respectively). The variable importance remained unchanged, except for a swap between the mean S_{v 125kHz} and coefficient of variation variables. Kuhn and Johnson (2013) show that using non-informative predictors in random forests may lower performance. However, the decrease in the TSS and AUC values, originating from a low (0.94) specificity in the validation dataset, suggests that in this case, keeping all predictors did improve performance.

TABLE 5

Table 5 Confusion matrix for random forest generated using training dataset of 555 and validation dataset of 528 randomly selected schools.

Using only the two most important variables, mean S_{v 67kHz} and mean depth, all metrics remained close to the previous values (97.9, 93.8 and 96.9% for K, TSS and AUC, respectively). Using only the main morphometric variables (thickness, time interval and area) decreased the performance metrics to 89.2, 68.8 and 84.4% for K, TSS and AUC, respectively. Using only the main acoustic variables (mean S_{v 67kHz}, mean S_{v 125kHz}, ΔMVBS_67-125kHz, ΔMVBS_67-200kHz and minimum S_{v 67kHz}) resulted in values of 92.6, 90.3 and 95.1% for K, TSS and AUC, respectively.TSS consistently provided the lowest performance metric, due to lower specificity values in the validation dataset. TSS decreased by 4.8 and 4% when removing all but five and two predictors, respectively. Run 5, which used only acoustic variables, presented the lowest sensitivity-to-specificity ratio, meaning that the model performed better at classifying herring than juvenile salmon in this case. AUC values were higher than TSS but lead to similar conclusions in model performance.

Discussion

Variable Importance

School depth and mean S_v67kHz were the dominant predictors to classify juvenile salmon and herring schools. Many other studies report the importance of school depth in classification accuracy (Lawson et al., 2001; Cabreira et al., 2009; D’Elia et al., 2014). Juvenile salmon were commonly found nearer the surface, whereas herring schools were most often found at mid and deep water during day time. The acoustic signal of a herring school is generally strong, as they form tight schools and swim in a synchronized fashion (Blaxter, 1985). In contrast, our DIDSON surveys showed that juvenile salmon, although exhibiting strong directional swimming following a disturbance (i.e. avoidance reaction), generally formed loose aggregations and were less polarized, resulting in a generally lower mean S_{v 67kHz} and mean S_{v 125kHz}.

During this study, a temporal mismatch was observed between juvenile salmon and herring in the area (Figure 4). Juvenile salmon migrated through the area mainly between May and July, while herring, although present throughout the summer, were found in larger numbers in August and September. This explains the importance of the date predictor. Including data from several years may help improve the transferability of the model if the date variable is included, since the timing of juvenile salmon migration and the presence of herring in the area may change from one year to the next. The inclusion of environmental data, such as temperature, may also help improve classification, since it can influence migration timing (Corten, 2001; Sykes et al., 2009).

FIGURE 4

Figure 4 Nautical area backscattering coefficient (m² nmi^-2) for juvenile salmon (A) and herring (B).

The model results suggest that mean S_v is a better classifier than ΔMVBS in the case of juvenile salmon and herring schools. Mean S_{v 67kHz} and mean S_{v 125kHz} ranked second and fifth in prediction importance, respectively, while ΔMVBS_67-125kHz and ΔMVBS_67-200kHz ranked seventh and sixth, respectively. Indeed, the probability density functions of juvenile salmon and herring schools ΔMVBS show a greater overlap than that of their mean S_v (Figure 5).

FIGURE 5

Figure 5 Probability density functions of the mean volume backscattering strength of juvenile salmon and herring schools at 67, 125 and 200 kHz (A-C), and of the difference in mean volume backscattering strength between the 67, 125 and 125 kHz frequencies for juvenile salmon and herring schools (D-F).

The importance of the coefficient of variation reflects the schooling behavior of each species. The acoustic signal of herring schools was stronger at the center, possibly reflecting a higher fish density compared to along the edges, although this could be the result of beam volume averaging. In contrast, the acoustic signature of juvenile salmon schools was generally weaker and more uniform, following our findings from the DIDSON surveys that individuals in juvenile salmon aggregations swam rather loosely and did not exhibit strongly polarized swimming, except during escaping events.Our results suggest that using all 13 non-correlated predictors selected for this study leads to the best model performance, mostly due to the higher error rates in the classification of the herring class when applying the model to the validation dataset, when variables are removed. Mean depth and mean S_v67kHz were the most important variables in the classification process, and were sufficient on their own to provide a model accuracy greater than 90% with all performance metrics.

Advantage of Random Forests Over Other Classification Methods

Echoview offers a school classification algorithm that allows the user to conduct classification using many selection criteria, including the most important variables in this study, mean S_v and range (or depth). However, the thresholds for these criteria must be selected manually by the analyst, through trial and error, which makes the selection process subjective and time-consuming. The advantage of the random forest method is that variable interaction is taken into account, and that the threshold selection does not rely on the analyst, because it is chosen within the model to minimize error.

Fallon et al. (2016) used a random forest approach to distinguish icefish, krill and mixed aggregations of weak scattering fish species and obtained an accuracy (K) of 95%. In their study, minimum S_v, mean aggregation depth (m), mean distance from the seabed (m), and geographic positional data were the most important variables in improving classification accuracy. Proud et al. (2020) also used a random forest model to classify Dagaa fish from other scatterers in Lake Victoria. Their most important variables were school length, school height and school Nautical Area Scattering Coefficient (NASC), whereas temperature, dissolved oxygen and turbidity were not important classification factors. Although environmental variables did not contribute significantly to the classification in their study, it may still be relevant to consider such variables since they are often measured in conjunction to acoustic surveys, are easily included into the random forest model, and can influence the distribution of many species of fish, including Pacific salmon and Pacific herring (Schweigert, 1995; Azumaya et al., 2007).

A brief review of the literature reveals that in recent years, the development of machine learning methods to classify active acoustic data has converged toward random forests and deep neural networks (Fallon et al., 2016; Brautaset et al., 2020; Proud et al., 2020; Sarr et al., 2020; Marques et al., 2021; Blanluet et al., 2022). In their study, Sarr et al. (2020) obtained the best and most stable performance with a random forest model when the training dataset was small, while a deep neural network gave the best performance with a large training dataset. Woodd-Walker et al. (2003) also evaluated a simple artificial neural network technique with a small training data set, but reported poor results for the least dominant class of their imbalanced dataset. Given that fisheries acoustic studies are often plagued with a lack of groundtruthing, random forests are likely the best candidate in many situations, although more studies comparing both methods would be required to draw conclusions. Using the same data source as this study, Marques et al. (2021) proposed a deep learning approach based on an instance segmentation framework to identify Pacific herring and juvenile Pacific salmon schools, using pixel-level annotations of 67 kHz echograms. While they obtained good results on unclassified schools (best performance of 92.12 mean average precision (mAP)), the performance was lower for classified schools (performance of 60.18 and 40.19 average precision (AP) for herring and juvenile salmon, respectively, using the same model configuration as for unclassified schools).

Deep neural networks have the advantage of requiring very little data pre-processing as they can draw information from the raw data (Brautaset et al., 2020), resulting in reduced analysis time and effectively removing challenges associated with bottom detection and school definition. On the other hand, random forests can incorporate a large number of variables to improve classification and provide information on the importance of each variable. This can prove advantageous not only to improve the classifier, but also to guide which variables should be monitored and sampled in the field in priority and which samples may be dropped with limited consequences on classification accuracy. It can also provide useful scientific insight into which environmental factors may be important in influencing fish population dynamics, thus contributing to the improvement of population models used in an ecosystem approach to fisheries management.

Conclusion

To our knowledge, this study is the first to attempt to classify two swim bladdered fish species using a random forest approach. We automated the school detection and classification process using Fernandes (2009) multi-frequency method and Echoview’s school detection algorithm, followed by a random forest model, and were able to differentiate between juvenile salmon and herring schools with an accuracy of 98% and higher, depending on the performance metric used. For long-term monitoring studies, consistency in the classification is as important as accuracy. Machine learning methods such as random forests not only provide this consistency by removing human bias associated with manual post-processing of the acoustic data, but they also reduce data processing time significantly. Proper data labeling is critical to reduce bias and improve accuracy of the classification model. However, a model biased by errors in the validation of the acoustic data only propagates an uncertainty that was already present with the “expert scrutiny” approach. Thus, although not ideal, a machine learning approach trained with imperfectly validated data is still preferable, because it leads to a systematic bias rather than an unpredictable uncertainty that is linked to the analyst’s choices. Over time, the algorithm’s accuracy can be iteratively improved with the addition of newly validated data to the training dataset, and this updated algorithm can easily be applied to previously analyzed data to ensure consistency.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics Statement

Lethal sampling of fish for inspection purposes, abundance estimates and other population parameters required for stock assessments are exempted from requiring an animal use protocol under the Department of Fisheries and Oceans' Pacific Region Animal Care Committee (PRACC).

Author Contributions

SR contributed to conceptualization, study design, investigation, methodology, data curation, formal analyses, programming, writing of the original draft and manuscript revision. SG contributed to conceptualization, investigation, study design, funding acquisition, project administration, resources, and manuscript revision. CN contributed to study design, investigation, funding acquisition and data curation. MT contributed to study design, funding acquisition and manuscript revision. SJ contributed to study design and funding acquisition. All authors contributed to the article and approved the submitted version.

Funding

Funding was provided by the Pacific Salmon Commission Southern Boundary Restoration and Enhancement Fund (grant number P-15-01-002). The purpose of this fund is to support activities that support salmon stocks and their habitat by developing improved information for resource management.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Acknowledgments

We thank George Cronkite, John Morrison, Ben Snow, Chelsea Stanley, and Svein Vagle for their assistance with acoustic-related field work, as well as Nordic Queen Captain Harol Sewig and his crew for their assistance with purse seine sampling. Julia Bradshaw, Dylan Conover, Cameron Freshwater, Yeongha Jung and Lenora Turcotte assisted with fish sampling and measurements in the field. Jackie Detering analyzed the DIDSON data. This work was supported by the Pacific Salmon Commission Southern Boundary Restoration and Enhancement Fund and the Aquaculture Collaborative Research and Development Program.

References

Allouche O., Tsoar A., Kadmon R. (2006). Assessing the Accuracy of Species Distribution Models: Prevalence, Kappa and the True Skill Statistic (TSS). J. Appl. Ecol. 43, 1223–1232. doi: 10.1111/j.1365-2664.2006.01214.x

ORIGINAL RESEARCH article

Acoustic Classification of Juvenile Pacific Salmon (Oncorhynchus spp) and Pacific Herring (Clupea pallasii) Schools Using Random Forests

Introduction

Materials and Methods

Data Sources

Data Collection

Acoustic Data Analysis

Random Forest Model

Results

Discussion

Variable Importance

Advantage of Random Forests Over Other Classification Methods

Conclusion

Data Availability Statement

Ethics Statement

Author Contributions

Funding

Conflict of Interest

Publisher’s Note

Acknowledgments

References

People also looked at