Evaluating machine-learning techniques for recruitment forecasting of seven North East Atlantic fish species
Introduction
Early on in fisheries research, recruitment was identified as a key element in management. As a result, recruitment and the factors determining it have been the subject of intense research (e.g. Cushing, 1971, Myers et al., 1995, Ricker, 1954, Rothschild, 2000). Such research has evolved from considering only the biomass of spawners, to including also environmental factors that can modulate recruitment (e.g. Planque and Buffaz, 2008, Schirripa and Colbert, 2006). The main limitation to achieve good forecasts, from a data analysis perspective is the sparse and ‘noisy’ nature of the available data (Fernandes et al., 2010, Francis, 2006).
A further problem is that data about some of the factors that can be controlling recruitment directly (e.g. food availability, larval growth), may be more laborious to obtain, than the recruitment estimate itself (Irigoien et al., 2009, Zarauz et al., 2008, Zarauz et al., 2009). Based on a simplified approach, fisheries management has been moving towards the use of environmental relationships using oceanographic data. These are collected routinely, as proxies of recruitment conditions (Bartolino et al., 2008, Borja et al., 2008, De Oliveira et al., 2005). Nevertheless, the problem remains difficult because the mechanisms behind such relationships are often poorly understood; this in turn, makes it difficult to determine the forecast estimation robustness, leading to the failure of some proposed relationships, methods and performance estimations, when new data became available (Myers et al., 1995). Such failures may be related to new controls, which were not considered previously (Myers et al., 1995, Planque and Buffaz, 2008), or to limitations in the available data (Schirripa and Colbert, 2006).
Recruitment forecast is a problem of high uncertainty (Mäntyniemi et al., in press). Machine-learning techniques have been proposed as an appropriate approach with some desirable properties to address such problems (Dreyfus-León and Chen, 2007, Dreyfus-León and Schweigert, 2008, Fernandes et al., 2010, Fernandes et al., 2013, Uusitalo, 2007). In this study, an update of a previously proposed machine-learning based framework (Fernandes et al., 2010) is applied to several North Atlantic species of commercial interest, which share spawning and nursing environment in the shelf break (Ibaibarriaga et al., 2007, Sagarminaga and Arrizabalaga, 2010). The main properties of this methodology are: (i) forecasts with its uncertainty estimated; (ii) forecasts and scenarios easy to interpret; (iii) recruitment and factors boundaries, that can be interpreted easily; (iv) high stability of selected factors, using a ‘leaving one out’ schema; (v) error balanced through all recruitment level; and (vi) robust, as well as honest performance estimation.
Within this context, this work has three aims: to identify factors for forecasting of North Atlantic species that share spawning and nursing area; (ii) to propose a novel model to modify the previous framework in order to produce more accurate probabilistic forecasts; and (iii) to provide a comparison between goodness-of-fit and generalization power, in order to assess the reliability of the final forecasting models. This comparison is necessary since the used methods are non-parametric and might over-fit the data. The three objectives are crucial to produce reliable forecasts that can be used for decision taking in fisheries management of those species that share spawning and nursing area.
Section snippets
Target species
The species recruitment time series analysed for the North East Atlantic that share the shelf break as spawning and nursing area are summarized below: 1) The anchovy recruitment mixed time-series (ARM) is a combination of two anchovy recruitment time-series; the long anchovy recruitment index time-series (ARI; Borja et al., 1996) established from the percentage of age 1 in the landings (40 years) and the Anchovy Recruitment (AR; ICES, 2008a; 23 years). The resulting time-series contains 45 years
Pipeline comparison
The missing imputation can also be applied to the ‘NBC-Pipeline’; however, no significant improvement was observed. This result was expected since NBC can be learned with missing data and there was no factor with high levels of missing values.
Both classifiers, NB and FNB classifiers, show good-fit for most of the considered species (Fig. 1). The ‘MIS + FNB-Pipeline’ produces the best fitting for the seven species (Table 2). The most interesting property of this fitting for fisheries management is
Discussion
The main contribution of this work is the application of the methodology developed in Fernandes et al. (2010), to a broad set of species using a global set of variables. The forecast estimates of each species can be improved by applying more specific knowledge (more specific environmental data), to each species. However, the results show that, even using a global approach, useful information can be obtained using machine learning techniques applied to the recruitment forecasting problem. The
Acknowledgements
The research of Jose A. Fernandes and Nerea Goikoetxea is supported by a Doctoral Fellowship from the Fundación Centros Tecnológicos Iñaki Goenaga. This study has been supported by the following projects: Ecoanchoa (funded by the Department of Agriculture, Fisheries and Food of the Basque Country Government); the Saiotek and Research Groups 2007–2012 (IT-242-07) programs (Basque Government), TIN2008-06815-C02-01 (Spanish Ministry of Education and Science); COMBIOMED network in computational
References (48)
- et al.
Modelling recruitment dynamics of hake, Merluccius merluccius, in the central Mediterranean in relation to key environmental variables
Fish. Resh.
(2008) Theory refinement on Bayesian networks
- et al.
Potential improvements in the management of Bay of Biscay anchovy by incorporating environmental indices as recruitment predictors
Fish. Res.
(2005) - et al.
Recruitment prediction with genetic algorithms with application to the Pacific Herring fishery
Ecol. Model.
(2007) - et al.
Recruitment prediction for Pacific herring (Clupea pallasi) on the west coast of Vancouver Island, Canada
Ecol. Inf.
(2008) - et al.
Fish recruitment prediction, using robust supervised classification methods
Ecol. Model.
(2010) - et al.
Supervised pre-processing approaches in multiple class variables classification for fish recruitment forecasting
Environ. Model Softw.
(2013) - et al.
Bayesian classifiers based on kernel density estimation: flexible classifiers
Int. J. Approx. Reason.
(2009) Advantages and challenges of Bayesian networks in environmental modelling
Ecol. Model.
(2007)- et al.
Relationship between anchovy (Engraulis encrasicholus) recruitment and the environment in the Bay of Biscay
Sci. Mar.
(1996)
Climate, oceanography, and recruitment: the Bay of Biscay anchovy paradigm
Fish. Oceanogr.
Verification of forecasts expressed in terms of probability
Mon. Weather Rev.
The dependence of recruitment on parent stock in different groups of fishes
ICES J. Mar. Sci.
Using entropy to impute missing data in a classification task
Pattern Classification and Scene Analysis
Bootstrap methods: another look at the jacknife
Ann. Stat.
Multi-interval discretization of continuous valued attributes for classification learning
Measuring the strength of environment-recruitment relationships: the importance of including predictor screening within cross-validations
ICES J. Mar. Sci.
An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons
JMRL
Egg and larval distributions of seven species in north-east Atlantic waters
Fish. Oceanogr.
A two-stage biomass dynamic model for Bay of Biscay anchovy: a Bayesian approach
ICES J. Mar. Sci.
Report of the ICES/GLOBEC Workshop on Long-term Variability in SW Europe (WKLTVSWE), February 13–16, Lisbon, Portugal
Report of the Working Group on the Anchovy, ICES Headquarters, June 13–16
Cited by (18)
Machine learning in marine ecology: an overview of techniques and applications
2023, ICES Journal of Marine ScienceA framework for assessing the skill and value of operational recruitment forecasts
2021, ICES Journal of Marine ScienceCurrent Status of Forecasting Toxic Harmful Algae for the North-East Atlantic Shellfish Aquaculture Industry
2021, Frontiers in Marine ScienceUsing machine learning to link spatiotemporal information to biological processes in the ocean: a case study for North Sea cod recruitment
2021, Marine Ecology Progress SeriesFocused small-scale fisheries as complex systems using deep learning models
2021, Latin American Journal of Aquatic Research