Challenges of transferring models of fish abundance between coral reefs

Reliable abundance estimates for species are fundamental in ecology, fisheries, and conservation. Consequently, predictive models able to provide reliable estimates for un- or poorly-surveyed locations would prove a valuable tool for management. Based on commonly used environmental and physical predictors, we developed predictive models of total fish abundance and of abundance by fish family for ten representative taxonomic families for the Great Barrier Reef (GBR) using multiple temporal scenarios. We then tested if models developed for the GBR (reference system) could predict fish abundances at Ningaloo Reef (NR; target system), i.e., if these GBR models could be successfully transferred to NR. Models of abundance by fish family resulted in improved performance (e.g., 44.1% <R2 < 50.6% for Acanthuridae) compared to total fish abundance (9% <R2 < 18.6%). However, in contrast with previous transferability obtained for similar models for fish species richness from the GBR to NR, transferability for these fish abundance models was poor. When compared with observations of fish abundance collected in NR, our transferability results had low validation scores (R2 < 6%, p > 0.05). High spatio-temporal variability of patterns in fish abundance at the family and population levels in both reef systems likely affected the transferability of these models. Inclusion of additional predictors with potential direct effects on abundance, such as local fishing effort or topographic complexity, may improve transferability of fish abundance models. However, observations of these local-scale predictors are often not available, and might thereby hinder studies on model transferability and its usefulness for conservation planning and management.


INTRODUCTION
Understanding patterns of species' abundance and their causes remains a central challenge to assist ecological management (Stuart-Smith et al., 2013;Atwood et al., 2015), particularly in marine ecosystems (e.g., Worm et al., 2006). Obtaining reliable abundance estimates for marine fishes can be challenging (Pauly, Hilborn & Branch, 2013) as abundances can vary rapidly and substantially through time and space (Caley et al., 1996;Rana et al., 2014). Marine surveys can also be prohibitively expensive rendering many marine systems severely under-surveyed, or when they are possible, estimates are often constrained by the efficiency of surveys (Reynolds, Thompson & Russell, 2011). This situation is exemplified on coral reefs where underwater visual censuses (UVCs) are commonly used to estimate fish abundances. The application of UVCs is constrained in space and time by the cost of field expeditions and by bottom time and depth limits for divers. These censuses do, however, have the advantage of estimating abundances directly (Halford & Thompson, 1996), including exploited (e.g., lutjanids and lethrinids) and non-exploited species (e.g., pomacentrids). They have been successfully used for estimating fish distributions and population trends in coastal ecosystems (Willis, Millar & Babcock, 2003;Stuart-Smith et al., 2013; among many others), and UVC-estimated fish abundances have served as input for predictive models aiming to estimate abundances at the scale of entire reef ecosystems using broad-scale predictors (e.g., Mellin et al., 2010).
Predictive models can assist identifying affinities of species or individuals for particular environmental conditions and thereby predict their patterns of spatial distribution. By being able to provide such information, predictive models can also help fill knowledge gaps essential for coral reef management (Fisher et al., 2011b). However, the robustness (degree of model fit) of predictive models of fish abundances can vary substantially. While relatively robust predictions of abundance have been obtained for single species (up to ∼70% deviance explained) (e.g., Young & Carr, 2015), considerably less variation has been explained for total fish abundances (∼35%) (e.g., Mellin et al., 2010) or even less for some functional groups (e.g., 16% for piscivores ;Pittman, Costa & Battista, 2009).
Because abundance is an important and commonly used ecological metric, having readily available spatial predictions of fish abundances in situations where direct estimates are currently unavailable could be highly beneficial. For example, such information could be used to support stock assessments or guide the establishment of management strategies (Bejarano, Mumby & Sotheran, 2011). In the absence of predictive models, or data with which to build them for a particular location, successful transfer of a pre-existing model built for a data-rich location could be of considerable utility. This procedure of using an existing model to derive predictions for a new location is defined as model transfer, with model transferability being the model's ability to transfer to new locations (Sequeira et al., 2018). Indeed, a recent study (Sequeira et al., 2016) demonstrated the transferability of predictive models of fish species richness between distant coral reefs. However, species richness and fish abundance are two distinct ecological metrics, with inherent differences in spatio-temporal variability caused by different processes. While species richness can be informative for conservation management in the design of marine protected areas, abundance models can provide relevant information for the management of fisheries in stock assessments and setting spatially explicit catch thresholds (Bejarano, Mumby & Sotheran, 2011). Here, we explore the transferability of models for the more variable metric fish abundance considering the same two widely separated reefs recently investigated by Sequeira et al. (2016): the Great Barrier Reef (GBR), a relatively well-studied reef system located off the northeast coast of Australia, and Ningaloo Reef (NR), a fundamentally different and less-studied reef, located off the northwest coast of Australia. In this study, we examine the effects of annual variability in fish counts by accounting for the number of years when fish were surveyed. We investigate a number of scenarios with subsets of data collected in different years and also covering a range of 1 or 4 years at a time. The latter multi-year scenarios allow averaging of inherently variable fish abundance estimates. We also investigate potential effects associated with fishing effort by considering ten representative fish families separately including the exploited fish families Lethrinidae and Lutjanidae. The number of species within each fish family also varies (e.g., Pomacentridae comprises several hundred species, while Zanclidae only one species). Therefore, examining these ten fish families separately allows testing for effects associated with different degrees of abundance variability within and among fish families. We then provide an overview of the difficulties of modelling fish abundances and highlight some of the issues that will need to be overcome before such models can be successfully transferred.
We then partitioned this dataset of fish abundance estimates collected on the GBR into a series of scenarios to explore the influence of different input datasets on the transferability of GBR models of fish abundance to NR. One of our objectives was to test if better transferability could be obtained by using data collected in the same year at both locations, in any single year, or in an averaged set of years when data from only a single year are available for the location to where the model is being transferred. In our scenarios, we varied (i) the length of the subset of data used in each scenario (i.e., including only 1 or 4 years of data) to account for annual variability, which is averaged-out in multi-year scenarios allowing to test if such variability compromises prediction and transferability and whether longer time series may provide better results; (ii) survey years (2003-2007, 2007-2013, 2007, and 2013) to account for effects of temporal variability (i.e., old versus more recent) and to test for any potential effects associated with differences in survey times between the reference and the target system; and (iii) transect size by downscaling the GBR data from 50 m to 25 m (resolution of dataset available at NR) to assess how a mismatch in data resolution between the reference and target systems might also affect model transferability. In all, we tested seven different scenarios.
Our downscaling procedure consisted of two steps. First, we used all transects within the same site to calculate the expected average increase in the number of individuals (N) with the addition of the area surveyed in each added transect (T) for each site (S) (i.e., to estimate N at S based on the areas sampled: T1, T1 + T2, T1 + T2+ T3, T1 + T2+ T3 + T4, and T1 + T2+ T3 + T4 + T5). Doing this procedure per site allowed us to account for the inherent spatial differences observed across sites within reefs. To quantify the increase in abundances with area sampled, we repeated this calculation multiple times by randomising the sequence at which the data from each transect was added. This random procedure allowed the calculation of an average increase in N for each transect added within the same site. Second, we subtracted from each transect, the ratio of the calculated average increase in N with the difference in transect length between each reef system (50/25 m = 2), i.e., Transect N -(estimated average increase/2). This downscaling procedure was only applied to the GBR data and assumed that the increase in the number of individuals per transect sampled would only be similar among all transects within the same site (i.e., with no assumption made beyond the reef scale). The final set of seven scenarios examined here included data collected in 2003-2007 (scenario A), in 2007-2013 (B and C), in 2007 (D and E), and in 2013 (F and G) with scenarios C, E and F including downscaled data, as described above. All models were tested for each of these scenarios.
For NR, we used reef fish UVC data from 81 sites on NR (Fig. S1) collected by the Commonwealth Scientific and Industrial Research Organisation (CSIRO). These data were collected in 2013 using a similar procedure to that used in the GBR, but with shorter sets of three transects of 25 × 5 m (25 × 1 m for small, sedentary species) per site.

Environmental predictors
Large scale predictors including nutrient inputs (such as NO 3 , which influences fish abundance through increasing local primary productivity), and sedimentation (which is relevant for some species feeding behaviour), have previously been suggested as important predictors of reef fish communities (Pinca et al., 2012). Similarly, water temperature affects the aerobic performance of reef fishes with effects being trait-and species dependent (Mellin et al., 2016), and sea surface temperature has been shown to contribute to the best performing models for predicting fish abundance (Mellin et al., 2010but see Barneche et al., 2016. Consequently, we used large-scale environmental predictors available Table 1 Models relating coral reef fish abundance (N) to spatial and environmental properties in the Great Barrier Reef (GBR) and Ningaloo Reef (NR). Coast : distance to coast; barrier: distance to the outer limit of the reef; crbnt : percentage of carbonates; gravel, sand, and mud also represented as percentage and derived from the Marine Sediment Database (MARS; available at npm.mars.search) (Passlow et al., 2005;Mathews, Heap & Woods, 2007); NO 3 : nitrate, PO 4 : phosphate, SI : silicate, O 2 : dissolved oxygen, Sal: salinity, all represented as mean concentrations and derived from the CSIRO Atlas of Regional Seas (CARS; available at: http://www.marine.csiro.au) (Dunn & Ridgway, 2002;Ridgway, Dunn & Wilkin, 2002) SST : annual sea surface temperature derived from the NASA standard monthly data products from the Advanced Very High Resolution Radiometer (AVHRR) Pathfinder V5; Chla: chlorophyll a, and K490: coefficient of light attenuation at 490 nm derived from the ocean colour standard monthly data products from the Sea-viewing Wide Field-of-view Sensor (SeaWiFS) and Moderate Resolution Imaging Spectroradiometer satellite (MODIS) from the National Aeronautics and Space Administration (NASA); subscript av: average. All predictors were mean centred, and coast, barrier, depth, and SST av were included as quadratic terms in a second-order polynomial function. Sediment variables were mostly collinear and were therefore included in separate models (4 -6). Bold face indicates predictors not included in both systems (due to collinearity between variables observed in NR).

Model
Predictor category GBR model NR model at a national scale in Australia (0.01 • resolution) (http://www.marinehub.org), including sea surface temperature (SST ), salinity, nutrients, light (K490 av ), depth (as proxy for habitat), oxygen, and sediment characteristics including percentages of carbonates, gravel, sand and mud (Table 1). To account for geographical effects in the distributional patterns of reef fish, we also included two spatial predictors: the shortest distances to coast (coast ) and to the outer limit of the reefs (barrier), which have been used to successfully predict their species richness and abundances (Mellin et al., 2010;Sequeira et al., 2016). We calculated these distances for each sampled site and node on the 0.01 • national grid using the Near tool in ArcGIS10.1 (ESRI, Redlands, CA, USA) and an equidistant cylindrical coordinate system. For NR, we also used hyperspectral bathymetric data (Heyward, 2006). We then assigned each site to the closest node on the 0.01 • national grid and used the environmental predictors corresponding to these locations. Our model set included different combinations of these environmental predictors, avoiding inclusion of correlated predictors in the same model (Table 1), and accounted for second order polynomials for bathymetric features (depth), distances to domain boundaries (coast and barrier) and sea surface temperature (SST ). To the extent permitted by the computational demands of the modelling procedures used here (i.e., negative binomial with mixed effects, as described below), we kept the model set as similar as possible to that used in the previously published study of the transferability of species richness models (Sequeira et al., 2016). We did this to facilitate better comparisons of the overall results obtained here with those obtained for species richness. We considered the same set of models for all scenarios for the GBR and also for NR (Fig. S2). All predictors were centred prior to modelling to ease interpretation of resulting coefficient estimates. Estimates of rugosity, the variation in the height of the substrate quantified along each transect by the absolute horizontal distance covered by 10 m of light chain moulded to the contours of the substratum, were also available at NR for each sampled transect, but not for the GBR.
These rugosity values at NR were included in an additional model with a quadratic term for this predictor to account for a potential best range of rugosity values for higher fish abundance. This additional model was used to get an indication of how much more the NR models could be improved by including local topographical fine-scale predictors that can directly influence fish abundance (e.g., Caley & StJohn, 1996;Wilson, Graham & Polunin, 2007).

Reference models for GBR and NR
Using the resulting counts for fish abundances (N total and N fam ), we developed generalised linear models (GLM) for each ecosystem (GBR and NR), using a negative binomial distribution with a log-link function with the glm.nb() function from the package MASS in R (R Core Team, 2016). Due to spatial autocorrelation of the residuals from the GBR model (Moran's I > 0.5 at first lag), we developed a generalised linear mixed-effects model (GLMM using the glmer.nb() function from the R package lme4) for the GBR by including reef as a random effect. Inclusion of this random effect was also necessary because several sites fell within the 0.01 • resolution that characterizes the environmental layers. Thus sites can be considered pseudo replicates at this resolution, which can be addressed with the use of a 'reef' random effect (Mellin et al., 2010;Mellin et al., 2012). We used a negative binomial distribution to account for the large dispersion observed in fish abundances (9.2 <dispersion statistic <23.1 calculated after a test model run using a GLMM with a Poisson distribution). We used the weight of the Akaike's information criteria corrected for small sample sizes (wAICc) (Burnham & Anderson, 2004) to rank models, and the marginal R 2 m and conditional R 2 c (Nakagawa & Schielzeth, 2013) to assess their performance. We also calculated the mean effect size for each predictor following the method described in Sequeira et al. (2014). We assessed the correlation between observed and predicted values on sampled sites using the R 2 and p-values from a linear regression between the two sets of values to allow direct comparison between all modelling results (reference and transferred models). We then used the models to predict N total and N fam at the scale of the entire GBR and NR separately. We used a 10-fold cross-validation (Davison & Hinkley, 1997) for estimating prediction errors for each model, as well as the model-averaged prediction error (i.e., across the entire model set) (Table 1). We checked for spatial autocorrelation in model residuals using the sp.correlogram function from the R package spdep (Bivand, 2013) to calculate Moran's I (Diggle & Ribeiro, 2007). We repeated this procedure for all seven scenarios in the GBR and for NR, separately. We then tested the transferability of the GBR model to NR by transferring the N total and N fam models corresponding to each GBR scenario to NR.

Transferred models from the GBR to NR
Assuming stationarity in species abundance distribution across both ecosystems, we assessed the transferability of the GBR models to NR using R 2 and p-values of a linear regression between the observed values for NR and the resulting predictions from the transferred model. We then compared the predictions from the transferred models with those obtained from the model developed for NR. We made the latter comparison following Sequeira et al. (2016), and using (i) the mean direct validation errors of each transferred GBR scenario against observations for NR, (ii) the mean absolute difference between values obtained by the transferred and reference models on a grid-cell basis, (iii) the percentage of grid-cells where the predictions from the transferred models differed by a small amount (≤15%) from the predictions from the reference NR model, and (iv) the N patterns predicted for NR by both the reference and transferred models after rescaling the predictions from the transferred models to the same maximum and minimum as the reference predictions.

Models for GBR (N total and N fam )
The N total models for the GBR resulted in R 2 ranging from 9.0-18.6% across all scenarios, with significant correlations between predictions and observations (p ≤ 0.002) ( Table 2). Model rankings were similar across all scenarios (Table 2, A-G) with the highest wAICc obtained for model 7 (nutrients) but sometimes shared with models 3 (depth), 4 (sand) or 5 (gravel). Most important predictors commonly included NO 3 . R 2 m was ∼20% for the best model in all scenarios while R 2 c was close to 100% (>98.5% for all N total models in all scenarios). Cross-validation errors varied between 12.8-22.5% with predicted ranges generally overestimating minimum observed fish abundances in the GBR but maximum values close to the observed for most scenarios ( Table 2). The predicted patterns of N total for the GBR were also similar for all scenarios with generally greater abundances predicted in two main areas of the southern sections of the GBR ( Fig. 1-N total and Fig. S3).
Modelling results were generally improved when the data were analysed by fish family. However, for most families, the wAICc rank obtained for all GBR scenarios varied and no evident pattern for best scenarios was obtained across all families (Table 3). Acanthuridae and Zanclidae were the only fish families for which all GBR scenarios led to the same wAICc results with model 2 (distance to domain boundaries) being the highest ranked for both families (Table 3). Across all fish families, R 2 m was highest for all Acanthuridae scenarios (73.7-80.0%) while cross-validation errors were lowest for Zanclidae (<1.3 ± 0.4%) ( Table 3). R 2 c was generally high for all fish families and across all scenarios highlighting high variability within each reef section of the GBR, as was already evident in the Ntotal models. Predictions from most GBR scenarios for all families were positively and significantly correlated with values observed on the GBR and resulted in higher R 2 (e.g., 44.1 ≤ R 2 ≤ 50.6%; p < 0.001 for Acanthuridae, and 31.1 ≤ R 2 ≤ 70.7%; p < 0.001 for Lutjanidae). Exceptions included some scenarios for Lethrinidae (D, E and G), Scaridae (F and G), Serranidae (G) and Siganidae (B, C, D, E and G). Cross-validation errors Table 2 Modelling results for the Great Barrier Reef (GBR) scenarios predicting total fish abundance (N total ) to the GBR and to Ningaloo Reef (NR). Observed N : observed fish abundance; Top model: the best performing model/s ranked by the weight of the Akaike Information Criteria corrected for small sample sizes (wAICc); R 2 m : marginal R 2 ; R 2 c : conditional R 2 with R 2 : variance explained; High effect: predictors with the highest effect size; Pred N : range of predicted fish abundance and the respective standard error (Pred se N ); CVerror: cross-validation error and its percentage (CVerror(%)); Val_R 2 : results of the direct validation of the observed abundances versus the predicted values for the same locations and the respective p-value. All scenarios resulted in some wAICc support for the null model. Italics indicate non-significant correlations between the observed values at NR and transferred predictions from the GBR based on a p-value <0.05. A total of 133 sites were considered in the GBR. were generally low (<10%) but ∼25% for Acanthuridae and Scaridae, and >100% for Pomacentridae (Table 3). Predictions for each GBR scenario are presented in Fig. 1.

Models for NR (N total and N fam )
When modelling N total for NR, model 7 (nutrients) also got the highest wAICc support, explaining 16.4% of the deviance (Table 4), and the predictor with greatest effect was also NO 3 . Predicted N total was highest around Point Cloates (central area of NR) (Figs. 2A and 2B-All) however, the cross-validation error was high (49.3 ± 21.2%) and model-averaged predictions were unrelated to observed values in NR (R 2 = 9.3% and p = 0.108) ( Table 4). The test with the additional model including a quadratic term for the predictor rugosity ranked the highest (wAICc = 0.882) and explained 20.6% of deviance, followed by model 7 (wAICc = 0.108, DE% = 16.4) (Table S1). Rugosity also had a large effect jointly with NO 3 , however, the cross validation error was still ∼50%. For the N fam models for NR, highest wAICc support also varied among models and fish families. It was highest for model 7 (nutrients) for Acanthuridae and Chaetodontidae, model 3 (physical predictors) for Lutjanidae, and model 10 (temperature) for Siganidae. However, for other families, wAICc support was below 0.5 for all models. Generally, most of the deviance was explained by the full model (model 1) (e.g., >23% for Acanthuridae and Chaetodontidae) except for Lutjanidae where model 3 explained most of the deviance (26.5%; Table 4). The most common predictors with biggest effect sizes were NO 3 and O 2 , notably for Acanthuridae, Chaetodontidae, Pomacentridae and Scaridae (Table 4). Predicted ranges of fish abundances to NR were always narrower than the observed range, and resulted in high cross-validation errors (>>50%) but were lowest for Labridae (∼52.2%) ( Table 4, Fig. S4) possibly associated with the limited amount of data available Table 3 Results from the Great Barrier Reef (GBR) scenarios predicting fish abundance by fish family to the GBR. Results are summarised for all scenarios for each family unless otherwise specified by splitting results by rows and using letters A-G to identify scenarios. Observed N : observed fish abundance; Best model: the best performing model/s ranked by the weight of the Akaike Information Criteria corrected for small sample sizes (wAICc); R 2 m : marginal R 2 ; R 2 c : conditional R 2 with R 2 : variance explained; Highest effect: predictors with the highest effect size; Pred N : range of predicted fish abundance; CVerror: cross-validation error; Val_R 2 : results of the direct validation of the observed abundances versus the predicted values for the same locations and the respective p-value indicated with asterisks: <0.001 (***), <0.01 (**) and <0.05 (*). Scenario F for Lethrinidae resulted in highest wAICc support for the null model and it is not shown. Italicised text: results where direct validation with values observed in the GBR was non-significant. For details on predictors with highest effect refer to Table 1.

Family
Observed N   for NR to detect large-scale patterns, i.e., those not associated with fine-scale changes in the habitat. Direct validation of predicted abundances against observed values in NR resulted in positive significant correlations only for Chaetodontidae (R 2 = 37.1%, p < 0.001). For this family, higher predicted abundances were concentrated in the central area of NR ( Figs. 2A and 2B). The test with the additional model including a quadratic term for rugosity in NR was ranked highest for five fish families: Chaetodontidae, Labridae, Lethrinidae, Pomacentridae and Scaridae. Rugosity generally had a large effect size in these tests (Table S1), however, similarly to the model of N total cross-validation errors were of the same order of magnitude as those obtained without rugosity in the model set (Table  S1).

Models transferred from the GBR to NR
When transferring the N total models from the GBR to NR, the range of predicted abundances was always narrower than that observed at NR (Tables 2, 4 and 5) and we obtained no significant correlations between observed and predicted values (Fig. 3). Similar results were obtained for predictions from the transferred N fam models from the GBR to NR, which resulted in low, non-significant R 2 (<6%, p >0.05) when directly compared with observed values at NR. Also, none of the transferred models (downscaled or not) resulted in prediction ranges close to the observations at NR (Tables 3 and 5). Absolute differences Table 4 Results for the models predicting total fish abundance (N total ) and abundance by fish family (N fam ) for Ningaloo Reef (NR). Observed N : observed fish abundance; Top model: the best performing model/s ranked by the weight of the Akaike Information Criteria corrected for small sample sizes (wAICc); R 2 : variance explained; High effect: predictors with the highest effect size; Pred N : range of predicted fish abundance and the respective standard error (Pred se N ); CV error : cross-validation error and its percentage (CV error(%) ); Val_R 2 : results of the direct validation of the observed abundances versus the predicted values for the same locations and the respective p-value. Models for the fish families Serranidae and Zanclidae resulted in high wAICc support for the null model and therefore results are not shown. Underlined wAICc values indicate highest ranked models in the model set received wAICc >0.5. Italicised text: values for which non-significant correlations (i.e., p-value >0.05) for the direct validation of observed versus predicted abundance were obtained. A total of 81 sites were used in NR.     between transferred and reference NR predictions were generally high for all scenarios and fish families, but lowest for Chaetodontidae, Lethrinidae and Pomacentridae (<55.3%) ( Table 5). Transferred predictions for Chaetodontidae, the only family for which the NR model resulted in a significant correlation between predictions and observed values, resulted in the highest percentage of grid cells with values differing by ≤15% from those obtained by the reference model (Chaetodontidae: 18.2% for scenario D, and 13% for both scenarios A and G) ( Table 5). Comparable rescaled patterns between transferred and reference NR predictions were obtained for this fish family (96.1%; scenario A) ( Table 5 and Fig. 3 for comparison Fig. 2).

DISCUSSION
Readily available models that reliably predict fish abundances would have the potential to assist management and conservation of marine ecosystems (Knowlton et al., 2010;Bejarano, Mumby & Sotheran, 2011;Plaisance et al., 2011). Following recent success in transferring predictive models of fish species richness between coral reefs (Sequeira et al., 2016), we assessed here the transferability of predictive models of reef fish abundances between the same two coral reef systems and using the same set of environmental predictors. Due to the relevance of fish abundance models for stock assessments (Bejarano, Mumby & Sotheran, 2011), we specifically developed models of total abundance and of abundance by fish family to compare transferability results when the response variable does and does not include exploited species. Model transferability from the GBR to NR varied among the response variables tested (i.e., for N total and N fam ) with no obviously superior model emerging for the GBR scenarios tested.
Predicting fish abundance for each system individually was challenging but generally improved when considering fish families separately. For example, our N fam models for most GBR scenarios resulted in greater and significant R 2 for all families, including the exploited Lutjanidae for which predictions resulted in the highest R 2 (70%). This result highlights that good predictive models of abundance by fish family including exploited species is achievable, and that predictions will be useful to understand patterns in abundance. This improvement when considering fish families separately was also verified for NR, where models of Chaetodontidae abundance were of moderate goodness-of-fit. Chaetodontidae are of little commercial interest, and are usually sedentary (site-attached) occurring as single individuals or in pairs. These factors may reduce survey bias in estimating their abundances, contributing to the better performance of the NR models despite the low number of sampling locations considered. However, with limited availability of fish abundance data at the scale of the entire NR, most models resulted in weak, non-significant correlations with observed abundances in that system (Table 2).
Due to the general lack of significant modelling results at NR, we could only directly assess and compare the transferred and reference models' results for Chaetodontidae. For this fish family, we found small absolute differences between the transferred and reference predictions and also a high similarity in the predicted patterns of low and high abundance (i.e., %low-versus-high predictions). With Chaetodontidae including highly sedentary and territorial species, our results suggest that local-scale processes of relevance to these species might be well captured by the variables in our models, whereas the possible reasons behind the lack of transferability for other fish families deserves further investigation. While the focus here was to understand what is required to transfer models successfully, application of these findings are also likely to be valuable for families including commercial species. Overall, our results indicate that similarities between predictions from transferred and reference models of fish abundance may be achievable. Transferring a model of abundance could, therefore, be useful where data are insufficient to build a new reference model for that location. However, at this point, we were unable to identify a consistent approach to transferring these models that would give confidence in other similar applications.
Our inability here to construct consistently transferable models of abundance should not be taken as failure, but rather as an opportunity to begin learning about how transferable models can be built, and ultimately, their utility improved. For example, the importance of nitrate (NO 3 ) as a predictor suggests that productivity might be an important habitat characteristic that contributes to determine abundance in these reef systems (Barneche et al., 2016). Therefore, if available, related predictors could be included to try and improve the models and their transferability. Indeed, the results of our models highlight that a major challenge in understanding and improving model transferability lies first in obtaining consistently robust predictions of fish abundances for the reference and target locations. Therefore, a useful way forward might be to first construct better, and more ecologically meaningful, predictive models of abundance than attempting the construction of more transferable ones.
To assist model improvement, where available, inclusion of variables that better represent the biology of reef systems or their benthic communities and habitats (Pittman, Costa & Battista, 2009;Yates et al., 2016) should be considered. As shown here for the NR models with the inclusion of rugosity (which in coral reefs is a largely biogenic variable), local, finescale topographic predictors might considerably improve model fit. If such predictors, more directly associated with patterns of fish abundances, were available for the GBR as well, it is plausible that better model transferability would have been obtained. Our contention here is supported by previous studies showing that models using different abundance-related response variables have different data needs, such as levels of replication (sensu Jones et al., 2015). In turn, these needs will depend, for example, on the variability of fish abundances, as these can be strongly influenced by many factors, such as recruitment, predation or fishing, i.e., demographic stochasticity (Sale & Douglas, 1984), or ecosystem biodiversity and human impacts (Ridgway, Dunn & Wilkin, 2002), with better models expected for fish species with more stable abundances (Jones et al., 2015).
The fish abundance data used here resulted from the application of different survey designs targeting different habitats in each reef system. It included the exposed side of the reef slope with greater coral cover on the GBR (Sweatman et al., 2008), while at NR, most samples were taken on reef flat areas with less three dimensional structure. Because greater structural complexity (i.e., high coral cover) can promote greater abundances of reef fishes (e.g., Gratwicke & Speight, 2005), this difference in the habitats sampled on the GBR and NR may have hindered the transferability of the GBR models to NR. Results for Zanclidae are a good example. Being a monospecific family with low population densities, predictions for this species were consistent across all GBR scenarios considered, while with the NR reference model wAICc support was greatest for the null model. This result is in agreement with this species' occurrence on reef flats, which includes much lower densities than in other habitats. Therefore, targeting equivalent habitats at the two locations may also assist model transferability, but direct tests will be required to better understand the magnitude of such an effect.
Another possible reason for the lack of model transferability observed here might be the spatial heterogeneity of the habitats in both reef ecosystems, as both geographical and geomorphological drivers affect the distribution of reef fish species (Pinca et al., 2012). Moreover, some species groups vary among habitats within the GBR. For example, the cross-shelf patterns of herbivores can be markedly different between inner-, mid-and outer-shelf reefs (Wismer, Hoey & Bellwood, 2009). When testing model transferability in space, observations for the same set of predictors also need to be available for both study systems. Here we used environmental and spatial predictors available around Australia to maximize the number of predictors that could be included and tested in our model set. Even still, this information was only available at a coarse resolution, and the results we report here may contrast with those that would have been obtained from finer resolution predictors such as roughness (e.g., estimated by acoustic methods) and fishing pressure that can affect fish abundances by removing individuals and destabilizing population dynamics (Hsieh et al., 2006;Anderson et al., 2008;Bejarano, Mumby & Sotheran, 2011;Rana et al., 2014).
To test the transferability of predictive models of fish abundances between two widely separated coral reefs, we constructed models that used a similar same set of predictors and scenarios previously shown to produce transferable models for fish species richness (Sequeira et al., 2016). The poor transferability of our abundance models compared to those developed for species richness highlights potential dissimilarities between the drivers of richness and abundance or in the way different ecological processes are captured by the predictors used and translated into this response variable. In particular, spatial predictors contributed to the transferability of the species richness models applied to the same two distant reefs (Sequeira et al., 2016) but were much less important when predicting fish abundances, with the exception of models for Acanthuridae and Zanclidae on the GBR. Interestingly, Acanthuridae were also previously shown to be an exception in beta-diversity stability relationships across fish families, potentially because of the way they respond to temporal variation in macro-algal availability (Mellin et al., 2014). Such differences in model transferability suggest that species richness patterns are more stable and better explained at large spatial scales (Caley & Schluter, 1997) than fish abundances, and models may therefore be more transferable for species richness. Conversely, fish abundances in marine systems are notable for their variability (Sale & Douglas, 1984;Caley et al., 1996) that might diminish model transferability. Generally, the predictors we used explained more variance in fish species richness than in fish abundance (Mellin et al., 2010;Sutcliffe et al., 2014) but because changes in abundance may be a more immediate indicator of changes in climate or responses to management intervention, it is also important to find ways to overcome challenges associated with building transferable predictive models of abundances.
Potentially, improvements to both fish abundance models and their transferability could be achieved by using different combinations of predictors (e.g., mixing physical and biological variables in the same model), by using biomass estimates, or by grouping species in different ways, such as, trophic role or ecological function. Here we grouped species by family and then tested for predictability and transferability. However, because species within families might differ considerably in behaviour, territoriality, food requirements, and thus, their resilience to change, other classification schemes such as, by functional group or size spectra, might result in better transferability. Furthermore, due to high abundance variability, relating yearly abundances to lagged effects of disturbance and/or fishing instead of averaging across years might also assist in improving transferability.