Using temporal occupancy to predict avian species distributions

Species distribution models (SDMs) are ubiquitous in ecology to predict species occurrence throughout their range. Typically, SDMs are created using presence‐only or presence–absence data. We hypothesize that the continuous metric of temporal occupancy, the proportion of time a species is observed at a given site, provides more detail about species occurrence than binary presence‐based SDMs.

negatives. Most SDM modelling methods use species presence and absence as input to generate a probability surface of predicted occurrences (Elith & Leathwick, 2009;Gomes et al., 2018;Guillera-Arroita et al., 2015;Liu et al., 2011). However, the binary nature of presence-absence data provides less information than continuous measures that can reflect gradations of habitat suitability. Some SDM approaches use continuous information on species abundance, which may account for how many individuals of the species occur at a given site and allow the model to account for more specific site preferences (Bahn & McGill, 2007;Gomes et al., 2018). Another continuous measure with the potential to better characterize site suitability is temporal occupancy, or the proportion of years a species is observed at a given site. While temporal occupancy and abundance are usually positively correlated, species commonly exist at sites with high temporal occupancy but at low abundance (Coyle et al., 2013), meaning these two measures may contain unique information. Specifically, temporal occupancy can help distinguish sites that are regularly versus intermittently occupied over time, and thus may reflect longer-term population viability at a site (Magurran & Henderson, 2003;Ulrich & Ollik, 2004;Coyle et al., 2013;Supp et al., 2015;Umaña et al., 2017;Snell Taylor et al., 2018;Jenkins et al., 2018).
This potential difference between geographic patterns in temporal occupancy and abundance has consequences for questions considering temporal turnover and richness-environment correlations (Belmaker, 2009;Coyle et al., 2013;Snell Taylor, et al., 2018;Umaña et al., 2017). For example, species with low temporal occupancy exhibit higher temporal turnover, shallower species-area relationships and richness patterns that are less correlated with the local environment compared to species with high temporal occupancy (Snell Taylor, et al., 2018). Additionally, environmental predictors explained more variation in temporal occupancy than variation in abundance on average across nearly 200 bird species (Snell Taylor, Umbanhowar, et al., 2020). This suggests that temporal occupancy may indeed capture important information about habitat suitability, and yet, modelling spatial variation in temporal occupancy remains understudied in ecology (but see Green et al., 2019;Rushing et al., 2019;Zuckerberg et al., 2011). We seek to determine whether using temporal occupancy data can provide more reliable SDMs than using presenceabsence data for a large suite of avian species in North America.
We also want to ensure that any observed differences in performance between SDMs created with temporal occupancy versus presence-absence data are independent of the modelling approach used. SDMs have been implemented using a wide range of approaches, and the optimal method may depend on the available data (Araújo et al., 2014;Elith et al., 2006;Guillera-Arroita et al., 2015;Merow et al., 2014;Phillips et al., 2006). The simplest SDMs (e.g. MaxEnt; Phillips et al., 2006) rely on species presence-only and environmental data as the inputs. In contrast, some SDM approaches also take into account absences where species are sampled but not observed (Elith & Leathwick, 2009;Guillera-Arroita et al., 2015).
Traditional occupancy models use more detailed information about species occurrence, generally spatial or temporal occurrence, which can be used to infer a more detailed gradient of suitability (Comte & Grenouillet, 2013;Gomes et al., 2018;Guillera-Arroita et al., 2015).
In this study, we evaluate the utility of using temporal occupancy data for building SDMs relative to traditionally used presenceabsence or presence-background data, comparing across four common modelling approaches (generalized linear models, generalized additive models, random forest and MaxEnt). We assess the relative performance of presence-absence and temporal occupancy data using both spatial cross-validation and temporal cross-validation.

| ME THODS
We used the North American Breeding Bird Survey (BBS) to characterize geographic variation in species presence-absence and temporal occupancy. BBS surveys are distributed across the continent and monitor breeding birds via a series of fifty 3-min point counts spaced at 0.8-km intervals along a roadside route (USGS Patuxent Wildlife Research Center 2012). Each survey is conducted during the breeding season (typically June) by a single observer who records all avian species seen or heard along the route. We used the 1,004 BBS survey routes that were continuously surveyed from 2001 to 2015 and excluded species that were poorly sampled by the survey design such as waterbirds, raptors and nocturnal species ( Figure S1; Butcher, 1987). In total, we identified 189 focal land bird species that were present on at least 60 BBS routes over 15 sampling years. These thresholds were chosen to ensure that a model could be fit for each focal species.
We constructed SDMs based on seven environmental variables summarized at the route level: temperature seasonality (intra-annual standard deviation *100), mean maximum temperature of the warmest month, mean minimum temperature of the coolest month, mean precipitation during the wettest month, mean precipitation during the driest month, mean elevation from the 30-m resolution National Elevation Dataset (ned.usgs.gov) and normalized difference vegetation index (NDVI) from the Global Inventory Mapping and Monitoring Studies database (https://glam1.gsfc.nasa.gov/). Because each BBS route is 40 km long and because only the starting coordinates are known for many of the routes, we calculated the means of each environmental variable within a 40-km buffer around each BBS route's starting coordinates. The five climate variables included in the models were acquired from Bioclim at 2.5-min resolution and have been commonly used in previous distribution modelling studies (Hijmans et al., 2005;Thuiller et al., 2014), while elevation and NDVI are also known to play a role in determining bird distributions (Bahn & McGill, 2007;Coyle et al., 2013;Hurlbert & Haskell, 2003;Rich & Currie, 2018;Snell Taylor, Umbanhowar, et al., 2020). The NDVI data were 250 m, monthly resolution, averaged over the summer months of June, July and August (https://gimms.gsfc.nasa.gov/MODIS/). We selected four commonly used methods for building SDMs: generalized linear models (GLMs), generalized additive models (GAMs), random forest (RF) and MaxEnt (Elith & Leathwick, 2009;Merow et al., 2014). We used R packages glm2 (Marschner, 2019; GLM), mgcv (Wood, 2017; GAM), randomForest (Liaw and Wiener, 2002; RF) and dismo (Hijmans et al., 2011;maxent) for each of the modelling approaches, respectively. For GAM, GLM and RF, we created SDMs using each method for each focal species based on presence-absence. We included all presences for each species during the 15-year period based on BBS data, meaning that any detection of a species during this period was recorded as a presence. We included as absences all survey routes within each species' breeding range map polygon from BirdLife International (www.birdl ife.org) on which the species was not detected during the 15-year period.
GLM, GAM and RF models were additive and used a binomial logit link function. We generated GAMs with cubic-smooth splines, and the smoothing parameter was chosen by generalized cross-validation (Godsoe et al., 2017;Oliver et al., 2012). We used regression random forests with the following default settings: 500 trees, the number of variables tried at each split was the number of predictor variables divided by 3, sampling was conducted with replacement, node size was 5, and sampling class was not stratified (Steen et al., 2019). All model fits were assessed using root mean square error (RMSE). We also calculated the range occupancy of each species by dividing the number of routes on which a species was present by the total number of routes within their breeding range polygon, which has been used to characterize the extent to which a range is fully versus patchily occupied (Hurlbert & White, 2007).
For maxent, we created SDMs for each focal species using presence points only. Although maxent cannot take the continuous temporal occupancy data as input, we include it in this study as a benchmark for comparison to the other methods because it is so widely used (Melo-Merino et al., 2020;Merow et al., 2013). We ran maxent using the default settings, which was shown to outperform a range of alternative settings for avian species using the BBS dataset specifically (Ralston et al., 2016), as well as in other applications (Elith et al., 2010;Qiao et al., 2015). Just as in the GAM, GLM and RF model, environmental data were characterized over a 40-km buffer.
Next, we created SDMs using temporal occupancy for the three modelling approaches that accept continuous variables as input: GLM, GAM and RF. We calculated temporal occupancy using a 15year window from 2001 to 2015, which provides greater resolution to temporal occupancy estimates and inclusion of more focal species, and a five-year window, which provides a more realistic representation of typical ecological sampling efforts. We used sampling years 2001, 2004, 2007, 2010 and 2013 for the five-year temporal occupancy calculation in order to sample evenly from our 15-year time period. Our intention for the 5-year analysis was to maintain the temporal extent of the analysis (15 years) and evaluate the efficacy F I G U R E 1 Map depicting the observed distribution and seven species distribution models (SDMs) of the Yellow-throated Vireo (Vireoflavifrons). (a) Presence-absence points for the Yellow-throated Vireo over the 15-year sampling period. Green dots represent sites where the species was present five or more sampling years (temporal occupancy of ≥33%), grey dots indicate sites where the species was present less than five sampling years, and Xs indicate no occurrences observed during the sampling period. SDMs in the left column (c, e, g) are based on temporal occupancy, while SDMs in the right column are based on (b) presence-only or (d, f, h) presence-absence data. Modelling approaches include (b) MaxEnt, (c, d) generalized linear models (GLMs), (e, f) generalized additive models (GAMs) and (g, h) random forest (RF). Purple indicates high suitability or relative probability of presence and is scaled independently for each plot based on quantile bins of using ecological datasets with more limited temporal sampling, and results were qualitatively similar using five consecutive years (not shown). The 15-year analysis included 189 focal species, but for the five-year analysis 16 species no longer met presence and route occurrence thresholds with the restricted time window and were excluded. Thus, for each species we used presence or temporal occurrence data ( Figure 1a) to generate seven SDMs based on the different methodological approaches (Figure 1b-h). We calculated root mean square error (RMSE) for each model using the hy-droGOF package (Zambrano-Bigiarini, 2017), with values closer to zero denoting a better fit to the observed values. This method is commonly used to describe the predictive performance of SDMs (Bell & Schlaepfer, 2016;Liu et al., 2011;Norberg et al., 2019;Tobler et al., 2019).  (Table 1). The validations included 173 species since 16 species had only predicted presences, meaning we could not generate false omission rates for those species because the denominator, predicted absences, would be zero.
For spatial cross-validation, we split the data into ten randomly generated bins, and for each bin, we evaluated how well the SDM based on the other nine bins could predict occurrences in the bin that was withheld. In order to convert the continuous output of the predicted values to binary presence-absence, we again rescaled predicted values by the maximum for each species and then used a threshold of 0.5 to generate the predicted presences for each bin and created confusion matrices and summary metrics for each species.

| RE SULTS
For most species, root mean square error (RMSE) for SDMs based on temporal occupancy was lower than RMSEs for SDMs based on presence-absence regardless of modelling approach (Figure 2a Table S1. RMSE values were slightly lower for models using temporal occupancy from five rather than 15 years of data. For 5-year models, 91% (157) of species performed better with temporal occupancy than presence-absence with the RF SDMs (Figure 3a), and 81%-82% (140-142) of species performed better using temporal occupancy than presence-absence for GLM and GAM (Figure 3b,c).
Again, MaxEnt models had the highest RMSE values of all modelling approaches examined, especially compared with models using temporal occupancy (Figure 3d). Overall, the extent to which temporal occupancy-based models performed better than presence-based models was similar whether using 5-year or 15year windows.

Best performing variable
False discovery rate Proportion of predicted presences that were incorrectly identified

TA B L E 1 Definitions of error rates and the best performing input variable for each metric
Next, in an effort to understand why some species are better predicted (have lower RMSE) using temporal occupancy and other species are better predicted by presence-absence, we examined the relationship between range occupancy (the proportion of routes within the breeding range of a species on which it was actually observed) and the difference in RMSE between 15-year temporal occupancy-based and presence-based models.
We found that the species for which temporal occupancy-based models had much lower RMSE (i.e. ∆RMSE was most negative) generally had lower range occupancy independent of modelling approach (Figure 4).

F I G U R E 3
Comparisons of root mean square error (RMSE) for species distribution models using temporal occupancy based on 5 years of data versus presence. Panels as in Figure 2 rates (t = 14.7, p = .005) and false-negative rates (t = 22.0, p = .002) compared with models based on presence-absence (Figure 5a).

| D ISCUSS I ON
SDMs are important tools for conservation and resource management, and many studies have considered how to optimize them Guillera-Arroita et al., 2015;Liu et al., 2011;Merow et al., 2014). Using a dataset of 189 avian species in North America and a variety of commonly employed modelling approaches, we found that for most species, SDMs based on temporal occupancy generally outperformed SDMs based on presence-absence. SDMs using temporal occupancy were more conservative in predicting presences relative to presence-based SDMs because locations where a species was observed infrequently were down-weighted by this measure. Additionally, the improvement in performance was greatest for species with lower range occupancy, which had more patchy distributions (Hurlbert & White, 2007) within their breeding range.

F I G U R E 4
The difference between RMSE for 15-year temporal occupancy models and RMSE for 15-year presence-absence models as a function of range occupancy (number presences/ number presences and absences across the range) for each species.
Points are coloured by model method, and values below zero represent species that had lower RMSE for temporal occupancybased models than presence-absence models Initially, we used a 15-year time series to calculate temporal occupancy because a long sampling window provides a more precise estimate of occupancy. However, it is uncommon to have datasets with such a long continuous time series, so we compared the 15year temporal occupancy results to those from a shorter sampling window of five non-contiguous years. We saw similar results between the five-year and 15-year time series, with the five-year time series performing slightly better. In the 5-year time series, temporal occupancy models outperformed presence-absence models for even more species than in the 15-year time series. Finally, species with large differences between temporal occupancy and presenceabsence RMSEs performed the same regardless of the duration of sampling window, whereas species with smaller ranges or fewer occurrences at a site were more likely to be excluded from the analysis with the shorter sampling window. Overall, there appears to be no reduction in effectiveness of temporal occupancy-based SDMs using only five sampling years, indicating that SDMs using temporal occupancy for sites or taxonomic groups with only modest monitoring efforts over time likely still provide useful habitat suitability data.
There was interspecific variation in whether GLM, GAM or RF models had the lowest RMSE, but MaxEnt was consistently the poorest performer for most species. This is concerning because MaxEnt is a widely accepted method of creating SDMs (Elith & Leathwick, 2009;Guisan & Thuiller, 2005). Our results indicate that MaxEnt may be adequate when a species is expected to be distributed uniformly in space with few "range holes," but may be problematic for species with few presences or low range occupancy. More broadly, several species with many presences and few absences were overfit using the presence-absence GAMs, while the same species were not overfit using the GAM temporal occupancy SDMs.
These species had either complete or partial separation, meaning the outcome values were almost perfectly fit by the predictors. All overfit species had range occupancy values of 70% or more. This overfitting demonstrates a potential drawback of using presence to create SDMs, and a potential advantage of using a continuous response variable like temporal occupancy.
One of the challenges of creating SDMs is balancing types of errors in order to maximize correct prediction of presences while not under-predicting correct absences. Any statistical approach for developing SDMs will be subject to a trade-off between errors of omission (false negatives) and errors of commission (false positives; Barry & Elith, 2006;Elith et al., 2006;Wintle et al., 2005). Our results demonstrate that these trade-offs need to be considered carefully when creating SDMs, as the various modelling methods we examined performed differently with respect to these two types of errors and depending on whether performance was evaluated based on spatial or temporal cross-validation. One consistent result was a much lower false-positive rate for models using temporal occupancy than for those using presence-absence. As such, temporal occupancy might be the superior metric when striving to minimize errors of commission, such as when creating SDMs for rare species or species with low range occupancy that are known to patchily occupy the landscape.
Differences in error rates between temporal and spatial crossvalidation were consistent across all models. When predicting presence-absence at the same set of sites but outside of the temporal window in which models were trained, false discovery rates were undesirably high (40%-60% on average), while false omission rates were low. On the one hand, this implies that presences were being predicted too liberally. Sites at which a species had been observed previously might be predicted to be presences again, and yet, the species might fail to be detected in a particular year, resulting in apparent false positives. On the other hand, absences were being predicted conservatively, as sites at which a species had never been observed previously could reliably be predicted as absences by the models. In contrast, when making predictions in new spatial locations during the same time window, false discovery rates were low, while false omission rates were 40%-60%.
This means that, spatially, SDMs are good at predicting a core set of environmental conditions strongly associated with species presences, but fail to adequately recognize the full set of conditions over which presences are possible. We encourage others working with SDMs to compare results from both spatial crossvalidation and temporal cross-validation of their models when possible to shed more light on how, where and why the models perform as they do.
Temporal occupancy is easy to calculate for annual monitoring datasets in which individual sites have been surveyed over multiple years, and our work suggests that the information encoded in this simple metric can improve species distribution models relative to presence-absence-based models. However, it is important to realize that temporal occupancy will necessarily underestimate true occupancy at a site given that species are not always detected when present. Dynamic occupancy models (MacKenzie et al., 2005) provide a framework for explicitly accounting for imperfect detection and have been increasingly used for range-wide species distribution models (Clement et al., 2016;Green et al., 2019;Kéry et al., 2013;Rota et al., 2011;Rushing et al., 2019). These dynamic occupancy models provide the additional advantage of explicitly modelling the colonization and extinction processes (Kéry et al., 2013;MacKenzie et al., 2003), and thus, the further development of such models is clearly a productive area of continued research. In particular, it would be useful to understand how the error rates from such models were compared to those from the temporal occupancy-based models we examined here.
Nevertheless, the failure to account for imperfect detection does not impede our ability to compare SDM performance between presence-absence and temporal occupancy models, the primary goal of this manuscript. Furthermore, the use of temporal occupancy as an indicator of whether a species truly maintains a viable population at a given site despite imperfect detection has been validated with simulation models (Snell Taylor, Coyle, et al., 2020). High temporal occupancy accurately reflected sites with high suitability except in cases where the focal site was an unsuitable population sink surrounded by a landscape of more than 60% suitable habitat. Low temporal occupancy accurately reflected sites with low suitability as long as the probability of detection was greater than ~30%, which is usually the case in birds based on several multispecies compilations (Boulinier et al., 1998;Johnston et al., 2014). That said, developing reliable SDMs for species known to be difficult to detect may benefit from (a) the use of data obtained from targeted monitoring efforts rather than generic surveys and (b) analysis by more sophisticated dynamic occupancy modelling approaches.
Overall, the use of temporal occupancy can improve SDMs by prioritizing areas where the species persists more regularly and using only a few years of data over a 15-year sampling period proved to be equally as effective for bird species. Temporal occupancy does not necessarily improve the performance of SDMs for abundant and wide-ranging species, but is beneficial for modelling species that occur infrequently or occupy a smaller proportion of their range. We hope this study encourages scientists to consider the role time series data can play in improving SDMs using this simple framework, and to assess whether it is appropriate to use temporal occupancy in their analyses when multiple years of surveys are available.

ACK N OWLED G EM ENTS
We are grateful to the volunteers who collect data for the North American Breeding Bird Survey. We thank J. Umbanhowar for statistical consultation and E. White for suggestions regarding methodology. This work was made possible by funding from the National Science Foundation through DEB-1354563. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/ddi.13296.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data and code associated with this study are available from the Zenodo Digital Repository (https://doi.org/10.5281/zenodo.4672696).