From pet to pest? Differences in ensemble SDM predictions for an exotic reptile using both native and nonnative presence data

. As a result of the pet trade, Africa’s Nile monitor (Varanus niloticus) is now established in North America (Florida). This generalist carnivore is a potential threat to native wildlife, requiring proactive measures to effectively prevent further spread into novel regions. To determine regions at risk, we create and compare 24 alternative ensemble species distribution models (SDMs) using a model selection approach (with 10 possible modeling algorithms grouped according to assumptions). The ensemble SDMs used presence and environmental data from both native (Africa) and nonnative (Florida) locations. The most predictive consensus SDMs for native and native + nonnative data sets (TSS = 0.87; Sensitivity = 93%; Specificity = 94%) were based on the boosted regression tree (BRT), classification tree analysis (CTA), and random forest (RF) modeling algorithms with all environmental predictor variables used. The global Nile monitor SDMs predict strong habitat suitability in tropical and subtropical regions in the Americas, the Caribbean, Madagascar, Southeast Asia, and Australia. Florida Nile monitor populations are less likely to spread into the Neotropics than if pets now in the Southwest USA are released intentionally or accidentally. Management options to avoid this spread into vulnerable regions are to actively prohibit/regulate Nile monitors as pets, enforce those restrictions, and promote exotic pet amnesty programs. The model selection approach for ensemble SDMs used here may help improve future SDM research.


Introduction
Successful vertebrate invaders often exhibit a combination of the following traits: close association with humans, abundance in a wide native range, competitive nature, large size, broad diet, high tolerance to various physical conditions, and rapid reproduction (Ehrlich 1989, Sakai et al. 2001, Sol 2008). Africa's Nile monitor (Varanus niloticus; Linnaeus 1766) exemplifies these traits: it is established in several urban areas of Florida ( Fig. 1), has the largest geographic distribution of the African varanids, ranging from South Africa to the Nile delta, is relatively abundant (population density 40-60/km 2 ), is a semi-aquatic, generalist carnivore, reaches lengths ≤ 2.4 m with a body mass of ≤ 7.3 kg, thrives in various environments (e.g., grasslands, lowland forests, swamps, seashores, and semi-deserts) by extending its occupied thermal range when burrowing, and reaches sexual maturity in two years with clutches ≤ 60 eggs (Edroma and Ssali 1983;Losos and Greene 1988, Lenz 1995, Luiselli et al. 1999, Faust 2001, Bayless 2002, Bennett 2002, de Buffrenil and Hemery 2002, Ciliberti et al. 2012. These characteristics cause the Nile monitor to become a potential invasive threat if introduced to new areas across the globe, particularly in areas where native top predators have been diminished or exterminated. These concerns are further exacerbated by its popularity in the exotic pet industry, potentially allowing further introductions.
This Old-World monitor has already been established in southern Florida since ~1990 due to the global pet trade (Campbell 2003, Enge et al. 2004, Campbell 2005, raising concerns about where else it might become established if introduced, and where it might become invasive once established. Evidence from this nonnative population already mirrors several of the native monitor characteristics listed above. Monitors Frontiers of Biogeography 2019, 11.2, e42596 in Florida have a broad diet, as evidenced by gut content analyses that reveal marine, freshwater, and terrestrial invertebrates and vertebrates, including reptile and bird eggs, and even a Florida burrowing owl-a protected species within the state (Enge et al. 2004, Campbell 2005. Nile monitors are most often reported in South Florida, where they appear to have anthropophilic habitat use because they are often reported in residential areas and on roadways (Campbell 2005). A Google search (November 2018) for the phrase "Nile monitor for sale" and restricted to the past decade obtained > 250 web hits, demonstrating its sustained availability and demand via global trade. Although stricter permitting regulations in Florida have been issued for the Nile monitor (designated a "conditional species" by the Florida Fish and Wildlife Conservation Commission; FWC; Rule Chapter 68-5), it may be introduced to other regions, imported and released even under enforced permitting rules, or spread from established populations (Faust 2001, Enge et al. 2004. By predicting geographic regions with high habitat suitability for the Nile monitor, we can better inform management decisions that may prevent them from reaching those areas via continued trade (Sakai et al. 2001, Campbell 2007. To date, only Dowell et al. (2016) have attempted to predict suitable habitat for Nile monitors in North America, but did so using only 71 native presences, one algorithm (i.e., Maxent), and only climate and elevation predictor variables. These predictions showed highly suitable habitat in places as disparate as the Yucatan and Olympic peninsulas, Florida and western Montana, and southern California and southern Maine. This surprising distribution may be consistent with Nile monitor traits (described above) but may also result from a reliance on relatively few locations from the native and invaded ranges to inform the model, the use of one algorithm rather than a model selection approach among ensembles of multiple algorithms, and metrics such as choice of predictor variables and model performance thresholds (Meller et al. 2014).
Here, we aim to develop a more robust prediction of Nile monitor habitat suitability on a global scale, using 1,072 total location points from both its known native (Africa) and nonnative (Florida) ranges, 24 alternative ensemble species distribution models (SDMs) comprised of a possible 10 modeling algorithms, and using combinations of predictor variables (e.g., climate, vegetation, and elevation data) to inform the models. We agree with George Box (1978) that "all models are wrong, but some are useful," and propose that ensemble SDMs produced here will be the most useful tools to date to prevent this "pet" from becoming an unwanted pest.

Data/Software
We geo-referenced 507 unique native range locations for the Nile monitor from Africa (de Buffrenil and Francillon-Viellot 2001, Bayless 2002, Berny et al. 2006, Ciliberti et al. 2011; Fig. 1) and used 565 unique nonnative range locations from Florida to comprise our presence data (provided by FWC; Fig. 1) (see Supplementary Dataset S1). Both sets of data were used as reference points for current climate (19 bioclimatic variables based on temperature and precipitation from WorldClim 1.4; Hijmans et al. 2005), vegetation (mean annual net primary productivity or NPP based on Moderate Resolution Imaging Spectroradiometer or MODIS; Zhoa et al. 2015), and elevation predictor variables (based on Shuttle Radar Topography Mission or SRTM from WorldClim 1.4; Hijmans et al. 2005) for our models at a 30 arc-second (1km 2 ) resolution. We did not select among correlated bioclimatic variables because too little is known about the monitor's habitat needs in a novel habitat to justify removing variables and because the goal here was to develop highly predictive models that are biologically relevant to the Nile monitor, as opposed to a most parsimonious model (Braunisch et al. 2013).
Ensemble SDMs were projected and analyzed using the 'Biomod2' package in R with 80/20 calibration and evaluation, 0.5 prevalence, and a high 0.70 quality threshold . Rather than relying on individual predictions that can vary among algorithms, Biomod2 enables an ensemble framework that reduces predictive uncertainty by combining up to 10 modeling algorithms (Table 1) to find a central trend (Marmion et al. 2009, Elith et al. 2010, Barbet-Massin et al. 2012, Meller et al. 2014, Breiner et al. 2015, Liu et al. 2018. Only algorithms that met the 0.7 quality threshold were included in the ensemble calculations; those that do not meet this standard were discarded . Importantly, this a priori step improved confidence in model results compared to models without a threshold. We supplemented our presence-only dataset by creating pseudo-absence (PA)/background data with the "random" algorithm in Biomod2 since all 10 modeling algorithms require presence-absence/background data to differentiate environmental conditions that predict species habitat preferences (Phillips et al. 2009, Barbet-Massin et al. 2012). To compensate for variation between the number of PAs, PA replicates, and model runs based on which algorithm is used, we grouped the 10 algorithms into three PA subsets to allow for optimal ensemble model performance-Groups A, B, and C (Table 1; Barbet-Massin et al. 2012, Brown andYoder 2015). Group results were compared for accuracy using the true skill statistic (TSS; scaled -1 to +1, where performance ≤ 0 means the model output is no better than random and > 0 means the proposed model successfully distinguishes between suitable and unsuitable habitat). TSS is preferred to the kappa statistic and area under the curve (AUC) since it is insensitive to prevalence or size of the dataset and accounts for both omission and commission errors (Allouche et al. 2006, Lobo et al. 2008, Somodi et al. 2017). By comparing model sets for their TSS scores before choosing the desired set of predictor variables or algorithms, our approach was analogous to model selection based on information theoretic criteria  Bayless 2002, Berny et al. 2006, Ciliberti et al. 2011FWC data). (Burnham and Anderson 2002) to offer a more accurate and justifiable model rather than one based on a single variable set or algorithm (Meller et al. 2014).
Models were also evaluated for sensitivity and specificity. Sensitivity is the accuracy rate for true positive outcomes (i.e., the probability that the model correctly predicted presence). Specificity is the accuracy rate for true negative outcomes (i.e., the probability that the model correctly predicted absence). Both sensitivity and specificity are scaled relative to random guesses and are scaled -1 to +1 in the same manner as TSS (Allouche et al. 2006). Overall, the use here of four criteria for model selection (i.e., an a priori and relatively high quality threshold, TSS, sensitivity, and specificity) ensured confidence in resulting model performance. We considered this methodological detail important to predictive SDMs.

Ensemble SDMs
For the native (Africa) presence data, seven environmental predictor variables (i.e., climate, vegetation, elevation, and all possible combinations) considered for each of the three PA Groups (Table 1) resulted in 21 alternative models to predict global Nile monitor distribution. The most informative model was selected based on overall model performance (TSS, sensitivity, and specificity scores), projected, and evaluated for Nile monitor habitat suitability. This consensus projection was then compared to the monitor's known nonnative range (Florida; Fig. 1) to further evaluate predictive performance. A second set of global ensemble SDMs was based on the combination of native and nonnative presence data for each PA Group, resulting in a new total of 24 alternative models to predict global Nile monitor distribution. The most informative combination SDM was compared to the most informative native-based ensemble SDM results to evaluate differences in their projections and performances.
In summary, our analyses resulted in the projection of two high-performing ensemble SDMs from a possible 24 to identify suitable habitat for the Nile monitor across the globe based on 1) native presence data and 2) native + nonnative presence data. Model comparisons here demonstrate consequences in performance and predictive accuracy for ensemble SDMs that are informed with different sets of presence data (Broennimann and Guisan 2008).

Ensemble SDM Performance
All of the algorithms tested met the 0.7 quality threshold and were considered within their respective ensemble SDM PA Groups with low levels of uncertainty. The most informative model hypothesis based on native Nile monitor presence data included all three predictor variables-Climate + Vegetation + Elevation -and the BRT, CTA, and RF algorithms (PA Group C; TSS = 0.87; Table 2 and 3). Model accuracy for the chosen hypothesis was further supported by high rates of sensitivity (93%) and specificity (94%) ( Table 2 and 3). This variable combination was then used to inform our remaining global ensemble SDMs. The most informative native + nonnative presence model hypotheses once again included the BRT, CTA, and RF algorithms (PA Group C; TSS = 0.87; Table 3), with high rates of sensitivity (93%) and specificity (94%) ( Table 3) supporting model accuracy.
Overall, our final global ensemble SDMs produced highly accurate and robust projections based on all three predictor variables (climate, vegetation, and elevation), both sets of presence data available, and were optimized using the PA Group C algorithms to create the most informative predictions for Nile monitor habitat suitability across the globe (Table 3).

Native Ensemble Projection
The Nile monitor projection based on the native presence data (Table 3; Fig. 2) predicted medium to high habitat suitability (range ~300-1,000 or orange to green) on five continents: North America, South America, Africa, Asia/Indopacific, and Australia. In North America, all Hawaiian Islands appear to have high predicted suitability whereas the established populations in Florida appear to be constrained to the subtropical portion of the peninsula, with limited potential to spread northward (Fig. 2). However, should the lizard be released and become established in the southwestern US or coastal California, it could spread southward into large areas of Mexico, Central America, and into South America (Fig. 2). In that case, the Nile monitor is likely to inhabit much of the tropical and subtropical Neotropics. The Nile monitor is also likely to become established on many Caribbean islands if introduced there (Fig. 2). Comparing with other SDM efforts (Dowell et al. 2016), we note that the most predictive model here does not project spread northward in continental North America. High areas of suitability in South America include (but are not limited to) coastal  (Table 3). Scaled from 0 (low suitability; white) to 1,000 (high suitability; green). Venezuela, large areas east and south of the Amazon Rainforest, and an apparent elevational band in the western Andes (Fig. 2). Southern limits for the Nile monitor's potential range in South America extend to about 41 o S in Chile and western Argentina (Fig. 2). If introduced, the South American range of the Nile monitor would likely be extensive and comparable to its native African range (Fig. 2).
If introduced beyond the Americas, the Nile monitor could spread into most of Madagascar but particularly south Madagascar (Fig. 2). More likely are invasions in much of Southeast Asia, including western and central India, Sri Lanka, the Indochina Peninsula, Northern Philippines, and parts of Indonesia (Fig. 2). Relatively high topographic relief in the Malay Peninsula, Borneo, and New Guinea may partially constrain the Nile monitor, but it could inhabit various Indonesian islands. In Australia, suitable habitat exists along much of the continent's coast, especially in northern tropical and subtropical regions (Fig. 2). It is also likely to inhabit portions of eastern Tasmania but is not likely to succeed in New Zealand (Fig. 2).  (Table 3). Scaled from 0 (low suitability; white) to 1,000 (high suitability; green).

Native + Nonnative Ensemble Projection
All of the algorithms tested met the 0.7 quality threshold and were considered within their respective ensemble SDM PA Groups with low levels of uncertainty. The Nile monitor projection based on the combination of the native and nonnative presence data (Table 3; Fig. 3) predicts similar habitat suitability ranges to the SDM informed by native presence data alone ( Table  3; Fig. 2). In North America, increased suitability is predicted in subtropical Florida and extending north along the Gulf coast, but reduced suitability throughout Mexico and into Central America (Fig. 3). South America shows slightly reduced suitability in Brazil, with an increased suitability range from Chile into central Argentina, expanding the monitor's southern limits (Fig. 3). While Africa remains similar, Madagascar's suitability decreased throughout (Fig. 3). Asia shows slightly reduced suitability in India and the Indochina Peninsula, though there were increased suitability ranges predicted along the coasts of the Indonesian Islands (Fig. 3). Finally, Australia shows reduced habitat suitability throughout the continent with the only increase being throughout most of Tasmania (Fig. 3).

Discussion
Our work resulted in two high-performing global ensemble SDMs for predicting Nile monitor habitat suitability across the globe. These projections included both native and nonnative presences in our analyses, highlighting the differences that can result in model predictions based on different initial data with different geographic scales. Our ensemble SDMs based on native and native + nonnative presence data showed suitable Nile monitor habitat across many tropical, subtropical, and warm temperate regions, with less potential spread into cooler regions, contrary to that in Dowell et al. (2016). This distribution is consistent with our a priori understanding of the monitor's native range and habitat preferences in Sub-Saharan Africa (de Buffrenil and Francillon-Viellot 2001, Bayless 2002, Berny et al. 2006, Ciliberti et al. 2011, encompassing many regions that are evolutionarily naïve to any of the known 53 Varanus species, let alone the Nile monitor (Campbell 2003). While some regions may already have indigenous monitor species (e.g., Africa, Asia, Southeast Asia, and Australia), potential prey species and even native varanids may not be immune to the negative effects of a new congener such as the Nile monitor (Pianka et al. 2004). Furthermore, terrestrial and aquatic wildlife inhabiting numerous biodiversity hotspots in North America, Central and South America, the Caribbean, Madagascar, Southeast Asia, and Australia could suffer from predation or competition by this long-lived, opportunistic, generalist carnivore (Luiselli et al. 1999, Myers et al. 2000, Faust 2001, Bennett 2002, Enge et al. 2004, Campbell 2005, Mittermeier et al. 2011, Noss et al. 2015.
Predictive accuracy was high for our resulting ensemble models, but there are still potentially important considerations for nonnative distribution modeling. Presence data from a species native range are typically assumed to indicate habitat requirements in accordance with the basal assumptions of SDMs (Guisan and Thuiller 2005, Broennimann et al. 2007, Bellard et al. 2014). However, presence data from a species developing a nonnative range do not yet fully indicate suitable habitat throughout a novel region and disrupt assumptions of SDMs. Invasion (i.e., colonization, establishment, spread, and adaptation) has no timeline to obtain equilibrium with its new range (Broennimann et al. 2007, Broennimann and Guisan 2008, Bellard et al. 2014. Overall, this violation of the equilibrium assumption within the nonnative range is a common obstacle for modeling range-shifting taxa such as introduced species, requiring careful interpretation of resulting SDM outputs (Guisan and Thuiller 2005, Broennimann et al. 2007, Elith et al. 2010, Bellard et al. 2014. As a result, it is important to map and compare the native and native + nonnative ranges for model performance and range projections to better infer the reliability of invaded range limits. The technical methods and general approach with ensemble modeling applied here help overcome some of the above challenges of nonnative species modeling and may guide future native and nonnative SDM research. These methods include a three-fold advance over previous attempts at predicting Nile monitor SDMs (Dowel et al. 2016) by: 1) using a more robust ensemble SDM approach (via model selection) as opposed to reliance on a single algorithm (e.g., Maxent); 2) incorporating both native and nonnative presence data and in high abundance (507 and 565 locations, respectively, compared to only 71 native presences); and 3) extending habitat selection across the globe to aid in informing proactive mitigation practices. While findings here are useful for informing Nile monitor mitigation specifically, we further recommend these methods be applied to various native and nonnative species where our Nile monitor work may serve as a case study for future ensemble SDM practices. A stronger inference approach based on the comparison of multiple working techniques to form an ensemble model is not only possible using the Biomod2 package in R ), but also highly recommended based on results seen here. We did not construct one ensemble model from all 10 available algorithms, but instead ensembles represented subsets based on optima and replicates of pseudo-absences (Barbet-Massin et al. 2012). This was an important inferential step, as Groups differed in TSS, sensitivity, and specificity scores for all analyses. Rather than constructing one model based on native or invaded range locations, we evaluated two possible combinations of native and nonnative presence data, in combination with SDM variants, to analyze differences in model projections and performance (Broennimann et al. 2008). Finally, we used a quality threshold to accept models and TSS and sensitivity and specificity values to compare and select among alternative ensemble SDMs. The net result of all features above was that we applied an approach consistent with the use of multiple working hypotheses and model selection (Chamberlin 1890, Burnham andAnderson 2002) to obtain very accurate SDMs with high proportions of true presences and absences, as well as highlighted differences in model output based solely on the presence data used to inform the predictive models-a cautionary measure for future SDM work.
Overall, our Nile monitor ensemble SDMs forecast a potentially wide spread for the reptile across the globe should current pet trade practices serve as further conduits of its dispersal. Despite repeated, early warnings (Campbell 2003, Enge et al. 2004, Campbell 2005, Campbell 2007), legal restrictions for this reptile did not begin until ~20 years after its initial introduction to North America in 2009, leading to the establishment of multiple breeding populations in Florida. If legal restrictions are to prevent the spread of this large, generalist carnivore and consequent impacts in biodiverse regions, those restrictions must be pre-emptive rather than post-hoc (Simberloff 2003, Simberloff et al. 2005, Campbell 2007). This goal is even more difficult since Nile monitors (and congeners) are readily available as pets via direct and online sales, including multiple vendors located in the US (Faust 2001, Enge et al. 2004). Based on our SDM results, management options to avoid further introductions into regions of high habitat suitability (where Nile monitors are not native) are to actively prohibit the import, sales, and ownership of Nile monitors (and potentially other Varanus species) before they are introduced (Simberloff et al. 2005, Campbell 2007). Without greater and more consistent trade restrictions, our data strongly suggest that introduced Nile monitors could spread and inhabit extensive regions of the world that currently support high biodiversity, including myriad sensitive species (Myers et al. 2000, Mittermeier et al. 2011, Noss et al. 2015. The well-supported predicted scenarios described here are only prevented now by voluntary actions of monitor owners, many of whom fully understand the predatory capabilities of their pets. However, the vast numbers of Nile monitors in the pet trade mean that chance events (e.g., escapes) and intentional releases are possible (Campbell 2007). As observed in Florida, Nile monitors can soon establish large populations once they are introduced to the wild (Enge et al. 2004, Campbell 2005. The effects of a 2 m, 7 kg, semi-aquatic, generalist carnivore in tropical and subtropical ecosystems of the world should be sufficient to warrant stronger trade restrictions, daunting penalties for release, and enhanced exotic pet amnesty programs (Campbell 2007).