Forecasting insect dynamics in a changing world

11 Predicting how insects will respond to stressors through time is difficult because of the diversity 12 of insects, environments, and approaches used to monitor and model. Forecasting models take 13 correlative/statistical, mechanistic models, and integrated forms; in some cases, temporal 14 processes can be inferred from spatial models. Because of heterogeneity associated with broad 15 community measurements, models are often unable to identify mechanistic explanations. Many 16 present efforts to forecast insect dynamics are restricted to single-species models, which can 17 offer precise predictions but limited generalizability. Trait-based approaches may offer a good 18 compromise which limits the masking of the ranges of responses while still offering insight. 19 Regardless of modeling approach, the data used to parameterize a forecasting model should be 20 carefully evaluated for temporal autocorrelation, minimum data needs


Introduction
Insect ecologists have generally approached forecasting insect dynamics in a piecemeal way, with individual solutions developed as needed to predict vital metrics for a few key species.Yet in an era of profound biodiversity loss, understanding and predicting long-term trends is key to mitigating functional losses [1].The critical importance of insects to most ecosystems has led to dire projections, but also considerable scientific debate on the nature of these predictions has occurred [2].At present, most attempts at modeling the insect decline phenomenon fall more accurately into explanatory modeling with implied extrapolation rather than predictive modeling.Indeed, most true forecasts of insect dynamics have focused on individual species of economic or cultural significance, that is, primarily pests and a few well-studied species of conservation concern [3,4].Given controversies, modeling disagreements, data needs, and natural variability in insect population sizes, a fundamental question emerges: how forecastable are insect populations?(Figure 1)

Forecasting biodiversity dynamics
In forecasting responses of biodiversity to environmental change, a wide variety of modeling techniques are commonly used, including combining correlative approaches (i.e.species distribution models), mechanistic approaches (i.e.demography and temperature dependence), and theory [5].Predicting the behavior of ecological systems is a means to test scientific understanding, yet much of the field of ecology has often focused on explanatory models [6].Although some authors define ecological forecasting as a strictly quantitative endeavor [e.g.7], more colloquially in biodiversity science, predictions yielded by modeling and synthesis may be qualitative, directional, or quantitative.Quantitative outputs are desirable from a hypothesistesting standpoint because these predictions can be explicitly tested [6].
Because biodiversity processes are driven, in part, by environmental variables, the accuracy of the projection will depend on the accuracy and uncertainty of the projection of these covariates [8,9].The uncertainty surrounding forecasts of biodiversity parameters inherently depends on the uncertainties associated with the information used in the models, including future uncertainties in driving variables, which variables are included, and the underlying model structure, and the interaction of these factors all ultimately drive how far a model may be used to predict into the future [10].While understating uncertainty is not desirable, models which incorporate all possible uncertainties may produce unrealistic and unreliable predictions [11].
Explanatory predictions tend to be based in mechanistic hypotheses: they can be used to describe the behavior of individual systems under testable conditions which can then be corroborated by data.Anticipatory predictions are forecasts (also referred to as projections and scenarios): they represent the extension of a hypothesis into the future, assuming a theory holds [5].Forecasts may be conditional rather than explicitly temporal, that is, their results depend on certain driver conditions occurring, rather than explicitly predicting a given metric at a point in time.For example, models can be used to forecast the likelihood that animals experience mortality during extreme heat events [12], or the locations where invasive insects are most likely to be detected [3].However, these predictions have an inherent temporal aspect: the implication being that should the modeled conditions be realized at some time in the future, the projected outcomes would (or could) occur at that point in time.In fact, many forecasts are not necessarily intended to predict the next state of the system under study, but may be used in an anticipatory way, to extrapolate explanatory models to possible scenarios, given uncertainty in driving parameters [5].
Quantifying the change in biodiversity metrics (whether for a single species population or a broader taxon) is difficult because the data needs to adequately characterize temporal processes [9].Simply detecting temporal trajectories of population processes (much less extrapolating from them) may require more than a decade of annual data when no underlying structure of the data is assumed, especially in environments with high inherent thermal variability [13].Given the challenges of simply measuring trends in many biodiversity systems and the peculiarities of insect biology, explicit efforts to forecast the dynamics of a system are relatively rare in insect ecology.

Explaining insect dynamics is challenging
Prediction of insect population responses, even to a single stressor, is not necessarily straightforward [14].It is likely that, as a general rule, anthropogenic change will negatively affect insect abundance and biodiversity [15].However, insect herbivore populations may be negatively, neutrally or positively affected by a stressor, depending on the nature of the disturbance [16].Responses to stressors may have immediate population effects or more idiosyncratic physiological effects [17], and may be mediated by behavioral adaptations [18,19].Insect biology can present a particular challenge because responses can be non-uniform, even within a single species, at different life stages [20,21].Specific taxa may be sensitive to lesserdocumented stressors [22].Furthermore, given their rapid generation time, eco-evolutionary dynamics will inevitably affect range and population sizes of insects over time [23].Ultimately, forecasting insect dynamics relies on an understanding of these complex biologies: they increase the complexity of the task of predicting future dynamics in insect taxa, and undermine researchers in their quest for generality.Due to the complexity of these interactions, some authors have argued that knowledge gaps remain too great and that understanding and predicting insect decline cannot be achieved without directed experimentation [24], while others have argued that extremely large scale observational approaches are key to understanding and ultimately testing forecasts of insect dynamics [25].

Impediments to forecasting insect dynamics
A major impediment to forecasting biodiversity dynamics in insects is the sheer difficulty in collecting insect species data: taxonomic expertise needed to process biodiversity samples to species is rare [26].Even in situations where standardized sampling approaches are employed [e.g.27], significant lags may hinder the timely production of data, and thus, the viability of forecasts [28].Another major hindrance to forecasting is that insect biodiversity data may not be collected at the scale of the process being modeled, leading to biased inferences or inflation of observed precision [29].However, recent advances in automated identification show promise in increasing capacity and speed for insect monitoring data, which may soon increase our ability to meaningful quantify insect variability across space and time [30].
Trends observed in insect dynamics also depend highly on how they are monitored.Estimates of extent and area of occupancy may differ dramatically when predicted using different data sources [31].Data may be taken from locations biased by their attributes to be more inviting to insects, like gardens or preserves [32].Similar biases are likely present in the data that the community considers the highest quality: much of the long-term, systematic data taken for insects comes from areas under protection [e.g.33], with less monitoring undertaken from areas under increasing disturbance [34].Biases may also be present in unstructured and untargeted records (like those produced by community scientists), with less experienced users contributing more observations of larger species with more striking visual traits [35] (Box 1).The increasing reliance on unstructured community science to estimate biodiversity trends may increase the likelihood of misleading results [36,37] (Box 2).
The selection of drivers used in models also plays a profound role in how predictions of insect populations manifest.For instance, using temperature extremes rather than average temperatures in extinction risk models to account for thermal stress results in substantial changes in predictions [38].An additional element of complexity occurs due to the nonuniformity of drivers of insect biodiversity trends through both time and space (Box 3).Finally, it is well-established that species are affected unequally by change: many species are negatively impacted by human activities, but a few thrive under the conditions of continuous disturbance of human altered environments [39].This 'winners and losers' dynamic presents a barrier to generalizability when it comes to selecting metrics that both authentically capture the broad scale of the insect decline problem without masking the details through unwarranted statistical lumping of very different groups of organisms.

Predictability of different metrics
The question of whether forecasting insect dynamics is possible depends greatly on the specifics of both the question being asked, but also on the information available to support this question, and, indeed, the inherent predictability of the biodiversity metric or property to be modeled [40].In most cases, the reliability of forecasting predictions decreases with time, while it increases with the amount of historical data informing the predictions [41,42].However, the inherent predictability, and the scale at which prediction can occur, will ultimately dictate the limitations on the accuracy of a forecast.
Aggregate and derivative measures may be more accurately predicted compared to more simple metrics, however, this comes at a cost to characterization of drivers and precision of estimates [28].Whereas forecasting models for single species abundance or distribution are common and offer detailed mechanistic explanations [e.g.3], whole-community metrics like diversity, evenness and richness may provide a more holistic picture of insect well-being.But these metrics may also mask unequal responses across a community, particularly in groups of insects with traits that cause widely divergent responses to environmental conditions [43].The temporal grain of the underlying data and the desired predictions inevitably interact with the selection of the metric, with longer time spans (i.e.inter-annual variation vs intra-annual variation) representing both different processes and the integration of more short-term underlying processes, but metrics that can be used for nearer-term forecasts are more inherently testable [44].Because most studies have a particular focus, most attempts to forecast insect well-being as a whole suffers from phylogenetic, functional, spatial and temporal biases; it has been argued that to optimize these broad scale predictions, standardized monitoring schemes focusing on net abundance and biomass were needed to capture authentic estimates of these processes.[45].
These more aggregated measures for biodiversity are often used to imply more general future predictions, or provide qualitative predictions associated with a management scenario [40]: for example, a recent study found that habitats with more rare plant species supported more rare insects, regardless of habitat size [46], implying that restoration efforts that focus on improving plant richness rather than protecting more habitat would result in better outcomes for insect richness.However, other authors caution against using richness as a measure for biodiversity change because this metric is highly sensitive to plot size, making it unreliable to measure, much less predict biodiversity change [47].
Functional and trait-based approaches to measuring biodiversity processes may yield some more generalizable, if often qualitative, predictions that offer a workable compromise from the highly stochastic species-focused metrics and limited mechanistic explanation of all-insect-level metrics [48].For instance, functional trait approaches to measuring biodiversity may provide generalities beyond taxonomic classification: climatic niche breadth was associated with degree of range shifts under climate conditions, and this association held in both vertebrates (birds) and invertebrates (moths and butterflies) across a latitudinal gradient in Europe [49].Thus, these approaches offer a viable compromise that may offer broad generalizability in prediction without the cost to mechanistic explanation, and some traits may be more conducive to building viable predictions than others [e.g.50] What tools can we use and where are they appropriate?
Several classes of tools hold promise for forecasting insect populations, depending on the desired scales and precision of predictions desired.A subset of the most commonly used current approaches are presented here.

Correlative/statistical approaches
Often, projections in ecological systems are based on linear trends applied to time series data [9,13,51].This is often statistically inappropriate based on the underlying autocorrelation structure of biodiversity metrics (i.e. the current state of the metric in question is dependent on both the environmental drivers and the previous state of that same metric), however, these linear trends are often essential for communicating change over time and provide more intuitive outputs to the model, such as expected change in population size.Thus, we can evaluate the length of time needed to establish a linear trend in the system under study, given the actual structure of historical data [52,53], but more importantly, it is essential that entomologists use models which statistically manage for this underlying structure in their estimates of rates-ofchange.Weiss et al. 2023 provide an accessible approach for correcting annual data using random year intercepts in generalized linear models (GLMs).Their approach was able to produce more conservative, less biased estimates of rates of change for ground beetle abundance over a 24 year study, and also demonstrated how sensitivity analysis could be applied to identify influential observations [54].
In data-rich systems where there is limited functional understanding (e.g.data produced by large distributed monitoring networks) other tools can be employed.Generalized additive modeling (GAM) approaches can be use where the shape of the relationship between variables is unknown: this suite of tools allows the estimation of smoothing functions between variables of interest, allowing predictions to be 'data-led' and not necessarily relying on a fore-knowledge of the mechanistic explanation of their relationships [55].For example, GAMs were used to explain patterns in carabid beetle richness relative to climatic variables, forecast the distribution of biodiversity hotspots, and used this information to develop conservation recommendations for a protected temperate steppe area in northwestern China [56].Machine learning models such as artificial neural networks may be used to take this data-driven approach further in cases where system knowledge is limited, making it possible to forecast systems with very limited knowledge of their ecology.For example, an early warning system for rice gall midge was developed using an autoregressive neural network approach on time series data documenting abundance of the midge, and the model outperformed more typical statistical approaches because the method does not assume linear relationships in the data [57].

Mechanistic and physiological population models
Mechanistic and physiological population models come in a wide variety of scales.In applied entomology, short-term forecasts of insects are commonly constructed, usually from mechanistic models describing the phenology, population growth and immediate environmental responses of a particular species or complex [58].These models often include spatially-explicit elements to indicate risk, and may include management information (i.e. economic injury levels, action thresholds) often providing these forecasts at a weekly interval, aligned with how farmers and foresters make pest management decisions [59].Near-term forecasting models may be extended (i.e. to the length of a growing season, for example) for a specific population of wellstudied insects using models that account for many of the major parameters, however these models may have very limited transferability if the models incorporate site-specific information and highly specific dynamics [e.g.60].Yet, mechanistic models can be used to gain more general insights when applied to broader groups using trait-based approaches.Mechanistic modeling essentially leverages very specific understanding of insect ecophysiological responses to predict higher level phenomena in insect populations, and can be used under longer term scenarios where statistical extrapolations are likely to break down [61] or explicitly link physiological traits to ecological theory [62].For instance, thermal sensitivity traits were used to forecast insect community responses under future climate scenarios: these analyses suggested greater extinction risk among insects in tropical environments without rapid adaptation or migration [63].Mechanistic approaches can be used to predict future selection patterns in plastic or variable traits within their ranges: e.g.selection for lighter wing colors to avoid overheating in warming climates [64].

Integrating heterogenous data into forecasting
Integrated population modeling is an approach commonly used in wildlife conservation, where taxa under management, such as game species, are monitored using varied survey protocols, at different life stages, across different parts of their range, creating a highly heterogenous but very rich set of observational data [65].Animals monitored across their ranges or lifecycles often yield discrepant patterns which can be difficult to resolve in isolation, often resulting from factors such as asynchronies between metapopulations and density-dependent demographic effects [66].This approach allows researchers to identify which data and monitoring strategies provide the most informative estimates [67], and is generally applied to well-monitored species with complex life histories, but may be used to estimate and forecast a wide variety of metrics regarding that population at various points in its lifecycle [66].This integrated modeling approach has recently been extended to integrated community occupancy modeling, which allows the integration of single species distribution models and hierarchical community occupancy models to forecast biodiversity dynamics of bird communities [68].

Inferring temporal processes from spatial approaches
Spatial processes may serve as a proxy for temporal processes in developing forecasts for insect decline, or as part of direct experiments to identify drivers that might be managed through time [69].Distribution models can be used to estimate range size and occupancy to prioritize protections and listings of species with contracting or vulnerable populations within their ranges, based on projected extinction risks [70].Spatial approaches may provide a means for forecasting other vital parameters in cases where abundance data are unavailable.For instance an extinction risk index was developed based on range size and was used to examine how species traits like thermal limits and body size affect extirpation risk in 600 Odonata species, using occurrence data [43].
Future ranges forecast through distribution modeling can be refined by combining this approach with dynamic evolutionary models that account for the genetic potential of the species to respond to changes in their environment [71], and may provide anticipatory predictions that go beyond interactions with the abiotic environment.Range dynamics models can be further refined beyond the niche-implicit aspects typical to species distribution models by the superimposition of process-explicit, mechanistic models (for organismal physiology, biotic interactions, and demography), helping to mitigate extrapolation issues created by distribution models based on correlative characteristics alone [72].

Iterative forecasting methods
With all methods described above, iterative, near-term forecasting approaches can be applied.In this case, the forecasts are made repeatedly and updated as new data becomes available, effectively re-running the model for each new system state as it is realized [40].This approach allows explicit testing not just the performance of predictive models, but would allow multiple competing models to be evaluated in real time, and provide insights into situations where relationships between drivers may not hold.This approach is currently under use for the NEON Ecological Forecasting Challenge, a community-driven scientific networking activity designed to bring about scientific interest in advancing approaches to ecological forecasting [7].The project challenges users to develop forecasting models to predict the next state of data collected by the National Ecological Observatory Network [73].Among the challenges, users have been tasked with developing models for the richness and abundance of Carabid beetles collected in pitfall traps at all the sites [7].At time of writing, the challenge was still ongoing.

Conclusions
Forecasting insect populations, as a whole, with simultaneous great generalization and precision is unlikely due to the diversity of insects, ecologies, life histories, behaviors and environments in which they occur, but also in the diversity of metrics, data sources, inherent biases in monitoring strategies, and tools available.However, several approaches, including integrated population monitoring for single species predictions, and near-term iterative approaches to testing forecasts hold promise for developing novel insights into drivers, particularly when underlying data are classified using relevant species traits.Yet, broader generalities may not be needed when speaking of biodiversity trends as a whole: it is well established that rates of anthropogenic change in the environment generally have negative consequences for all but a handful of species that have traits that favor disturbed environments and tend to be associated with humans.Because of the non-uniqueness of models, it is likely that the quest for the 'best' (i.e.most precise) model to inform management is both ill-informed and potentially dangerous [74].Although a nuanced approach to predicting insect responses to stressors is desirable from a scientific and management standpoint, core conservation and policy efforts do not require this level of detail in order to enact positive changes for insects more generally [75,76].This paper examines a principal challenge in macroecology: when studying ecology at broad scales, this inherently means integration of heterogeneous data.The authors examine common barriers to data integration and provide instructive commentary on possible approaches to manage data integration issues.This perspective piece highlights a particularly insidious bias associated with insect biodiversity monitoring: insect populations are monitored in habitats that are "good" for insects like conservation areas.This ignores the biodiversity dynamics occurring in impacted areas, even while human activities encroach into more and more relatively pristine habitats.This effect almost certainly has compromised our ability to measure insect dynamics across broader scales.Using natural history collection data (and similarly, data produced by community science surveys like iNaturalist) in explanatory and forecasting models is a subject of ongoing concern in the quantitative ecology community because of the unstructured nature of these data [36,77].Yet, one of the principal challenges in understanding and predicting insect decline is the lack of historical baseline data [2].If used with caution, these data represent an unprecedented resource for understanding how insect communities have changed over time [78].A technique that could capitalize on this data resource is to use a community of specimens instead of single species from within the collection data, where multiple species with a similar probability of being captured are examined together, using total captures across the community to control for sampling effort over time.This approach allows relative, if not absolute abundance and thus long-term responses to historical drivers to be evaluated [79].Similarly, researchers might use detection data of similar species within a given species' expected range, at a given date and time to infer non-detection for the construction of occupancy models [80].Furthermore, these records can be brought into integrated modeling approaches which have the ability to couple these long term, but unstructured data with contemporary experimentally-produced data in a single analytical framework [77].

Box 2: Tool Highlight: Evaluating bias in time series
Because of the high degree of temporal and spatial autocorrelation present in occurrence and abundance surveys, Boyd et al. [81] developed ROBITT: Risk Of Bias In Studies of Temporal Trends.ROBITT is a tool which provides a structured approach for a researcher to essentially 'interview' their data in the context of bias assessment, focusing on explicitly defining the questions, scales, data reliability and provenance, as well as any apparent geographical, environmental and taxonomic biases.This tool is especially useful for assessing limitations of data from unstructured surveys and how these biases might manifest in any projection models [81].

BOX 3: Case study: Forecasting the dynamics of complex insects
In addition to different species being sensitive to different disturbances through their varied biologies, different stressors may act on populations at different times, and one stressor may predispose a species to sensitivity to another.In the iconic and well-studied Monarch butterfly, a number of conditions have been linked to the dynamics of this species, including pesticide use in breeding grounds, unfavorable conditions at migratory stopover points, or loss of integrity of overwintering sites [82].While time-series methods may be used to identify periods of change in internal rules of population regulation, providing insight into when the most changes have occurred historically [83], a hierarchical modeling approach used to integrate population data across the monarch lifecycle and isolate the effects of these potential drivers, disentangling those with historical effects from those currently driving the dynamics of this species [84].This approach revealed that breeding season temperatures played a larger role in monarch dynamics than previously thought in recent years: when it was used in concert with climate projections to forecast future populations of the species, it highlighted particular vulnerability to monarch breeding in parts of the US Midwest experiencing higher rates of temperature increase [4].

systematic literature review of forecasting and predictive models for cyanobacteria blooms in freshwater lakes.
EV, Sistri G, Bonifacino M, Menchetti M, Pasquali L, Salvati V, Balletto E, Bonelli S, Cini A, Portera M, et al.: Discard butterfly local extinctions through untargeted citizen science: the interplay between species traits and user effort.In Review; 2023.36.Boyd RJ, Powney GD, Pescott OL: We need to talk about nonprobability samples.

332-344. Figure 1. Core elements required to forecast insect dynamics.
Researchers must consider the research question and context to select appropriate data, metrics models and validation approaches to be used for forecasting insect dynamics.Figure constructed using Canva.