Time to better integrate paleoecological research infrastructures with neoecology to improve understanding of biodiversity long-term dynamics and to inform future conservation

Anthropogenic pressures are causing a global decline in biodiversity. Successful attempts at biodiversity conservation requires an understanding of biodiversity patterns as well as the drivers and processes that determine those patterns. To deepen this knowledge, neoecologists have focused on studying present-day or recent historical data, while paleoecologists usually study long-term data through the composition of various biological proxies and environmental indicators. By establishing standard protocols or gathering databases, research infrastructures (RIs) have been instrumental to foster exchange and collaboration among scientists within neoecology (e.g. Global Information Biodiversity Facility or National Ecological Observatory Network) and paleoecology (e.g. Paleobiology Database, Neotoma Paleoecology Database or European Pollen Database). However, these two subdisciplines (and their RIs) have traditionally remained segregated although both provide valuable information that combined can improve our understanding of biodiversity drivers and underlying processes, as well as our predictions of biodiversity responses in the future. For instance, integrative studies between paleo- and neoecology have addressed the global challenge of biodiversity loss by validating climate and ecological models, estimating species fundamental niches, understanding ecological changes and trajectories, or establishing baseline conditions for restoration. Supporting and contributing to research infrastructures from both paleo- and neoecology, as well as their further integration, could boost the amount and improve the quality of such integrative studies. We argue this will enable improved capabilities to anticipate the impacts of global change and biodiversity losses. To boost such integration and illustrate our arguments, we (1) review studies integrating paleo- and neoecology to advance in the light of global changes challenge, (2) describe RIs developed in paleoecology, and (3) discuss opportunities for further integration of RIs from both disciplines (i.e. paleo- and neoecology).


Introduction
The pace of global change has accelerated since the 1950s, and society currently faces major challenges at the global scale (Steffen et al 2005). In fact, humans are potentially causing the sixth mass extinction in the history of life on Earth (Barnosky et al 2011, Ceballos et al 2015Ceballos et al , 2017, and biodiversity loss has been recognized as one of the most relevant challenges that humanity must face in the coming decades (Díaz et al 2006, European Commission 2011, Gardner et al 2013. Anticipating those changes, especially those affecting biodiversity, has become one of the main goals for scientists from disparate disciplines such as climatology, geology, and/or ecology (Vitousek 1994, Bonan 2008, Heller and Zavaleta 2009, Allan et al 2015, Chaudhary and Mooers 2018. However, this global challenge, as many others, is a wicked (Rittel andWebber 1973, DeFries andNagendra 2017) and multifaceted problem that requires many cooperative efforts if it is to be addressed (Whyte and Thompson 2012). Solving this environmental challenge will require an integrative study of several different interconnected components of the Earth system, which in turn will require interdisciplinary approaches, methods, resources, and efforts.
One of the most intriguing and elusive facets of the global change challenge is understanding the linkages between temporal scales when dealing with biodiversity loss and ecosystem degradation (Bunnell andHuggard 1999, Azaele et al 2015). Many questions are still unsolved regarding this issue: e.g. To what extent does the past configuration of landscapes affect the current conservation status of species (Kissling et al 2012, Eiserhardt et al 2015?; How far into the future should we expect ecological legacies to be influential (Moorhead et al 1999)?; To what extent can we use the structure of past ecosystems as analogs for present ones when we try to restore a degraded ecosystem (Suding et al 2004, Perring et al 2015, Wingard et al 2017?; Can we use hindcasting methods to test the predictive ability of ecological forecasting models under no-analog environments (Maguire et al 2015, Fitzpatrick et al 2018?; Can we anticipate the effects of climate change on biodiversity by understanding past events of biodiversity loss (Willis et al 2010, Barnosky et al 2011, Willis and MacDonald 2011? What triggers abrupt and non-linear regime shifts in ecosystems (Ratajczak et al 2018)? Rather than providing an exhaustive list of pending work, those questions illustrate the importance of considering different time scales to understand and avoid biodiversity loss and conservation.
Answering these questions, however, requires a deep understanding of biodiversity patterns (e.g. species distributions, community composition and assembly, or macroecological patterns), drivers of change (e.g. geology, climate, fire, or humaninduced landscape transformations) and processes that determine those patterns. Neoecologists have traditionally focused on studying current or recent historical processes (intra-annual to decadal or centennial) to address these questions, while paleoecologists have usually studied long-term processes (from decadal to millions of years) through the fossil record. Although this distinction and definition of paleo-and neoecology might be over-simplistic and, in fact, there are multiple exceptions (see Rull 2010, Reitalu et al 2014, Jackson and Blois 2015, the two fields have traditionally been segregated because of multiple and diverse causes (e.g. differences in samples nature, different jargon, or different journals; see Rull 2010, Reitalu et al 2014, Jackson and Blois 2015. Nevertheless, ecological elements and processes in the past, present and future are interconnected in a spatio-temporal continuum (Delcourt and Delcourt 1988, Turner et al 1989, Reitalu et al 2014. Therefore, both disciplines provide valuable information at different and complementary times scales that, combined, have the ability to improve our understanding of biodiversity drivers and underlying processes or improve predictions of biodiversity responses in the future (Rull 2010, Blois et al 2013, Williams et al 2013, Jackson and Blois 2015, Maguire et al 2015. Thus, further integrating these two perspectives is a necessary step towards understanding and anticipating potential ecological changes. Research Infrastructures (RIs) may play a critical role in bridging the gap between both disciplines. RIs refer to tools specifically designed to enhance science, providing disparately large services to scientific communities (i.e. from physical infrastructures-experimental sites or facilities-to computational infrastructures-databases and data portals-, but also entities that define and manage standard protocols and/or universal identifiers of samples). Although this term has different meanings around the world, the European Commission has created a definition that properly gathers most of the 'traits' of being a RI (European Commission 2017): 'research infrastructures are facilities, resources and related services that are used by the scientific community to conduct top-level research in their respective fields and cover major scientific equipment or sets of instruments; knowledge-based resources such as collections, archives or structures for scientific information; enabling information and communication technologybased infrastructures such as grid, computing, software and communication, or any other entity of a unique nature essential to achieve excellence in research' . This definition might include many initiatives from both neo-(e.g. Long Term Ecological Research networks -LTER-, National Ecological Observatory Network -NEON-, or the Global Biodiversity Information Facility -GBIF-), and paleoecology (e.g. Neotoma Paleoecology Database-Neotoma-or Life Earth Consortium). By documenting data and protocols, improving accessibility to data and analysis, as well as by exchanging and connecting databases and services, RIs might have a primary role in establishing collaborations within and between the two fields (Peters et al 2014, Bonet 2016, RISCAPE-project 2017. In our experience, while some of the neoecology RIs are widely known and used by scientists from different fields (including paleoecology), paleoecology RIs remain comparatively less known and mostly used only by paleoecologists. However, paleoecology RIs are crucial to reveal insights about the long-term response of biodiversity to environmental and climate changes in the past.
In this manuscript, we aim to encourage integration between paleo-and neoecology, through the integration of their RIs. Given the comparatively less popularization and use of paleoecological RIs, we focus on introducing paleoecology and its RIs to a broader audience. To do so, we provide a nonexhaustive review of fruitful studies which have successfully integrated paleo-and neoecological data.
Using these examples, we aim to describe some cooperation threads between these fields that could be useful to determine the present and future impacts of global change. Then, we describe the past and current initiatives in the paleoecological community to build RIs (i.e. to foster data sharing and collaborative studies), discuss some of their main opportunities and limitations, and suggest further steps to improve integration of paleo-and neoecology through RIs. Additionally, for the those unfamiliar with paleoecology, we provide an overview of the nature of paleoecological data and their particularities (Box Paleoecological record) that should be taken into consideration when designing, adapting, connecting, integrating, and/or using RIs that host paleoecological specimens, data, models, or analytical procedures.

Integrating ecology and paleoecology: overview and needs
The importance of integrating paleo-and neoecology has been recognized since the beginning of the 20th century (Clements 1924, Foster et al 1990, Schoonmaker and Foster 1991, Willis and Birks 2006, Willis et al 2007, Rull 2010, Reitalu et al 2014, Jackson and Blois 2015. Indeed, both disciplines are increasingly exchanging theories (e.g. community assembly theories/rules; Jackson and Blois 2015), concepts (e.g. almost all nicherelated concepts-realized and fundamental niche or disequilibrium; Veloz et al 2012, Nogués-Bravo et al 2016, Saarinen and Lister 2016, and/or tools (e.g. species distribution models, time series analyses or multivariate approaches). By combining elements from paleo-and neoecology, these integrative studies provide insightful information to understand long term ecological processes and dynamics.
The relationship between biodiversity and climate is scientifically recognized and studied since Humboldt's foundational works (Von Humboldt and Bonpland 2009). This relationship is at the heart of biodiversity responses to global change. Indeed, anticipating those responses increasingly relies on models to predict climate in the future (Global Circulation Models and Regional Climate Models-GCMs and RCMs, respectively; Navarro-Racines et al 2020). Because models' predictions to the future cannot be validated, they are frequently hindcasted to past conditions and then validated with paleoecological data (both fossils and environmental proxies). This sort of validation has been, and will be, instrumental in intercomparison projects to quantify model uncertainties and to improve their performance (Pinot et al 1999). Similarly, paleoecological information can be used to validate ecological models used to predict biodiversity responses to global changes (Maguire et al 2016, Cheddadi et al 2017. These sorts of models are usually calibrated using neoecological data and then projected into future conditions using climate simulations. These models can also be hindcasted using paleoclimate simulations and then validated against paleoecological records (Alba-Sánchez et al 2015). These validations can be used to select best models to calculate future predictions (e.g. Macias-Fauria and Willis 2012) or to quantify model uncertainties (e.g. Garrido-García et al 2018).
Fossil records provide the evidence necessary to both infer and study changes in species distribution and/or community composition (e.g. Foster et al 1990, Schoonmaker and Foster 1991, Davis 1994, Huntley 1996, Jackson and Overpeck 2000, Williams and Jackson 2007, Rull 2010, Ostling 2012, Jackson and Blois 2015. This information has been used to test ecological theory, such as niche-stability (Veloz et al 2012), or test for community assembly rules (Blois et al 2014). For instance, Veloz et al (2012) compared the climate distributions (based on paleoclimate simulations from GCMs) for fossil-pollen data from the Last Glacial Maximum (21-15 ka bp; LGM) to observed modern pollen assemblages. They found that certain taxa, such as Fraxinus, Ostrya/Carpinus and Ulmus, substantially shifted their realized niches from the late glacial period to present, whereas other taxa, such as Quercus, Picea, or Pinus strobus, had relatively stable realized niches. Consequently, Species Distribution Models (SDMs) for the former taxa had low predictive accuracy when projected to modern climates, despite demonstrating high predictive accuracy for late glacial pollen distributions. For the latter taxa, models tended to have higher predictive accuracy when projected to present. These findings reinforce the point that the realized niche at any time often represents only a subset of the climate conditions in which a taxon can persist and allow the authors to conclude that projections from SDMs into Box: Paleoecological record: nature and structure of the data Paleoecologists study macro-and microscopic fossils (e.g. shells, bones, spores, plant tissues, pollen, or resistant structures of unicellular organisms), in combination with paleoenvironmental indicators (e.g. sedimentological, geochemical, or tree-ring records), from a particular location (Maguire et al 2015) to understand interactions between organisms and between organisms and their environment in the past. Records are usually derived from sedimentary deposits with favorable conditions for preservation of biological samples (e.g. lake bottoms, peat bogs, tar pits, biogenic accumulations as middens). However, they can also be found in archeological deposits or open-air settings. Both macro-and microscopic fossils can provide information about the occurrence (presence, but not absence) and/or relative abundance from a wide range of organisms (table 1) The strength of fossil data lies in their ability to document biological and ecological patterns on time scales of decades to millions of years; in some cases, as series of continuous records (e.g. microfossils from sedimentary deposits like lakes and marine cores), in others, as discontinuous samples in time (e.g. plant or vertebrate macrofossil remains in discrete alluvial deposits). For instance, continuous deposits (e.g. diatoms, dinoflagellates, pollen and fungal spores), as well as rodent middens, deposits in caves, tar pits, and shallow marine deposits (with marine invertebrates) have been used to study dynamics of ecological communities (Faegri and Iversen 1975, Odgaard 1999, Maguire et al 2015. Palynology stands out in this regard since it often provides continuous information about the relative abundances for certain taxa of land plants (see taxonomic biases below). Note that abundances from the paleoecological record are usually relative abundances. Hence, they might be difficult to compare with abundance data from neoecological studies.
For certain taxa it is possible to estimate their continuous occurrence by using indirect indicators from other continuous paleo records (e.g. herbivores from dung fungal spores; Gill et al 2012, Perrotti and Van Asperen 2019). The study of dynamics for taxa with a discontinuous fossil record requires pooling information from different time periods.
Most fossil records are multivariate, indicating the relative composition and/or the co-occurrence of multiple species in a particular region, allowing both single-and multiple-taxa studies (Maguire et al 2015. Although single-taxon fossils (e.g. many macrofossils) are also frequent, they can still be used to infer community composition (e.g. allowing analysis of plant and animal communities altogether) by combining data from different taxa in a particular region and time period (Magri andPalombo 2013, Saarinen andLister 2016). The increasing availability (Magri andPalombo 2013, Saarinen andLister 2016) and accessibility (Saarinen and Lister 2016) of fossil data enables pooling information for multiple taxa from different locations and time periods, which strengthen the ability of the fossil record to study multivariate biodiversity patterns through time.
Like all ecological data, fossil records are potentially affected by several types of uncertainty (namely temporal, taxonomic, and taphonomic; Maguire et al 2015. For instance, taphonomic uncertainties arise from the geological processes that biological remains undergo since they originate until their fossilized forms are found (movement of the remain, sedimentation and burial, etcetera). Despite these uncertainties, it is possible to make insightful reconstructions about the variability of past landscapes and environments, especially if key features of the fossil record are assessed, quantified, and documented during the analytical process. New developments in proxy-system modelling encourage that each step of the analytical process (i.e. sampling, processing, analyzing, dating, and identifying the samples) are documented so that any uncertainties can be incorporated in either qualitative or quantitative ways (e.g. Jackson 2012, Evans et al 2013, Seddon et al 2019. In fact, paleoecology has a long tradition in those processes.
Temporal uncertainty is usually high relative to most neoecological observations and most frequently arises from the fact that the age of fossil samples needs to be inferred. Sometimes the fossil samples are dated directly by different dating techniques (e.g. radionuclides of C or U/Th, amino acid racemization, or luminescence dating) depending on the nature of the sample and/or the age. Each technique has its own assumptions and potential biases, which lead to different levels of uncertainty. Age estimates can also be indirect. In these cases, such as pollen grains from sediments, fossils are not directly dated, but age is inferred indirectly through age-depth models based on certain control points (Blaauw 2010). The use of such models implies an increasing level of uncertainty (Blaauw and Christen 2011, Blois et al 2011). Sedimentation rates may change through time, affecting the accumulation rates and thus producing nonregular time intervals in sediment cores. Nonetheless, developing reliable age models to the interface between paleo and modern systems might be error prone (Tylmann et al 2016, Arias-Ortiz et al 2018. Spatial uncertainty in the fossil record is generally recognized by the fact that the absence of fossil evidence does not indicate the absence of such taxon, because there might not be appropriate conditions for fossilization and/or preservation (Laplana and Sevilla 2013). Although, this challenge of presence-only data is also common in many present-day biodiversity datasets, the additional uncertainties, and limitations of the paleoecological datasets make it more difficult to circumvent. Furthermore, fossil samples might be affected by taphonomic processes due to erosion, topographical changes, tectonic plate dynamics and/or animal and human action (Varela et al 2011, Martín-Perea et al 2019. Fossil remains can often be incomplete or degraded, making identification difficult. In other cases, like pollen grains, fossil remains are identified at higher taxonomic levels (e.g. genus or family; Rull 2012) because they are morphologically similar or do not provide enough information to distinguish between taxonomic units (Alba-Sánchez et al 2010). Other taphonomic uncertainties arise because different organisms fossilize and preserve differently, leading to a positive bias towards those groups with better preservation (Behrensmeyer et al 2000). Furthermore, pollen grains do not linearly correlate with vegetation abundance. For instance, Pinus can disperse very long distances before deposition, blurring the signal of the local taxon occurrence (Bunting et al 2004, Broström et al 2016, Hicks 2001, Lisitsyna et al 2012, Goring et al 2013. Factors like weather, pollen morphology, depositional basin size, and especially pollen productivity affect such uncertainty (Davis 2000, Bunting et al 2004, Sugita 2007b, Sugita 2007a, Hellman et al 2009, Bunting et al 2013. future climate conditions that are based solely on contemporary realized distributions are potentially misleading for assessing the vulnerability of species to future climate change. Paleoecological information has also been used to fit multitemporal models, with the aim of better estimating the fundamental niche and partially circumventing shifted-realized niches (Nogués-Bravo 2009). In this vein, Nogués-Bravo et al (2016) projected changes in abundance and conservation status under a climate warming scenario for 187 plant taxa using niche-based models calibrated with paleorecords for the last 21 000 years. Incorporating long-term data into niche-based models increased the magnitude of projected changes for abundance and community turnover. Those larger projected changes translated into different, and often more threatened, projected conservation status for declining taxa, compared with traditional and single-time approaches. Interestingly, they also found that few models predicted total disappearance of taxa, suggesting that these taxa are resilient if climate is the only extinction driver. These findings demonstrate how linking paleorecords and forecasting techniques have the potential to improve conservation assessments and inform future conservation measures. Furthermore, information derived from paleorecords can help to improve environmental management and decision making. For instance, information from paleolimnological studies has been proposed to select reference sites and determine reference conditions in those sites to define current aquatic ecosystem statuses and restoration goals in the light of the European Union Water Framework Directive (Bennion and Battarbee 2007).
Paleoecological information can also help to understand biodiversity dynamics and responses to climate and anthropic changes (e.g. Garrido-García et al 2018, Gaüzère et al 2020). For instance, Lozano et al (2016) studied how hominin species affected large mammals' interactions during the Early and Middle Pleistocene in Western Eurasia, by constructing and analyzing paleo food-webs from the archaeopaleontological records. Pleistocene food webs shared basic features with modern food webs, although several parameters differed significantly. Very interestingly, the results also highlight the central position of hominins in the trophic web, modifying energy fluxes. Other studies have identified the effect of human pressure on many other aspects of paleobiodiversity, like body size (Faurby and Svenning 2016) or equilibrium in plant functional trait responses to climate (Gaüzère et al 2020).
While the previous studies exemplify the use of paleoecological information with neoecological theories and tools, they are biased towards relatively recent time periods (mostly the Quaternary, and most specifically the Pleistocene and the Holocene). However, paleoecological information from distant periods in the past (millions of years ago) are also crucial to analyze and understand current and future patterns and responses of biodiversity. For instance, advances in molecular methods are allowing to analyze whole genomes, which enables to estimate phylogenies with unprecedented levels of confidence (Armstrong et al 2020). Furthermore, analytical methods have been developed to ensemble multiple phylogenies in megatrees, which increase the taxonomic breadth of phylogenies to cover the whole tree of life (Redelings and Holder 2017). However, dated fossils remain essential to constrain nodes' ages in all those phylogenetic trees (Anderson et al 2005, Beck 2008. Age calibrated trees are crucial to estimate speciation and extinction rates, as well as phylogenetic diversity, becoming essential for most eco-evolutionary studies. The previous links between paleo and neoecology illustrate the relevance of such integrative studies and how they can advance the biodiversity loss and conservation agenda. However, this agenda is far from complete and there remain several areas of research that can benefit from further integration and advance in the study of ecological processes and dynamics within the context of long temporal scales (table 2).

RIs in paleoecology: state of the art
Better understanding the multitemporal biodiversity and ecosystem responses to climate and other global changes Willis and Birks (2006) and Jackson and Sax (2010) Setting temporal constraints when calibrating phylogenetic trees and incorporating explicitly the fourth dimension in eco-evolutionary studies Donoghue and Benton (2007) important role in boosting such integration by ensuring all stages involved in successful management and preservation of data for use and reuse (a data life cycle; Michener and Jones 2012) in paleoecology. Paleoecological community has developed their own set of RIs to cover different parts of the cycle (see below in this section). We propose a tentative roadmap illustrating the data life cycle of paleoecological records (figure 1) that integrate all possible actions of the cycle in three main stages: (1) collect and assure samples, (2) describe, preserve, and discover, and (3) integrate and analyze. Standardized methods and protocols to collect, store, preserve, and document fossil records are well developed, some of them with long histories that trace to the foundations of their disciplines (e.g. fossil pollen; Faegri and Iversen 1950). Most frequently, paleoecological samples (e.g. fossils) are preserved in museums and biological collections (Jagt et al 2006), while others (e.g. sediment cores) are preserved in facilities of research institutions (Sampériz et al 2013). The International Geo Sample Number (IGSN) provides a system to assign unique identifiers to geological samples in order to locate, identify, and cite physical samples (including fossils Figure 1. Proposed roadmap illustrating the data life cycle of paleoecological records and its further integration with harmonized neoecological datasets. We have considered three domains adapted from Michener and Jones (2012) that are related to the data life cycle: (1) collect and assure samples, (2) describe, preserve, and discover datasets, and (3) integrate and analyze datasets. and paleoecological samples) with confidence, which is utterly relevant to ensure accessibility of those samples. Despite IGSN being established in 2011, it has already issued more than 7 million identifiers. Several organizations (e.g. European and American Geosciences Unions; EGU/AGU) recommend reporting IGSN for samples in their publications (e.g. poster sessions in AGU conferences and articles in AGU journals). Furthermore, important data repositories, like Pangaea (www.pangaea.de) or Neotoma DB (see below in this section), include fields for IGSN in their data structure. Hence, the first stage of the data life cycle of paleoecological records (figure 1) is well established and implemented. However, a wider use of the IGSN, by more journals and data repositories adopting the IGSN and making it mandatory, could improve the accessibility of samples.
Similarly, methods and protocols to analyze paleoecological samples and produce useful information (e.g. depth-age models and sedimentation rates, microorganisms/charcoal counts, or isotope ratios) are generally well developed and standardized. Furthermore, there is also a long tradition in the paleosciences to build databases that store, preserve, and share this processed information (see table 3 for some of the main paleoecological databases). For instance, in the 1980s, several databases, like the European Pollen Database (EPD) or the North American Pollen Database (NAPD) emerged to preserve and share Quaternary pollen data at continental scales (Pollen Database Administration 2007, Fyfe et al 2009, Grimm et al 2018. More recently, these initiatives have been complemented with the development of more databases covering different taxonomic groups/proxies and/or temporal scales and resolutions (e.g. paleobiology Database-PBDB-Global Charcoal Database), database aggregations (e.g. Neotoma), data repositories (e.g. Pangaea), or metadatabases (compiled during the execution of research projects; e.g. Past Global Changes metadatabases). Although some paleoecological subfields might lack such developments, overall, the second stage of the data life cycle of paleoecological records (figure 1) is also well advanced and implemented.
The paleosciences also have a long tradition of collaborative and integrative projects and initiatives. For instance, in 1991 the National Science Foundation funded the Past Global Changes project (www.pastglobalchanges.org), which encourages international and interdisciplinary collaborations to understand the Earth's past environment, in order to obtain better predictions of future climate and environment and inform strategies for sustainability. More recent developments include the Earth Life Consortium (http://earthlifeconsortium.org; Uhen et al 2018) or the EarthCube community (www.earthcube.org), which have common and overlapping objectives. The Earth Life Consortium aims to develop an Application Programming Interface Facilitate activities that address past changes in the Earth System in a quantitative and process-oriented way in order to improve predictions of future climate and environment, and inform strategies for sustainability. Working groups in PAGES have developed databases and metadabatases to support their projects.
(API) to interconnect and interoperate databases (i.e. Neotoma and the PBDB). EarthCube is more ambitious and aims to boost data science, integration, and collaboration across the geosciences by developing many types of cyberinfrastructures (and not only APIs to interoperate databases). Two of the main outcomes from EarthCube activities are the Link-edEarth (http://linked.earth; Emile-Geay et al 2018) and the Linked paleo Data (http://lipd.net; McKay and Emile-Geay 2018) projects. LinkedEarth aims to better organize and share Earth Science data, especially paleoclimate information, through curation, developing standards to store and share paleodata, and crafting tools to analyze those data; Linked paleo Data aims to develop the framework (which includes data structure, API, and tools) necessary to reach the goals of LinkedEarth. While APIs and cyberinfrastructures would allow a decentralized interoperability of databases, databases like Neotoma have started to centralize and aggregate other databases (e.g. the EPD has started the migration into Neotoma). Note that all these initiatives (developments and databases aggregations) contribute to the third stage of the data life cycle of paleoecological records (figure 1) but remain limited to the paleoscience domain. The enhancing Paleontological and Neontological Data Discovery API (ePANDDA; https://epandda.org), a project in active development, has developed an API that connects data from the paleo and neoecological domains. More specifically, it interconnects the PBDB, iDigpaleo (www.idigpaleo.org), and iDigBio (www.idigbio.org). In line with the integration of paleo-and neoecological data, some of the paleo-databases have been integrated with present-day database aggregators (e.g. the PBDB has been connected to GBIF).

Opportunities from closer integration
Most of the past and current initiatives occurring within the paleoecological community have a strong resemblance to the process followed by neoecologists when building RIs: e.g. definition of protocols and standards, data harmonization, use of metadata standards. The former suggests that paleoscience could benefit from a higher-level RI that organizes and coordinates all these initiatives. In fact, this gap has been partially filled by certain scientific initiatives like Past Global Changes (www.pastglobalchanges.org/) or Earth Life Consortium (http://earthlifeconsortium.org). Realizing this gap, a recent white paper was submitted to the National Science Foundation (USA) to create a paleoecological cyberinfrastructure (Williams et al 2017). Nevertheless, the approval of this proposal would cover only part of the data life cycle. Alternatively, paleoecological RIs (i.e. data, procedures, analysis, and services) could also be directly integrated with neoecological RIs. In any case, further steps in the development of paleoecology RIs, should be made in a flexible and integrative approach that enable close collaborations and interoperability with neoecological RIs to elicit a stronger integration of both fields.
Regardless of the route taken, we describe next some of the aspects in which such integration can benefit both paleo-and neoecology and their RIs in terms of the three main stages of the data life cycle.

Collect and assure samples and data
Although protocols and standards for collecting and assuring paleoecological samples are well developed and established, RIs could foster harmonization by forcing to review, or create, if necessary, protocols and methods. Those protocols should cover collecting and assuring samples but also storing and curating information. Such RI would promote such protocols and methods (e.g. IGSN) among the participating entities, which would in turn ensure that samples are correctly stored, preserved and located, while data are correct, properly documented, searchable, and easily accessible. Integrating paleoecology with neoecological RIs would have the additional advantage of sharing experience with other infrastructures also concerned with curation of samples (e.g. NEON biorepository; www.neonscience.org/ data/neon-biorepository).

Describe, preserve, discover
Here, we see at least three main areas to develop for the integration of paleoecology and neoecology and their RIs: (1) promoting the use of standards, (2) improving/completing paleoecological databases, and (3) increasing the discoverability and accessibility of paleoecological data.
Like the first stage of the data life cycle, RIs should promote the use of standards for describing, preserving, and discovering paleoecological data. This would require increasing the participation of the paleoecological community in the international initiatives defining Biodiversity Information Standards (e.g. www.tdwg.org) to consider and incorporate the peculiarities of the paleoecological information (e.g. modifying the Darwin Core, an standard to facilitate the sharing of information about biological diversity, according to modifications proposed from the Earth Life Consortium).
RIs can also help to improve and complete paleoecological databases. For many biological groups (like vertebrate fossils), the actual specimens are housed in museums with their own databases, which may or may not be easily exposed to the public or available for integration. Nonetheless, RIs are powerful agents to articulate institutions (see GBIF articulating more than a thousand of institutions around the world) and databases, which could help to mobilize all those museum records into the existing databases or the corresponding cyberinfrastructures (like GBIF itself). In this line, iDigBio is trying to mobilize specimens from both present-day and paleo collections. Of course, incorporating data into databases is not easy and serious difficulties are expected. For instance, the difficulty of incorporating taxonomic updates to data from legacy and/or institutional databases. Because these problems are not trivial, RIs should increase the participation of paleoscientists in current initiatives dealing with taxonomic backbones (e.g. www.itis.gov) for present-day biodiversity. Furthermore, the use of common standards, apart from improving the description, preservation, and discoverability of the data (see above), should ease the combination and integration of paleoecological databases.
Although some paleoecological fields and databases have a long history of data sharing, many others are difficult to find and access. However, most of them (if not all) might be little known and/or difficult to use by non-experts. These aspects could be partially solved by creating or improving data portals where datasets and metadata are searchable, citable (via DOIs), and downloadable. Again, data contained in these portals should be compliant with international standards commonly used to document ecological and biodiversity data (e.g. Ecological Metadata Language, Darwin Core, etcetera). Furthermore, the existing databases and catalogs could be integrated into other initiatives like eLTER (www.ltereurope.net) or DataONE (www.dataone.org).

Integrate and analyze
Regarding the last stage of the data life cycle, we recognize at least two areas of interest for paleoecology. First, RIs could coordinate the implementation of standards and protocols to facilitate/automate data homogenization and standardization, which would elicit the harmonization of data among paleoecological fields (e.g. request data from pollen and diatoms for the same region and time in a single query). RIs would also help to develop tools that allow documenting workflows (e.g. statistical analysis or hindcasting and forecasting models; Bonet et al 2014), which could also be advanced with the integration of paleoecological workflows into Virtual Labs within LifeWatch ERIC (www.lifewatch.eu). Such workflows should be made with paleo-and neoecology integration and interoperability in mind (e.g. getting paleo-and present-day data for a specific region in a single query). This would require database integration to overcome the numerous challenges described here. For instance, current neoecological databases and RIs cannot tackle spatial, temporal, and taxonomical uncertainties that are idiosyncratic to paleoecological data (see Box Paleoecological record); whilst discrepancies in taxonomic nomenclatures between paleo-and neoecological fields need to be addressed and resolved. Paleoecology could also join existing theoretical frameworks for indicators of biodiversity, like the essential biodiversity variables (e.g. https://geobon.org/ebvs; Pereira et al 2013). By generalizing beyond individual species data, these frameworks might provide an alternative to circumvent part of the issues in paleo-and neoecological databases integration.

Networking activities
Finally, and regardless of the data life cycle, promoting networking activities is at the heart of RIs. Among the countless opportunities, we highlight the possibility to access to paleoecological facilities (e.g. laboratories, sampling sites, etcetera) through transnational activities, like the free access to RIs' facilities supported by the European Union (https:// ec.europa.eu/research/infrastructures/index.cfm?pg= access). eLTER, EMSO, ACTRIS-2, INTERACT, AQUACOSM are several examples of European RIs that share their facilities and in which certain paleoecological facilities might fit. Furthermore, RIs can create training programs in the network of research facilities. These programs could train from other paleoecological fields as well as non-paleoecologists regarding appropriate paleoecological methods and work (i.e. collect, process, and analyze) with samples. These programs would reinforce all the initiatives from the RI regarding the data life cycle, but most importantly, it would bridge the gap between paleoand neoecology.
Taken together, all the previous confirms, not only that there is a potential for paleoecology being part of the environmental RI's ecosystem, but that environmental RI would benefit from that movement (i.e. a win-win situation). A lot of work has been advanced from both the paleo-and neoecological community, but there is still plenty of work to be done. However, the importance of such integration to facing global challenges really deserves the attempt.

Data availability statement
No new data were created or analyzed in this study.

References
Alba-Sánchez F, López-Sáez J A, Nieto-Lugilde D and Svenning J-C 2015 Long-term climate forcings to assess vulnerability in North Africa dry argan woodlands Appl.