Integration of satellite remote sensing data in ecosystem modelling at local scales: Practices and trends

Spatiotemporal ecological modelling of terrestrial ecosystems relies on climatological and biophysical Earth observations. Due to their increasing availability, global coverage, frequent acquisition and high spatial resolution, satellite remote sensing (SRS) products are frequently integrated to in situ data in the development of ecosystem models (EMs) quantifying the interaction among the vegetation component and the hydrological, energy and nutrient cycles. This review highlights the main advances achieved in the last decade in combining SRS data with EMs, with particular attention to the challenges modellers face for applications at local scales (e.g. small watersheds). We critically review the literature on progress made towards integration of SRS data into terrestrial EMs: (1) as input to define model drivers; (2) as reference to validate model results; and (3) as a tool to sequentially update the state variables, and to quantify and reduce model uncertainty. The number of applications provided in the literature shows that EMs may profit greatly from the inclusion of spatial parameters and forcings provided by vegetation and climatic‐related SRS products. Limiting factors for the application of such models to local scales are: (1) mismatch between the resolution of SRS products and model grid; (2) unavailability of specific products in free and public online repositories; (3) temporal gaps in SRS data; and (4) quantification of model and measurement uncertainties. This review provides examples of possible solutions adopted in recent literature, with particular reference to the spatiotemporal scales of analysis and data accuracy. We propose that analysis methods such as stochastic downscaling techniques and multi‐sensor/multi‐platform fusion approaches are necessary to improve the quality of SRS data for local applications. Moreover, we suggest coupling models with data assimilation techniques to improve their forecast abilities. This review encourages the use of SRS data in EMs for local applications, and underlines the necessity for a closer collaboration among EM developers and remote sensing scientists. With more upcoming satellite missions, especially the Sentinel platforms, concerted efforts to further integrate SRS into modelling are in great demand and these types of applications will certainly proliferate.


| INTRODUC TI ON
Anthropogenic and climate change pressures constitute serious threats to the integrity of the delicate ecosystems of several protected areas, such as National Parks, UNESCO World Heritage sites and Natura 2000 sites (Marris, 2011). Ecosystem models (EMs) help researchers to understand the dynamics of these terrestrial environments, and improve monitoring capabilities by filling spatiotemporal data gaps and predicting short-and long-term impacts of different management strategies. Mechanistic ecohydrological models that couple hydrological and vegetation processes (Chen, Wang, Ma, & Liu, 2015) can estimate features like forest productivity and growth (Huber et al., 2013), or evaluate the system water stress under different climate scenarios (Bhattarai, Wagle, Gowda, & Kakani, 2017).
These EMs mainly differ in the complexity of the vegetation component and its interaction with the carbon, nutrient and water cycles.
Research in this field has mainly focused on improving the physical description of physiological processes (Chen et al., 2015;Fatichi, Ivanov, & Caporali, 2012) to accurately quantify vegetation photosynthesis and growth. However, this modelling effort requires the measurement or estimation of several biophysical parameters (e.g. LAI, canopy height) and input fluxes (e.g. precipitation, land surface temperature, irradiation) that are heterogeneous at the landscape scales and evolve in time (Fatichi, Pappas, & Ivanov, 2016;Pappas, Fatichi, & Burlando, 2016).
Monitoring systems based on in situ, airborne and unmanned aerial vehicles (UAVs) measurements may not always be sufficient to provide the large amount of data required by EMs. Despite the very high spatial resolution (few centimetres), UAV data suffer from low spectral resolution, limited flight endurance of drones, and, for several countries, strict laws regulating UAV use for research purposes (Paneque-Gálvez, McCall, Napoletano, Wich, & Koh, 2014), while in situ and airborne measurements are time-, cost-, and labour-intensive. The constantly expanding list of ready-to-use satellite remote sensing (SRS) products is useful to integrate and, if necessary, replace these measurements (Lawley, Lewis, Clarke, & Ostendorf, 2016) with the main advantages of being freely available for the past two or three decades (e.g. since 1972 for Landsat or 2000 for MODIS), with almost-regular repetition (depending on atmospheric conditions), and at high spectral resolution.
In a seminal work, Plummer (2000) highlighted the crucial role that SRS was already playing two decades ago for the improvement of terrestrial models, defining four strategies for linking SRS data and EMs: 1. using SRS data for the estimation of EMs forcings; 2. using SRS data for the calibration and validation of EMs; 3. using SRS data for updating the state variables of EMs; 4. using EMs to interpret SRS data.
As foreseen by Plummer's conclusions, research efforts in the past decade have been promoted by a closer collaboration among ecological modellers and the RS community (e.g. the ESA Climate Change Initiative), which led to improved SRS products together with the estimation of SRS uncertainties (Merchant et al., 2017). Nowadays, a number of online repositories allow direct access to images from different satellite missions and sensors in near real-time, and to higher level global SRS products certified through quality assurance (QA) tests and standards. While SRS data have been extensively used for updating the state of global terrestrial models, e.g. for hydrological (De Lannoy & Reichle, 2016) and carbon cycle (Scholze, Buchwitz, Dorigo, Guanter, & Quagan, 2017) models, the potential of combining SRS data and EMs for near real-time monitoring at local scales, such as those of small watersheds and protected areas (area up to few hundreds of km 2 ), is yet to be fully exploited. This step is fundamental to achieve the objectives of projects such as the ECOPOTENTIAL H2020 project (http://www. model grid; (2) unavailability of specific products in free and public online repositories; (3) temporal gaps in SRS data; and (4) quantification of model and measurement uncertainties. This review provides examples of possible solutions adopted in recent literature, with particular reference to the spatiotemporal scales of analysis and data accuracy. We propose that analysis methods such as stochastic downscaling techniques and multi-sensor/multi-platform fusion approaches are necessary to improve the quality of SRS data for local applications. Moreover, we suggest coupling models with data assimilation techniques to improve their forecast abilities.
4. This review encourages the use of SRS data in EMs for local applications, and underlines the necessity for a closer collaboration among EM developers and remote sensing scientists. With more upcoming satellite missions, especially the Sentinel platforms, concerted efforts to further integrate SRS into modelling are in great demand and these types of applications will certainly proliferate.

K E Y W O R D S
data assimilation, ecohydrological models, satellite remote sensing, stochastic downscaling ecopotential-project.eu), which brought together ecologists, park managers and SRS experts with the goal of monitoring several European protected areas through integration of SRS products into EMs.
Given the marked advances since Plummer (2000), we present updates on the state-of-the-art approaches and discuss some remaining difficulties that modellers may face when integrating SRS data into EMs. As high spatial and temporal resolutions of model drivers and parameters are particularly important to describe vegetation dynamics at local scales, Section 2 highlights the strategies adopted in the literature to properly downscale high-level SRS products (i.e. level 2 and level 3 products, Section 2.1) or to directly obtain the quantities of interest through processing of the observations at the sensor level (level 1 products, Sections 2.2 and 2.3). Best practices to compare EM outputs to the associated SRS products are described in Section 3. Data assimilation (DA) techniques that are proving successful to enhance EMs forecast abilities through SRS data are described in Section 4, together with techniques for the evaluation of model and data uncertainties (Sections 4.1 and 4.2).
Finally, Section 5 concludes the paper and suggests strategies for further advancements.

| PREPARING S R S DATA FOR US E IN LO C AL TERRE S TRIAL EMS
Satellite remote sensing products are useful to characterize spatial parameters and forcings related to the hydrological component (pre-  Table 1 for further examples).
The spatial resolution, temporal frequency and accuracy of "off-the-shelf" high-level SRS products (Table S1) frequently do not match modelling requirements for local applications, where grid cells from tens to hundreds of metres in size are used. Moreover, the assumptions and ancillary data (which are often "hidden" in the technical documentation, cascades of scientific articles or provider's websites for more recent changes) used for the computation of these high-level products might not be consistent with the assumptions or other inputs used in local scale EMs. For example, LAI estimated by MODIS are based on specific land cover map characterized by eight biomes (Yan et al., 2016), which may differ from those used in the EM. We present three typical strategies that modellers are currently adopting and combining to address the aforementioned problems: 1. downscaling low-resolution SRS products, which is particularly useful for climatic products having resolutions of several kilometres (Table S1); 2. deriving high-resolution products from SRS level 1 data, which is relevant to obtain accurate vegetation-related parameters for the domain under study (Table S2); 3. applying multi-sensor/multi-platform fusion techniques, which can fill temporal gaps among the estimated vegetation parameters.

| Downscaling methods for climatic products
Downscaling helps modellers overcome the scale mismatch between high-level (levels 2, 3) SRS products and the desired model resolution. While applicable to many types of SRS, downscaling methods are frequently used for climatic variables (e.g. precipitation and LST), which are among the main driving factors of terrestrial EMs describing ecosystem seasonality and long-term trends.

| Precipitation
Satellite remote sensing-based precipitation measurements (Table S1) constitute a valid alternative to datasets obtained through spatial interpolation of in situ measurements. These datasets achieve regional- Moreover, from the temporal perspective, the monthly resolution of the provided climatologies is too coarse to effectively drive many EMs relying on daily or sub-daily forcing data (Table 1). Main problems arising from SRS-based rainfall products concern spatial coverage, because different satellites cover different ranges of latitudes (i.e. GPCP, CMAP or GPM), and their generally coarse spatial resolution, from 2.5 deg (about 280 km at the equator) up to 0.1 deg (about 11 km). A recent comparison of precipitation datasets derived from gauges, models and SRS data also highlighted the variability of different products, in particular in SRS-derived seasonal precipitation and distribution of extreme events (Sun et al., 2018).
Statistical downscaling has been used to obtain rainfall data at about 1 km resolution from e.g. TRMM using classic geostatistical analysis (Chen, Liu, Liu, & Li, 2014;Shi & Song, 2015). Covariates available at higher spatial resolution (e.g. VIs, elevation and other topographic parameters, and in situ weather data) are used to explain part of the large spatiotemporal variability of the precipitation field.
Stochastic methods are particularly suitable to generate synthetic precipitation patterns whose statistical properties are consistent with those of observed precipitation.

| Land surface temperature
Satellite remote sensing measurements provide daily global spatial coverages of LST at resolutions that vary from 1 km (Sentinel 3) to 56 km (AMSRE-E). Higher resolution data (30 m) are provided at 16 days interval from Landsat 8 (Table S1). Spatial downscaling of LST (also known as sharpening or disaggregating) relies on information about the soil type, emissivity and vegetation cover, frequently using NDVI or LAI as proxy for the latter. Downscaling has been applied to estimate LST at higher spatial resolution (e.g. 250 m) from MODIS and AVHRR data while maintaining their original temporal resolution (Liu & Pu, 2008;Metz, Rocchini, & Neteler, 2014). Fusion approaches have been tested in keeping the temporal resolution of MODIS (daily) while downscaling it down to Landsat resolution (30 m) or ASTER (90 m; Weng, Fu, & Gao, 2014;Yang et al., 2016). It is worth stressing that LST may differ by several degrees from nearsurface air temperature measured by surface stations (Good, 2016).
Approaches to estimate near-surface air temperature from satellite observations are still under development, and daily global mapping will be accomplished in the EUSTACE H2020 project (Brugnara, Auchmann, & Brönnimann, 2017).
It is important to remember that SRS climate products might have low accuracy in topographically complex areas mainly due to their low resolution. To improve the accuracy for local applications, downscaling can be applied in conjunction to bias correction techniques TA B L E 1 A non-exhaustive list of examples of ecosystem models (EMs) applications at local scales using satellite remote sensing (SRS) products to describe model parameters and/or forcings. Detailed model descriptions and references are available in Supplementary  and considering the ancillary data used to obtain the original SRS products (e.g. Maggioni, Meyers, & Robinson, 2016).

| Deriving high-resolution SRS products
Assessing the key parameters for describing photosynthesis, ET, and NPP is crucial to compute energy, water and carbon fluxes. Several global SRS products related to vegetation biogeochemistry are freely available (Table S1). Although these products have the clear advantage of passing external QA, they differ in terms of algorithms, ancillary data and product uncertainty, and might not be consistent with the particular requirements and assumptions of EMs, especially for local applications. Exploitation of satellite images using empirical, semi-empirical or physically based approaches may be required to obtain consistent products at high spatiotemporal resolutions.
We underline, however, that a deep expertise on remote sensing is needed to exploit correctly the available dataset depending on the desired accuracy and application, in particular for assessing the uncertainty of the retrieved variables. The indirect nature of SRS measurements makes them potentially hard to interpret and relate to physically measurable quantities (Disney, 2016).

| Empirical approaches
Empirical approaches establish mathematical relationships between SRS data and the biophysical variables of interest via calibration on in situ data (Chuvieco & Huete, 2009

| Physically based models
Physically based models attempt to describe the surface reflectance through physical laws of the radiation transfer inside the canopy and its interaction with the soil surface, and offer an explicit connection between the biophysical variables of vegetation and soil and canopy reflectance (Banskota et al., 2015;Houborg, Mccabe, Cescatti, et al., 2015). Physically based models have strong advantages over empirical approaches: they permit to infer causality and perform predictions, can be adapted to a wide range of land cover situations, time periods and sensor configurations, while at the same time not requiring the simultaneous acquisition of in situ and SRS data.

| Semi-empirical models
Semi-empirical models rely on the theoretical formulations used in physical models, while adjusting some parameters through empirical relationships based on SRS data. Such models reduce the com-  Recent studies showed that the low temporal frequency (16 days) of LAI products derived from Landsat can be improved by combining MODIS reflectance and LAI data with higher temporal resolution (Myneni, Knyazikhin, & Park, 2015), while keeping Landsat spatial resolution .

| Multi-sensor/multi-platform approaches
Mountain areas pose particular challenges for SRS applications, making indispensable the application of thorough topographic and atmospheric correction (for optical data), as well as of methods to account for foreshortening and layover effects and variations in surface water content affecting dielectric properties (for radar data, Gupta, 2018). In particular, multi-sensor/multi-platform approaches require radiometrically homogeneous and consistent re-

| MODEL C ALIB R ATION AND VALIDATI ON US ING S R S DATA
Satellite remote sensing is currently used not only as an input to EMs, but also to assess model reliability through validation techniques (Bennett et al., 2013). A large number of SRS products have served to assess EM results at the ecosystem level such as LAI, FAPAR, soil moisture, GPP and NPP (see examples in Table S4).
The operations to process the model outputs to obtain a variable consistent with the measured data constitute the so-called "observation operator" (Kaminski & Mathieu, 2017), which is necessary for the implementation of calibration, validation and assimilation schemes. Two strategies can be used to compare the EM outputs to SRS data (Plummer, 2000), namely indirect or direct comparison, which is related to two different observation operators.

| Indirect comparison
Indirect comparison considers high-level SRS products (e.g. LAI, FAPAR, etc.). In this case, the observation operator adapts model outputs to the measurements, typically through downscaling/upscaling procedures.
The correction of biases due to different assumptions between a particular EM and products (e.g. over-simplification of the vegetation layer in the EM) is a particularly important step that, if not considered, might lead to large discrepancies in the results (Liu et al., 2018). Due to these difficulties and the free availability of many SRS products, indirect comparison is still frequently adopted, especially when validation is simply performed by qualitative approaches (Table S4).

| Direct comparison
Note that indirect comparison requires the evaluation of error metrics between raster maps (Stow et al., 2009), which are typically characterized by strong spatial autocorrelation. Due to the errors introduced by SRS downscaling procedures, the accuracy of the sensor and the complex nature of environmental systems, classical validation metrics based on a per-pixel comparison (such as the root mean squared error, the Pearson's correlation coefficient or Nash-Sutcliffe efficiency) might not be able to evince common spatial patterns, thus limiting the comparison to mere qualitative considerations. In these cases, residual-based metrics should be replaced by the analysis of statistical moments, spectra and other quantitative measures of spatial structure that have been developed to objectively reveal common patterns among maps (Koch, Jensen, & Stisen, 2015). The quantified uncertainties associated with the model results and SRS observations are frequently neglected during the validation of model results, but should be included to weigh their relative contribution to the error metric (see Section 4.2).

2.
Kalman-based methods, which extend the well-known Kalman filter to nonlinear/non-Gaussian models, e.g. using MC simulations as in the Ensemble Kalman filter (EnKF; Quaife et al., 2008). The analysis step of DA updates the model state variables based on a balance between model and observation uncertainties, described by estimates of their probability distribution. This is a particularly difficult task for both EM outputs and SRS data.

| SRS measurement uncertainty
The assimilation of SRS data requires the computation of error crosscovariances at the pixel level, taking into account the cumulative effect of different sources of uncertainty: sensor errors, errors in the RTM, errors of representativity (due to upscaling or downscaling steps) and errors introduced when processing the data, e.g.
However, the propagation of the instrument and parameter errors is frequently neglected during the inversion of RTMs, and the product accuracy is assessed a posteriori, by means of costly validations against in situ measurements, or comparison with the output of calibrated process-based models (Wanders, Karssenberg, De Roo, De Jong, & Bierkens, 2012). QA of online products usually provides qualitative information on pixel values (e.g. QA band for Landsat VIs specifies the pixel condition about cloud, snow, water, etc.) and only rarely quantitative information on the accuracy of the data (e.g.

SD at the pixel level for MODIS LAI/FAPAR) which is what needed
for DA. The lack of information on the errors of SRS products can be relieved by using the statistics associated with the residuals between EMs outputs and SRS measurements evaluated during the DA updates (Crow & Reichle, 2008). Direct coupling represents a valid alternative, allowing the assimilation of the optical signal at the sensor, thus describing the observation error as the accuracy of the sensor (e.g. De Lannoy & Reichle, 2016;Zhang, Shi, & Dou, 2012). In this latter case, the evaluation of model uncertainty (see Section 4.2) has to be propagated through the observation operator (described in Section 3). Moreover, model uncertainties might be amplified by the unknown parameters of the RTM. As an example, EnKF assimilation of canopy reflectance from MODIS has been shown to improve EM estimates of GPP and reduce model uncertainty (Quaife et al., 2008).

| EM uncertainty
Correct estimation of model error is fundamental for DA.
Underestimating model uncertainty reduces the relevance of the data (the assimilation would marginally correct the system state variables), with increased risk of divergence from the actual state of the system. Overestimating it, instead, would result in poor forecast capabilities, with the model forecast that spans a large range of possible solutions. The latter is more favourable to DA analysis, since the true system state is more likely to fall within the prediction interval.
Ecosystem Models uncertainties are mainly evaluated during the forecast step, which drives forward the state variables until the following observation time. The main sources of uncertainty are input variables (initial conditions, external forcing), unknown parameters, structural uncertainties due to the physical simplification of the governing processes, and numerical approximations for the discretization of continuous processes (Refsgaard, van der Sluijs, Hojberg, & Vanrolleghem, 2007 (Uusitalo, Lehikoinen, Helle, & Myrberg, 2015).
Many studies neglect these different sources of uncertainty, thus possibly underestimating the model output variances (Matott et al., 2009). Further research and inquiry are required to provide standard methods for the estimation of model structural uncertainty.

| CON CLUS IONS
In the two decades since Plummer (2000) To fully explore the potentiality of SRS data, we suggest that the development of spatially explicit EMs should adopt the following strategies: 1. Direct coupling of EMs with RTMs, a strategy that is rarely adopted in ecological applications at present. Although direct coupling would require the calibration of a larger number of model parameters, the possibility of directly simulating the system reflectance clearly gives considerable advantages for (1) the validation of model outputs, (2) the assimilation of SRS data in near real time (without the necessity to wait the production of higher level products) and (3) the estimation of measurement uncertainties, which is required in DA techniques and allows for prediction.
2. structuring the simulation codes, so that they can be easily connected to the available DA platforms. In fact, the relatively small investment required in terms of updating model implementation would be greatly compensated by the potential to access a number of state-of-the-art, well-tested algorithms for the assimilation of SRS data and the assessment of model uncertainty.
Further collaborations between the remote sensing and the ecosystem modelling scientific communities will help to overcome the outlined difficulties and develop standardized techniques to include SRS data into EMs. There are promising prospects for new SRS missions, and for the development of new algorithms, standards and platforms.
Realizing these will require collaborative projects demonstrating the use of SRS in EMs, similar to the Horizon 2020 ECOPOTENTIAL project whose focus was on improving protected areas management using remote sensing.
The coming years promise great potential for SRS and EMs. In the next decade, several new missions are planned to be launched, including P-band SAR (BIOMASS mission) and ISS-mounted LiDAR (GEDI) for estimating biomass, hyperspectral imaging spectrometers for more accurate RTM inversion (EnMAP, HyspIRI, Hisui), high-resolution sun-induced fluorescence for estimating plant photosynthetic activity (FLEX), and a thermal radiometer to monitor water stress (ECOSTRESS). In addition, constellations of small and less-expensive satellites taking high-resolution imagery of the entire Earth every day, like PlantScope (optical) and ICEYE (SAR), are becoming a reality. In combination with longer time-series data from Sentinel missions and the computational power of cloud platforms such as Google Earth Engine and the upcoming DIAS platforms-the sky is the limit for ecological modellers.

ACK N OWLED G EM ENTS
This work has been carried out within the H2020 project

DATA ACCE SS I B I LIT Y
The manuscript does not include any data.