Spatially downscaling sun-induced chlorophyll fluorescence leads to an improved temporal correlation with gross primary productivity

Article history: Received 30 September 2015 Received in revised form 4 April 2016 Accepted 30 April 2016 Available online 20 May 2016 Sun-induced chlorophyll fluorescence (SIF) is known to relate directly to leaf and canopy scale photosynthesis. Retrieving SIF from space can thus provide an indication on the temporal and spatial patterns of the terrestrial gross primary productivity (GPP). Recent studies have successfully demonstrated the serendipitous retrieval of SIF from satellite remote sensing instruments originally destined to atmospheric studies. However, thefinest spatial resolution achieved by these products is 0.5°, which remains too coarse for many applications, including the early detection of drought impacts on vegetation and the integration with ground GPPmeasurements from fluxtowers. This paper proposes a methodology to spatially disaggregate the information contained within each coarse SIF pixels by using a non-linear model based on the concept of light use efficiency (LUE). The strategy involves the aggregation of high-resolution (0.05°) remote sensing biophysical variables to calibrate the downscaling model locally and independently at each time step, which can then be applied to non-aggregated data to create a new layer, denoted SIF*, with a spatial resolution of 0.05°. A global SIF* dataset is generated by applying thismethodology globally to 7 years ofmonthly GOME-2 SIF data. SIF* is shown to be a better proxy for GPP than the original coarse spatial resolution product according to flux-tower eddy covariance measurements. Its performance is comparable to dedicated GPP products despite that (unlike SIF*) these are calibrated based on the same flux towers, driven by meteorological data and not hampered by the large noise caused by the SIF retrieval. To further illustrate the added-value of the global SIF* product, this paper also presents: (1) an ecosystem level assessment showing a considerable reduction of noise with respect to the original SIF; (2) a spatio-temporal intercomparison with existing GPP products; and (3) estimations of global terrestrial productivity per selected vegetation types based on SIF*. © 2016 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Photosynthesis is the main process governing the global carbon cycle. Gross primary productivity (GPP), defined as the amount of carbon that terrestrial ecosystems assimilate per unit of area and time, is the largest planetary CO 2 flux (123 ± 8 Pg C y − 1 , Beer et al., 2010). GPP drives the inter-annual variability of the CO 2 mixing ratio and may substantially affect the future climate trajectory (Le Quéré et al., 2015). The quantification of the spatio-temporal variations of GPP is therefore fundamental for the detection of biogeochemical signals in the terrestrial biosphere, as driven by climate and other environmental drivers connected to global change like atmospheric CO 2 concentration and nitrogen deposition. Accurate data-driven estimates of GPP are also required to evaluate and improve biogeochemical land surface models used in Earth system models to predict future climate trajectories.
Techniques to estimate GPP range from eddy-covariance measurements at flux towers sites (Baldocchi, 2003;Baldocchi et al., 2001) to globally-distributed mechanistic land surface model simulations (Friedlingstein et al., 2006;Sitch et al., 2008). At point level GPP can be derived from the partitioning of the net ecosystem exchange as measured with the eddy covariance techniques at a global network of sites. While arguably the most reliable estimations of GPP, eddy-covariance measurements are limited by the uncertainty in the partitioning method (Lasslop et al., 2012), by the restricted area covered by their observation footprints and by the limited and biased spatial distribution of towers across the globe (Schimel et al., 2015). On the other side, land surface models can simulate GPP over a range of spatial and temporal scales across the globe, but the reliability of such estimation is heavily dependent on both the input data and the strong modelling assumptions made about the system, which do not always hold in space or time. Between these extremes are various data-driven approaches that try to remain as close as possible to observations while tolerating a variable degree of modelling abstraction (Durgun, Gobin, Gilliams, Duveiller, & Tychon, 2016;Jung et al., 2011;King, Turner, & Ritts, 2011;Mäkelä et al., 2007;Ogutu, Dash, & Dawson, 2013;Papale & Valentini, 2003;Ruimy, Dedieu, & Saugier, 1996;Running et al., 2004;Veroustraete, Sabbe, & Eerens, 2002;Zhao, Heinsch, Nemani, & Running, 2005).
Data-driven methods for globally distributed GPP estimations generally rely, at least partly, on satellite Earth Observation (EO) either as driving variables or to calibrate model parameters. A typical approach to link remotely-sensed data to GPP involves the light-use efficiency (LUE) approach proposed by Monteith (Monteith, 1972(Monteith, , 1977, which states that GPP is proportional to the product of incoming photosynthetically active radiation (PAR), the fraction absorbed by vegetation (fAPAR), and the efficiency at which absorbed radiation is used in the process of photosynthesis (ε P ): Much research effort has been devoted on estimating the first two terms of this equation from satellite remote sensing (see reviews : Hilker, Coops, Wulder, Black, & Guy, 2008;Malenovský, Mishra, Zemek, Rascher, & Nedbal, 2009. The spatio-temporal variations of fAPAR, which can be derived either from empirical relationships with vegetation indices (VIs) or from inversion of radiative transfer models (Baret et al., 2013;Gobron et al., 2007;Myneni et al., 2002), are typically the major remote sensing contribution to GPP estimations. However, the presence of chlorophyll, which is estimated by fAPAR or certain VIs, is a required but not sufficient condition for the occurrence of photosynthesis. The photosynthetic efficiency (ε P ) varies in space and time depending on both vegetation type and key climate drivers, such as air temperature and soil water content. GPP models (e.g. Running et al., 2004) commonly assume a potential LUE, ideally set to a different value for every major plant functional type (PFT), and then downregulate it based on environmental constrains. Such methodologies are heavily dependent on the availability and quality of both climatic and land-cover data, along with strong assumptions on the capability of these variables to drive the spatio-temporal variations of ε P (Garbulsky, Peñuelas, Papale, & Filella, 2008). Given that important short and long-term dynamics in ecosystem carbon fluxes such as GPP depend on variations in the physiological properties of plant canopies combined with the trends in environmental drivers (Reichstein, Bahn, Mahecha, Kattge, & Baldocchi, 2014), there is an increasing interest in improving the representation of ε P in satellite-based GPP estimations (Garbulsky, Filella, Verger, & Peñuelas, 2014;Grace et al., 2007). Fortunately, photosynthesis emits an optical electromagnetic signal, called chlorophyll a fluorescence, that is sensitive not only to PAR and FAPAR, but also to ε P (Porcar-Castell et al., 2014).
Chlorophyll a fluorescence originates from the core of the photosynthetic machinery and consists of a re-emission of absorbed photons at lower energy wavelengths (from 650 to 850 nm, with peaks at approximately 690 and 740 nm). It is considered to be a mechanism that photosynthetic organisms developed to respond instantaneously to rapid perturbations in environmental conditions of light, temperature and water availability, before the heat dissipation mechanism of nonphotochemical quenching can be triggered (Maxwell & Johnson, 2000). Chlorophyll a fluorescence has been extensively studied in laboratory at scales ranging from the subcellular up to the leaf for the past Fig. 1. Flowchart of the processing steps taken to generate SIF*, which is a proxy of gross primary productivity derived from downscaling a sun-induced fluorescence (SIF) signal at a spatial resolution of 0.5°using three explanatory variables at 0.05°spatial resolution: Normalized difference vegetation index (NDVI), evapotranspiration (ET) and land surface temperature (LST). Violet boxes represent input data, grey boxes represent processing steps and green boxes represent output or intermediary datasets. decades (Baker, 2008). However, recent development have enabled the passive measurement from space-borne remote sensing platforms of what is known as solar induced chlorophyll fluorescence (SIF). Under sunlit conditions, SIF is positively correlated with leaf photochemistry and should thus serve as valid proxy for GPP, even though the mechanistic link between the two remains unclear (Porcar-Castell et al., 2014).
SIF is a very weak signal consisting of less than 5% of the total light absorbed by the plant. However, it is comparably stronger in some narrow parts of the solar and atmospheric spectrum in which irradiance is strongly reduced (the Fraunhofer lines), making the signal detectable passively from narrow-band instruments (for a review see Meroni et al., 2009). Various space-borne instruments destined for atmospheric research can provide the necessary spectral and radiometric sensitivities for SIF retrieval from space, despite not having been designed for this purpose. Monthly global SIF maps have thus been recently produced from the GOSAT, SCIAMACHY, and GOME-2 instruments (Frankenberg et al., 2011;Guanter et al., 2012;Joiner et al., 2013;Joiner, Yoshida, Vasilkov, Corp, & Middleton, 2011;Köhler, Guanter, & Joiner, 2014). However, because the signal remains weak and the measurement noise is comparatively large, the SIF measurements currently need to be averaged in time and space, resulting in datasets with a coarse spatial resolution (0.5°at best with GOME-2) and the temporal resolution (typically 1 month). Although several studies have already shown that these datasets can be related to GPP (Frankenberg et al., 2011;Guan et al., 2015;Guanter et al., 2014;Parazoo et al., 2014;Yoshida et al., 2015), the coarse spatial resolution remains a bottleneck as most ecosystems are highly heterogeneous at spatial scales larger than 5-10 km. The situation will be improved with the recently launched NASA Orbiting Carbon Observatory-2 (OCO-2) (Crisp et al., 2004;Frankenberg et al., 2014), the upcoming TROPO-spheric Monitoring Instrument (TROPOMI) on the Sentinel-5 Precursor Veefkind et al., 2012) and the proposed FLEX mission specifically dedicated to measuring SIF and photosynthesis (Kraft et al., 2013;Mohammed et al., 2014;Moreno et al., 2006). However, these will not be able to go back in time to provide GPP information for the past, and many applications in ecology, agronomy, forestry and climate sciences will benefit of having the longest archives possible.
The aim of the present work is to contribute in the overarching goal of exploiting current coarse resolution SIF products to improve spaceborne GPP estimation by tackling the spatial resolution issue. The objective is to spatially disaggregate the information contained within each coarse SIF pixels by using a non-linear model based on the concept of light use efficiency. The strategy involves the aggregation of highresolution remote sensing biophysical variables to calibrate the downscaling model locally at every individual time step, which can then be applied to finer spatial resolution data to provide a downscaled estimate. The resulting dataset should be considered as a proxy for fluorescence that could later be used for several purposes, like the detection of vegetation stresses, the quantification of drought impacts (e.g. Sun et al., 2015;Wang et al., 2016), or incorporated in a mechanistic model of primary productivity. To assess the value of this new modelled proxy, the coherence in the temporal trend with respect to GPP is explored by both analysing the global signal per plant functional type and comparing it locally to flux-tower GPP estimates. As this dataset is expected to help further research in using SIF-based modelling for different applications, this paper further aims at making an inter-comparison with existing GPP products and present an estimate of global terrestrial productivity per selected vegetation types.

GOME-2 fluorescence
The global SIF dataset used in this study is the one proposed by Joiner et al. (2013) and provided by the NASA Aura Validation Data Center (http://avdc.gsfc.nasa.gov/). It consists of far-red fluorescence (referenced at 740 nm) retrieved from hyperspectral observations from the Global Ozone Monitoring Experiment-2 (GOME-2) instruments on-board of MetOp-A and MetOp-B satellites. The methodology for SIF retrievals is based on a simplified radiative transfer model to disentangle the spectral signatures caused by atmospheric absorption, surface reflectance, and fluorescence radiance. This involves an empirical principal component analysis approach to compute the atmospheric absorption component. Retrievals are done on the daily observations (level 2 product), but given that the resulting SIF estimates are very noisy, these are typically averaged in time and space resulting in the level 3 product. This level 3 dataset has a spatial resolution of 0.5°and at a monthly time step. Version 25 of this product covering the period 2007-2013 was used in this study. Additional pre-processing steps included here are removing SIF monthly values falling outside the plausible range of 0-6 mW/m 2 /sr/nm, along with those for which the individual valid observations in the monthly period are less than 5 or have a standard deviation of more than 2 mW/m 2 /sr/nm. Fig. 2. Schematic representation of (1) how the downscaling function is applied to the high spatial resolution explanatory variables using (up to) 9 sets of parameter values taken from the spatial vicinity; and (2) how the resulting downscaled estimations are averaged with weights depending on the high spatial resolution pixel's location within the coarse pixel.  . Flowchart summarizing the steps taken to make the flux-tower evaluation comparing the SIF* product with other GPP products and proxies. Violet boxes represent input data, grey boxes represent processing steps and green boxes represent output or intermediary datasets.

Biophysical variables from MODIS
Three biophysical variables derived from satellite remote sensing data at a spatial resolution of 0.05°are used to downscale the GOME-2 SIF dataset. All three are derived from the MODIS instrument, which has been flying on-board of 2 platforms: Terra since 2000 and Aqua since 2002. All data are gathered for the same period of 2007-2013 with a monthly temporal resolution.
The first variable is the Normalized Difference Vegetation Index (NDVI) obtained from the MYD13C1 version 5 product provided by Fig. 5. Comparison between the input coarse spatial resolution SIF product and the fine spatial resolution downscaled SIF* product for a bimonthly temporal sequence over 3 selected zones. Fig. 7. Example of the advantages of using the methodology adopted in this paper. The top row illustrates SIF* obtained by a simple downscaling, resulting in several gaps and squared artefacts in the zoomed region. The bottom row shows the method adopted (summarized in Fig. 2) based on spatially averaging (up to) nine sets of SIF* downscaled values, each based on parameters obtained within the 3 × 3 coarse pixel vicinity. NASA. NDVI has widely been used in Earth monitoring as it responds to the amount of green vegetation biomass present on the ground, and can be used as a proxy for fAPAR to a certain extent. This particular product (MYD13C1) consists of cloud-free spatial composites of the gridded 16 day 1 km NDVI product obtained from the AQUA MODIS instrument, and are provided as a level-3 product projected on a 0.05°(5600 m at the equator) geographic Climate Modelling Grid (CMG). Cloud-free global coverage is achieved by replacing clouds with the historical MODIS time series climatology record. The data is available from the NASA LPDAAC website (https://lpdaac.usgs.gov/).
The second variable is evapotranspiration (ET), obtained from the MOD16 product proposed by NTSG (Mu, Zhao, & Running, 2011). Evapotranspiration is the quantity of water lost by the land surface by the combined processes of evaporation and transpiration through leaf stomata. Plants close their stomata when water is limiting factor for photosynthesis, thereby reducing ET, which can thus serve as an indicator of the water stress in the soil-plant continuum. This product is obtained by integrating several MODIS products (land cover, albedo, leaf area index, and Enhanced Vegetation Index) with meteorological data. MODIS ET is also available in the CMG grid with a monthly temporal resolution. The data is available from the NTSG website (http://www.ntsg. umt.edu/project/mod16).
The third variable is the daytime land surface temperature (LST) obtained from the MYD11C3 version 5 product of NASA (Wan, 2008). In this product, LST is the radiometric (kinetic) temperature derived from the thermal infrared radiation emitted from the land surface. Thus, over vegetated areas LST mostly relates to the canopy top and is therefore representative of the leaf layers with the highest photosynthetic rates. The daytime LST from the MYD11C3 product consists of 1 km measurements performed by MODIS on-board of the AQUA platform at around 13:30 local time (depending on the latitude) and averaged over a monthly period and over the CMG grid. The choice of using AQUA rather than TERRA is that the early afternoon overpass of the former better coincides with the moment when the heat stress on vegetation is likely to be more pronounced. Data are available from the NASA LPDAAC website (https://lpdaac.usgs.gov/).

Data at flux tower site level
A valuable source of GPP estimates comes from eddy covariance measurements performed at the global network of flux towers. Time series of GPP available in the FLUXNET LaThuile dataset under the "fair use" data policy were evaluated for the present analysis. In total, 143 sites were initially considered, with data covering variable time periods spanning from 1991 to 2007 depending on the site.
To evaluate the spatial representativeness of the flux sites in the 0.05°grid cell an assessment of the surrounding of the sites in an area of 0.3°by 0.3°has been performed. For this purpose two sources of spatial data were collected over this area. The first dataset consists of high spatial resolution remote sensing imagery extracts from Google Earth that will serve to visually assess the homogeneity of the landscape. The second dataset consists of NDVI time series at a 16-daily time step and with a spatial resolution of 250 m obtained from the MODIS instrument on-board of the Terra platform, available under the MOD13Q1 product of the NASA LPDAAC.

GPP global products
The first data-driven global GPP product considered in this study is the one proposed by NASA based on the MODIS instrument. This MOD17 product (Running et al., 2004;Zhao et al., 2005) is produced with a light use efficiency model by combining canopy biophysical information derived from MODIS with meteorological and land cover information, with some adjustments based on localised in situ fluxtower observations. The product used is a version that has been aggregated to a spatial resolution of 0.05°and a monthly temporal resolution (available here: http://www.ntsg.umt.edu/project/mod17). This product is henceforth referred to as simply MOD.
The second GPP dataset is the MTE-GPP product of the Max Planck Institute's Biogeochemical integration group (Jung et al., 2011). It is constructed using a machine learning method (ensemble of regression trees) to upscale information from flux-towers up to a 0.5°grid, aided by gridded meteorological and remote sensing co-variables (available here: https://www.bgc-jena.mpg.de/geodb/projects/Data.php). This product will henceforth referred to as MPI.

Ancillary data
The analyses in this paper require information on the global distribution of different vegetation types or ecosystems. The delimitation of the spatial extent of these ecosystems is based on a combination of climatic zones and land cover classifications.
The climate zones are based on the Koppen-Geiger climate classification system updated with the climate from 1950 to 2000 by Kottek, Grieser, Beck, Rudolf, and Rubel (2006). Only the major climate zones are considered (Tropical, Dry, Temperate and Continental) after having been split between the northern and southern hemisphere counterparts. The spatial resolution of the maps is 0.5°. When these maps need to be crossed with datasets at 0.05°spatial resolution, all 0.05°pixels falling in the 0.5°cells are considered to have the same climate.
Land cover is obtained from the MODIS land cover products (Friedl et al., 2010). The MCD12C1 version 5 product with a 0.05°spatial resolution is used, which consists of the aggregation of the finer 500 m land cover product by selecting the dominant land cover types within the coarser 0.05°grid cells. Separate land cover maps are available for

Downscaling approach
The downscaling approach consists in establishing a model that physically relates the coarse spatial resolution dependent variable (SIF) with explanatory variables that are available at both fine and coarse resolution. The resulting product is a modelled spatio-temporal field and is henceforth referred to as SIF* to distinguish it from the original SIF signal proposed by Joiner et al. (2013). The methodological steps to go from coarse SIF to fine SIF* are summarized in Fig. 1.
The model is to be calibrated and applied only over a local spatiotemporal moving window. Spatially, this window is defined for each coarse resolution pixel by selecting the 40 nearest valid pixels (including the central one) within a region of 11 × 11 pixels (i.e. 5.5°× 5.5°). This method allows having the same number of samples for every selected grid cell, but it also adapts (to a certain extent) to the coastlines, thereby avoiding the limitation of a symmetric moving window (e.g. Fig. S1 in the supplementary information). Temporally, the window is here fixed to a single month, which is the finest resolution of the Level 3 SIF GOME-2 data used. By limiting the windows to a single time slice, this approach ensures that the explanatory variables only serve to disaggregate the coarse SIF at each time step, and not to impose their temporal behaviour to the derived SIF* signal.
To construct the model, several assumptions are made. The first assumption is that, as described by several authors (Berry et al., 2013;Joiner et al., 2014;Schimel et al., 2015), SIF can be expressed using an analogous equation to that linking GPP with LUE (Eq. (1)): The main differences lies in the term ε F , describing the efficiency at which the plant emits photons back as fluorescence, instead of the ε P term that relates to photosynthesis efficiency. It is further assumed that the spatiotemporal variations of ε F can be modelled as a function of hydric and thermic stresses, for which the remote sensing biophysical variables ET and LST can serve as proxies: In this formulation, b 0 is a constant value that is downregulated by f(ET) and f(LST), which can both only range from 0 to 1. Another assumption is that over the local spatio-temporal window, PAR remains relatively constant and that the resulting APAR (i.e. PAR × fAPAR) can be modelled as a function of NDVI, resulting in the following model structure: Fig. 10. Comparison of the original SIF signal at 0.5°spatial resolution (LR) with the downscaled SIF* product at 0.05°spatial resolution (HR) when aggregating to specific land cover classes within different climate zones. The panes on the left depict the 5, 25, 50, 75 and 95 percentiles of the aggregated time series. The panes on the right illustrate the temporal evolution of the coefficient of variation, which is defined as the ratio between the standard deviation and the mean expressed in percentages. EBF and ENF respectively stand for evergreen broadleaf forest and evergreen needle-leaf forest.
A quadratic, a sigmoid and a Gaussian function are respectively used to model f(NDVI), f(ET) and f(LST), resulting in the following expanded expression: The b i parameters are to be estimated at monthly resolution and locally over the moving window through a calibration process. For this purpose the first step is to aggregate linearly the datasets of explanatory variables (NDVI, ET, and LST) from their native 0.05°spatial resolution to the 0.5°resolution of the GOME-2 SIF dataset. In case of missing values at fine spatial resolution, the aggregation is done only if a minimum of 50 valid values are available (otherwise the coarse data point is flagged as not available). Since Eq. (5) is non-linear, a Quasi-Newton "L-BFGS-B" optimization algorithm, as implemented in the core R package stats (R Core Team, 2015), is used to find the optimum b parameters over every local window. This algorithm is based on the gradient projection method and uses a limited memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) matrix to approximate the Hessian matrix of the objective function (Byrd, Lu, Nocedal, & Zhu, 1995). It allows the specification of a lower and upper bound for each parameter. In this case, these are set based on educated guesses on the limits that each parameter may encounter. Table 1 reports the values set here for these bounds, along with the initial conditions, which are chosen as to ground the resulting sets of parameter to physiologically plausible solutions. The same initialization values are used across the globe irrespective of climate or land cover type. The algorithm minimizes the cost function defined as the sum of squares of the residuals between measured and modelled SIF values. The result of the optimization process is a raster stack with a coarse (0.5°) spatial resolution containing the spatialized values of each of the 6 parameters in different layers.
Once the vector of optimized parameter values are available for each coarse grid cell and time step, the relationship of Eq. (5) can be applied to all fine spatial resolution pixels located in that coarse grid cell to result in 10 × 10 pixels of SIF*. Two issues arise when performing this operation. First, some places do not have values for coarse grid cell because the SIF values might have been discarded prior to the calibration step (due to excessive noise or insufficient observations) or because no SIF value was available in the first place (such as a peninsula delimitated at fine spatial resolution but absent at the coarser one). Second, abrupt transitions in optimized parameter values may occur between adjacent coarse spatial resolution grid cells that may generate an artificial 'large square' pattern in the downscaled data due to the relatively large difference of spatial scales. Due to the non-linear nature of the downscaling Fig. 11. Comparison of the temporally averaged coefficient of variation of the original SIF signal at 0.5°spatial resolution (LR) with the downscaled SIF* product at 0.05°spatial resolution (HR) when aggregating to specific land cover classes within different climate zones. The error bars represent the standard deviations of the values in time. Only cases in which at least 500 pixels are encountered are plotted. Climate zones have been separated according to their location in northern and southern hemispheres to avoid artefacts due to differences in seasonality. DBF, EBF and ENF respectively stand for deciduous broadleaf forest, evergreen broadleaf forest and evergreen needle-leaf forest. function (Eq. (5)), the parameters cannot be smoothed out spatially prior to downscaling. To minimize artefacts due to these two issues (data gaps and step transition in parameter values), each high spatial resolution SIF* estimate is obtained as a spatially weighted average of nine SIF* estimations obtained by applying the downscaling functions with an ensemble of nine different sets of parameters, those located within a 3 × 3 coarse pixel window around it. The weights within this window are calculated using a 2-D Gaussian function (with standard deviation of 15 km) over the high spatial resolution grid. The selection of weights for the (up to) nine SIF* estimations depends on the relative position of the high spatial resolution pixel within the low spatial resolution grid (see Fig. 2). This method allows a smoother spatial transition at the fine scale while filling up the gaps based on adjacent SIF estimations.

Comparison with flux tower GPP
The comparison of a coarsely gridded product, such as SIF or even the finer SIF*, with point observations like flux tower data is hampered by the large differences in the coverage of each measurement. Specifically, these differences include the temporal frequency of the measurements, their spatial footprints and the period over which the data are available. The various steps taken to overcome these hurdles are summarized below and in Fig. 3.
The first task is to prepare the GPP time series for each flux tower from the available half-hourly GPP values resulting from the flux partitioning method of Reichstein et al. (2005). These are aggregated to monthly values in order to match the temporal resolution of the available SIF dataset and to that of the other global GPP products. Only the site-months for which more than 25 daily values per month were available are retained, in order to avoid non-representative monthly GPP estimates.
The second step consists in reconciling the spatial scales of the different data-streams. The spatial footprint of eddy covariance measurements is typically less than 1 km 2 (Rebmann et al., 2005), which is a much smaller area than the 0.05°grid cell of the downscaled SIF* product (let alone the original SIF data pixels that are 100 times larger). Fig. 12. Spatial and temporal pair-wise correlations between the SIF* downscaled product and the MODIS MOD17 GPP product (MOD) and the MPI-BCG MTE GPP product (MPI) when aggregated by land cover class and climate zones. Only cases in which at least 500 pixels are encountered are plotted. Climate zones have been separated according to their location in northern and southern hemispheres. DBF, EBF and ENF respectively stand for deciduous broadleaf forest, evergreen broadleaf forest and evergreen needle leaf forest.
In addition, flux tower measurements are associated with a single vegetation type, but there is no guarantee that this vegetation type is present far beyond the expected footprint of the measurement. Given these potential sources of point-to-pixel mismatch, the suitability of every single tower to represent the signal of a 0.05°grid cell has been carefully inspected using a combination of high spatial resolution Google Earth imagery with time series of indicators derived from 250 m NDVI images. These indicators are: 1. h pix : The heterogeneity of the 250 m NDVI over the 0.05°grid cell, quantified using the standard deviation of all NDVI values at a given date. 2. δ tow −pix : The absolute difference between 250 m NDVI at the tower site and the average NDVI over the 0.05°grid cell at a given time. 3. τ tow : The mean absolute difference between a given yearly NDVI curve for the flux tower 250 m pixel and the mean "climatological" curve for the 2000-2013 period for the same pixel. 4. τ pix : Same as τ tow but with the mean NDVI values over 0.05°grid cell instead of the flux tower 250 m pixel.
All indicators are calculated for a single time slice but they can also be averaged over a year to provide a mean annual metric. These indicators have been jointly examined for each site as illustrated in Fig. 4. The main criteria used to screen the sites is whether the land cover corresponding to the flux tower represents the area of the 0.05°g rid cell (as a rule of thumb, a minimum threshold of 3/4 of the grid cell was considered). Large values in any of the indicators mentioned above also served to remove sites in which the behaviour of the vegetation at the tower appears to be too different from the vegetation covering the rest of the 0.05°grid cell. Other exclusion criteria included: cases in which the plant functional type (PFT) associated with the tower did not seem to match with what can be observed in the high resolution imagery; cases in which large water bodies are present in the 0.05°grid cell; cases in which it was deemed there was not enough data (of either SIF ⁎ or GPP) to make a proper comparison. Finally, all towers over crops were discarded because the variability inter-annual dynamics, driven essentially by management of this land use, compromises the comparison for non-overlapping periods, as it will be made clear in the next paragraph.
Once the screening of suitable towers is completed, the last step before confronting the SIF and SIF* products and the tower-based GPP is to reconcile the fact they are not covering the same period. For this purpose we limited the comparison to the multi-year average curve, or Fig. 13. Map illustrating the pair-wise temporal correlation between 3 datasets: terrestrial chlorophyll fluorescence (SIF), the MOD17 GPP product (MOD) and the MPI-BCG MTE GPP product (MPI). The top map (a) uses downscaled SIF* while the bottom map (b) uses the original SIF. Each primary colour represents the squared correlation coefficient between 2 datasets (squares are used to avoid negative numbers which would complicated the colour representation). If all 3 datasets are equally correlated at a given area, that area is coloured in a greyscale. Dark areas show poor agreement for all paired comparisons. If a given hue stands out, it indicates a given pair of dataset agrees more between each other than with the third. climatology, of each variable. The 14-year 250 m NDVI climatology is used to ensure that anomalous years are not included in the calculation of the tower GPP and satellite SIF climatology curves. More specifically, the anomalies in the τ tow and τ pix metrics are used to identify "non-normal" years respectively in the tower GPP and in the gridded SIF time series (year i is considered anomalous if τ i N μ(τ)+2 σ(τ) where μ is the mean and σ the standard deviation). The temporal agreement between the climatological curves of GPP derived from flux tower and those of different GPP proxies (such as NDVI, SIF, MOD17 GPP or MPI-MTE GPP) can then be obtained per flux tower site. In this study, the metric of agreement that is considered is the adjusted coefficient of determination (R 2 ), obtained after fitting a linear regression between the GPP proxies and the flux tower GPP.

Ecosystem level assessments
To extent the evaluation of the proposed downscaled product beyond the limitations imposed by the flux-tower data, a product intercomparison is done at ecosystem level based on three separate assessments. The first test consists in analysing the temporal variability of the original SIF and the downscaled SIF* product over the 7 years of data when clustered by land cover and climate zones. For each group and at each time step, the following metrics are calculated: the 5, 25, 50, 75 and 95 percentiles; and the coefficient of variation, defined as the ratio (expressed in percentages) between the standard deviation and the mean of a given set of values. The temporal evolution of these metrics is expected to illustrate the improvement realized by downscaling the data to a spatial resolution at which ecosystems are less heterogeneous.
In a second evaluation, the spatio-temporal evolution of the downscaled SIF* dataset is compared to that of the two data-driven GPP products: MOD and MPI. This assessment looks only at the correlation between datasets, following the protocol proposed by Meroni et al. (2012). The temporal correlation analysis is performed by collecting time series for each pixel in each dataset, calculating the Pearson moment-product correlation coefficient among these vectors and then mapping the resulting values in space. The result is a map of the temporal correlation between the datasets, which can then be summarized per vegetation type. The spatial correlation analysis is done by collecting for each product all values at a given time over predefined areas (delimited by vegetation type). The correlation between values obtained for the different products is retained for each time step, resulting in a time series of correlations for each vegetation type. As the inter-comparison exercise involves 3 different products, 3 separate pair-wise analyses have been produced (SIF* vs MPI, SIF* vs MOD, MOD vs MPI).
A third assessment looks at actual estimations of global GPP from SIF and compares them with the equivalent values calculated from the other two GPP products. Establishing a direct relationship between SIF (or SIF*) and GPP in a mechanistic way is beyond the scope of this work, but an empirical analysis is performed based on linear regressions between the flux tower GPP and the gridded SIF at coarse and fine resolution. Because such empirical relationships can be expected to be dependent on the vegetation type, separate regressions are considered for four different groups for which there were sufficient data from flux stations. These are: (1) evergreen needleleaf forests (ENF); (2) deciduous broadleaf (DBF) and mixed forests (MF); (3) grasslands; and (4) savannahs and woody savannahs. In this analysis, space and time are combined by considering together all site-years pertaining to a given vegetation type, in order to maximize the data points available to fit each regression. Note that separate regression coefficients are obtained for the coarse 0.5°SIF data and for the fine 0.05°SIF* product, each based on their respective values confronted with the tower GPP estimates. Because both SIF and SIF* datasets contain gaps in space, either because the original retrievals or the downscaling method was unsuccessful, these are filled with the 7-year monthly climatology of either products. The SIF or SIF* are then cumulated annually for every grid cell. The yearly land cover maps are then used to separate spatially the areas corresponding to the four above-mentioned vegetation groups. The regression coefficients are applied to each grid cell to convert SIF or SIF* to GPP. To obtain global values in petagrams of carbon per year (Pg/ year), all grid cell values for a given vegetation group are cumulated taking into account the variable land area of each cell. Despite the temporal gap-filling that was previously applied, some areas can still have missing SIF* data compared to the equivalent area covered by SIF (e.g. because SIF cannot be downscaled properly at all times), which may in turn underestimate the total amount of carbon fixed. To avoid this problem, the first estimate is rescaled by multiplying it by the ratio between the total area covered by the target ecosystem and the actual area covered by the effectively available SIF* pixels.

Overview of the downscaled product
The improvement obtained from downscaling the GOME SIF data from 0.5°to the derived 0.05°SIF* product can be appreciated in the subsets provided in Fig. 5. The downscaled product fully captures the same spatial patterns present in the original data, and disentangles them with much finer spatial detail at sub-pixel scale, allowing for instance the identification of river systems.
If the downscaled data is re-aggregated to original 0.5°spatial resolution, it can then be subtracted from the original SIF data to expose the residual errors of the downscaling procedure. As it can be seen in Fig. 6, the residuals show a largely random spatio-temporal structure. The values of these errors are generally within the margin of 0.1 to 0.4 mW/m 2 /sr/nm, which is the retrieval error of the original product reported by (Joiner et al., 2013).
The benefits of using the 9-member weighting procedure are illustrated in Fig. 7. These include reducing the number of gaps and removing 0.5°pixel edge effects remaining after the downscaling.

Comparison with flux tower GPP
The final set of flux towers retained for this analysis consists of 40 out of the initial 143 flux towers considered from the LaThuile dataset. Their spatial distribution is depicted in Fig. 8, while an exhaustive list is provided in the supplementary material. Despite having only a limited amount of towers and not being able to assess the inter-annual variability due to the lack of temporal overlap between the time series, the results are encouraging. The results are presented as a curve specifying, for each GPP proxy considered here, the proportion of towers within the ensemble that have an R 2 above a given threshold. As could be expected, the original SIF is overall a better predictor of flux-tower GPP than NDVI (see Fig. 9a), confirming previous results by other authors (Joiner et al., 2014). However, the downscaled SIF* product performs even better than the original signal, arguably because the spatial coverage of the finer grid cell is closer to the footprint of the flux tower measurements. In addition, the 0.05°SIF* performance almost reaches the level of the dedicated GPP products, even though the latter use ancillary information (meteorology and spectral indexes) and are calibrated on these same flux towers used in this evaluation, while the former uses no such information. When considering the overall performance of SIF* in Fig. 9, it must also be stressed that the temporal dynamic is driven by the original SIF signal and not by the three biophysical downscaling variables. The contribution of these variables is to spatialize the coarse SIF signal at each timestep individually, and thus do not impose their temporal behaviour onto the SIF* signal. Declining the comparison of performances by the plant functional types (see Fig. 9b) shows how the reduced performance of SIF and SIF* is considerably lower for ENFs, while for grassland ecosystems it can even outperform the other products.

Ecosystem level assessment
The first ecosystem-level assessment demonstrates the increased coherence of the SIF* signal with respect to the original SIF when considering an aggregation per climate zone and per land cover type. Fig. 10 illustrates the evolution in time of the distribution of all SIF and SIF* values for a selected number of climate zone/land cover classes. The increase in ecosystem-level signal purity is demonstrated by the narrowing of the distributions, which furthermore appear more consistent with the expected behaviour of the targeted ecosystems. To quantify this increase in coherence, Fig. 10 also compares the evolution of the coefficient of variation (CV, defined as the standard deviation divided by the mean in per cents) from the coarse and fine datasets. In all cases, the CV goes down. In some cases, such as boreal evergreen needle-leaf forests, it does so in a very considerable way. Fig. 11 summarizes this reduction of the CVs for major land cover types in all climate zones.
The second ecosystem-level analysis illustrates how 3 GPP proxies, SIF*, MOD and MPI, are correlated in space and time. The pair-wise correlations per classes of land cover and climate zones are summarized in Fig. 12. It shows both that the three products do not always agree (which would make them redundant), and that no pair is systematically more correlated than the others. For example, for croplands and DBFs in the tropical climate downscaled SIF* and MPI GPP are more highly correlated in time between each other than they respectively are with the MOD product. However, MOD and MPI have higher temporal correlation among each other in continental climate for all land covers excepting croplands and DBFs. Temporal correlations is equal or above spatial correlation in most cases. This suggests how it is easier to catch changes in vegetation productivity in time that in space, particularly when there is a strong seasonality such as in croplands and DBFs. When there is little or no seasonality, e.g. in tropical evergreen broadleaf forests, correlation across products is typically poor (interestingly, the highest correlation pair in these cases always includes SIF*). To better visualize the spatial patterns of where products agree temporally, Fig. 13a shows the averaged (squared) correlations per pixel as a map. The spatial resolution is aggregated to 1°using an arithmetic average. Large parts of the World are shown in light hue-less taints in Fig. 13a, indicating high agreement between all 3 datasets, while other regions vary in hue indicating how 2 out the 3 datasets agree more between each other than with the third. Fig. 13b shows the same map using the original SIF dataset instead of the downscaled SIF*. These two maps show general consistency, but the map including SIF* does show higher overall agreement with the other products (reduced blue fraction in many areas) and a smoother surface.
The third and last ecosystem-level analysis moves beyond correlation and the three GPP proxies directly with GPP estimates for the selected vegetation types. The scatter plots and regression lines between SIF or SIF* and flux-tower GPP are presented in Fig. 14 per vegetation type. These include all valid sample points in both space and time, so each is not fully independent from each other. As a consequence, Fig. 14 should not be considered as an appropriate diagnostic of the improvements in relationship with GPP when passing from SIF to SIF*, but rather as a summary of the regressions used to make an estimate of global ecosystem-level GPP. All scatterplots appear fairly linear with the exception of the ENF in which low values of GPP do not seem to be so well related with low (near-zero) SIF values. Despite this suboptimal situation for ENF, a linear regression is still considered for the sake of simplicity. Fig. 15 shows the temporal evolution of the global annual GPP when the relationships in Fig. 14 are used to convert SIF/SIF ⁎ to GPP.

Discussion
The two main outcomes of this study are i) the methodology developed to produce the downscaled SIF* dataset and ii) the dataset itself, for which the increase in spatial resolution has resulted in an enhanced proxy for GPP. The novelty in the methodology consists of adopting a physically and physiologically sound concept of light-use efficiency to spatially disentangle the coarse SIF signal that is known to correlate with GPP. This approach can now serve as a baseline that can be refined by exploring if better explanatory variables exist, adjusting how these are arranged in the model and applying it to other SIF datasets. The arrival of new instruments capable of measuring SIF combined with new updated validation datasets from flux-tower or flights will undoubtedly stimulate further developments in this direction. The other methodological outcome is the downscaling framework itself, involving an adaptable moving window and a 3 × 3 weighted smoothing procedure that result in a reduction of both gaps and residual tessellation artefacts. This approach could be adapted other downscaling applications beyond the domain of carbon balance or remote sensing.
Various analyses done in this study attest to the potential value of the generated SIF* dataset. This is best illustrated by SIF* increased performance, with respect to the original SIF, in representing the temporal dynamics of GPP as estimated by independent eddy-covariance measurements. Resolving the spatial patterns of different land cover types from within the coarse SIF estimate is arguably the main reason for the increase of performance, as the selected flux tower sites were selected to ensure only one (and the same) dominant land cover class falls both within the SIF* and the eddy-covariance instrument footprints. It should be stressed again that because the downscaling is done independently at every time step, the temporal dynamic of SIF* is driven by SIF only and not by the temporal dynamics of the downscaling variables. The predictive capacity of SIF* to estimate GPP is comparable to that of dedicated data-driven GPP products that are (unlike SIF) both: (1) driven (partly) by meteorological data and other EO products like NDVI or fAPAR, and (2) calibrated with the same flux tower GPP estimates used here for evaluating their performance. SIF is also further penalized by the strong noise generated during its retrieval (caused because the actual SIF signal is so weak) that may reduce the R 2 with GPP even if the underlying correlation should be high. From this point of view it is very encouraging to see the excellent performance of SIF*, suggesting that combining it with meteorological drivers could further increase the performance and perhaps lead to a new generation of datadriven GPP products such as MOD17 (Running et al., 2004;Zhao et al., 2005) or MPI-BGC MTE (Jung et al., 2011). While these deductions are based on improved temporal correlations with flux-tower GPP, they should hold also for spatial correlations. A small analysis of spatial correlation, presented in the supplementary material, did not reveal such improvement. However, to make such analysis properly a spatially dense network of flux tower is necessary (ideally with several towers per coarse SIF pixel), and the severe screening of sites needed to cope with ensuring the tower-pixel adequacy probably reduced the site numbers excessively. More evaluation is needed with updated FLUXNET data to assess both this spatial improvement and to investigate the inter-annual variability with synchronous flux and satellite measurements.
The added-value of disentangling the spatial components within the coarse SIF signal can also be appreciated at ecosystem level. The reduction of noise when passing from SIF to SIF*, as expressed by the reduction in coefficient of variation in Fig. 10 and Fig. 11, can be attributed to two effects. First, the generation of SIF* entails an extra degree of modelling that smooths out noise. Second, SIF* produces ensembles of time series that are more similar to each other when looking at different clusters of land cover/climate zones. The reason behind this increased temporal homogeneity is that land cover specific curves can be selected more easily from the 0.05°SIF* pixels than from the mixed coarse 0.5°S IF pixels. The reduction in variability due to this spatial un-mixing is clearer in forested environments that in croplands, most probably because croplands still have a large diversity of contrasting crop specific temporal profiles within a 0.05°pixel.
The inter-comparison of SIF* with other GPP products at ecosystemlevel brings several interesting points worth discussing. First, the graphs in Fig. 12 and the maps in Fig. 13 indicate that SIF* agrees widely with the other two GPP products in large parts of the world, comforting its value as a GPP proxy. Second, there are several places of disagreement, notably in the tropics and the boreal zones, which are the areas of greater interest for monitoring carbon fluxes and stocks, and which coincide with the sparser availability of flux towers sites (Schimel et al., 2015). In tropical evergreen broadleaf forest, the higher correlations occur when SIF* is involved, suggesting this signal is bringing a compromise where the other two products disagree more. In continental biomes including evergreen needleleaf forests, shrublands and savannas, the pair-wise agreement between GPP products is systematically higher than any agreement with SIF*. This could be attributed to the GPP products being strongly correlated to the small number of flux-towers available in these northern latitudes, but the relationship between SIF/SIF* and flux tower GPP also appears to be weaker by itself for evergreen needleleaf forests (see Figs. 9 and 14), perhaps due to the more complex canopy structure of these ecosystems. Another notable point is the general consistency between the two maps in Fig. 13, which illustrates how the SIF signal from the original data is driving itself a large part of the agreement or disagreement with the GPP products. However, the SIF* map does show higher overall agreement with the other products, indicating that the modelling, downscaling and smoothing steps used to generate SIF* have a positive effect in providing a proxy closer to GPP. Finally, when looking in terms of total terrestrial annual GPP per ecosystem (Fig. 15), the temporal patterns match those proposed by the other GPP products. This synchronicity may partially be driven by the annual changes (and errors) reported by land cover products, but overall the values are of the same order of magnitude. Overall, it is difficult to conclude whether the passage from SIF to SIF* brings an improvement as neither of the GPP products can be considered a definitive reference, but in the case of deciduous and mixed forest, the downscaling results in a strong alignment with the MPI-BGC MTE product. Remarkably, in the four examined cases total GPP estimates are increased when using SIF* rather than SIF. A possible explanation is that estimates based on SIF* are made using spatially purer samples, while those using the original SIF dataset rely on coarser (and thus more mixed) pixels. Because the ecosystem considered here are rather highly productive (such as forests and grasslands), purer samples result in higher total GPP values that are less affected by less productive ecosystems (such as shrublands and wetlands) that are more likely to be included in the coarser pixels.
The SIF* dataset could find various uses for applications for which the current SIF datasets are too coarse in spatial resolution. These include: (1) a better parametrization of GPP models at flux tower level based on SIF; (2) making studies at ecosystem or plant functional type level avoiding systematically mixed pixels; (3) fine spatial resolution studies on the impact of climate constrains (and in particular water limitation) to primary productivity; (4) using the downscaled product to explore the possibility of synergistically combining SIF products coming from different satellite platforms with contrasting sampling strategies (e.g. with OCO-2); and (5) use these data to calibrate mechanistic models of fluorescence (e.g. van der Tol, Verhoef, Timmermans, Verhoef, & Su, 2009) in view of the improvements of GPP estimates from carbon cycle data assimilation system (e.g. Koffi, Rayner, Norton, Frankenberg, & Scholze, 2015). However, one important caveat is warranted. The SIF* product should be regarded as a baseline dataset that can serve as a proxy for GPP, rather than as a sun-induced fluorescence dataset sensu stricto. SIF varies instantaneously along the day while the input data used in this study (the level 3 GOME-2 SIF product) are themselves based on monthly average of daily observations acquired at the specific overpass time of the MetOp satellite. Therefore, the downscaled product cannot be expected to accurately represent how SIF behaves at finer temporal resolution.
To do so, downscaling should be done directly on the level 2 GOME-2 SIF data (which are the instantaneous retrievals before averaging temporally and spatially) with the synchronised instantaneous values of explanatory variables. Such approach would be more appropriate, but also more complicated because of the necessity to match the spatial and temporal supports of the SIF instantaneous retrievals with those of the explanatory variables. As new missions and new retrieving algorithms become capable of producing SIF estimates with higher accuracy, exploring whether the downscaling can be applied to shorter temporal composites should also be considered, as the monthly temporal resolution is also a limitation for many applications. In the meantime, further research should now be devoted to consolidate the present approach by applying it to other SIF retrievals from GOME-2 (Köhler et al., 2014), or from other satellites such as GOSAT (Frankenberg et al., 2011;Guanter et al., 2012) or SCIAMACHY (Joiner et al., 2012;Köhler et al., 2014). This could serve as a basis to generate a consolidated archive of SIF* that could enable a better exploitation of the data that will be available from new instruments such as TROPOMI or FLEX.

Conclusions
The work presented shows how spatially downscaling a coarse resolution sun-induced chlorophyll fluorescence product with ancillary remote sensing data leads to an improved temporal correlation with GPP. By ensuring that the downscaling methodology remains independent at every time step, the temporal behaviour of the resulting SIF* signal is not driven by that of the downscaling variables. The resulting performance is comparable to dedicated GPP products despite that (unlike SIF*) these are calibrated based on the same flux towers, driven by meteorological data and not hampered by the large noise caused by the SIF retrieval. This encouraging outcome suggests SIF* could be incorporated in a more dedicated mechanistic model of primary productivity to provide new datasets of remote sensing driven GPP estimates.