Inference of spatial heterogeneity in surface fluxes from eddy covariance data A case study from a subarctic mire ecosystem

Horizontal heterogeneity causes di ﬃ culties in the eddy covariance technique for measuring surface ﬂ uxes, re- lated to both advection and the confounding of temporal and spatial variability. Our aim here was to address this problem, using statistical modelling and footprint analysis, applied to a case study of ﬂ uxes of sensible heat and methane in a subarctic mire. We applied a new method to infer the spatial heterogeneity in ﬂ uxes of sensible heat and methane from a subarctic ecosystem in northern Sweden, where there were clear di ﬀ erences in surface types within the landscape. We inferred the ﬂ ux from each of these surface types, using a Bayesian approach to estimate the parameters of a hierarchical model which includes coe ﬃ cients for the di ﬀ erent surface types. The approach is based on the variation in the ﬂ ux observed at a single eddy covariance tower as the footprint changes over time. The method has applications wherever spatial heterogeneity is a concern in the interpretation of eddy covariance ﬂ uxes.


Introduction
Eddy covariance has become a widely used technique for measuring surface-atmosphere exchange of energy and scalars (Baldocchi et al., 2001). The technique has advantages in that it integrates over a relatively large area, and can run near-continuously over long time spans. It can thereby provide the best available data with which to estimate the magnitude and controls on ecosystem-scale trace gas and energy fluxes. An implicit assumption in the method is horizontal homogeneity in the wind field and surface flux (Aubinet et al., 2012;Kaimal and Finnigan, 1994). Horizontal heterogeneity in the wind field is associated with problems of flux divergence, and the misinterpretation of advective fluxes as part of the vertical eddy flux (Aubinet et al., 2002;Finnigan et al., 2003).
Here, we focus on horizontal heterogeneity in the flux from the surface, which causes problems for the eddy covariance method by causing advection (horizontal flux), and difficulties of interpretation; the measured flux will vary in time, depending on which parts of the surface contribute, and this is not straightforward to account for. In other words, temporal variability is confounded with spatial variability. Our aim here was to address this issue using statistical modelling and footprint analysis, applied to a case study of methane fluxes in a subarctic mire.
Temporal variability in surface fluxes occurs at diurnal and seasonal scales because of patterns in driving variables such as incoming radiation, temperature and leaf area, and also at intermediate time scales with synoptic weather patterns. Horizontal heterogeneity in surface fluxes can occur at a rangle of spatial scales, depending on the ecosystem type. For example, single-tree gaps and larger clearings occur in forests, and CO 2 fluxes will be different here because of the lower leaf area. In mires, the vegetation may be a mosaic of hummocks and hollows at a small scale (tens of cm), and made up of patches of different vegetation communities at a larger scale because of variations in soil parent material, topography and drainage. Spatial variability in arable fields and managed grasslands may be driven by gradients in fertility, but extreme hotspots of N 2 O productions have been also reported (Cowan et al., 2015).
In previous work, the simplest and most common approach to dealing with spatial heterogeneity has been to define sectors in the wind rose based on prior knowledge, and treat these as independent sub-sets of the data. For example, Jones et al. (2016) located an eddy covariance tower on a boundary between two fields, and allocated observations to data sets for either the north field or the south field, depending on the mean wind vector for each half-hour. In a case study which we examine further here, Jammet et al. (2017) used an eddy covariance tower located at the edge of a lake in a subarctic mire ecosystem (Fig. 1), and allocated observations to data sets for the lake or mire, depending on the mean wind vector for each half-hour.
Rather than simply considering the azimuth angle of the mean wind vector, a more advanced step is to consider the way in which eddy covariance measurements are influenced by the so-called "flux footprint" (Leclerc and Foken, 2014;Schmid and Oke, 1990;Schuepp et al., 1990). The footprint defines the relative contribution of each element of the surface area to the measured vertical flux, according to the advection-diffusion equation. Models of the footprint have used analytical approximations to the advection/diffusion equation (based on K-theory and power-law approximations of the wind and diffusivity profles) (Horst and Weil, 1992), stochastic Lagrangian approaches (Baldocchi, 1997;Hsieh and Katul, 2009), or large-eddy simulation (Leclerc et al., 1997). For computational reasons, the first of these is the most common approach.
The footprint acts as a weighting function, such that some areas contribute strongly to the measured flux, and others not at all (Fig. 1). In mathematical terms, the mean flux F of a scalar over a landscape represented by a discretised gridded domain with dimensions n x by n y at time t is given by: where the overbar denotes spatial averaging. Eddy covariance effectively measures a weighted mean, where the footprint provides the set of weights, ϕ, to give: where we use the hat -symbol to indicate that this is an estimator, not F t itself. The weights = ϕ ϕ { } xy sum to unity at each time interval. We typically have a large number of measurements over time (n t , potentially all the half-hourly flux measurements over a number of years). If the number of samples is large, the magnitude of variability is small relative to the mean, and the footprint coverage approximates the domain of interest, we can assume 〈 〉 F is a reasonable approximation of the true time-averaged mean flux for the whole domain, i.e. 〈 〉 ≈ 〈 〉 F F¯, where angled brackets denote time-average means. In many circumstances, this may not be the case. Our goal here is to find the appropriate balance in identifying and accounting for spatial heterogeneity as far as possible, whilst keeping the model reasonably simple and identifiable.
Fluxes measured by eddy covariance are typically analysed as a time series, where the temporal variation in F is related to variation in independent temporal covariates X, such as solar radiation, air temperature and soil moisture, measured at a single point near the flux tower. Ignoring footprint sampling issues, this can be approached using a multiple regression model with a vector of coefficients β and a residual error term ϵ: In the presence of spatial heterogeneity in F around the tower, this approach is incomplete, and additional terms should be considered in the analysis. We do this here by adding terms for the different surface types to Eq. (3), and by adding terms for the residual spatial variation. We use the approach variously called "hierarchical", "mixed-effects" or showing the eddy covariance tower (orange triangle) on the border of the lake, and the locations of flux chamber measurements (blue circles). Isopleths of the flux footprint averaged over the threeyear data set are shown, showing the predominant easterly and westerly wind directions. Gridded lines sub-divide the domain around the tower into a number of regions in which we estimate spatial variability of the surface flux. Orange lines denote the four cardinal quadrants (n q = 4); Grey lines subdivide these in to further quadrants (n q × n s = 16). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) "multi-level" modelling in the frequentist (Bates et al., 2015;Laird and Ware, 1982;Pinheiro and Bates, 2006) or Bayesian paradigms (Goodrich et al., 2018). This is a development of linear regression, where the effects of a grouping structure in the data are explicitly represented and accounted for. There is now a large literature on the application of hierachical modelling to agricultural and ecological science, and we refer the reader to this (Bolker et al., 2009;Peek et al., 2002;Zuur et al., 2009). Models that ignore (spatial) grouping structures in the data tend to underfit or overfit (Gelman et al., 2013). Hierachical modeling allows parameters to vary by group at lower levels of the hierarchy while estimating common parameters at higher levels, often referred to as "borrowing strength".
In the case of the landscape around a flux tower, we can often classify groups of discrete surface types forming a heterogeneous matrix, = I I { } xy .
In the examples cited above, the surface types could simply be north field/south field in Jones et al. (2016), or water or land in Jammet et al. (2017) (Fig. 2), In our subarctic mire example, the surface types could be divided into the more detailed set i = {water, hummock, semi-wet, wet, graminoid} (Fig. 2). Additionally, we can consider the domain around the tower as sub-divided into a number of spatial regions (e.g. the squares in Fig. 1), where the local parameters will differ from the domain average, after accounting for the distribution of known surface types. This represents spatial variability which is not accounted for by any other measured covariate, and constitutes residual variation at a lower level in the hierarchy. Additionally, we can also represent the spatial relationship between grid cells as a hierarchical structure, with each cardinal quadrant q further subdivided into nested sub-regions s, as a means to represent spatial autocorrelation (i.e. the fact that neighouring grid cells tend to be similar).
We combine these terms in a model of the general form described by Bates et al. (2015), where the time series of fluxes from surface type i in spatial region s within region q is estimated by: where β i is the vector of coefficients used to predict the flux from surface type i with the temporal covariates X; Z q is the so-called "design matrix" or a subset of the temporal covariates with responses that vary among spatial regions; and b q is the vector of the local deviations in their coefficients in surface sub-type, q. Similarily, Z s is a subset of the temporal covariates with responses that vary among the nore finely subdivided spatial regions, and b s is the vector of the local deviations in their coefficients in spatial region, s. That is, we fit a model which describes the response of the flux to independent variables in different surface types ("fixed-effects" in the mixed-effect modelling parlance), but allow the coefficients to be consistently higher or lower in each spatial region through the ("random effect") terms b. The spatial deviations b are assumed to be independently drawn from normal distributions with mean zero and variance Ψ q and Ψ s . The model is hierarchical in the sense that the b q parameters are drawn from a parent distribution which is determined by the fitted β parameters, and the b s parameters are drawn from a parent distribution which is determined by the fitted b q parameters. The b terms are treated as random variables, where we are more concerned with the variance at this level, rather than the specific values for the groups we sampled. We refer the reader to textbooks on the subject for further discussion of the "mixed-effect" modelling approach (Pinheiro and Bates, 2006;Zuur et al., 2009). To estimate the contribution from each of these to the measured flux, we require the footprint weighting for each group at time t, given by: where the square brackets are Iverson notation, evaluating to 1 where true and zero otherwise. We can now add in these terms and estimate the measured flux at time t to be: The model fitted to observations of F t can be used to make explicit predictions of F t over a wider domain, if the area covered by each surface type is available (and any other covariates). In this case, the spatial terms, being random effects with mean zero, drop out to leave our global model for the predicted flux as: where A i denotes the area occupied by surface type i and l 2 is the area of a single grid square. In this way, we remove the artefacts from the data induced by the footprint sampling characteristics and local spatial variability to infer our best estimate of mean flux F t over a wider domain.
The parameters of mixed-effect models are typically estimated by maximum likelihood estimation. Here we use Bayesian methods instead, which are increasingly being used for ecological data analysis (Ogle, 2009;Van Oijen et al., 2017). This has the advantages that it provides a robust means of estimating uncertainty on the parameters and predictions, and it allows us to combine data from flux chambers as informative priors. Bayes Theorem relates the model parameters to the data as: which equates the posterior probability of the parameter set θ given the observed data D to the prior probability of the parameters P(θ), the likelihood of the data given the parameters, and the marginal probability of observing the data. In our case, the parameter vector θ consists of the β and b values, and the eddy covariance flux meaurements provide the observed data D. The prior probability of the parameters P (θ) comes from fitting the model to flux chamber data collected ast the same site. Eq. (8) cannopt usually be solved analytically, but is generally estimated by Markov chain Monte Carlo (MCMC) sampling, an iterative approach for calculating numerical approximations of multidimensional integrals. Many MCMC algorithms are available, and the mechanics of performing Bayesian statistical analysis are described in several textbooks (e.g. Gelman et al., 2013). Here we use the Gibbs sampling algorithm, which provides a computationally efficient means of estimating the posterior distribution of the parameters (Plummer, 2016).

Case study: methane fluxes in a subarctic wetland
We apply the approach outlined above to a case study of fluxes in a subarctic wetland (Jammet et al., 2017). Methane (CH 4 ) is an important greenhouse gas, and has other damaging effects related to stratospheric ozone, aerosols and water vapour (Myhre et al., 2013). The recent increase in global atmospheric CH 4 concentration has been attributed to a rise in emissions from natural wetlands, as well as fossil fuel use (Kirschke et al., 2013). Boreal and arctic wetlands comprise around 50% of the global wetland area (Lehner and Döll, 2004), and have experienced a temperature increase of almost double the global rate (IPCC, 2013). If this has the effect of promoting arctic CH 4 emissions, there is the potential for a positive feedback on global warming. The magnitude and controls on arctic CH 4 fluxes remain highly uncertain (Kirschke et al., 2013), largely due to the heterogeneous nature of the ecosystems. Based on field studies using static chamber methods, CH 4 fluxes are known to be influenced by factors such as soil temperature, water table height, soil moisture, active layer thaw depth and vegetation type (Davidson et al., 2016;Hommeltenberg et al., 2014;Levy et al., 2012;Sturtevant et al., 2012;Zona et al., 2009). We need to understand these responses if we are to correctly predict the effects of future climate change. However, extrapolating to larger scales from these point measurements is difficult, given the small scale of the measurements and the extent of the spatial heterogeneity . Using the modelling approach described above applied to a subarctic ecosystem where there are clear differences in surface characteristics within the landscape, we demonstrate the inference of spatial heterogeneity in CH 4 flux. Our aims here were to: • quantify and account for the spatial heterogeneity in CH 4 flux around the tower • combine information from flux chamber and eddy covariance methods in a coherent way • describe the response of CH 4 fluxes to environmental variables, accounting for spatial variability.
With this case study, we aimed to show that the approach aids the interpretation of eddy covariance measurements, and maximises the information content which can be retrieved from the data.

Site description
Stordalen mire is a subarctic peatland located near Abisko, Northern Sweden (68 ∘ 20' N, 19 ∘ 03' E). At a broad level, the landscape is heterogeneouis, comprising freshwater lakes, smaller pools, and non-inundated bog with a water table below the surface (Fig. 1). At a finer level, the terrestrial vegetation types can be classified into: (i) moss hummocks containing shrubs, (ii) semiwet (ombro-minerotrophic) bog with a water table below the surface, (iii) wet (ombrotrophic) bog pools with a water table at or near the surface, and (iv) graminoid-dominated bog with a water table above the peat surface (Johansson et al., 2006;Malmer et al., 2005). The distribution of these vegetation types has been mapped using remote sensing (Giljum, 2014) (Fig. 2).

Chamber flux measurements
Static chamber measurements of CH 4 flux were made in June, August and September 2013. 46 collars were located across the mire, clustered in five locations accessible from boardwalk. Circular PVC collars (40 cm diameter), fitted with a flange, were inserted to a depth of 5 cm into the ground prior to the start of measurements and remained in the ground for the duration of the study. Cylindrical PVC chambers (40 cm diameter, 20 cm height) with a matching flange, were sealed on the collars for the duration of each flux measurement (45 min). Each chamber lid had a vent (pressure compensation plug; Mouser Electronics, London, UK) to compensate for pressure changes whilst minimising diffusion losses (Xu et al., 2006). During each flux measurement, gas samples were extracted from the chamber headspace at 5, 15, 30 and 45 min via a polypropylene syringe and a three-way stopcock. Samples were transferred to 20-ml vials fitted with chlorobutyl rubber septa. The samples and three sets of four certified standard concentrations were analysed on a gas chromatograph (HP5890 Series II, Hewlett Packard, Agilent Technologies, Stockport, UK) with a flame ionization detector. The limit of detection was 0.07 ppm CH 4 . Peak integration was carried out with Clarity chromatography software (DataApex, Prague, Czech Republic). The flux was calculated from the change in mixing ratio within the chamber headspace against time: where F is the gas flux in − − nmol m s , 2 1 dC/dt 0 is the initial rate of change in mixing ratio with time in − − nmol mol s , 1 1 ρ is the density of air in − mol m , 3 V is the volume of the chamber in m 3 and A is the ground area enclosed by the chamber in m 2 . dC/dt 0 was estimated by linear and nonlinear regression as described in Levy et al. (2011).

Eddy covariance flux measurements
Fluxes of CH 4 , CO 2 , H 2 O and sensible heat were monitored nearly continuously with an eddy covariance system located at the shore of Villasjon (Fig. 1), as described in full by Jammet et al. (2017). The system comprised a 3-D sonic anemometer (R3-50, Gill Instruments Ltd.), an open-path infrared gas analyser (IRGA) for CO 2 and H 2 O (LI7500, LICOR Environment, NE, USA) and a closed-path analyser for CH 4 (FGGA, Los Gatos Research, CA, USA), logged on a data logger (CR1000, Campbell Scientific, Inc., UT, USA). The sonic and IRGA were mounted on a mast at 2.5 m, with a sample inlet line to the FGGA adjacent to these. Fluxes were calculated using the EddyPro version 5.2 open-source software (hosted by LICOR Environment, USA). Processing of the 10-Hz raw data followed standard eddy covariance procedures (Aubinet et al., 2012;Lee et al., 2006). Fluxes were rejected for some time periods on the basis of quality checks, including the number of spikes, skewness and kurtosis, stationarity, turbulence (low u* values) and implausible time lags (Papale et al., 2006). Although Jammet et al. (2017) imputed missing values, only observed fluxes were analysed here. For every time period with a valid flux measurement, we calculated the flux footprint ϕ over a 200 × 200 m domain around the tower at 2-m resolution. For simplicity, we use the commonly-used simple analytical model of Kormann and Meixner (2001) to calculate ϕ.

Statistical analysis
We consider two examples to demonstrate the application of the approach to the Stordalen flux data. Firstly, we analyse the response of sensible heat to incoming solar radiation. In this case, we have a clearly defined spatial pattern, which we know in advance, because land and water surfaces behave quite differently for clear physical reasons. Without specifying anything about the two surface types, we fit Eq. (10), a simple model for sensible heat flux H with a global intercept β and a slope term b relating solar radiation G to H for each spatial quadrant q (i.e. a single temporal covariate = G Z ). This gives a prediction for the measured flux as: We use this to test the approach to detecting spatial heterogeneity, where we have a good knowledge of the true pattern a priori.
Secondly, we analyse the more complex case of methane flux, where we define five different surface types (i = {water, hummock, semi-wet, wet, graminoid}), all with potentially different responses of CH 4 flux to multiple temporal covariates. In both cases, we can sub-divide the domain, with increasing granularity, into 4 or 16 spatial regions (the cardinal quadrants centred on the tower, and further repeated subdivisions of these). We explored several terms in models of the form of Eq. (6). Eq. (11) shows an example of a near-optimal form, as measured by AIC, written out more fully: where T wt is the lake water temperature at time t, T lt is the land soil temperature, u t is the windspeed, G t is the total solar radiation, and b q and b qs are intercepts for each spatial region s within each quadrant q. Applying Eq. (5) provides the footprint weightings for each defined surface type and each spatial region. Equation 11 gives the prediction for what we expect to observe by eddy covariance at time t, given a calculated footprint ϕ, a known distribution of surface types, a predefined matrix of spatial regions, and the parameter vectors for fixed and random effects, β and b. The likelihood of a series of n t observations of F t arising from normal distributions with means determined by Eq. (6) is: where σ t obs is the uncertainty in the observed (eddy covariance) flux at each time point. The posterior distribution for β, b and the predicted flux were estimated using Gibbs sampling MCMC algorithm implemented in the R package rjags (Plummer, 2016). Initial fits were obtained by the restricted maximum likelihood method using the 4 lme package in R (Bates et al., 2015).
We based the choice of independent variables to use in modelling the CH 4 flux on the earlier analysis of (Jammet et al., 2017). Although many of the variables are strongly correlated, this suggested that CH 4 fluxes from the lake surface were controlled by water temperature (as a control on biological production) and turbulence (as a physical mechanism driving the flux at the water surface). CH 4 fluxes from the land surface were controlled by soil temperature, with additional effects of solar radiation and humidity (as drivers of stomatal opening and transport through plant pathways). The chamber flux data were used to establish the prior distribution for model parameters. The chambers were allocated to the same set of surface types, and the same model was fitted to the chamber flux data for each of these groups, in the same way as for the eddy covariance data. The posterior distribtion of parameters established in this way was then used as the prior distribution in the analysis of the eddy covariance data. The uncertainty in each halfhourly flux observation was estimated using the method of Finkelstein and Sims (2001). The effect of including different terms in the model fitted, and the granularity of the grid on which spatial heterogeneity was resolved, was assessed using common model selection criteria: AIC, BIC, and DIC. . 3 a shows the basis of our analysis of sensible heat fluxes to demonstrate the method. On land, a relatively constant fraction of incoming solar radiation, G, is converted to sensible heat, producing a clear relationship in the green data points. At night, the land cools via sensible heat loss, so the intercept is negative. Over water, most of the incoming radiation is converted to latent heat (i.e. evaporation), and there is no strong dependence of H on G in the dark blue points, meaning the slope is close to zero over water. Without using any a priori knowledge of the surface type classification or its spatial distribution, we can use Eq. (10) to estimate the slope b q (relating G to H) in the four cardinal quadrants around the flux tower. In this simplest case, we use a single intercept β and a single temporal covariate = G Z . With this, we fit the Ψ q parameter governing inter-quadrant variability and the slope term b q for each quadrant, and with the Bayesian methodology, we estimate their posterior distributions (Fig. 4). This shows that b q is higher in the all-land quadrants (q 1 and q 3 ), lowest in the all-water quadrant (q 2 ), and intermediate in the mixed quadrant (q 4 ). The model thus correctly retrieves the spatial pattern that we know to be present in the data. The posterior distributions are narrow, meaning we have high certainty about these parameter values.

Fig
We repeat a similar analysis for methane fluxes, where the underlying model is much less well-known, although a reasonably clear temperature response provides a similar basis (Figs. 3b). The chamber data show that there are some distinct differences in methane fluxes between the different surface types (Fig. 5). The semi-wet class has a much lower mean than the graminoid and hummock classes, but these latter appear similar. The data are rather variable, and a substantial part of this is point-to-point variability, characterised by location but P. Levy, et al. Agricultural and Forest Meteorology 280 (2020) 107783 not related to any measured covariate. There were no chamber measurements in the wet or water surface types. When the footprint was on the land, ecosystem-scale fluxes were similar in magnitude to the mean of fluxes measured in the chambers at a corresponding temperature (of the order of 100 nmol m −2 s −1 at 10°C , Fig. 3b). Methane fluxes increase with biological production in the peat, producing a clear relationship in the green data points, described by the slope. The lake temperature is not strongly coupled to soil temperature, and lake fluxes are also influenced by physical turbulence (Jammet et al., 2017), so there is no strong dependence of F CH4 on soil temperature in the dark blue points, meaning the slope is close to zero.
We explored terms to include in the model, based on temporal covariates identified by Jammet et al. (2017), using information-theoretic criteria. A model corresponding to Eq. (11) was close to optimal by these measures (comprising β terms for the effects of lake water temperature and windspeed on fluxes from water, with separate β terms for the effects of soil temperature and global radiation on fluxes from the different land surface types, and with b qs intercept terms). Many model variants could be fitted, and a single optimal model was not obvious. In this circumstance, some form of model averaging may be appropriate, weighting variants according to their likelihood (Yao et al., 2018). Fig. 6 shows the maximum a posteriori predictions of methane flux from the optimal model over the spatial domain. This shows the expected broad spatial pattern, with lower fluxes from the lake surface than the land average. Within the land vegetation sub-classes, the model estimates the highest fluxes from the hummock class, the lowest fluxes in the wet type, with semi-wet and graminoids being intermediate. This partially matches expectations from the chamber data, in that hummocks show higher fluxes than the semi-wet class, but fluxes On land, a relatively constant fraction of incoming radiation is converted to sensible heat, producing a clear relationship in the green data points. At night, the land cools via sensible heat loss, so the intercept is negative. On water, most of the incoming radiation is converted to latent heat (i.e. evaporation), and there is no strong dependence of H on G in the dark blue points, meaning the slope is close to zero. (b) Relationship between methane flux and soil temperature, as influenced by the proportion of surface water in the flux footprint ϕ water at the Stordalen flux tower. On land, methane fluxes increase with biological production in the peat, producing a clear relationship in the green data points. The lake temperature is not strongly to soil temperature, and lake fluxes are also influenced by physical turbulence, so there is no strong dependence of F CH4 on soil temperature in the dark blue points, meaning the slope is close to zero. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) from the graminoids were expected to be higher. However, with the scatter and the lognormal distribution of the chamber data, the prior distributions are relatively wide and not strongly informative. Given the number of eddy covariance observations, it is expected that these dominate the likelihood calculation and dictate these rankings. There is an issue, which needs further exploration, of how to correctly weight samples from different measurement techniques, which have different sampling properties, such that the weightings relect their true information content . The rankings are also dependent on temperature, as different temperature sensitivities are fitted for each class.
We investigated the influence of the prior by analysing two variations: using the model fitted to the chamber data as the prior, or using "weakly-informative" priors with mean zero and wide standard deviations. For the wet and water surface types, there was no chamber data, so default weak priors were used here. Distributions of the β temperature coefficient for the five surface types are shown in Fig. 7. Generally, the red and green curves are not very different, so the prior does not have a very strong effect on the results. The posterior moves substantially away from the prior in the case of the hummock, and graminoid surface types. The priors for these are strongly influenced by a small number of high fluxes in the chamber data. These are plausible for individual point flux measurements, which often show a lognormal distribution, but not plausible for the ecosystem-scale mean which eddy covariance measures. The difference is thus partly due to the difference in measurement scales, and could be accounted for explicitly in future work (e.g. see Levy et al., 2017).
The uncertainty in a given parameter estimate is characterised by the spread of its posterior distribution, and this can be summarised as a credibility interval or standard deviation, σ, or its reciprocal the precision (1/σ). To examine the level of uncertainty in our inference of spatial heterogeneity in residual methane flux, Fig. 8 shows the plot of the precision of our estimates of the spatial random intercept term, b s (from the model fitted as decribed above, with = n 2 i and = n 16 s ). This gives a quantitative representation of our intuitive expectation, that we have higher certainty in flux estimates from regions near the flux tower compared with those at a distance. It also quantifies the effect of prevailing wind patterns on the extent to which different regions are sampled by the footprint. In addition, this incorporates the differing degrees of variability in fluxes from different regions, and the extent to which these are explained by temporal or spatial covariates.

Discussion
The method described here provides a means of accounting for spatial heterogeneity in the interpretation of eddy covariance data. The approach allows for the inclusion of terms related to: (i) temporal variability, (ii) the expected pattern of spatial variability, as characterised by maps of surface classification and surface properties, and (iii) latent spatial variability, unrelated to known explanatory variables. In the methane case study, and more generally, it allows us to describe the source (or sink) strength of the defined surface-type or land-cover classes. By using Bayesian methods, we can incorporate prior knowledge, such as flux data derived from chambers, and characterise the uncertainty in parameters and fluxes as posterior density distributions. The example using sensible heat flux, where we know what the true parameter values should be a priori, shows that the method works well. This gives us confidence that the results for methane flux, and more generally, should be reliable. Mathematically, the method is similar to the "inverse" methods applied at much larger scales, to infer surface fluxes from variations in the atmospheric mixing ratio observed on tall towers and by satellites. (e.g. Ganesan et al., 2015, The method has applications wherever spatial heterogeneity is a concern in the interpretation of eddy covariance fluxes. Examples  include sites where vegetation shows distinct patterning (Schmid and Lloyd, 1999), where hotspots of N 2 O production are known to occur in grazed grassland systems (Cowan et al., 2015), or where features such as riparian zones are suspected of influencing the measured flux (Mc Namara et al., 2008). Other authors have used footprint-weighting previously to upscale from chamber measurements to the ecosystem scale measured by eddy covariance, using a map of surface-type classes (Budishchev et al., 2014;Hartley et al., 2015). This provides a way of comparing chamber observations with eddy covariance observations, accounting for footprint variability in the latter. Our approach goes beyond this, in combining chamber and eddy covariance data to infer the properties of the surface-type classes, with associated uncertainties. Matthes et al. (2014) applied a hierarchical mixed-effects model to methane flux data from three eddy covariance towers in a wetland in California, but in this case the grouping referred to the three towers, rather than attempting to infer parameters of the different vegetation types present.
Various degrees of complexity in the statistical model represented by Eq. (6) can be envisaged. Although we only use discrete spatial covariates (i.e. surface-type classes) in these examples, the method is equally applicable to continuous spatial covariates where maps of these are available (e.g. topographic variables, leaf area index, or remotelysensed indices). We could thus include terms for temporal covariates, both continuous (e.g. temperature) and discrete (e.g. seasons), and spatial covariates, both continuous (e.g. NDVI) and discrete (e.g. surface-type classes). One improvement would be to model spatial autocorrelation using a geostatistical approach, so that spatial groupings are not merely discrete classes, but the latent spatial variability is modelled explicitly as a continuous function of x and y. We currently only consider additive terms, but some terms may act in a multiplicative way. For example, absorbed radiation is more related to the product of LAI and incoming radiation.
There are some shortcomings in the approach as presented here. No uncertainty is accounted for in the footprint model, whereas we know that it is imperfect. This could be included in further development, but requires a methodology for quantifying footprint uncertainty. At the expense of computation time, more sophisticated footprint models could be used which deal with, for example, a wider range of stability conditions (Kljun et al., 2015).
More fundamentally, the eddy covariance method assumes that spatial heterogeneity in the fetch is small enough such that all the advective and horizontal eddy flux divergence terms in the mass balance equation are zero, to leave: In the presence of spatial heterogeneity, the advective and divergence terms will not be zero, but in practice are very hard to measure, and are generally ignored. When applying eddy covariance in imperfect conditions, we can justify the approximation of Eq. (13) if we assume that one or more of the following are true: • the magnitude of spatial heterogeneity in F is neglibly small, relative to the domain mean F ; • the spatial structure of heterogeneity is at a small scale, so that the large-scale gradient across the domain ∂ ∂ c x/ (and thereby advection) is small; • the advective and divergence terms do not show strong diurnal patterns or correlations with other temporal covariates, so the errors produced are random rather than systematic, and do not bias estimates of parameters or long-term mean fluxes.
In the extreme case of a flux tower at a land/lake boundary, none of these conditions may be well met and the characterisation of advection errors when applying eddy covariance in heterogeneous environments remains a challenge for the micrometeorological research community (Rebmann et al., 2005). The validity of these assumptions and the role of advection in causing systematic errors in eddy covariance measurements is still an on-going debate (e.g. Griebel et al., 2016). Most work on advection has focussed on heterogeneity in the wind field and the problems of separating out the mean advective flux from the turbulent eddy flux (via Reynolds decomposition) in an appropriate coordinate system, such that advective terms do not contaminate the vertical eddy flux when w is non-zero (Finnigan, 2004;Finnigan et al., 2003;Lee, 1998;Lee et al., 2006). Relatively little work has tackled the role of heterogeneity in the surface flux itself on advective errors. The method outlined here provides a possible approach for attempting this, in that spatially explicit predictions of surface flux in the domain around a tower provide a starting point for formulating our expectation of the horizontal advective term. Fig. 8. Satellite image of the field site showing the eddy covariance tower (orange triangle) and isopleths of the flux footprint averaged over the three-year data set (grey lines). The raster colour scale superimposed on this shows the precision (= 1/σ) of posterior estimates of the spatial random intercept, b s in 16 spatial regions around the flux tower, estimated without any a priori knowledge of the surface type classification or its spatial distribution.