1 Introduction

A recurring problem within climatology and meteorology is the optimization of interpolation techniques to generate maps of meteorological and climatic parameters using point measurements from climatic stations. A variety of interpolation techniques, ranging from splines, regression and kriging to neural networks and machine learning techniques have been suggested and evaluated (Antonić et al. 2001; Jarvis and Stuart 2001; Boer et al. 2001; Hartkamp et al. 1999). Most of these techniques perform better if auxiliary gridded maps, i.e. information on topography (Digital Elevation Model), exposition, distance from the coast-line and similar, are used for interpolation. Jeffrey et al. (2001), for example, produced an archive of climatic variables for Australia using thin plate splines and elevation as auxiliary predictor. Hijmans et al. (2005) use latitude, longitude, and elevation as auxiliary variables to produce global maps of a variety of climatic variables by thin plate smoothing splines: monthly total precipitation, monthly mean, minimum and maximum temperature, and 19 derived bioclimatic variables. Extending this idea, the PRISM group of the Oregon State University used weighted regression with an extended list of predictors—elevation, coastal proximity, vertical layer, topographic facet, effective terrain weights etc.—to produce (retrospective) daily precipitation and temperature data sets for the conterminous USA (Daly et al. 2008).

Although techniques such as smoothing splines have been well accepted by the climatic modeling community, they may be criticized by geostatisticians for allowing users to set the model parameters subjectively. Also, the quantification of interpolation error is less sophisticated than with geostatistical interpolation techniques (Boer et al. 2001). On the other hand, splines are more efficient in dealing with noisy (weather station) data as they do not need to go through the measured values. Unlike the kriging methods that often ignore measurement error and force predictions to go through actual measured values.

In recent years, there has been a shift in the discipline to improve two important aspects of climatic mapping: (1) statistical robustness—by using uni- or multivariate (dynamic) spatio-temporal prediction models; (2) temporal coverage—by extending the list of auxiliary variables to time series of remote-sensing based meteorological images. A geostatistical interpolation technique known as “Kriging with External Drift” (KED), “Universal kriging” (UK) and “regression-kriging” (RK) is now largely recognized as a flexible and well-performing technique for unbiased estimation of spatially continuous features (fields), and is also increasingly used for interpolation of meteorological and climatic variables (Hudson and Wackernagel 1994; Pebesma 2006; Hengl 2009).

Geostatistical methods used in meteorology are now also increasingly 3-D (2-D plus time) and 4-D (3-D plus time). Gelfand et al. (2005) proposed a Bayesian inference framework for predicting temperature and extended it to multivariate dynamic spatial models. Schuurmans et al. (2007) further propose an automated framework for prediction of rainfall fields using spatio-temporal data and KED. Carrera-Hernández and Gaskin (2007) compare KED and KED in a local window for spatio-temporal interpolation of temperatures and rainfall over the Basin of Mexico. Kebaili Bargaoui and Chebbi (2009) show that 3-D kriging leads to significantly lower prediction errors than classical 2-D kriging. Inclusion of temporal auto-correlation has in general shown to increase information content in the generated maps (Spadavecchia and Williams 2009). The remaining issue is how to combine time series of meteorological images with ground observations.

In this article we present a procedure to interpolate daily mean temperature over a whole year period by using time series of auxiliary predictors. We first build a spatio-temporal regression model using the complete point data and covariate data, then estimate the deterministic part of variation, analyze residuals for spatial and temporal auto-correlation, and finally generate gridded maps by spatio-temporal regression-kriging. Our objective is to promote space-time prediction techniques for operational mapping versus purely spatial methods, and motivate meteorological agencies to utilize publicly available time series of remote sensing products (MODIS) for production of meteorological and climatic maps.

2 Theoretical concepts

Spatial and temporal variation in temperature are governed by physical processes. For example, land surface temperature at some ‘location’ in space and time (\({\bf{s}}_0, t_0 | {\bf{s}} \in \mathbb{S}, \; t \in \mathbb{T}\)) is a function of incoming solar radiation, cooling factor by wind, coastal effects, land cover, temperature inversion and other effects. The temperature patterns differ between day and night time also: during the night temperature patterns are mainly determined by land cover, air humidity and proximity to water bodies and/or soil moisture (van Leeuwen et al. 2011). In urban and industrial areas, temperature is often locally somewhat higher due to heat emissions from industrial activities or heating (see e.g. Cheval and Dumitrescu 2009). Therefore, it can be said that surface temperature is a function of:

$$ T({\bf{s}}_0 ,t_0 ) = f\left \{ \begin{matrix} \hbox{Incoming solar radiation}: q_1 \\ \hbox{Wind/rain cooling ef\/fects}: q_2 \\ \hbox{Coastal ef\/fects}: q_3 \\ \hbox{Temperature Inversions}: q_4 \\ \hbox{Local thermal radiation sources}: q_5 \end{matrix} \right \} $$
(1)

where each fraction can be modeled separately:

$$\begin{array}{rll} q_1 &= & f(\hbox{latitude}, \hbox{longitude}, \hbox{elevation}, \hbox{exposition}, \hbox{date}) \\ q_2 &= & f(\hbox{wind exposition}, \hbox{cold air f\/low}, \hbox{precipitation}, \hbox{clouds}) \\ q_3 &= & f(\hbox{distance from coast line, \hbox{orography}}) \\ q_4 &= & f(\hbox{depth to sink}, \hbox{exposition}, \hbox{land cover}) \\ q_5 &= & f(\hbox{land cover}, \hbox{industrial activities}) \end{array}$$

A physical-deterministic model would in theory do the best job in explaining the spatio-temporal patterns of temperature, but this is in practice far from trivial, for several reasons. Number one, there are many parameters that need to be included in the model. This is not a big problem for factors such as the incoming solar radiation, which can be globally derived as a function of latitude, longitude, elevation, exposition and day of the year, and coastal effects which can be modeled by using the distance from the sea and exposition of terrain. Both models are inexpensive considering that the geomorphology of the Earth is well sampled through e.g. the SRTM mission. On the other hand, factors such as wind, temperature inversions and local radiation are more difficult to represent because their temporal variability is high. It would become even more expensive to collect field data on wind intensity and direction and cloud status, than to measure air temperature over a dense network. Number two, the model in Eq. 1 is fairly complex and model inputs, initial and boundary conditions, are poorly known, so that deterministic modeling becomes close to impossible (at least at the current level of technology).

Historically, temperature, precipitation and other crucial meteorological variables have been observed locally at meteorological stations, and subsequently interpolated over large areas to produce complete maps. Modern meteorology is enriched with remote sensing imagery. A range of meteorological images at different wavelengths (visible, infrared, thermal and microwaves), in passive and active modes (microwave radiometers and precipitation radar), and from low and geostationary orbits are now available to the meteorological monitoring community and are increasingly used to produce direct (and global) estimates of meteorological variables (Baum and Platnick 2006; Prigent 2010). One popular source of remote-sensing based estimates of Land Surface Temperature are the Moderate Resolution Imaging Spectroradiometer (MODIS) Land Surface Temperature (LST) images, derived from the MODIS thermal bands. According to Wan et al. (2004), the accuracy of MODIS LST images is cca ±1°K, which is more than satisfactory considering that MODIS products are available on near-to-daily basis and distributed freely via the Level 1 and Atmosphere Archive and Distribution System (LAADS FTP).

The MODIS LST images can be used to improve spatial prediction of ground measured values. In other words: ground measurements of temperature can be used to calibrate the RS-estimated climatic products through spatio-temporal regression-kriging. The statistical model to predict temperatures at an unobserved locations (s 0, t 0) is thus (Hengl 2009, p.45):

$$ \hat{T}({\bf{s}}_0,t_0) = {\bf{q}}_0 {\bf{\times}} \hat{\bf{\beta}} + {\bf{C}^{-1}} {\bf{\times}} {\bf{c}}_{\bf{0}} {\bf{\times}} \left({\bf{T}} - {\bf{q}} {\bf{\times}} \hat{\bf{\beta}} \right) $$
(2)

where \(\hat{T}\) is the predicted temperature, T is a vector of measured values of the target variable at ground stations (n · m measurements in space and time; \({\bf{s}} \in \left( (x_1, y_1), (x_2, y_2), \ldots , (x_n, y_n) \right)\); t ∈ ( t 1, t 2, ..., t m )), q 0 and q are a vector and matrix of the auxiliary variables at the target and observation locations, C is the covariance matrix of the n · m residuals at sampling locations, c 0 is the vector of covariances between residuals at the observation and target locations, and \(\hat{\bf{\beta}}\) is a vector of regression coefficients.

The prediction system in Eq. 2 basically follows the universal kriging model for spatio-temporal data described in more detail in e.g. Kyriakidis and Journel (1999) and/or Heuvelink and Griffith (2010):

$$T({\bf{s}},t) = m({\bf{s}},t) + \varepsilon ({\bf{s}},t) $$
(3)

where m(s, t) is the deterministic part of the variation (i.e. a linear function of the auxiliary variables), ε(s, t) is the residual for every (s, t). To ease statistical inference it is commonly assumed that the zero-mean residual part is multivariate normally distributed. Note also that the prediction locations are typically nodes of a fine space-time grid.

The regression kriging Eq. 2 is basically the same for purely spatial and for space-time data, hence the extension from space-based geostatistics to space-time seems to be trivial. It is not. There are two large differences between purely spatial and spatio-temporal prediction models. First, the auxiliary variables (i.e. predictors) need to be exhaustively available and thus must be known for all locations and all time steps for which predictions must be made. In other words, the predictors need to be available as a time series of images (or copies of the same values if these are static). Second, estimation of the space-time semivariance,

$$ \gamma ({\bf{s}}_i,t_i; {\bf{s}}_j,t_j) = 0.5 \cdot E \left[ \left( \epsilon ({\bf{s}}_i,t_i) - \varepsilon ({\bf{s}}_j,t_j) \right)^2 \right] $$
(4)

is inherently difficult because the time and space domains do not have similar properties (different scales, different causality principles). Likewise, predicting in time domain is different than predicting in space: predictions for the right side of the time axis are basically forecasts, i.e. extrapolations; using values from future events to explain the past is conceptually awkward (reversed causality) and can lead to artifacts (Snepvangers et al. 2003). All these problems do not exist if we deal with purely spatial models.

If we ignore the causality problem in the time domain, we are still left with a problem of representing the space-time autocorrelation accurately because space and time have different variability patterns. In practice, when dealing with real-world data, space-time variograms are fitted by introducing simplifying statistical assumptions. Basically, two main groups of approaches exist: (a) separable (purely spatial and purely temporal models) and (b) non-separable (space-time) approach (Ma 2005; Huang et al. 2007).

Separable models suffer from unrealistic assumptions and properties. An attractive inseparable model to space-time variogram modeling, also used in this paper, is the so-called “sum-metric” model. This assumes that the residuals (ϵ) consist of three stationary and independent components (Heuvelink and Griffith 2010):

$$ \epsilon ({\bf{s}},t) = \epsilon_s ({\bf{s}}) + \epsilon_t (t) + \epsilon_{s,t} ({\bf{s}}, t) $$
(5)

where ϵ s (s) is a purely spatial process (with constant realizations over time), ϵ t (t) is a purely temporal process, and ϵ s,t (s, t) is a space-time process for which distance in space is made comparable to distance in time by introducing a space-time anisotropy ratio. Thus, the covariance can be represented by (Snepvangers et al. 2003):

$$ C ({\bf{h}},u) = C_s ({\bf{h}}) + C_t (u) + C_{s,t}\left(\sqrt{ {\bf{h}}^2+(\alpha \cdot u)^2}\right) $$
(6)

where C (h, u) is the covariance at distance h in space, and time-distance u, C s (h) + C t (u) allow the presence of zonal anisotropies (different variogram sills in different directions), and \(C_{s,t} (\sqrt{ {\bf{h}}^2+(\alpha \cdot u)^2})\) allows the presence of geometric anisotropy represented with the ratio α. After parameterizing the three covariance functions to a common structure such as the exponential model, the entire sum-metric covariance model consists of ten parameters: three times a nugget, sill and range parameter (C 0, C 1, R) for the marginal temporal, marginal spatial and space-time covariance functions, and the anisotropy ratio (α).

Another methodological difficulty that must be tackled is the problem of missing pixels in the MODIS LST images. MODIS images can contain up to 100% missing pixels in an area, which can be due to clouds and other unfavorable atmospheric conditions (Neteler 2010). In addition, MODIS images, especially the mosaicked MODIS scenes, are known to contain artifacts and noisy features. Hence, a MODIS LST image can be represented as a composite of three components:

$$ {\rm{LST}}_{\rm{MODIS}}({\bf{s}},t) = {\rm{LST}}^*({\bf{s}},t) \cdot \Phi ({\bf{s}},t) + \xi ({\bf{s}},t) $$
(7)

where LST* is the temperature estimated under perfect atmospheric conditions, Φ(s, t) is the masking function (Φ(s, t) ∈ [0, 1]) and ξ is the noise component. This means that, although MODIS LST is potentially an accurate estimator of temperature, the missing pixels (Φ(s, t)) and noise (ξ(s, t)) represent a problem for geostatistical mapping because they will deteriorate the predictive power of the MODIS data.

One way to reduce the noise and artifacts in the MODIS images is to also exploit information in other inexpensive static predictors that are typically available at all locations and contain much less noise, e.g. latitude (LAT), longitude (LON), distance from the coast line (DSEA), Digital Elevation Model (DEM), Topographic Wetness Index (TWI) and incoming solar radiation (INSOL). In order to use these predictors in the regression kriging model we first transform these to uncorrelated principal components:

$$\begin{array}{lll} && \left\{ {\rm{PC}}_1({\bf{s}},t), \ldots , {\rm{PC}}_p({\bf{s}},t) \right\} \\ &&\quad\; = \Psi \big\{ q_{\rm{LAT}}({\bf{s}}), q_{\rm{LON}}({\bf{s}}), q_{\rm{DEM}}({\bf{s}}), q_{\rm{DSEA}}({\bf{s}}), q_{\rm{TWI}}({\bf{s}}),\\ &&\qquad\qquad q_{\rm{INSOL}}({\bf{s}},t), q_{\rm{LST}}({\bf{s}},t) \big\} \end{array}$$
(8)

where Ψ{ } is the principal component transformation function and p is the number of inputs to the principal component analysis.

Principal component transformation helps reducing the noise and artifacts in remote sensing images because impurities and uncorrelated features are typically moved to higher order components (Liszka 2004), which are not used in the regression kriging (i.e. the number of predictors used in regression kriging is smaller than p). By combining principal component analysis with step-wise regression, one can filter out such noisy components from the regression analysis and in this way make better use of the LST* component (Eq. 7). Hence the regression model from Eq. 2 is in fact:

$$\begin{array}{rll} {\bf{T^*}}({\bf{s}},t) &=& b_0 + b_1 \cdot \cos \left( {[t - \phi ] \cdot \frac{2\pi }{{365}}} \right) \\ &&+\, {\bf{\beta}}^{\bf{T}} {\bf{\times}} \left\{ {\rm{PC}}_1({\bf{s}},t), \ldots , {\rm{PC}}_p({\bf{s}},t) \right\} + \varepsilon ({\bf{s}},t) \end{array}$$
(9)

where t is the date (cumulative days), ϕ is the time delay from the coldest day and a trigonometric function is assumed to model seasonal fluctuation of daily temperature. Note that some predictors that define the PC i in Eq. 9 are temporally constant or static (e.g. DEM), while INSOL and MODIS LST are available as time series. The residual term (ε) is assumed to be normally distributed with zero mean (\(E\left\{ \varepsilon({\bf{s}},t) \right\} = 0\)), and is modeled using the sum-metric covariance function explained in Eq. 5.

The remaining issue is the missing pixels in the MODIS LST images which are due to clouds and similar atmospheric disturbances (see further Fig. 4). These missing pixels can be iteratively filtered by interpolating the neighboring values of PCs using some mechanical interpolator such as splines. By combining these two processes—PCA and filling of missing pixels—one can generate predictors that are: (a) uncorrelated and (b) available for the whole area of interest. The complete procedure used in this paper to generate time series of temperature images is explained in the Methods section (see also Fig. 1).

Fig. 1
figure 1

Computational procedure for spatio-temporal prediction of meteorological variables using time series of predictors. The R script and data needed to reproduce the analysis are available from the contact author’s website

3 Methods and materials

3.1 Study area

We use the meteorological data from the Croatian National Meteorological Service to demonstrate the method and assess the accuracy of predictions. Croatia is a relatively small country but it comprises several different climatic regions, which is a result of its specific position near the Adriatic Sea and its fairly diverse topography, ranging from plains in the East, through a hilly central part to the mountains in the West separating the continental from the maritime part of the country. Weather systems originating or crossing over Croatian territory are strongly influenced by this topography, thus the influence that these have on weather and climate is highly dependent on the region (Hiebl et al. 2009; Perčec Tadić 2010). This diversity of climate and terrain at relatively short distances makes this region of special interest for testing more complex climatic models (Antonić et al. 2001).

Some previous results of climatic modeling for a wider (Alpe-Adria) region can be followed in the work of Hiebl et al. (2009). 1 km resolution grids of main climatic parameters for the wider region can also be obtained from the web site of the Central Institute for Meteorology and Geodynamics in Vienna.

3.2 Ground data

The data set contains 57,282 measurements of daily average temperature for the year 2008. The location of the 159 meteorological stations is shown in Fig. 2. Temperature was measured with mercury-in-glass thermometers that are situated inside a wooden shelter, the so called “Stevenson” screen, which allows air circulation while sheltering the sensors from direct sun exposure. The precision of the temperature readings is tenth of °C. The number of observations that can be used for model building is about 5% smaller than the maximum possible number (159 · 365) due to missing observations.

Fig. 2
figure 2

Location of climatic stations in Croatia and static topographic predictors: Digital Elevation Model (DEM, in meters), total annual Incoming solar radiation (INSOL, expressed in Joules), topographically weighted distance from the coast line (DSEA, in km) and Topographic Wetness Index (TWI)

On most climatological stations temperature is measured three times a day, at 7 am, 1 pm and 9 pm. The mean day-time temperature (ΔT = 1 day) is calculated as a weighted average (Hiebl et al. 2009):

$$ \Delta T = \frac{T_{(7 \text{\,am})} + T_{(1 \text{\,pm})} + 2 \cdot T_{(9 \text{\,pm})}}{4} $$
(10)

The spatial distribution of stations in Croatia is not ideal for mapping purposes (Fig. 2). There is a certain under-sampling at higher elevations and in areas with lower population density. For practical reasons, areas of higher population density have been given a priority (Perčec Tadić 2010).

Figure 3 shows temporal dynamics of daily temperatures at three representative meteorological stations. In general the daily temperature varies smoothly over time and differs systematically with elevation. Plots in Fig. 3 also indicate that even daily temperature can differ largely between consecutive days. This variability is primarily caused by the local meteorological conditions, which can differ substantially from day to day.

Fig. 3
figure 3

Temporal dynamics of daily temperatures at three selected meteorological stations in Croatia (for year 2008): island Hvar station (20 m) representing the Mediterranean climate region, station Pleso airport (106 m) representing the continental part, and station Zavižan (1594 m) located in the mountainous part of Croatia. The numbers on the axis represent the cumulative days since 1970-01-01 (Unix epoch)

3.3 Auxiliary gridded predictors

We used five topographic predictors (see also Hiebl et al. 2009) and a time series of MODIS images to aid the interpolation of daily temperature. The static predictors are latitude (LAT), longitude (LON), SRTM Digital Elevation Model (DEM), topographically weighted distance from the coast line (DSEA), and topographic wetness index (TWI), which is also often used to describe cold air accumulation potential. Dynamic predictors are DEM-derived total insolation (INSOL) and MODIS LST images.

The SAGA GIS TWI was first computed using a 100 m SRTM DEM (Conrad 2007), then aggregated to 1 km resolution (Fig. 2). Total incoming solar radiation i.e. insolation (expressed in Joules per grid cell) was derived for each day using the SAGA GIS lightning module. This follows the algorithm developed by Böhner & Trachinow and described in detail in Böhner and Antonić (2008). Inputs to the derivation of total (sum of direct and diffuse) insolation is a DEM, latitude and longitude grids of the study area, the solar constant, and day of year.

We prepared a time series of 46 day–time and night–time 8-day composite LST images (MOD11A2 product bands #1 and #5) that we obtained from the NASA’s FTP server.Footnote 1 The 8-day composite images were created by patching together images from a period of ±4 days, so that the proportion of clouds can be reduced to a minimum. We decided to use the 8-day composite images because the proportion of missing pixels in the daily LST images will often be so high so that such images would be of very limited use for mapping purposes (Neteler 2005, 2010; van Leeuwen et al. 2011). Even these patched images can contain a large proportion of clouds and atmospheric disturbances (up to 100%; see also white patches in Fig. 4). The SRTM DEM and its derivatives are, on the other hand, complete and consistent maps, and can be used to compensate for the areas where MODIS images are of variable quality and contain missing pixels.

Fig. 4
figure 4

Sample of 12 time series of 1–km MODIS Land Surface Temperature (LST in °C) 8-day images. Notice the missing pixels (especially in the winter months), which are due to clouds and other atmospheric disturbances

To prevent the generation of incomplete maps, we filtered the missing pixels in the MODIS LST images first by taking the average of values between two neighboring dates. The remaining empty pixels are filtered using the close gap function in SAGA GIS available via the module grid_tools. This function iteratively filters all missing pixels from its neighbors by using a spline interpolation (Conrad 2007).

3.4 Data processing steps

For analysis of data and spatio-temporal prediction we use the R environment for statistical computing in combination with GIS packages SAGA GIS and GDAL utilities (Bivand et al. 2008; Hengl 2009). First, we import the point data and time series of images to R using the GDAL translation library, next overlay and reorganize the data into a space-time matrix where each grid node is represented as a 3D point with x, y, t coordinates (as illustrated in Fig. 1).

For spatio-temporal prediction we used the spatio-temporal regression-kriging framework as implemented in the gstat package via 3D kriging (Pebesma et al. 2007; Heuvelink and Griffith 2010), i.e. the time dimension is simply modeled as a third dimension (Huerta et al. 2004; Jost et al. 2005). The regression and residual kriging parts are dealt with separately: we first produced predictions for the regression part of Eq. 9 on a fine grid, next extract residuals for all observations and fit a global sum-metric variogram model. The residuals were then interpolated to the same grid and added to the predicted trend.

The sum-metric covariance model (Eq. 6) was fitted by using the optim function available in the stats package. We first determined initial numbers for: (1) the nugget of the marginal temporal variogram, (2) sill of marginal temporal variogram, (3) temporal range parameter, (4) nugget of marginal spatial variogram, (5) sill of marginal spatial variogram, (6) spatial range parameter and (7) total sill, visually by plotting the marginal experimental variograms. Next, the adjusted parameters were determined using the optim function until a satisfactory fit was achieved (see further Fig. 7). The procedure to estimate a sum-metric space-time variogram is explained in more detail in Heuvelink and Griffith (2010).

Precise estimation of the space-time covariance model is important if a search radius is used to speed up kriging, which is common practice and indeed necessary in case of many observations (recall that there are 57,282 observations in this case). Ideally, the algorithm must use only those observations that have the strongest correlation with the variable at the prediction location. This is not only affected by their distance in space and time, but also by the temporal and spatial ranges and sills. By selecting the right space-time covariance model (especially the right geometric anisotropy ratio), we can prevent generating artifacts. For example, in this case point measurements are based on station data—the observations are stacked at top of each other in the space-time cube—which can lead to near-singularity problems if inappropriate variogram parameters are used.

Principal components were extracted using the prcomp method, as implemented in the basic R package stats (Venables and Ripley 2002). This method allows setting of unit variance for diverse predictors (DEM, TWI, DSEA, MODIS LST, Latitude, Longitude). Note that we build a global Principal Component Analysis (PCA) model (Ψ) that is fitted using values of auxiliary predictors at all sampling locations. This global model can then be used to predict PC components for each new time step j, at any given location where auxiliary predictors are available. After fitting a global regression model for all n · m observation pairs (T, t, PC1, PC2, ..., PC p ), spatio-temporal auto-correlation in the residuals can be represented by fitting the variograms as described before. Once all the parameters of the model have been estimated—principal component transformation coefficients, regression coefficients, variogram model parameters—the model in Eq. 9 is used to make predictions for any location in the space-time cube (see also Fig. 1).

3D kriging can be implemented in gstat both in the Kriging with External Drift (KED) or in the regression-kriging (RK) version via the krige or krige0 function that allows parsing of any type of covariance modelFootnote 2 (Bivand et al. 2008; Pebesma 2010). In ideal situations both KED and RK algorithms should produce the same results, but there can be differences in processing speed and modeling possibilities. Regression kriging implies that the trend and residual part of the model are fitted and estimated separately. The maps of interpolated residuals produced using ordinary kriging and predictions of the linear model are at the end added together to produce final predictions. In the case of Kriging with External Drift (or universal kriging), both regression and kriging are implemented in a single (matrix) operation (Hengl 2009). We use the regression kriging approach for two main reasons: first, gstat accepts only linear trend models and separation of regression and kriging allows more possibilities to include non-linear regression models; and second, by separating regression from kriging the complexity of the analysis is reduced and the processing can be speeded up by limiting the search radius for kriging predictions, while still being able to model the regression part of the model globally.

Most of the data processing steps including import, reorganization of the data, model fitting and predictions in R are explained in detail in Hengl (2009, ch.11) and Pebesma (2010). GIS operations and statistical operations in R can be combined with the help of R packages RSAGA, sp, rgdal and similar, so that the complete process can be put in a single script (this matches the computational procedure in Fig. 1). The script, including the input data—MODIS images, SRTM DEM, distance to coast line map, original point measurements of temperature and precipitation—used in this case study, can be freely downloaded from the contact author’s websiteFootnote 3 and adopted to similar case studies.

A review of spatio-temporal models (dynamic linear state-space models), and some practical suggestions how to analyze such data and fit spatially varying coefficients is given in Banerjee et al. (2004, ch.8), Huang et al. (2007) and Heuvelink and Griffith (2010).

3.5 Accuracy assessment

We compare the results of spatio-temporal prediction versus plain 2D geostatistical prediction—ordinary kriging (space domain only)—using 10-fold cross validation as implemented in the gstat package (Bivand et al. 2008, pp. 221–226). We focus on three measures of accuracy: the mean error (ME), root mean squared error (RMSE), and relative \(\it{RMSE}_r\):

$$ {\it{RMSE}} = \sqrt {\frac{1}{l} \cdot \sum\limits_{j = 1}^l {\left[ {\hat T({\bf{s}}_j ) - T ({\bf{s}}_j )} \right]^2 } } $$
(11)

where l is the number of validation points. In order to see how much of the global variation budget has been explained by the model we use:

$$ {\it RMSE}_r (\mbox{\%}) = \frac{{{\it RMSE}}}{{s_T}} \cdot 100 $$
(12)

where s T is the sampled variation of the target variable, and \({\it RMSE}_r (\mbox{\%})\) is a global estimate of the map accuracy.

The comparison between ordinary kriging and spatio-temporal kriging is done only to quantify the added value of using time series of predictors. We do not evaluate the quality of individual stations.

4 Results

The results of regression modeling show that the predictors explain 86% of the variability in daily temperature values for the year 2008. In fact, just by knowing the date, one can explain about 60% of the variation in the measured temperatures. For a comparison, note that the temperature from the day before explains 81% of the variation. From the auxiliary maps, MODIS LST images are the most significant estimators of the daily temperatures: the correlation plot in Fig. 5 indicates an average precision of ±4°C, and a close to linear relationship. The complete list of predictors achieves a precision of ±3.4°C.

Fig. 5
figure 5

Scatter plots showing the general relationship between daily temperature and MODIS LST images. Line indicates locally fitted polynomial

The results of the principal component analysis shows that the predictors are strongly correlated—for example DEM and TWI, and INSOL and MODIS LST. The existence of significant correlation between the auxiliary predictors justifies the use of a principal component analysis. The step-wise regression, as anticipated, typically filters out the last few components which visually show dominantly noisy features and artificial breaks in values. After the PCA, however, it is more difficult to specify which original predictors are more significant, what is their coefficient and sign etc.

The variogram models of the original temperature measurements can be almost without exception fitted using the linear variogram model (Fig. 6), the average nugget variation is ±1.6°C. This indicates that temperature varies smoothly over large areas, which is probably a reflection of smooth changes in topography. The same is true in the time dimension, however, the nugget variation in the time dimension is about twice as large as in the geographical space. The semivariance at short ‘distances’ in time ranges from 2.4 to 10.1 (Fig. 6, below); average short range standard deviation is about 2.5°C. Notice also that autocorrelation structure is more variable in space than in time domain.

Fig. 6
figure 6

Daily spatial variograms (left) and temporal variogram (right) for the original target variable (temperatures). Lines fitted using an automated fitting procedure in the gstat package

Histograms show that both the target variable (array of daily temperatures for the whole year) and the residuals have close to normal distribution. The marginal spatial and temporal variograms for residuals are shown in Fig. 7. This indicates that temporal auto-correlation exist up to a ‘distance’ of about 12 days. Because much of the variation is explained by the auxiliary predictors, the residual variogram now shows a bounded (stable) sill at 10°C2 in time domain and at 2.6°C2 in space domain. The resulting sum-metric variogram, formatted as a vgm object in the gstat package in R, is:

Fig. 7
figure 7

The marginal experimental variograms for residuals and fitted models: (left) space-domain only, (right) time-domain only

From Fig. 7 it is clear that the majority of variation in the residuals is accounted for by the time component. This means that the regression model has better success in explaining the spatial pattern of the temperatures, while the temporal pattern is poorly captured. The nugget variation (day-to-day variability) is still relatively high, which indicates that it is probably impossible to predict daily temperatures with a precision better than ±2°C (global average). The space-time component of the variogram (Eq. 6) is least significant, which indicates that the regression residual is nearly space-time separable. In other words, if we know at a single station that a specific day is colder than the long-term average, then it is likely that this day will be proportionally colder at all stations and vice versa. This is not surprising because the study area, in climatological terms, is relatively small and the overall meteorological conditions are highly correlated.

The results of 10-fold cross-validation confirm that the prediction model is highly accurate. The spatio-temporal regression-kriging explains 91% of the variation in the original values (daily average); compared to 44% for plain ordinary kriging. The cross-validation accuracy increases first if the DEM map is added to the model (78%), and then grows up to 91% if both MODIS LST images and time dimension are used as input to prediction. The RMSE at 50,309 validation points was 2.4°C, which closely corresponds to the estimated nugget variation. The kriging variance maps for both ordinary and space-time regression-kriging in general correspond to the values estimated by the cross-validation. The average value for the ordinary kriging error for the whole study area was, however, somewhat lower than the actual prediction error (3.6°C versus 4.0°C).

It is important to emphasize that the results of cross-validation are of limited value because the position of meteorological stations is somewhat biased toward populated regions and open land cover types. Visual inspection of the output maps in Fig. 8 indicates that auxiliary predictors are significant in predicting temperatures including the areas with no ground stations (neighboring countries, hill tops and unpopulated areas). It is also evident from Fig. 8 that spatial patterns in the output maps reflect mainly patterns in the MODIS images. Because the model is significant, it can also be used to extrapolate outside the country borders. Nevertheless, projects where various national meteorological data are combined to build global prediction models would probably be more accurate.

Fig. 8
figure 8

Mean daily temperatures for four arbitrary dates predicted using spatio-temporal regression-kriging and actual observed values. Because the prediction model is significant (84% of variability explained by the model), it may be used to map space-time patterns in the neighboring countries

5 Discussion and conclusions

The result of this case study indicate that MODIS time series of LST images can be successfully combined with ground measurements of temperatures to produce more accurate and more detailed predictions of daily temperature. The MODIS images used in this case study almost always contained about 10–30% missing pixels (Fig. 3), so that some filtering steps are highly recommended. In addition, visual inspection showed that these images can be fairly noisy with many strange patterns (e.g. artificial line or polygon features, jumps in values) which are obvious artifacts. The precision of MODIS LST estimates is smaller than obtained in the results of Wan et al. (2004) for the simple reason that we used averaged (8-day) estimates rather than daily estimates of LST. If we had used the daily LST images we would have probably been able to predict the values with a higher precision, but then we would be constrained with meteorological conditions because coverage of the daily MODIS LST is typically <50% of the total area (Neteler 2010).

The advantage of using MODIS LST images, on the other hand, is that they account for small differences in temperature that are due to different land cover, moisture content, and non-orographic effects, which cannot be modeled with constant physical parameters such as elevation, latitude, longitude and distance from the coast line. The results of this case study also confirm that the Zagreb urban heat island is 0.5–2°C warmer than the surrounding countryside, which would be impossible to represent by using only topographic parameters. The resulting patterns in Fig. 8 clearly demonstrate that our proposed combination of PCA and filtering with neighbors leads to consistently ‘cleaner’ predictions i.e. maps that typically contain less artifacts and noise than the original MODIS LST images.

An important issue for the success of the space-time regression kriging is the quality of the parameter fitting techniques. We have used a simple regression kriging model so that both regression and residual components were dealt with separately. More realistic model parameters could have been estimated if we had used some multivariate, Maximum Likelihood-based methods that jointly model the deterministic and stochastic parts of variation and possible non-stationarities (estimation in a moving window). The problem is that the mixed spatio-temporal processes (Gelfand et al. 2005) are still rather experimental, and so is the software. For example, fitting space-time variograms is still cumbersome.

A related issue is how to visually explore space-time variograms. In our case (Fig. 6) we have decided to fit parameters for the space and time domain separately (for visualization purposes), and then build the sum-metric model by adding the different components one by one (which can then be visualized by using the marginal variograms shown in Fig. 7). There are many alternatives to represent the spatio-temporal auto-correlation, these need to be tested on a variety of climatic variables. Computational complexity is also an issue: the methods used in this article are time consuming and can take several hours to generate predictions. All this indicates that there is still opportunity to improve the proposed data processing flow (Fig. 1).

Binding spatio-temporal data is also an open issue (Bivand et al. 2008; Pebesma 2010). Although the work on hybrid space-time classes and methods in R has already started (spacetime and spt packages available via R-forge), it will take time until one will be able to use wrapper functions that fit both space-time regression models and allow visual (3D) exploration of space-time correlograms.

It is also worth emphasizing that the proposed model for daily temperatures (Eq. 9) makes use of a statistical model and can lead to poor predictions depending on the density and coverage of ground observations, and the amount of artifacts in the meteorological images. Hence the final quality of predictions is a product of various factors: noise and quantity of missing pixels in the predictors, quality and density of sampling, accuracy of ground measured parameters and strength of relationship between explanatory and dependent variables. There is also the issue of the day-to-day variability in temperatures that is highly random and can be modeled only up to a certain level (Jarvis and Stuart 2001).

Evolution of spatio-temporal analysis of climatic station data is today largely driven by increasingly powerful analysis tools and increasingly rich remote sensing data—new generation meteorological satellites, but also SRTM DEM-derived topo-climatic variables and similar global layers. Many of these layers are now available even at no cost, which is an additional motive to replace purely geographical interpolation techniques such as splines or ordinary kriging by methods such as proposed here (Pebesma 2006). This method has demonstrated an operational potential to produce land surface temperature maps of higher quality than purely station-based and purely remote sensing products. We envisage that a single global space-time model could be built using the method described in Fig. 1 to produce calibrated maps of daily temperatures. Time series of complete and calibrated LST images could then be aggregated per pixels to depict local regional and global trends and produce more accurate estimates of climatic parameters.

We encourage researchers and/or project team leaders to use freely publicly available time series of atmospheric and surface reflectance images (MODIS, Meteosat, GOES, GMS) to generate detailed daily maps of climatic variables, instead of using purely static predictors. We encourage readers of this article to use free and open source academic software such as R and SAGA GIS to generate climatic maps, share code and model outputs.