Spatial ARMA model for wind speed data

Spatial time series model for wind speed data is proposed. Based on few similar papers, first at each location univariate time series model, containing seasonal component and higher order autoregressive component is fitted. After eliminating time dependence in time series, empirical semivariogram based on time independent residuals is fitted and spatial weights for new location are calculated.


Introduction
In last year's there are a lot of different spatial-temporal modeling techniques suggested. Some authors analyse Wind Speed (WS) data. One of the first studies describing the spatial-temporal structure of WS has been performed by Haslett and Raftery [3] on Irland's winds. The authors base their inference on the model including deseasonalization, kriging, and fractional ARMA modeling with the main purpose of evaluating the average power output to be expected in the long term from a wind turbine at a given site. Yan et al. [5] apply a generalized linear model for modeling daily WS time series in Northwestern Europe. Cripps et al. [2] use a Bayesian hierarchical model to predict the surface wind field 1 day ahead within Sydney Harbour. Ailliot et al. [1] propose a Markov switching AR model for modeling the spatial-temporal evolution of WS. Similar results are described in Šaltytė-Benth and Šaltytė [4]. Modeling technique in this paper is based on the same idea. Data about daily WS data in 20 meteorological stations in Lithuania were analyzed. Lithuanian Hidrometeorological Service has provided data in pdf format starting from 1967 year. For modeling 18 stations where selected, and in each of them seasonal time series model was fitted. After eliminating dependence in time, spatial dependence was analyzed. Spatial dependence in Šaltytė-Benth and Šaltytė [4] was based on spatial exponential correlation function. In this paper spatial connection, based on semivariogram is presented.

Time series modeling
We consider spatial-temporal data to be a realization of a random field where s defines spatial coordinates and t indexes time. As it is usual, we decompose a spatial-temporal model Z(s; t) into a mean component µ(s; t) modeling the trend of the field, and residual component ε(s; t) modeling the variations around the trend in both space and time. Then, the spatial-temporal random field Z(s; t) may be written as Z(s; t) = µ(s; t) + ε(s; t), where µ(s; t) is a deterministic function of the space and time coordinates and is defined by here S(s; t) is a seasonal function of space and time, and φ i (s) are space-dependent parameters of an AR(p s ) process. Next, we decompose the residual field into where σ(s; t) is a non-random function, satisfying the condition σ(s; t) = σ(s; t + 365) for any time t.
We also assume that is a zero-mean stationary Gaussian spatial-temporal random field which is independent in time and has a spatial correlation function. We therefore assume that the cross-correlation in time and space is equal to zero. We also note that the residuals ε(s; t) are uncorrelated but dependent in time and correlated in space. Therefore, the model of the spatial-temporal covariance function for ε(s; t): Where ρ(h s ; θ s ) is spatial correlation function and s (θ t ) is a diagonal variance matrix at the location s ∈ D. First time series model at a single spatial location independently of the spatial information is considered, and then model spatial correlation function on timeindependent residuals. All parameters from time series analysis into a spatial model are combined.
where µ k (t) and ε k (t) denote the mean and the residual process at the moment t = 1, . . . , T at the spatial location s k ∈ D, k = 1, . . . , n respectively. Here with φ k i being parameters of a AR(p k ) process and describing the seasonality of WS in spatial location s k ∈ D. We assume that the residual process ε k (t) is of the following form where σ k (t) is (possibly) time-dependent standard deviation function, and ǫ k (t) is a zero-mean temporally independent Gaussian random process with standard deviation equal to one. The model was estimated on in-sample data from the period 1 January 1977 to 31 December 2006, consisting of 10,950 observations recorded in 18 stations. First seasonal function (9) for WS data at each station separately was estimated. The function with only three parameters was sufficient to remove seasonal fluctuations in data.
In Fig. 1, the ACF and partial ACF (PACF) of residuals obtained after eliminating seasonal effects from data in Klaipėda are plotted. Seasonal variations were removed in the data; however, the residuals still exhibited strong autocorrelations. As indicated by the PACF plot, a higher-order AR process is needed to explain the autocorrelations.
As AR order selection criteria AIC together with Box-Ljung test and histograms of the residuals were used. We started with a low-order AR process and proceeded with estimating the processes with an increasing order. The order was bounded to 10. For most of the stations, AR(3) or AR(4) was sufficient to capture the variations in the residuals. However, in four stations an AR(6) or AR(8) in Nida was needed. An ARMA process with the combination of AR and MA components did not provide a better fit.
The residuals after fitting AR became not autocorrelated according to the Box-Ljung statistics. The autocorrelations of residuals seemed to vary randomly around zero; however, the ACF of the squared residuals revealed time dependency in the variance (Fig. 2).
The residuals were first normalized by dividing them by the empirical variance of residuals.
For model validation, we use 365 out-of-sample WS observations from the year 2007 available in all 18 stations. To validate the temporal model, one-step-ahead predictions were generated for out-of-sample observations and the prediction errors (PE) defined as the differences between the observations and predictions were calculated. The PEs were normally distributed for all but two stations Even though the normality assumption was not verified at the 5% significance level in Šiauliai and Utena, the histograms demonstrated a reasonable symmetry in these stations.

Spatial modeling
In spatial connection method for fitted time series parameters semivariogram is used. Semivariogram is fitted to time independent and normally distributed residuals. As in each location there was 10950 observations (residuals), before fitting empirical semivariogram model, residuals where averaged for each day, resulting in 365 days. Exponential anisotropic semivariogram was fitted: with parameters θ 0 = 0.03; θ 1 = 0.002; θ 2 = 38472.13 and anisotpropy parameter t = 1.15. Time series model for new k-th location is based on spatial weights: there γ(s ij ) is the semivariogram between i-th and j-th locations. Inverse dimension is used, while semivariogram is measure of dissimilarity and inverse dimension let us consider that stronger dependence is between nearest locations.
In the estimated temporal model, there are 12 parameters in each station. The AR(3) or AR(4) process was sufficient to capture the autocorrelations in the deseasonalized data. In four stations, however, a higher-order AR was needed. We therefore assumed that the AR(8) process was fitted in all stations and set unestimated parameters equal to zero.
In Table 1 estimated parameters for two out of sample stations are shown. By using estimated parameters, spatio-temporal predictions were calculated for out of sample data (2007 January 1st), predictions were compared to real data in Palanga and Traku Vokė stations. Prediction in Palanga was 7.9 m/s, while observed WS is 8.8. Prediction in Traku Vokė was 6.1 m/s, while observed WS for this day was 6.6 m/s. Results shows, that spatial-temporal model for WS data fit quite well.