Seasonal Characteristics Analysis and Uncertainty Measurement for Wind
Speed Time Series

Wind speed’s distribution nature such as uncertainty and randomness imposes a challenge in high accuracy forecasting. Based on the energy distribution about the extracted amplitude and associated frequency, the uncertainty measurement is processed through Rényi entropy analysis method with time-frequency nature. Nonparametric statistical method is used to test the randomness of wind speed, more precisely, whether or not the wind speed time series is independent and identically distribution (i.i.d) based on the output probability. Seasonal characteristics of wind speed are analyzed based on self-similarity in periodogram under scales range generated by wavelet transformation to reasonably divide the original dataset and effectively reflect the seasonal distribution characteristics. Experimental evaluation based on the dataset from National Renewable Energy Laboratory (NREL) is given to demonstrate the performance of the proposed approach.


Introduction
Renewable energy resources such as the wind energy have been treated as an alternative energy to solve the issues in environmental pollution and energy crisis [1], and they have been widely used in the electricity generation. Accurate and reliable uncertainty measurement of the wind resource is conductive to establish the forecasting model with high generality ability [2]. Pryor et al. [1] briefly analyzed the inherent challenges and uncertainty factors such as wind uncertainty nature and climate distribution in the attribution of the future wind energy based on the historical data trends. In order to improve the validity of the wind source analysis, many literature proposed the uncertainty measurement of the wind resource by using the probabilistically analysis. Kwon [3] proposed a probability model to quantitatively analyze the uncertainty of the wind energy and figured out that there was about 11% uncertainty in the average energy production after normalization. Aien et al. [4] discussed the necessary of the uncertainty analysis of the renewable energy in engineering systems. Analogically, the considered approaches based on hydrological evaluation methods [5][6][7] were presented to analyze the uncertainty effects in climate changes and wind power penetration etc. In fact, there is another issue which can cause the portion problem of the wind power forecasting due to the randomness [8,9]. Du et al. [10] discussed the significant characteristics of the randomness in wind power system output related to the generation power. In particular, the potential influence to the power capacity credit on different output levels. Roy et al. [11] gave a comprehensively and quantitatively description for the randomness of wind speed to evaluate the outputs variability used for power integration and reserve. Ding et al. [12] utilized the stochastic optimization method to discuss the negative effect from the randomness of the wind energy, and proposed an optimized pumped-hydro-storage plant scheduling to reduce the potential influence on power system operation. Wang et al. [13] proposed a forecasting methods to analyze the wind speed uncertainty based on the analytical model, especially, the uncertainty of the wind speed can be qualitatively quantitatively analyzed based on the multi-objective water cycle algorithm. However, this method still requires a large amount of data as an analysis sample to obtain a quantitatively estimated interval. Soulouknga et al. [14] investigated the wind speed at 10 m height to analyze the samples' characteristics and the energy potential, and provided the recommendation installation strategies by evaluation of the power density and available energy. Groch et al. [15] provided a method to empircally assement the wind power's loss of generation events based on the historicitical observation. More precisely, the statistical analysis of mesoscale wind speed obtained by the turbines at a wind farm were discussed in detal. The outlined literature figured out that the methodology can be applied for high wind speed shutdown events, as well as any event of interest even below cut-in. Ciulla et al. [16] figured out that the wind power curve of wind speed can be generally described by the statistical analysis to describe the instantaneous wind speed. The probability desnsity functions, power curves and neural networsk were respectively used to analyze the first phase, second phase and third phase based on the fitting processes on real production samples of one year. Ren et al. [17] analyzed the challenges caused by the uncertainty and intermittency of the wind power, and provided the definition of wind power intermittecny as well as the quantitative results for evaluation. The experiments based on one China wind farm was given to demonstrate the final performance of the proposed approaches.Typically, the outlined methods' generality ability in wind speed forecasting can be further significantly improved by analyzing the following issues: 1. Uncertainty measurement. How to reduce the negative effect from the uncertainty in wind resource?
What are the variation patterns of uncertainty in different seasons? 2. Randomness evaluation. Does the wind speed time series obey independent and identically distribution under different time scale?
The main objective of this paper is to analyze the outlined issues to improve the forecasting accuracy, promote the model configuration and enhance the model robustness. This paper is organized as follows. The proposed approach including the uncertainty measurement and randomness evaluation is given in Section 2. In Section 3, experimental evaluation is given to verify the performance of the proposed approach, and this paper is concluded in Section 4.

Proposed Approach
The proposal for the outlined issues is given as follows: (1) Powerful time-frequency analysis method Rényi entropy benefits the investigation of the energy distribution according to amplitude and frequency. Wavelet decomposition method is used to extract the stationary component in wind speed to reduce the negative effect from the uncertainty in wind resource. (2) Nonparametric statistical method is used to test if there are mixed distributions in wind speed and whether the wind speed is independent and identically distribution in different seasons. The uncertainty measurement is processed to analyze wind speed time series by evaluating the energy distribution based on the extracted amplitude and associated frequency. The randomness evaluation is presented to check whether or not the observation is a randomly generated 290 series based on the output probability. Finally, experimental comparison based on the dataset from NREL in 2004 is given to evaluate the performance of the proposed approach.

Uncertainty Measurement
Time series uncertainty measurement is one of the bases for information decision and model control. Statistical method is usually selected to measure the uncertainty of the time series in mathematical modeling. The effective measurement uncertainty benefits the capability improvement in time series computation and analysis. Uncertainty variability exists in wind power system at all levels. Wind speed's nature such as randomness, seasonality and uncertainty increases the difficulty in forecasting modeling with high accuracy. Wind flow modeling will be reliable if the uncertainty in wind resources is properly estimated [18]. The uncertainty in wind resource is mostly related to the energy distribution which depends on the wind turbine model, wind flow frequency and other factors related to climate. Rényi entropy as the generalized expression for Shannon entropy and Hartley entropy is a time-frequency analysis method, which is dedicated to analyze the nonlinear and non-stationary time series by evaluating the energy distribution based on the extracted amplitude and associated frequency. For the given time series X t f g t¼1;2;… with the corresponding probability p i ¼ : where a ! 0; a 6 ¼ 1, and it is the order of the Rényi entropy, which can be adjusted to calculate the spectrum of the Rényi entropy used for the measurement of the energy distribution varies on time and scales.
In general, the lower H a X t ð Þ is, the smaller uncertainty associated to time series X t is.
Otherwise, X t may be composed by many uncertainty factors associated to energy flows, and the corresponding distribution performs a loosely packed situation related to higher entropy. Rényi entropy can derive a good quality measurement for the wind speed uncertainty over time, which also provides an estimation of the time series based on a quantitative measure of sample quality. Typically, the longer time series is, the more uncertainty is. Another significant issue about the random should be considered besides the uncertainty measurement. In wind speed modeling, the forecasting accuracy would beyond the control if the observation sample obeys a completely random distribution. The objective of the random test is to test if the given time series X t f g t¼1;2;… is complete random by inferring the associated characteristics. The model's forecasting accuracy and output robustness etc. cannot be guaranteed for an approximately random time series because the example in the forecasting issue may obey a chaotic distribution.

Randomness Evaluation
Statistical methods refer to nonparametric tests are usually used to test whether the wind speed time series is independent and identically distribution (i.i.d). More precisely, the randomness in wind speed will be effectively evaluated based on the distribution-invariant properties associated to random processes. The goodness-of-fit criteria and several entropies are the famous methods used for sample distribution comparison [19]. There is no underlying distribution in the utilized sample are usually assumed in non-parametric testing. The null and alternative hypotheses are simplified as, Null Hypothesis: Wind speed time series sequence is i.i.d related to random; Alternative Hypothesis: The outlined sequence is not random. Assuming the given time series with mean l 1 ¼ 2n 1 n 2 n 1 þ n 2 þ 1 and variance , the corresponding marginal probability distributions based on the permutations and combinations is formulated as, where r is a positive number for order, and R is the observed number of runs. n 1 is the number of those samples, which is larger than the given median in the given time series. n 2 is the ones which are smaller or equal to the median in sequence. The critical region related to the probability of the false alarm (PFA) is stated as The test for randomness is to check whether or not the observation is a randomly generated series based on the output probability.

Seasonal Characteristics Analysis
Wind speed is essentially a non-stationary meteorological time series, and the corresponding distribution approximately presents the different seasonal characteristics in different seasons. WT composes of discrete and continuous wavelet transformation, which is widely used in time series analysis through the multifrequency bands with multi-resolutions along with time. WT can detect the wind variation pattern and capture the seasonal feature in different seasons [20]. Morlet function is dedicated to the meteorological time series analysis because its waveform shape is close to the analyzed signal. The continuous wavelet transformation is defined by where xðtÞ is the given time series, and w Ã denotes the complex conjugate operator with respect to mother function wðtÞ 2 L 2 ðRÞ. The Fourier transformation wðxÞ related to wðtÞ satisfies the admissibility wðxÞ j j 2 x j j 2 dx < 1. W s; s ð Þ is the wavelet coefficient, and s is the translation parameter associated to time t. s is a scale factor refer to xðtÞ, which is used for the frequency measurement. The similarity of the seasonal characteristics with respect to xðtÞ can be captured based on the scale s along with the time t. The similarity among the different seasonal characteristics in periodogram will be higher if the corresponding W s; s ð Þ is larger. This procedure is conductive to capture the seasonal characteristics of the wind speed by measuring the periodic similarity in periodogram.

Data Description
The sample used for experimental evaluation is downloaded from the National Renewable Energy Laboratory (NREL) at: http://www.nrel.gov/electricity/transmission/. The sample contains two variables: wind speed (M/S) and netpower (MW). The site number of the utilized sample is 06996, and the sampling frequency is 10 minutes/point. In order to effectively reflect the seasonal characteristics of wind speed trend in a whole year, the sample size is 52703 Â 1 (only the wind speed) from January 1, 2004 to December 31, 2004 is selected for analysis.

Uncertainty Measurement and Randomness Evaluation
The uncertainty measurement of the wind speed benefits the estimation of the potential changes. Rényi entropy is conductive to analyze the wind speed's uncertainty based on the corresponding amplitudes and frequency information. Smaller Rényi entropy means less uncertainty in wind speed. Otherwise, the analyzed signal will be more complex. The change trend between the order a and Rényi entropy H a X t ð Þ in the formula (1) is intuitively shown in Fig. 1. In general, H a X t ð Þ decreases when a increases based on Fig. 2. H a X t ð Þ trends to the constant value when a ! 1. H a X t ð Þ estimates all the possible probability related to the wind speed time series X when a ! 0. H a X t ð Þ is approximately equal to the Shannon entropy when a ¼ 1, which can be used to solve the problem in quantitative measurement of the sample information. Without loss of generality, the time scale, day, month and seasons are used to demonstrate the change trend in wind speed. The uncertainty of wind speed in winter is higher than other three seasons whatever on day or month. The random analysis of the wind speed based on the Non-parametric test methods has been given in Tab. 1.
The longer wind speed time series is, the more uncertainty is. LB and UB indicate the lower and upper boundar of evaluation, respectively. For instance, the uncertainty on the whole spring is significantly higher than the ones on month or day. According to the order of the sequence priority order in uncertainty and radomness: In fact, many researches have proposed some strategies to reduce the negative effect from the uncertainty in wind resource. The uncertainty in wind resource can be treated as a percent of the wind speed. The quantitative measurement about the uncertainty in wind speed is implemented through a specified confidence degree that wind speed falls within the specified interval. This paper provides an optimal solution to reduce the influence of the uncertainty in wind resource by utilizing the WT.
Moreover, median crossing test is used to estimate the randomness of wind speed, which is essentially a non-parametric test for randomness. In fact, the forecasting accuracy would beyond the control if the observation or sample is a truly random. In particular, the long-term forecasting accuracy cannot be guaranteed for an approximately random time series, and the typical example is the forecasting issue related to the chaos. According to three divided time scales: day, month and whole seasons, all the test decision using the Matlab inline function 'runstest' for the test of randomness indicate that the null hypothesis should be rejected. More precisely, it is false that wind speed time series sequence is i.i.d   related to random at the default 5% significance level. This represents that wind speed is not a randomly generated series based on the output probability according to the outlined time scale.

Seasonal Characteristics Analysis
The uncertainty measurement and randomness evaluation of wind speed indicate that: the establishment of the forecasting model with high accuracy should consider the difference of the distribution feature in different seasons. Usually, the distribution type of the wind speed has a continuous spectrum due to the periodically change and external climates conditions. In fact, the precisely change period of the wind speed is difficult to estimate due to the complicated meteorological interaction and distribution characteristics. But the seasonal characteristics including the approximate period can be effectively derived based on the scalogram percentage of energy distribution generated by the WT. Wind speed's trend and seasonality can be sufficiently reflected based on self-similarity related to the WT coefficient in periodogram under various scales to investigate the corresponding seasonal characteristics. Note that, WT coefficient in periodogram may be not independent and identical due to the potential noisy. In fact, the energy distribution of the signal and noise is different, so wavelet filter method is used to improve the seasonal characteristics analysis accuacy. Seasonal characteristics analysis results about July and November are intuitively displayed in Figs. 3 and 4, respectively.
Two seasonal distribution characteristics of the wind speed can be derived based on the SCA. The one is the first distinguish seasonal characteristics. For instance, July is a typical month with this distribution. Half day (about 67 points) is approximately treated as the minimum period under scales 1-256 along with time. The distribution characteristics is local periodicity and seasonal. The distribution characteristics of the other one is seasonal but no obvious periodicity along with time, for instance, November is a typical month with the outlined seasonal characteristics. approximately treated as the 'true' frequency to investigate the seasonal characteristics. There are two seasonal distribution pattern based on WT spectrum: (1) the ones related to seasons with half-day of periodic, which is seasonal and local periodicity such as summer and winter; (2) the other one is seasonal but no obvious periodicity along with time, such as the autumn and spring. In order to accurately analyze the wind speed's seasonal characteristics, the statistical results related to the wavelet coefficient matrices W s; s ð Þ f g mÂn is given in Tab. 2.  The months are merged into one subset if the self-similar coefficient r SS is smaller than the correlation coefficient r W ðs;sÞ related to the last and next month. In particular, June, July and August are merged into one subset due to r W ðs;sÞ > r SS . This indicates that these months have a higher similarity about seasonal characteristics than the self-similarity in each month. Based on the outlined discussion, larger sample may not benefit the establishment of the high-accuracy model unless the training sample contains the sufficient seasonal characteristics information used for testing. Wind speed's seasonal characteristics will be sufficiently reflected in the wind speed forecasting process.

Conclusions
The uncertainty, randomness as well as the seasonal characteristics of wind speed is mainly considered in this paper. Three time-scale, i.e., day, month and whole year for the uncertainly evaluation of wind speed is properly analyzed based on the Rényi entropy. The nonparametric statistics evaluation methods are used to check if the time series is stationary and provide the lower and upper boundarys of the probability evaluation. Similarity among the different seasonal characteristics of wind speed are effectively reflected to analyze the seasonal distribution characteristics, and provide the effective strategies to properly divide the original dataset. Finally, experimental evaluation based on the dataset from NREL in 2004 is given to verify the effectiveness of the proposed approach. In our further work, the dynamical analysis with ability of error correction and adaptive adjustment in combination with the proposed approach in this paper will be considered.

Acknowledgement:
The authors acknowledge the reviewers providing valuable comments and helpful suggestions to improve the manuscript.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.