Probabilistic wind power forecasting and its application in the scheduling of gas-ﬁred generators

(cid:1) Accurate wind forecast is essential for integration of wind farms to power systems. (cid:1) This paper presents a methodology for producing wind power forecast scenarios. (cid:1) The impact of the wind uncertainty on the operation of gas plants was investigated.


Introduction
Many countries are committed to reduce their greenhouse gases (GHG) emissions by at least 80% by 2050, from 1990 levels [1]. Therefore ambitious plans have been set to deploy low carbon and renewable sources of energy. The large scale integration of wind generation into power systems of the countries in west of Europe is perceived to be an efficient strategy in response to the short term as well as long term emissions and renewable targets [2]. For example, in UK, according to a number of low carbon scenarios developed by industry and governmental bodies, capacity of wind generation in 2030 is expected to span between 48 GW and 65 GW [3,4].
The uncertainty of wind power forecasts makes the balancing of electricity supply and demand more challenging [5], as such, larger level of flexibility is required in the system to compensate for the forecasts errors. The additional investment and operating costs required for employing flexibility options (such as fast ramping back-up generators and electrical storage) can be minimised through obtaining accurate information about possible day-ahead wind power generation.
During the past decade many methods have been developed for forecasting of wind power. Generally, these approaches can be classified into two broad categories, namely physical methods and time series methods [6]. Physical methods use physical and meteorological information, including description of orography, roughness, obstacles, pressure and temperature to model wind power and forecast its future values. These approaches perform satisfactory for long-term prediction of wind power [7]. On the other hand, time series approaches require a smaller volume of data and information, compared to physical methods. Some of key meteorological variables such as wind speed and direction are needed by a time series approach to build a wind forecast model [8]. The historical data of wind power generation can also be used directly by the time series models to forecast wind power http://dx.doi.org/10.1016/j.apenergy.2016. 10.019 0306-2619/Ó 2016 Published by Elsevier Ltd. [8]. Conventional statistical models such as auto-regressive (AR) models [9] and auto-regressive integrated moving average (ARIMA) models have been proposed for wind speed and wind power forecasting.
A number of recently reported methods for wind power forecasting are mentioned in the following. A Markov-switching method for forecasting wind speed is examined by [10] which is able to produce both point forecast as well as interval forecast for wind speed. In [11], a recursive model for short term (1-24 h) forecast of wind speed is reported. The model was developed based on Hammerstein model, and is capable of capturing chaotic dynamics of wind speed time series. A wind speed forecasting method based on secondary decomposition algorithm and Elman neural networks is reported in [12]. An approach based on backward extreme learning machine (ELM) forecasting was proposed by [13] to address the issue of ultra-short term wind power time series forecasting. For a detailed review of wind power forecasting models refer to [14].
From the system operators' point of view, forecast values and the associated uncertainty of aggregated wind power from wind farms in a region is of high importance. These information are important for optimal scheduling of storage and thermal units as well as determining the required level of reserve to deal with uncertainty in wind and load forecasts [15][16][17]. Most of the wind power forecasting models generate a single forecast thus do not provide information about the uncertainty of the wind power forecasts. In an effort to enhance the information provided by the forecasters, probabilistic forecasting has been a recent area of development in wind power forecasting [18][19][20]. Probabilistic predictions can be either derived from meteorological ensembles [21] based on physical considerations [22], or finally produced from one of the numerous statistical methods that have appeared in the literature, see [23][24][25][26][27][28] among others.
The models producing probabilistic wind power forecasts, provide useful information for studying power system impacts of wind energy. In [28], the impact of wind power forecasting on the market integration of wind energy in Spain is studied using time series analysis. The impact of wind power forecast uncertainty on unit commitments was investigated in [29], however the value of improved forecasts has not been quantified.
The main objective of this study is to propose a framework for generating probabilistic aggregate wind power forecast scenarios using historical wind power time series data. The advantage of the proposed forecasting model is its independency on additional attributes (i.e. weather data) for the training process. Real aggregate power from wind farms across Great Britain were used to examine the performance of this model. The time granularity for historical data and generated forecasts is 30 min. The historical values for wind power were classified based on their normalized values and trend. Data in the same classes were used to create probability density functions based on kernel density estimators. These probability density distributions were used to generate forecast scenarios using a rolling process. In order to demonstrate the application of the proposed forecasting approach, the wind forecast scenarios were used in an optimal unit commitment and economic dispatch model to investigate the impacts of wind forecast uncertainty on the operation of gas-fired generators.
The rest of the paper is organized as follows. In Section 2 the forecasting methodology is illustrated. In Section 3, a case study is presented to demonstrate the performance of the forecasting model using real data from wind farms in UK. Section 4 discusses the impact of forecasting errors on the unit commitment of gasfired generators. Conclusions are drawn in Section 5.

Methodology
A probabilistic forecasting model was developed to generate wind power forecast scenarios and provide insights on the uncertainty of forecasts. As shown in Fig. 1, the model consists of three stages: Data pre-processing, Training and Forecasting.

Data pre-processing stage
A data pre-processing stage is required to prepare the inputs for the forecasting model. In this stage, the time series of the wind power data are normalized between zero and one, according to the installed wind generation capacity. Then, the normalized data were separated into two datasets, namely Training and Testing dataset. The Training dataset was used in the Training process where the model finds the hidden correlations and patterns behind the data. The Testing dataset represents the actual data (to be forecasted) and was used to evaluate the performance of the forecasting method. The largest part of the data forms the Training Dataset whereas the rest are used in the Testing dataset.

Training stage
During the Training stage, the relationship between two consecutive values of the training dataset is identified. Each value of the time series is classified according to its magnitude and trend. A number of N equal intervals (between zero and one) is used to classify the magnitude classes of the normalized values of the time series. For the trend classification, there are three possible classes, namely ''Increase", ''Decrease" and ''Constant". In order to determine the trend class of a specific value of the time series, the magnitude class of the previous value is considered. If the previous value belongs to smaller or larger magnitude interval, then the trend class is ''Increase" and ''Decrease", respectively. If both values belong to the same magnitude interval, then the trend class of the most recent value is considered as ''Constant". Fig. 2 illustrates the structure of the classification tree. Further investigation is required to determine the optimal number of consecutive data points needed for the trend classification, which is beyond the scope of this study.
According to the combination of the Magnitude and Trend classification, each value (of the time series data) is related to only one class. A ''Future Value Bin" for each class is defined, containing the successor (next value) of each classified value. For example, if both the M th and (M À 1) th value of the time series belongs to Magnitude Interval N, then the (M + 1) th value is assigned to ''Future Value Bin 3N À 1". This procedure is called Arrangement.
The final step of the training process is the calculation of the probability density functions of each ''Future Value Bin" group. Applying the Kernel distribution fitting methodology described in [30], the probability density function (pdf) of the data in these groups is calculated using Eq. (1). The kernel distribution is defined by a weight function K(x) and a bandwidth value h that controls the smoothness of the resulting density curve. Unlike a histogram, which discretizes the data values into separate bins, the kernel distribution sums the weight functions for each data value to produce a smooth, continuous probability curve. In this model, the Epanechnikov kernel weight function is used, described in Eq.
(2). The bandwidth value h is considered equal to 2. KðxÞ The training process is summarized with the following algorithm.

Forecasting stage
This model forecasts the normalized value in one time step considering the two most recent time steps. The normalized values of these two time steps are defined as base value1 (one time step before) and base value2 (two time steps before). The forecasting model first recognizes the Magnitude of both base values and then identifies the Trend Class of base value1. Once the class of base val-ue1 is recognized, the forecasting model retrieves the parameters of the corresponding probability density function. The forecasted value is a random number generated according to the specific probability density function. This process is repeated for the whole forecasting period (N time steps), using the two most recent forecasted values to produce the forecast of the next time step.
The wind power values of the future time steps are forecasted using a rolling process. Every forecasted wind power value of a time step is considered as the base value1 to produce the forecast for the next time step. This results in less accurate predictions as the forecast time horizon increases. To improve the accuracy of the forecasts, this model has a feature to frequently update the base values using the two most recent actual values of the wind power. The frequent updates of the base values result in decreased forecasting errors, however the computation cost is increasing. Therefore, the frequency of updating the base values is defined according to the desired accuracy levels or the computational limitations. Due to the occurrence of larger forecast errors in further ahead time steps, the impact of updating the forecasts using the most available data on wind power outturn is worthwhile to be explored. However, it is out of the scope of this paper to analyze the computational cost when increasing the frequency of updating this model. Algorithm 2 describes the detailed actions in order to produce a new wind power forecast using the probability density functions calculated during the Training stage. In this algorithm, two tasks are implemented namely, Base Values Calculation and Random Forecast Generation. A list containing the numbers of the future time steps (of the Testing Period) when the forecasting model needs to update its Base Values is assigned to an ''UpdateFrequency" variable.
For the first time step of the forecasting period, the last two values of the Training dataset are used to complement the base values. For the remaining time steps, the forecasting model checks if the current time step is included in the ''UpdateFrequency" list. In case the current time step is included in the ''UpdateFrequency" list, the actual wind power values of the two previous time steps are used to describe the base value1 (one time step before) and base value2 (two time steps before). In the specific case when the current time step is equal to the second time step of the Testing period, the model uses the last time step of the Training Dataset as base value2 and the actual wind power value of the first time step (of the Testing period) as base value1. In case the current time step is not included in the ''UpdateFrequency" list, the actual values of the two past time steps are used as base value1 (one time step before) and base value2 (two time steps before) respectively. The Base Values Calculation task is described with lines 2-16 of the Algorithm 2.
Once the Base Values Calculation task is complete, the next stage is the Random Forecast Generation. First, the model identifies the Magnitude Class of base value1 and then compares with the Magnitude Class of base value2. According to Magnitude Class and Trend Class of base value1, the parameters of the corresponding probability density function are retrieved. A random number is generated using these parameters, which is the forecast value of wind power for the current time step. This process is described with the lines 17-30 of Algorithm 2.

Wind power forecast performance indices
Accurate short term forecasting of wind power up to 24 h is important for efficient operation of power systems. Therefore, it is important that the performance of a wind power forecasting model is properly evaluated. A review of the evaluation criteria for wind power forecast is described in [31][32][33][34]. In this paper, the following criteria were used for evaluating the performance of the forecasting model: Mean Error (ME) Normalized Mean Absolute Error (NMAE) Standard Deviation of the Errors (SDE) where e tþkjt ¼ P tþk À P _ tþk is the error corresponding to time t + k for the prediction made at time t.
Any prediction error consists of a systematic error (l e ) and random error (v e ), where l e is a constant and v e is a zero mean random variable. MAE is affected by both systematic and random errors whereas only random errors affect the SDE criterion which describes the error distribution. According to [32], MAE criterion is essential to be included in the error evaluation of a forecasting model.

Data description
The forecasting model was applied on real aggregate wind power data, obtained from [35]. The data consisted of 10,416 half-hourly aggregate wind power values from wind farms across the Great Britain, for the period of 01/03/2014-03/10/2014. Fig. 3 shows the time series of the actual wind power and its first difference over time.
The data were preprocessed according to the methodology of Fig. 1. The total installed capacity of wind farms across Great Britain in 2014 was used to normalize the wind power data. The normalized data were separated in two groups, the Training and the Testing datasets. Data from the first 30 weeks were used as the Training dataset whereas data from the last week were used as the Testing dataset. Some analysis metrics are presented in Table 1.
During the training process, the training data were classified according to their magnitude and trend. Intervals of 0.01 (normalized wind power) were used in the magnitude classification, whereas the trend was determined by comparing with the previous data entry. The classification tree consists of 300 classes (leaves), and consequently 300 ''future" groups.

Wind forecast scenarios
After the training process, the forecasting model was used to forecast the wind power of the last week (31st) of the dataset. As mentioned before, the impact of frequent updating of the forecasts using the latest realized wind power was investigated. Seven different update frequencies were considered, namely every 48, 24, 16, 12, 8, 4 and 2 half-hourly time steps. The forecasting model was run 20 times (i.e. 20 scenarios) for every update frequency, resulting in 140 forecasts in total for the 31st week of the dataset. The results are shown in Fig. 4. Table 2 presents the performance indices which were used to evaluate the accuracy of the forecast.
As seen from Fig. 4, updating the base values more frequently results in more accurate forecasts. This is also depicted in Table 2, where the indices show an improvement in accuracy of the forecasts as the update frequency is increased. This is because of the chain-like behavior of the forecasting model. An error in forecasting the first time step contributes to the next forecast, creating another error and so on. This error chain breaks when the base values are updated with the actual wind power, and the model generates the next forecast without any previous errors. The wind power on days 2 and 3 is very fluctuating; therefore the forecasting errors are higher. On the other hand, day 6 has almost a constant wind power generation, and the forecasting errors are very small. Overall, all the performance indices are improved when the update frequency is increased. The MAPE ranges between 32.79% and 172.08% when the base values are updated every 48 time steps (day-ahead forecasting). When the update frequency is increased to every 2 time steps (hourly), the MAPE ranges between 4.9% and 14.25%; an average of 84.827% improvement. Average improvements of 80.657% and 85.073% are also observed for the SDE and NMAE respectively.
The performance of the proposed model was compared to Persistence model, described in [31]. The Persistence model assumes that the future wind power production remains constant and equal to the last measured value of wind power. In Fig. 5a   The results show that as the forecast time horizon increases, the proposed model provides more accurate forecasts. Fig. 6 presents the distribution of the error on every day of the forecasted week. On each box, the red 1 line is the median and the edges of the blue box are the 25th and 75th percentiles. Every data point outside the box is considered outlier, and is drawn in black. The stochastic nature of the forecasting model results in different forecasted values each time. To investigate and capture any possible systemic errors, 1000 forecasts were generated for the same week. The half-hourly forecast errors were calculated, and the results are shown in Fig. 7. All seven days of the 31st week were considered, for every update frequency. In all cases the error median is close to zero for every half-hour, which indicates that there is no systemic error in the model. More frequent update of the forecasts results in low error range, improving the forecasting accuracy. Another significant factor that affects the performance of the model is the number of magnitude intervals used in the classification stage. In order to investigate the effect of this factor, the training process was repeated for magnitude intervals of 0.1, 0.05 and 0.02. For every interval the model was used to forecast the same week (31st), using the seven different update frequencies. Fig. 8 presents the aggregated results of MAPE for each case. The effect of the size of magnitude interval depends on the update frequency. As seen from Fig. 8, at low update frequencies there is not significant impact from the size of magnitude interval. At high update frequencies however, reducing the size of magnitude interval (increasing the number of magnitude classes) results in further improvement of the forecasting accuracy.
A model's complexity is always correlated with the required simulating time. The training time was used to identify the complexity of the forecasting model because this process is the most time consuming part of this model. Therefore, the training time for all possible combinations of magnitude intervals was calculated. The required time for training the model when using 10 and 100 magnitude intervals was 147 and 186 s respectively. This indicates that the training time was increased by approximately 25%; however, this additional time is of a small duration which 1 For interpretation of color in Fig. 6, the reader is referred to the web version of this article. does not affect the real time operation of the model, as the shortest forecast horizon is 2 h. Obviously, the efficiency is a trade-off between the forecast effort and the forecast accuracy. The simulations were conducted on an Intel i3 Processor Platform (3.3 GHz), which consists of 16 GB RAM and Microsoft Windows 7 operating system.
In order to achieve the best possible forecasts, the training of the model needs to be updated with the latest available data. For example, the model cannot provide accurate wind power forecasts for a typical summer day, when it was trained using the wind generation data in winter.
Overall, considering 10 Magnitude Classes, the average MAPE for 1000 forecasts was reduced from 44.2% to 10.15% when the forecasts were updated every 48 time steps to every 2 time steps, respectively. This represents a reduction of 77.03% in the forecasting error. The accuracy can be further improved by increasing the number of Magnitude Classes used in Training Stage. Considering 100 Magnitude Classes, upgrading the forecasts from every 48 time steps to every 2 time steps reduced the average MAPE for 1000 forecasts by 84.79% from 42.3% to 6.43%. In addition, increasing the number of Magnitude Classes results in a reduction on the error range. For 10 magnitude classes and update frequency of 2 half-hours the error range is 23.48%. However, for 100 magnitude classes the error range is 7.96%.

Impacts of the wind power forecast uncertainty on gas-fired generators
The proposed probabilistic aggregate wind power forecasting methodology can be used to investigate different aspects of the integration of wind to power systems. The wind forecast scenarios will help system operators and market participants to take into account the wind power uncertainty in their decision makings.
In the following sections the use of the probabilistic aggregate wind power forecasts in optimal commitment and dispatch of power generating units is demonstrated.

Unit Commitment and Economic Dispatch (UC&ED) model
An optimisation model for UC&ED of different types of generators, including gas-fired generating units, was used in this study to analyze how uncertainty in wind power forecasting affects the operational cost of power system and in particular operation of gas-fired generators.
The UC&ED model is described in details in [36][37][38]. Here a brief overview of the model is provided. The model minimises the operational costs of power system to determine optimal operation of thermal generating units in presence of a large capacity of wind generation. The objective function includes fuel, variable O&M, shut down and start-up costs of different generating units, as well as cost of unserved electricity (Eq. (7)). Minimising the objective     (11)). The gas demand for power generation was calculated taking into account the efficiency of gas-fired generators (Eq. (12)).
Power i m i:t 6 Power i:t 6 Power i m i:t ð8Þ jPower i;t À Power i;tÀ1 j 6 R i ð9Þ m i;t 0 À m i;t 0 À1 6 m i;t ; t 0 ¼ ½t À UT i þ 1; t À 1 ð 10Þ m i;tÀ1 0 À m i;t 0 6 1 À m i;t ; t 0 ¼ ½t À DT i þ 1; t À 1 ð 11Þ Wind power forecast scenarios produced above were used as inputs to the UC&ED model to investigate the impacts of the wind power uncertainties on the operation of gas-fired generators and consequently gas demand for power generation. The Great Britain generation mix in 2030 reported by National Grid's Gone Green scenario [39] was used as a case study ( Table 3).
The UC&ED model was developed using Fico Xpress Optimisation Suite [40]. Due to the binary variables used to imitate ON/ OFF state of the thermal generating units, the UC&ED optimisation model is a mixed integer linear programming problem, and therefore is non-convex. The optimisation problem was solved using a Branch and Bound framework.

Operational costs of power systems
The total operational costs of the Great Britain power system for a representative winter week in 2030 are shown in Fig. 9 for different wind power forecast scenarios. Updating the wind power forecasts more frequently using the latest wind power out-turn resulted in smaller variations of the expected operational cost. On the other hand, when the wind forecasts are updated every 24 h, a significant difference of £45 million between minimum and maximum expected operational costs of the power system was observed. The variation in the expected operational costs of the system can be considered as a proxy for the uncertainty in electricity price which poses risks to the participation in the electricity market.

Standard deviation of power output from gas-fired generators
The standard deviation of power outputs from gas-fired generators were calculated for different forecast scenarios. Fig. 10 shows  that when wind power data is updated on hourly basis, the standard deviation of gas-fired generators is around 1 GW. On the other hand in the case that the wind power data is updated every 24 h, the standard deviation of the gas-fired generation for different forecast scenarios can be as large as 6 GW. The large standard deviation for the gas-fired generation means that gas network should be able to deal with wider range of possible gas demand. This requires more flexibility to be made available to the gas network via fast cycle local gas storage or linepack.

Conclusions
A model was developed for producing day-ahead probabilistic wind power forecast scenarios. Using a rolling forecasting approach, the impact of frequent updating of the forecasts was investigated. The modelling framework consisted of three parts, Data pre-processing, Training and Forecasting. The Data preprocessing stage included data normalization according to the max-imum installed capacity of the wind farms. The Training stage is related to the extraction of the knowledge hidden behind the aggregate wind power data. Each normalized data point was classified according to its magnitude level and trend. For every combination of the above classes, the probability density functions were calculated using the Kernel density estimators. Finally, the model provides probabilistic forecasts for the next time step according to the Magnitude classes of the two previous time steps and the Trend Class of the previous time step. The model repeats this process to forecast wind power for the following time steps using a rolling approach.
It was demonstrated that increasing the number of magnitude classes together with the update frequency results in more accurate forecasts. The impact of various data updating frequency on the accuracy of the forecasts was investigated. The results showed an improvement of the forecast accuracy as the model updates the base values more frequently.
The proposed forecasting model can be used to generate probabilistic aggregate wind power forecast scenarios which are necessary for stochastic scheduling models to investigate optimal operation of a generation portfolio under wind forecast uncertainty. This includes optimal commitment and dispatch of a generation portfolio, in addition to optimal allocation of spinning and operating reserves. Furthermore, the value of more frequent updating of the wind power forecasts in the scheduling of generation portfolios can be quantified.
The advantage of the proposed forecasting model is its independency on additional attributes (i.e. weather data) for the training process. Real aggregate power from wind farms across Great Britain were used to examine the performance of this model. The application of the forecast scenarios was demonstrated in the unit commitment of gas fired generators.