Ultra-Short-Term Prediction of Wind Power Based on Fuzzy Clustering and RBF Neural Network

High-precision wind power forecast can reduce the volatility and intermittency of wind power output, which is conducive to the stable operation of the power system and improves the system's effective capacity for large-scale wind power consumption. In the wind farm, the wind turbines are located in different space locations, and its output characteristics are also affected by wind direction, wake effect, and operation conditions. Based on this, two-step ultra-short-term forecast model was proposed. Firstly, fuzzy C-means clustering (FCM) theory was used to cluster the units according to the out characteristics of wind turbines. Secondly, a prediction model of RBF neural network is established for the classification clusters, respectively, and the ultra-short-term power forecast is performed for each unit. Finally, the above results are compared with the RBF single prediction model established by unclassified g wind turbines. A case study of a wind farm in northern China is carried out. The results show that the proposed method can effectively improve the prediction accuracy of wind power and prove the effectiveness of the method.


Introduction
In order to solve the challenges of current energy development, such as resources, environmental pollution, and climate change, the future electric power system adopts low-carbon, green, and clean development as the development direction, which will increase the installed capacity of renewable energy represented by wind energy.Due to the randomness and volatility of wind energy, grid-connected wind power generation results in fluctuations in grid voltage and frequency, which directly affect the stability of the power quality of the grid and the operation of the power system and also bring many uncertainties to the power grid dispatching [1].Accurate wind power prediction is one of the effective ways to reduce the above factors.
At present, approaches for short-term wind power forecasting is mainly divided into two categories: statistical models based purely on historical data and physical numerical weather prediction (NWP) models.They are better for shortterm forecasting 6-72 hours ahead.Though the former model is relatively concise and the calculation speed is fast, the prediction accuracy decreases sharply with the increase of prediction time.Models based on numerical weather prediction can obtain wind power prediction values for the next 1 to 3 days.The prediction accuracy is relatively stable, but the calculation volume is huge and often requires supercomputers to continuously operate for hours [2].Statistical models with NWP data as additional exogenous inputs, considering spatial relationships is one of the main research methods for improving wind power prediction accuracy in the future.Considering the research of spatial correlation, most of the literature is based on the analysis of wind velocity and spatial correlation to establish wind speed prediction model [2][3][4][5].In [4], based on the statistical data of wind speed in the history year, in order to study the temporal evolution and spatial extent of the statistics of the data, using a data-coupled clustering method (SODCC algorithm) to calculate the cluster size probabilities and the node to cluster size probabilities, a spatiotemporal model of wind speed is established.The method provides guidance for forecasting analysis of wind farm output.Assume that there exist lowdimensional structures governing the interactions among a  set of historical data from meteorological stations, and we utilize Wavelet Transform (WT) for decomposition of the wind speed data into more stationary components.Based on Compressive Sensing (CS) and structured-sparse recovery algorithms, a spatiotemporal model of each subsequence is established to predict wind speed.Another study is about the spatial distribution of regional wind farms, analyzing the spatial-temporal correlation of the output of wind farms and establishing wind power forecasting models based on measured historical data.In [6], based on the geographic spatial distribution information of multiple wind farms and the historical time series of statistics, the power probability prediction of wind farms with parameters and nonparametric regression is carried out by using the correlation of wind power output in different locations.Using EFO decomposition to extract the characteristics of the regional wind farm, the representative unit of each wind farm is selected to predict the output of the wind farm.Finally, the statistical upscaling method is used to predict the total power value of the regional wind farm.At present, most of the research based on spatial correlation is mainly considering the changes of wind speed, caused by, for example, the geographical location of different wind farms, the terrain data of wind farms, wind direction, roughness, temperature, atmospheric pressure, and so on.The literature that directly analyzes the influence of spatial correlation on the output characteristics of a wind farm with different wind turbines is rarely seen.
Ultra-short-term (typically minutes to hours) forecasting is time series based models, which rely on historical wind speed or power measurements and take the predicted variable itself as explanatory variables.They can capture the hidden stochastic characteristics of wind speed or wind power [7].Reference [8] applied a hybrid model to develop multipoint prediction and single-point prediction for ultra-short-term wind power prediction.Reference [9] proposed a novel hybrid wind power time series prediction model to improve accuracy of ultra-short-term wind power forecasting.There are also some literatures studying the impact of wind speed or wind direction on output power [10,11].For the same NWP data, in fact, the output power of wind turbines in different geographical locations is related to the above factors, but also to the geographical location and its own structural characteristics.In this paper, the spatial distribution information of the unit in the wind farm is considered.First, according to the historical data of the wind turbine output as the sample, the fuzzy mean clustering method is used to classify the units in the wind farm.Secondly, the RBF neural network prediction model is set up for the classified units, and the prediction results are added up to obtain the total wind power forecast power.

Fuzzy Clustering and RBF Network
Forecasting Model

Flow Chart of Two-
Step Forecasting Model.The proposed method considers the wind turbines at different space positions have different contributions to output power of wind farm.Fuzzy clustering is performed based on the measured historical power data of the wind turbines, and the classification of the unit is realized using the advantages of the nonlinear fitting of the RBF; a sample of historical data of 33 wind turbines in a wind farm is trained and tested.The specific flow chart is characterized in Figure 1.

Fuzzy C-Means Clustering (FCM).
Fuzzy clustering is regarded as one of the commonly used approaches for data analysis.Fuzzy C-means clustering is adopted to classify the historical power data of wind turbines to discover the output characteristics of different turbines.Take a sample set of the n typhoon unit in the wind farm, the j-th sample has a set of eigenvectors, where m is the characterizing time series output characteristics of the j-th unit.All samples are classified into category c by fuzzy clustering algorithm [12,13].The sample set X can be expressed as follows: where   ( = 1, 2, ⋅ ⋅ ⋅ , ) is the set of classification for the i-th crew and represents the i-th vector or cluster prototype vector, The relationship between each sample and all clusters is represented by membership matrix.  represents the degree of membership of the jth sample for the ith cluster center.In the clustering process, the distance weighted squared sum of each sample to all cluster centers is taken as the objective function, defined as follows: where m is the fuzzy coefficient, take 2 in this article;   is the distance between the wind power history data and the cluster prototype in the i-th classification.(, ) is the sum of squared errors of the sample data of each classifier and the prototype of the cluster.The specific steps of the FCM algorithm [14] are as follows: Step 1. Update membership matrix  () , and the matrix indicates the membership values of each cluster sample data belonging to the corresponding cluster prototype: If ∃, , make   () = 0, Step 2. Update cluster prototype matrix  (+1) : Step 3. Repeated iteratively, if ‖ () −  (+1) ‖ < , the algorithm stops, and the membership degree matrix U and the cluster prototype matrix X are output.Otherwise, turn to the first step.
In order to evaluate the clustering results of wind turbines and determine the optimal number of clusters, two evaluation indexes, partition coefficient   and classified entropy   , were introduced [15].
is used to evaluate the degree of separation between clusters of different units.The larger the value, the better.  is used to evaluate the degree of fuzzy clustering among wind turbines.The smaller the value, the better.

RBF Neural Network Prediction Model.
The RBF neural network is a highly efficient multilayer feed forward neural network.Using the multidimensional spatial interpolation technique, it can approximate any nonlinear function.Compared with other feed forward neural networks, the neural network has good optimal approximation performance and global optimal characteristics.The RBF neural network is composed of three layers of input layer, hidden layer, and output layer, as shown in Figure 1.  = { 1 ,  2 , . . .,   } is the jth input sample,  = 1, 2 ⋅ ⋅ ⋅ , , n is the total number of units. is the connection weight between the output layer and the hidden layer; h is the number of hidden layer neurons [16,17].This is also demonstrated in Figure 2.
The determination of the RBF network structure requires three key parameters: the center of the basis function, the variance, and the connection weight from the hidden layer to the output layer.The parameters are solved as follows: Step 1.The center of the basis function is obtained by the Kmeans clustering method.Firstly, the network is initialized, k training samples are randomly selected as the initial cluster center   ( = 1, 2, ⋅ ⋅ ⋅ , ), the Euclidean distance between   and the initial cluster center   is calculated, and clustering is performed according to the nearest neighbor rule.Secondly, the cluster center is readjusted and calculated.The average value of the samples in the clustering set thus obtains a new clustering center.If the new clustering center no longer changes, the calculation is stopped; otherwise, it returns to the previous step to continue to determine the center of the basis function.
Step 2. The function of RBF network is Gauss basis function, and the solution of its variance can be solved as follows.
where  = 1, 2, ⋅ ⋅ ⋅ , ;  max is the maximum distance from the selected basis function center.Step 3. The connection weights from the hidden layer to the output layer can be calculated directly using the least square method.The formula is described below.
The input layer to hidden layer mapping of the RBF network is nonlinear, and the hidden layer to output layer is a linear mapping.The parameter centers   and weights  are adjusted by the input and output errors, and then the internal layer coefficients of the network are adjusted accordingly, through repeated iteration calculations.When the output to network error of the network reaches the preset accuracy requirement, the network terminates the calculation and outputs the predicted value.

Case Study
The 12-month historical power data of 33 wind turbines measured at a northern wind farm was selected, and the single-unit capacity was 1.5 MW.The power curve is shown in Figure 3.It can be seen that the generating power of the 33 wind turbines horizontally related to the time sequence and has a certain correlation with the spatial distribution in the longitudinal direction.
Taking 12 months of historical power data as inputs of fuzzy clustering, clustering and grouping wind turbines are carried out.Figure 4 describes the membership matrix curve of the units divided into two clusters, and Figure 5 describes the membership matrix curve of the units divided into 3 clusters.
Select the number of different clusters , and the membership matrix values are shown in Figures 3 and 4. Two index values are calculated by formula (6) and as shown in Table 1.
According to the membership matrix and Table 1, when the number of clusters is 2, the cluster evaluation index   is large and   is small.Therefore, it is better to divide the wind turbines into 2 clusters.The first group includes 14 units, and the second group includes 19 units.The 10-minute historical data of the wind farm in March 2017, 733 sampling points, are adopted to set the RBF neural network modeling and prediction.The objective function error is set to 0.001, and sc is 3, where the MN is 20, and the DF is 1.The prediction of the RBF neural network is carried out for the cluster group and the entire wind farm unit, respectively.The prediction curve is shown separately in the next following figures.Figure 6 is the RBF prediction curve for the first cluster, and Figure 7 is the RBF prediction curve for the second cluster.Figure 8 depicts the RBF prediction curve for all the wind turbines in the wind farm.
In the forecast of wind power generation, the commonly evaluation indexes are the root mean squared error (RMSE)  and the absolute error (MAE).The specific definitions are as follows [18][19][20].
where   =   − ∧   ,   and ∧   are actual value and predicted value, respectively.
The RBF neural network prediction model is set up for different units, and the prediction error analysis is shown in Table 2.The predicted values of the two groups wind turbines are added by the equal weights to obtain the output of the combined model of the wind farm.From Table 2, the error based on combination model with two groups is lower than the single model.
Compared to RBF neural network prediction model, the ARIMA forecast model error curves are illustrated in Figure 9.
According to above comparison and analysis, the prediction error based on the ARIMA model is more than ultrashort time prediction model in this paper.The accuracy of   wind power forecasting can be effectively improved by the two-step ultra-short-time prediction approach.
Advances in Fuzzy Systems

Conclusions
In this study, the power of the generators in the wind farm is derived from wind energy.The power output of the wind farm is affected by, for example, wind speed, wind direction, the tail flow effect of unit, and so on.Each unit's output has a certain influence on each other.According to the output of the wind turbine and taking into account the uncertain relationship between these factors, fuzzy clustering and RBF neural network are combined to establish the two-step prediction model.Different contributions of the wind turbines at different space positions to the power of wind farm, and the correlation of wind power time series are also considered.Compared to the ARIMA forecast model and single RBF model, the case verified that the two-step forecasting method proposed in this paper can effectively improve the precision in the ultra-short-term power prediction and has obtained certain practical value in engineering.

Figure 1 :
Figure 1: Fuzzy clustering and RBF network prediction model flow diagram.

Figure 4 :
Figure 4: Membership matrix value curves of two types of units.

Figure 5 :
Figure 5: Membership matrix value curves of three types of units.

Figure 6 :
Figure 6: The first group power forecast curve.

Figure 7 :
Figure 7: The second group power forecast curve.

Figure 8 :
Figure 8: All wind turbines power forecast curve.

Figure 9 :
Figure 9: Wind power forecast error curves based on ARIMA model.

2
Advances in Fuzzy Systems

Table 1 :
Evaluation indicators of clustering results.

Table 2 :
Wind power forecast error comparison analysis.