Ultra-Short-term Power Prediction of the Wind Farm Based on Multivariate Data Combination

Accurate wind power prediction is an important way to promote large-scale wind power grid connection. First, to address the abnormal wind farm actual measurement data caused by wind abandonment and power limitation, the DBSCAN method is used to pre-process the wind farm actual measurement data and eliminate the abnormal data. Then, a short-term wind power prediction model with a combination of GA-LSSVM and ARIMA weights is established, and the Lagrange multiplier algorithm is used to obtain the weighted values of each single model in the combined model to further obtain the wind power prediction results. Finally, the effectiveness of the proposed method is verified by arithmetic examples, and the results show that the proposed model and method can effectively improve the prediction accuracy of short-term wind power.


Introduction
According to different time scales, research objects, historical data, characteristics of the prediction model itself, and implementation methods, wind power prediction methods have multiple prediction systems, but there are currently no specific and unified standards in academia and industry. Currently, the short-term wind power forecasting method of wind farms mainly include numerical weather forecasting methods (NWP), physical prediction methods and statistical prediction methods [1]. Among them, the numerical weather forecast prediction method [2] is based on the actual conditions of the atmosphere, and using numerical calculation methods to obtain related equations of thermodynamics and fluid mechanics to describe the evolution of weather, and using the trained model to predict the output of short-term wind power. Physical predictions method [3] calculates the wind direction and wind speed of fan hub position by using data such as meteorological parameters and geographic information of the wind field, and according to the change of the wind field output power curve with the wind speed and then calculates the short-term wind power output power, and then obtains the power. The statistical prediction method [4] is to construct a mapping between the input value of wind farm measurement data, numerical weather forecast model, etc. and the wind power, so as to capture the time and space related information in the data to predict the wind power. At present, domestic and foreign researchers have realized that the forecast accuracy of a single prediction model is not high, and it struggle to meet the needs of the actual parallel operation of wind farms [5][6][7][8]. This paper first proposes to use the Density-based spatial clustering of applications with noise (DBSCAN) to preprocess the sample spatial data, and then use a wind power prediction method according to a weighted combination of Genetic algorithms-Least squares support vector machines (GA-LSSVM) and Auto Regressive Integrated Moving Average (ARIMA). The Lagrange multiplier approach is used for getting the weighted values in the merged model to achieve accurate prediction of wind power, and finally the example is analyzed and calculated.

Abnormal Data Were Eliminated Based on DBSCAN Algorithm
The DBSCAN algorithm, as the most widely used density clustering algorithm, has the advantages of finding arbitrary clusters, automatically excluding noise points, and not specifying the number of categories, which effectively compensates for the lack of subjective selection of the number of clusters required by k-means. DBSCAN is a widely used density clustering method that uses the neighborhood parameters to portray the closeness of the sample distribution. The process of DBSCAN algorithm is as shown below.
(2) Compute the Euclidean distance between each point and all other points.
(3) Set the values of parameters ε and MinP ts , calculate all core points, and establish a mapping between core points. (4) Calculate the real core points that can be connected based on the set of real core points obtained and the value of radius ε. (5) Grouping each set of real core points that can be connected into a class cluster, finally grouping the points within the neighborhood of the core points into them to form a cluster. (6) Repeat steps (4)-(5) until no more core points are found. The measured power data of a wind power unit for one month are taken, and the data processing method proposed in this paper is used to correct the original data. Figure 1 is the original data, and Figure 2 is the processed data.

Wind Power Prediction Modeling
According to the basic features of short-term wind power prediction, a prediction scheme is proposed, as shown in Fig. 3. As can be seen from Figure 3, the input data include meteorological department data and wind farm data, among which, meteorological data include meteorological forecast data, meteorological actual data, climate background data of the wind farm and topographic data of the location of the wind farm, and wind farm data include wind turbine parameter data, wind speed data measured by wind measurement tower and wind power operation working condition data. These data are used as sample data for wind power prediction, and the DBSCAN algorithm is used to process the data. After pre-processing, the data are entered into the wind power prediction system, which can realize ultra-short-term wind power prediction and short-term wind power prediction, and the corresponding prediction algorithm is given. In this paper, we mainly focus on ultra-short-term wind power prediction, and then analyze GA-LSSVM and ARIMA in detail.
The φ(x) in the expression is the kernel space mapping function, w is the weight, b is the amount of deviation.
Based on the principle of structural risk minimization, the parameter optimization model of LSSVM can be expressed as: where, ζ is a relaxation factor, the sample size satisfies i=1, 2, …, N. The Lagrange function is constructed as follows: where, αi is the Lagrange multiplier.
where, l=[1, 1, …, 1] T , K(xi,xj) is the kernel Function according to Mercer's condition. The kernel Function of LSSVM selects the Radial Basis Function (RBF), which can be expressed as follows: The LSSVM model can be obtained as follows: The model performance of LSSVM largely depends on the value of the penalty coefficient and the parameter σ of the kernel function. In this article, GA is used to optimize γ and σ, and the relevant algorithm has been very mature. The specific principle can be seen in the literature [11] . The calculation process proposed in this paper is shown in Figure 1.

Wind Power Prediction Model Based on ARIMA
Wind power data is a common non-stationary and non-linear time series. Common linear time series models include Autogressive (AR) model and Moving Average (MR) model, and ARIMA model is a combination of these two models. Among them, AR and MR models are often used to solve smooth time series, while ARIMA model can solve non-smooth time series. Difference refers to the operation of subtracting the corresponding value at the time of sequence t from its lag value. Random non-smooth time series can obtain smooth time series through difference. Assume that the non-stationary sequence {y(t):tT} can obtain the stationary sequence x(t) after the d difference, and d-order is the minimum number of difference making x(t) stationary, that is, the optimal number of difference, which can be represented as: where, B is the hysteresis operator, ( ) ( ) j y t j B y t   , d is the number of difference, and daftest [x(t)]=1 represents the average series. After difference, the autoregressive moving average model of stationary time series x(t) can be established. The specific model is as follows: the φ(x) in the expression is the autoregressive parameter, i  is the moving average parameter, 0  is the certainty number greater than 0, p is the autoregressive order, q is a moving average parameter, a(t) is white noise, and its mean value is 0.
Take the sequence of prediction results as ( ) x t  , then the difference sequence of short-term wind power prediction is ( ) x t  , and the time series of short-term wind power prediction can be obtained through transformation is ( ) y t  . The specific operation is as follows

Combined Wind Power Prediction Model
Since the contribution of different prediction models to the results is different, this paper carries out The n in the expression is is the total number of prediction models, k 1 , k 2 , …, k n is the weight coefficient corresponding to the predicted power. Assume that the prediction error of each prediction model is e 1 , e 2 , …, e n , the variance is 1 The value principle of the weight coefficient is calculated by taking the sum of the minimum variance as the objective function, which should meet the following requirements: where, Cov is the covariance. The Lagrange method is used to find the minimum value of the prediction variance and the weight coefficient k i in the combination model can be further obtained. The Lagrange function is:

Algorithm Validation
The algorithm uses a wind farm49.5MW by SCADA collection power sequence from June 1 to 30,2018,2019 and 2020 as the experimental sample, the input wind power sampling interval is 15min, a total of 8640 power sampling points, the first 8592 points are selected as the model experimental sample, and the last 16 points in 2020 are the test sample. Before wind power prediction, the DBSCAN method was firstly used to pre-process the data, and the values of parameters ε and MinPts were set to 20kW and 5, respectively. To quantify and analyze the forecast accuracy of different forecasting method, the normalized root-mean-squared error (NRMSE), normalized mean absolute error (NMAE) and normalized absolute error (NAE) are used in the paper So as to verify the effectiveness of the proposed methodology in the article and prove the superiority of the combined prediction method, the GA-LSSVM prediction, AMIMA prediction and the proposed method will be directly compared. The calculation results are shown in Figure 4.  Figure 4. Comparison of wind power prediction results As can be seen in Figure 4 that the forecasted value of the combined weighted model for wind power prediction is closer to the actual value, and the prediction accuracy is higher than other methods. This shows that the proposed combined weighted wind power prediction model is reasonable. The specific calculation results are shown in Table 1. By analyzing the comparison results shown in Tab.1, the prediction errors of the prediction models proposed in the article are smaller than those of the other two methods. Among them, the single ARIMA has the worst prediction performance and is not fitting for short-term wind power forecast. It needs to be effectively combined with other methods.

Summary
In this paper, we propose to combine GA-LSSVM and ARIMA for wind power forecast, and calculate the forecast results by combining the weighting, and compare the prediction results is in contrast to those of the traditional single model through arithmetic examples. The model can be suitabled for the prediction of wind power.