An Advanced Hybrid Forecasting System for Wind Speed Point Forecasting and Interval Forecasting

Ultra-short-term wind speed prediction can assist the operation and scheduling of wind turbines in the short term and further reduce the adverse effects of wind power integration. However, as wind is irregular, nonlinear, and nonstationary, to accurately predict wind speed is a difficult task. To this end, researchers have made many attempts; however, they often use only point forecasting or interval forecasting, resulting in imperfect prediction results. +erefore, in this paper, we developed a prediction system integrating an advanced data preprocessing strategy, a novel optimization model, and multiple prediction algorithms.+is combined forecasting system can overcome the inherent disadvantages of the traditional forecasting methods and further improve the prediction performance. To test the effectiveness of the forecasting system, the 10-min and one-hour wind speed sequences from the Sotavento wind farm in Spain were applied for conducting comparison experiments. +e results of both the interval forecasting and point forecasting indicated that, in terms of the forecasting capability and stability, the proposed system was better than the compared models. +erefore, because of the minimum prediction error and excellent generalization ability, we consider this forecasting system to be an effective tool to assist smart grid programming.


Introduction
Along with the rapid development of modern industry, the issues of environmental deterioration and fossil fuel depletion are becoming increasingly serious. Seeking clean and renewable energy is a hotspot topic in the world. Across the globe, the scale of wind power generation is growing rapidly and wind energy has attracted attention [1].
However, when wind power is connected to the grid, the intermittence and volatility of wind power will affect the balance of power demand and supply, the security of the power system, and the power quality. erefore, in order to ensure the stability of the wind power system and promote the large-scale integration of wind power, it is necessary to predict the short-term wind power with high accuracy [2]. For obtaining more accurate prediction results, a great deal of efforts has been made and many studies have been performed. On the basis of applied data processing models, we summarized the previous research into four classes: (a) physical class, (b) statistical class, (c) spatial correlation class, and (d) artificial intelligence class [3].
Generally speaking, the physical strategy uses the atmospheric data and numerical weather simulation to predict the wind speed. Cassola and Burlando [4] used an improved numerical weather prediction (NWP) model to provide 2year-long wind speed dataset of two wind farms in the eastern Liguria (Italy). Zhao et al. [5] put forward a novel multistep forecasting method according to the weather research and forecasting for predicting the one-day ahead wind speed and power. Dong et al. [6] proposed a new clustering analysis model for NWP information, which proved that the clustering analysis of NWP can be effectively used in the field of wind speed forecasting. Yang and Tsai [7] developed two linear regression models and a microgenetic algorithm (MGA) model to predict the gust wind speed during a typhoon. NWP needs to observe not only the conventional data of temperature, humidity, wind pressure, and visibility but also the cloud top conditions, ozone, solar radiation, sea surface temperature, and so on, and it also requires the use of supercomputers. erefore, it cannot deal with the short-term prediction problem.
Compared with the physical strategy, the statistical strategy has clear advantages in addressing short-term forecasting problems due to the full use of historical data. From the perspective of existing statistical methods, exponential smoothing (ES), ARMA, GARCH, and other typical models are universally employed and can achieve reliable prediction effects.
Kavasseri and Seetharaman [8] applied ARIMA for forecasting wind speeds on the day-ahead (24 h) and twoday-ahead (48 h) horizons, which resulted in a better forecasting accuracy. Qian and Wang [9] proposed an improved seasonal gray prediction model (GM (1,1)) to forecast the wind generation of 2020 and 2021 in China for further development of the wind power industry. e spatial correlation algorithm takes the spatial relationship between different locations' wind speeds into consideration. For instance, Chen et al. [10] examined the application of a multifactor spatial-temporal correlation model and achieved an improvement from other benchmark models. Nevertheless, these models had to measure the wind speed from multiple spatial correlation points. Because of the strict measurement requirements and time delay, the requirements are difficult to meet.
Apart from the abovementioned models, artificial intelligence arithmetic has attracted a great deal of attention since its development. Different from the previously mentioned strategies, artificial neural networks are completely dependent on historical data, which means that the models have good fault tolerance, strong robustness, and adaptability of wind speed prediction. In addition, artificial neural networks have a good ability in dealing with nonlinear data, self-adaptive adjustment, determination of prediction mode, fuzzy relationship judgments, and so on. Scholars have used artificial intelligence algorithms to forecast the speed of wind, using SVM [11,12], ANN [13][14][15], methods of fuzzy logic (FL) [16,17], and so on.
Some utilizations of artificial intelligence algorithms from previous works are shown below. Guo et al. [18] proposed a new method according to BPNN mixture with seasonal exponential regulation, and this greatly improved the wind speed forecasting accuracy. Liu et al. [19] designed a novel predicting method according to the neural network of Elman (ENN) and the quadratic decomposition method, which demonstrated excellent abilities in multistep wind speed predicting.
At present, artificial neural networks are always combined with other intelligent algorithms. Yang et al. [20] proposed a new forecasting system for electricity market management with multistep electricity price forecasting. A deep learning framework with two-stage feature selection is developed by Niu et al. [21] for multivariate financial time series prediction. Although the abovementioned models achieved satisfactory results, there are still some inherent defects in the neural network model, such as the high degree of fit, ease of falling into local optimization, and the slow convergence speed.
However, the majority of the methods only obtain the average point of forecasting results and ignore the reflecting forecasting volatility, which makes it difficult to meet the needs and the risk management of a wind system. erefore, many researchers have studied wind speed interval forecasting models. Uncertainty in the forecast process is an accompanying factor, which has an important impact on the decision-making process of the energy system [22]. erefore, some papers use the interval forecast to predict this uncertainty. Based on the existing literature, the interval forecasting model can be summed up in four types of models: point forecasting error models, bootstrap models, nonparametric theory models, and quantile regression models. e quantile regression method focuses on quantum regression, and within this, each component must be modelled [23]. Bootstrap is a statistical model that applies data sampling and substitution to evaluate the robustness of the statistics, including the parameter of the confidence interval, the standard error, the coefficient correlation institution, and the regression coefficient [24]. In addition, an interval forecasting model, which applies both the ANNs and lower upper bound estimation (LUBE), was established to structure the predicting intervals for generating highquality forecasting intervals in a short time [25].
By reviewing the previous literature, we concluded that there is no single model that can currently be regarded as the best model for wind speed prediction [26]. A more intuitive comparison of all models is drawn in Table 1.
From the above analysis, a conclusion can be drawn that each method has inherent disadvantages due to the particular characteristics. erefore, in 1969, Bates and Granger [27] first proposed the combined forecasting theory, which significantly improved the prediction performance of a single model and has been widely used up to now. Stated thus, a novel combined system is put forward in this paper for conducting wind speed prediction. e system integrates an advanced data pretreatment theory, a fresh optimization technique, and several prediction methods, namely, ARIMA, GRNN, BPNN, and BiLSTM, respectively.
Specifically speaking, in contrast to the previous subjective judgment of the number of denoising layers, we performed a weighted reconstruction of each sequence obtained through decomposition to effectively control the interference of noise data. For capturing the linear and nonlinear characteristics hidden in collected wind speed sequence, BPNN, GRNN, BiLSTM, and ARIMA were selected as subsequences to forecast the pretreatment series. en, an advanced weightdeciding technology, nondominated neighbor immune algorithm (NNIA), was effectively introduced to determine the output results.
is paper makes up for the deficiencies in the previous research in the following aspects: (1) In contrast to the previous processing ideology, this paper proposes a more effective data preprocessing theory for controlling the interference of noise data. According to the disintegration and reconstruction 2 Complexity theory, the collected time series was disintegrated into a set of series with different frequencies and then these series were reconstructed with combination weights determined by NNIA. rough such preprocessing, the randomness and volatility of a wind speed sequence are successfully alleviated.
(2) For fully capturing all data characteristics and making up for the defects of individual methods, one linear algorithm and three neural networks were employed as submodels to build up a combination forecasting system. To expand, ARIMA performs well in seizing linear information, while BPNN, ELMNN, and BiLSTM are effective at grasping nonlinear characteristics. e combination system not only can integrate the advantages of these submodels but is also more suitable for wind speed time series with complex characteristics.
(3) To comprehensively compare the capability of different models, we established an efficient assessment system, mining the point prediction module and interval prediction module. is system contained eight assessment indicators, five of which were used to indicate point forecasting errors, while three were employed for the accuracy of interval prediction.
(4) e proposed combination system with high accuracy and stability is a powerful tool for the smart grid. On the basis of the dataset from the Sotavento wind farm in Spain, the simulation results demonstrated that relative to other benchmark models, the system was able to achieve reliable improvements. e structure of this article is as follows. Section 2 provides the modelling structure and the related methodologies. In Section 3, we introduce the implementation process of the three experiments and the obtained results. And, Section 4 offers our main conclusions. e abbreviations that appear in the text are listed in Table 2.

Design of the Combined Forecasting System
is section provides the detailed introduction of the modelling structure and the related methodology. Figure 1 illustrates the whole process of building the combination system in this paper.

Modelling Structure.
In the 1960s, Bates et al. found that by combining two or more prediction models, the prediction accuracy could be greatly improved. Since then, combinatorial theory has been widely used. We propose a novel combination wind speed prediction system, and the specific steps are expressed as follows.
Step 1. Data preprocessing ideology: e collected wind speed sequence was first decomposed into a set of filters with the different frequencies en, an advanced optimization model, NNIA, was employed to determine the combination weights between these filters. is weight adaptive combined denoising ideology successfully removed the adverse impacts of high frequency noise. e objective function of the adaptive denoising algorithm is as follows: objective: min(MAE), where x i is the i-th data after denoising, x i is the i-th data before denoising, and IMF in is the n-th intrinsic mode function of the i-th data. e constraint for the weighting parameters ω is (0, 1).
Step 2. Design of contrast models: Verifying the validity of the model is on the basis of a certain benchmark. In order to demonstrate the advantages of the developed forecasting system from multiple perspectives, we constructed three different types of contrast models, which were single prediction models, nonoptimized models with different data preprocessing methods, and hybrid models with the same data preprocessing and internal structure optimized by the same optimization algorithm.
Step 3. Point forecasting and interval forecasting: For point forecasting, the proposed combined system combined four different types of forecasting models to provide high-precision point-prediction results. e objective function is as follows: where y i is the i th forecasting data, y ij is the i th forecasting data of j th model, and y i is the i th actual data. e constraint for the weighting parameters ω is [0, 1]. On the basis of the point forecasting, we utilized the fuzzy c-means clustering (FCM) algorithm and optimization algorithm to build an appropriate forecast interval. As the value of the wind speed will affect the construction of the wind speed prediction interval, FCM was used to cluster the wind speed data into 10 categories and, then, to construct the prediction interval for each category. After that, NNIA was employed to search for the global optimal of ω, which is the coefficient of the interval, and the same data have the same ω. From the optimization point of view, a forecast interval coverage probability (FICP) higher than AWD and a lower FINRW are the two objectives for high-quality interval forecasting. erefore, the primary problem for NNIA is modelled as a constrained singleobjective problem: where μ is the nominal confidence level. Finally, the complete prediction interval is constructed.
Step 4. Establishment of developed forecasting system: e preprocessed wind speed sequence was divided into two parts: the training set and test set. e best

e Related Methods.
is part introduces the implementation process of several algorithms related to this paper.

Data Pretreatment Methodology.
For each signal, the correlation analysis signal was calculated using the Hilbert transform to obtain a unilateral frequency spectrum. e Hilbert transform process can be expressed as the Cauchy principal value (demoted p.v.) of the convolution integral [28]: Next, the bandwidth was calculated according to the Gaussian smoothness of the demodulated signal, i.e., the square gradient. en, the decomposition problem was transformed into a constrained variational problem: Module 2: three different types of contrast models Step 1. In order to fully capture the linear and nonlinear characteristics of the wind speed sequence, ARIMA, BPNN, GRNN and BiLSTM were selected as submodels.

Module 3: Point forecasting and interval forecasting
Module 4: Simulation experiment The combined system of short-term wind speed forecasting Make hilbert transform for original time series to obtain a unilateral frequency spectrum Step 1.
Mix with an exponential tuned and estimate the band width kIMFs Step 2.
Weighted reconstruction of kIMFs by optimization algorithm according to formula (1) Step 3.

Antibody population Recombination
Step 2. Point forecasting Step 3. Interval forecasting Complexity 5 min, where μ k � μ 1 , . . . , μ K and φ k � φ 1 , . . . , φ K stand for the abbreviations for all pattern sets and their center frequencies, respectively. And, k : � K k�1 is regarded as the sum of all modes.
rough leading into the Lagrangian multiplierλ(t) and penalty factor α, the unconstrained variational problems are transformed by constrained variational problems, and it can be expressed as [29] By alternately updating μ k , φ k , and λ, the minimum point of the extended Lagrangian expression can be obtained. e equation of μ k , φ k , and λ are as follows: Finally, based on the idea of sequence reconstruction, the processed sequence is expressed in the following form: (10) where the weights, w 1 , w 2 , . . . , w k , are determined by a multiobjective optimization algorithm, NNIA.

Optimization eory.
To build up a complete combination system, we introduced a novel multiobjective optimization algorithm, nondominated neighbor immune algorithm (NNIA). e algorithm is used to optimize the weights in a combined system, including a data preprocessing module and different single forecasting models. e main steps of NNIA are as follows [30]: Step 1. Initialize the parameters in the model: create one initial antibody population B 0 with a length of N D .
Step 2. Update the dominated populations: determine the dominated antibodies in B t . Copy the whole dominated antibodies to form the temporary dominated population (expressed as DT t+1 ). When the size of DT t+1 is smaller or equal to ND, set D t+1 � DT t+1 . If not, after calculating all individuals' crowding distances, according to these values sort the individuals in DT t+1 in descending order, and then the first N D individuals are selected to form D t+1 .
Step 3. Determine whether the termination conditions are met: if t ≥ G max (maximum number of generations), D t+1 is derived as the output and stops; otherwise, t � t + 1.
Step 4. e selection of the nondominant neighbor: if the size of D t is smaller or equal to N A (maximum size of active population), set A t � D t . If not, after calculating all individuals' crowding distance, according to these values sort the individuals in D t in descending order, and the first N A individuals are chosen for forming A t .
In this formula, T C (a i ) � a 1 i + a 2 i + · · · + a q i i , where a j i � a i , i � 1, 2, . . . , |A|,j � 1, 2, . . . , q i , and q i stands for a self-adaptive parameter. e symbol + not only represents the arithmetical operator but also separates the antibodies here. q i � 1 means no cloning on antibody a i . e greater the crowding-distance value of the individual, the larger of the q i of the individual is. e calculation of q i is as follows: where ζ(a j , A) is the crowding-distance value of the active antibodies a j and n C stands for an expectant value of the size of the clone population.
Step 6. Hypermutation and recombination: carry out hypermutation and recombination on C t and let C t be the generated population.
Step 7. Combine C t and D t to obtain the antibody population B t ; return to repeat Step 2.

e Basic eory about Submodels in a Combined
System. e core forecasting algorithms involved in the combination system are ARIMA, BPNN, GRNN, and BiLSTM. In this section, their basic theories are described briefly [31].
e ARIMA model has three parameters: p, d, and q.
p is the lag number of the data of ARIMA d is the orders that the data are differential d times in order to make the data stable q is the lag number of forecasting error ARIMA can be expressed as follows: where ε t is the error term at time t, θ j is the j-th coefficient of AR, and φ i is the i-th coefficient of MA.
(2) BPNN. BPNN simulates the activation and transmission of human neurons. Taking a three-layer neural network as an example, the BP neural network consists of three layers: input, hidden, and output layers. e information of each layer of neurons is transmitted to the next layer of neurons through the activation function: e BP algorithm is used to optimize the weights of the network to keep the mapping between input and output consistent. en, BP minimizes the objective function by changing the weight of each layer: where N is the number of training samples, y ij is the forecasting value, and y ij is the actual value. Sigmoid function is used as neuron transfer function. No matter how complex the network structure is, each parameter can be evaluated by calculating the gradient. is is the basic idea of multilayer perceptron backpropagation algorithm. e sigmoid function is as follows: (3) GRNN. e learning speed and nonlinear fitting capability of GRNN are strong. Even if the sample size is small, the forecasting effect of GRNN is accurate. Furthermore, the instability data can be forecasted by GRNN. Generally, the GRNN can be established by linear neurons and radial basis function neurons.
GRNN consists of four layers: output layer, mode layer, sum layer, and input layer. e activation function of GRNN can be summarized as (4) BiLSTM. e LSTM was firstly proposed by Hochreiter and Schmidhuber [32], and they have been refined and promoted by many others. BiLSTM is a kind of special RNN network, which can learn the dependence between long sequences. BiLSTM has been widely used in function fitting.
where Z t , I t , and F t are the input block, input gate, and forget gate activation, respectively. C t is the cell state, Y t is the output of the cell at time t, o t is the activation of the output gate, and g(x), σ(x), and h(x) are activation functions.

Experiments and Analysis
We adopted three experiments and the associated analyses to verify the forecasting capacity of the proposed model. e data sources, evaluation metrics, and the experiments are also introduced in this section.

Study Area Description and Datasets.
Original wind speed sequences were collected from Sotavento, which is located in the southwest of Europe, in Galicia, Spain. Sotavento is highlighted as an example of good practice by UNESCO in 2014. e time intervals were 10 minutes and one hour. e preprocessed wind speed sequence was divided into two parts: the training set and test set. e best ratio, 4 :1, was determined from many experiments; therefore, in the dataset with a total length of 1440, the first 1152 data were used to train the model's structure and the remaining 288 data were input as the test set. Dataset 1 and dataset 2 contained 10-minute intervals; dataset 3 contained one-hour intervals. e data structure of the three sites is listed in Table 3.

Evaluation Metrics.
After obtaining the prediction results, there must be a clear and unified standard to evaluate the prediction ability of the model. erefore, we established a comprehensive evaluation system composed of seven indexes and evaluated the prediction ability from two aspects of point prediction performance and interval precision. In this paper, MAE, MAPE, MSE, RMSE, and DC were selected Complexity to reflect the prediction error. e FICP, FINAW, and AWD represent the interval prediction capability. Table 4 shows the specific information regarding these indexes.

Experiment Design.
According to the obtained wind speed time series, three experiments were conducted to make clear the comparison between the proposed forecasting system and other models. e input was the wind speed value of the first four days, and the output was the wind speed value of the fifth day.

e Related Parameters of the MATLAB Operation of the Proposed Forecasting Model.
In this part, the relevant parameters of each model used in this paper are discussed to achieve the best prediction effect.
In the VMD method, there are six parameters that must be set to achieve a better denoising effect. Alpha is the balancing parameter of the data fidelity constraint, and we set it to 2000; we chose 0 for the noise slack (tau); and the number of modes K to be recovered was 10 in this paper. We did not require the DC part; therefore, the DC was 0. All omegas started at 0; therefore, init � 0. e tolerance of the convergence criterion (tol) was 1e − 7.
As for NNIA, the population size, analog binary cross parameters, polynomial variation parameter, frequency minimum, frequency maximum, and iterations were the parameters that were required to be set. A small range of changes in one of the parameters has little effect on the accuracy; however, the program run time is quite different. us, considering the two factors, the default parameters were set up as follows: the population size was 10, analog binary cross parameter was 0.25, polynomial variation parameter was 0.5, frequency minimum was 0, frequency maximum was 2, and iterations were 100. e weights of IMFs of dataset 1 calculated by NNIA of the Sotavento wind farm were 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.79, 0.53, 0.08, and 0.00. e parameters of BiLSTM were as follows: the learning method was deep learning; the maximum number of training epochs was 300; the minimum batch size was 120; the count of hidden layers in BiLSTM unit one was 250; the count of hidden layers in BiLSTM unit two was 250; the number of hidden layers in the feedforward network was 20-50; and the execution environment was CPU.

Experiment I: Comparison with Other Weight Adaptive Combined Denoising (WACD)-Based Models.
In this experiment, a group of hybrid models that were also preprocessed by weight adaptive combined denoising (WACD) was selected as benchmark models. e experimental results are exhibited in Figure 2 and Table 5. e following results can be drawn.
In both one-step and two-step forecasting for dataset 1, the developed system obtained a remarkable performance. For example, the developed system achieved the smallest MAE values at 0.1648 and 0.2435, which were better than the other hybrid models. As for the DC, the correct directional change was 92.0139%, which proved that the forecasting of trend of the proposed model was very accurate. For interval prediction, FICP and FINAW cannot be analyzed separately. If the FICP is large but the interval normalized average width is bigger at the same time, the model is bad. According to Table 5, when the FINAW is similar to other models, the FICP of the proposed model is bigger than the WACD-ARIMA, WACD-BPNN, WACD-GRNN, and WACD-BILSTM. In addition, the accumulated width deviation of the proposed model was 0.0193, which is smaller than the other models. Dataset 2 was similar to dataset 1. erefore, we concluded that the forecasting effect of the developed model was better than other models in both point forecasting and interval forecasting for 10-minute wind speed data.
For the 1-hour wind speed data (dataset 3), the WACD-NNIA-EBGB system still provided more distinguished prediction performance, not only in one step but also in two steps. For instance, for two-step prediction, our proposed system with the smallest errors was the most satisfied and it could precisely fit the changing trend of 89.9305% points, whereas the most nonideal error indicator values (MAPE at 18.9532%, MAE at 0.7059, RMSE at 0.8934, and MSE at 0.7981) were obtained by the WACD-GRNN model in one-step forecasting. Remark 1. According to Figure 2, compared with the mixed model, there is clear evidence that the developed prediction system with the lowest statistical error values in both point and interval prediction had the optimal prediction performance. In addition, other assessment indicator values were also the most ideal, which fully proves the feasibility of this combination system in shortterm wind speed prediction.

Experiment II: Comparison with Models with Different Data Preprocessing Technologies.
e aim of experiment II was to make clear the comparison situation between several common data preprocessing technologies, namely, EMD, CEEMD, SSA, and VMD and the developed pretreatment strategy. e assessment indicator values are listed in Table 6 and Figure 3. e prediction results are shown as Figure 4.
For Set 1, the WACD-based combination system was more successful in processing noise data, as observed from all indicator values. In addition, this strategy obtained higher DC In the formulas, L i and U i represent the lower and the upper limit of the production interval, respectively. And, c i is the number of the truth value contained in constructed interval. N is the number of the testing set [16]. Data are the actual data. Besides, AWD i is the width deviation of the construction interval of each sample. 10 Complexity Table  6: Prediction performance comparison table of the proposed model and combined models using different data pretreatment techniques. values, meaning that after removing the interference, the forecasting models better fit the changes of wind speed. e denoising effect of EMD technology followed the developed system, with MAPE values of 5.4278%, and 7.6773%, respectively. Moreover, when the number of prediction steps varied, the prediction ability of CEEMD-based and VMD-based combined models undulated greatly. When the FINAW of all models was almost the same, the forecast interval coverage probability of the proposed model was approximately 10% more than the other models in one-step forecasting. e AWD of the proposed model was 0.0345 smaller than the EMD-based model. For Set 2, the results showed that the prediction errors obtained by the WACD-based combined system under 7% were smaller than the other data that were pretreatmentbased. Taking one-step prediction as example, the developed system's RMSE was 0.3065, improving 30.40% and 54.12%, respectively, relative to the values of the CEEMD-based (0.4697) and SSA-based prediction systems (0.7126).

Dataset
In Set 3, for both one-step and multistep prediction, any evaluation index of the prediction system put forward in this paper was at the optimal level. More specifically, under the one-step forecasting situation, the WACD-based system had the best MAE, MAPE, RMSE, MSE, and DC at 0.2946, 8.2657%, 0.3794, 0.1456, and 89.9305, respectively. Nevertheless, the prediction performance of the SSA-based systems was the worst with the biggest MAPE at 26.9696%.
Remark 2. Different from the recombination idea of the denoising layer number based on subjective judgment, each subsequence obtained though decomposition was weighted and reconstructed to eliminate the adverse influence of high frequency noise. e results of experiment II show that the combination system based on WACD was always better than the other preprocessing technologies, which indicates that the idea of adaptive combination reconstruction of weight has great potential in the future.

Experiment III: Comparison with Traditional Forecasting Models.
is experiment highlighted the contrast that exists between the prediction capability of the developed system and several traditional prediction methods, including ARIMA, BPNN, ELMNN, GRNN, and ELMAN. e detailed numerical values are shown in Table 7 and Figure 4. For Set 1, although the assessment index values were changed with the number of prediction steps, the point prediction and interval prediction performance of the proposed combination system was always optimal among all models. is phenomenon reveals that the combined system put forward in this study not only had the smallest prediction error but also achieved the highest stability.
For Set 2, the RMSE values of ARIMA, BPNN, ELMAN, GRNN, and ELM were 1.1928, 1.0728, 1.1324, 1.0922, and 1.0630, respectively, in two-step forecasting. Apart from the worst ELMAN model, there was no significant difference among the prediction results obtained from the other single models; however, they were significantly weaker than the proposed composite system, whose RMSE was 0.5477, even smaller than 50% of the above model.
In Set 3, when comparing the interval forecasting capability, the developed system was still the best relative to the other comparison methods in terms of the values of the predictive performance indexes. e values of FICP and FINAW should be studied together. For instance, with the two-step prediction, the developed system had the smallest FINAW and the biggest FICP, which were almost two-times greater than the single models.
Remark 3. Compared with the single models, the advantages of the combined strategy were outstanding. is conclusion has nothing to do with the selected evaluation index and the number of prediction steps. erefore, these traditional individual prediction models are no longer suitable for current wind speed prediction.

Point forecasting results
Interval forecasting results

Conclusions
Wind energy, as one of the fastest growing natural resources, is gradually occupying the leading position in energy structures. Wind speed forecasting, as the basis of wind energy utilization, has become a research focus [21]. In this paper, an advanced combination system, which integrated the fortes of data preprocessing techniques and multiobjective optimization theory, was successfully proposed.
e system achieved great improvements regarding both the prediction accuracy and stationarity of the previous models. To expand, the proposed forecasting system overcame the defects exposed in previous studies in the following aspects. (1) In contrast to the recombination idea of the denoising layer number based on subjective judgment, each subsequence obtained though decomposition was weighted and reconstructed to eliminate the adverse influence of high frequency noise. (2) One widespread applied statistical method, two traditional neural networks, and one deep learning model were used as base models for fully capturing all important characteristics in wind speed series. (3) An advanced multiobjective optimization algorithm, NNIA, was successfully applied in the forecasting system to decide the weights in the data preprocessing link and combined forecasting link.
Empirical studies on the basis of realistic wind speed data indicated that the developed WACD-NNIA-ABGB system was clearly superior to other benchmark models. (1) Experiment I made a comparison between the proposed system and hybrid models with the same data pretreatment technique, and the results showed that the developed system obtained the lowest prediction error, with the MAPE mean values at 4.7405%, 9.2661%, and 8.9389%, which were reduced by more than 50%, compared with other hybrid models. (2) When contrasted with the models with diverse data preprocessing techniques, similar results indicated that the forecasting ability assessment values were still better than the other comparative models, revealing that the proposed decomposition and weighted reconstruction strategy surpassed the previous strategies and was a more effective tool. (3) In experiment III, the proposed forecasting system was compared with several traditional algorithms. According to the experiment results, we observed that regardless of the sites, the prediction performance of our system always ranked first, indicating that the combination theory was effective and successful. (4) Further discussions indicated that the developed system realized higher precision and stability relative to the other models. In conclusion, it is evident that the prediction system proposed in this paper shows great application potential and will be a reliable tool in smart grids.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.