An Ultra-Fast Power Prediction Method Based on Simpliﬁed LSSVM Hyperparameters Optimization for PV Power Smoothing

: With existing power prediction algorithms, it is difﬁcult to satisfy the requirements for prediction accuracy and time when PV output power ﬂuctuates sharply within seconds, so this paper proposes a high-precision and ultra-fast PV power prediction algorithm. Firstly, in order to shorten the optimization time and improve the optimization accuracy, the single-iteration Wolf Optimization (SiGWO) method is used to simplify the iteration process of the hyperparameters of Least Squares Support Vector Machine (LSSVM), and then the hybrid local search algorithm composed of Iterative Local Search (ILS) and Self-adaptive Differential Evolution (SaDE) is used to improve the accuracy of hyperparameters, so as to achieve high-precision and ultra-fast PV power prediction. The power prediction model is established, and the proposed algorithm is applied in a test experiment which can complete the power prediction within 3 s, and the RMSE is only 0.44%. Finally, combined with the PV-storage advanced smoothing control strategy, it is veriﬁed that the performance of the proposed algorithm can satisfy the system’s requirements for prediction accuracy and time under the condition of power mutation in a PV power generation system.


Introduction
In recent years, the penetration rate of renewable energy such as solar energy has increased [1]. PV power generation is easily affected by the environment, resulting in power fluctuations lasting for several seconds to several minutes [2], in which the maximum instantaneous power fluctuation rate can reach 75%/s [3], causing grid voltage and frequency flicker, which will reduce power quality and power supply reliability [4,5].
It has been proven that the PV power station equipped with energy storage can smooth the power fluctuation effectively [6,7]; especially, the energy storage system with high power density can effectively smooth the short-term and severe PV power fluctuation [8]. In Guo T., Liu Y., Zhao J., et al. [9], a new robust dynamic wavelet-enabled method is proposed, which can optimize the wavelet parameters adaptively and adjust the state of charge (SOC) and depth of charge or discharge of the hybrid energy storage system (HESS) composed of supercapacitors and batteries so as to smooth the fluctuations of the output power. In Sun Y., Tang X., Sun X., et al. [10], an improved low-pass filtering algorithm (ILFA) is proposed to optimize the power distribution of the battery and the supercapacitor, and it combines with the fuzzy control (FC) to smooth the power fluctuations based on the SC priority control strategy. In Lamsal, D., Sreeram, V., et al. [7], a fuzzy-based

LSSVM Algorithm and Hyperparameters
Compared with the wavelet analysis and neural networks, support vector machine (SVM) has obvious advantages in self-learning, self-adaptation, and non-linear mapping, but its training speed for quadratic programming problems is slow. The least square support vector machine (LSSVM) on the basis of SVM is proposed by Suykens [20], which chooses the principle of minimizing structural risk, and converts the optimization problem into a form similar to ridge regression. The difficulty of solving to a certain extent is reduced and the solution speed is improved. According to the principle of structural risk minimization, the regression problem is transformed into an equation-constrained optimization problem: min β,e g(β, e) = 1 where e i is the error variate, β T is the hyperplane normal vector in high-dimensional space, b is the offset, ϕ(x) is the non-linear mapping function, and C is the regular parameter which used to balance the complexity of the model. The model of LSSVM is as follows: where α i is the Lagrange multipliers and K(x, x i ) is the kernel function. This paper selects the RBF kernel as the kernel function, defined as the monotonic function of the Euclidean distance from any point x to a certain center x c in space [21].
The kernel function K(x, x i ) is as follows: where x is the any point in space, x c is the center point in space, and σ is the width parameter of the kernel function. The regression of LSSVM is related to the choice of hyperparameters which are included with the regular parameter C and the width parameter σ of the kernel function [22].

Hyperparameters Optimization for the First Time
Mirjalili imitates the hunting process of the gray wolf and proposes the Gray Wolf Optimization (GWO) [23]. Under the guidance of the optimization ideology of GWO, this paper proposes a Single-iteration Gray Wolf Optimization (SiGWO) to optimize hyperparameters for the first time. In the GWO algorithm, the process of iteration is the process of wolf α, wolf β, and wolf δ constantly approaching their prey. The distance D α , D β , and D δ between wolves ω and wolf α, wolf β, and wolf δ is continuously shortened by the iterative calculation to narrow the encircling radius, and the optimal hyperparameters are obtained.
The schematic diagram of the hunting process of the gray wolves is shown in Figure 1.
According to the single iteration principle of the SiGWO, the gray wolves only hunt for the prey (the optimal solution) once, and narrow the hunting radius to the range by r shown in Figure 1. The approximate position of the prey (the optimal solution) is defined and recorded as the position solution X pos . The above process is recorded as the first optimization of the hyperparameters. According to the single iteration principle of the SiGWO, the gray wolves only hunt for the prey (the optimal solution) once, and narrow the hunting radius to the range by r shown in Figure 1. The approximate position of the prey (the optimal solution) is defined and recorded as the position solution Xpos. The above process is recorded as the first optimization of the hyperparameters.

Chaos Initialization
Chaos is a widespread phenomenon in nonlinear systems, and chaotic mapping instead of traditional probability distribution is used to initialize the population which can enhance the traversal and uniformity of the population [24]. The cube map which has better uniformity is chosen to complete the initialization of the gray wolf. The formula for cube mapping is as follows: 3 ( 1) 4 ( ) 3 ( ) 1 ( ) 1, 0,1, 2, y n y n y n y n n where y(n) is the chaos number generated by chaos initialization and n is the size of the gray wolf population.
Chaos is a complex system with unpredictable behavior, and mapping is to associate chaotic behavior with a parameter by a function [24]. The original pseudo-random numbers are replaced by chaotic numbers in the proposed algorithm, and the position is calculated.
where Position is the initialization position and ub and lb are the upper and lower bounds of the parameter value, respectively. This paper sums up the detailed introduction of the SiGWO algorithm and chaos initialization above, and records the specific process of the algorithm in the form of pseudo code, as shown in Figure 2.

Chaos Initialization
Chaos is a widespread phenomenon in nonlinear systems, and chaotic mapping instead of traditional probability distribution is used to initialize the population which can enhance the traversal and uniformity of the population [24]. The cube map which has better uniformity is chosen to complete the initialization of the gray wolf. The formula for cube mapping is as follows: where y(n) is the chaos number generated by chaos initialization and n is the size of the gray wolf population.
Chaos is a complex system with unpredictable behavior, and mapping is to associate chaotic behavior with a parameter by a function [24]. The original pseudo-random numbers are replaced by chaotic numbers in the proposed algorithm, and the position is calculated.
where Position is the initialization position and ub and lb are the upper and lower bounds of the parameter value, respectively. This paper sums up the detailed introduction of the SiGWO algorithm and chaos initialization above, and records the specific process of the algorithm in the form of pseudo code, as shown in Figure 2.

Hyperparameters Accuracy Optimized by Hybrid Local Search
A hybrid local search is introduced to improve the accuracy of the hyperparameters, and the Xpos, which has experienced chaos initialization, will be used as the initial solution for the hybrid local search.

Hyperparameters Accuracy Optimized by Hybrid Local Search
A hybrid local search is introduced to improve the accuracy of the hyperparameters, and the X pos , which has experienced chaos initialization, will be used as the initial solution for the hybrid local search.
(1) Preliminary optimization of accuracy. Iterative local search (ILS) is based on the common characteristics of good solutions, adding local disturbance to the existing position solution X pos [25], using the Griewank function as a perturbation function [26], disturbing the existing solution to ensure that it jumps out of the local optimum to find the new position solution better pos .
The formula of the Griewank function is as follows: where X pos is the position solution obtained by the SiGWO algorithm, randn is the stochastic number between [−1, 1], x i is the stochastic product of X pos , and G is the disturbance output.
(2) Re-optimization of accuracy. In order to find the optimal hyperparameters, a Self-adaptive Differential Evolution Algorithm (SaDE) is introduced. The second local search is used to confirm the position information of the wolf α in the optimal population to obtain the optimal hyperparameters. The global or local search ability of the algorithm will be affected by the definition of the variation factor F [27]. Therefore, the variation factor F of the SaDE algorithm is defined as follows: where F 0 is the initial variation factor, i is the ith population, and Max_iteration is the maximum number of iterations. The mutation strategy of SaDE algorithm is as follows: where dd is the adaptive parameter, better pos is the better solution of position, N i2 (t) and N i3 (t) is the random vector, and best pos is the position of the optimal solution. Based on the content above, the pseudo code of the hybrid local search algorithm is shown in Figure 3.  Under the guidance of the principle of one-to-one correspondence between the population fitness and the position, the optimal position is determined by the search of the minimum population fitness. The information of the optimal hyperparameters is contained in the optimal position.
The main flow chart of the proposed algorithm is shown in Figure 4.
Import the sampling data collected Start Under the guidance of the principle of one-to-one correspondence between the population fitness and the position, the optimal position is determined by the search of the minimum population fitness. The information of the optimal hyperparameters is contained in the optimal position.  Under the guidance of the principle of one-to-one correspondence between the population fitness and the position, the optimal position is determined by the search of the minimum population fitness. The information of the optimal hyperparameters is contained in the optimal position.
The main flow chart of the proposed algorithm is shown in Figure 4.

Data Collection
The existing power prediction methods are based on solar radiation intensity, historical power, and meteorological factors to complete the power prediction by statistical prediction or intelligent algorithm. However, these methods cannot adequately study the power fluctuation characteristics, and the accuracy of the prediction is impacted by overreliance on Numerical Weather Prediction (NWP), which is low-precision and high-cost [28].
This paper divides the historical output power into every 1 min in PV power station in Shenzhen. It updates the sampling power based on the cyclic forecasting idea, adds the

Data Collection
The existing power prediction methods are based on solar radiation intensity, historical power, and meteorological factors to complete the power prediction by statistical prediction or intelligent algorithm. However, these methods cannot adequately study the power fluctuation characteristics, and the accuracy of the prediction is impacted by over-reliance on Numerical Weather Prediction (NWP), which is low-precision and high-cost [28].
This paper divides the historical output power into every 1 min in PV power station in Shenzhen. It updates the sampling power based on the cyclic forecasting idea, adds the latest measured data, and eliminates the furthest measured data. The basic weather conditions of the selected sample data are as follows: the temperature at the time of sample collection is 24-27°C, cloudy, and northeast wind level 3.

Data Classification and Normalization
This paper selects the six-hour historical output power of the PV power as a sample, the last half-hour output power as the test data, and the rest as the training data.
The speed of convergence and the accuracy will be improved because the sample data are normalized. The min-max standardization method is selected to linearly transform the original data, so that the result value x* is mapped to [0, 1].
The conversion function is as follows: where x max and x min are the maximum and minimum values in the sample data, respectively, and x* is the normalized value.

Predictive Evaluation Index
A simulation is built to record the prediction time and the accuracy of the predicted power at the same time. The mean absolute percentage error (MAPE) and root mean square error (RMSE) are used to evaluate the accuracy of the predicted power.
where N is the number of training or test samples, y i is the actual value, andŷ i is the predicted value.

Simulation Verification
Several existing high-precision power prediction algorithms such as QPSO-LSSVM, SaDE-GWO-LSSVM, and ABC-LSSVM are chosen as the control group, compared with the proposed algorithm, and run under the same sample data. MATLAB is used to simulate the above algorithm, and the prediction algorithm is evaluated from the two aspects of prediction accuracy and time.
In order to reduce the effect of prediction randomness, the average power after 20 times of prediction in the same period is selected as the final prediction power.
The fitting curve of the predicted power obtained by the power prediction algorithms and the actual power is shown in Figure 5. As shown in Figure 5, the degree of fitting between the predicted power and the actual power is at a high level, and the degree of fitting of the power curve is positively correlated with the accuracy. By comparing the deviation degree of each predicted power, we can gain the following prediction accuracy results: QPSO-LSSVM > HLSGWO-LSSVM > ABC-LSSVM > SaDE-GWO-LSSVM.
According to the definition of the power prediction evaluation index, this paper selects the RMSE to draw the prediction. To sum up the above, the chart of error-time comparison is shown in Figure 6. As shown in Figure 5, the degree of fitting between the predicted power and the actual power is at a high level, and the degree of fitting of the power curve is positively correlated with the accuracy. By comparing the deviation degree of each predicted power, we can gain the following prediction accuracy results: QPSO-LSSVM > HLSGWO-LSSVM > ABC-LSSVM > SaDE-GWO-LSSVM.
According to the definition of the power prediction evaluation index, this paper selects the RMSE to draw the prediction. To sum up the above, the chart of error-time comparison is shown in Figure 6. tual power is at a high level, and the degree of fitting of the power curve is positively correlated with the accuracy. By comparing the deviation degree of each predicted power, we can gain the following prediction accuracy results: QPSO-LSSVM > HLSGWO-LSSVM > ABC-LSSVM > SaDE-GWO-LSSVM.
According to the definition of the power prediction evaluation index, this paper selects the RMSE to draw the prediction. To sum up the above, the chart of error-time comparison is shown in Figure 6. As can be seen from Figure 6, the proposed algorithm can complete the power prediction within 3 s and greatly shorten the time required for power prediction. In this case, the sampling interval could be further shortened, and the sensitivity of the smoothing system to the power fluctuations in an ultra-short period can be enhanced.
Understanding the occupancy of the power prediction in PV power generation systems is of great significance to the internal resource allocation of PV power generation systems. The paper uses AIDA64 software to monitor the computer CPU occupancy rate of each power prediction algorithm when it works. A histogram is used to represent the total occupancy rate of each power prediction algorithm in the PV power generation system, as shown in Figure 7.
The total occupancy rate of the prediction algorithm is the sum of the system occupancy rate in each time period when the prediction algorithm is running. The higher its value, the higher the performance requirements of the CPU, and the high total occupancy rate will affect the operation of other parts in the PV power generation system. As can be seen from Figure 6, the proposed algorithm can complete the power prediction within 3 s and greatly shorten the time required for power prediction. In this case, the sampling interval could be further shortened, and the sensitivity of the smoothing system to the power fluctuations in an ultra-short period can be enhanced.
Understanding the occupancy of the power prediction in PV power generation systems is of great significance to the internal resource allocation of PV power generation systems. The paper uses AIDA64 software to monitor the computer CPU occupancy rate of each power prediction algorithm when it works. A histogram is used to represent the total occupancy rate of each power prediction algorithm in the PV power generation system, as shown in Figure 7. As shown in Figure 7, the proposed algorithm can greatly reduce the resource occupancy rate for the power prediction calculation in the PV power generation system, and then relieve the pressure of computer calculation and improve the operating efficiency of the power generation systems.

Comprehensive Analysis of Predictive Power
The RMSE, which is sensitive to the abnormal values, is selected as the main evaluation index of the prediction error, and the RMSE between the predicted power and the actual power is calculated every minute in the future. The distribution diagram of the prediction error (RMSE) at each time point of power prediction is shown in Figure 8. The total occupancy rate of the prediction algorithm is the sum of the system occupancy rate in each time period when the prediction algorithm is running. The higher its value, the higher the performance requirements of the CPU, and the high total occupancy rate will affect the operation of other parts in the PV power generation system.
As shown in Figure 7, the proposed algorithm can greatly reduce the resource occupancy rate for the power prediction calculation in the PV power generation system, and then relieve the pressure of computer calculation and improve the operating efficiency of the power generation systems.

Comprehensive Analysis of Predictive Power
The RMSE, which is sensitive to the abnormal values, is selected as the main evaluation index of the prediction error, and the RMSE between the predicted power and the actual power is calculated every minute in the future. The distribution diagram of the prediction error (RMSE) at each time point of power prediction is shown in Figure 8.

Comprehensive Analysis of Predictive Power
The RMSE, which is sensitive to the abnormal values, is selected as the main evaluation index of the prediction error, and the RMSE between the predicted power and the actual power is calculated every minute in the future. The distribution diagram of the prediction error (RMSE) at each time point of power prediction is shown in Figure 8. As shown in Figure 8, by analyzing the RMSE index of each power prediction value, the prediction error RMSE curve of the proposed algorithm changes most gently. By comparing Figures 5 and 8, the power prediction curve of the proposed algorithm can be regarded as the translation of the actual power curve.
In order to verify the universality of this discovery, it needs to be verified later. After verification, it is found that the power prediction curve of the same PV power station at different times or under different weather conditions can show a high fit with the actual power curve after translation change, and the translation range is relatively stable and fluctuates in a small range. Therefore, the above predicted power can be translated and calculated, and the fitting diagram of the translated power prediction curve can be drawn in Figure 9. As shown in Figure 8, by analyzing the RMSE index of each power prediction value, the prediction error RMSE curve of the proposed algorithm changes most gently. By comparing Figures 5 and 8, the power prediction curve of the proposed algorithm can be regarded as the translation of the actual power curve.
In order to verify the universality of this discovery, it needs to be verified later. After verification, it is found that the power prediction curve of the same PV power station at different times or under different weather conditions can show a high fit with the actual power curve after translation change, and the translation range is relatively stable and fluctuates in a small range. Therefore, the above predicted power can be translated and calculated, and the fitting diagram of the translated power prediction curve can be drawn in Figure 9. The new power prediction fitting curve after translation and calculation is shown in Table 1. According to the comprehensive analysis of Table 1 and Figure 9, the proposed algorithm has the best fit between the predicted power curve and the actual power curve, which greatly improves the accuracy of power prediction. The new power prediction fitting curve after translation and calculation is shown in Table 1. According to the comprehensive analysis of Table 1 and Figure 9, the proposed algorithm has the best fit between the predicted power curve and the actual power curve, which greatly improves the accuracy of power prediction.

PV Power Generation System Equipped with HESS
HESS, which is composed of energy storage batteries and supercapacitors, is selected to complete the power smoothing. HESS combines the advantages of both, which has high energy density and power density at the same time and ensures the energy storage can smooth the power fluctuation efficiently and quickly.
The schematic diagram of the PV power generation system equipped with HESS is shown in Figure 10. According to the comprehensive analysis of Table 1 and Figure 9, the proposed algorithm has the best fit between the predicted power curve and the actual power curve, which greatly improves the accuracy of power prediction.

PV Power Generation System Equipped with HESS
HESS, which is composed of energy storage batteries and supercapacitors, is selected to complete the power smoothing. HESS combines the advantages of both, which has high energy density and power density at the same time and ensures the energy storage can smooth the power fluctuation efficiently and quickly.
The schematic diagram of the PV power generation system equipped with HESS is shown in Figure 10. All parts in Figure 10 obey the law of conservation of energy that can be summarized as the following formula: P pv + P HESS = P Grid (12) where P pv is the output power of PV, P HESS is the charge or discharge power of HESS, and P Grid is the grid-connected power.

Related Parameter Settings
It is assumed that the energy storage system in this paper is an ideal energy storage, meaning that the capacity is sufficient to satisfy the requirements for power smoothing.
(1) Sampling interval T 0 . The PV power generation is a continuous process; as long as the power generation conditions are met, electric energy can be generated in real time. It is assumed that the power of PV power generation during T 0 is a constant value.
(2) Power fluctuation rate ∆P . The ratio of the output power difference between adjacent power sampling points to time. The calculation formula is: where t is the tth sampling point after the prediction time, P(t) is the predicted power at time t, and ∆P (t) is the power fluctuation rate at time t.
(3) Target volatility D et . D et is a parameter that can reflect the power grid's frequency modulation capability; it is the upper bound of the grid-connected power fluctuation allowed in the guidelines [29]. The grid-connected PV power needs to satisfy the following formula: Energies 2021, 14, 5752 11 of 14

The Design of PV-Storage Advanced Smoothing Control Strategy
The charge or discharge action of HESS is judged by the fluctuation rate between the prediction power and target volatility D et . When the predicted power fluctuation rate is greater than D et , HESS will charge or discharge; otherwise, there is no action [30].
The specific control strategy is as follows: (15) where P HESS (t) is the charge or discharge power of HESS at time t.
The energy exchange can be completed in advance according to the predicted power, so as to ensure the energy storage system can satisfy the charge or discharge.
The flow chart of PV-storage advanced smooth control is shown in Figure 11.

The Verification of Power Smoothing Simulation
The predicted power is gained by the power prediction algorithm. HESS follows the flow of advanced smoothing control strategy to realize charge or discharge, in order to compensate the difference between the predicted power and the grid-connected target value.
In addition, according to China's State Grid Enterprise Standard Q/GDW617-2011 "Technical Regulations for Connecting Photovoltaic Power Stations to the Grid", the maximum active power change of a small PV power station within 1 min is limited to 0.2 MW. Therefore, this paper sets the target volatility Det to 2 KW/min.
A schematic diagram of power smoothing based on predicted power is shown in Figure 12.

The Verification of Power Smoothing Simulation
The predicted power is gained by the power prediction algorithm. HESS follows the flow of advanced smoothing control strategy to realize charge or discharge, in order to compensate the difference between the predicted power and the grid-connected target value.
In addition, according to China's State Grid Enterprise Standard Q/GDW617-2011 "Technical Regulations for Connecting Photovoltaic Power Stations to the Grid", the maximum active power change of a small PV power station within 1 min is limited to 0.2 MW. Therefore, this paper sets the target volatility D et to 2 KW/min.
A schematic diagram of power smoothing based on predicted power is shown in Figure 12.
As shown in Figure 12, the power smoothing based on the predicted power can effectively solve the problem of the sharp fluctuations within seconds. In combination Figure 8 with Table 1, the relationship between the power smoothing performance and the power prediction accuracy shows that the storage power smoothing performs well when it under the guidance of the predicted power with high-precision. The slight power fluctuation such as at the 14th sampling time point can be smoothed well by HESS, and the smoothness of the grid-connected power can be improved, in order to guarantee the power quality of the whole power system.
The predicted power is gained by the power prediction algorithm. HESS follows the flow of advanced smoothing control strategy to realize charge or discharge, in order to compensate the difference between the predicted power and the grid-connected target value.
In addition, according to China's State Grid Enterprise Standard Q/GDW617-2011 "Technical Regulations for Connecting Photovoltaic Power Stations to the Grid", the maximum active power change of a small PV power station within 1 min is limited to 0.2 MW. Therefore, this paper sets the target volatility Det to 2 KW/min.
A schematic diagram of power smoothing based on predicted power is shown in Figure 12.  Table 1, the relationship between the power smoothing performance and the power prediction accuracy shows that the storage power smoothing performs well when The charge or discharge power of HESS is shown in Figure 13. it under the guidance of the predicted power with high-precision. The slight power fluctuation such as at the 14th sampling time point can be smoothed well by HESS, and the smoothness of the grid-connected power can be improved, in order to guarantee the power quality of the whole power system. The charge or discharge power of HESS is shown in Figure 13. As can be seen from the Figures 12 and 13, the grid-connected power under the guidance of high-precision power prediction has a higher degree of fit with the actual power.
The power required for power smoothing for a single time is reduced by 4.5% to 5%. At the same time, the power smoothing guided by the proposed algorithm reduces the capacity requirements of energy storage equipment.

Conclusions
This paper proposes a high-precision and ultra-fast PV power prediction algorithm, in view of the difficulty of the existing power prediction algorithms to simultaneously satisfy the requirements for prediction accuracy and time when the PV output power fluctuates sharply within seconds.
By comparison with the existing power prediction algorithms, the proposed algorithm can complete power prediction within 3 s and greatly reduce the time required for power prediction. According to the predicted power error distribution, the RMSE of the proposed algorithm optimized is only 0.44%.
The proposed algorithm is applied to the PV-storage advanced smoothing control to prove that this algorithm can effectively guide the smoothing of HESS, and ensures the smoothness of the grid-connected power. In addition, the proposed algorithm can reduce the requirements of the energy storage capacity to a certain degree. As can be seen from the Figures 12 and 13, the grid-connected power under the guidance of high-precision power prediction has a higher degree of fit with the actual power.

Patents
The power required for power smoothing for a single time is reduced by 4.5% to 5%. At the same time, the power smoothing guided by the proposed algorithm reduces the capacity requirements of energy storage equipment.

Conclusions
This paper proposes a high-precision and ultra-fast PV power prediction algorithm, in view of the difficulty of the existing power prediction algorithms to simultaneously satisfy the requirements for prediction accuracy and time when the PV output power fluctuates sharply within seconds.
By comparison with the existing power prediction algorithms, the proposed algorithm can complete power prediction within 3 s and greatly reduce the time required for power prediction. According to the predicted power error distribution, the RMSE of the proposed algorithm optimized is only 0.44%.
The proposed algorithm is applied to the PV-storage advanced smoothing control to prove that this algorithm can effectively guide the smoothing of HESS, and ensures the