Research on Power Load Forecast Based on Ceemdan Optimization Algorithm

With the continuous development of the power industry, power data has become more complex. However, it is difficult for common shallow neural networks to fully extract the original load data features, thus greatly limiting the load prediction accuracy. Therefore, given the advantages of CEEMDAN, a prediction model based on CEEMDAN is proposed in the paper, the advantages of which are verified by simulation analysis of measured power data. Experiments prove that the algorithm proposed in the paper has higher prediction accuracy, which effectively overcomes the EMD modal aliasing problem, and the decomposition process is more complete, so that the prediction accuracy of the subsequent prediction model is improved.

Consistent with EEMD, in the CEEMDAN algorithm, EMD is used to repeat the decomposition of the signal   0 i X t e n t  N times, and the first modal component is obtained by the average calculation: 1 1 1 i IMF t IMF t n  (1) (1) Calculate the first margin signal 1 rt : 1 1 rt Xt IMF t   (2) (2) The EMD algorithm is used to repeat the decomposition of the signal (3) For k=2, …, K, the k-th residual signal is calculated.
Repeating the calculation process of step (3), the k+1th modal function is obtained as follows.
(5) Repeat steps (4) and (5) until the residual signal meets the termination condition of decomposition, and finally K modal components are obtained. The decomposed final residual signal is as follows.

Sample Entropy
Similar to approximate entropy, sample entropy measures the complexity of a time series by measuring the probability of a new pattern generated by the signal. The size of the entropy reflects the complexity of the time series. However, compared with approximate entropy, it has higher robustness and better consistency, which does not depend on the data length. For time series   , 1,2, , X xn n N   , the sample entropy calculation process is as follows.
(1) The original sequence is reconstructed into a set of m-dimensional vectors (2) The absolute value of the maximum difference between the corresponding elements in the (7) For the finite value N, SE can be approximated as:

Pearson Correlation Coefficient
Pearson correlation coefficient is a statistical method used to measure the degree of correlation between variables, and the correlation is derived from R worth. The calculation formula is as follows.
In the formula, X and Y are the set of two variables, K refers to the number of sample points, and R indicates the ratio of the product of covariance and standard deviation between the two variables. In addition, the value of R ranges from -1 to +1. The closer the value of R is to +1, the stronger the positive correlation between variables will be. Similarly, the closer the value of R is to -1, the stronger the negative correlation between variables will be.

Experiment Analysis
The experimental data of the paper selects the power load data and meteorological data of a certain area for three consecutive years. In order to effectively verify the advantages of the proposed model, the daily 60-hour daily point load data from June 26, 2016 to August 24, 2019 is selected as the training set. Besides, auxiliary data are temperature data and daily type for the same period, which are used to forecast the hourly load on August 25. The simulation software is MATLAB 2016b.
For the random non-stationary characteristics of the original load sequence, CEEMDAN is first used to decompose the load data of the training set, that is, 1440 continuous data points are decomposed. The decomposition results are shown in Figure 1. During the decomposition, the number of white noise groups NR=400 and the standard deviation Nstd=0.3 are added. It can be seen from Figure 1 that the original load sequence is decomposed into modal components with different characteristics and a residual signal by the CEEMDAN algorithm, and the features of different scales in the original sequence are well decomposed. Then considering that there are more modal components, if each component is directly modeled, it will increase the calculation scale, and the overall prediction model will be more complicated.
Therefore, the article uses sample entropy is used in the paper to evaluate the complexity of each component, and through experiments it is obtained that when the parameters m=2, r=0.2 Std, the difference in complexity of each component can be best reflected, and the entropy curve of each component is shown in figure 2. It can be seen from Figure 2 that as the frequency of each IMF component decreases, the corresponding SE value also decreases, that is, the complexity of each component continues to decrease. Moreover, the entropy value of each component is used as the judging criterion to reorganize each IMF component in the paper. As can be seen in Figure 5, the entropy value of IMF1with the highest complexity is the largest one, which is significantly higher than other components.

Conclusion
The CEEMDAN algorithm is introduced into the short-term load forecasting problem, which overcomes the problems of conventional decomposition methods such as EMD and EEMD, so that the original non-stationary load sequence can be decomposed more completely at different scale features. What is more, sample entropy is used to analyze the complexity of each modal component, and then each component is reorganized to reduce the overall complexity of the model. Therefore, the analysis of the periodicity and influencing factors in each recombination sequence can effectively overcome the problems that the shallow neural network cannot fully extract the original data features and the initial parameters are difficult to determine.