Improved Chicken Swarm Algorithms Based on Chaos Theory and Its Application in Wind Power Interval Prediction

. Probabilistic interval prediction can be used to quantitatively analyse the uncertainty of wind energy. In this paper, a wind power interval prediction model based on chaotic chicken swarm optimization and extreme learning machine (CCSO-ELM) is proposed. Traditional optimization has limitations of low population diversity and a tendencyto easily fall intolocal minima. To address these limitations, chaos theory is adopted in the chicken swarm optimization (CSO), which improves its performance and efficiency. In addition, the traditional cost function does not reflect the deviation degree of off-interval points; hence, an evaluation index considering the relativedeviation of off-intervalpoints is proposed in this paper. Finally, thenew cost functionis taken as the fitness function, the output layer weight of ELM is optimized using CCSO, and the lower upper bound estimation (LUBE) is adopted to output the prediction interval directly. The simulation result shows that the proposed method can effectively reduce the average bandwidth, improve the quality of interval prediction, and guarantee the interval coverage.


Introduction
Recently, with the development of power systems and the popularization of renewable energy, distributed renewable energy has been growing rapidly, and the system complexity and uncertainty levels have increased significantly as well. As a clean and widely used resource, wind has become one of the most popular renewable sources. However, the intermittent and randomness of wind brings great challenges to power generation, transmission, and distribution. Therefore, it is necessary to predict wind power accurately to improve power quality [1,2].
The widely used prediction models of wind power are mainly divided into physical models and statistical models. The former needs to analyse and model the internal structure of a draught fan, and the process is more complex [3,4]. The latter requires a large amount of historical data, and models such as the artificial neural network (ANN) [5], the support vector machine (SVM) [6], and the AR integrated moving average (ARIMA) [7] are applied to analyse the nonlinear relationship between wind power and its impact factors.
However, these models are mainly focused on deterministic point prediction, and prediction error always exists and cannot be eliminated because of the randomness and nonstationarity of wind [8,9]. Therefore, from the perspective of decision-making, the use of point prediction will cause some negative impact on the stability and reliability of the power grid.
Compared to point prediction, probabilistic interval prediction can provide more quantitative information for the uncertainty of wind power [10]. The reliability and clarity of the prediction interval can be evaluated by the confidence coefficient and average bandwidth, respectively [11]. These aspects are instructive for the decision-making of the power grid [12]. Traditional interval construction methods are often placed after the deterministic prediction model with specific prior assumptions. Quantile regression [13], Bootstrap [14], mean-variance estimation [15,16], delta methods [17], and Bayesian [18] are usually used to obtain the prediction interval. These methods require a large amount of computation and have great complexity. In [19], an interval prediction model based on the lower upper bound estimation 2 Mathematical Problems in Engineering (LUBE) is proposed. The dual-output neural network is applied to the output prediction interval directly, which makes the model relatively simple and efficient. In [20][21][22], the objective function is improved for the LUBE model, which limits the deviation of the points outside the interval and effectively reduces the average bandwidth and improves the interval quality. However, traditional neural networks (NNs) employed in the LUBE method have the problem of overtraining and high computational burden [20]. Compared to the traditional NNs, the extreme learning machine (ELM) has the advantage of simpler and faster learning speed, better generalization performance, and not having to adjust the weights of the input layer [23]. Still much further optimization of weight and threshold of ELM are need.
Chicken swarm optimization (CSO) is a new bioinspired algorithm for optimization problems [24], the chickens' diverse movements can be conducive for the algorithm to strike a good balance between the randomness and determinacy for finding the optima. Recently, CSO has aroused great concern. For example, Wu analysed the convergence of CSO [25] and the algorithm is demonstrated to meet two convergence criteria, which ensures the global convergence. However, weak local search ability of CSO limits its optimization performance; hence CSO can be improved by combining the strong search ability of other algorithms. Chaotic systems are famous for inherent ergodicity, irregularity, and pseudorandomness; these basic traits make it perform better than random operators in local searching [26]. Chaotic systems search extensively for solutions and can find a desirable solution within a practical time. In [27], the phenomenon of chaos is embedded at different stages of PSO in order to make the search process more efficient. In this paper, the local search ability of chaos is introduced into the CSO's position updating rules, and chaotic chicken swarm optimization (CCSO) is proposed and then applied to solve the wind power interval prediction.
Based on the above discussion, the detailed procedures of this study are as follows: first, the basic characteristics of the CSO and the chaos theory are considered, and the performance of CSO is improved by the chaos theory; then CCSO is proposed. Second, the performance of CCSO is tested by benchmark functions, and the results show that it performs better than other optimizations. Third, ELM is applied to learn historical data, furthermore, in order to reduce the possibility of falling into local optimum because of premature convergence; CCSO is adopted to optimize the weight and threshold of ELM [28] and then LUBE method is adopted to directly output the prediction interval of wind power. Finally, the feasibility of the model is verified by simulation.
In conclusion, the main contributions of this paper include the following three aspects. First, an improved CSO based on chaos theory is proposed in this paper, and its performance is verified to be better than before. Second, a new objective function considering deviation of the points outside the interval is proposed; then the overall quality of the interval prediction is improved. Third, a hybrid wind power interval prediction model based on ELM optimized by CCSO is proposed, which maintains narrower average bandwidth and higher quality of interval.
The organization of this paper is as follows. In Section 2, the original CSO, chaotic optimization strategy, ELM, and LUBE method are introduced. In Section 3, improving CSO based on chaos theory is proposed, and its performance is tested; then a new objective function of interval prediction is proposed. In Section 4, the simulation results are explained. Finally, Section 5 provides the conclusions.

Literature Review
. . Original Chicken Swarm Optimization. CSO is a new bioinspired algorithm proposed by Meng in 2014 [23]. It optimizes problems by simulating the hierarchal order and the behaviors of the chicken swarm. This paper makes the following assumptions: (1) The whole swarm is composed of several groups, and each group is composed of a dominant rooster and several hens and chicks.
(2) Each rooster leads its group. The chicken with the best fitness would be selected as the rooster in the group, the chickens with worst fitness would be selected as chicks, and the other chickens would be hens. Furthermore, the number of hens is greater than the number of chicks in each group.
(3) The relationship between the hens and the rooster is randomly established, as well as the relationship between the mother hens and the chicks. These relationships remain unchanged within G iterations ( ≤ , is total number of iterations) and update every G iterations to maintain the diversity of population and reduce the possibility of local convergence.
(4) In the group, each hen forages around the rooster, each chick forages around its mother hen, and each rooster randomly steals food from other roosters.
(5) Assume that , ℎ , , and , respectively, indicate the number of roosters, hens, chicks, and mother hens in the whole chicken swarm.
In [23], the position of the rooster can be updated as follows: where (0, 2 ) is a normal distribution with mean 0 and variance 2 , denotes a minimal constant to prevent zero division errors, denotes the rooster of the other groups, and Mathematical Problems in Engineering 3 denotes the corresponding fitness of the individual. The position of the hen can be updated as follows: where rand denotes the random numbers uniformly distributed within [0, 1], 1 is an index of the rooster which is the group-mate of the hen ℎ , 2 is an index of the chicken (rooster or hen) which is randomly selected from the swarm, and 1 ̸ = 2 . The position of the chick can be updated as follows: where indicates the mother hen of the ℎ chick, and is a parameter that means the chick would follow its mother to forge for food.
. . Chaos Optimization Strategy. Chaotic variables of randomness, regularity, and ergodicity exist, and a logistic map is widely used. The logistic map is defined by where ∈ [0, 4], ∈ [0, 1], = 1, 2, 3 ⋅ ⋅ ⋅ . When = 4, the logistic map is in chaos region, and there are 4 fixed points in the system: 0, 0.25, 0.75, and 1. When these fixed points are removed, the chaotic variables of the logistic map show ergodicity in the interval [0, 1] as long as is large enough.
A Tent map is a piecewise linear map, which has better uniform distribution and higher iterative speed [29]. Its iterative formula is defined by where ∈ [0, 1], ∈ [0, 1], = 1, 2, 3 ⋅ ⋅ ⋅ . When = 1, the Tent map is in chaos region, and there are 4 fixed points in the system: 0, 0.25, 0.5, and 0.75. In addition, there is a periodic cycle in 4 other points: 0.2, 0.4, 0.6, and 0.8. When these fixed points and periodic points are removed, the chaotic variables of the Tent map show ergodicity in the interval [0, 1]. Figure 1 shows the distribution of two map sequences after 5000 iterations with the initial point ( 0 = 0.3451, 0 = 0.3452). As we can see from Figure 1, the logistic map is relatively concentrated in the two interval distributions of [0, 0.1] and [0.9, 1]. As a whole, its uniformity is not as good as that of the Tent map. This affects the efficiency of optimization.
. . Extreme Learning Machine. ELM is a single hidden layer feedforward neural network. Different from traditional backpropagation (BP) networks, input layer weights and thresholds of hidden layer neurons are randomly generated in ELM and are not adjusted after fixing, nor during the training process. The generalized matrix theory is applied in ELM to solve the output layer weight [30], which has the advantages of fast learning speed and outstanding generalization performance.
Assume that , , and , respectively, indicate the number of neurons in the input layer, the hidden layer, and the output layer. For samples, the model output can be expressed as follows: where is the input weight vector, indicates the threshold of hidden layer neurons, is the weight between the hidden layer and the output layer, is the input vector of ELM, and ( ) is the activation function of the hidden layer neurons. The specific steps of ELM are described as follows.
(1) Determine the number of hidden layer neurons, and randomly set and .
(2) An infinitely differentiable function is selected as the activation function of the neurons in the hidden layer, and then the output matrix of the hidden layer is calculated.
(3) Calculate the output layer weight matrix , . . LUBE Method. Traditional probabilistic interval prediction is based on the assumption of the probability distribution of wind power data or prediction errors, such as delta and Bayesian methods, as their calculation process is complicated which affects the reliability and robustness of the prediction. The LUBE method is a nonparametric statistical method that constructs the prediction interval directly [12]. Dualoutput neural network is applied in LUBE to output the lower and upper bounds, without considering the distribution hypothesis. The LUBE network structure based on ELM is shown in Figure 2.
and are the upper and lower bounds, respectively, for the prediction of wind power. To optimize the objective function in the ELM training process, in this paper, CCSO is applied to train the network. The diversity of the chicken swarm algorithm is combined with the local search ability of the Tent map, which improves the overall optimization  performance of CCSO. A set of output layer weights are found in the solution space to minimize ; then and are taken as the outputs directly.

Proposed Methods
. . Chaotic Chicken Swarm Optimization. In this paper, the Tent map is applied to a local search based on the best individual of the chicken swarm, and the randomly selected rooster is replaced by the selected individual. Chaotic chicken swarm optimization is proposed in the end. The key points are described as follows.
( ) Self-Adaptation Search Space. Chaotic search is more effective in a small space, but it takes a long time to search in a large space, which affects the efficiency of the algorithm. In this paper, chaotic search space is adaptively adjusted according to the progress of the algorithm.
where min ( ) is the lower bound of the search space for the ℎ dimension, max ( ) is the upper bound of the search space for the ℎ dimension, ( ) is the ℎ dimension of individuals with the best fitness in the chicken swarm, and ∈ (0, 0.5) is the chaotic search factor, which is applied to adaptively adjust the search space.
( ) Self-Adaptation Chaotic Mutation Probability. In the early search period, chaotic mutation leads to a decrease in convergence rate. In this paper, the chaotic mutation probability is adaptively adjusted to have a smaller value in the early period and a larger value in the later period.
where is chaotic mutation probability in the ℎ iteration. The logarithmic function is applied to increase the mutation probability during the iterative process, which improves the efficiency of the algorithm as well.
The CCSO is described as follows.
Step . Determine the parameters: the chicken swarm size , the number of iterations , the individual dimensions d, the updated frequency of the swarm , the number of the roosters, the hens, the chicks and the mother hens which are, respectively, 1 , 2 , 3 , and 4 , the following coefficient , the maximum step of chaotic search max , and the chaotic search factor .
Step . Initialize the swarm. The initial swarm is randomly generated between the upper and lower bounds, and the iterations are initialized.
Compare Step . Randomly select a rooster in the swarm, replace it with ∧ , and then update the global optimal individual.
Step . If the maximum number of iterations is reached or the error requirements are met, stop; otherwise, return to Step 3.
. . Benchmark Function Tests. In this paper, 6 multipeak functions [31] are introduced to test the performance of CCSO compared to that of chicken swarm optimization (CSO), particle swarm optimization (PSO), and chaotic particle swarm optimization (CPSO).
It can be seen from Table 2 that CCSO has higher accuracy as compared to PSO, CSO, and CPSO. The Tent map in CCSO guarantees the diversity of the swarm while avoiding premature convergence in the algorithm, which greatly decreases the possibility of falling into the local minimum. In addition, the self-adaptation of the chaotic search space and mutation probability improves the efficiency and robustness of the algorithm, and the Benchmark functions validate the optimization performance of CCSO, but its practicability needs to be further studied. Therefore, the application of CCSO in wind power interval prediction will be introduced.
. . Objective Function of Interval Prediction. The construction of the prediction interval consists of estimating the upper and lower bounds of the interval under a certain confidence level, which indicates that the evaluation needs to consider the accuracy and quality of the interval [14,32,33]. In order to evaluate the accuracy of the interval, the prediction interval coverage probability (PICP) was introduced, which is defined as follows: where is the number of predicted points, and if the actual target is within the prediction interval, = 1, otherwise = 0. ∈ [0, 1], bigger PICP indicates higher accuracy, and smaller indicates the reverse.
On the other hand, to evaluate the quality of the prediction interval, PI normalized average width (PINAW) is introduced, which is defined as follows: where and , respectively, indicate the upper and lower bound of the interval, and is the range of target values, which is used to normalize the average bandwidth. ∈ [0, 1], bigger PINAW indicates higher quality, and smaller indicates the reverse. PICP and PINAW can only assess the effect of the prediction interval unilaterally. Therefore, to transform complex multiobjective problems into singleobjective problems, a comprehensive index consisting of both PICP and PINAW known as coverage width criterion (CWC) has been developed [34], which is expressed as follows. where where is a predetermined confidence degree and is the penalty coefficient when PICP is less than . CWC takes coverage rate and average bandwidth into account, but there is no evaluation of the deviation of the point out of the interval. Furthermore, known information is not fully utilized in the training process. Therefore, a third evaluation index is designed in this paper: prediction interval relative deviation (PIRD).
where is the actual target value. The improved objective function adds PIRD to the original CWC; that is, where 1 replaces in CWC and 2 is the penalty coefficient of PIRD. The penalty cost is high when < , which is unfavorable to the optimization of the objective function. When > , the penalty cost of PIRD will be considered to meet the minimum target. Both coverage rate and average bandwidth are considered by minimizing the new cost function, and the deviation of the outer points is effectively decreased. The known training information is fully utilized to acquire a narrower interval.   . . Model Implementation. Based on the ELM and CCSO network model, the LUBE method is applied to construct the prediction interval. Details of the main steps are discussed below.
(1) Data preprocessing: the training data and test data are normalized to [-1, 1] to avoid errors caused by different dimensional data.
(2) Constructing training sample set: determine the input vector = [ 1 , ⋅ ⋅ ⋅ , 4 , 5 , 6 , 7 ], where 1 ∼ 4 are wind power data for 4 moments before the forecast point. 5 , 6 , and 7 relatively indicate wind speed, sine of the wind direction, and temperature of the forecast point. The original training sample ( , ) is slightly fluctuated; that is, where rand is a random number uniformly distributed between [0, 1]. and obtained are used as the outputs of the training data.
(3) Parameter initialization: randomly generate input weights and thresholds of ELM, initialize the parameters of the CCSO according to Step (1) in Section 2.3, and randomly generate the initial swarm.
(4) Constructing interval prediction mode: a doubleoutput extreme learning machine prediction model is constructed according to the LUBE method, the output weight is taken as an individual of the chicken swarm, and the objective function is taken as the fitness of the individual; the lower the fitness, the better the individual. (5) Iterative optimization: update the chicken swarm and optimize the objective function according to Section 3.1 Step 3∼Step 6. The training is terminated when the maximum iterations are attained or the fitness of the best individual can no longer be decreased. Then, the best individual and corresponding fitness ( ) are taken as the outputs.
(6) Interval calculation: the individual obtained in (5) is used as the weight of the output layer, and the prediction interval is directly calculated by ELM. Then PPIC, PINAW, and PIRD of the interval are calculated.

Case Studies
In this section, two parts are introduced to verify the proposed model. The first part shows the performance of CCSO applied in interval prediction compared to other optimization algorithms. In the second part, wind power interval prediction results are analyzed under a confidence level of 90%. The wind data from an existing wind farm in China is taken as an example. In this study, the first 85% is taken as the training sample set, and the last 15% is taken as the test sample set.
. . Comparison and Analysis of Optimization Performance. In this paper, the PSO, CSO, CPSO, and CCSO are used to optimize the training process of ELM. Then, the performance of the 4 algorithms is compared and analyzed. The maximum iterations the algorithm can have are 300, the swarm size is 100, the confidence level is 90%, 1 and 2 in the objective function ( ) are 50 and 0.3, respectively, and the other parameters are shown in Table 3. MATLAB 2014a is used to code and run the simulation. The PICP, PINAW, PIRD, and the convergence curve of the fitness function in each algorithm are shown in Figures 3-7.
Because of the penalty, the PICP of the 4 algorithms are always above 90%. If the constraints are not met, the fitness function will rise rapidly. In the early period of PSO, PINAW decreased rapidly, but it changed very little after the 40th iteration. This was because all the individuals in the swarm learned from the global best individual, lacked population diversity in the later period, and could not jump out after falling into the local extremum, that is, could not achieve premature convergence. PINAW in CSO has been decreased by 30% in the 10th generation, which proves that subgroup division method of CSO can greatly increase the diversity of the population. However, due to the lack of strong local search ability in the later period, the decline speed of PINAW slows significantly until it no longer declines. CPSO is an improved method of chaotic search combined with particle swarm optimization, in which the decline trend of PINAW in CPSO is improved obviously, and PINAW drops to about 28.5% after the 150th iteration. Subgroup division method and chaotic theory are combined in CCSO, and the population diversity and local search ability have been improved. Furthermore, self-adaptation search space and self-adaptation chaotic mutation probability are used to enhance the optimization efficiency and decrease the possibility of falling into local optimum greatly. PINAW has dropped to 27.7% at the 50th iteration. Compared to other algorithms, CCSO shows more outstanding optimization ability.
. . Results and Discussions. Four kinds of trained models were simulated and verified by the test sample set. The prediction results of the wind power interval of each model are shown in Figures 8-11.
Under a confidence level of 90%, the prediction interval of CCSO is narrower than that of the other three algorithms, and only a few points fall outside the interval. Furthermore, the deviation of these peripheral points is greatly decreased by PIRD. Therefore, the negative effects of points outside the interval will be minimized.
It can be seen from Table 4 that different optimization algorithms differ greatly in the prediction results. The PICP of each algorithm is higher than 90% because of the penalty and PICP of CCSO is slightly lower than that of PSO; meanwhile,

Conclusion
Accurate prediction of wind power directly affects the stable operation of the power system. Traditional point prediction can only give the wind power value at the corresponding moment, and there is no quantitative analysis of the uncertainty of the error. However, interval prediction is an effective method to quantitatively predict the uncertainty of wind power, and the interval can be evaluated with both clarity and reliability. In this paper, an improved chicken swarm optimization based on chaos theory is proposed, and an interval prediction model of wind power based on CCSO optimized ELM is constructed. To address the problem of large deviation in the traditional objective function CWC, a new objective function considering deviation PIRD is proposed. The benchmark function tests and case studies show the following.
(1) CCSO proposed in this paper outperforms the PSO, CSO, and CPSO in terms of both optimization accuracy and robustness; the reason is that CCSO inherits the population diversity of CSO and the local search ability of chaotic system. (2) PIRD proposed in this paper takes into account the influence of the points outside the interval of the prediction results, effectively decreases the relative deviation, and improves the overall quality of the interval prediction. (3) Compared to other prediction models, CCSO-ELM has faster convergence speed, narrower PINAW, and lower PIRD; in a word, the proposed CCSO-ELM model constructs higher quality prediction intervals.
The high quality of training samples can effectively improve the prediction accuracy, and advanced feature selection methods can construct an optimal set of inputs for the CCSO-ELM models. Therefore, the future research will aim at advanced feature selection methods.

Data Availability
The wind power data used to support the findings of this study have not been made available because of secrecy agreement.

Conflicts of Interest
The authors declare that they have no conflicts of interest.