An Enriched Prediction Intervals Construction Method with Hybrid Intelligent Optimization

Prediction intervals (PIs), within which future observations of time series are expected to fall, are a powerfulmethod for uncertainty modeling and forecasting.This paper presents the construction of optimal PIs using an enriched extreme learningmachine (ELM)based method. While quality evaluation indices for PIs on reliability and sharpness of prediction results have been defined in the literature, this paper proposes a new PIs evaluation index, robustness, which focuses on the forecasting error. Combined with the above three indices, a more comprehensive objective function is then formed for optimal PIs construction.The paper also proposes an efficient hybrid quantum-behaved particle swarm optimization method with bacterial foraging mechanism to optimize the parameters in the ELM model. The effectiveness of the additional robustness index and the proposed improved ELM approach in determining higher quality PIs is demonstrated by applying them to PIs constructions for the cases of prediction in different datasets.


Introduction
To make optimal water resource allocation, precise runoff prediction is always needed and is one of the most important issues in the field of hydrology [1]. In the literature, most of forecasting methods are focused on developing accurate deterministic point forecasting methods for runoff time series [2][3][4]. In actual case applications, single point prediction result is more popular than prediction interval because of its convenience to implement. However, for the point forecasting method, the predicted runoff value is provided without any information about associated uncertainties. During the process of decision making and operational planning, it is important to know how well the predictions match the real targets and how large the risk of unmatching is. Prediction Intervals (PIs) are excellent tools for the quantification of uncertainties associated with point forecasting [5,6]. By definition, a PI is an interval consisting of lower and upper bound, in which the predicted value will fall with a certain probability [7]. Recently, Prediction Intervals (PIs) have been widely accepted to quantify uncertainties associated with point forecasts [8,9].
According to the existing interval forecasting methods, the probability density of the point forecasting error is always used to generate PIs [10,11]. Without any prior knowledge of the point forecasting, a novel method in [12] adopts the two outputs of Neural Networks (NNs) to directly construct upper and lower bounds of PIs. Traditional NNs are widely used to construct PIs owing to outstanding generalization performance and approximation ability [13][14][15]. However, conventional NNs-based method causes inevitable limitations, such as overtraining [16]. As a new learning algorithm for training traditional feed forward neural networks, Extreme Learning Machine (ELM) [17] has been applied to construct PIs for its iterative-free learning mechanism [18,19]. The performance of ELM has also been shown to have faster learning speed and better generalization ability than traditional NNs in [20][21][22]. In these ELM-based prediction methods, PIs with associated confidence levels are generated through minimizing the PIs evaluation functions and optimizing the parameters in the ELM to produce high quality PIs. But the problem to determine the optimal PI still remains to be solved.
For the optimal PIs, higher reliability and sharpness are both expected [23]. Previously, a number of objective functions for training the prediction model have been proposed aiming to assess the overall quality of PIs with a single value. The Coverage Width-based Criterion (CWC) [12] based PIs objective function gives more weight to the property of reliability based on the exponential penalty term. Considering that the reliability is usually defined as the fundamental feature and determines the validity of PIs, the authors of [12] then modified the CWC in [24,25] by regarding the reliability as the hard constraint. However, this method would cause the value of sharpness becomes larger than it needs to be, since the hard reliability constraint is difficult to be satisfied without sacrificing the sharpness. Hence, PIs construction still has room to be improved, with respect to PIs evaluation framework and optimization objective function.
In this paper, a new PIs evaluation index and an objective function have been formulated to comprehensively account for the properties of obtained PIs. Usually, reliability and sharpness are considered as main required properties of interval forecasting. However, the PIs error information for samples with their target values beyond the bounds is a key measure to assess the potential risk of operational planning. Regarding the actual values beyond the bounds of PIs, it will cause the overload or loss load. Hence, lower error values are also needed to be considered in the objective function of PIs optimization. An additional PIs evaluation index, which focuses on the forecasting error information, is therefore developed and defined as robustness. Combined with the reliability and sharpness indices, a more comprehensive objective function is formulated for training the ELM to obtain optimal PIs. To minimize the new constructed objective function of PIs optimization, this paper proposes the integration of the bacterial foraging mechanism into quantum-behaved particle swarm optimization method to form a hybrid intelligent optimization method for the determination of the parameters in the prediction model.
The rest of this paper is organized as follows. Section 2 describes the required PIs evaluation indices and the proposed objective function. Section 3 describes the implementation of proposed PIs construction approach based on ELM and hybrid intelligent optimization method. Case studies results and comparisons of proposed approach with benchmarks are presented in Section 4. Finally, Section 5 concludes the whole paper.

Evaluation Framework for Optimal PIs
Reliability and sharpness are widely considered as the main required properties of optimal PIs [12][13][14]. In this section, an additional PIs evaluation index that focuses on the interval forecasting error is defined as robustness. A novel objective function for PIs optimization is then formulated to comprehensively account for the reliability, sharpness, and robustness properties of constructed PIs.
. . Reliability. According to the PIs definition [12], a PI with lower bound 훼 ( 푖 ) and upper bound 훼 ( 푖 ) bracket the target value 푖 with nominal confidence level can be expressed as follows: Reliability is referred to as the statistical consistency of PI coverage probability and nominal confidence level. To assess the reliability of constructed PIs, PI coverage probability (PICP) is proposed in [5] to describe the probability that target values will be covered by the upper and lower bounds. The PICP is defined as follows: where 푡 is the size of the entire test samples, and 푖 is the indicator of PICP. If the future target value 푖 is covered between the lower bound 푖 and upper bound 푖 , then 푖 =1, otherwise 푖 =0. 푖 can be expressed as . . Sharpness. Sharpness is referred to as the ability of PIs to concentrate the probabilistic forecasts information about future outcome [5]. For the optimal PIs, the target 푖 is expected to lie within the bounds of PIs with higher reliability. This can be easily achieved if we just take the PICP into account, which will result in the lacking of useful forecasting information for decision-maker. The PI normalized average width (PINAW) index is applied in [12] to quantitatively measure the sharpness of PIs, which is defined as where is the maximum range of targets.
. . Robustness. Robustness is referred to as the ability of PIs to resist the probabilistic forecasts error during execution. On the basis of satisfactory reliability requirement and narrower interval width, this paper proposes that the forecasting errors for samples with target values beyond PIs bounds should also diminish towards zero as closely as possible. According to this, the Average Width Error (AWE) is proposed as a key measure to assess the potential risk of operation, and it is defined by Mathematical Problems in Engineering   3 where ⋅ 푡 means the size of nominal error samples, and 푖 is the indicator of width error information for each sample.
. . Objective Function. A number of objective functions have been proposed in the literature [12,18,24,25], aiming to assess the overall quality of PIs based on the reliability and sharpness evaluation index. The Coverage Width-based Criterion (CWC) [12] gives more weight to the variation of PICP and if it is lower than the nominal confidence level of Eq. (1), the corresponding exponential term penalty will be accounted for. By using this objection function, it is hard to decide the coefficient for penalty term. For the Constrained CWC (CCWC) [24], the PICP is regarded as the hard constraint but it also results in large width of the PIs. For the interval score criterion (ISC) [18], the score of each prediction point is calculated according to the modified sharpness information. Different from the above objection function, the proposed objective function focuses on the best combination of reliability, sharpness, and robustness. Taking the criterion of robustness into account together with that of reliability and sharpness, a new objective function for PIs optimization can be defined as where is the penalty coefficient to ensure the reliability, and PICP is a function of PICP. If PICP is less than the PI nominal confidence level, PICP =1 and the penalty item will be accounted for to ensure the reliability. Otherwise, PICP =0 and the optimization process will maximize the sharpness and robustness of constructed PIs. An examination of Eq. (7) shows that this objective function gives the reliability index a higher priority over sharpness and robustness but complemented by the other two indices. The objective function aims to obtain highquality PIs with smaller values for PINAW and AWE based on satisfied confidence level. Compared to the existing objective functions, the new objective function provides a more comprehensive PIs evaluation for the PIs construction method presented in the next section.

Hybrid Intelligent Optimization Method for Optimal PIs Construction
To construct PIs for the runoff time series, the ELM-based PIs construction method can be applied. The construction of PIs based on ELM model is described in Section 3.1 and the theory and implementation of hybrid intelligent optimization method in optimizing the objective function are described in Sections 3.2 and 3.3 below, respectively.

. . Construction of PIs Based on Extreme Learning Machine.
As a novel learning algorithm for training single hiddenlayer feedforward neural networks (SLFNs), ELM generates the hidden node parameters randomly and determines the output weight matrix analytically [16]. For distinct samples , SLFNs with hidden nodes and activation function ( ) are mathematically modeled as [17]: where w 푗 is the weight vector connecting the j-th hidden neuron and the input nodes, 푗 is the weight vector connecting the j-th hidden neuron and the output nodes, and 푗 is the threshold of the j-th hidden node. If SLFNs can approximate these samples with zero error, it means that i.e., there exist 푗 , w 푗 and 푗 such that The above equation can then be written as [17] ]푁×퐾 (12) where H is the hidden layer output matrix, and T is the matrix of targets. Unlike standard NNs where all parameters need to be tuned, in ELM, the input weights and hidden biases are randomly assigned and fixed. Then the hidden layer output matrix H can be determined, and the approximation of * , * and * can be obtained such that With fixed input weights and hidden layer biases of ELM, training an SLFN is simply equivalent to finding the smallest norm least-squares solution of the above linear system: where H † is the Moore-Penrose generalized inverse of the hidden layer output matrix H.
To construct PIs for the runoff time series, the ELMbased interval forecasting method with the PIs generated at the output layer of the neural network is illustrated in Figure 1. In which, the first and second outputs of the ELM model correspond to the upper and lower bound of the PIs separately [6].
. . An Efficient Hybrid Intelligent Optimization Method for PIs Optimization. The optimization for the above ELM-based PIs construction aims to obtain the minimum objective function values of Eq. (7) with respect to the best output weights in the ELM model. To minimize the constrained optimization function and optimize the parameters of ELM, conventional optimization methods, such as Dynamic Programming (DP), Particle Swarm Optimization (PSO) algorithm, always suffer from the problem of being trapped into local optima [26][27][28][29][30]. Inspired by quantum mechanics, a new version of PSO named Quantum-behaved Particle Swarm Optimization (QPSO) [31] was proposed due to its guaranteed characteristic of global convergence. However, it still needs a local search mechanism to make a balance between the exploitation and exploration [32]. This paper proposes the integration of the bacterial foraging mechanism [33] into QPSO to form an efficient hybrid intelligent optimization method for the determination of the parameters in the ELM prediction model based on the above objective function.
. . . e Description for Quantum-Behaved Particle Swarm Optimization Method. The QPSO algorithm assumes that there is a quantum delta potential well on each dimension at the local attractor point 푑 [31] where = 1, 2, ⋅ ⋅ ⋅ , indicates the particles of population, d= 1, 2, ⋅ ⋅ ⋅ , indicates the dimensions of each particle, 푖푑 is the best position of ith particle, 푔푑 is the position of global best particle, and 1푑 and 2푑 are random vectors with range of (0, 1). In QPSO, every particle has quantum behavior with its state formulated by wave function [34]. The probability density functions of particle's position can be deduced from the Schrödinger equation, and then the measurement of particle's position from the quantum state to the classical one can be implemented by using the Monte Carlo simulation method. The position of the i-th particle is updated as follows: where ℎ and are two random numbers distributed uniformly with range of (0, 1), and is suggested to decrease the value of linearly from max to min generally; mbest is the mean best position, which is defined as the mean of 푖푑 positions of all particles.
. . . e Description for Hybrid Intelligent Optimization Method. It is evident from Eq. (16) that each component of the particle's updated position is determined by the local attractor 푑 and the disturbing part. Like the original PSO algorithm, convergence of the particles' positions to their local attractors can guarantee the convergence of particles [28]. As a result, the local attractors in Eq. (16) will gather toward the global best value, which in turn makes the particle's current position converge to the global best position. The local attractors in Eq. (15) can be translated into It can be seen that the global best position guides the movement of local attractors. During the process of iteration, if the global best position is trapped into a local optimal point, it will mislead the particle's current position convergence, resulting in premature convergence. Aiming at this problem, this paper develops a Hybrid Quantumbehaved Particle Swarm Optimization (HQPSO) algorithm by introducing the bacterial foraging mechanism to update the position of global best position. As investigated in [35,36], one of the major driving forces of Bacterial Foraging Optimization (BFO) is the chemotactic movement. However, the chemotaxis employed by BFO usually results in sustained oscillation, especially on flat fitness landscapes, due to the fixed chemotactic step size.
To make a balance between the global searching capability and local searching capability, this paper proposes a dynamic approximation control strategy to update the chemotactic step size. For the particles, the loss of global searching capability means it only flies within a small space, which will result in premature convergence, while the loss of local searching capability means that the possible flying cannot lead to perceptible effect on its fitness, which will result in a slow convergence speed [37]. To overcome these shortages, the proposed dynamic approximation control strategy aims to leave a large search area in the early iteration process and shrink the search range adaptively in the later iterations.
Suppose 푔,푑 ( ) represents the global best particle at -th bacterial foraging operations, ( ) is the size of the chemotactic step taken in the random direction specified by the tumble, the update of global best position based on the Mathematical Problems in Engineering 5 bacterial foraging mechanism and dynamic approximation step control can be expressed as where Δ indicates a vector in the random direction whose elements lie in [−1, 1], 푐 indicates the number of chemotactic steps, 푑 is the vector of global best position with respect to the d-th dimension, max 푑 and min 푑 are the bounds of the -th dimension, and is an range parameter that used to control the decrease rate of chemotactic step size.

. . Implementation of the HQPSO-Based PIs Construction.
In this section, the proposed HQPSO algorithm is applied to minimize the objective function of optimal PIs construction. During the optimization process, the output weight 푗 ( = 1, 2, . . . , ) of ELM is designed as the decision variable and the objective function value of each particle is designed to evaluate the quality of obtained PIs. The dimensions of each particle [ 푖1 , 푖2 , . . . , 푖퐾 ] in HQPSO algorithm are , and the value of each particle 푖푑 represents the value of output weight 푗 ( = 1, 2, . . . , ). The steps of optimal PIs construction are given as follows: (1) Normalization. Normalize the runoff database to [−1, 1], and then construct the PIs optimal model.
(3) Construct PIs and calculate fitness . Construct PIs based on the parameters 푗 ( = 1, 2, . . . , ) and then calculate the objective function value for each particle.
(4) Update the value of particles. The parameters 푗 of each particle are updated by (16).
(5) Update the value of local attractor point. Construct a new ELM-based model by using the updated , and then calculate the objective function value for new particles. If this new objective function value is smaller than that of 푖 , then 푖 is updated by this new particle. What is more, if it is smaller than 푔 , then 푔 is updated.
(6) Update the value of global best particle. By using the bacterial foraging mechanism, the position of global best particle can be updated as in Algorithm 1.
(7) Loop. If the maximum iterations have not been reached, go to step (4). Otherwise, the iteration process is finished and the optimal output weight matrix of the ELM is obtained.

Results and Discussion
. . Datasets. The effectiveness of the additional robustness index and the proposed ELM optimization approach in determining higher quality PIs is demonstrated by applying them to PIs constructions for the predictions of runoff time series. The runoff datasets are from Zhexi hydropower station, which is located in Hunan province. The hourly inflow runoff data measured in 2015 is adopted in this research, out of which the first six months is used for training and the rest of the data are used for testing. All these data are normalized to [−1, 1] before constructing interval forecasting model.
For training of the proposed prediction intervals construction method, as described in Section 3.1, ELM generates the hidden node parameters w and b randomly. The only job left for the users is to determine the output weights and the inputs of ELM model. In this paper, the inputs of ELM are chosen based on the partial autocorrelation function (PACF) analyses [38], which is a useful tool to analyze the correlation between the candidate variables and the historical datasets. After normalizing the training data, the PACF values of inflow runoff data are calculated and shown in Figure 2. It can be seen that the lag orders of the inflow runoff time series are three and the inputs of ELM model are [ 푖−1 , 푖−2 , 푖−3 ]. The number of neurons in hidden layers is set as seven according to the Kolmogorov theorem (i.e. = 2 + 1).
The training process during the PIs optimization shows the convergence behavior of the constructed PIs for global best particle, which aims to find the best values for the output weights of the ELM. To evaluate the performance of proposed algorithm, three methods including PSO, QPSO, and HQPSO are applied to minimize the objective function in (7). In the case studies, the PINC is set to 90%, and the coverage probability for training phase is set 1-2% higher than the nominal confidence level as pointed out in [11]. The penalty coefficient is set to 10. The major parameters for the three optimization methods are given in Table 1. The number for iterations and population size in three methods are set to 500 and 100, respectively.  Figure 3. The ycoordinate of Figure 2 means the optimization results of PIs construction according to (7). It is shown that the proposed HQPSO algorithm has much better global searching ability and convergence property than PSO and QPSO method. For the beginning of iteration, owing to the introduction of local searching mechanism, the fitness of global best particle with HQPSO decreases rapidly, which ensures the efficiency of the swam search. For the later evolution process, owing to the decline of population diversity, there will appear premature convergence. Among three optimization methods, the PSO performs worst since it traps into a local optimum easily. Compared to PSO, QPSO has a better global searching ability, but it makes only a little improvement from 300 to 500 iterations due to the poor local searching ability. Conversely, the HQPSO in the same iteration range still has the global best particle decreasing in the fitness value as shown in Figure 3.
The original QPSO and HQPSO method employ the same parametric setup, except with the difference of the chemotactic step size and swimming length in the bacterial foraging mechanism. The chemotactic step size ( ) was kept at 0.1 in the classical BFO [29]. For the dynamic approximation control strategy, the chemotactic step size shows exponential decline as the bacterial foraging iterative process advances. For PIs optimization, the optimization range for output weight is set to be [−100, 100]. To obtain optimum accuracy, the chemotactic steps in the later iterations are required to be reduced to nearly 0.001. Hence, the value of can be calculated by a given value of 푐 . Figure 4 illustrates the relationship between the objective function and the number of generations for different chemotactic steps 푐 . As evident from the results, when the chemotactic step size is fixed at 0.1, the objective function decreases rapidly at the beginning, but it suffers from premature convergence. In the results of dynamic step size control strategy, for larger chemotactic steps 푐 , the objective function converges faster. Figure 5 illustrates the characteristics between PIs optimization objective function and the number of generations for different swimming lengths 푠 of the bacteria. As evident, when the swimming length is bigger, the objective function decreases faster and converges faster. However, the larger chemotactic steps and bigger swimming lengths will lead to higher computational time. From Figure 5, it can also observed that, for longer iteration process, e.g., 500 generations and above, the small change of the values for chemotactic steps and swimming lengths does not affect the optimization results, and therefore the values for 푠 and 푐 can be set at smaller values to keep the computational time down. Conversely, the values can be at larger for a shorter iteration process, e.g., 200 generations.
. . Discussions on Different Objective Functions. The effectiveness of ELM has already been demonstrated to have faster learning speed and better generalization ability than traditional NNs in [20][21][22]. In this section, the effectiveness of proposed PIs objective function in (7) is demonstrated by comparing it with CWC, constrained CWC (CCWC), and the interval score-based criterion (ISC). The merits of using CWC or ISC have previously been compared with conventional interval forecasts methods, e.g., exponential smoothing and quantile regression. Hence, this paper focuses on the effectiveness of proposed PIs objective function. The PIs are constructed with 90% and 80% nominal confidence level, respectively. The numerical results of different case studies are given in Table 2 including the reliability index PICP, the sharpness index PINAW, and the robustness index AWE. At two different confidence levels, it can be seen from Table 2 that all these methods with different objection functions can provide fairly satisfactory coverage probability. The reason for this is obvious since all of these cost functions take the reliability index as the primary requirement of forecasting. For the performance of sharpness and robustness, it can be seen that the PINAWs of the proposed objective function are the smallest for all the case studies, indicating the highest sharpness of the obtained PIs. For CWC and CCWC, the penalty function is only related to the value of PICP, and the hard PICP constraint in CCWC is difficult to be satisfied without sacrificing the width of the PIs, which makes the values of PINAW become larger than they need to be. Besides, the CWC and CWCC objective functions do not take the error information into account and they just focus on the narrower width of PIs, which will lead to much larger values for AWE. For ISC, the generated PIs demonstrate fair robustness due to the introduction of interval score, but the interval score cannot quantitatively distinguish the contributions of robustness and sharpness, and they generally have larger width than the present proposed approach indicating a lower sharpness.
The obtained PIs for runoff datasets with PINC=90% and PINC=80% are displayed in Figure 6, where the actual measured data are covered by the constructed PIs in a great percentage. According to Figure 6, it can be observed that the lower and upper bounds have a good performance in following the change of real test samples, which demonstrates that the proposed approach is effective in construction of optimal PIs.

. . Performance Variation for Different Application Dataset.
The effectiveness of the proposed ELM optimization approach in determining higher quality PIs is demonstrated by applying it to PIs constructions for the predictions of load demand. The load demand data are from Tasmania regional market in Australian National Electricity Market Mathematical Problems in Engineering 7 (a) Initialize parameters: s=1, d=1. (b) Tumble: Generate a random vector û, with each element û 푑 belongs to a random number on [−1, 1]. (c) Move: Update 푔,푑 ( + 1) according to (19) and calculate its new fitness . (d) Swim: Let m=0 (counter for swimming length), while m is smaller than the maximum swimming length 푠 , let m=m+1.
If F is smaller than ( 푔 ), then update 푔 . Else, set m to be the maximum swimming length. (e) Loop: If < , d=d+1, go to step (b). In this case, continue the foraging operation for next dimension variable. (f) Loop: If < 푐 , s=s+1, update the chemotactic step size ( ) according to (21) and go to step (b). Else, terminate the life of bacterial foraging operation.
Algorithm 1: Bacterial Foraging Searching Mechanism.  (ANEM) [39]. The chosen time periods are from Jan 2016 to June 2016 with half an hour trading interval, out of which the first three months is used for training the ELM model and the rest of the data are used for testing the prediction performance of the proposed algorithm. Half-hour ahead PIs are implemented for load demand. The numerical results of different case studies are given in Table 3 in terms of the reliability index PICP, the sharpness index PINAW, and the robustness index AWE. At two different confidence levels, it can be seen from Table 3 that the proposed method generally has fairly satisfactory coverage probability and lower values for PINAW index and AWE index (i.e., higher sharpness and robustness) than traditional methods.
The obtained PIs for load demand data with PINC=90% are displayed in Figure 7, where the actual measured data are covered by the constructed PIs in a great percentage.  According to Figure 7, it can also be observed that the lower and upper bounds have a good performance in following the change of real test samples for all case studies. Therefore, experimental testing results demonstrate the effectiveness of the proposed ELM-based interval forecasting approach in improving the quality of obtained PIs with a combination of higher reliability, sharpness and robustness.

Conclusions
The ELM based PIs construction method has been applied and extended for PIs optimization in this paper. For the training of ELM model, a new evaluation index that focuses on the PIs error information has been developed to evaluate the robustness of PIs. A novel objective function for PIs   optimization has also been formulated to comprehensively account for the properties of reliability, sharpness, and robustness. To solve the proposed nonlinear objective function in the training phase of the ELM, an improved QPSO with the bacterial foraging mechanism has been proposed to minimize the proposed objective function and optimize the parameters of the ELM model. To make a balance between the exploitation and exploration of the search space, this paper also develops a dynamic approximation control strategy to update the chemotactic step size. The effectiveness of proposed method has been validated by using it to construct optimal PIs for different datasets. The results have illustrated that the proposed ELM-based method can provide much higher quality interval forecasting information.

Data Availability
Data generated by the authors or analyzed during the study are available from the following options: (1) the data related to this research is available upon request by contacting the corresponding author. (2) All data generated or analyzed during the study are included in the published paper.