Wind Power Interval Forecasting Based on Conﬁdence Interval Optimization

: Most of the current wind power interval forecast methods are based on the assumption the point forecast error is subject to a known distribution (such as a normal distribution, beta distribution, etc.). The interval forecast of wind power is obtained after solving the conﬁdence interval of the known distribution. However, this assumption does not reﬂect the truth because the distribution of error is random and does not necessary obey any known distribution. Moreover, the current method for calculating the conﬁdence interval is only good for a known distribution. Therefore, those interval forecast methods cannot be applied generally, and the forecast quality is not good. In this paper, a general method is proposed to determine the optimal interval forecast of wind power. Firstly, the distribution of the point forecast error is found by using the non-parametric Parzen window estimation method which is suitable for the distribution of an arbitrary shape. Secondly, an optimal method is used to ﬁnd the minimum conﬁdence interval of arbitrary distribution. Finally the optimal forecast interval is obtained. Simulation results indicate that this method is not only generally applicable, but also has a better comprehensive evaluation index.


Introduction
With the development of advanced generation technologies, there has been an enormous increase in the amount of renewable energy generation in recent years [1,2]. As a clean and widely available resource, wind energy has become one of the most popular renewable sources. However the intermittent nature and volatility of the renewable energy brings rigorous challenges for the safety and stability of electric power system when large scale renewable energy is connected to system [3]. Accurate forecast of the renewable power generation is the basis for the energy internet. Traditional forecast methods are mainly point forecasts with a certain degree of error. Interval forecasts can effectively describe the uncertainty of the wind power, so they have attracted more attention in recent years [4,5].
Point forecast methods only generate a deterministic forecast value for a certain time without any indication of the associated uncertainty. Point forecasts are based on some climate parameters (such as wind speed, wind direction, temperature, etc.) and historic wind power data. Interval forecasting is based on the point forecast result. According to the error between the point forecast result and the real wind power, an error distribution will be obtained. Analyzing the error distribution, we can obtain the errors' confidence interval at a certain confidence level. The interval forecast is based on the point forecast result and the errors' confidence interval. The point forecast results add the upper and lower bound of the errors' confidence interval, and the interval forecast result will be obtained at a certain confidence level. Compared with point forecasting, interval forecasting can provide additional quantitative information about the wind power generation uncertainty [6,7].
Wind power interval forecasts help decision-makers better control the changes in the power grid planning and scheduling, risk analysis, and reliability evaluation, etc. They are one of the effective ways to alleviate peaking pressure, reduce the reserve capacity, and improve the power grid capacity for accepting the wind power [8]. The power loss also can be decreased. However, current interval forecast methods have some defects.
First, regarding the error distribution, most previous reports assumed that the error distribution satisfies a certain known distribution. The errors are often assumed to follow a normal distribution [9,10], but the Weibull distribution [11], beta distribution [12,13] and Cauchy distribution [14] have also been utilized. In fact, wind power forecast errors are influenced by many factors. There are still no specific known distribution to describe them accurately. The real error data usually don't fit to any of these known distributions. The accuracy of the error distribution estimation is reduced when using an approximately known distribution and thus the quality of the wind power interval forecast is harmed. It is a critical task to find out a general method to accurately estimate the error distribution.
Second, concerning how to solve the confidence interval, the confidence interval is not unique at the same confidence level. We always speculate with a narrower interval to make a meaningful interval forecast, as very wide intervals hardly yield any valuable information and may be useless in practice. If the minimum confidence interval can be obtained, the wind power interval forecast will be more accurate. Most literatures use the traditional symmetric divide point method to obtain the confidence interval. For a symmetric distribution, the narrowest confidence interval can be obtained using this method [15,16], but for an asymmetric distribution, the confidence interval obtained by this method may not be the minimum. In [13], the minimum confidence interval is obtained for the asymmetric distribution, but the method in that paper is only suitable for an asymmetric and unimodal distribution (such as a beta distribution). It is not applicable for multimodal distributions. There are scarce reports on how to obtain the shortest confidence interval for arbitrary shapes (asymmetric, multimodal, etc.).
Finally, about the evaluation index of the interval forecast. Although the interval width is the key index of interval forecasts, we should not just focus on this. In addition to the interval width, the percentage of the real value falling into the forecast interval (prediction interval coverage probability−PICP) should also be considered. Generally, the narrower the interval, the smaller the percentage of the real value falling into the forecast interval is, while the wider the interval, the larger the percentage of the real value falling into the forecast interval is. The interval width and the PICP are contradictory and mutually constrained. There are very limited studies on how to comprehensively evaluate these two contradictory yet mutually constrained factors.
In a word, for wind power interval forecasting, there is no general method suitable for various situations, for example, some methods only work for a normal distribution or symmetric distribution or unimodal distribution. There is also a lack of comprehensive evaluation criteria for the quality of interval forecasts. In the view of the above problems of wind power interval forecast, this study mainly focused on the following aspects: • First, in order to get a general method to describe the error distribution more accurately, a non-parametric Parzen window method [16,17] is introduced to estimate the error distribution. Parzen window method is suitable for arbitrary shape error distribution. It can be applied more generally and fits the distribution of the real data more accurately. • Second, in order to get a general method to solve the minimum confidence interval more accurately, a trust region optimization method is used to minimize the confidence interval of arbitrary shape error distribution. This method is suitable for arbitrary shape distribution and can be generally applied. • Finally, in order to evaluate the quality of interval forecast more comprehensively and objectively, a comprehensive evaluation index F value is introduced firstly in this paper. F value considers the two mutually constrained factors: forecast interval width (D) and the PICP. F value can evaluate the quality of the wind power interval forecast more objectively and comprehensively.
The proposed method is subjected to simulation and approved to be not only suitable for the distribution of arbitrary shape and generally applicable, but also obtains a better comprehensive evaluation index. This method improves the quality of the wind power interval forecast effectively.

Parametric Estimation and Non-Parametric Estimation
In order to estimate the error distribution of the wind power point forecast, a correct error distribution model should be set up first. To solve the probability density function from the given error samples is actually a problem of probability statistics.
Parametric estimation assumes that the sample's probability density form is known, for example, the sample fits normal distribution or other known distributions. However, such assumption is not correct sometimes. In fact, the sample's probability distribution is usually unknown that the known function can't describe the true probability density accurately. Some classical density functions are unimodal, such as normal distribution and beta distribution, while many sample probability density functions are multimodal.
Non-parametric estimation method works on arbitrary shape probability distribution, and does not require a known probability density function type. It doesn't use any prior knowledge of data distribution, nor any assumptions of data distribution. Instead, it is a method to study data distribution characteristic by data sample itself. It entirely depends on training data itself for estimation and can be used for arbitrary shape distribution estimation and thus, it fits the true data distribution more accurately.

Parzen Window Estimation Method
Parzen window estimation method is a very effective non-parametric estimation method, which is also named as kernel density estimation [5]. Its general expression is: where f (x) is the probability density function, N is the total number of the sample, h is the bandwidth or smooth parameter, x i is a given sample, and K( ) is the window function. The common window functions are Uniform function, Gaussian function, trigonometric function, Gamma function, etc. When Gaussian window function is used, its probability density function is: The advantage of Parzen window estimation is its general applicability. It can be used for uniform distribution or non-uniform distribution, symmetric distribution or asymmetric distribution, unimodal distribution or multimodal distribution. It fits the arbitrary shape distribution and is more consistent with the true data distribution which leads to a better estimation accuracy.

Solving the Shortest Confidence Interval
In this paper, we use the optimization method to obtain the shortest confidence interval of the arbitrary distribution. From the above conditions, the probability density function of error is given by Equation (2), let F(x) be the error probability distribution function, and let x 1 be the upper bound of confidence interval; x 2 be the lower bound of confidence interval, P(x 1 ≤ x ≤ x 2 ) is the confidence level, which means the probability occurred in the interval x 1 − x 2 , then: The minimum confidence interval of wind power forecast error which satisfied certain confidence level is the minimum of x 1 − x 2 , and also satisfies the condition: That is: Constraint condition: The above optimization problem can be expressed as: s.t. c(x) = 0 (8)

Trust Region Method
There are many optimization methods for nonlinear optimization problems. This paper used the trust region method characterized with the ideal global convergence and reliable algorithm in practical computation, which makes it an effective numerical algorithm to solve the problem of nonlinear optimization [18,19].
The trust region method works in a way that first defines a region around the current best solution, in which a certain model (usually a quadratic model) can to some extent approximate the original objective function. The method then takes a step forward according to the model depiction within the region. Unlike the linear search methods, the trust region method usually determines the step size before the improving the direction (or at the same time). If a notable decrease is gained after the step forward, then the model is believed to be a good representation of the original objective function. If the improvement is too subtle or even a negative improvement is gained, then the model is not to be believed as a good representation of the original objective function within that region.
If we are using the quadratic model to approximate the original objective function, then our optimization problem is simplified to solving the trust-region subproblem.
The trust-region subproblem is: s.t c(c k ) + d k T ∇c(x k ) = 0 (10) where d k is trial step, ∇ϕ(x k ) is the gradient at current point and B k is the Hessian (or a Hessian approximation). It is easy to find the solution to the trust-region subproblem if B k is positive definite. By solving the trust-region subproblem, d k is obtained, which should satisfy Equation (11): where ∆ k is the trust region radius. The basic calculation steps are as follows: Step 1, give the initial values x 1 ∈ R, ∆ 1 > 0, K = 1; Step 2, solve the subproblem of trust region method, calculate d k ; Step 3, if a certain condition is satisfied, then Step 4, ∆ k+1 is adjusted in some conditions, k = k+1, go to step 2.

Filter Concept
For the constrained optimization problem, either for the linear search method or the trust region method, it is necessary to select a suitable penalty function. It is required that the penalty function should be gradually reduced with the iteration, but the selection of penalty factor in the penalty function method is difficult. In order to overcome the difficulties caused by the penalty factor selection Fletcher and Leyffer proposed the filter concept in [20,21]. The evaluation of iteration points is not done using a penalty function, but rather using a two-dimensional array.
The filter method introduces a two-dimensional filter to determine whether the new iteration points can be accepted. The two-dimensional filter has two parameters: the objective function value ϕ k and the constraint violation h k .
When the constraints are equality constraints, the filter is: where ϕ(x k ) is the objective function value when x = x k , h k = c (x k ) is the constraint violation when x = x k , · is the norm in Rn.
If the objective function value ϕ(x k ) or the constraint violation c(x k ) is reduced, this step is accepted by a filter, which is the basic idea of the filter. The algorithm flow chart is shown in Figure 1.

Filter Concept
For the constrained optimization problem, either for the linear search method or the trust region method, it is necessary to select a suitable penalty function. It is required that the penalty function should be gradually reduced with the iteration, but the selection of penalty factor in the penalty function method is difficult. In order to overcome the difficulties caused by the penalty factor selection Fletcher and Leyffer proposed the filter concept in [20,21]. The evaluation of iteration points is not done using a penalty function, but rather using a two-dimensional array.
The filter method introduces a two-dimensional filter to determine whether the new iteration points can be accepted. The two-dimensional filter has two parameters: the objective function value and the constraint violation ℎ . When the constraints are equality constraints, the filter is: where ( ) is the objective function value when x = , ℎ = ‖ ( )‖ is the constraint violation when x = , ‖•‖ is the norm in Rn. If the objective function value ( ) or the constraint violation ‖ ( )‖ is reduced, this step is accepted by a filter, which is the basic idea of the filter. The algorithm flow chart is shown in Figure 1.

Some Explanations
In Step 3, a certain condition means whether the ( , ℎ ) is accepted by the filter set. In Step 4, "∆ +1 is adjusted in some conditions" means if the new point is accepted, increase the trust region radius ∆ +1 , if the new point is rejected, decrease the trust region radius ∆ +1 and recompute.
The basic idea of this method is that in every step, we use the trust region approach to produce a new iteration. Some filter rules are then employed to determine whether this point is accepted by the filter set or not.

Some Explanations
In Step 3, a certain condition means whether the (ϕ k , h k ) is accepted by the filter set. In Step 4, "∆ k+1 is adjusted in some conditions" means if the new point is accepted, increase the trust region radius ∆ k+1 , if the new point is rejected, decrease the trust region radius ∆ k+1 and recompute. The basic idea of this method is that in every step, we use the trust region approach to produce a new iteration. Some filter rules are then employed to determine whether this point is accepted by the filter set or not.

Interval Forecast Evaluation Index
An important evaluation index of interval forecast is the interval width D. We always hope that the interval width D is narrower. If the forecast interval is too wide the interval forecast is meaningless.
Then should the evaluation index of interval forecast only care about the interval width D? In fact, the forecast accuracy, that is, the percentage of real values falling into the forecast interval, should also be considered [22,23].
A high quality interval forecast is expected to have larger reliability and narrower width. To assess these two aspects, two indices, he prediction interval coverage probability (PICP) and interval width D, are also employed to quantitatively measure the forecast intervals. PICP is used to show the probability that target values will be covered by the forecasted intervals. Obviously, a larger PICP value indicates that more targets will lie in the interval.
The prediction interval coverage probability (PICP) expresses the percentage of real values falling into the forecast interval: where N is the number of test samples. If the test sample is located in the forecast interval, We expect a better forecast result when the interval width D is narrower and the PICP is larger. However, D and the PICP restrict each other mutually. With the same distribution and at the same confidence level, the narrower the D, the smaller the PICP, and the wider the D, the larger the PICP. There is an urgent problem to be solved that how to find the balance between D and PICP and how to comprehensively evaluate these two contradictory indexes.
In order to solve this problem, a comprehensive evaluation index F value from the field of information retrieval is introduced first in this paper. The F value can comprehensively evaluate the two factors which restrict and contradict mutually. Here the F value is defined as the weighted harmonic average of PICP and 1/D: The F value evaluates the two factors, the confidence interval width and the interval forecast accuracy and eventually comprehensively evaluates the quality of interval forecast. The two contradictory indexes are considered comprehensively. The F value gives us an easy method to evaluate the quality of interval forecast. A higher F value means the test method is more effective.

Case Analysis
The field data of a wind farm located in Ohio (USA) from 1 January 2011 to 31 December 2011 were analyzed using our proposed method. Data were collected with a sampling interval of ten minutes, and the maximum output of wind farm was 150 MW.
The wind power point forecast error at different time points (see Figure 2) was calculated by subtracting the forecast power from the true power (see Figure 3).The Parzen window method was used to estimate the error distribution and then the optimization method was used to minimize the confidence interval at a certain confidence level. were analyzed using our proposed method. Data were collected with a sampling interval of ten minutes, and the maximum output of wind farm was 150 MW.
The wind power point forecast error at different time points (see Figure 2) was calculated by subtracting the forecast power from the true power (see Figure 3).The Parzen window method was used to estimate the error distribution and then the optimization method was used to minimize the confidence interval at a certain confidence level.

Estimation of Wind Power Forecast Error Distribution by Parzen Window Method
For the data samples with a given error, the probability density function of error was obtained by using the Parzen window estimation method, as shown in Figure 4. For comparison, the error frequency histogram and the two known distributions are also drawn in the figure. The two known distributions are the normal distribution, which is commonly used; and the gamma distribution which mostly fits the sample among the known distribution by a hypothesis test. Figure 4 shows the wind power forecast error frequency histogram, normal distribution density function graph, gamma distribution density function graph and Parzen window estimation graph. It is obvious that Parzen window estimation fits the error's real distribution better than the normal distribution and gamma distribution. The Parzen window estimation distribution is neither symmetric nor unimodal; it is an arbitrary shape.

Estimation of Wind Power Forecast Error Distribution by Parzen Window Method
For the data samples with a given error, the probability density function of error was obtained by using the Parzen window estimation method, as shown in Figure 4. For comparison, the error frequency histogram and the two known distributions are also drawn in the figure. The two known distributions are the normal distribution, which is commonly used; and the gamma distribution which mostly fits the sample among the known distribution by a hypothesis test.

Estimation of Wind Power Forecast Error Distribution by Parzen Window Method
For the data samples with a given error, the probability density function of error was obtained by using the Parzen window estimation method, as shown in Figure 4. For comparison, the error frequency histogram and the two known distributions are also drawn in the figure. The two known distributions are the normal distribution, which is commonly used; and the gamma distribution which mostly fits the sample among the known distribution by a hypothesis test. Figure 4 shows the wind power forecast error frequency histogram, normal distribution density function graph, gamma distribution density function graph and Parzen window estimation graph. It is obvious that Parzen window estimation fits the errorʹs real distribution better than the normal distribution and gamma distribution. The Parzen window estimation distribution is neither symmetric nor unimodal; it is an arbitrary shape. In order to illustrate the general applicability and accuracy of the Parzen window estimation method, we take several other groups of error data and make the distribution of error data respectively, and compare them with a normal distribution as shown in Figure 5.    Figure 4 shows the wind power forecast error frequency histogram, normal distribution density function graph, gamma distribution density function graph and Parzen window estimation graph. It is obvious that Parzen window estimation fits the error's real distribution better than the normal distribution and gamma distribution. The Parzen window estimation distribution is neither symmetric nor unimodal; it is an arbitrary shape.
In order to illustrate the general applicability and accuracy of the Parzen window estimation method, we take several other groups of error data and make the distribution of error data respectively, and compare them with a normal distribution as shown in Figure 5. From Figures 4 and 5, we can see that Parzen window method can better fit the real distribution of various data, with higher accuracy and better generality. Thus, the Parzen window estimation method is suitable for arbitrary shape distributions, more generally applicable, and more consistent with the data.

Solving the Minimum Confidence Interval
We have known that errors' probability density function is a non-symmetrical and multimodal function (as shown in the black line in Figure 4). The minimum confidence interval at a certain confidence level can be obtained by using the optimization method for the minimum confidence interval of arbitrary error distribution as described above.
Using the general method in this paper, we obtain the narrowest interval width. When the confidence level is 90%, the narrowest interval is (−14.6, 21.2) and the interval width D is 35.8, as shown in Figure 6. For comparison, we also use the traditional method (symmetric point method) to calculate the width of the confidence interval. When the confidence level is 90%, the significant level is: From Figures 4 and 5, we can see that Parzen window method can better fit the real distribution of various data, with higher accuracy and better generality. Thus, the Parzen window estimation method is suitable for arbitrary shape distributions, more generally applicable, and more consistent with the data.

Solving the Minimum Confidence Interval
We have known that errors' probability density function is a non-symmetrical and multimodal function (as shown in the black line in Figure 4). The minimum confidence interval at a certain confidence level can be obtained by using the optimization method for the minimum confidence interval of arbitrary error distribution as described above.
Using the general method in this paper, we obtain the narrowest interval width. When the confidence level is 90%, the narrowest interval is (−14.6, 21.2) and the interval width D is 35.8, as shown in Figure 6. From Figures 4 and 5, we can see that Parzen window method can better fit the real distribution of various data, with higher accuracy and better generality. Thus, the Parzen window estimation method is suitable for arbitrary shape distributions, more generally applicable, and more consistent with the data.

Solving the Minimum Confidence Interval
We have known that errors' probability density function is a non-symmetrical and multimodal function (as shown in the black line in Figure 4). The minimum confidence interval at a certain confidence level can be obtained by using the optimization method for the minimum confidence interval of arbitrary error distribution as described above.
Using the general method in this paper, we obtain the narrowest interval width. When the confidence level is 90%, the narrowest interval is (−14.6, 21.2) and the interval width D is 35.8, as shown in Figure 6. For comparison, we also use the traditional method (symmetric point method) to calculate the width of the confidence interval. When the confidence level is 90%, the significant level is: Then the upper and lower symmetry points are -13.92 and 25.31, respectively, so the interval  Figure 6. The minimum confidence interval when confidence level is 90%(using the general method in this paper). For comparison, we also use the traditional method (symmetric point method) to calculate the width of the confidence interval. When the confidence level is 90%, the significant level is: Then the upper and lower symmetry points are −13.92 and 25.31, respectively, so the interval length is 39.2, as shown in Figure 7.  Table 1 shows the minimum confidence interval results using two different methods when the confidence levels are 80%, 85%, 90%, 95%, respectively. From Table 1, the confidence intervals obtained by the general method we propose are shorter at every confidence level. Thus our general method is not only suitable for arbitrary shape distributions, but also effectively reduces the confidence interval width.

Comparison of Confidence Intervals of Different Distributions
To further evaluate our general method, we compared the confidence intervals of different distributions (normal distribution and gamma distribution) using the same set of error samples described above. First, assuming that the error obeys the normal distribution, the mean and the variance of the normal distribution were calculated to be μ = 0.0969 and σ = 11.1033. The probability density function curve was shown in the red dotted line in Figure 4.   Figure 7. The confidence interval when the confidence level is 90%(using the symmetric quantiles method). Table 1 shows the minimum confidence interval results using two different methods when the confidence levels are 80%, 85%, 90%, 95%, respectively. From Table 1, the confidence intervals obtained by the general method we propose are shorter at every confidence level. Thus our general method is not only suitable for arbitrary shape distributions, but also effectively reduces the confidence interval width.

Comparison of Confidence Intervals of Different Distributions
To further evaluate our general method, we compared the confidence intervals of different distributions (normal distribution and gamma distribution) using the same set of error samples described above. First, assuming that the error obeys the normal distribution, the mean and the variance of the normal distribution were calculated to be µ = 0.0969 and σ = 11.1033. The probability density function curve was shown in the red dotted line in Figure 4.
To solve the confidence intervals of this normal distribution, we set the confidence level P at 80%-95% and calculate the symmetry points. When P = 90%, the symmetry points are −17.3 and 19.3, respectively. Thus the width of the confidence interval is 36.6. Other confidence intervals of different confidence levels for normal distribution are summarized in Table 2. Similarly, assuming that the error obeys the gamma distribution, the two parameters were calculated to be α = 3.6634 and β = 5.7139. The probability density function curve was shown in the green line in Figure 4. The confidence intervals of different confidence levels for the gamma distribution are shown in Table 2.
Comparing the data in Table 2, the method we propose gave the narrowest confidence intervals at different confidence levels among the three tested methods.

The Interval Forecast Result
The interval forecast results were obtained according to the upper and lower bounds of the narrowest confidence interval in the Table 2, shown in Figures 8-11. The red dotted lines represent the upper and lower bounds. To solve the confidence intervals of this normal distribution, we set the confidence level P at 80%-95% and calculate the symmetry points. When P = 90%, the symmetry points are −17.3 and 19.3, respectively. Thus the width of the confidence interval is 36.6. Other confidence intervals of different confidence levels for normal distribution are summarized in Table 2.
Similarly, assuming that the error obeys the gamma distribution, the two parameters were calculated to be α = 3.6634 and β = 5.7139. The probability density function curve was shown in the green line in Figure 4. The confidence intervals of different confidence levels for the gamma distribution are shown in Table 2.
Comparing the data in Table 2, the method we propose gave the narrowest confidence intervals at different confidence levels among the three tested methods.

The Interval Forecast Result
The interval forecast results were obtained according to the upper and lower bounds of the narrowest confidence interval in the Table 2   To solve the confidence intervals of this normal distribution, we set the confidence level P at 80%-95% and calculate the symmetry points. When P = 90%, the symmetry points are −17.3 and 19.3, respectively. Thus the width of the confidence interval is 36.6. Other confidence intervals of different confidence levels for normal distribution are summarized in Table 2.
Similarly, assuming that the error obeys the gamma distribution, the two parameters were calculated to be α = 3.6634 and β = 5.7139. The probability density function curve was shown in the green line in Figure 4. The confidence intervals of different confidence levels for the gamma distribution are shown in Table 2.
Comparing the data in Table 2, the method we propose gave the narrowest confidence intervals at different confidence levels among the three tested methods.

The Interval Forecast Result
The interval forecast results were obtained according to the upper and lower bounds of the narrowest confidence interval in the Table 2   Figure10. The interval forecast result when the confidence level is 90%. Figure 11. The interval forecast result when the confidence level is 95%

Result Analysis
From Section 5.2-5.4, the narrowest forecast interval was obtained. Does this mean the best quality of the interval forecast? In fact, the PICP also needs to be considered. Of course, we hope that the narrower the interval width D, the larger the PICP, but in fact the two are contradictory. Generally, the narrower the D, the smaller the PICP and the wider the D, the larger the PICP.
We calculate the PICP of different distributions at different confidence levels according to formula (6). PICP and D are shown in Table 3, and Figure 12 described the relationship between D and PICP.  Figure10. The interval forecast result when the confidence level is 90%. Figure 11. The interval forecast result when the confidence level is 95%

Result Analysis
From Section 5.2-5.4, the narrowest forecast interval was obtained. Does this mean the best quality of the interval forecast? In fact, the PICP also needs to be considered. Of course, we hope that the narrower the interval width D, the larger the PICP, but in fact the two are contradictory. Generally, the narrower the D, the smaller the PICP and the wider the D, the larger the PICP.
We calculate the PICP of different distributions at different confidence levels according to formula (6). PICP and D are shown in Table 3, and Figure 12 described the relationship between D and PICP.   Figure 11. The interval forecast result when the confidence level is 95%.

Result Analysis
From Sections 5.2-5.4, the narrowest forecast interval was obtained. Does this mean the best quality of the interval forecast? In fact, the PICP also needs to be considered. Of course, we hope that the narrower the interval width D, the larger the PICP, but in fact the two are contradictory. Generally, the narrower the D, the smaller the PICP and the wider the D, the larger the PICP.
We calculate the PICP of different distributions at different confidence levels according to Formula (6). PICP and D are shown in Table 3, and Figure 12 described the relationship between D and PICP.  From Figure 12 we can see that with the decrease of D, PICP is decreasing, with the increase of D, PICP is increasing. From Figure 12 we can see that with the decrease of D, PICP is decreasing, with the increase of D, PICP is increasing.

Evaluation Index of the Quality of Wind Power Interval Forecast
In order to comprehensively reflect the contradictory indexes of PICP and D, F value is introduced to comprehensively estimate the quality of overall forecast. The F value was calculated according to Equation (14). Table 4 shows F values at different confidence levels. For convenience, the index D is normalized. From Table 4, we observe that at every confidence level, when the general method in this paper was used, the interval width became the shortest, though PICP might not be the largest. We also observed the largest F value, indicating the best interval forecast quality by this method.
Notice that, at 85% confidence level, the F value of the gamma distribution is equal to the F value of the general method in this paper. It can be seen from Table 4 that although the D of the gamma distribution is wider than D of the method in this paper, the PICP of the gamma distribution is larger than PICP of the method described in this paper. Comprehensively considering two indexes, the same F value for the two methods is observed. It shows that the quality of the interval forecast of the gamma distribution method is as good as the method in this paper at an 85% confidence level.
At the same time, we also observed that at a 90% confidence level, the F value of the normal distribution (1.002) is very close to the F value (1.003) of the method in this paper. This shows that although the D of normal distribution is the widest, its PICP is the largest. The combination of the two indexes makes the F value close to the F value of the method in this paper, which indicates that the quality of interval forecast based on a normal distribution is not bad at a 90% confidence level.
The F value comprehensively reflects the two aspects: confidence interval width D and prediction interval coverage probability (PICP). From Table 4, we can see that, at every confidence level, the F value obtained by the general method presented in this paper is the largest, that is, the comprehensive index of interval forecast is the best. This shows that the quality of the interval forecasts based on a Parzen window estimation and confidence interval optimization method in this paper is the best.

Conclusions
This paper proposed a general method for wind power optimal interval forecast. The following conclusions can be put forth:

1.
First, the Parzen window method was tested as a general method for estimating the point forecast error distribution. This method is suitable for arbitrary shape distributions, generally applicable, and more accurate.

2.
Then a trust region method was adopted to calculate the minimum confidence interval length of the arbitrary shape distribution. Compared with the traditional methods, it is not only generally applicable, but also effectively shortens the confidence interval length.