Probabilistic Load Flow Calculation of Power System Integrated with Wind Farm Based on Kriging Model

Because of the randomness and uncertainty, integration of large-scale wind farms in a power system will exert significant influences on the distribution of power flow. This paper uses polynomial normal transformation method to deal with non-normal random variable correlation, and solves probabilistic load flow based on Kriging method. This method is a kind of smallest unbiased variance estimation method which estimates unknown information via employing a point within the confidence scope of weighted linear combination. Compared with traditional approaches which need a greater number of calculation times, long simulation time, and large memory space, Kriging method can rapidly estimate node state variables and branch current power distribution situation. As one of the generator nodes in the western Yunnan power grid, a certain wind farm is chosen for empirical analysis. Results are used to compare with those by Monte Carlo-based accurate solution, which proves the validity and veracity of the model in wind farm power modeling as output of the actual turbine through PSD-BPA.


Introduction
With the large-scale integration of renewable energy sources such as wind power and photovoltaics (PV) into the grid, there are more and more random factors in the operation of power system, which intensifies the uncertainty of system operation [1]. The calculation of probabilistic load flow (PLF) can take into account the influence of random disturbance or uncertain factors in the operation of power system on steady state operation of system, comprehensively evaluate the weak points and potential risks in the operation of system, and provide valuable information for planning and scheduling department to make decisions, which has always been a focus of research [2,3].
The common calculation methods of PLF include simulation methods and analytical methods. In simulation methods, Monte Carlo simulation (MCS) method [4] owns high accuracy when the sampling scale is large enough, but it takes a long time, so it is usually used as a standard method to evaluate the accuracy of other PLF methods. In analytical methods, convolution method performs convolution calculation according to the probability distribution function of input random variable and obtains the probability distribution characteristics of output random variable [5], which concept is clear but calculation burden is large. In order to solve the above problems, an approximate method is introduced, that is, on the premise of not reducing the accuracy, a mathematical model with a small amount of calculation and a calculation result similar to the actual simulation result is constructed to replace the actual simulation program. Moreover, common approximate models include polynomial response surface method (RSM) [6], artificial neural network (ANN) model [7], and support vector machine (SVM) function model [8], etc. However, polynomial response surface model has a lower fitting degree for nonlinear problems; the characteristics of SVM model varies with its used functions; ANN model is a "black box" model, which lacks strict statistical test and is easy to fall into local extremum. Kriging model is a kind of smallest unbiased variance estimation method estimating the unknown information, which can achieve global approximation in the design space and has high fitting accuracy [9,10].
Both analytical and approximate methods assume that random variables in probabilistic power flows are independent of each other. When it deals with random variables with correlation, new steps are needed to solve the correlation problem of random variables. Haesen et al. [11] used Nataf inverse transformation to predict wind speed to obtain a wind speed series with correlation. Liu et al. [12] transformed multidimensionally correlated non-normal random variables into independent normal random variables based on marginal probability distribution. Bin et al. [13] proposed a simulation method of correlated random variables based on Copula theory and rank correlation coefficient based on cumulative distribution function of random variables. The above correlation modeling methods all need to know the probability distribution function of random variables, but for non-normal multi-dimensional random variables, it is difficult to give a complete joint probability distribution [14].
Kriging is a regression algorithm for spatial modeling and prediction (interpolation) of stochastic process based on covariance function. This paper proposes a combined Kriging model and polynomial normal transformation (PNT) method to calculate PLF of wind farm access to system. Firstly, PNT is used for wind speed prediction to obtain wind speed sequences related to wind turbine clusters, while output model of wind turbine cluster is established. This method does not need to know the probability distribution function of input random variable, but only needs to use polynomial normal transformation according to its digital characteristics to quickly and accurately obtain relevant wind speed sequences. Then, taking the output of PNT as input, relationship between system response and random input is established by Kriging model, thus probability statistics of system response can be calculated. Finally, a wind farm in Yunnan power grid is taken as an example to verify the effectiveness, accuracy, and rapidity of the proposed method.

Polynomial Normal Transformation
In this paper, PNT is adopted to transform the non-normally correlated multi-dimensional random variables into the normal uncorrelated variable space, take sampling points in the normally uncorrelated variable space, and then inversely transform these sampling points to the original non-correlated variable space. This transformation-inverse transformation process enables the method to deal with the PLF problem with correlated random variables [15,16].
Common polynomial normal transformation methods mainly include third-order polynomial normal transformation (TPNT) [17], fifth-order polynomial normal transformation (FPNT) [18] and ninth-order polynomial normal transformation (NPNT) [19]. Considering the calculation accuracy and complexity of polynomial normal transformation, this paper employs the most widely applied third-order polynomial normal transformation (TPNT).
Assume a set of correlated multidimensional random variables, denoted X ¼ x 1 ; x 2 ; Á Á Á ; x m ½ T , where the mean value and standard deviation of x i is l x i and r x i , and the correlation coefficient matrix of these variables is expressed by where q x i ; x j represents the correlation coefficient between x i and x j .
Besides, x i can be expressed as a third-order polynomial of a standard normal distribution vector, i.e., where a 0;i , a 1;i , a 2;i , and a 3;i denote polynomial coefficients.
The polynomial coefficient can be obtained from the probabilistic weighted moment (PWM) of random variable. Suppose PWM of random variable x i is b ic , and its unbiased estimation [20] can be expressed as The linear moment of x i can be obtained from PWM of the variable, as follows [21] 1 The coefficient of the polynomial can be obtained according to the linear moment of x i , as follows The above transformation transforms non-normal random variables into normal random variables. In order to ensure that the correlated coefficient matrix of random variables remains unchanged in the transformation process, correlated coefficient matrix R Z between variables that obey normal distribution needs to be further given by The detailed relationship between R X and R Z can be referenced in literature [22], as follows where q z i ;z j is the correlated coefficient between z i and z j ; r x i and r x j denote the standard deviation of x i and x j , respectively; l x i and l x j mean the expected value of x i and x j , respectively.
Solve the above unary cubic equation and select the root that satisfies À1 q z i ;z j 1, q x i ;x j q z i ;z j ! 0 as the value of q z i ;z j .
At this time, Z is still correlated, so it needs to be transformed into an independent normal distribution by using orthogonal transformation, as follows where the lower triangular matrix L is the Cholesky decomposition of correlation coefficient matrix T is a random variable subject to independent normal distribution.
Substitute Eq. (8) into Eq. (2), the original space X can be obtained. The calculation process of TPNT prediction of wind farm groups related to wind speed can be referred to Appendix A.

Modelling of Kriging
Kriging model was proposed by Krige, a South African geologist, in 1951, which was first applied to geostatistics. Subsequently, Sack et al. [23] applied it to engineering design, and Remero et al. [24] applied it to structural reliability analysis in 2004. Since then, the application field of Kriging model has been explored and studied continuously.
As an approximate model, the essence of Kriging model instead of actual simulation program is to approximate the relationship between system response and design variables with Kriging model, which is composed of a parametric model and a non-parametric model. It is more flexible than the simple parametric model and simultaneously overcomes the limitation of non-parametric model in processing high-dimensional data [25]. Kriging model simulates a certain point with the help of known information around this point, so it establishes a model related to known information, which requires few parameters to be determined, small calculation amount, and simple model. In addition, the estimation model can be determined through a small number of sample tests.
Kriging model assumes that the actual relationship between system response and random variables can be expressed as where b denotes the regression coefficient; f x ð Þ is the regression model; z x ð Þ is a random process, which obeys the normal distribution N 0; r 2 ð Þ, but the covariance is non-zero. The covariance of z x ð Þ can be expressed as follows where R h; x i ; x j À Á is the correlation function between sample points x i and x j ; h is the relevant parameter.
The commonly used regression models and related functions are listed in Appendices B and C respectively. Research shows that the type of regression model does not play a decisive role in the accuracy of simulation [26]. In order to simplify the calculation process, the regression value f(x) is set as 1. The existence of z x ð Þ is the fundamental reason why Kriging model is different from the response surface, and it plays a decisive role in simulation accuracy of model. In this paper, Gaussian correlation equation [27] is employed as where n N denotes the number of random variables; x i k and x j k are the kth components of x i and x j ; h k means the correlated parameter. After the regression model and correlation function are determined, the expression of approximate responseŷ x ð Þ of y x ð Þ concerning point x to be measured can be established. The known training sample is denoted as S ¼ x 1 ; x 2 ; Á Á Á ; x m ½ T , its actual response value is Y ¼ y 1 ; y 2 ; Á Á Á ; y m ½ T , and m is the capacity of training sample. Then the estimated value of any point x to be measured iŝ where f x new ð Þ is the regression quantity; R is a symmetric matrix consisting of R h k ; x i ; x j À Á with a diagonal element of 1 and a size of m Â m; F is the regression matrix Þ is the correlation vector between measured point and sample point, which can be expressed as The maximum probability estimation factor is expressed aŝ The correlated model needs to solve the relevant parameter h, and according to the maximum probability estimation, it can be obtained aŝ The relevant parameter h is determined by solving an optimization problem as follows max h>0 À m lnr 2 À Á þ ln detR ð Þ 2 (16) At this point, Kriging model is established.

Case Studies of Kriging Models
The numerical example of Eq. (17) After the model is established, the accuracy of the model should be tested. Randomly generate 100 sets of sample points to be tested in the sampling space, and substitute them into Kriging model and Eq. (17) to obtain the corresponding predicted and actual values. Their relative error is provided in Tab. 2, which shows that the established Kriging model owns good approximate accuracy. Fig. 1 indicates the mean relative error between the predicted value and the actual value of Kriging model. It can be seen from that the accuracy of Kriging model is related to the size of training sample. The larger the training sample size is, the higher the accuracy of Kriging model will be.

Calculation Steps
In the deterministic power flow calculation, node injection power equation and branch power equation [29] of system can be expressed by where W is the power injected into the node; X is the node voltage; L is the branch current; f is the power equation injected into the node; g is the branch power equation.
This paper defines output of wind turbine clusters in the node injected power as input variable and the amplitude of node voltage and the power of branch as output variable. When the input variable is a random variable, the output variable becomes a random variable. PLF problem is to obtain the probability distribution or digital characteristics of output variable when input variable is random.
During calculation, the sample value of output of n groups of wind turbine clusters is firstly generated by TPNT, and then the deterministic power flow is calculated by employing sample value of each group to obtain the corresponding node voltage and branch power of n groups, and then Kriging model is established [30]. Then, TPNT generates output sample values of N groups of wind turbine clusters, and Kriging model is used to obtain the estimated values of node voltage and branch power corresponding to N groups, calculate their probability statistics and compare them with the exact solution based on Monte Carlo method.
The calculation process is shown in Fig. 2 while the basic steps are as follows: 1. Input basic data: the number of wind turbine clusters m, the training sample size N , the sample size n to be tested; 2. TPNT method is used to generate a series of wind speed sequences with a scale of n Â m and calculate the output of the corresponding wind turbine clusters; 3. The node voltage and branch power corresponding to N groups are calculated by deterministic power flow calculation, and Kriging model is established; 4. TPNT method is used to generate a set of wind speed series with a scale of N Â m, and the corresponding output of wind turbine clusters are calculated; 5. The estimated values of node voltage and branch power corresponding to N groups are obtained by Kriging model; 6. The probability statistics of Kriging model results are calculated and compared with the exact solution based on Monte Carlo method.

Case Studies
Based on historical wind speed data of an actual wind farm in Yunnan Power Grid, Kriging model is established, and the power flow calculation is performed on PSD-BPA through command line programming [31]. All case studies are undertaken by Matlab 2019a through a personal computer with IntelRcoreTMi7 CPU at 2.0 GHz and 32 GB of RAM. Monte Carlo method runs 5000 times to obtain near the wind farm node voltage and branch power of statistical information as accurate values, and 50 times in comparison with the results of Kriging model simulation input.
This wind farm has four wind turbine clusters, with installed capacity of 49.5 MW, 42 MW, 34.5 MW, and 42 MW respectively, and total installed capacity of 168 MW, operating with constant power factor of 0.95. Fig. 3 is a regional grid structure diagram of wind farm access system, which is connected to 220 kV main network in the 220 kV area by Node 2 through 110 kV line of two LGJ-300 conductors [32,33]. Fig. 4 shows the probability density curve of historical wind speed sequences and TPNT generated wind speed sequences (wind turbine cluster 1 and wind turbine cluster 2 are taken as examples). It can be seen from that wind speed sequences generated by TPNT are very close to the measured historical data, which indicates that TPNT method is effective.
Suppose historical wind speed and parameters of marginal probability distribution function of wind speed generated by TPNT are g ar and g at , respectively. The relative error index of the parameters of marginal probability distribution function can be expressed by Eq. (19) where g ar and g at are the historical given value of parameter a and the fitting value of TPNT method, respectively.

Start
Set the number of wind turbine clusters, the training sample size, and under test sample size to m, n, and N, respectively.
According to Eq. (2), generate a set of training sample related wind speed sequence S via TPNT, and establish the output model of wind turbine clusters.
Calculate basic power flow to obtain its true response Y.
Determine the regression quantity, correlation function and related parameters; construct Kriging prediction model according to Eq. (12).
Generates a set of related wind speed series X of samples to be tested applying TPNT according to Eq. (2), and establish the output model of wind turbine clusters.
Substitute the sequence of the sample to be tested into Kriging prediction model of Eq. (12) to obtain the estimated output response .
Calculate the probabilistic and statistical information of Y and y.

Compare and analyze statistical results
End Generate a set of training sample related wind speed sequence via TPNT based on Eq. (2), and establish the output model of wind clusters.
Calculate output response y employing Monte Carlo power flow calculation method. In order to facilitate the calculation of Eq. (19), probability distribution model of wind speed adopts Weibull distribution that fits well with the actual statistical distribution of wind speed [34], and its probability distribution function can be expressed as where v represents wind speed; k and c are the shape parameters and scale parameters of Weibull distribution, respectively, which can be approximated by the expected value of wind speed l v and standard deviation r v : where À represents the gamma function.
The accuracy of correlation model established by TPNT method can be evaluated through Eq. (19), and calculation results are shown in Tab. 3. It can be seen that the correlation samples generated by TPNT have high accuracy in fitting the parameters of marginal probability distribution function of input random variables, and the sample quality keeps improving with the increase of sampling scale.
The voltage amplitude of two nodes near wind farm and power of three branches are selected as the observed values, which are numbered as Node 1, Node 2, Branch 1, Branch 2, and Branch 3, respectively, as shown in Fig. 3.
According to the historical wind speed data of 4 wind turbine clusters, a set of relevant wind speed sequences are generated from TPNT to establish wind turbine cluster output model, which is connected to the system for 50 basic power flow calculations to obtain the corresponding 50 voltage groups of Nodes 1 and 2, along with the response values of active power of Branch 1, Branch 2 and Branch 3. Kriging model is constructed according to 50 groups of response values, and 5000 groups of node voltage and branch power are generated by the constructed Kriging model, and the statistical results of 5000 Monte Carlo simulations are compared and analyzed to verify the validity, accuracy, and rapidity of Kriging model.
In Fig. 5, the probability density distribution curve of the voltage amplitude of Node 1 and Node 2 are obtained by Kriging model, and compared with the simulation result based on Monte Carlo method. As can be seen from Fig. 5, the results of 50 times Kriging model method have a high degree of fitting with 5000 times MCS, indicating that Kriging model can well simulate system output.    Let the simulation results of Kriging model and Monte Carlo method be n mc and n krig , respectively, and the relative error between them is: Tab. 4 compares mean value and standard deviation of the grid related node voltage and branch power. It can be seen from that the mean value and standard deviation of the Kriging model results and the relative error between the exact value are below "0.2%" and "3%", respectively, which once again verifies the high accuracy of Kriging model. Tab. 5 shows the simulation time of two methods. It shows that simulation time of Kriging model method is far less than that of Monte Carlo method, which proves the rapidity of the method proposed in this paper, and solves the problems of the traditional random analysis method with many simulation times and long simulation time.

Conclusions
Due to the proximity of geographical locations in the same wind farm, the wind speed between different wind power clusters has a strong correlation, so that the output of each wind turbine cluster has a strong correlation. In this paper, TPNT method is adopted to model wind turbine clusters with relevant wind speed, and Kriging model is applied to PLF analysis of system. An example of a real wind power farm in Yunnan power grid shows that Kriging model owns 100 times less than MC computation burden. Kriging also owns higher simulation accuracy, which verifies the effectiveness, accuracy, and rapidity of this method.  This paper provides an effective modeling and analysis method for studying PLF problem based on correlation. Compared with traditional random analysis method, it has fast calculation speed and less memory, and can quickly calculate system probability statistics with correlated random variables. Besides, based on calculation and analysis of PLF in this paper, Kriging model method could be applied to the Optimal power flow calculation and analysis of power system. The measured data will be used to verify feasibility of the proposed method.
Funding Statement: The authors received no specific funding for this study.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.