A least square support vector machine-based approach for contingency classification and ranking in a large power system

: This paper proposes an effective supervised learning approach for static security assessment of a large power system. Supervised learning approach employs least square support vector machine (LS-SVM) to rank the contingencies and predict the system severity level. The severity of the contingency is measured by two scalar performance indices (PIs): line MVA performance index (PI MVA ) and Voltage-reactive power performance index (PI VQ ). SVM works in two steps. Step I is the estimation of both standard indices (PI MVA and PI VQ ) that is carried out under different operating scenarios and Step II contingency ranking is carried out based on the values of PIs. The effectiveness of the proposed methodology is demonstrated on IEEE 39-bus (New England system). The approach can be beneficial tool which is less time consuming and accurate security assessment and contingency analysis at energy management center.


Introduction
Modern power system is a complex interconnected network having multiple utilities of different nature at generation, transmission, and distribution ends. This diverse nature of the devices makes

PUBLIC INTEREST STATEMENT
With the increase in population and ongoing demand of the electricity, power utilities are working near to the operating limits. The relevant information of the power system's state is beneficial for the operation and control of the power system at energy management center. This paper presents a supervised learning approach for ranking the contingencies in three states namely not critical, critical and most critical. The supervised learning model is developed by the offline studies and two standard performance indices are used. After reading this paper, the readers will be able to know about the static security assessment and online contingency ranking of a large power system. the operation of power system more complex as compared with earlier days. This complexity is also increasing due to exponential increase in population and escalating load demands. Under these conditions a question mark appears on the reliable operation of the power system (Souza, Do Coutto Filho, & Schilling, 2002). The utilities at different ends are operating at their operating and security limits. To make system extremely reliable, the offline studies under different operating scenarios for occurrence of probable contingencies is the field of interest. Contingency analysis is carried out by screening and sensitivity-based ranking methods (Devaraj, Yegnanarayana, & Ramar, 2002;Niazi, Arora, & Surana, 2004;Patidar & Sharma, 2007;Refaee, Mohandes, & Maghrabi, 1999;Shanti Swarup & Sudhakar, 2006;Singh & Srivastava, 2007;Verma & Niazi, 2012). In ranking methods, calculation of standard indices is carried out. These calculations are based online MVA, bus voltage, and reactive power generation of the system. However, in screening methods different load flow methods are employed namely DC load flow, Ac load flow, and local solution methods. Sensitivity-based methods are efficient but inaccurate, on the other hand screening methods are less efficient.
In past few decades, this area invited the interest of researchers to develop a foolproof contingency evaluation scheme. Souza et al. (2002) proposed a fast contingency selection based on pattern search analysis in this approach authors employed Multi Layer Perceptron (MLP). However, this method is tested for small IEEE 24 bus system. Srivastava et al. (2000) proposed a fast voltage contingency screening through a hybrid neural network. This neural network is obtained by the combination of filter module and ranking modular network. Radial Basis Function Neural Networks (RBFNN) was used in the approaches (Devaraj et al., 2002;Singh & Srivastava, 2007;Srivastava et al., 2000). These networks exploited as a supervised agent to estimate the line loadings and bus voltages of different power systems. The RBFNN was employed in this problem, due to simplicity and training efficiency. Some more approaches employed RBFNN for the calculation of the indexes (Refaee et al., 1999). In Singh and Srivastava (2007) mutual information method is used for selecting the feature of the neural net. This method was employed to define the relationship between independent variables and dependent variables. Method of correlation coefficients has been discussed and employed in Verma and Niazi (2012). In the work reported in Verma and Niazi (2012), 11 best features were chosen as input features. Euclidean-based clustering technique was applied by Jain, Srivastava, & Singh, (2003) to select the appropriate number of hidden layers for the RBFNN for voltage contingency screening. For selection of features, class seperability index and correlation coefficients based approach were employed in many researches (Devaraj et al., 2002;Patidar & Sharma, 2007;Singh et al., 2000;Verma & Niazi, 2012).
Supervised learning approaches, namely feed forward neural network (FFNN) (Shanti Swarup & Sudhakar, 2006;Verma & Niazi, 2012), RBFNN (Devaraj et al., 2002;Singh & Srivastava, 2007;Srivastava et al., 2000), cascaded neural network (CNN) (Niazi et al., 2004;Singh et al., 2000) have been presented to estimate and classify the critical contingencies for many models of power networks. The most important part of these learning approaches is input feature selection and choice of the parameters which determine the micro and macro structure of neural nets. In literature bus injections, state variables associated with generating and loading conditions were employed to generate a large database. To aggregate research in a more promising way, two major thrust areas are identified and those are as follows: firstly, development of an intelligent feature selection algorithm which can map dependent and independent variables and secondly to employ fast and accurate supervised learning model to contemporary power system for accurate contingency ranking. Recent years LS-SVM is used as a classifier in many approaches (Ekici, 2012;Erişti, Yıldırım, Erişti, & Demir, 2013;Jain et al., 2003). Erişti et al., (2013) presented a study based on wavelet transform to classify power quality events into fault events. These events were self regulating faults, line energizing events, and non-fault interruption events. Nine different features were extracted for this study. Similar work is reported by Sami Ekici (Erişti et al., 2012 to classify the power system disturbances. Power load forecasting along with Ant Colony Optimization is presented by Niu, Wang, & Wu, (2010); In Niu et al. (2012) optimal feature selection is performed by Ant Colony Optimization. Different Neural topologies presented in the work (Goyal & Goyal, 2011;Ghosh & Lubkeman, 1995;Niu et al., 2010;Toha & Osman Tokhi, 2008;Williams & Zipser, 1989). The size reduction of the data and optimal feature selection are the key issues addressed in these approaches.
In view of the above literature review, following are the objectives of this research paper.
(i) To develop a supervised learning based model which can predict the performance indices based on MVA power flow and line voltage reactive power flow for a large interconnected standard IEEE 39 bus test system under a dynamic operating scenario.
(ii) To develop a classifier which can screen the contingencies of the power system into three states namely not critical, critical, and most critical.
(iii) To present the comparative analysis of the reported approaches with the proposed approach based on accuracy in prediction of the PIs.
The paper is organized as follows, Section 2 contains the details and mathematical formulation of the performance indices, in Section 3 philosophy of support vector machine is discussed, in Sections 4 and 5 proposed methodology and simulation results are discussed. Section 6 conclusion enlists the main finding of this work.

Contingency analysis
Contingency evaluation is an essential practice to know the emergency situations in power networks. Without knowing the severity and the impact of a particular contingency, preventive action cannot be initiated by the system operator at energy management center (Devaraj et al., 2002). Contingency analysis is an important tool for security assessment. On the other hand, prediction of the critical contingencies at earlier stage (which can present a potential threat to the system stability (voltage or rotor angle)) helps system operator to operate the power system in a secure state and initiate the corrective measures. In this paper, line outages at every bus in New England system are considered as a potential threat to the system stability. Performance Indices (PI) methods are widely used for contingency ranking (Jain et al., 2003;Singh & Srivastava, 2007;Srivastava et al., 2000;Verma & Niazi, 2012). Following subsection presents definition of performance indices used for contingency ranking.

Line MVA performance index (PI MVA )
On the basis of literature review, it can be judged that the contingency ranking performed by the performance indices. System loading conditions in a modern emerging power system are dynamic in nature and impose a great impact on the performance of the power system. An index based on Line MVA flow is determined to estimate the extent of overload. Equation (1) shows the mathematical representation.
where S post i is the post contingency MVA flow of line i, S i max is the MVA rating of the line i, N L is the number of lines in the system in this study (N L = 46), W Li is the weighting factor(=1). M (=2n) is the order of the exponent of penalty function (Verma & Niazi, 2012). To avoid misranking high value of exponential order (n = 4) is chosen in this paper. In order to classify the power system security states, on the basis of PIs calculation the status of power system is subdivided into three categories and indicated in Figure 1. Class A non-critical contingencies, Class B critical contingencies, and Class C most critical contingencies. Class B contingencies are related to the violation of the loading limits or voltage limit violations. However, the Class C contingencies indicate that they are not safe under any operating condition. (1)

Line voltage reactive performance index (PI VQ )
The system stress is measured in terms of bus voltage limit violations and transmission line over loads. An index based on Line VQ flow is determined to estimate the extent of overload.  (=2n) is the order of the exponent for penalty function. The first summation is a function of only the limit violated buses chosen to quantify system deficiency due to out-of limit bus voltages. The second summation, penalizes any violations of the reactive power constraints of all the generating units, where Q i is the reactive power produced at bus i, Q max i the maximum limit for reactive power production of a generating unit, N G the number of generating units, W Gi is the real non-negative weighting factor (=1). The determination of the proper value of "n" is system specific. The optimum integer value "n" for this paper is taken as 4. In following section the basic details of Least Square Support Vector Machines (LS-SVMs) are interwoven to understand the role of this supervised learning model as a regression agent and classifier (Figure 2).

Support vector machine
Recently the mappings and classification problems are handled well by the artificial neural networks (ANNs). Two basic properties of neural nets make themselves different from other conventional approaches. These properties are: where α k is the weighting factor, x are the training samples and x k are the support vectors, b represents the bias, and N is the training samples. The architecture of LS-SVM is shown in Figure 3.
(2) The RBF kernel function for the proposed SVM tool can be written by Equation (4).  In the present simulation work LS-SVM is interfaced with MATLAB software (Matpower 4.0 User's Manual, 2015; Power system test cases, 2010) and data-set of 13,800 different operating conditions along with line outages of each line is considered.
The values of PIs obtained from standard Newton Raphson methods are used for training purpose. The least square estimator uses the optimize values of σ (kernel width). The larger the value, the more will be the width of the kernel. This value indicates that system is global and near to a linear system. Unlike neural network SVM trains in less time and possess no hidden layers.

Proposed methodology
Data generation is an important task in supervised learning approach. In this study, a rich data of 13,800 samples are employed to train, test, and validate the networks. Following are the steps involved in the process.
(i) A large number of load patterns are generated by randomly perturbing the real and reactive loads on all the buses and real and reactive generation at the generator buses.
(ii) The features are selected as per (Verma & Niazi, 2012). Total 11 features as indicated in work are chosen for training purpose. These features are P g10, Q g1, Q g2, Q g3, Q g4, Q g5, Q g7, Q g8, Q g9, Q g10 , and Q d14 . A contingency set for all credible contingencies are employed. N−1 contingencies are the most common event in power system. Single line outages are considered for each load pattern and the value of index is stored for each iteration of the simulation.
(iii) The obtained values of the index are normalized between 0.1-0.9 to train the SVM. Further the binary classification is done to train the classifier.
The system operating state contingency type and the regression performance of the network is stored for each operating scenarios (Figure 4).

Simulation results
The Simulink implementation of proposed approach has been implemented in MATLAB and tested over IEEE 39 bus test system (New England) (Lin, Horne, Tiňo, & Lee Giles, 1996) shown in Figure 5. The modeling of the system and simulation studies are performed over Intel ® core ™, i7, 2.9 GHz 4.00 GB RAM processor unit. Bus no. 39 has taken as slack bus. For line contingency 13,800 patterns are generated, which includes 46 line outages and different loading patterns (300). Out of these 200 patterns are those that the Newton Raphson (NR) method failed to converge.
From Table 1 it can be judged that the LSSVM possess lower values of mean square error (MSE) and high values of regression coefficient (R). It is empirical to judge that often the performance of the ranking methods is questioned due to wrong detection or misranking of a critical contingency.
The comparative results for the performance of the neural networks for determination of PIs are shown in Figures 6 and 7 based on value of mean square error (MSE) and percentage R 2 , respectively.
The values of calculated indices for different mentioned contingencies are shown in Table 2. Number of samples is exhibited to show the efficacy of the different methods. From Table 2, it can be judged that the line outage 6-7 during loading condition 1345 is the critical one as the values of the indices are higher for every method.
For sample no. 2984, the values of PI MVA by NR method is 0.1102 and the values predicted by Elman Backdrop, NARX, and Cascaded FBNN are around 0.15. Higher values can be clustered near the classifier boundaries and a crisp classifier will not be able to classify the state of the power system by these values.
On the other hand, the values calculated by the LS-SVM method possess lower values. It is important to mention here that often the performance of the ranking methods is questioned due to wrong detection or misranking of a critical contingency. The LS-SVM outperformed over the recent available topologies of neural networks (NNs) in prediction of performance indices. Classifications of contingencies are compared with the NR method and it is observed that LS-SVM can classify the contingencies well. For the ease of simplicity and understanding the excel plots are also included with the analysis. It can be observed from Figure 6 that values of MSE for FFNN are the highest. This shows the incapability of FFNN to predict the contingencies. MSE is the residual mean square, in statistical interpolation the value closer to zero indicates that the fit is more useful for prediction. From these values, it can be concluded that LS-SVM proven as a best regression agent for the prediction of both indices. LS-SVM method is suitable for prediction of contingencies. Values of R 2 are found minimum for Elman backdrop as shown in Figure 7. In statistical studies these values are the indication of how successful the fit is in explaining the variation of the data. The values which approach near to 1 as in case of LS-SVM shows that the machine learning model is able to predict the data very well. The value of adjusted R 2 is highest in the case of LS-SVM for calculation of both indices.   Following points are emerged from Table 2 (PI MVA ): (i) It is observed that for line outage 6-7 the values of PI MVA from NR method is 0.8683. For this operating scenario, this contingency is not only a potential threat to the system stability but also most critical in nature. The value predicted by backdrop method is 0.7438; Cascaded FBNN is 0.7844 and 0.77859 FFDTDNN. LS-SVM predicted the value which is very near to the NR method and that is 0.8588.
(ii) It is also observed that for line 20-34 the contingency is neither severe nor critical hence the values predicted by all network topologies fall in the same range. However, the value predicted by LSSVM is quite close to the original values.
(iii) The classes identified by the NR method are same with the class identified by LS-SVM. It is also important to mention here that in this simulation study total no. of 13,800 cases were simulated. Out of these cases 1,232 no. of cases identified as critical contingencies and 10,243 were identified as most severe critical contingencies. The classification handled by LS-SVM is verified through NR and presented in a lucid manner for some contingencies.
(iv) Confusion matrix is an error matrix. The rows represent the instances in predicted class and column represents instances in actual class. Confusion matrix for the index detection is shown in Figure 8. Confusion matrix is a classical way to determine the accuracy of the classifier. From Figure 8, it can be observed that for class C, 100% cases were identified. For class B, 92% cases were identified, and 8% remaining cases were identified as a contingency A and C. Classification Efficiency for Class A and C is 100%. This shows an efficacy of the proposed approach to identify critical contingencies more accurately.
(v) It is concluded that both classification and prediction parts are performed by SVM with selected features. The efficacy of the proposed method is compared with contemporary types of neural networks. It is pragmatic to say that LS-SVM proved as a better supervised approach for real power system problem. It is also observed that most contingency ranked correctly by LSSVM shows the efficacy of the proposed approach over the recent conventional supervised learning models. In few cases it is observed that neural networks detect the class wrongly. To draw the fair comparison between all the neural nets, the hidden layers are kept 2 and number of neurons is kept 4. The neural nets are trained several times and the final results are taken from the best performed networks. It is observed that SVM is able to classify the contingency efficiently. It is also observed that the estimation of the PIs under different outages is also carried out in a very effective manner by SVM as compared with other approaches.
Following points are emerged from Table 2 (PI VQ ): (i) It is observed that for line outage 6-7 the values of PI VQ from NR method is 0.8421. For this operating scenario, this contingency not only a potential threat to the system stability but also most critical in nature. The value predicted by backdrop method is 0.8413; Cascaded FBNN is 0.8001 and 0.8015 FFDTDNN. LS-SVM predicted the value which is very near to the NR method and that is 0.8425.
(ii) It is also observed that for line 20-34 the contingency is neither severe nor critical hence the values predicted by all network topologies fall in the same range. However, the value predicted by LSSVM is quite close to the original values.
(iii) The classes identified by the NR method are same with the class identified by LS-SVM. It is important to mention here that in this simulation study total 13,800 cases were simulated. Out of these cases 1,232 cases identified as critical contingencies and 10,243 were identified as most severe critical contingencies. The classification is handled by LS-SVM is verified through NR and presented in a lucid manner for some contingencies.
(iv) It is concluded that both classification and prediction parts are performed by SVM with selected features. The efficacy of the proposed method is compared with contemporary types of neural networks. It is pragmatic to say that LS-SVM proved as a better supervised approach for real power system problem. Confusion matrix for the index detection is shown in Figure 9. Confusion matrix is a classical way to determine the accuracy of the classifier. From Figure 9 it can be observed that for Class C, 100% cases were identified. For class B, 93% cases were identified i.e. from 1,232 cases 1,146 cases are identified by the classifier. 7% remaining cases were identified as a contingency A. It is observed that SVM is able to classify the contingency efficiently. It is also observed that the estimation of the PIs under different outages is also carried out in a very effective manner by SVM as compared with other approaches.

Conclusions
This paper proposes a supervised learning model based on least square loss function with RBF Kernel function to estimate the contingency ranking in a standard IEEE 39 bus system. Following are the main highlights of this work.
(a) Comparative analysis of existing learning-based approaches for contingency ranking through standard performance indices is carried out on a large interconnected power system while considering dynamic operating conditions. It is observed that neural nets of different topologies exhibit their quality to act as a regression agent. However, the best regression results are based on MSE and R are exhibited by LS-SVM. The numerical results obtained for the indices calculation advocated the efficacy of the proposed approach.
(b) In second part, the classification of the contingencies are carried out by LS-SVM. A binary classifier is obtained with three binary classes based on the values of Performance Indices. The performance of the SVM as a classifier is exhibited through the comparison of the results with NR method. It is concluded that SVM shows a satisfactory response to classify the contingencies.
(c) The proposed approach is suitable for online application. The operator at energy management center can easily get the details of the contingency and severity of the same with the help of these offline tested results. The study on larger system with multiple contingencies lays in the future scope.