Study on quick judgement of small signal stability using CNN

: Dynamic security assessment (DSA) is widely used in dispatching operation systems, and the small signal stability is one of the DSA's most time-consuming calculation methods. In this article, a fast method is proposed aiming to predict the small signal stability metrics of designated oscillation mode, for example frequency or damping ratio. The method is much faster than the simulation and suitable for the online application. First, the t -distributed stochastic neighbour embedding ( t -SNE) algorithm is performed which can create a mapping from the power system components to 2D coordinate depending on the electrical distance of each other; then, it will be transformed into a grid structure by meshing operation, on which the convolutional neural network (CNN) model can be run properly. Finally, with a large amount of simulation samples, the CNN model can be well trained using static quantities as its input and small signal stability metrics as its prediction target. While a new operation mode needs to be evaluated, the result will be obtained by CNN directly. The validity of proposed method is verified using online data of State Grid Corp of China. It is proved that the method meets the requirements for speed and accuracy of online analysis system.


Introduction
In order to protect the transmission of electric energy reliability, China power grid has carried out many projects such as power transmission from west to east, national networking, and UHV transmission, and an extra-large-scaled AC/DC hybrid power grid has been accomplished in China [1]. With the expansion of grid scale, the security and stability characteristics of power grid become more and more varied. In order to enhance the abilities to control the large grid operation, the dynamic security assessment (DSA) applications are deployed in all dispatching systems above provincial level [2][3][4][5]. The online small signal stability analysis is one of the main modules of online analysis system [6].
The online small signal stability analysis usually adopts eigenvalues method, which can be divided into three main steps [7]: (i) Linearisation of the model. According to the current system operation point, linearise the dynamic models of power elements; (ii) Solve eigenvalues and eigenvectors of the state matrix of power system; (iii) The coherent generators and oscillation mode analysis. According to the result of eigenvector, we can determine the coherent generators, select the representative generator, and analyse the corresponding typical oscillation modes.
At present, the online small signal stability analysis has been put into practical application, but there are still some drawbacks as follows: (i) The computation time is too long. Take the online calculation data of real power system as an example, and calculation time is usually 1-2 min, which is the longest of the online analysis. (ii) It is possible that the key oscillation mode cannot be found. As the scale of the modern power system is too large, the dimension of state matrix is always very high, so it is impossible to get all the eigenvalues. In practice, the Rayleigh entropy iterative method is usually used to solve a small amount of eigenvalues between 30 and 200. So, it has the possibility that the eigenvalue of the key oscillation mode cannot be solved.
In this paper, a method of predicting the frequency and damping ratio of key oscillation modes based on a large number of historical data and results is proposed. The method is very fast that can be calculated in millisecond level.
The main contribution of this paper is as follows: we propose the t-SNE algorithm to map all the power system features to a matrix on which the convolutional neural network (CNN) model can be ran properly. The rest of the paper is organised as follows: Section 2 introduces the main ideas and analysis steps of the method; Section 3 proposes and describes the t-SNE method for input feature arrangement to meet the requirements of CNN model; results are illustrated and evaluated in Section 4 using actual data; and Section 5 concludes the paper.

Methodology
CNN is a very popular deep-learning model and achieved great success in the field of image processing. At the bottom of CNN model, the smaller size convolution kernel is usually used to extract the local features of input data. As the number of model layers increases, CNN gradually integrates the local features of each part as the overall features of input data to judge the properties of the sample. That means CNN is good at starting from the partial data and gradually integrating the global features of the samples, which is similar to the characteristics of small signal stability analysis. As online analysis focuses on the key oscillation mode between regions and provinces, and the generators in close proximity tend to have the same consistency for a particular mode of oscillation, the coherent generators are always assigned into the same cluster, so the operating statuses of all the generators determine the frequency and the damping ratio of the oscillation mode collectively. It' is also a process from the local to the global analysis. Therefore, this paper proposes to build a CNN model for quick judgement of the small signal stability.
Different from the two-dimensional structure of the image, the structure of the power grid is more complicated, and the connection relationship between power elements is more diverse. Some stations have only one connecting AC line, but some others have more than 10 AC lines. There is no uniform standard and it is difficult to directly establish CNN model. Therefore, in order to effectively use CNN model, we first propose t-SNE algorithm to establish a mapping, which arranges the input features properly to a two-dimensional matrix mode using electrical distances between power elements as the strength metric of mutual relations. Then, CNN model can be established and trained, and can rapidly get the result of stability. Detailed steps are as follows: (i) Collect the historical data of online analysis, and set up the sample set by using the frequency and damping ratio of the user concerns as sample labels. (ii) Calculate the electrical distance matrix in the full connection mode of power system. (iii) Use t-SNE algorithm to map high-dimensional electrical distance relationship to two-dimensional coordinates. (iv) Mesh the two-dimensional coordinate points to form a matrix, in which each element represents a bus or a station.
(v) The CNN model is built based on the matrix, and historical sample set data is used for training and testing.

t-SNE algorithm
The electrical distances exist in every node pair of interconnected system, which can be seen as a high-dimensional data, but cannot be applied directly to CNN. The stochastic neighbour embedding (SNE) algorithm and its improved version of the t-SNE algorithm are effective non-linear dimensionality reduction methods. Under the premise of maintaining the same distances between nodes, the original structural relationship can be mapped into 2D or 3D space [8].

SNE
The principle of the SNE algorithm is to keep the distance between the data points in the process of mapping, that means the closer (or farther) points in the original high-dimensional space will be mapped to the lower dimensional space while the corresponding points are also relatively closer (or farther). Euclidean distance is a common metric of distance, set x i , x j as any two points in the original high-dimensional space, and we want to map them to the points y i , y j in the low-dimensional space. ∥ x i − x j ∥ 2 and ∥ y i − y j ∥ 2 represent the Euclidean distances in the high and low dimensions, respectively. Based on the Euclidean distance, the SNE algorithm presents a conditional probability to indicate the degree of distance of the data points. In a high-dimensional dataset X, the conditional probability is defined as follows: p j i represents the conditional probability that point x j appears near the point x i . It takes the form of a Gaussian distribution and σ i is its corresponding variance. The closer x j is to x i , the larger p j i will be, which indicates the higher the similarity between them. A similar conditional probability is used to indicate the degree of proximity of a data point in a low-dimensional dataset Y, defined as follows: q j i represents the conditional probability that the point y j appears near the point y i . The Gaussian distribution of the variance is set to 1/ 2. In this way, for any i, a conditional probability distribution P i and Q i is formed in the high-dimensional space and the lowdimensional space, respectively. To keep the distance between the data points before and after the mapping, P i and Q i should be as consistent as possible.
Kullback-Leibler (KL) divergence is a common method to measure the consistency of two probability distributions. For all i, P i , and Q i , the KL distance is as follows: C is the cost function of the SNE algorithm, so if the value of C is smaller, the distribution will be more consistent. The gradient descent algorithm can be used to solve y i , which minimises its value. y i can be two-dimensional or three-dimensional coordinates, and the solution can be used in planar graphics or stereo graphics.

t-SNE
t-SNE is an improved algorithm from SNE, which mainly solves two problems in the SNE algorithm: the asymmetric problem and the crowding problem.

Asymmetry:
The conditional probabilities defined by the SNE algorithm are asymmetrical, that is p j i ≠ p i j and q j i ≠ q i j . According to common sense, distances or similarities between two points should be unique, so they should be equal. Therefore, the t-SNE algorithm redefined p j i and q j i on the basis of the formula (1) and (2), and is recorded as p i j and q i j : In the formula (4), n is the total number of the original data points, so that both p i j and q i j of the definitions satisfy the symmetry.

Crowding problem:
The ideal effect of data dimensionality reduction visualisation is to gather similar or close points into the same cluster and separate clusters as far as possible. The conditional probability of the high-dimensional space and the lowdimensional space in the SNE algorithm is Gaussian distribution, and the distribution of the solution is approximately the same. Due to the decrease of space dimension, the clusters of different categories in low-dimensional space tend to be close to each other causing congestion. To solve this problem, the t-SNE algorithm replaces the conditional probability distribution in the lowdimensional space with the t-distribution: In contrast to Gaussian distribution, t-distribution is lower in its central part, and higher and longer in its tail. The p i j and q i j in the formula (4) and the formula (5) are substituted for the formula (3), and the new cost function is as follows: To make the cost function C smaller in the formula above, p i j will be more consistent with q i j .

Results
Based on the online data from January to October 2015 of State Grid Corporation of China, the method proposed in this is verified. In all, 29,254 different operation modes were collected as the data set. As the North China and Central China power grid is interconnected at that time, we choose the North China-Central China oscillation mode's frequency and damping ratio as the predicting objects. We choose the operating status of generator and the electrical distance between each station and oscillation centre (Nanyang Station) as the input features.

t-SNE
Usually, most of the AC lines are in operation, so we choose the full connection of power system to generate CNN model. Based on the online data at 10:00 on 16 September 2015, all the AC lines were put into operation to calculate the electrical distance between each station and reduce the dimension by using t-SNE. The result is shown in Fig. 1. The circles in the figure represent the North China power plants, the 'x' represents the Central China power plants, and the different colours represent the power plants in different provinces. It can be seen from Fig. 1 that the generators in power grid clearly show the different characteristics of each province. The electrical distance between stations in the same province is usually smaller than stations in different provinces, especially in Jiangxi, Chongqing, and Hunan. In Sichuan, Shandong, and Jingjinji, the number of plants is bigger than other province, so they occupy larger areas of the entire canvas. The visualisation result is consistent with the characteristic of the power grid, which proves the effectiveness of the proposed algorithm. Based on the output of t-SNE algorithm, the two-dimensional coordinate points are gridded to a matrix further. The gridding process is carried out gradually from the centre of the matrix to the periphery one element after another. For each matrix element, the closest point in the unselected set will be chosen. Finally, a 24 × 32 matrix is formed as base model. Since there are 475 power plants in the North China and Central China power grid in the online data, the base model has some blank elements with no input. Most of the blank elements existed in the corner of the base model.

CNN
After Section 4.1, the input data of generator has been effectively arranged and can be used in the CNN model. In our example, there are 24,542 samples in the training set and 2727 in the test set. Referred to the LeNet-5 [9] CNN structure, we constructed a CNN model containing two convolution layers (5 × 5), two pools layer (2 × 2), one dropout layer (keep probability is 0.7), and two fully connected layers (1024 hidden units), using ReLU as the activation function. Since the oscillation frequency and damping ratio are continuous value, the final layer has no activation function; the entire CNN model structure is shown in Fig. 2.
The mean-squared error between the predicted value and the ground true value is used as the loss function of the training process. The number of batch is 32; Adam optimisation strategy is adopted; and the learning rate is 0.00003. The total output power, average terminal voltage of power plant, and the electrical distance from the power plant to the oscillation centre (Nanyang station) are used as input feature separately. The results are shown in Tables 1  and 2.
From Table 2, we can find that the training set and the test set have same performance roughly, so we can conclude that there is no over fitting problem. From the results, we can see: (i) among the three kinds of input variables, the total power has best performances, whose the average error rate of the oscillation frequency is <1% and the oscillation damping ratio is <2.5%, fully meeting the requirements of the application online; (ii) the performance of predicting damping ratio is worse than predicting frequency, which indicates that the prediction of damping ratio is a more complicated problem; (iii) the maximum error rate is very high, that means the model is not fit all samples properly, and need to be further improved.
As the frequency and the damping ratio are related to each other, it is reasonable to modify the CNN model to predict them both at the same time. The prediction results are shown in Table 3 using total power of plant as input.
It can be seen from Table 3 that the average error rate of the prediction results increases slightly, but is still satisfied for the online analysis, and this model has higher computational efficiency than two separate models.
All above models were trained and tested under the TensorFlow framework and Intel Core (TM) i5-6200U environment. The training time took about 40 min, but the computational speed was extremely fast. One single sample took only about 1 ms, which was far less than the online simulation. An online small signal stability analysis method using CNN is introduced, including frameworks and key technologies. The method is verified by using online data of State Grid Corp of China for 10 months and has a great advantage in calculation speed. It is also necessary to make further improvements, such as: (i) Try more features that make influence on small signal stability.
(ii) Analysis the sensitivity of CNN model that could be used for decision-making.
(iii) Consider the combination of CNN model and simulation.

Acknowledgments
Project is supported by Science and Technology Foundation of the State Grid Corporation of China: Research on Key Technologies to Enhance Performance of Power Grid Online Analysis System.