A Lossy Compression Algorithm for Phasor Measurement Units Data Based on Auto-encoder and Long Short Term Memory

With the wide deployment of phasor measurement units (PMU) in power system, the amount of data generated by various measurement devices is increasing, which brings new challenges to data transmission and storage. In this paper, a lossy data compression algorithm based on auto-encoder and Long Short-Term Memory (LSTM) is proposed for PMU data. Specifically, the auto-encoder is used to extract the features of the measurement data first, and then saved the feature vectors to achieve the effect of data compression. When need to decompress the measurement data, the compressed data is input into the decoder to achieve decompression. Compared with the traditional algorithms, the algorithm proposed in this paper can effectively reduce the reconstruction error. At the same time, it can be combined with the existing lossless compression algorithm to further improve the compression rate.


Introduction
With the expansion of the scale of the power system, the information construction of the power system is speeding up. The reliable and stable operation of the power system is more and more dependent on the support of a large amount of information. At the same time, the massive information generated by various measuring devices has also caused a great burden on data storage and transmission. Especially with the general application of phasor measurement units (PMU), more and more electricity data is generated [1]. At present, some general commercial data compression software is not designed for the characteristics and requirements of electricity data and cannot achieve the ideal high compression effect. Therefore, it is more and more important to study the compression algorithm according to the characteristics of electricity data.
Data compression algorithms are divided into two main categories: lossless compression and lossy compression [2]. Lossless compression is reversible, and data can be recovered to the original data by compression and decompression. Huffman coding, arithmetic coding and dictionary coding are the main types of lossless compression currently used. In terms of power system, Golomb-Rice Encoding is used to compress Phasor Angle Data [3]. Lossy compression is irreversible, this kind of compression algorithms allowing a small amount of data loss, but usually has a larger compression rate, which saves more storage space. Various lossy data compression algorithms have been widely used in power systems, especially in the field of distribution systems, For example, PCA is used to compressing SCADA steady-  [4], SVD is used to compress the measurement data of distribution system [5], K-SVD algorithm is used to compress the data of residential power [6], and SVD-CE is proved suitable for PMU data compressing [7]. In addition, a combination of lossy compression and lossless compression can be used to improve the compression rate. Considering the high sampling frequency and large amount of data of PMU, the lossy compression algorithm is more practical.
Therefore, this paper presents a PMU data compression algorithm based on auto-encoder (AE) and LSTM decoder. First, PMU data is input to the AE for encoding, which reduces the time series dimension, thereby reducing the storage space. Then, constructing a decoder based on LSTM, and the time series characteristics of LSTM are used to fit the complex non-linear relationship and complete the data reconstruction. Compared with the compression performance of traditional algorithms such as PCA and T-SVD, the reconstruction error of the algorithm proposed in this paper has obvious advantages at the same compression rate, which proves the effectiveness of the algorithm proposed in this paper.

LSTM
Since the voltage, current, and phase angle data collected by the PMU measurement device are all time series, the collected data has certain periodicity and volatility. To analyse this kind of time series data, this paper intends to use LSTM cell for network modelling [8]. ) LSTM is a special kind of RNN, and its added gating mechanism can reduce the problems of RNN gradient explosion and gradient disappearance to a certain extent. Its structure is shown in Figure 1. Each LSTM cell contains three gates, which are input gate, forget gate, and output gate. The input gate combines the input t x and the output -1 t h into an input vector. The forget gate is responsible for the deletion and retention of information and forms a state vector t c . The output gate is used to determine the next hidden state t h . The specific mechanism is shown in equation (1)-(6).

Model of LSTM-AE
The LSTM-AE model constructed in this paper is mainly composed of two parts: one part is an AE, which is designed for data compression, and the other part is a decoder based on LSTM, which is used for data reconstruction. The AE is composed of an input layer, a hidden layer and an output layer, and its input and output are symmetrical. The input information X is subjected to the nonlinear mapping of the intermediate hidden layer, and a new vector representing Y can be obtained, and then the vector Y is forced to be decoded as X in the output layer. After training the network, the feature vector Y can be extracted from the original information X. To achieve the effect of data compression, the number of hidden layer neurons can be designed to be smaller than the dimension of the input data, and the specific structure is shown in Figure 2. In this paper, the voltage, current and phase angle data in the PMU data are input into the AE, to obtain the feature vector whose dimension is smaller than the input data.  Figure 2. Structure of LSTM-AE model To reduce the error between the decoded data and the original data, reconstruct the real data as much as possible. This paper constructs a decoder-based LSTM. The basic idea is to input the features extracted from the AE into the LSTM decoder and then perform unsupervised training to learn the nonlinear relationship between the feature vector and the original data. Realize the reconstruction of compressed data. The specific process of the entire algorithm is as follows: 1. Data compression: The basic idea is to extract features of the input PMU data through the AE's ability to fit nonlinear relationships and save the feature vectors to achieve the effect of data compression. 2. Data decoding: use the LSTM decoder to reconstruct the saved feature vector to the original PMU data 3. Backward propagation of the gradient: The loss function is calculated by decoding the output data and the original input data, and the network weight is updated through back propagation. After repeated training, the decoder learns the mapping law of the data.

Dataset description
This paper uses the public PMU measurement data set of the campus power grid of École Polytechnique Fé dé rale de Lausanne (EPFL) [9]. A total of five PMU and one state estimation process are installed on the campus grid. The sampling frequency of the PMU is 50Hz, and the collected data includes threephase voltage magnitude, current magnitude, voltage phase angle and current phase angle. In addition, because it is equipped with 2MW photovoltaic panels and 6MW combined heat and power generation units, there are active power injections at the nodes, which makes the voltage and current waveforms more complicated.

The network structure of LSTM-Decoder
In this paper, the network structure of the decoder in the data compression model is designed according to the characteristics of the PMU data. Its main structure is shown in Table 1. The network parameters are as follows, the initial learning rate is 0.005, the loss function is MSE, the training batch size is 256, and Adam is used as the optimizer.

Data reconstruction effect analysis
We use K-PCA, T-SVD, AE and the algorithm proposed in this paper to compress the data respectively. In the data pre-processing process, the data collected by the PMU every second is selected as a time series. Since the sampling frequency of the PMU is 50Hz, the length of each time series is 50. To evaluate the decompression effect of the data compression algorithm, we calculated the MAE index under the different extracted feature numbers of each algorithm. The details are shown in Figure 3, where the abscissa is the number of features extracted in each time series. angle) It can be seen from Figure 3 that the data compression algorithm proposed in this paper has smaller errors than other algorithms, especially when the compression rate is high, it has obvious error advantages over other algorithms. Therefore, more storage space can be saved while ensuring that the error is within a certain range. At the same time, comparing the error of voltage and current data, it can be found that the algorithm in this paper has more obvious advantages than other algorithms when compressing current data. This is because the fluctuation of current data is more severe than that of voltage data, and the characteristics are not obvious. The traditional algorithms based on mathematical operations are difficult to extract features accurately. And the algorithm proposed in this paper uses a deep neural network algorithm, which can fit all kinds of complex nonlinear relations, so the original data can be reconstructed more accurately. The specific effect is shown in Figure 4 and Figure 5.  Figure 4. Voltage reconstruction effect (n=6) Figure 5. Current reconstruction effect (n=6) It can be seen from the above figure that even in the case of a high compression rate, the algorithm proposed in this paper can reconstruct as much as possible details such as tiny fluctuations while preserving the overall characteristics of the data. Intuitively illustrates the effectiveness of the algorithm proposed in this paper.

Data compression effect analysis
To further improve the compression effect, we perform secondary compression on the compressed data. The secondary compression uses a lossless compression method, so it can further save storage space without increasing the error. The specific compression effect is shown in Table 2. As shown in Table 2, the compression rate of data subjected to secondary compression is 2 to 3 times higher than that of data subjected to lossy compression only. This is because the logic of the two categories of algorithms is different, and the overall compression rate of the algorithm is approximately equal to the product of the respective compression rate of the two algorithms.

Conclusion
This paper proposes a lossy data compression algorithm for PMU data based on AE and LSTM, this algorithm can greatly reduce the storage space of PMU data. Through unsupervised training of the network, the neural network can learn the characteristics of the PMU data, encode the data, reduce the storage space, and achieve the effect of data compression. Then input the encoded data to the decoder to realize the reconstruction of the compressed measurement data. Compared with the traditional algorithm, the algorithm proposed in this paper can retain more information under the same compression rate, especially can retain more details while retaining the overall characteristics of the data, which is conducive to the subsequent data analysis. At the same time, combining the algorithm proposed in this paper with the traditional lossless compression algorithm can perform secondary compression on the data without introducing new errors, further saving storage space. Future works will focus on the joint compression to further reduce the compression loss.