Data-driven online distributed disturbance location for large-scale power grids

: Timely detecting disturbances and locating their sources are critical to the reliable operation of power grids. This capability enables operators to effectively diagnose disturbances over wide areas and earns time for remedial reactions. In this study, a travelling-wave based scheme, namely data-driven online distributed disturbance location (DODDL), is proposed to quickly detect disturbances and determine their geographic location in large-scale power grids when the grids’ topology is not available. The proposed DODDL scheme consists of two function blocks: (i) a singular spectrum analysis-based change-point detection method, which can quickly detect disturbances and determine their arrival time at distributed sensors, and (ii) a novel temporal scanning algorithm, which can accurately determine the geographic location of the disturbance source point. Utilising field measurement data sets recorded by the frequency disturbance recorders from the frequency monitoring network, it is shown that the DODDL scheme is not only quicker and more robust to grid non-homogeneity than existing approaches, but also can capture and locate more subtle and concealed disturbances.


Introduction
Timely and efficiently detecting and locating disturbances are critical to the reliability and security of power grid operations. Once a fault occurs in power systems, operators need to detect the disturbance and locate the hypocenter before launching appropriate remedial countermeasures. Although many fine-grained methods have been developed [1,2] to detect and locate disturbances, they may not be reliable all the way. For example, in the Northeast blackout of 2003 [3], a software 'bug' in a computer at First Energy prevented its energy management system from detecting and reporting important transmission line tripping events to the operators. The operators were unaware of the malfunction, the transmission line tripping events, and the needs of re-dispatching power generation. Consequently, a transmission line tripping event led to widespread power outage. If a backup disturbance detection and location system had worked in parallel with the current detection system and increased the situational awareness (SAW) over a wide geographic area, the risk of cascading failures like the Northeast blackout of 2003 could have been significantly reduced.
Several considerations to building the backup disturbance detection and location system are listed as follows: (i) Wide coverage. To obtain SAW in wide areas, the method should be able to monitor disturbance over the entire grid, and give a warning if the local fault detection fails.
(ii) Lightweight. The method should not rely on a high volume of sensors over the whole system. (iii) Independence to system models. Due to the administrative constraints, it is very difficult to build a detailed operational model in real time for a large interconnected power grid. Therefore, the method should function properly even without a detailed model.
Frequency monitoring network (FNET) is a platform for building a backup disturbance detection and location system. FNET is an Internet-based real-time, wide-area frequency monitoring network proposed in 2000 [4], which measures and records frequencies of typical office outlet at 120 V voltage level using GPS-synchronised clocks. The objective is to create a low cost and rapidly deployable wide-area frequency measurement system with high accuracy and minimal installation cost. The frequency disturbance recorders (FDRs), as a major part of FNET, calculate frequency using algorithms of phasor analysis and signal sampling techniques. The wide area measurement data obtained from different FDRs has revealed valuable information about power system disturbances.
Researchers have proposed to leverage the FDR data to locate disturbance sources based on the principle that disturbances travel as an electromechanical wave in power grid network [4]. When a disturbance happens, the contingency frequency perturbation will propagate in the entire power grid through transmission lines. The sampled frequency measurement data at sensors carries information about disturbance characteristics, such as disturbance categories and intensities. In general, a disturbance can lead to frequency changes with the same properties at different terminals, such as frequency decrease following a generator tripping or frequency increases following a load shedding. However, the widearea frequency measurements have a time difference of arrival (TDOA) of frequency perturbation by the distance between the measurement locations and disturbance source [5]. Consequently, it is a critical step to accurately detect the arrival time of the disturbance signals at sensors that are deployed at different geographic locations [6][7][8] and then the disturbance source can be located. Recently, researchers have leveraged the novel edge/fog computing paradigm to tackle the early anomaly detection problem in wide-area phasor measurement unit (PMU) network [9], and the results are pretty encouraging because the PMU fog can effectively reduce the data flow end-to-end delay without sacrificing data completeness.
In this paper, a data-driven online distributed disturbance location (DODDL) scheme is proposed to detect disturbances and locate their sources. The major contribution lies in two folds: (i) a singular spectrum analysis (SSA) [10] based change-point detection method is proposed to quickly detect disturbances and determine their arrival time; and (ii) a novel temporal scanning (TeSc) algorithm is proposed to identify the geographic location of the disturbance source. Utilising the FNET data, we have validated the DODDL scheme with satisfying results. Not only had it instantly detected the change in frequency with high accuracy, but it also located the source precisely.
The rest of the paper is organised as follows. A brief review of related work is presented in Section 2. Section 3 describes the proposed DODDL scheme. Section 4 discusses the rationale and design of the SSA based change-point detection method to quickly detect a disturbance and determine its arrival time at distributed sensors, along with conceptual-proof simulation experimental results. Section 5 introduces the TeSc algorithm, which can determine the geographic location of disturbance sources. Section 6 discusses the computational complexity issue. Finally, Section 7 concludes the paper with some discussions and our on-going efforts.

Disturbance arrival time detection
Quick and accurate detection of disturbances is the prerequisite of successful disturbance location. Thanks to the development of synchronised system measurement and communication technologies, many data analysis and machine learning algorithms have been introduced to detect disturbance. These methods can be roughly divided into model-based methods and non-model based methods. The non-model based methods directly catch the instantaneous changes from the observed waveform, induced by the abrupt transition when the disturbance occurs. For example, principal component analysis has been used on PMU data to extract features and then obtain the disturbance information like type and intense [11]. In [12], support vector machine and wavelet analysis are adopted to identify the faulty-section and faulty-half. In another work [13], the FDRs data was described as a hidden Markov model, and then disturbances were identified using matching pursuit decomposition. All of these machine learning and data analysis methods show significant potential in disturbance detection.
Under the same framework of TDOA using FDR data, current disturbance arrival time methods are simple and straightforward. A method has been proposed that employed a morphological filter to remove the noise before extracting wave-front arrival time [7]. The arrival time was defined as the first sample that presents several consecutive and large-enough differences. Another disturbance detection method is the event start time (EST) method [6], which also pre-processed the noises and spikes with an adaptive median filter. After de-noising, the EST method detects the EST as the last point with the most 'normal' value in a section of frequency data series. These methods suffer from not only single detection capability but also non-trivial detection delay. Therefore, they are good candidates for postmortem analysis, but not suitable for online detection.
In this paper, to achieve the goal of online disturbance detection, we considered the SSA, a non-model time-series analysis technique. Because the measurement data is synchronised, it is appropriate to consider it as a time series. Intuitively, arrival time detection can be considered as a change detection problem, which identifies the change point of the time series. And similar time-series analysis techniques have been tentatively used for other applications in the smart grid. Although the time-series analysis applications exist in the voltage quality monitoring [14,15], anomaly detection [16], false data injection attack detection [17] and so on, to date there is not reported research that has tried to leverage the SSA method in power system data analysis.

Geolocation methods
In this subsection, we focus the discussion on the travelling-wave based disturbance location methods. The major challenge of this kind of methods is the high heterogeneity in the network. For instance, the propagation velocities are not the same in the entire power grid. Most of the reported approaches tried to reduce the location error resulted from the heterogeneity through the additional information of the network topology, or just using single-end data. For example, the approach proposed in [13] detected the impact of the fault on each recorded bus and combined with topology information to give a fault contour map. In [12], the authors firstly identified the transmission configuration (either cable or line) and then used the corresponding wave travelling velocity to estimate the distance from the single end. These methods either require the topology information or only cover limited local grid network.
To locate disturbances in wide areas without using power grid topology, we used geographic distance instead of actual transmission line distance. The reported method using the TDOA to identify the geographic location of a disturbance source is defined by the equation below [8,18,19]: where V is the mean velocity of electromechanical waves propagating in the power grid, t i is the disturbance arrival time at a frequency measurement unit i, and t h is the actual disturbance start time. (x i , y i ) is the Cartesian coordinates of the ith frequency measurement unit and (x h , y h ) represents the disturbance hypocentral location. The equation set for estimating the disturbance source location can be solved using Newton's method or the least squares (LS) method [8,18]. The LS method linearises the model by making a difference between neighbour equations. One concern is that the root close to the real situation is often eliminated because of the order reduction. Newton's method is a gradient approaching method. However, because of the limited, unknown number, only a subset of the data items is needed. The method itself cannot take advantage of the abundant collected measurement information. The location estimation errors under these methods were at the level of 100 km. Besides the TDOA based methods, there are other approaches using triangulation location. Following the principle of local minimum, researchers have tried to triangulate disturbance location with only three measurement units, and they successfully detected in which nation a disturbance happened [16]. Researchers have estimated the location of disturbance sources by studying the frequency oscillations [19]; the smallest estimation error is still >160 km.
Because the major estimation errors result from the unequal velocities and simplification of electrical distance to geographic distance, we propose a location estimation method to mitigate these two issues by leveraging redundant measurements.

DODDL: system level overview
The DODDL scheme is a travelling-wave based method that aims at detecting disturbances and determining their geographic locations in real time using measurement data. A frequency perturbation will be observed at monitors distributed at different geographic locations in the identical morphology. Fig. 1 shows the frequency responses recorded by six different FDRs after a generator tripping event. The figure illustrates that FDRs at different locations observed different disturbance arrival time. The proposed disturbance location scheme is based on the geometrical triangulation algorithm which leverages the measurement of TDOA. According to the event triangulation principle, the DODDL scheme consists of two function blocks as shown in Fig. 2. Table 1 summarises the symbols to be used in the paper.

Arrival time estimation
An SSA-based change-point detection method is implemented to quickly detect the arrival of a disturbance at distributed sensing units. This detection method calculates the disturbance arrival time without any additional pre-processing steps for de-noising the data sequence. Consequently, it achieves much higher efficiency and is applicable to detect different categories of disturbances.

Disturbance source location
Leveraging the disturbance arrival time reported by each sensor, a novel TeSc algorithm is introduced to accurately identify the geographic location of the disturbance source point. There are two main challenges to accurately locate disturbances. Firstly, the propagation speed of the electromechanical waves may vary from 100 to 1000 km/s among different paths in the power grid due to variant transmission technologies. Secondly, it is the absolute geological distance that is used to simplify the calculation. However, in the real world, the electromechanical waves propagate along power grid transmission lines. To mitigate the negative impacts of the module simplification, the proposed TeSc algorithm takes advantage of the abundant FDRs data.

Estimation of disturbance arrival time
One fundamental step for disturbance location is that each sensor should accurately and quickly estimate the time point at which a disturbance starts to impact. With synchronisation, it is feasible to compare the disturbance record data marked with time stamps. Therefore, a time series is naturally suitable for corresponding data analysis technique. In the normal state, the collected frequency measurement time series will operate stably. When a disturbance happens and affects the local condition, the frequency measurement time series will present different patterns. Consequently, the detection of disturbance arrival problem is formulated as a change point detection problem, and the SSA algorithm is adopted to solve it.

Change-point detection method based on the SSA
Change-point detection [20][21][22] is concerned with the design and analysis of procedures for on-the-go detection of possible changes in the characteristics of a running (random) process. The time instance at which the state of the process changes is referred to as the change-point, which is not known in advance. The SSA theory, proposed in 1986 by Broomhead and King [10], has shown its significant ability in a wide field of practical applications, such as finding data structure, extracting periodic patterns and complex trends, smoothing and change point detection [17,23]. Actually, the SSA method has shown good potential in the detection of the sustained oscillation in the power grid [24]. Applied by a branch of applications, the SSA approach features lower false alarm rate and higher detection accuracy. Practically, matrix operation-based SSA conversion is more regular and easier for implementation than hash-function-based carnality estimator. Results of SSA procedure have already outlined the potential point of change, which simplifies the following operation, to benefit the overall process [21,22,25].
The main idea of SSA-based change-point detection method is as follows.
is an ongoing frequency measurement time series. At the time point of t + o, a section of previous data f t + 1 , f t + 2 , …, f t + N will provide a template of normal data operation, while a section of current data f t + 1 + s , f t + 2 + s , …, f t + o is supposed to describe the situation at the current time point. These two sections will be transformed into two trajectory matrices to keep the completed information. Meanwhile, the matrix of previous data, i.e. the target matrix, will be projected to an l-dimensional subspace ℒ l by the singular value decomposition (SVD). After that, the distance between the mentioned subspace and the matrix of the current situation, i.e. test matrix, will be calculated. A change point will be declared if the distance between the previous situation and the current situation is larger than a pre-determined threshold.
This SVD-based change-point detection method is resistant to noise. When projected by the SVD, the target trajectory matrix can be divided into different components. Each component can be represented by one or a set of eigenvalues along with their corresponding right eigenvectors and left eigenvectors. The signal containing disturbance information usually is either slowly-change trend or simple oscillation. So the signal can be described by several low-rank components with larger eigenvalues than others, while the noise is complex and its power is distributed to several components. Therefore, even with low signal-to-noise ratio, extracting the primary components (components with large eigenvalues) can effectively retain the disturbance information and eliminate noise.
The detection method involves the following three steps. Assuming N, M, s and o be constant integers, Step 1: Construction of the target matrix. Define f target (t) as the target time series starting from time point t with length N: Map the vector f target (t) into an M × K matrix F target (t) as follows: Apply the SVD operation on f target (t) . Assume , and get M eigenvalues λ 1 , λ 2 , …, λ M (in decreasing order) and the corresponding eigenvectors U 1 , U 2 , …, U M , which are biorthogonal, i.e. (U i , U j ) = 0, when i ≠ j. Then, let's choose the first l eigenvalues, calculate their corresponding rank-one biorthogonal matrices F i and get l-dimensional target matrix F target, l (n) , which is the sum of F i .
Step 2: Construction of the test matrix. Similar to Step 1, construct the test time series Step 3: Computing distance. Compute the difference between the two matrices measured via calculating the sum of squared Euclidean distance where U is composed by the chosen eigenvectors U i 1 , U i 2 , …, U i l of R (n) from step 1, and i indicates the ith column of the matrix. D is used to detect the change points through a specific threshold.

Online estimation of disturbance arrival time
The SSA-based change-point detection method is developed following the above principle. As shown in Algorithm 1 (see Fig. 3), it is straightforward to detect the disturbance arrival time. In a power grid, frequency data measured by each monitoring unit is collected as a one-dimensional time series. When an event occurs, such as a generator tripping or a transmission line tripping, a frequency perturbation is introduced. A generator tripping will result in decreasing of frequency, while a transmission line tripping will create a short-term oscillation of frequency and then the frequency will come back to the original value. Thus the detection of the change-point of the frequency data sequence identifies the arrival time of the disturbance.
In general, the longer the target/test window is, the more details can be extracted from the working pattern of that particular point. However, the tradeoff includes both higher computing complexity and longer processing time. Considering the length of the time series and the sizes of both target matrix and test matrix, a relatively small value was selected to control the computational complexity to a tolerable level with little sacrifice of the accuracy. In this paper, the parameters were experimentally set as: N = 36, M = 18, s = 36, o = 66 and l = 6.

Experimental validation
Because the actual disturbance arrival time is not known in the real-world FDRs data, simulation data were used to validate the proposed method. A generator tripping case and a transmission line tripping case were reported in this subsection. In our experimental study, the disturbance and fault locations were randomly selected without loss of generality. Actually multiple simulation experiments were conducted with different locations. The obtained results are similar, which shows that the proposed method is not sensitive to the location of the faults and disturbances. We compared the proposed method with the EST method, which mainly considered white noise and impulsive noise. An adaptive median filter was applied to smooth the spikes and noises [6].

Scenario 1: generator tripping:
The simulative data set, based on the 16-machine 68-bus system [26] shown in Fig. 4, was generated using the Power System Toolbox for 20 s with a simulation time step of 0.0033 s. In the 16-machine system, generator G01 was tripped off by incurring a fault on the transformer between bus 2 and bus 53 at 10.1 s. The fault was cleared at the near end at 10.15 s and at the remote end at 10.20 s. The load noise is usually around 3-5% in real-world power grid system [4]. To sufficiently mimic ambient responses to random load changes, 1-8% of Gaussian white noise was added at each load.
The frequency measurements at the 16 generators were collected and used as testing data. Because of the short distance, the electromechanical effect can be ignored. The actual disturbance arrival times at each generator had a very small random delay in the range of 1-2 ms after the disturbance start time, 10.1 s. Compared to the detection delays, the propagation delay was trivial. So approximation was made by using the disturbance start time instead of the disturbance arrival time, ignoring the negligible delay errors.
The performance of the SSA and EST algorithms was compared under noise levels from 1 to 8%. Fig. 5a shows the average arrival time detected by the two algorithms at the 15 generators (i.e. G02, ..., G16). When the noise level grew, the start time estimated by the SSA was consistently close to the actual start time 10.1 s, with some ignorable increasing. In contrast, the start time estimated by the EST algorithm suffered obvious longer time delays. This degradation was due to two factors: (i) the effect of the disturbance was initially weakened by the median filter; and (ii) in the EST the disturbance start time was defined as the last intersection with the benchmark, while the vibration after the generator tripping made the last intersection away from the yielding point. Fig. 5a also compares the detection rate of the SSA and EST algorithms. Given the generator tripping start time at 10.1 s. The detection rate R d in Fig. 5 is defined as No . of events detected by the algorithm No . of total events Generally, both the EST and SSA algorithms were able to successfully detect the generator tripping. The relatively lower detection rate of the EST may be caused by its median filter and its approximate treatment of the yielding point, which may lead to detection before the disturbance.

Scenario 2:
transmission line tripping: Another tested power system event is a transmission line tripping, which generally speaking does not result in frequency changes as significant as the generator tripping does. Instead, the frequency profile will experience a transitory oscillation with smaller magnitudes than those in generator tripping. Then, the frequency will return to its original value as it was before the disturbance. In the simulation, we tripped the transmission line between bus 47 and bus 48. The fault was cleared at the near end at 10.15 s and at the remote end at 10.20 s. And the total simulation window length was 20 s. The general performance evaluations actually concur with the intuition. Fig. 5b shows the average estimated start time of the disturbance and the detection rate for each algorithm. The EST algorithm suffered from a huge delay of detection and a much lower detection rate with higher noise levels. One reason lies in the denoising scheme of the EST, which mistakenly removed the head of the disturbance. Also, in the transmission line tripping, the frequency returned to the original value after the disturbance, which invalidates the assumption used by the EST detection.
The SSA outperformed the EST in both the generator tripping and line tripping scenarios. In the generator tripping simulation, although both algorithms were able to detect the disturbance with a high detection rate, the SSA was outstanding with shorter delays. In the transmission line tripping case, the SSA maintained a capability of quick and accurate detection. Meanwhile the EST gradually lost its efficacy of detection as the noise levels got higher.
For the generator tripping event shown in Fig. 1, the disturbance arrival time at FDRs was estimated by the SSA. The results shown in Table 2 match with the real data.

Location of disturbance sources
Once the arrival time of the disturbance has been accurately determined at distributed FDRs, the second phase is to find out the actual beginning time t h and the geographic location (x h , y h ) of the disturbance source based on the location of FDRs.

Propagation model analysis
Before building the disturbance propagation model, two assumptions are made as follows: • The frequency perturbation propagates to every measurement unit (i.e. FDR) at different locations as a continuum. It spreads over the entire electric grid at a uniform speed. • Although the perturbation propagates along electric transmission lines, geological distance is used instead of electric distance.
Thus the disturbance propagation model is written as where V is the mean velocity of electromechanical waves, which is treated as a constant according to the first assumption. t i is the disturbance arrival time at the frequency measurement unit i, while t h represents the actual disturbance or event start time. (x i , y i ) is the Cartesian coordinates of the ith frequency measurement unit and (x h , y h ) represents the Cartesian coordinates of the disturbance hypocentral location.
As the location information of the frequency measurement unit was provided in the form of latitude with longitude, a transform is needed before the computation. The Greenwich coordinate of each unit (Lat i , Lon i ) can be converted into Cartesian coordinate (x i , y i ) using the following equation: where 111,300 m is the approximate distance between two locations with the same longitudes and 1° difference in the latitudes.
The assumptions have simplified the propagation model but also introduced some systemic errors. One issue is that the actual propagation velocities are measured in the range from 100 to 1000 km/s along different paths. Meanwhile, earlier studies have shown that for a given path, the propagation velocity depends on timevarying system conditions [27]. Also, the difference between the real electric distance and the geometric distance can be unignorably large. However, addressing these issues requires considering the actual propagation velocity and the real-world electric network configuration and topology, which will lead to very high computational cost. It is worthy of considering how to reduce the negative effect of these noise by leveraging the abundant FDRs data.
Another issue that calls for attention is the overdetermined equation set. Normally the number of the measurement units is larger than the number of the unknowns ((x h , y h ), t h , and V at most), which means the result can be attained using only part of the data set. Meanwhile, it is critical to select data points that are less affected by the underlying noise. If the selected data points were with a high level of noise, it is possible that the measurement units far away from the disturbance source will mistakenly detect the disturbance earlier than the ones that actually are located closer. Then the location result will point to the opposite direction.
Considering the measurement samples were more than the unknown variables, we firstly provided some estimates about the disturbance location and then used all the measurement data to evaluate and determine the estimation of the most possible. Although the individual measurement data had propagation error and modification error, according to the law of large number, with a large enough sample size, the estimation will be sufficiently close to the actual situation. Following this rationale, the proposed TeSc algorithm firstly solved the equation set with partial selected data set and start time assumption and got a series of estimations. Then the estimates were evaluated by all measurement samples using the mean squared error (MSE).

Temporal scanning algorithm
Based on the following observations, the TeSc algorithm is proposed to locate the source of the detected disturbance.
1. Although the accurate start time of a disturbance is unknown, a reasonable range of the actual time point can be estimated. 2. Similarly, the propagation velocity of the electromechanical wave in the power grid can be assumed reasonably well.
3. Intuitively, the noises and interferences cannot make the signals propagate faster but only slower. 4. The actual location (x h , y h ) and start time t h should be able to fit in (2) with small error.   (x h , y h ) and t h . 5. Below is a detailed explanation of the operations.

Value range of the disturbance start time:
As the disturbance event must have started earlier than the time point when the signal is detected, the upper bound of start time was selected as the earliest of the detected arrival times, t max . The lower bound t min was defined in the following equation: where d max is the largest distance between two nodes in the power grid. It is the extreme case when the measurement unit and the disturbance source are at the two furthest ends of the network, such that the unit is the last one to detect the disturbance. Hence, in any situation, the disturbance should happen between t min and t max .

Dataset selection algorithms:
Given a particular value of event start time t h and a reasonably assumed velocity V, there are only two unknowns. As such, two data points are sufficient. More than enough data is collected by measurement units to locate the source of a disturbance, but arbitrarily picking two may end up with a huge error. The fewer errors or noises, the more accurate the spatiotemporal location will be. Once a [(x h , y h )] t h is solved, the rest of the redundant measurement units are used to calculate the MSE of this [(x h , y h )] t h point. It is not practical to expect an MSE of zero, but the minimum MSE does indicate the best estimates. For this purpose, the following two-step data selection method is proposed. On the one hand, the entire map is divided into regions with equal size along the longitude and latitude, and an average disturbance arrival time detected by the measurement units in each region was calculated. Intuitively, an average of noises which lead to inconsistency of the detection sequence concerning the distance between the measurement units and the disturbance source point were also used. By applying the average, the influence of the random noises on the individual data items can be effectively reduced.
On the other hand, to solve (2) to find the geographic location, data points with the minimal disturbance arrival times were selected from two regions in which the average arrival times were minimal. The main source of the error is the assumption that the disturbance signals propagate along a straight line, while they actually follow the grid transmission lines. Hence, it is reasonable to assume that the earlier a sensor detects a disturbance, the less effect from noise or distance error is involved. Unconditionally choosing the single point that is the earliest detection of the disturbance will increase the risk of false detection. Therefore, it is proposed to select the earliest detected points from two regions with minimal average arrival time.
The above-mentioned rationale had been validated by a generator tripping case. In Fig. 6a, the green dots represent the 53 FDRs; and three monitoring units that detected the disturbance are marked as U A , U B , and U C . We divided the regions by each 10° in longitude and 5° in latitude and calculated the average arrival time, show them in Fig. 6b.
A reasonable guess is that the source of the disturbance should be located between 70° and 90° in longitude and 35° and 45° in latitude. Although the calculation using U A claims that the disturbance is located in between 90° and 100° in longitude and 40° and 45° in latitude, considering the regional large average arrival time of 18.22 s, the detection of U A probably was not accurate. Meanwhile, the second and the third points, U B and U C , belong to the two regions with minimal average arrival times, which made them more trustworthy. The calculation result also verified that higher accuracy was achieved by using the data at U B and U C rather than selecting the first two FDRs at U A and U B to solve the equation set.

MSE calculation:
The MSE assesses the quality of an estimator (i.e. a mathematical function mapping a sample of data to a parameter of the population from which the data is sampled) or a predictor (i.e. a function mapping arbitrary inputs to a sample of values of some random variable). The definition of the MSE is where Ŷ is a vector for n predictions, and Y is the vector of observed values corresponding to the inputs to the function that generates the predictions. Let's define the error function of one set of the estimator (x^h, y^h, t^h), such that Ideally, if there is not any noise or error, it is expected that Suppose the number of selected data items is n, and the estimated event start time t^h = t min + mΔt, where m ∈ (0, 1, …, (t max − t min )/Δt) Therefore, the MSE can be calculated as

TeSc algorithm:
The operations of the TeSc algorithm are illustrated by the pseudo-code in the following algorithm description table (see Fig. 7).

Validation
We validated the proposed TeSc algorithm using the generator tripping event which took place on 14 November 2013. As shown in Fig. 1, among all FDR units, significant perturbation had been observed by six sets of FDR frequency signals, which were selected for the validation analysis. The SSA based change-point detection method was used to detect the disturbance arrival time at each single FDR. Table 2 shows the Greenwich coordinate and disturbance arrival times of the six FDRs.
In this experimental study, the parameters are set as follows. The largest distance in this power grid d max = 3 × 10 6 m, the propagation velocity V = 6.9 × 10 5 m/s. Therefore, the estimated start time range was [13.8000, 17. The last important issue is the impacts that are introduced by the estimation error of electromechanical wave propagation speed, because we do not know the actual propagation speed in the power grid. Fig. 9b illustrates the differences in the detection results under various velocity assumptions, from 500,000 m/s to 900,000 m/s. This result revealed that the location accuracy is insensitive to errors in the assumed electromechanical wave propagation speed.

Computational complexity
The computational complexity is an important issue to the algorithms for real-time applications. In this section, the SSAbased change-point detection method and TeSc location algorithm are evaluated. Generally speaking, the SSA-based change-point detection method should work all the time for each local measurement to monitor if there is any change, while the TeSc location algorithm only runs when a disturbance is detected.

Computational complexity of the SSA-based changepoint detection method
In the calculation of the SSA procedure for a single time point, the matrix algebra computation accounts for a very high share. So the matrix size parameters N, M, s, o and subspace dimension l determine the computational complexity. The complexity of each operation in the SSA procedure is shown in Table 3. In practice, the target matrix usually is a square matrix, i.e. M = N /2 > l, and

Computational complexity of the TeSc location algorithm
When a major disturbance is detected, and the disturbance arrival time at each FDR is obtained, TeSc scheme is triggered to locate the source of the disturbance. In the procedure of the TeSc location algorithm, the disturbance time scans on the range of (t min , t max ), with the step Δt. In every turn of scanning, with a possible disturbance start time t h , a binary quadratic equation set is solved for a set of possible disturbance location (x h , y h ), and then the MSE of the disturbance estimator (x h , y h , t h ) is calculated. In this procedure, the solving of the binary quadratic equation set costs most of the computing resources. The experiment platform is a laptop with Intel i5-4200M CPU @ 2.50 GHz, 8.00GB RAM. The operation time on MATLAB R2012a is 19.1881 s of CPU time.

Conclusions
In this paper, we proposed a novel method, namely DODDL, which can rapidly detect disturbances and accurately locate their sources in a large scale power grid leveraging the distributed measurement units. The SSA-based change-point detection method was proposed to the field of power grid monitoring and evaluated through intensive experimental study. The simulation experiments have shown the robustness of the proposed DODDL method in a noisy environment and its high sensitivity to the changes in the power grid, such as line tripping. Quick and accurate detection of the arrival time of disturbance signal at multiple frequency measurement units inspired the TeSc algorithm, which can estimate the geographic location of the source of a disturbance. Using realworld data, the experimental results verified the effectiveness and correctness of the DODDL scheme. Furthermore, the DODDL scheme achieved high accuracy, and effectively avoided the invalid solution brought by the quadratic equation set, and thus, is robust to the errors of propagation velocity. The SSA method and TeSc algorithm are non-parameter, model-less algorithms. It is the advantage of the proposal as it is insensitive to the changes in the actual grid properties. However, it is also the main constraint that prevents the algorithms from leveraging the characteristics of the grid to achieve better performance. Accordingly, our on-going work mainly consists of the following components: (i) The effects of a fine-grained electromechanical wave propagation model and more detailed network topology of the power grid will be considered in the TeSc algorithm for more accurate location. (ii) New digital signal processing techniques will be explored to denoise the data without losing useful information. (iii) New signal processing algorithms will be developed to detect, distinguish and locate multiple anomalies and disturbances that happen within short time intervals.