Clusterization for Distributed Timely Detection of Changes in Smart Grids

the power system


INTRODUCTION
The Smart Grid concept evolved as a response to the ever increasing demand for highly reliable electric power, the increasing penetration of renewable sources and the need of a robust and resilient grid design for surmounting challenges of an aging infrastructure. Modernizing the grid encompasses both the distribution and the transmission levels, where due to its less advanced state, the distribution level is currently receiving most of the design attention. At the distribution level, main objectives to be attained include improved efficiency, reliability and power quality, high penetration of renewables, active load control, self-healing, and vulnerability monitoring. These objectives could in principle be achieved through the fast control of hundreds of individual distributed generators (DGs) and other devices linked to the grid through a power electronic interface. However, this would require real time information on each DG unit and key loads, leading to a daunting control problem. The control complexity and reliability of such a system may be greatly reduced if the distribution system is broken down into smaller partitions, named clusters, with each cluster containing a data communications supporting infrastructure and resembling the concept of microgrids [1].
The challenge addressed in this paper is the formalization of an optimization criterion for the appropriate formation of clusters at the distribution level. Our formalization is based on the following idea: clusterization evolves from the necessity for fast response in the presence of significant power flow (PF) or power loss (PL) changes, reflected by measurable bus voltage and/or current changes. Thus, the pertinent optimization criterion adopted is accurate and timely detection of such changes. The adopted criterion leads to the development of a distributed stochastic sequential algorithm, implemented by the supporting data communication network, whose distributed characteristics are initially defined by sufficient identifiability conditions and whose accuracy and speed performances are controlled by a set of threshold parameters. The algorithm is additionally asymptotically optimal in a mathematically precise sense.
In distribution networks, real-time monitoring is a critical issue, since the number of supervisory devices installed may be limited. To provide reasonable and meaningful estimates of the distribution network state from a very limited set of real measurements, State Estimation (SE) methods have been widely studied in the literature [2][3][4][5][6][7][8][9][10][11][12]. The network state is typically given by the voltages at all nodes [13], estimated from a few real measurements and complimented with pseudo-measurements, to ensure system observability; the pseudomeasurements need to be accurately modeled to improve the estimation quality [2]. In [3], the network is first split into smaller sections, before an SE process is deployed to estimates voltage magnitudes via artificial neural networks. In [4], SE based on a statistical technique is utilized to estimate the level, location and impact of voltage unbalance. In [5] and [6], the status of the distribution network is estimated using a distributed measurement system in a multi-area framework.
The work in [7] utilizes synchrophasor measurements to study dynamic SE techniques in distribution grids. The SE approach in [8] is proposed for identification of network topology changes. With the integration of distributed generation, the evolution of the distribution grids may involve enhanced complex behavior. As mentioned in [9], in the evaluation of network status, traditional SE methods may be insufficient and inaccurate. Hence, while robust SE techniques are required to support the secure operation of the network, such approaches may require in some cases [10,11] a significant number of measurements yielding to complex computational processing. In addition, inaccuracies of the deployed SE methodologies may affect confidence in their results [12].
In the present work we depart from SE methods in monitoring. Instead, we propose a sequential distributed algorithm which monitors failure probabilities of distribution lines (including the equipment they incorporate), to detect effectively alarming dysfunctionalities, while simultaneously imposing network clusterization. The organization of the paper is as follows. In Section 2, the system model and problem formulation are presented, while, in addition, a maximum likelihood (ML) estimation optimization criterion is formulated and identifiability analysis is performed. In Section 3, a stochastic approximation approach to the maximum likelihood solution is presented and analyzed. In Section 4, the distributed sequential detection of change algorithm is presented and analyzed. In Section 5, discussions on the structure of numerical evaluations and an example are included. In Section 6, conclusions are drawn.

SYSTEM MODEL, PROBLEM FORMULATION AND ML IDENTIFIABILITY
We view a power distribution network as a graph comprised of nodes (DGs and power consuming units) and lines. For each of the lines, we determine metrics of good condition or acceptable functionality which may be represented by acceptable ranges of bus voltages and currents. It is important to note here that the status of DGs and power consuming units may be incorporated into the acceptable line functionality concept; alternatively, dysfunctionality of lines may be caused by failing equipment at their ends. Thus, when a line is declared faulty, functionality investigation of all its components (including the equipment at its ends) should be next initiated. If a line meets the acceptable functionality conditions, it is given a success score 0; otherwise, it is given a failure score 1. Similarly, we determine acceptable functionality for power demands raised at some network node k and addressing another network node l, giving the same 0, 1 scores. The (kl) acceptable functionality is determined by metrics same as those for line functionality, only that the ordered pair (kl) is generally a route involving several lines, where a (kl) unacceptable functionality may be caused by the failure of any of the lines involved in the route. We will assume that the metrics of acceptable functionality are well defined. For given topology of the power distribution network, let us denote: For given network topology, we assume that relative loads and routing probabilities are design quantities and remain unchanged during the monitoring process (before increase of their values above the upper bounds is estimated). We use (kl) outcomes as observations. We first formulate a maximum likelihood (ML) estimation approach [14] for the probabilities {p i } which encompasses identifiability conditions. We then transform the result of the approach to a sequential stochastic approximation [15] format, which we finally use to develop a sequential detection of change algorithm for the set {p i }. Similar approach was first taken in [16] for the distributed monitoring of the telephone network, where in this paper, the delay characteristics induced by the distributed sequential detection of change algorithm are used as design guidelines for the clusterization of the power distribution network.
To avoid formulating a cumbersome problem, we make the following simplified assumption: For each demand failure, there is a major contributor: a single line which is its initial or main cause. This assumption is consistent with our network maintenance objective: we wish to monitor possible deteriorating network conditions, characterized as "soft faults", rather than recognize "obvious" catastrophic events. Under this assumption we have: Without much effort, we can also derive the expression below, which expresses the probability f (kl) (x) as a function of load, routing and failure probabilities.
The overall failure probability for pairs (kl) is obtained by summing up the above expression over all the network lines. Finally, we consider a sequence of pair observations, where x j, kl denotes the j-th outcome of a (kl) pair demand and N kl is the number of (kl) pair demands made. Then, assuming that demands are independent from each other and summing up over all (kl) pair demands, we form the following ML function for the probabilities {p i }: We note that the independence assumption made in the formulation of the ML function in (4) represents a worst case scenario: in the presence of dependent demands, the probability of error induced by the then optimal ML function is bounded from above by that induced by the ML function in (4), [14].
The existence of unique solutions of the system in (5) From the above derivations and discussions, we conclude then that {p i } identifiability is represented by the sign of the quadratic expression in (8), where this sign is negative if and only if the vectors in (9) are all linearly independent. Thus, the {p i } is identifiable if and only if the vectors in (9) are linearly independent. Alternatively, subsets of components in {p i } which are identifiable correspond to linearly independent routing vectors in (9). Such subsets determine an initial clusterization of the network, where the cluster sizes arising from them are generally larger than those which will arise from delay constraints imposed on the reliable detection of changes.
Let us now assume that we have identified M a linearly independent routing vectors and matching a s ;s = 1,…,M a identifiable lines in the network. For such vectors, let us define: Then, the ML system in (5) can be written as: is substituted by the routing probability q s,(kl) . The resulting ML system depends then only on observations and routing probabilities; not on relative loads.
We will complete this section by stating a theorem whose proof is in the appendix.

Theorem 1
The ML estimate in (11) is asymptotically consistent and efficient if each true value of the identifiable components in {p i } is larger than some e j > 0, where 1 e j   j Theorem 1 states an important algorithmic guideline. It basically implies that perfectly operating or totally disconnected (kl) power demand-to-power source pairs should be excluded from the ML algorithm, because they may dominate the estimation scheme and lead to false overall estimates.

A STOCHASTIC APPROXIMATION ESTIMATION ALGORITHM
In this section, we are seeking sequential ML algorithms for the identifiable lines in the network. Drawing from the notation in Section 2 and using the vector notation p for the set {p i } of probabilities, we first define a vector for given identifiable pair (kl) n and observed outcome x from the pair: We then express the following recursive approximation algorithm, where x(t+1) is the outcome of the (t+1) th pair observation: In (13), sequential estimates of all the identifiable components of the {p i } probabilities are computed at successive observation instances and the evolution of the scalar L( ) in time is termed gain sequence. If x(t+1) is a pair (kl) observation, then: In the appendix, we prove that the error of the stochastic approximation estimate in (13) converges asymptotically to a Gaussian variable, if the gain sequence and its first order derivatives are all bounded in the region of probability values where the conditions in Theorem 1 are satisfied. The specific choice of the gain sequence fine-tunes the convergence rate. From the above, as well as from results in [17] and [18], as discussed in the appendix, we conclude the following refinement of the algorithm in (13): for the two terms in (17) being the smallest and largest eigenvalues of the ML covariance matrix below.
The stochastic approximation algorithm for the set {v i } of probabilities is identical to that for the set {p i }, as presented in this section.

THE DISTRIBUTED DETECTION OF CHANGE ALGORITHM (DDCA)
In Section 2, we determined an initial network clusterization by isolating network lines which are identifiable by an ML estimation algorithm. In Section 3, we developed a sequential stochastic approximation format of the latter ML algorithm, as implemented within a set of identifiable network lines. In this section, we develop a sequential Distributed Detection of Change Algorithm (DDCA), still maintaining operations within an identifiable set of lines. In particular, the algorithm uses the sequential steps in the stochastic approximation algorithm of Section 3, in conjunction with the sequential evolution of the is developed to detect a change from one distribution to another in an optimal fashion and involves a threshold parameter δ > 0. In particular, the algorithm is asymptotically optimal in the sense that, for δ → ∞, the expected time for detecting a correct change is of order log δ, while the expected time for an incorrect decision is of order δ, while there exists no algorithm attaining faster correct decision subject to order δ speed of incorrect decision. Here, we transfer the concept of sequential stochastic approximation estimation to the concept of asymptotically optimal sequential detection of change via the following logical steps: (i) Let us assume that a satisfactory functionality of each line i is reflected by a probability v i which is bounded from above by a given value ρ i . Let us assume that unsatisfactory functionality is then reflected by a v i value which is bounded from below by ρ i + η i , where η i > 0 and ρ i + η i < 1. (ii) Let us then require that we detect a ρ i to ρ i + η i change per v i by detecting rapidly a change from a likelihood function is given by (4) By substitution in the modified expression (4), as applying to the probabilities {v i }, we conclude that Equivalently, let us consider the case, where subject to {v i = ρ i ; i≠s, v s = ρ s + η s } versus {v i = ρ i ; all i}, the stochastic approximation algorithms in (13)  Given design threshold parameter δ s > 0: Stop the first time n(s), such that W (n(s)) ≥ δ s and decide that the change ρ s → ρ s + η s has occurred.
The above algorithm uses only observations and routing probabilities and basically detects a change from a Bernoulli random variable with parameter ρ s to another Bernoulli random variable with parameter ρ s + η s , where the Kullback-Leibler number between these two Bernoulli variables is: Non-asymptotically, the performance of the algorithm is determined by the power and false alarm probabilities it induces, as functions of time. Given finite threshold value δ s , the false alarm probability α (δ s , n) is the probability that the algorithm crosses the threshold at the nth observation for the first time, given that no change ρ s → ρ s + η s has occurred. Given the same threshold, the power probability β (δ s , n) is the probability that the algorithm crosses the threshold at the nth observation for the first time, given that the change ρ s → ρ s + η s occurred before the collection of observations began. The methodology for the recursive in time computations of the probabilities α (δ s , n) and β (δ s , n) in the presence of the Bernoulli model represented by the algorithm in (24) is an extension of that found in [14] and [19]. The design decision regarding the operating threshold δ s is based on the required power versus false alarm tradeoff. The specific requirement is then that at a given time n, the power probability exceeds a given lower bound, while the false alarm probability remains below another given upper bound. Extensive discussion on the selection of operating decision thresholds for the non-distributed simpler version of the algorithm can be found in [22].

NUMERICAL EVALUATIONS -AN EXAMPLE
Given a power distribution system, the DDCA is implemented stepwise, as follows: (i) The global system matrix with columns as those in (9)   We note that in the case where ρ s = ρ and η s = η; for all s, then the updating steps in (22) take the following simplified form: In this section, we consider an example of a simple hypothetical 12.66 kV distribution system with 33 buses, 37 lines, and a looping (auxiliary) branch [23], exhibited in Fig. 1, where it is assumed that the power sources in the system are located at buses 1, 6 and 12 and the loads are located at buses 8, 14, 18, 19, 23 and 26. We also consider the possibility that the auxiliary line 35 may be utilized 20% of the time, when line 18 is disconnected (80% of the time), e.g. for minimizing real power losses and improving the voltage profile [24] or optimal day-ahead operational scheduling [25] via network topology reconfiguration.
Subsequently, two different scenarios arise: one referring to the above utilization of line 35 and one when line 35 is absent. In Tables 1 and 5, we exhibit the routing probabilities induced by the two different scenarios, where we indicate in bold, the vulnerable lines which need to be monitored: lines 1, 4, 6, 17, 22, 25 and 35 (35 is absent in scenario 1). In view of the later lines of interest, the matrix in Table 1 has a maximum of three independent columns; corresponding to lines 17, 22 and 25, while Table 5 has a maximum of four independent columns; corresponding to lines 17, 22, 25 and 35. Thus, without clusterization, not all lines of interest can be monitored via source-to-demand/load measurements, in both scenarios. Non unique, clusterization approaches are exhibited for each scenario, by Tables 2, 3, 4; 6, 7, 8, where a final clusterization decision may be dictated by possible additional limiting physical factors of the network, such as line ratings. We note that, in both scenarios, lines 17, 22 and 25 may be monitored by the global non-clustered system, in which case measurements from all three sources, 1, 6 and 12 contribute to their monitoring; then, as compared to the clusterized approaches, their monitoring is accelerated.   Table 1 Global system matrix { q i, (kl) } Identifiability probabilities for the power distribution system in Fig. 1

Without clusterization, only lines 17, 22, 25 and 35 can be monitored. A clusterization choice is shown below
Regarding the monitoring of the vulnerable lines, let us also assume that ρ s = ρ and η s = η; for all s, in which case the simplified form of the updating step in (28) is used in the implementation of the DDCA algorithm in (24). Finally, let us assume that the tolerable versus non-tolerable power system conditions are reflected by the choices ρ = 0.01 and η = 0.04; for non-auxiliary lines, while ρ = 0.01 and η = 0.2; for auxiliary lines. The latter choices, in conjunction with the probabilities in Tables 2, 3 In both scenarios, update the algorithm each time a (1,8), (1,14), (1,18), (1,19), (1,23) or (1,26) source-to-demand/load pair measurement is collected. For each source-to-demand pair measurement, use the updating step in (29) below.
In scenario 1, update the algorithm each time a (6,19) or (6,23) source-to-demand/load pair measurement is collected. In scenario 2, update each time a (6,23) source-to-demand/load pair measurement is collected. For each source-todemand pair measurement, use the updating step in (29).

(d) For line 25
In both scenarios, update the algorithm each time a (6,26) source-to-demand/load pair measurement is collected. For each source-todemand pair measurement, use the updating step in (29).

(e) For line 17
In all both scenarios, update the algorithm each time a (12,18) source-to-demand/load pair measurement is collected. For each source-todemand pair measurement, use the updating step in (29).

(f) For line 22
In both scenarios, update the algorithm each time a (12,23) source-to-demand/load pair measurement is collected. For each source-todemand pair measurement, use the updating step in (29).

(g) For line 35
In scenario 2, update the algorithm each time a (12,19) source-to-demand/load pair measurement is collected. For each source-todemand pair measurement, use the updating step in (29).
Selecting a threshold value δ s = 4.47 for the algorithm with updating step as in (29), we attain superior values of the false alarm and power probabilities, α (δ s , n) and β (δ s , n). Specifically, in 100 measurements, the power equals then 0.98, while the false alarm is practically zero (see [14], Figure 8.5.1).
The specific tolerable upper limit on the number of measurements is determined by the time required for their collection, in conjunction with the time-limit demanded for the detection of substantial changes.

CONCLUSIONS
Our overall objective has been the performance monitoring of a power distribution system, for real-time dynamic operational adjustment in the presence of substantial operational changes. Then, such changes need to be identified timely and accurately before pertinent adjustments be performed, necessitating a distributed approach. This approach may be implemented by network clusterization whose design should be dictated by the accuracy and delay constraints imposed on the detection and identification of operational changes.
The general approach taken in this paper involves the following steps: (i) We first consider the initially non-clusterized power distribution system and determine the -current, voltage, power-variations perceived as considerable changes; we also determine the vulnerable lines which need monitoring.
(ii) We secondly formulate a recursive maximum likelihood (ML) approach which naturally points to an initial network clusterization via incorporated sufficient identifiability conditions. (iii) We subsequently develop, analyze and evaluate a distributed sequential detection of change algorithm, implemented by the supporting data computercommunication network, whose performance (including accuracy and decision delay) is controlled by a set of threshold parameters and the architecture and dimensionality of each cluster.
Specifically, we have proposed a distributed algorithm for monitoring the quality of power lines (and their incorporated equipment) in power distribution systems.
The algorithm utilizes sequentially processed power source-to-demand measurements within an identifiable system, to generate alerts about faulting lines, rapidly and with a high level of accuracy. System identifiability, in conjunction with constraints on the speed of correct decisions, provide system clusterization guidelines.  It is easily seen that here:

Convergence of the Stochastic Approximation Algorithm in (11)
Let us define the vector regression function below.   It can be easily seen that due to the identifiability of the system, B is positive definite. It is obviously also symmetric.
We may further observe that if