1 Introduction

Power systems are vulnerable to cyber attacks. Modern IT technologies are heavily used in today’s supervisory control and data acquisition (SCADA) systems of industrial control systems including power systems. While IT technologies bring a lot of benefits, many security risks are introduced as well. For example, the connectivity of SCADA systems and enterprise networks improves business visibility and efficiency, but it makes SCADA systems more vulnerable to cyber attacks. According to the 2003~2006 data from Eric Byres, BCIT, 49 % cyber attacks at industrial control systems are launched via connected enterprise networks. One highly publicized example is Stuxnet, which attacked an industrial control system by infecting those organization networks that interact with the target [1].

In 2006, US Department of Energy (DOE) published “Roadmap to secure control systems in the energy sector” (updated in 2011) [2]. It envisions that: in 10 years, control systems for critical applications will be designed, installed, operated, and maintained to survive an intentional cyber assault with no loss of any critical function. Much effort has been made to secure power facilities. The DOE National SCADA Test Bed (NSTB) Program, established in 2003, supports industry and government efforts to enhance cyber security of control systems in the energy sector. The NERC standards CIP-002-4 through CIP-009-4 provide a cyber security framework for the identification and protection of critical cyber assets to support reliable operations of the bulk electric system [3]. The International Electrotechnical Commission Technical Council (IEC TC 57), i.e., power system management and associated information exchange, has advanced the standard communication protocol security in IEC 62351 with stronger encryption and authentication mechanisms [4]. The Hallmark Project by Schweitzer Engineering Laboratories, Inc. presents the secure SCADA communications protocol (SSCP) technology which provides integrity for SCADA messages. United States Computer Emergency Readiness Team (US-CERT) has set up awareness programs about system vulnerabilities to improve control system security [5]. The cyber security audit and attack detection toolkit by Digital Bond, Inc. is developed to identify vulnerable configurations in control system devices and applications. Reference [6] presents a risk assessment methodology that accounts for both physical and cyber security of critical infrastructures. In [7], a SCADA security framework is proposed. System vulnerabilities are assessed quantitatively through an attack tree. The impact of a cyber attack on SCADA systems is studied systematically in [8]. It is evaluated by the resultant loss of load through a power flow computation.

This paper presents a new risk assessment framework for SCADA systems of power grids. Individual vulnerabilities within control systems are evaluated based on the duality element relative fuzzy evaluation method (DERFEM). An attack graph is developed to identify possible intrusion scenarios that exploit multiple security vulnerabilities. An intrusion response system (IRS) based on the phasor measurement unit (PMU) data is proposed to assess the impact of intrusion scenarios on power system dynamics.

The main contribution is IRS, which is an on-line monitoring and control scheme based on PMUs. It monitors the impact of cyber intrusions on power system dynamics in real time. If power system instability, such as voltage instability, is judged to be likely after a cyber attack, IRS will act as a mitigation mechanism to prevent power system instability. Unlike traditional security mechanisms, such as encryption and authentication, which increase the complexity of power systems, and may cost additional time in power system operations, IRS uses a control strategy based on the conditional Lyapunov exponents (CLEs) to enhance the resilience of power systems against cyber attacks.

2 Risk assessment framework

The risk assessment framework is shown in Fig. 1. For SCADA systems of a power system, the procedure starts with identification of the configuration of its cyber system. Vulnerabilities within the cyber system are then identified. Each vulnerability is evaluated quantitatively by DERFEM. An attack graph is built to identify possible intrusion scenarios that exploit multiple vulnerabilities. The probability of occurrence of every intrusion scenario is calculated. Once an intrusion scenario is successfully executed, IRS will monitor its impact on power system dynamics in real time. The impact is characterized by CLEs computed on PMU data. If the values of CLEs are high, it implies that voltage instability is likely to happen, and then control actions based on CLEs will be taken to prevent voltage instability.

Fig. 1
figure 1

Proposed risk assessment framework

2.1 DERFEM

Assume that a cyber system has l identified vulnerabilities: r 1, r 2···r l . DERFEM is employed to assign each vulnerability a scaled value within [0, 1] which quantitatively characterizes the vulnerable level. The larger the scaled value is, the higher the vulnerable level will be.

DERFEM proceeds as follows.

1) Compare a pair of different vulnerabilities (r i , r j ) so as to obtain the scaled values \(\tau_{{r_{j} }} (r_{i} )\) and \(\tau_{{r_{i} }} (r_{j} )\). \(\tau_{{r_{j} }} (r_{i} )\) represents the vulnerable level of r i compared to r j . Likewise, \(\tau_{{r_{i} }} (r_{j} )\) represents the vulnerable level of r j compared to r i . \(0 \leqslant \tau_{{r_{j} }} (r_{i} ) \leqslant 1\); \(0 \leqslant \tau_{{r_{i} }} (r_{j} ) \leqslant 1\). If \(\tau_{{r_{j} }} (r_{i} ) > \tau_{{r_{i} }} (r_{j} )\), it implies that the vulnerability r i has a higher vulnerable level than r j does. \(\tau_{{r_{j} }} (r_{i} )\) and \(\tau_{{r_{i} }} (r_{j} )\) are from engineering judgments. This method is valid, because engineering judgments from different sources are statistically close when it is to compare two vulnerabilities.

2) Continue the comparison of different pairs of individual vulnerabilities until a matrix like Table 1 is generated (\(\tau_{{r_{i} }} (r_{i} )\) is set to be 1 here for convenience of the calculation).

Table 1 Comparison results of the vulnerabilities

3) In each row of Table 1, substitute \(\tau_{{r_{j} }} (r_{i} )\) with \(\tau (r_{i} /r_{j} )\), where \(\tau (r_{i} /r_{j} ) = \tau_{{r_{j} }} (r_{i} )/\hbox{max} (\tau_{{r_{j} }} (r_{i} ),\tau_{{r_{i} }} (r_{j} ))\).

4) Finally, the vulnerable level of r i is quantitatively characterized by \(\sigma (r_{i} )\), \(\sigma (r_{i} ) = \hbox{min} (\tau (r_{i} /r_{1} )\), \(\tau (r_{i} /r_{2} ), \cdots ,\tau (r_{i} /r_{n} ))\).

DERFEM does not measure the vulnerable level of certain vulnerability directly, which could be difficult. It reveals the relatively vulnerable level of the vulnerability compared to the others.

2.2 Attack graph

In practice, a hacker may have to compromise a couple of interconnected hosts within a cyber system before he/she gains access to the control systems. For example, an outside intruder has to compromise an enterprise network, and then attacks its connected industrial control systems via the enterprise network. This procedure is modeled as an intrusion scenario in this research. An intrusion scenario is comprised of several intrusion actions, each action involves exploiting one security vulnerability.

An attack graph is employed to capture possible intrusion scenarios within a cyber system. The attack graph depicts ways in which a hacker compromises interconnected hosts sequentially by exploiting the corresponding vulnerabilities so as to achieve a specific goal. The benefits of the attack graph take into account the effects of interactions of local vulnerabilities and find global security holes introduced by the interconnections [9].

Basic concepts of the attack graph are defined as follows.

Definition 1: Subject (S T). Subject is the initiator of actions. S t ∈ S T can be an attacker or a compromised device.

Definition 2: node (N D). An electronic device in a cyber system is a node, using \(n^{\text{d}} = (i^{\text{d}} ),n^{\text{d}} \in N^{\text{D}}\) to denote. i d is the identifier of the node, and it could be set as an equipment name. If a node is compromised by a subject, the node itself will become a subject.

Definition 3: privilege (P G). It is used to describe the operating privilege of a subject in a node. When s t ∈ S T and n d ∈ N D, the function \(P^{\text{G}} \left( {S^{\text{t}} ,n^{\text{d}} } \right) \to \{ 0,1,2,3,4,5\}\) expresses the privilege level of s t in n d. \(P^{\text{G}} (s_{i}^{t} ,n_{j}^{\text{d}} ) = 0\) implies that subject \(s_{i}^{t}\) has no access to node \(n_{j}^{\text{d}}\); \(P^{\text{G}} (s_{i}^{t} ,n_{j}^{\text{d}} ) = 1\) indicates that subject \(s_{i}^{t}\) is able to read the inbound and outbound messages of node \(n_{j}^{\text{d}}\); \(P^{\text{G}} (s_{i}^{t} ,n_{j}^{\text{d}} ) = 2\) means that subject \(s_{i}^{t}\) is able to block the inbound and outbound messages of node \(n_{j}^{\text{d}}\); \(P^{\text{G}} (s_{i}^{t} ,n_{j}^{\text{d}} ) = 3\) represents that subject \(s_{i}^{t}\) can read and block the inbound and outbound messages of node \(n_{j}^{\text{d}}\); \(P^{\text{G}} (s_{i}^{t} ,n_{j}^{\text{d}} ) = 4\) denotes that Subject \(s_{i}^{t}\) can send messages to node \(n_{j}^{\text{d}}\); \(P^{\text{G}} (s_{i}^{t} ,n_{j}^{\text{d}} ) = 5\) signifies that subject \(s_{i}^{t}\) has the full control access to node \(n_{j}^{\text{d}}\).

Definition 4: state (Z). State is a triple \(z = (s^{t} ,n^{\text{d}} ,P^{\text{G}} (s^{t} ,n^{\text{d}} ))\). State is the prerequisite of the next attack action to be implemented.

Definition 5: interconnection (I C). Interconnection refers to connections between nodes, using a quadruplet \(i^{\text{c}} = (n_{i}^{\text{d}} ,n_{j}^{\text{d}} ,C_{ij} ,M_{ij} )\), \(i^{\text{c}} \in I^{\text{C}}\), \(n_{i}^{\text{d}} ,n_{j}^{\text{d}} \in N^{\text{D}}\) to denote. C ij represents the communication channel between \(n_{i}^{\text{d}}\) and \(n_{j}^{\text{d}}\). C ij could be copper wires, optical fibers, wireless, dial-up, virtual private network (VPN), or digital microwave. M ij is the type of messages from \(n_{i}^{\text{d}}\) to \(n_{j}^{\text{d}}\). M ij could be measurements or control signals. M ij does not necessarily equal to M ji .

Definition 6: action (A). Action represents the set of possible actions of the subjects in a cyber system. Action is a quadruplet \(a = (n_{\text{name}} ,z_{\text{s}} ,z_{\text{d}} ,\gamma )\), \(a \in A\), \(z_{\text{s}} ,z_{\text{d}} \in Z\). n name is the name of an attack action such as the denial-of-service (DOS) attack or the man-in-the-middle attack; z s and z d represent the initial and final states of the action; γ is the vulnerability exploited in the action. γ is used to denote the difficult level of action a.

The algorithm to construct an attack graph proceeds as follows.

1) Identify N D and I C. Develop a directed graph (N D, I C). The vertex is \(n^{\text{d}} \in N^{\text{D}}\), and the edge is \(i^{\text{c}} \in I^{\text{C}}\).

2) Identify the node \(n_{k}^{\text{d}}\) which will be the target of attacks. \(n_{k}^{\text{d}}\) could be a SCADA server or a programmable logic controller (PLC).

3) Determine the goals of attacks—the state of \(n_{k}^{\text{d}}\) after being attacked, formulated as follows: \(z_{\text{d}} = (s_{i}^{\text{t}} ,n_{k}^{\text{d}} ,P^{\text{G}} (s_{i}^{\text{t}} ,n_{k}^{\text{d}} ) > 0\)), in which \(s_{i}^{\text{t}}\) represents the initial intruding subject (hackers).

4) Traverse the directed graph (N D, I C). Identify the node \(n_{{k^{\prime}}}^{\text{d}}\) that is connected to \(n_{k}^{\text{d}}\) directly. Assume that node \(n_{{k^{\prime}}}^{\text{d}}\) has been compromised by \(s_{i}^{\text{t}}\), and it becomes an intruding subject, say \(s_{{i^{\prime}}}^{\text{t}}\).

5) Extract an attack action aimed at \(n_{{k}}^{\text{d}}\) from \(s_{{i^{\prime}}}^{\text{t}}\), such that \(a = (n_{\text{name}} ,z_{\text{s}} ,z_{\text{d}} ,\gamma_{\text{a}} )\), \(z_{\text{d}} = (s_{i'}^{\text{t}} ,n_{k}^{\text{d}} ,P^{\text{G}} (s_{i'}^{\text{t}} ,n_{k}^{\text{d}} ) = P^{\text{G}} (s_{i}^{\text{t}} ,n_{k}^{\text{d}} ))\). \(\gamma_{\text{a}}\) is the vulnerability of node \(n_{k}^{\text{d}}\) exploited in action a.

6) Establish the prerequisite of action a: z s, formulated as follows: \(z_{\text{s}} = (s_{i}^{\text{t}} ,n_{{k^{\prime}}}^{\text{d}} ,P^{\text{G}} (s_{{i}}^{\text{t}} ,n_{{k^{\prime}}}^{\text{d}} ) > 0)\).

7) Set \(n_{{k^{\prime}}}^{\text{d}}\) as a new target node, and z s becomes another z d. Repeat step 4, 5 and 6, until \(s_{{i^{\prime}}}^{\text{t}} = s_{i}^{\text{t}}\).

After the attack graph is built, it gives a bird’s-eye view of possible intrusion scenarios. For each scenario, the probability of occurrence P b is calculated as follows.

a) If the intrusion scenario is comprised of two serial intrusion actions a i and a j , then

$$P^{\text{b}} = \sigma (\gamma_{{a_{i} }} )\sigma (\gamma_{{a_{j} }} )$$
(1)

where \(\gamma_{{a_{i} }}\) and \(\gamma_{{a_{j} }}\) are the local vulnerabilities exploited in the attack actions a i and a j . Note that P b is relative as \(\sigma (\gamma_{{a_{i} }} )\) and \(\sigma (\gamma_{{a_{j} }} )\) are relative. P b tells how possible an intrusion scenario is compared to the others.

b) If the intrusion scenario consists of two parallel intrusion actions a i and a j , then

$$P^{\text{b}} = \sigma \left( {\gamma_{{a_{i} }} } \right) + \sigma \left( {\gamma_{{a_{j} }} } \right) - \sigma (\gamma_{{a_{i} }} )\sigma (\gamma_{{a_{j} }} )$$
(2)

c) If the intrusion scenario is more complicated, the calculation of its P b will be the synthesis of (1) and (2).

2.3 Intrusion response system

The concept of IRS is illustrated in Fig. 2. It is intended to be an application in the control center of a power system. The proposed algorithm, which will be discussed in detail in Section 3, obtains updated power network configurations from the state estimator (SE), say, every 5 minutes. If an intrusion scenario is executed successfully, and it results in disruptions in power system operations such as breaker opening or loss of generation, such sudden changes of the power network configurations will be reported to the proposed algorithm through SCADA systems in real time. The post-attack dynamical model of the power system is then built. After that, the algorithm extracts synchronized phasor measurements from the PMU data concentrator, which obtains real time PMU data from substations equipped with PMUs. A number of the state variables of the dynamical model are observed from PMU data. Based on the dynamical model and PMU measurements, CLEs are calculated to monitor the impact of the intrusion on power system dynamics.

Fig. 2
figure 2

Concept of IRS

If CLEs have only low values, the prediction is that voltage instability will not happen; otherwise, voltage instability is likely to occur, and the proposed algorithm will send proper control signals to the energy management system (EMS) to prevent voltage instability.

3 Proposed algorithm

3.1 Dynamical model

In this algorithm, generators are represented by classical models, and loads are represented by ZIP models. After a cyber intrusion, the dynamical model of a power system is established as shown below:

$$\left\{ \begin{array}{ll} \varvec{Y}_{\text{bus}} \dot{\varvec{V}} = \dot{\varvec{I}} \hfill \\ - \dot{V}_{i} \dot{I}_{i}^{*} = P_{{{\text{D}},i}} + {\text{j}}Q_{{{\text{D}},i}} \hfill \\ \dot{V}_{j} \dot{I}_{j}^{*} = \dot{V}_{j} \left( {\frac{{\varOmega_{j} \angle \delta_{j} - \dot{V}_{j} }}{{Z_{j} }}} \right)^{*} \hfill \\ \end{array} \right.$$
(3)
$$\left\{ \begin{array}{ll} \frac{{{\text{d}}\delta_{j} }}{{{\text{d}}t}} = \omega_{j} \hfill \\ \frac{{2H_{j} }}{{\omega_{\text{Re}} }}\frac{{{\text{d}}\omega_{j} }}{{{\text{d}}t}} + \frac{{O_{j} }}{{\omega_{\text{Re}} }}\omega_{j} = P_{{{\text{m}},j}} - \text{Re} ((\varOmega_{j} \angle \delta_{j} )\dot{I}_{j}^{*} ) \hfill \\ \end{array} \right.$$
(4)

where i = 1, 2,···, n − m; j = n – m + 1, n – m + 2,···, n; n is the total number of buses; m is the total number of generators; P D,i  + jQ D,i is the power consumption at load bus i; \(P_{{{\text{D}},i}} = P_{0,i} \left[ {A_{i} + B_{i} \left( {\frac{{\left| {V_{i} } \right|}}{{\left| {V_{0,i} } \right|}}} \right) + C_{i} \left( {\frac{{\left| {V_{i} } \right|}}{{\left| {V_{0,i} } \right|}}} \right)^{2} } \right]\left( {1 + L_{P,i} \Delta f} \right);\) \(Q_{{{\text{D}},i}} = Q_{0,i} \left[ {D_{i} + E_{i} \left( {\frac{{\left| {V_{i} } \right|}}{{\left| {V_{0,i} } \right|}}} \right) + F_{i} \left( {\frac{{\left| {V_{i} } \right|}}{{\left| {V_{0,i} } \right|}}} \right)^{2} } \right]\left( {1 + L_{Q,i} \Delta f} \right);\) A i , B i , C i , D i , E i , F i , L P,i , and L Q,i are load parameters; P 0,i  + jQ 0,i is the steady-state power consumption; V 0,i is the steady-state voltage; \(\Delta f\) is the frequency deviation in p.u.; H j and O j are generator inertias; δ j is the rotor angle of generator j; ω j is the angular speed of generator j; ω Re is the reference speed; Ω j is the internal voltage magnitude at generator j; Z j is the impedance between generator j and its generator bus; P m,j is the mechanical power input to generator j.

Excitation systems of the generators are assumed to function in some way to keep internal voltage magnitudes at reference values during the transient period. The time constant of modern excitation systems is less than 0.5 s. If a new reference value is issued to an excitation system, the corresponding voltage magnitude will change rapidly due to the fast response of the excitation system. CLEs will be computed based on an updated dynamical model to reassess system stability.

Let x denote \(\left[ {\left| {V_{1} } \right|,\angle V_{1} ,\left| {V_{2} } \right|,\angle V_{2} , \cdots ,\left| {V_{n} } \right|,\angle V_{n} } \right]^{\text{T}}\), and y denote \(\left[ {\delta_{1} ,\omega_{1} , \cdots ,\delta_{m} ,\omega_{m} } \right]^{\text{T}}\). Equations (3) and (4) are represented by:

$$\varvec{G}\left( {\varvec{x},\varvec{y}} \right) = 0$$
(5)
$$\frac{{{\text{d}}\varvec{y}}}{{{\text{d}}t}} = \varvec{F}\left( {\varvec{x},\varvec{y}} \right)$$
(6)

Since

$$\frac{{{\text{d}}\varvec{G}\left( {\varvec{x},\varvec{y}} \right)}}{{{\text{d}}t}} = 0 = \varvec{G}_{\varvec{x}} \frac{{{\text{d}}\varvec{x}}}{{{\text{d}}t}} + \varvec{G}_{\varvec{y}} \frac{{{\text{d}}\varvec{y}}}{{{\text{d}}t}}$$
(7)

It is obtained that:

$$\frac{{{\text{d}}\varvec{x}}}{{{\text{d}}t}} = - \left( {\varvec{G}_{\varvec{x}} } \right)^{ - 1} \varvec{G}_{\varvec{y}} \frac{{{\text{d}}\varvec{y}}}{{{\text{d}}t}} = - \left( {\varvec{G}_{\varvec{x}} } \right)^{ - 1} \varvec{G}_{\varvec{y}} \varvec{F}\left( {\varvec{x},\varvec{y}} \right)$$
(8)

where G x and G y are the Jacobian matrixs of G with respect to x and y.

When det(G x ) = 0 and \(\varvec{G}_{\varvec{y}} \frac{{{\text{d}}\varvec{y}}}{{{\text{d}}t}} \ne 0\), \(\frac{{{\text{d}}\varvec{x}}}{{\text{d}t}}\) has very large values. Correspondingly, x will change dramatically, and voltage instability is likely to happen.

3.2 Methodology: CLEs

The notion of CLEs (originally called sub-Lyapunov exponents) is introduced by Pecora and Carroll in their study of synchronization of chaotic systems [10] and [11]. Similar to the full Lyapunov exponents, CLEs are well defined ergodic invariants.

Consider a N-dimensional continuous-time dynamical system \(\frac{{{\text{d}}\varvec{z}}}{{{\text{d}}t}} = \varvec{H}(z)\). Split the state vector z into two vectors: \(\varvec{z}_{1} \in {\mathbf{R}}^{K}\), and \(\varvec{z}_{2} \in {\mathbf{R}}^{N - K}\) (0 < K < N), one will obtain two sub systems: \(\frac{{{\text{d}}\varvec{z}_{1} }}{{\text{d}}t} = \varvec{H}_{1} (\varvec{z}_{1} ,\varvec{z}_{2} )\) and \(\frac{{{\text{d}}\varvec{z}_{2}}}{{\text{d}}t} = \varvec{H}_{2} (\varvec{z}_{1} ,\varvec{z}_{2} )\). Let \(\varvec{z}_{1} \left( t \right) = \varvec{\varphi }(t,\varvec{v}_{1} ,\varvec{v}_{2} )\) be the solution of the first sub system at time t starting from the initial conditions \(\varvec{z}_{1}^{0} = \varvec{v}_{1}\), \(\varvec{z}_{2}^{0} = \varvec{v}_{2}\). The CLEs C i for the sub system \(\frac{{{\text{d}}\varvec{z}_{1} }}{{{\text{d}}t}} = \varvec{H}_{1} (\varvec{z}_{1} ,\varvec{z}_{2} )\) are defined as eigenvalues of the following limiting.

$$\varvec{\varLambda}(\varvec{v}_{1} ) = \mathop {\lim }\limits_{t \to \infty } [\varvec{K}^{\text{T}} (t,\varvec{v}_{1} ,\varvec{v}_{2} )\varvec{K}(t,\varvec{v}_{1} ,\varvec{v}_{2} )]^{\frac{1}{2t}}$$
(9)
$$C_{i} (\varvec{v}_{1} ) = \ln (\bar{\lambda }_{i} (\varvec{v}_{1} ))$$
(10)

where i = 1, 2,···, K; K(t, v 1, v 2) is the Jacobian matrix of φ(t, v 1, v 2) with respect to v 1; \(\bar{\lambda }_{i} (\varvec{v}_{1} )\) is the ith eigenvalue of \(\varvec{\varLambda}(\varvec{v}_{1} )\). The existence of CLEs is guaranteed under the same conditions that establish the existence of the Lyapunov exponents [12].

The relationship between CLEs and system stability is discussed in the following. In ergodic theory of dynamical systems, the Lyapunov exponents are used to characterize the exponential divergence or convergence of nearby trajectories, as shown in Fig. 3. For the sub system \(\frac{{{\text{d}}\varvec{z}_{1} }}{{{\text{d}}t}} = \varvec{H}_{1} (\varvec{z}_{1} ,\varvec{z}_{2} )\), its maximal conditional Lyapunov exponent (MCLE) M MCLE determines the exponential convergence of nearby system trajectories. This is true due to the approximation of

$$\left\| {\Delta \varvec{z}_{1} (t)} \right\| \approx e^{{M_{\text{MCLE}} t}} \left\| {\Delta \varvec{z}_{1}^{0} } \right\|$$
(11)
Fig. 3
figure 3

Nearby trajectories in the state space

If \(\frac{{{\text{d}}\varvec{z}_{1} }}{{{\text{d}}t}}\) has very large values, the nearby system trajectories will diverge. Correspondingly, \(M_{\text{MCLE}} \gg 0\). Otherwise, the nearby trajectories will converge, and MCLE has a low or even negative value. Therefore, the value of MCLE reveals the magnitude of time derivatives of related state variables. When the state variables are voltages of a power system, MCLE can be used to monitor the magnitude of time derivatives of the voltages, and hence voltage stability.

In this work, the dynamical system in (8) is split into n sub systems. The ith sub system has the state variables \(\left[ {\left| {V_{i} } \right|,\angle V_{i} } \right]^{\text{T}}\), where i = 1, 2,···, n. MCLE is computed for each sub system to monitor voltage stability within it.

Let \(\varvec{G}_{\varvec{y}} \frac{{{\text{d}}\varvec{y}}}{{{\text{d}}t}} =\varvec{\varPhi}\in {\mathbf{R}}^{2n}\), one may obtain

$$\left\{ {\begin{array}{l} {\varvec{\varPhi} _{{2i - 1}} = \frac{{\left| {V_{i} } \right|\varOmega _{i} \cos \left( {\angle \left( {V_{i} + Z_{i} - \delta _{i} } \right)} \right)}}{{\left| {Z_{i} } \right|}}\frac{{{\text{d}}\delta _{i} }}{{{\text{d}}t}}} \\ \quad\quad\quad +{ Q_{{0,i}} \left[ {D_{i} + E_{i} \left( {\frac{{\left| {V_{i} } \right|}}{{\left| {V_{{0,i}} } \right|}}} \right) + F_{i} \left( {\frac{{\left| {V_{i} } \right|}}{{\left| {V_{{0,i}} } \right|}}} \right)^{2} } \right]L_{{Q,i}} \frac{{{\text{d}}\Delta f}}{{{\text{d}}t}}} \\ {\varvec{\varPhi} _{{2i}} = - \frac{{\left| {V_{i} } \right|\varOmega _{i} \sin \left( {\angle \left( {V_{i} + Z_{i} - \delta _{i} } \right)} \right)}}{{\left| {Z_{i} } \right|}}\frac{{{\text{d}}\delta _{i} }}{{{\text{d}}t}} } \\ \quad\quad\quad + { P_{{0,i}} \left[ {A_{i} + B_{i} \left( {\frac{{\left| {V_{i} } \right|}}{{\left| {V_{{0,i}} } \right|}}} \right) + C_{i} \left( {\frac{{\left| {V_{i} } \right|}}{{\left| {V_{{0,i}} } \right|}}} \right)^{2} } \right]L_{{P,i}} \frac{{{\text{d}}\Delta f}}{{{\text{d}}t}}} \\ \end{array} } \right.$$
(12)

where i = 1,2,,n. Ω i = 0, |Z i | = ∞, and δ i  = 0 if there is no generator at bus i.

As \(\frac{{{\text{d}}\Delta f}}{{\text{d}t}}\) is small,

$$\left\{ \begin{aligned}\varvec{\varPhi}_{2i - 1} \approx \frac{{\left| {V_{i} } \right|\varOmega_{i} \cos \left( {\angle \left( {V_{i} + Z_{i} - \delta_{i} } \right)} \right)}}{{\left| {Z_{i} } \right|}}\omega_{i} \hfill \\\varvec{\varPhi}_{2i} \approx - \frac{{\left| {V_{i} } \right|\varOmega_{i} \sin \left( {\angle \left( {V_{i} + Z_{i} - \delta_{i} } \right)} \right)}}{{\left| {Z_{i} } \right|}}\omega_{i} \hfill \\ \end{aligned} \right.$$
(13)

One can assume that G x is diagonal in computation without compromising the accuracy, and then the ith sub system of (8) is represented by:

$$\begin{aligned} \frac{{{\text{d}}\left| {V_{i} } \right|}}{{{\text{d}}t}} &= - \frac{{\varvec{\varPhi}_{2i - 1} }}{{\varvec{G}_{\varvec{x}} (2i - 1,2i - 1)}} \hfill \\ \frac{{{\text{d}}\angle V_{i} }}{{{\text{d}}t}} &= - \frac{{\varvec{\varPhi}_{2i} }}{{\varvec{G}_{\varvec{x}} (2i,2i)}} \hfill \\ \end{aligned}$$
(14)

where i = 1, 2,,n; G x (2i − 1, 2i − 1) is the element at row 2i  1 and column 2i − 1 of G x . It is noted that \(\frac{{{\text{d}}\left| {V_{i} } \right|}}{{\text{d}t}} = \frac{{{\text{d}}\angle V_{i} }}{{\text{d}t}} = 0\) if there is no generator at bus i, which is reasonable since the change of the voltages at load buses is driven by the voltages at generator buses. Consequently, \(\frac{{{\text{d}}\left| {V_{i} } \right|}}{{{\text{d}}t}}\) and \(\frac{{{\text{d}}\angle V_{i} }}{{{\text{d}}t}}\) do not depend on |V i | and \(\angle V_{i}\).

The proposed algorithm calculates MCLEs of the sub systems that have generators at the corresponding buses. The computation method is introduced in the following.

3.3 Computation method

MCLEs are calculated over a limited time window. PMU measurements are extracted to observe time-varying values of the state variables of the sub systems. The unobservable part of the state variables is approximated through the implicit integration method with trapezoidal rule [13]. At the same time, the observable part is estimated by the same method as a backup of PMU data. If a PMU is compromised, it will be detected by comparing the PMU data and the corresponding estimation results. The estimation results will be used in the MCLE calculation. The algorithm in [13], the standard method with Gram-Schmidt reorthonormalization (GSR), is then used to compute MCLEs. If the values of MCLEs are over a predefined limit, it is predicted that voltage instability will happen. Control signals will be sent to EMS to prevent the voltage instability.

Selection of the length of the time interval could be arbitrary. Study shows that MCLEs exhibit robustness to the length of the time interval: MCLEs computed over different length time intervals all have very high values if voltage instability is going to happen. In this research, the time interval length is set to be 0.2 s, so that it is short while it has enough PMU measurements.

3.4 Control actions

When the value of MCLE of a sub system is over a predefined limit, the proposed algorithm will send a control signal to the excitation system of the generator related to the sub system through EMS. The reference value of the generator internal voltage magnitude is modified as follows:

$$\varvec{\varOmega}_{\text{Gen}}^{{{\text{ref}},{\text{new}}}} = \left( {1 + \frac{{M_{\text{MCLE}} }}{{C_{\text{const}} }}} \right)\varvec{\varOmega}_{\text{Gen}}^{{{\text{ref}},{\text{old}}}}$$
(15)

where C const is a predefined constant value. Voltage instability can be prevented with the fast response of the exciting system.

4 Case study

Wind farm SCADA systems are selected for case study due to the fact that wind power is a fast-emerging renewable resource on power grids, and it has the potential to affect the dynamical performance of power systems.

4.1 Wind farm SCADA systems

The generic network configuration of wind farm SCADA systems is identified and shown in Fig. 4. Every wind turbine is equipped with a wind turbine control panel (WTCP), which monitors and controls the wind turbine. WTCP is normally mounted in the tower base and is easily accessible. Through WTCPs, servers in a control room support monitoring and control of the wind turbines within a wind farm. However the control room is normally not staffed and it is only for maintenance occasions. Wind farms in separate locations are integrated into a single EMS in a main control center through a control wide area network (WAN). In the control center, system analysts oversee every turbine at the wind farms. The control center interfaces restrictively with corporate networks for business and operational reasons.

Fig. 4
figure 4

Generic network configuration of wind farm SCADA systems

Vulnerabilities are identified in [14], including configuration management of WTCPs (r 1), implicit trust between WTCPs and a control room (r 2), implicit trust between control rooms and a control center (r 3), wireless network (r 4), optical fibers (r 5), virtual private network (r 6), digital microwave (r 7), poor access control within a control room (r 8), poor access control within a control center (r 9), bad configuration of remote access (r 10), weak firewall policy (r 11), and human errors (r 12).

The vulnerabilities are evaluated through DERFEM. The results are shown in Table 2. An attack graph is built as shown in Fig. 5. Nine possible intrusion scenarios are identified, and the probability of occurrence of every scenario is calculated, as shown in Table 3.

Table 2 Results of DERFEM
Fig. 5
figure 5

Constructed attack graph

Table 3 Intrusion scenarios and probabilities

The intrusion scenarios show that, if successfully executed, a hacker will gain some levels of control access to several or even hundreds of WTCPs. The output of compromised wind farms will be maliciously manipulated. The impact on power system dynamics is studied next.

In Fig. 5, z 1 = (hacker, WTCP, 5); z 2 = (hacker, WTCP, 0); z 3 = (hacker, WTCPs in a wind farm, 2); z 4 = (hacker, WTCPs in a wind farm, 0); z 5 = (hacker, WTCPs in a wind farm, 1); z 6 = (hacker, WTCPs in a wind farm, 4); z 7 = (hacker, SCADA server in the control room, 3); z 8 = (hacker, SCADA server in the control room, 0); z 9 = (hacker, SCADA server in the control room, 4); z 10 = (hacker, SCADA server in the control center, 2); z 11 = (hacker, SCADA server in the control center, 0); z 12 = (hacker, SCADA server in the control room, 5); z 13 = (hacker, workstation in the control room, 5); z 14 = (hacker, workstation in the control room, 0); z 15 = (hacker, SCADA server in the control center, 5); z 16 = (hacker, workstation in the control center, 5); z 17 = (hacker, workstation in the control center, 0); z 18 = (hacker, workstation in the corporate LAN, 5); z 19 = (hacker, workstation in the corporate LAN, 0); z 20 = (hacker, remote access point, 5); z 21 = (hacker, remote access point, 0); a 1 = (password cracking, z 2, z 1, r 1); a 2 = (jamming, z 4, z 3, r 4); a 3 = (passive tapping, z 4, z 5, r 5 ); a 4 = (man-in-the-middle attack, z 7, z 6, r 2); a 5 = (active tapping, z 8, z 7, r 5); a 6 = (spoof, z 9, z 6, r 2); a 7 = (spoof, z 10, z 9, r 3); a 8 = (DOS attack, z 11, z 10, r 6); a 9 = (jamming, z 11, z 10, r 7); a 10 = (spoof, z 12, z 6, r 2); a 11 = (internal attack, z 8, z 12, r 12); a 12 = (malware infection, z 13, z 12, r 8); a 13 = (infected portable storage device attack, z 14, z 13, r 12); a 14 = (malware infection, z 15, z 12, r 3); a 15 = (malware infection, z 16, z 15, r 9); a 16 = (infected portable storage device attack, z 17, z 16, r 12); a 17 = (malware infection, z 18, z 16, r 11); a 18 = (infected portable storage device attack, z 19, z 18, r 12); a 19 = (phishing, z 19, z 18, r 12); a 20 = (malware infection, z 20, z 18, r 10); a 21 = (infected portable storage device attack, z 21, z 20, r 12); a 22 = (phishing, z 21, z 20, r 12).

4.2 Simulation results

The IEEE 39 bus system [15] shown in Fig. 6 is used for simulations. Generator G5 and G9 (marked with two rectangles) are replaced by two wind farms comprised of hundreds of variable speed wind turbines utilizing the doubly-fed induction generators (DFIGs). The rating of each wind turbine is 2.0 MW. From the system point of view, the wind farms are considered as constant negative loads during the transient period, due to the fast control capacity of the power electronic technology within wind turbines. The other generators are classically modeled and the loads are represented by ZIP models.

Fig. 6
figure 6

IEEE 10 generator 39 bus system

MCLEs are calculated for the generator buses (except G5 and G9) by the proposed algorithm every 0.2 s to monitor power system stability. Assume that at t = 0.4 s, a hacker maliciously manipulates the power output of G5 (or G9) to some extent. Part of the simulation results is shown in Table 4.

Table 4 MCLE of bus G3

The explains of Table 4 are as following.

Attack 1: P Gen of G5 is reduced by 10 MW. Attack 2: Q Gen of G5 is reduced by 10 Mvar. Attack 3: P Gen of G5 is reduced by 100 MW. Attack 4: Q Gen of G5 is reduced by 100 Mvar. Attack 5: P Gen of G9 is reduced by 10 MW. Attack 6: Q Gen of G9 is reduced by 7.5 Mvar. Attack 7: P Gen of G9 is reduced by 100 MW. Attack 8: Q Gen of G9 is reduced by 75 Mvar. Attack 9: P Gen of G5 is reduced by half. Attack 10: Q Gen of G5 is reduced by half. Attack 11: Q Gen of G5 is reduced to -Q Gen. Attack 12: P Gen of G9 is reduced by half. Attack 13: P Gen of G5 is reduced by half. Q Gen of G5 is reduced by half. P Gen of G9 is reduced by half. Attack 14: P Gen of G5 is reduced by 30 MW. Q Gen of G5 is reduced by15 Mvar. P Gen of G9 is reduced by 50 MW. Q Gen of G9 is reduced by 10 Mvar.

The simulation results come to the following conclusions.

1) The values of MCLEs are close to 0, when the power system is in the steady state.

2) Upon an attack, the values of MCLEs oscillate as time evolves, but have limited values if voltage instability is not likely to happen. During Attack 2, the reactive power output of G5 is reduced by 10 Mvar at t = 0.4 s. MCLEs increase for a while, and then decrease, as shown in Fig. 7a. The values are below 200.

Fig. 7
figure 7

Simulation results

3) The values of MCLEs constantly increase as time evolves, if voltage instability is likely to happen within the power system. During Attack 10, the reactive power output of G5 is reduced by half at t = 0.4 s. Voltage instability happens at t = 1.42 s, as shown in Fig. 7b. The values of MCLEs keep increasing after the attack, as shown in Fig. 7c.

4) Voltage instability is likely to occur around the generator buses where MCLEs have high values. Take Attack 10 as an example, MCLEs of G2, G3, G4, G6 and G7 (circled in Fig. 6) are over 1000 at t = 1.4 s. Time-domain simulation results show that voltage instability happens around those generator buses. It is reasonable as G2, G3, G4, G6 and G7 are close to G5.

Based on the simulation results, a predefined limit for the values of MCLEs is set to be 800. If the value of MCLE of a generator bus exceeds the limit, it is predicted that voltage instability will happen around the generator bus. Control signal

$$\varvec{\varOmega}_{\text{Gen}}^{{{\text{ref}},{\text{new}}}} = \left( {1 + \frac{{M_{\text{MCLE}} }}{10000}} \right)\varvec{\varOmega}_{\text{Gen}}^{{{\text{ref}},{\text{old}}}}$$
(15)

will be sent to the excitation system of the related generator. Simulation results show that voltage instability can be avoided. For example, during Attack 10, MCLEs of G3, G4, G6 and G7 are over 800 at t = 1.2 s. The corresponding control signals are then sent to G3, G4, G6 and G7. Voltage instability is prevented, as shown in Fig. 7d.

5 Conclusion

A risk assessment framework with a PMU-based IRS is proposed for power control systems. The main idea of IRS is to calculate MCLEs for generator buses in order to monitor voltage stability. The higher values MCLEs have, the more likely voltage instability occur around the corresponding generator buses. MCLE method is based on a solid analytical foundation and it is validated by simulation results.

This research leads to significant contributions to the development of a more reliable and secure power grid. Future research includes the following aspects.

1) For a large cyber system with numerous security vulnerabilities, DERFEM may not be sufficient. Some statistical analysis techniques may be coupled with DERFEM to improve evaluation results.

2) A dedicated control strategy will be developed in IRS for control actions to prevent voltage instability. The voltages are over 1.2 after 1.8 s in Fig. 7d. It is because IRS employs a control action on a simplified excitation system. The dedicated control strategy will be studied with full-scale excitation systems.

3) IRS is not only able to monitor voltage stability under cyber intrusions, but also can be used to monitor voltage stability after disturbances. It is promising to integrate IRS and the on-line monitor scheme in [13], so that a control center can monitor both voltage dynamics and rotor angle dynamics.