Security Attacks on Smart Grid Scheduling and Their Defences: A Game-Theoretic Approach

The introduction of advanced communication infrastructure into the power grid raises a plethora of new opportunities to tackle climate change. This paper is concerned with the security of energy management systems which are expected to be implemented in the future smart grid. The existence of a novel class of false data injection attacks that are based on modifying forecasted demand data is demonstrated, and the impact of the attacks on a typical system's parameters is identified, using a simulated scenario. Monitoring strategies that the utility company may employ in order to detect the attacks are proposed and a game--theoretic approach is used to support the utility company's decision--making process for the allocation of their defence resources. Informed by these findings, a generic security game is devised and solved, revealing the existence of several Nash Equilibrium strategies. The practical outcomes of these results for the utility company are discussed in detail and a proposal is made, suggesting how the generic model may be applied to other scenarios.


I. INTRODUCTION
D URING the last decade the rise of the smart grid has shown significant potential to address not only the traditional grid problems but also support the development of power generation from renewable sources. Indeed, since electricity suppliers must meet customers' demand during peak hours, they traditionally invest in power generation capacity able to sustain those high power consumption periods. This is an expensive solution as some of those resources are only exploited sporadically. On the other hand, with the increase of greenhouse gases that impact negatively on the Earth's ecosystem, better exploitation of renewable energy sources is seen as a way to reduce their emissions [1]. However, their inherent intermittency and unpredictability makes their integration into the power grid particularly difficult. Therefore, management of consumption and production plays a crucial role to facilitate power distribution as well as reduction of cost for both suppliers and consumers [2].
Traditional Demand-Side Management has designed strategies to change consumers' consumption patterns so that they better match energy generation profiles: these include peak clipping, load shifting, and flattening consumers' loads [3]. Advancements in energy storage and renewable energy generation provide further opportunities to devise smarter and efficient power grids. For instance, storing energy during offpeak times eases supply during peak hours where there is high demand. Furthermore, local electricity generation reduces substantially power dissipation and transmission costs. Accordingly, the concept of 'microgrids' was introduced to facilitate distribution by dividing the power grid into several smaller M. Pilz, F. Baghaei Naeini, M. Davis, J.-C. Nebel, L. Al-Fagih, and local grids [4]. Efficient management of these microgrids require a two-way communication system between suppliers and consumers, so that those smart grids can exploit distributed information for storage scheduling and pricing purposes [5].
Taking advantage of smart meters, energy storage and trading strategies, a variety of energy consumption scheduling techniques aiming at optimally distributing daily power consumption has been put forward to reduce a smart grid's peakto-average ratio (PAR) of the aggregated load. In particular, dynamic game-theoretic frameworks have been proposed to optimise energy cost using their Nash Equilibrium [6], [7]. Some consider advanced battery models [8] and integrate forecasting errors [9]. Alternatively, usage of a Stackelberg game minimising both the PAR and the system total cost has also proved promising [10]. More generally, comprehensive reviews reveal the significant contribution that game-theoretic solutions offer in terms of reducing consumer costs and PAR values [11]- [15].
Since smart grids rely on a communication network and smart meters, they may be vulnerable to cyber attacks [16]. As a result, appropriate defence strategies need to be put in place [17]- [22]. False Data Injection (FDI) is one of the most common approaches to attack cyber-physical systems [23]. In general, FDI attacks target data integrity breaches to make profit or disturb a system. Since, in power grids, state estimators are the main data sources used for monitoring and controlling purposes, they are the target of data injection [24]. Such FDI attacks and possible defence strategies have been investigated in several scenarios: (i) the 'ideal' undetectable attack where the attack vector is built from complete knowledge of the state estimators' parameters [17]; (ii) a more realistic attack relying on a probability distribution function where only incomplete information about the system's parameters is available [25]; (iii) a stealth data injection in which an attacker has complete information about the system's topology [26]. Detection of cyber-attacks and associated defence strategies are essential for a reliable grid. For instance, a fast detection algorithm has been proposed to deal with FDI and jamming attacks in smart grids [27].
Since game theory has been a very successful framework to improve cyber security [28], it has been applied in several scenarios dealing with grid security. When attackers target either a single or multiple state estimators, both Markovian and static strategies have been investigated to defend against load redistribution attacks by allocating optimal budgets to energy suppliers [29]. If attackers manipulate price information from the utility companies, its impact can be mittigated exploiting a Stackelberg formulation [30]. Furthermore, it has been proposed to defend against coalitional attacks by multiple attackers using an iterated game-theoretic model [31], where a probability of attack detection is considered in each iteration: correlation between payoffs and penalty factors demonstrated the effectiveness of the defence system. Finally, a defence system against switching attacks based on a zerodeterminant iterative game between controller and attacker showed that transient stabilisation could be achieved over time [32]. Although grid cyber security has been an active field of research, no defence scheme has yet been proposed to protect forecasting data in smart grids.
The contributions of this paper are as follows: 1) The design of a novel class of false data injection attacks, preserving average daily load in a smart energy scheduling system. The forecasted demand data is corrupted by a single attacker, targeting one or several households. Using extensive simulations, two families of attacks are investigated. The impact on both the PAR of the aggregated load and consumer bills as well as the resulting benefit for the attacker are analysed. 2) The design and analysis of an augmented security game for monitoring average-preserving false data injection attacks, based on a detailed model with strategies and payoff functions informed by the simulation findings.
The conditions under which a pure Nash Equilibrium exists are derived. This extends previous work by providing additional strategies and a more detailed payoff design, informed by the various cost and benefit functions of the utility company and the attacker. 3) To give practical guidelines to the utility company on how to protect itself against such attacks. The recommendations are based on combining a range of mitigation strategies and the results of the equilibrium analysis of the game, to aid the utility company with the decision-making process of investing in the security defence. The given advice is motivated by the simulation scenario, but can also be adapted to other situations. This is demonstrated using a concrete example.
This paper is organised as follows. The underlying smart grid management model is introduced in Section II. Different types of attacks are developed and their impacts are analysed in Section III. A game-theoretic defence strategy for the utility company is proposed in Section IV. Finally, the paper is concluded in Section V.
II. SMART GRID MANAGEMENT MODEL This section focuses on the description of the gametheoretic scheduling model used in a smart grid management model. After specifying the smart grid scenario including the battery model, cost function and data specifications, the scheduling game is presented. Note that a more detailed description can be found in [9].

A. Scenario
The scenario of interest considers a residential neighbourhood comprised of M houses where each household is equipped with a smart meter. The set of households that participates in the demand-side management (DSM) program is denoted by N ⊂ M, where M is the set of all households in the neighbourhood. The total number of participants is N = |N |. It is assumed that all M households are served by the same utility company (UC).
Each day is split into T discrete intervals, where the set of all intervals is represented by T . The DSM protocol runs as follows: The forecasted demand is sent to the UC where demand data are aggregated and sent to each DSM participant. Based on this input, the households play a dynamic noncooperative game (cf. Section II-B). Its outcome is a set of schedules, one for each household, that specify how they can make best use of their battery system. The households follow these schedules, even if their actual demand differs from the forecasted one. Instead of using a forecasting algorithm, random errors were added to actual demands in order to simulate a realistic average error of 8% in individual forecasted data as reported in [33]. More details about the process used to simulate realistic forecasts can be found in Appendix A.
Households that participate in the DSM scheme are equipped with a lithium-ion battery. Using the battery model proposed in [8], [9], which includes charging, discharging, and self-discharging characteristics of the battery, storing decisions are made. They are denoted by the variable a.
The demand d t m ≥ 0 of a household m ∈ M is defined as the amount of electricity that is needed to run all its appliances during the time interval t ∈ T . Let l t m denote the load, i.e. the amount of energy drawn from the grid by household m ∈ M during the interval t ∈ T . For the scheme participants, the load depends on the decision a t n taken at that specific interval, it combines the demand with the amount of energy that is charged or discharged by the battery l t n = d t n + a t n . Thus, the grid total load during interval t is given by L t = m∈M l t m . In order to incentivise a reduction of load at peak times, the UC charges the DSM participants using a dynamic pricing tariff: The cost per energy unit is based on the aggregated load of all users and is calculated separately for each interval. As in [6], [34], this is expressed by a quadratic cost function g t (y) = c 2 · y 2 + c 1 · y + c 0 , where y is the aggregated load at time t given by L t and the coefficients c 2 > 0, c 1 ≥ 0 and c 0 ≥ 0. The electricity bill B n (cf. [6], [8], [10]) of each participant is given by: where Ω n = t l t Formally, the game used to schedule battery usage is a discrete time dynamic game, in which players, i.e. households, have to decide how to use their battery during each individual interval of the upcoming day. In this game, each household has the objective to minimise their own costs as defined by the electricity bill (1). As the electricity bill itself depends on the aggregated load, the selfish behaviour of each individual becomes equivalent to minimising the peak-to-average ratio of the aggregated load (cf. [10]). It is defined as The theoretically optimal result is a perfectly flat curve with a PAR value equal to 1.0. Since the mathematical details of the game mechanics lie outside the scope of this manuscript, the interested reader should refer to [9] for a thorough description. In the following, the scheduling game is treated as a black-box that takes forecasted demand data as an input and then outputs schedules (one for each household) of optimal battery usage for the upcoming day as defined by its Nash Equilibrium. An optimal Nash Equilibrium (NE) strategy has a local maximum property: any single player deviating from the NE strategy will suffer a reduced payoff. It is important to note that only unilateral strategy changes are considered in this concept. Hence applying game theory for real-life scenarios is only a valid and useful tool if all participants agree to adopt it as a contract for strategic decision-making, in the modelled scenario. Fig. 1 shows an example of the scheduling impact of the game on the aggregated load for one day. Whereas the load profile without playing the game shows the usual peaks in the morning and evening, it is possible to obtain a relatively flat profile by means of the scheduling game. The first row of Fig. 9 (in Appendix B) illustrates actual battery usage of each household using a battery. As the dashed curve in the last row of Fig. 9 shows, the higher the participation to the game, the flatter the aggregated load.

C. Experimental Setup
Throughout the paper, all simulations are performed for a neighbourhood of M = 25 households over a period of 365 consecutive days to allow for statistical analysis of the outcomes. Each day is split into T = 24 intervals. The respective demand data are taken from [35]. Every participant of the DSM scheme is equipped with the same type of battery, i.e. the Tesla Powerwall 2 (cf. [36]). Battery characteristics such as efficiency, capacity, charging and discharging rates, and degeneration behaviour are read off its data sheet. This setup is deliberately chosen to be similar to the one investigated in [9], [37] to allow for comparison of the outcomes.

III. FALSE DATA INJECTION ON FORECASTS
As motivated in Section I, the security of a smart energy system is of extreme importance and there is a lack of research on possible attacks on forecasted data. This section describes households for a single day is shown for two scenarios: Without the game, and after playing the game. Every household is equipped with a battery. As players try to lower their electricity bill (1) (by means of their battery usage), they directly affect the load profile. In this example, the PAR value of the aggregated load (2) decreases from 1.69 to 1.04. different types of potential attacks that may take advantage of the game-theoretic smart grid management model presented previously. Furthermore, outcomes of those attacks are analysed from the point of view of the attacker, the UC and the other players. Various defence strategies to detect those attacks are proposed and analysed. Finally, attack mitigation is discussed.

A. Description of Attacks
All attack scenarios investigated in this section rely on the following assumptions. First, the attacker (who is one of the players) exploits the vulnerability of the smart grid communication network: They have the ability not only to intercept forecasting data from all the other players, but also to replace them. Second, after the game has been played based on the tampered data, the attacker adapts their storage schedule and takes advantage of the erroneous schedules that the other players follow. Finally, in order to limit the risk of having their attack detected, the attacker makes sure that the average daily aggregated load is not affected by their actions. Although there are many strategies which can be applied to change forecasts while maintaining a constant aggregated value of the load, this study investigates two simple families of attacks: Forecast shifting and scaling. a) Shift Attack: The shift attack replaces a given forecast with the original forecast after having undergone a circular shift of σ time intervals, where σ is an integer. Since experimental results have shown that a shift attack of 4 hours, see Fig. 2, produces the most dramatic impact for the dataset of interest (cf. Section. II-C), that value is used for the rest of the study.
b) Scale Attack: The scale attack substitutes a given forecast with a scaled version centred around its average value for the day. To ensure that the day average is not affected,  the scaling parameter τ should be chosen such that no load becomes negative after scaling. Note that for the dataset of interest (cf. Section. II-C), a value of τ = 2 remains acceptable: Although a couple of values do become negative, they are set to 0; the day average is slightly increased, but it remains within a realistic forecast uncertainty (cf. Appendix A). Fig. 3 illustrates the effect of various scale attacks, i.e. τ = −1, τ = 0 and τ = 2. While τ = 1 returns the initial forecast, τ = 0 and τ = −1 produces a flat, and mirrored forecast, respectively. In the rest of the paper, these two different attacks are called flat and mirror attack. The outcome of an attack does not only depend on the type of attack and its associated parameter, but also on the number of forecasts which are replaced among all the players of a game: the higher the percentage ρ of attacked households, the  Fig. 1), the load curves after the attack differ considerably. This is a direct result of the attacker taking advantage of the falsely injected data. The bottom graph displays the change of price per unit during those two scenarios. As expected, the attacker's load has a high inverse correlation with the unit price (≈ −0.96).
more room for maneuver the attacker has to profit from their attack.
B. Attack Outcomes 1) Outcome for the Attacker Fig. 4 illustrates the resulting load curves of attacker and victim in the case of a shift attack (σ = 4). The attacker benefits by having a high load during the periods when the victims have a low one and vice versa, so that the attacker's higher consumption takes place when the aggregated load, and thus unit price, is low. This is exactly what the attacker tried to achieve by manipulating the forecasting data and thus the input to the scheduling game. More details about the cost function model can be found in Section II-A and [11], [38]. In this attack example, there is a high inverse correlation, i.e. ≈ −0.96, between the attacker's load and the unit price.
An attacker's financial benefit depends not only on the type of attack, but also the number of households using a battery, i.e. the participation rate N /M, as well as the proportion of targeted households ρ whose forecasts have been changed. In order to investigate this, attack simulations were conducted on a smart grid comprising M = 25 households for a duration of one year. Compared to the non-attack scenario, Fig. 5 displays the percentage change on the attacker's bill (yearly median of the daily changes) according to those parameters in the cases of shift (σ = 4), mirror (τ = −1) and scale (τ = 2) attacks. Simulations have revealed that a flat attack (τ = 0) results in benefits similar to those of the shift attack (σ = 4) and is thus not shown. gains increase with both participation rate and percentage of targeted players. Bill reductions for the attacker reach up to 25.5% and 35.7%, respectively. However, in the case of the scale attack (τ = 2), the graph displays a different picture: Up to a relatively high participation rate ( N /M > 55%), the attacker is financially penalised by their attack. Indeed, while the other attacks lead players to charge their battery at a wrong time, this scale attack tends to make players charge their battery more than they need at a time when the attacker would also need to charge their battery. As Fig. 9 (cf. Appendix B) reveals, when the participation rate is high, the aggregated load profile is inverted due to a large number of players charging their battery excessively at a time that was initially of low load and discharging their battery when a peak was expected. As a consequence, the aggregated load profile is now almost ideal for the attacker who can benefit from low prices at their time of high needs. Thus, they hardly need to use their battery and can gain up to 9.5% of bill reduction.

2) Outcome for the Utility Company and the Other Players
As mentioned in Section II, for the utility company, the efficiency of a microgrid is assessed by its PAR value. Since attacks change the aggregated load, it is directly affected. The effect of the previously introduced attacks on PAR values is presented in Fig. 6. The different attack types are associated to a different graph, which presents several curves, each for a different percentage ρ of targeted players, showing the relationship between participation rates N /M and PAR values. For the shift (σ = 4) and mirror (τ = −1) attacks, an increase of ρ leads to a worsening of PAR values. Moreover, as in the non-attack scenario, PAR values tend to improve with an increase of participation rate. Note for the case of the mirror attack: If a high percentage of players are targeted, an increase of the participation rate contributes to the degradation of PAR values.
As analysed in the previous section, the outcomes of the scale attack (τ = 2) are different from the others when the participation rate is below N /M = 55%. In fact, Fig. 6 shows an improvement of the PAR values compared to the nonattack scenario when the percentage ρ of targeted players increases. Fig. 9 clearly shows that at low participation rates the aggregated load is flatter than without an attack. The explanation is that this positive scaling incentivises participating households to work harder to flatten the load curve: As seen in Fig. 9, charging takes place at the same time but with a higher intensity. As a consequence, a 52% participation rate is sufficient to achieve a PAR that is similar to the one resulting from a 100% participation rate without any attack, i.e. PAR = 1.11 and PAR = 1.07, respectively. Participants work twice harder, which has the same effect as if everybody was working as they should. This extra work leads to higher bills for those households. An improved PAR value may suggest that the UC benefits from such attacks. In practice, this is not the case because in those scenarios the electricity bills of all players, including the attacker, increase substantially (data not shown), which will eventually lead to a loss of reputation and customers for the UC.
All attacks leading to the reduction of a single player's (the attacker) bill result in an increase of all the other players' bills by usually a comparable amount, see Table I and II. As a consequence, the aggregated bill for the whole neighbourhood is significantly increased. For example, a mirror (τ = −1) attack targeting all players (ρ = 100%) rewards the attacker with a 35.7% bill cut, while the other players must endure a 54.0% rise on average. Similarly, the attacker benefits from a scale attack (τ = −2, ρ = 28%) with a bill reduction of 1%, penalising the other households by a 2.3% increase.

C. Attack Detection Strategies
All investigated attacks affect the utility company negatively: When the participation rate is high, PAR values are systematically degraded compared to the non-attack scenario; otherwise, either PAR values worsen, or their improvement is at the cost of higher electricity bills for the average household. This is detrimental to the UC's credibility and competitiveness. Consequently, the UC needs to design defence strategies to prevent attacks that affect the storage scheduling process. In this study, the focus is on the detection of false data injection by monitoring the forecasting data that are transmitted every day on the smart grid communication system.

1) Attack Detection Through System Monitoring
Forecast monitoring is considered at three different levels: • Aggregated consumption forecast average, i.e. average amount monitoring • Aggregated consumption forecast profile, i.e. deep aggregated monitoring • Household consumption forecast profiles, i.e. deep individual monitoring In each case, the UC would compare the received forecast data with its own estimate. While monitoring the aggregated consumption forecast total only requires the UC to forecast the daily total electricity consumption of the smart grid community as a whole, deep monitoring relies on producing hourly consumption estimates for either the entire community or each individual household. The more precise the monitoring, the more resources are needed to implement it.
Since an individual average hourly forecast error for a 24-hour period is expected to be lower than 8% [33], the expectation is that the difference between two forecasts, i.e. the forecast provided by the received forecast data and the forecast estimated by the UC, to be lower than twice the 8% error of a single forecast. As a consequence, it is reasonable to assume that the UC could use a threshold of 20% to identify an attack when using deep individual monitoring. In the case of deep aggregated monitoring, the combination of forecasts tends to lead to error reduction. As consequence, here a discrepancy of at least 10% is used to detect an attack. Finally, since in the proposed attack scenarios, the attacker always makes sure that their attack does not change the average daily aggregated forecast, a UC relying only on average amount monitoring would not be able to detect any attack. Eventually, the detection of a given attack depends not only on the chosen monitoring strategy, but also the type of attack, the participation rate N /M and the percentage ρ of targeted players.

2) Attack Impact Analysis
Based on the three proposed monitoring strategies, the consequences of undetected attacks are studied. These are evaluated by estimating an attack's impact in terms of average bill change for the attacker and the other players, bill revenue change for the UC and PAR values. Assuming a participation rate of N /M = 100%, this set of experiments considers, for each attack type of interest, i.e. shift (σ = 4), flat (τ = 0), mirror (τ = −1) and scale (τ = 2 and τ = 1.29), the most severe attack, in terms of the highest percentage ρ of targeted players, that has remained undetected according to the monitoring strategy.
As Tables I and II show, all of those attacks prove beneficial to the attacker in terms of reducing their bill, while other players suffer a bill increase. Regarding the UC, it benefits financially from the general bill rise, but sees its PAR value degraded. Note that the impact of a scale (τ = 1.29) attack is evaluated because it is the most powerful scale attack which can target all players (ρ = 100%) without being detected by any of the proposed monitoring strategies. Table I reports the impact of undetected attacks despite average amount monitoring. As such monitoring is ineffective against the considered attacks, the attacker is able to carry out their attack with maximum strength, i.e. (ρ = 100%), without being detected. The mirror (τ = −1) attack is particularly efficient: The attacker's bill is reduced by 35.7% at the cost of the other players' bills, i.e. 54.0%, and a large increase of the PAR value to 2.06 from a non-attack value of 1.12.
Once deep aggregated monitoring is in place, the strength of the attacks that remain undetectable is reduced significantly. As Table II shows, the attacker's bill is lowered by 1.9% at most. However, although, in this case, the other players are hardly affected -their bills only increase by 0.3%, the UC suffers from a significant degradation of the PAR to 1.23. One should note that although the scale (τ = 2) attack with Finally, although the most stringent monitoring strategy, i.e. deep individual monitoring, would detect most attacks whatever ρ, i.e. shift (σ = 4), flat (τ = 0), mirror (τ = −1) and scale (τ = 2), some limited scale attacks such as (τ = 1.29, ρ = 100%) still cannot be discovered (cf. last line of Table II). Although none of the proposed monitoring strategies can detect all attacks, they are able to recognise the most severe ones. Moreover, they can detect false data injection for a wide range of attacks.

D. Attack Mitigation
Once an attack has been detected, some response needs to be provided. For the most serious attacks, households may be instructed not to follow the calculated battery schedule, but use an alternative one. Several options are possible such as keeping the same schedule as the previous day or recalculating their schedule only taking into account their own data. In the latter case, scheduling is executed without using the game-theoretic framework, but by performing a simple optimisation of battery usage for their own consumption forecast.
Those options were evaluated in a previous study [37]. It showed that, although both approaches lead to a PAR reduction, local scheduling should be the defence of choice since it systematically outperforms previous day scheduling. Still, this mitigating strategy has its own cost: At medium participation rates N /M, the PAR reduction can be up to ≈ 25% lower than when the game is played. As Tables I and II show, only the most powerful attacks have an impact on the PAR which is higher than reverting to the local scheduling strategy. This suggests that the best reaction to a low impact attack would be to let it happen. In terms of monitoring, only deep aggregated monitoring would prove useful, since it is able to detect all attacks for which the proposed mitigation strategy is beneficial. Therefore, a two-level detection system may be the most suitable strategy for the UC: It should conduct either no monitoring at all, or deep aggregated monitoring.
Before deciding on a complete defence strategy, which includes detection and mitigation, all costs and benefits must be taken into account by the UC, i.e. cost of monitoring, cost of mitigating action, cost of reputation loss and benefit of increased consumption. The main challenge for the utility company is to control the spending on their security measures, as organisations typically have a restricted budget. For example, if the expected probability of an attack is low, a low investment in security could be justified. On the other hand, if an attacker is aware of such a strategy, they would be more likely to attack as they would expect less resistance. Finding a solution to this decision-making problem cannot be achieved by optimisation alone, but instead non-cooperative game theory helps in devising suitable models and advising on the expected likelihood of attacks.

IV. GAME-THEORETIC DEFENCE STRATEGY
When planning to defend against the false data injection attacks described in the previous section, the need for the utility company to allocate resources for the defence in the most efficient way has been highlighted. This section proposes to use game theory in order to support this decision-making process. The game is motivated and introduced based on detailing the payoff functions of the two players describing the game normal form. This is followed by solving the game using various assumptions. Finally, the solution is discussed with respect to their implications for the simulated scenario and potential alternatives.

A. Game Theory for Security
Game theory is increasingly being employed for modelling attacker-defender scenarios in cyber security, for a broad range of scenarios such as intrusion detection in network security [39], managing the security of information in an organisation [40] and predicting the likelihood of cyber attacks [41]. Non-cooperative game theory is based on the assumption that players are rational, i.e. they choose between actions such that they maximise their payoffs. The associated optimal strategies can be identified using the fundamental concept of the Nash Equilibrium (cf. Section II-B). Although not all games have Nash Equilibrium, Nash's theorem states that nonzero-sum games always admit a mixed strategy equilibrium. However, for practical applications it may not be easy to interpret [42].
In this paper, x and y denote a pure or mixed strategy of the first and second player in a two-player game, and x * and y * are used for optimal strategies of these players. A strategy profile s = (x, y) groups strategies of each player together. If the grouped strategies are optimal, the optimal strategy profile is written as s * . A two-player nonzero-sum game can be represented in normal form, based on the players' payoff matrices A and B [43].
An optimal Nash Equilibrium strategy profile is a strategy profile s * = (x * , y * ) satisfying x * Ay * ≥ xAy * ∀x, x * By * ≥ x * By ∀y . (3) Here, the strategies may be pure or mixed, and the corresponding NE is referred to as pure or mixed. Furthermore, if all of the inequalities in the above definition are strict, one has a strict NE. Otherwise, the NE is non-strict.

B. Proposed Security Game
The proposed security game is a two-player nonzero-sum complete information game [43] between the utility company U and the attacker A. The game is inspired by the nonzero-sum Intrusion Detection System (IDS) game of [39] which has been thoroughly analysed in the literature and is well understood. Table III illustrates the game where the two strategies available to the defender are to monitor or not, denoted by the strategy space S D = {s D mon , s D −mon }, and the attacker chooses between attacking or not attacking: The positive parameters α c , α f , α m , β c and β s are used to denote the payoffs corresponding to the various strategies. The main characteristic of this game is the design of the payoff functions in such a way that the monitoring defender only has an incentive to defend in the presence of an attack. The attacker is discouraged from attacking if there is defence in place. This design leads to a circular path when considering payoffincrementing unilateral changes of strategy, hence prohibiting the existence of a pure Nash Equilibrium.

1) Description of the Game
Here, an augmented security game is introduced, extending the IDS game described previously by an additional action. The rationale behind this extended game model is twofold: Section III-C1 demonstrated the existence of low-impact attacks which cannot be detected by standard monitoring techniques, and it would be desirable to capture these in a more sophisticated game model. Second, an extended game might better match real-world scenarios and might lead to simpler solutions, in this case pure equilibria rather than mixed ones. a) Game Strategies: Section III presented three possible monitoring strategies for U: to monitor the daily average of forecasting data, to inspect the daily profile of the aggregated forecast, and to inspect the individual forecast data with the same level of detail. In this work, the assumption is made that the first and second monitoring strategies are most useful in a realistic setting, as they have an observable impact on the strength and outcome of successful attacks while the third monitoring strategy merely eliminates attacks that are possible for weaker monitoring levels. Furthermore, as the data of aggregated forecasts is readily available to the UC, the first monitoring strategy is not very costly and is identified with the strategy s U −mon . The second monitoring strategy is denoted as s U mon so that the strategy space for the defender U is as in the previous game S U = {s U mon , s U −mon }. The attacker A has three different strategies: to attack strongly with high impact, to perform a weaker attack with low impact, or not to attack at all. This is denoted by the strategy space S A = {s A att°, s A att , s A −att }. The additional weak attack strategy s A att°o ffers an alternative incentive of not monitoring to the UC, preferring to save monitoring cost when facing a weak attack. No assumption is made on the relationship between the attacker's overall payoff when choosing the two different attack types, and a discussion of conditions clarifying this relationship is the main subject of the game analysis in the next section. b) Game Payoff Functions: The following notations for the payoffs for U are introduced: c U mon is the cost for monitoring the daily profile of the aggregated forecast (second monitoring strategy) and c U def is an additional cost for investing in defence mechanisms such as actions discussed in Section III-D. Losses from weak and strong attacks are denoted by l U att°a nd l U att respectively. The payoff functions corresponding to A are the benefits and costs associated with weak and strong attacks, denoted by b A att°, c A att°, and b A att and c A att , respectively. The monitoring activity always leads to monitoring costs for U. If there is no monitoring, U incurs losses l U att°a nd l U att due to weak and strong attacks. Otherwise, despite monitoring, weak attacks cannot be detected, hence there is a resulting loss l U att°. Strong attacks however are detected and mitigated against through some countermeasures, preventing any losses but leading to a defence cost c U def . Finally, if there is no attack, then the only arising nonzero payoff function involved is the monitoring cost for U. The attacker A obtains a benefit b A att°f rom a weak attack, but has to invest in attack costs c A

att°.
Similarly, the cost c A att arises from a strong attack, however the model assumes the lack of a benefit for A due to the UC's defence mechanism. Using these notations, the proposed security game G can be represented in normal form as shown in Table IV.

2) Game Assumptions
In this section, assumptions on the relationship of the various cost and benefit functions, which are part of the game payoff matrices, are listed and justified. a) Assumptions from the IDS Game: The cost for missing an attack α m = l U att > 0 is interpreted as losses from an attack that is not mitigated against, the false alarm cost α f = c U mon > 0 as an ongoing monitoring cost and the detection penalty β c = c A att > 0 as the cost for the attacker to conduct a strong attack. The gain from detection α c = −c U mon − c U def > 0 is reformulated as necessary cost to monitor and to defend in order to prevent damage. In order to preserve the mixed equilibrium property of the security game given by −α m < α c it is then assumed that this attack prevention cost is less than the actual incurring attack damage, i.e. c U mon +c U def < l U att . This assumption is natural: In a typical security game, the defender does not spend more on attack prevention than what they potentially loose from an attack. Finally, β s = l U att − c A att > 0 is the difference between the benefit from an undetected attack and the attack effort. This expresses a similar principle as above, but this time applied to the attacker A who does not spend more on an attack than the expected gain from it. These assumptions can be referred to as the Security Game Assumptions. b) Augmented Security Game: The assumptions required for the augmented security game are in parts inspired by those of the IDS game, and also motivated by the experimental results presented in Section III which suggest that strong attacks require targeting more victims, i.e. a bigger effort.
For a weak attack, the attacker receives a greater payoff than the cost of the attack, implying It can also be assumed that the cost for launching a strong attack is higher than that for a weak attack since a higher number of households has to be attacked Finally, a strong attack leads to higher losses for the utility (cf. Section III-B1) l U att > l U att°.
Note that in order to aid the game analysis, an assumption made in this game is that the benefit of the attacker is equal to the loss of the defender, b A att = l U att and b A att°= l U att°.

C. Game Analysis
In this section, analysis of the security game G reveals existence of several NE strategies. Following the study of practical examples, the relevance of these strategies are discussed so that they can be used to inform UC's security investments.

1) Optimal Nash Equilibrium Strategies
To solve the augmented security game, three distinct cases are considered. This is based on discussing the second order difference ∆ = q att°− q att , where q att°= l U att°− c A att°a nd q att = l U att − c A att describe the net-benefit for the attacker in case of a weak and strong attack, respectively. a) Case 1 (∆ > 0): In this case, the existence of a unique pure NE for the game G can be asserted. The corresponding NE strategy is for the UC to not monitor, and for the attacker to carry out a weak attack. Due to the uniqueness property these solutions are globally optimal.
Proposition 4.1: If l U att°− c A att°> l U att − c A att , the game G admits a unique pure Nash Equilibrium strategy profile of the form s * = (s U −mon , s A att°) and the corresponding payoffs s * U = −l U att°a nd s * A = l U att°− c A att°a re globally optimal. Proof: First, it needs to be verified that when choosing the pure strategy profile (s U −mon , s A att°) , none of the two players benefits from a unilateral change of pure strategy.
Focusing on the UC, the change of strategy s U −mon → s U mon diminishes its payoff since −l U att°> −c U mon − l U att°d ue to the assumption of a positive monitoring cost. Considering the attacker, the change s A att°→ s A att is not beneficial because of the main assumption ∆ > 0 of this case. Finally, the change of strategy s A att°→ s A −att reduces the payoff due to Assumption (4). Second, a careful inspection of the payoff functions of the remaining strategies of the game, together with the fact that the assumption of Case 1 implies l U att°− c A att°> −c A att , shows that there is no other pure NE. b) Case 2 (∆ < 0): Similarly to the IDS game, the augmented security game has the same property of circular paths when performing unilateral changes strategy with increasing payoffs, hence prohibiting the existence of any pure NE.
Proposition 4.2: If l U att°− c A att°< l U att − c A att , the game G admits no pure NE.
Proof: The proof of this proposition is done very similarly to that of Proposition 1 by comparing the changes in payoff, following a unilateral change of strategy. It is clear that there is no pure NE in the game restricted to the attacker strategies s A att and s A −att , as the resulting subgame is identical to the IDS game. When augmented by the weak attack strategy s A att°, two cases may arise, depending on which of the strategy changes s A att → s A att°o r s A att°→ s A att , starting from the initial strategy profile (s U mon , s A att ), lead to an increased payoff for the attacker.
In the first case, one observes the additional sequence of strategy changes s U mon → s U −mon , s A att°→ s A att and s U −mon → s U mon leading back to the original strategy profile. These changes entail increased payoffs due to the assumption of positive monitoring cost, the condition ∆ < 0 and the Security Game Assumptions. In the second case, the unilateral payoff change joins the circular path of the IDS game, from which the proof follows as shown earlier.

2) Quantitative Examples
Attacks discussed in Section III are further analysed using the proposed augmented security game. In order to establish which case they correspond to, estimations of the sign of ∆ (7) are performed using previous simulation calculations. More specifically, b A att and b A att°a re represented by the values of theÀttacker bill change' (γ and γ°), reported in Tables I  and II respectively, multiplied by the actual amount of the bill λ, e.g. b A att = l U att = γ · λ. Moreover, assuming a linear relationship between the number of attacked players and the cost of an attack, c A att and c A att°c an be expressed using the values of percentage of targeted players (ρ and ρ°) shown in Tables I and II respectively, e.g. c A att = ρ·κ. As a consequence, an attack type corresponds to Case 2, i.e. (∆ < 0), iff the following inequality is satisfied: with Assumption (4) stating γ°/ ρ°> κ /λ . Evaluations of attacks reported in Tables I and II show that Case 2 applies to the shift (σ = 4), flat (τ = 0), mirror (τ = −1) and scale (τ = 2) attacks. Hence, for none of those attacks a pure NE exits and only mixed strategies can be offered. Using the mirror attack as an example, Equation (8) requires 0.41 > κ /λ and Assumption (4) imposes 0.11 > κ /λ.
Since the scale (τ = 1.29) attack was especially designed to be undetectable by the proposed monitoring solution, it cannot be analysed by the game which assumes that a successful monitoring strategy is available. On the other hand, the best strategy for such attack is self-evident: Since all attack results in gains for the attacker, they should attack, while the UC should not waste any resources in ineffective defence.
In order to investigate the mixed strategies associated to those attacks, numeral values were selected so that mixed strategies could be computed using an NE solver [44]: λ = 100, κ = 10, c U mon = 10 and c U def = 20. Table V shows representative mixed strategy probabilities associated with the investigated Case 2 attacks, here the mirror attack. The attacker either performs a strong (63.7% probability) or weak (36.3% probability) attack, while the UC chooses to use monitoring with a 71.7% probability. Note that the choice of numerical values is not critical. As long as all the game assumptions are fulfilled, the probability for the monitoring action of the UC is at least 70%.

3) Discussion
Theoretical analysis of the proposed extended game model has shown that according to the sign of ∆ (7), three different cases should be considered. While, both Case 1 (∆ > 0) and Case 3 (∆ = 0) are associated to a pure NE, only Case 1's is strict. However, in both cases, the optimal NE strategy for the UC is the same: not to monitor. On the other hand, Case 2 (∆ < 0) only leads to mixed strategies. Practical analysis, investigating the attack examples described in Section III based on a 100% participation rate, revealed that only Case 2 was practically relevant. This is in line with expections that the net benefits, i.e. benefits minus costs, of strong attacks are supposed to be higher than those of weak attacks. Note that for the scale (τ = 2) attack, different cases could arise at lower participation rates due to its specific behaviour as shown in Fig. 5 and Fig. 9.
Regarding Case 2, for a UC, the practical application of optimal strategies, as illustrated in Table V, is not straightforward. Actually many suggestions have been made regarding possible interpretations of mixed strategies [45]- [47]. In the specific context of this work, that proposed by [47] is of particular interest: Indeed, assuming that the UC supplies a set of microgrids, where security strategy is decided at the microgrid level, they, seen as a population, would choose defence strategies following the mixed probabilities. Alternatively, as suggested in [43], [48], the probability associated to defence could be interpreted as an index of security criticality which would inform the UC's decisions regarding its defence investments. Interestingly, experiments (not shown) indicates that when the cost of attacking a singler player, i.e. κ, decreases, the mixed strategy probability for monitoring grows, increasing defence needs.
Finally, the undetectable scale (τ = 1.29) attack is a reminder that no practical monitoring strategy is perfect and the best defence strategy may be not to defend if the losses associated to an attack can be considered as acceptable.

V. CONCLUSION
Protecting smart grids from cyber attacks is essential for them to deliver their promises. Investigating different classes of false data injection attacks against the forecasts required for smart energy scheduling, extensive simulations showed the extent of damages that a single attacker can cause to both the utility company (growth of PAR value by up to 84%) and its consumers (bill increase by up to 54%). The need for mitigation having been established, monitoring and defence strategies were proposed. In order to assess their value and advise utility companies on their optimal attack prevention strategy, a novel and generic security game that considers low and high-impact attacks was designed. Its analysis highlighted, in particular, conditions under which a Nash Equilibrium exists. Interestingly, in those cases, the best strategy is for the utility company not to invest in any monitoring and the attacker to conduct low-impact attacks. Numerical evaluations considering the previously studied classes of attacks revealed that there is a type of attack where, indeed, no monitoring is the best strategy. However, in all the other cases, only mixed strategies can be offered. Their practical interpretation by UCs was discussed. As a conclusion, the proposed security game offers utility companies the ability to investigate the most appropriate monitoring and defence strategies so that false data injection attacks have only very limited, if any, impact on smart energy scheduling.

APPENDIX A SIMULATING FORECASTING ERRORS
Since forecasting electricity consumption is out of the scope of this study, forecasts were simulated instead of produced by a forecasting algorithm. However, in order to consider forecasts as realistic, they must show some deviation from the actual consumption. As it has been reported that the average error in individual forecasted data is around 8% [33], some random error is added to the actual consumption values to produce sufficiently inaccurate forecasts. Although errors could 5 Fig. 7. Individual forecast created by adding random errors. While the dashed curve is the actual demand of an household, addition and subtraction of 10% are represented by the two dotted curves. The bold curve is one example of simulated forecast produced using the described method. Here, whereas the average error is 7.5%, there are some values outside the 10% error area.
be added following a Gaussian law, the obtained forecasted profile would prove unrealistic since it would display random jumps. As a consequence, some smoothing effect is added by linking successive values. More specifically, for each value i, a random error is initially calculated e i , then the actual error added to the value i is the average of the corresponding e i and its neighbours, i.e. E i = ei−1+ei+ei+1

3
. As seen on Fig. 7, with this approach, simulated forecast is smoother and, as a consequence, more realistic. Due to the relatively large number of players, despite the added errors, the aggregated forecast remains quite similar to the aggregated demand (an average error of around 2% was estimated experimentally). As a consequence, game solutions based on forecast with and without errors are close: drawing the histogram of the error per day during a whole year (not shown) reveals an average error of 8% [9]. Fig. 8 shows a flow diagram of the augmented security game which helps to understand the analysis in Sec. IV. Fig. 9 provides details to the discussion in Sec. III-B1 about individual household schedules and the influence of the scale attack with τ = 2.  Table IV, including the relations between the respective quantities. The arrows indicate which strategy would be more preferable in terms of the individual players' utility function. As discussed in Section IV-C the connection between the IDS game (in green) and the proposed augmented security game is defined by the second order difference ∆ (7) which is highlighted here by the red dotted lines. Depending on the sign of ∆ (7), the direction of the arrows varies as illustrated in the three cases. Note that the double line represents equality. Each column corresponds to a different participation rate, i.e. from left to right ρ = 28%, ρ = 52% and ρ = 100%. The first row shows battery schedules of each individual household; the second row shows battery schedules of each individual household under attack -note that the first household is the attacker; the third row compares aggregated loads without -dashed curves -and with -bold curves -attacks. Without attack, participation of all households, i.e. ρ = 100%, is required to flatten the aggregated load (PAR = 1.07). However, excessive battery usage by attacked households (the second row shows stronger charges and discharges) leads to a relatively flat (PAR = 1.11) aggregated load at ρ = 52%. However, at ρ = 100% the aggregated load profile is almost inverted; in this case the attacker hardly needs to use their battery.