A Cooperative Learning Approach for Decentralized Peer-to-Peer Energy Trading Markets And Its Structural Robustness Against Cyberattacks

Peer-to-peer (P2P) energy trading has recently emerged as a promising paradigm for integrating renewable and distributed energy resources into local energy grids with the presence of active prosumers. However, prosumers often have different preferences on energy trading price and amount. Therefore, in decentralized P2P energy markets, a negotiation between prosumers is needed to obtain a commonly satisfactory set of preferences, i.e., a market-clearing solution. To achieve that, this paper proposes a novel approach in which a decentralized inverse optimization problem is solved by prosumers to cooperatively learn to set their objective function parameters, given their preferential intervals of energy prices and amounts. As such, prosumers’ parameters can be determined in specific intervals computed analytically from the lower and upper bounds of their preferential intervals, if a certain learning condition is satisfied. Next, the structural robustness of prosumer’s cooperative learning against the malicious and Byzantine models of cyberattacks is studied with the weighted-mean-subsequence-reduced (WMSR) resilient consensus algorithm. A novel sufficient robustness condition is then derived. Finally, case studies are conducted on the IEEE European Low Voltage Test Feeder system to validate the effectiveness of the proposed theoretical results.


DER
Distributed energy resource. MAS Multi-agent system. P2P Peer to peer. P ij , P i Traded power/energy between peers i and j and vector of peer i traded powers/energy. Parameters of selling prosumer/peer i cost function. a i,s , b i,s Parameters of buying prosumer/peer i cost function. t, T Time step, and the number of considered time steps. 1 n , I n Vector with n elements equal to 1, and n× n identity matrix. diag{}, vec() Diagonal or block-diagonal matrices, and stacked vector. R, R n , R n×m Set of real numbers, real n-dimension vectors, and real matrices with dimensions n × m. · , · Flooring and ceiling functions.

I. INTRODUCTION
The wide adoption of renewable and DERs all over the world, as an effort to reduce carbon emissions, not only poses many challenges to the operation and management of energy systems, but also brings opportunities to develop novel approaches for revolutionizing energy systems. P2P energy system is such an approach recently attracts much attention, due to its suitability for renewable and DERs integration and many other advantages [1]- [4]. For example, P2P energy system has potential for reduction of energy losses (through energy transfer in local areas), flexibility for provision of demand side management services, better security and privacy protection with distributed ledger technologies, and multiple possible business models [2], [3], [5], [6], to name a few. In addition, P2P energy system helps strengthen the role of prosumers, who are both energy producers and consumers, to proactively participate in energy markets instead of being just passive consumers. Each peer/prosumer in a P2P energy system can directly communicate and trade energy with other peers/prosumers (similar to the P2P protocol in computer science), which is essentially different from that in pool-based energy markets. Therefore, new approaches for the operation and management of P2P energy markets need to be developed. Hitherto, a body of works has been proposed in the literature to derive distinct P2P energy trading mechanisms including bilateral contracts [7]- [10], game theory based [3], [6], [11]- [13], distribution optimal power flow [14], supply-demand ratio based pricing [15], mixed performance indexes [16], multi-class energy management [17], continuous double auction [14], etc. Multi-agent based decentralized optimization and control algorithms, e.g. [7], [18], have also been derived for P2P energy systems in which each agent acts on behalf of a prosumer. Dynamic pricing and demand response can also be used to design trading schemes in P2P energy markets, see e.g., [19], [20].
Different characteristics of decentralized P2P energy trading markets have recently been analyzed, see e.g., [7], [18]. For instance, there can be a unique or multiple P2P market clearing prices, while clustered P2P energy markets can also exist, partly due to unsuccessful energy transactions [18]. As such, two fundamental assumptions have been commonly employed in the literature of P2P energy systems research, one is the successful trading of all prosumers, while the other assumption is the right selection of cost function parameters by each prosumer to obtain expected energy transactions. However, those assumptions can be easily violated in realistic situations because of: (i) distinct preferences on the amount of power/energy and energy prices to be traded between prosumers; and (ii) the successfully traded power/energy and prices depend on private information of all participated prosumers which is unavailable for an individual prosumer.
To cope with the challenge on relaxing the abovementioned assumptions, a heuristic approach for selecting parameters of prosumers' cost functions in P2P energy systems composing of multiple selling and buying prosumers has been introduced in [18]. This heuristic approach has been shown quite effective in assuring the success of P2P energy trading and the increase of trading energy amounts, and has been applied to several decentralized P2P energy trading systems [18], [21]. In a more recent work [22], an analytical method has been introduced for cooperative learning of prosumers/agents, but only for special cases where only one selling or one buying prosumer exists.
Meanwhile, the information exchange between prosumers is inevitable for their P2P energy trading, which is threatened by cyberattacks. Without suitable protections, intruders, via cyberattacks, can break down the energy trading or arbitrarily alter the energy trading outcomes. Despite of this critical issue, the structural robustness of P2P energy trading systems under cyberattacks has not been mathematically investigated, to the best of our knowledge. Instead, blockchains and smart contracts have often been employed to secure the energy transactions between prosumers, see e.g. [23]- [25]. However, consensus algorithms underlying blockchains are wellknown to be not scalable well, time-and energy-consuming.
This work fulfills the research gaps described above and contributes the following to the literature. To the author's knowledge, this is the first time these results have been reported in the literature.
• An analytical cooperative learning approach for setting parameters of prosumers' cost functions to guarantee their energy preferences, in decentralized P2P energy systems consisting of multiple buying and multiple selling prosumers. • A tight certificate for resilient consensus is derived for P2P energy trading markets with bipartite structures under Byzantine and malicious cyberattack models. This result is much stronger than the existing ones in [26, when using for the same system. The rest of this paper is organized as follows. Section II presents the considered P2P energy systems and their decentralized inverse optimization problem. Then a decentralized analytical cooperative learning approach for prosumers is proposed in Section III to solve the introduced inverse problem. Next, the structural robustness of P2P energy systems with WMSR algorithm against cyberattacks is studied in Section IV. Afterward, case studies are presented in Section V to demonstrate the proposed theoretical results. Finally, Section VI concludes the paper and provides directions for future research.

II. P2P ELECTRICITY TRADING PROBLEM
Consider the P2P energy trading in an energy system consisting of n prosumers during a time interval [1, T], in which each prosumer is regarded as a peer or agent who can both produce and consume power. In addition, prosumers are assumed to behave non-strategically.
Let P ij (t) be the traded power/energy between prosumers i and j at time step t, P ij (t) > 0(< 0) means prosumer i sells to (buy from) from prosumer j. Furthermore, we assume that at each time step a prosumer only sells or buys power, but not to do both, to simplify the prosumers power/energy trading.

A. COMMUNICATION STRUCTURE BETWEEN PROSUMERS
The inter-prosumer communication structure is bipartite and time-varying (see e.g., [18, Figure 2] for an illustration), and is represented by a graph G(t) which is undirected because of the bilateral trading between prosumers. The node set V(t) of G(t) consists of two disjoint subsets V s (t) and V b (t) corresponding to selling and buying prosumers, respectively. Next, denote N i (t) the neighboring set of prosumer i at time step t, i.e., the set of other communicated prosumers. Let 0 ≤ a ij ≤ 1 be elements of the inter-prosumer communication matrix A(t), where a ij (t) > 0 means prosumers i and j are connected at time step t, and a ij (t) = 0 otherwise. Moreover, A(t) is a symmetric and doubly-stochastic matrix (i.e. row sums and column sums of A(t) are all equal to 1).

B. OBJECTIVE FUNCTION
1 T ni(t) P i (t). Here, the notation P i,tr (t) is used to simplify the representation of results, and subsequently it will be replaced by P i,b (t) or P i,s (t) depending on whether the associated prosumer/agent is a seller or a buyer to clearly distinguish its role.
Let C i (P i (t)) denote the overall cost function of prosumer i for trading in the P2P energy market. Individual components of C i (P i (t)) are presented below.
Eq. (1a) shows the production cost and consumer utility function for each prosumer, assumed to be quadratic [7] (see [27] for more insights of this quadratic form for energy generators, loads, and storage devices). Piece-wise linear functions may also be used to represent the production cost and consumer utility function, but they are non-differentiable and could be non-convex, hence are not considered here. The time-varying private parameters a i (t) > 0 and b i (t) > 0 show the time-varying and complex behaviors of prosumers. Eq. (1b) is the total trading cost from the energy transactions with other prosumers, where p e ij is the energy price on the individual bilateral trading between prosumers i and j. Lastly, Eq. (1c) is the implementation cost paid to the bulk power grid for physically executing P2P energy transactions, with a fixed rate β > 0. There would be another component representing the bilateral trading cost between prosumers/agents (see e.g., [7], [18]), however it is ignored here for simplicity.
is the sum of its components in (1a)-(1c), which is a convex function.

C. SYSTEM CONSTRAINTS
Let [1, T] be the trading period. The constraint induced by the bilateral trading between prosumers is: The limits of power/energy can be traded is expressed by: Such limits are based on the own profiles of power generation and consumption of each prosumer, as well as the guidance from power system operators, if exists, to avoid physical problems (e.g., constraints on grid voltage and frequency) which may affect to the grid stability, reliability, etc. Power flow constraints are not included in the prosumers' optimal energy management problem. Instead, they pay a cost shown in (1c) for power network operators to handle such network constraints. Time-binding constraints can also be integrated with (4), similarly to what were shown in [18], [21], hence the details are omitted here for brevity.

D. DECENTRALIZED FORWARD OPTIMIZATION PROBLEM
The market-clearing problem for P2P energy markets, which is often studied in the literature, is as follows.
Decentralized Forward Optimization Problem for P2P Energy Market: Given parameters a i and b i of prosumers/agents' cost functions (2), find the market-clearing energy price and trading amounts of all prosumers/agents. This forward optimization problem is written as: which is in fact similar to the dynamic social welfare maximization problem for the pool-based markets (see e.g., [28], [29] and references therein). Because of the bilateral trading, the sum of all C i,2 (P i (t)) will be vanished in (5a) and only C i,1 (P i (t)), C i,3 (P i (t)) are remained. As shown in [18], (5) can be solved at each time step, hence the time index will be ignored for conciseness of mathematical expressions. Resolving (5) then gives us the P2P market clearing energy price and power/energy trading amounts P ij .

E. DECENTRALIZED INVERSE OPTIMIZATION PROBLEM
An issue arises when solving the decentralized forward optimization problem (5) is that some prosumers might be unsuccessful in trading, as shown in [18]. Another encountered issue is that the energy trading price, i.e. the dual variable associated with the equality constraint (5b), might not be satisfied by all prosumers. The reasons for those issues are due to different preferences on energy trading price and amount between prosumers, causing by, for instance, uncertainties on prosumers' renewable generation and energy consumption, or distinct capacities and sources of prosumers' energy assets.
In current electricity markets, the above-mentioned issues might be acceptable, hence the parameters of participants' VOLUME 4, 2016 cost functions are fixed. Nevertheless, it is expected for P2P electricity trading markets that all of their participating prosumers and consumers can be benefited. Therefore, it is arguable that in P2P electricity trading markets the participants' cost function parameters could be adjustable so that all participants can successfully trade with their expected electricity trading price and amounts. That leaded to the development of heuristic learning methods in [18] to tune parameters a i , b i in the cost functions of prosumers. Following this research line, the current work aims to derive an analytical learning method by resolving an inverse problem described below, before solving the forward optimization problem (5).
Decentralized Inverse Optimization Problem for P2P Energy Market: Each participated prosumer/agent i, i = 1, . . . , n, sets the following a priori.
• Its preferred range λ i , λ i of P2P energy trading price. • Its preferred range P i,b , 0) or (0, P i,s of power amount when it is a buying or selling prosumer; respectively. The inverse problem here is to find parameters a i and b i of prosumers' cost function (2) to achieve successful energy trading with above desired quantities.
This inverse problem was investigated for a special context with only one selling or one buying prosumer in [22]. The current research proposes an approach for the general scenario of multiple buying and selling prosumers, hence the results derived here include that in [22] as special cases.
To illustrate the relation between the forward and inverse problems described above, Figure 1 is presented for one selling and one buying prosumer whose parameters a i can be varied in some intervals. Accordingly, the marketclearing solution for the forward optimization problem, i.e. the optimal set of energy price and amounts, belongs to the smaller, gray-shaded region in Figure 1. This region could be very complicated if we consider multiple selling and buying prosumers, hence is not easy for prosumers to cooperatively learn upon it. This motivates us to study the inverse optimization problem, where the energy preferences of prosumers are known, which are demonstrated via the larger, yellow-shaded rectangle in Figure 1, based on which we need to find the variation ranges of the prosumers' parameters. Once those ranges are found, solving the forward optimization problem will surely give us the market-clearing solution in the grayshaded region, and thus in the yellow-shaded rectangle, i.e. the prosumers' energy preferences are guaranteed.
Next, the proposed approach to guarantee the energy preferences in P2P energy trading markets is summarized in Figure 2. Our focus in this paper is on Step 1, i.e. the decentralized inverse optimization problem, whereas the decentralized forward optimization problem in Step 2 can be found in the existing literature. , Ω * , * Step 2 Deriving optimal energy trading price and amounts FIGURE 2. Two-step procedure for preferences-guaranteed P2P energy trading markets.

III. ANALYTICAL COOPERATIVE LEARNING FOR GUARANTEEING ENERGY PREFERENCES OF PROSUMERS
Define the following Lagrangian associated to (5), where λ ij are the Lagrange multipliers associated to the power/energy trading equations (5b), which are regarded as the market clearing prices for energy transactions between pairs of prosumers.
When the inequality constraints (5c)-(5d) are omitted and the communication graph between successfully traded peers is connected, the P2P energy clearing price was shown to be unique [18], which is computed by And the optimal total trading power/energy for each peer/prosumer is The formulas (7)-(8) of the forward optimization problem will serve as the basis for the proposed inter-prosumer cooperative learning approach to resolve the inverse optimization problem described in Section II-E. First, prosumers with individual price ranges λ i , λ i need to negotiate for obtaining the same range, otherwise the energy trading will not happen. There are several methods to do so, e.g., min-max and averaging. In the former, the maximum of lower bounds and minimum of upper bounds of prosumers' price intervals are derived, while in the latter the averages of lower and upper bounds are obtained. For simplicity, we assume that the preferred price intervals of prosumers are overlapped, hence either of those strategies could be used. Here, we choose the latter, i.e. averaging method, in which the following ordinary consensus algorithm is utilized, λ i (see [22] and references therein for the proof). Next, denotẽ Based on interval analysis, the following theorem reveals the intervals for parameters of prosumers such that the inverse optimization problem in Section II-E is solved. Theorem 1: Having the consensus price range [λ, λ], the lower bound P i,b of power amount to be bought by buying prosumers, and the upper bounds P i,s of power amounts to be sold by selling prosumers, the following conditions are sufficient to strictly satisfy the constraints (5c)-(5d), Proof: See Appendix VII-A. It can be observed from (10) that in order to satisfying the global condition (10d) private parameters b b , b b , b s , b s need to be exchanged between prosumers. This is not acceptable from the privacy-preserving viewpoint of prosumers. Therefore, in the following, other sufficient conditions are proposed, which are more conservative than that in (10), but are better in term of privacy protection. Denote i∈Vs Theorem 2: Let γ > 3, 0 < γ s < 2, and 0 < γ b < 2 such Then the following conditions are sufficient to guarantee the strict feasibility of the constraints (5c)-(5d), Proof: See Appendix VII-B. Conditions in (11)- (12) in fact show a way to satisfy conditions in (10) of Theorem 1, based on interval analysis. Once the new global condition (11) is fulfilled, the parameters a i,b , a i,s , b i,b , b i,s of prosumers' cost functions are randomly selected in a fully decentralized manner as in (12). Additionally, only the upper or lower bounds on trading powers of prosumers are exchanged in (11), instead of private cost function parameters exchange in (10d). Thus, the privacy of prosumers is preserved.
Note that there is a limitless number of choices for the parameters γ, γ s , γ b in (11), given the values of P i,b and P i,s . Moreover, among those three parameters, γ is the most free one. Therefore, we can choose γ s and γ b first, then select γ appropriately to satisfy (11). One possibility is shown in the following corollary. Corollary 1: Let γ s = 1, γ b = 1, and choose γ > 4 such that Then the strict feasibility of the constraints (5c)-(5d) is guaranteed by the selections of prosumers' cost function parameters in (12). Corollary 1 is straightforwardly obtained from Theorem 2, so a proof for it is not presented, for brevity. Remark 1: It is easy to select γ to satisfy (13) or (11), simply by letting it as big as possible. As such, the interval 2 γ−2 , γ−2 2 is widened and will certainly contain ξ inside. The increase of γ also makes b i,s smaller and b i,b bigger, within the interval λ, λ , in order to satisfy the constraint (5d), i.e., to obtain successful trading. This is indeed in line with the heuristic learning strategy proposed in [18], and can be utilized to analytically explain that strategy. However, there is also a tradeoff for the proposed analytical cooperative learning in the current paper since parameters of prosumers' cost functions cannot be arbitrarily increased or decreased as in [18] but are bounded by (12).
The global conditions (11) and (13) can be analytically verified in a decentralized fashion through prosumers cooperation. First, buying prosumers broadcast their lower bounds P i,b of buying powers to all selling prosumers, and vice versa selling prosumers broadcast their upper bounds P i,s of selling powers to all buying prosumers. Second, each buying prosumer calculates i∈Vs P i,s and sends back to VOLUME 4, 2016 selling prosumers. Meanwhile, each selling prosumer computes i∈V b P i,b and sends back to buying prosumers. As the result, each prosumer can calculate ξ. Then prosumers choose γ, γ s , γ b to satisfy (11) or (13). For example, since (13) is equivalent to each prosumer can choose an initial value of γ to satisfy (14), denoted by γ i . Afterward, a decentralized consensus algorithm, similar to (9), is run by all prosumers to derive the average of γ i which certainly satisfies (14). This average value is then utilized as γ by all prosumers to choose their private parameters as in (12).
Remark 2: An essential point in our proposed analytical cooperative learning approach for solving the decentralized inverse optimization problem in P2P energy trading markets is the global inequality stated in (10d), (11), or (13).
As discussed above, this can be achieved via consensus algorithms whose scalability has been widely reported in the literature. Therefore, our proposed analytical cooperative learning approach is scalable.

Remark 3:
The results shown in Theorems 1-2 and Corollary 1 are derived for P2P energy systems with multiple buying and multiple selling prosumers. Those results will be simpler when only one selling or buying prosumer exists.
If there is only one selling prosumer, then the first equality on the left hand side of (10d) becomes trivial, because in this case b s = b s , hence it can be omitted. Similarly, if there is only one buying prosumer, then the second inequality on the right hand side of (10d) is not needed. Accordingly, the results in Theorem 2 can be made simpler, as shown below.

Corollary 2:
If there is only one buying prosumer, let γ > 1. Then parameters of prosumers' cost functions can be chosen as follows, which are sufficient for the strict feasibility of the constraints (5c)-(5d). Here, a b , b b are parameters of the buying prosumer's cost function, and P b is the minimum amount of power it wants to trade.
Proof: See Appendix VII-C.
Then parameters of prosumers' cost functions can be chosen as follows, which are sufficient for the strict feasibility of the constraints (5c)-(5d). Here, a s , b s are parameters of the selling prosumer's cost function, and P s is the maximum amount of power it wants to trade. Proof: See Appendix VII-D. The results provided in Corollaries 2-3 include that in [22] as a special case, where k was set to be 2 in [22].

IV. ROBUSTNESS OF P2P ENERGY SYSTEMS UNDER COMPROMISED PEERS
As seen in Section III, inter-prosumer negotiation via consensus algorithms is essential for the cooperative learning of prosumers. Such communication-based negotiation is a potential target for cyberattacks. Therefore, this section is devoted to study the structural robustness of the considering P2P energy trading markets against cyberattacks, in presence of a specific resilient consensus algorithm.
First, we introduce the models of cyberattacks [26] considered in this research, which results in misbehaving (compromised) nodes in P2P energy systems. Definition 1 (Byzantine node): A node i ∈ V is said to be Byzantine if it sends different values to its different neighbors at some time step, or if it applies some other update rule at some time step. Definition 2 (Malicious node): A node i ∈ V is said to be malicious if it sends x i (k) to all of its neighbors at each time step k but applies some other update rule at some time step.
Throughout this work, we consider the scenarios that a set of adversary nodes is F -local or f -fraction local. Consequently, the the following notions of robustness [26] are employed for P2P energy systems. Definition 5 (r-reachable set): A nonempty set S ⊂ V is rreachable (r ≥ 0) if ∃ j ∈ S such that |N j \S| ≥ r. Definition 6 (p-fraction reachable set): A nonempty set S ⊂ V is p-fraction reachable (0 ≤ p ≤ 1) if ∃ j ∈ S such that |N j \S| ≥ p|N j | and |N j | > 0. Definition 7 (r-robustness): A graph (G, V, E) with at least two nodes is r-robust (r ≥ 0) if for every pair of nonempty, disjoint subsets of V, at least one of the subsets is r-reachable.
Definition 8 (p-fraction robustness): A graph (G, V, E) with at least two nodes is p-fraction robust (0 ≤ p ≤ 1) if for every pair of nonempty, disjoint subsets of V, at least one of the subsets is p-fraction reachable.
Consequently, in order to cope with the cyberattacks resulting in misbehaving agents as described in the attack models above, the WMSR algorithm [26] below is employed.

Algorithm 1 WMSR Algorithm
for k = 1, 2, . . . do • Each normal agent i receives its neighbors' states and sort them.
• If there are F neighbors' state values strictly larger and smaller than agent i state value, then it removes all such values, otherwise it removes precisely the largest and smallest F values.
• Agent i updates its state by: where N i (k) is the set of agent i's neighbors whose state values are kept. end for Next, let us denote Then the robustness of considering P2P energy systems with bipartite structures under different cyberattack models defined above is revealed in the following theorem. Theorem 3: A bipartite graph with n s +n b nodes (n s , n b ≥ 2) achieves resilient asymptotic consensus by the WMSR algorithm under: (i) the -local, or f -fraction local malicious model, (ii) the -local, or f -fraction local Byzantine models, where 0 < f < h/2; and h are defined in (18). Proof: The key point is to show that a bipartite graph with n s + n b nodes (n s , n b ≥ 2) is -robust and h-fraction robust, in presence of the WMSR algorithm. This is achieved following the robustness determinations in Definition 7 and Definition 8 and the structure of bipartite graphs.
The results in Theorem 3 are tight and strong, and are much better than the system robustness to the −1 2 -local malicious model or the −1 2 -local Byzantine model which are obtained by a direct use of [26,. Hence, they clearly show our contribution in this section.

V. CASE STUDIES
This section is intended to illustrate the proposed multiseller multi-buyer analytical cooperative learning approach by applying to the IEEE European Low Voltage Test Feeder consisting of 55 nodes [30] (Figure 3). Similarly to [18], we assume here the existence of 5.5 kW rooftop solar power generation for 25 nodes and 3 kWh battery systems for the  32   44   21 19  22  20 18  23   35  33  36  37  29  31  25  30  34   10   48   53  50  55  52  47  43  45  41  51  54  40  39  49 42   14  other 30 nodes, hence each node is a potential prosumer who can perform P2P energy trading assumed to occur every hour. Realistic hourly load patterns of all nodes are taken from [30], and solar power generation is computed based on the average daily global solar irradiance data given in [31] for Spain in July, similarly to that in [18]. As such, 25 nodes with rooftop solar power generation have a maximum of approximately 2kW of power for selling, whereas the other 30 nodes can buy a maximum of 3kW of power, i.e., n s = 25, n b = 30, P i,s = 2 and P i,b = −3.
There was a feed-in-tariff system for renewable energy in Spain but it was terminated in 2014, and just until recently (in 2020), rooftop solar customers in Spain could sell power to the grid, but in form of communities regulated by auctions. Hence, P2P electricity market could provide another way for rooftop solar prosumers to trade electricity. The current electricity price in Spain is ≈ 0.0075 C/kWh which is converted to be ≈ 24.8 JPY/kWh. JPY is used for achieving better computing precision and clarity of figure illustration which will likely be lost if C or US$ monetary unit is utilized due to very small numerical values. Next, we assume that selling prosumers randomly set their intervals of preferred energy prices within [21, 23.8] JPY/kWh, whereas buying prosumers randomly select their expected prices within [20,23] JPY/kWh.
The scalability of our proposed multi-agent-based method can be verified through simulations in a similar manner as that reported in [18], hence it is omitted in the current work for conciseness.

A. NO MISBEHAVING PROSUMERS
Employing our proposed analytical cooperative learning approach, prosumers first negotiate to obtain an agreed price interval by running the consensus algorithm (9) with initial price intervals randomly generated by prosumers. The negotiation results are then shown in Figure 4, which reveal that λ = 20.95 JPY/kWh and λ = 23.81 JPY/kWh.
Next, prosumers cooperatively learn to check the global condition (14). It turns out that the global parameter k should satisfy k > 5.6. As discussed after (14), prosumers can initially choose their local copies of k to fulfill (14), and then run a consensus algorithm to derive a common, global value of k. Here, we assume, for the sake of conciseness, that all prosumers reach a consensus on k to be 5.7. Then selling VOLUME 4, 2016   (12). The distributed and parallel ADMM algorithm proposed in [18] with ρ = 0.011, ψ = 0.012, φ = 0.012 is utilized here for prosumers to obtain the optimal P2P energy trading price and amounts which are shown in Figures 5-6, respectively. It can be observed that the optimal price is 22.07 cent/kWh which indeed belongs to the interval [λ, λ].

B. EXISTENCE OF MISBEHAVING PROSUMERS
For clarity of illustrating the resilience of P2P electricity trading with the WMSR algorithm, we assume that two prosumers are suffered from a fault-data injection attack (FDIA), where constant values are added to the states of selling prosumer 1 and buying prosumer 1. As a result, during the communication of prosumers, the states of selling prosumer 1 and buying prosumer 1 are always kept constant. Therefore, under the ordinary consensus algorithm (9), the price negotiation between prosumers can reach consensus, but not to their average nor the average of unattacked prosumers, as shown in Figure 7. Instead, the consensus value will be the average of the faulty states with injected constants, as observed in Figure 7. This means the attackers can drive the consensus values for the energy price range negotiation between prosumers to any values they want by injecting some constants to several successfully attacked prosumers, if no robust consensus algorithm is used. On the other hand, with WMSR algorithm, Figure 8 depicts the negotiation of all prosumers for the range of P2P energy trading price in such scenario. As can be seen, during the negotiation for the lower and upper bounds of the energy price, the states of selling prosumer 1 and buying prosumer 1 are unchanged, however the remaining prosumers still can reach average consensus on the price range bounds. That is because the WMSR algorithm discards the two abnormal values from selling prosumer 1 and buying prosumer 1.
Next, we verify the usefulness of the theoretical results in Theorem 3. The value of computed from (18) is 13, hence we test the system under 13 FDIAs to 13 selling prosumers. Note that the results presented in [26, state that the considering system is only robust up to 6 FDIAs. Thus, our results are obviously much better. The illustrated simulation result is then exhibited in Figure 9, showing that the consensus for price range negotiation is still achieved. However, when the number of FDIAs is increased to be 14, consensus is no longer reached, as depicted in Figure 10. This confirms the tightness of our derived theoretical results in Theorem 3.

VI. CONCLUSION
This paper addresses two central issues in P2P energy systems: (i) the guarantee of prosumers' energy preferences; and (ii) the cyber robustness of inter-prosumer communications. For the first issue, an analytical cooperative learning approach is proposed in which prosumers can locally and randomly select their cost function parameters in certain intervals to achieve successful energy transactions with desired price and amounts. The issue is formulated as a decentralized inverse optimization problem for which interval analysis and multi-agent consensus are employed to solve. For the second issue, a resilient consensus scheme, the WMSR algorithm, is used to detect and isolate compromised prosumers intruded by the Byzantine and malicious cyberattacks. Then a novel and tight sufficient condition is derived on the structural robustness of P2P energy systems with bipartite communication structure. This result shows that bipartite networks have much stronger structural robustness against Byzantine and malicious cyberattacks than the original result reported in [26], hence constitutes a significant contribution of the current research.
Case studies are then conducted on the IEEE European Low Voltage Test Feeder, which validate our derived theoretical results. In particular, the proposed cooperative learning approach is shown to help prosumers attaining successful trading with energy price and amounts in their desired intervals, hence bring benefits to all of them. Next, the proposed certificate on the structural robustness of bipartite P2P energy trading markets is demonstrated to be strong and tight via simulations for system under fault data injection attacks.
On the other hand, the proposed cooperative learning approach has a limitation that does not allow prosumers to arbitrarily adjust their parameters, but instead choosing them in specific intervals, unlike the learning approach in [18]. This opens a room for future improvement. Another research direction in the future is to develop different resilient consensus algorithms to deal with cyberattacks, which provide better structural robustness for network systems like the P2P energy trading markets considered in the current research.

A. PROOF OF THEOREM 1
First, to guarantee the successful trading of prosumers, the constraint (5d) must be satisfied. From (8), this means λ * < b i,b for buying prosumers, and λ * > b i,s for selling prosumers. We have Hence, a sufficient condition for λ * < b i,b is that the right hand side of (19) is negative, which is equivalent to On the other hand,

VOLUME 4, 2016
Therefore, a sufficient condition for λ * > b i,s is that the right hand side of (21) is positive, which is equivalent to With the parameters b i,b , b i,s selected as in (10a) and the numbers of selling and buying prosumers are more than one, the inequalities (20) and (22) lead to (10d). Next, the constraint (5c) is equivalent tõ It can be deduced that Consequently, the following condition is sufficient for the first inequality in (23), which is exactly (10b). On the other hand, Thus, the following condition is sufficient for the second inequality in (23),ã i,s which is the same as (10c).

B. PROOF OF THEOREM 2
Obviously, the conditions in (12a)-(12b) is a way to satisfy (10a), where b i,s and b i,b are forced to lie in smaller intervals which are disjointed. Moreover, we can easily deduce that On the other hand, the selections of a i,s and a i,b in (12c)-(12d) lead to Subsequently, substituting (11) into (29) gives us which, together with (28), clearly shows that the condition (10d) is satisfied.

C. PROOF OF COROLLARY 2
If there is only one buying prosumer, then the right hand side of (10d) is not needed. In addition, (15a), (15b), and (15c) trivially guarantees (10a), (10c), and (10b), respectively. With the choices of b s and b i,s as in (15a), we obtain Next, the right hand side of (15c) and (15b) lead tõ The combination of (31) and (32) gives us the left hand side of (10d).

D. PROOF OF COROLLARY 3
If there is only one selling prosumer, then the left hand side of (10d) is not needed. Moreover, the conditions (10a), (10c), and (10b) are satisfied by (16a), (16b), and (16c), respectively. Then the choices of b b and b i,b as in (16a) give us On the other hand, it is deduced from the right hand side of (16b) and (16c) that Hence, the right hand side of (10d) is guaranteed from (33) and (34).