Noncooperative Equilibrium-Seeking in Distributed Energy Systems Under AC Power Flow Nonlinear Constraints

Power distribution grids are commonly controlled through centralized approaches, such as the optimal power flow. However, the current pervasive deployment of distributed renewable energy sources and the increasing growth of active players, providing ancillary services to the grid, have made these centralized frameworks no longer appropriate. In this context, we propose a novel noncooperative control mechanism for optimally regulating the operation of power distribution networks equipped with traditional loads, distributed generation, and active users. The latter, also known as prosumers, contribute to the grid optimization process by leveraging their flexible demand, dispatchable generation capability, and/or energy storage potential. Active users participate in a noncooperative liberalized market designed to increase the penetration of renewable generation and improve the predictability of power injection from the high-voltage grid. The novelty of our game-theoretical approach consists in incorporating economic factors as well as physical constraints and grid stability aspects. Finally, by integrating the proposed framework into a rolling-horizon approach, we show its effectiveness and resiliency through numerical experiments.


I. INTRODUCTION
P OWER grids around the world have acquired an increasingly distributed feature in the last few years. Indeed, several independent players are now typically involved in the control and optimization of power grids, strongly affecting their dynamics [1]. The integration in the grid of these active users -also known as prosumers-is strictly related to the so-called demand-side management, which is recognized as a very effective solution for the planning and optimization of power grid operations. This new paradigm is widely used to tackle the complexity issues and uncertainties of modern power grids, which are mainly caused by the extensive diffusion of noncontrollable renewable energy sources (RESs) at different levels of energy distribution systems [2]. Moreover, distributed generation systems (DGSs) and distributed energy storage systems (ESSs) are currently used in power systems aiming at suitably supporting the grid operations and providing ancillary services, such as frequency regulation and voltage control. Currently, most of these new elements are mainly integrated into the medium-voltage (MV) and low-voltage (LV) power distribution networks that are characterized by a high R/X ratio (i.e., resistance to reactance ratio). Moreover, while these grids were traditionally designed for monodirectional, radial, and weakly meshed power flows, they are now massively stressed by an incessant swap in direction and magnitude of power flows, caused by the continuous integration of unpredictable RESs [3]. The integration in the grid of new proactive players, responsible for the control of various parts of power grids, can considerably reduce these issues. On the one hand, the cooperation of such independent entities is challenging. These entities act selfishly aiming at their profit; in addition, the control of a large number of independent entities can be computationally demanding, ill scalable, limitedly fault-tolerant, and poorly respectful of privacy. On the other hand, since they are physically interconnected through power lines, they must be effectively coordinated to ensure safe and secure grid operations [4].

A. Literature Review
One of the critical problems in power systems is the wellknown optimal power flow (OPF) problem, which consists in optimally scheduling the committed generators based on the results of the day-ahead economic dispatch problem [5]. In greater detail, the OPF problem has the objective of calculating the optimal operating point for each generator, resulting in the least cost required to supply a given power demand, while fulfilling the power system and equipment constraints. In case this computation is performed in a centralized approach, the so-called independent system operator must have access to all the generators' cost functions and system parameters.
However, in the presence of selfish players operating in power grids, as well as flexible loads, DGSs, and DSSs, some issues discourage the use of this centralized decision-making structure. In fact, this is usually computationally demanding, ill-scalable, This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ limitedly fault-tolerant, and poorly respectful of privacy [4]. In order to improve the drawbacks of such an architecture, several distributed/decentralized OPF and economic dispatch frameworks have been recently proposed [6], [7]. Most of these works either focus on the economic dispatch problem only or address the direct current OPF (DCOPF) system, thus ignoring a reliable model of the power network dynamics. In addition, most works deal with the high-voltage (HV) transmission network only, which is modeled more straightforwardly than medium-voltage/low-voltage (MV/LV) networks. Specifically, several economic dispatch and linearized OPF models (e.g., the DCOPF) are largely implemented in a decentralized architecture [8], [9]. However, these models are not applicable to MV/LV distribution networks, where different physical assumptions must be made, and they cannot fully capture the behavior of independent elements in the grid. Only very few distributed approaches, such as [10], employ convexified models for distribution networks, for instance, based on conic relaxation. Nevertheless, this model may include some error and inexactness.
All these strategies focus on the control of cooperative integrated elements: In fact, the final goal of these approaches is to optimize the global welfare rather than the single players' revenues. Conversely, to cope with the situation where several noncooperative agents are involved in the optimization of power grids, game-theoretic methods have been proposed for the generation and storage control in power grids. These control approaches are based on centralized architectures or on distributed ones that have the advantage of requiring only a little sharing of private information [11], [12]. In detail, several works focus their efforts on defining efficient noncooperative decentralized control policies that rely on effective models of power distribution components. For instance, the authors in [13] and [14] propose a noncooperative approach for distributed generation in power systems, considering economic and power flow constraints. However, the common weakness of these works is the inadequate representation of the underlying power system. Indeed, their main focus is on the game perspective, while physical constraints are partially disregarded. Several studies (e.g., [15], [16]) include a constraint barely on the mismatch between the generated and the requested power. Nevertheless, this assumption is sufficient only during the economic dispatch phase, while it is not adequate for real-time control of the grid.
Summing up, in the case of HV power grids, certain simplifying assumptions hold and, thus several approximated convex models ensure a reliable solution of the control problem using game-theoretical techniques. On the contrary, in distribution grids, these hypotheses become inappropriate; hence, the classical ac power flow must be used. In particular, in case the underlying models are nonconvex and, in general, nonlinear, few game-theoretical approaches are available in the literature, while their application to power systems is, to the best of our knowledge, almost absent.

B. Article Contribution
The above reported literature review shows that little attention has been devoted to game theory-based frameworks able to satisfy the ac power flow constraints. Therefore, in this article, we propose a reliable distribution grid model that comprehends several buses connected with noncontrollable loads and RESs as well as active users. The distribution grid is connected with the HV transmission grid through a slack bus from which power is injected. The distribution system operator (DSO) schedules this injection over a control horizon, to satisfy the load demand and taking into account the RESs' production. However, due to the uncertainties in both demand and production, this schedule may be not accurate. For the sake of increasing the predictability of the distribution grid, from the HV grid's perspective, the DSO encourages, by means of economical rewards, noncooperative active users to modify their energy scheduling. In particular, these users have some flexible loads, with an elastic energy demand, or own either a traditional energy generator or a storage system.
The contributions of this work with respect to the related literature can be summarized as follows. 1) We model the system not only in a decentralized or distributed fashion, such as in works [8], [9], but we consider multiple agents with a noncooperative behavior for a better representation of the market. This representation aims at increasing the predictability and flexibility of the power grid, while allowing each user to preserve its privacy. In addition, different from [10], where the offered flexibility is indeed not properly valued at the individual level but only shared among all members, we provide users with an incentivizing control goal. Moreover, different from other game-theoretical approaches presented, for instance, in [13]- [16], in addition to mere economical aspects, we take into account the nonlinear ac power flow equations, and consequently the power quality constraints.
2) We extensively improve our previous work [17], where we preliminarily introduce a noncooperative framework of energy storage providers participating in the grid optimization under power flow constraints. In particular, in this work, we significantly enhance the grid model by integrating flexible loads and dispatchable generators, while additionally including power quality constraints.
3) Due to the nonconvexity of the power flow constraints, we define a weaker game-theoretical equilibrium concept showing the condition required for the existence and uniqueness. Moreover, we propose a novel decentralized algorithm that iteratively leads the single profit strategies to converge to a local variational equilibrium, if this exists. We implement this algorithm in a rolling-horizon approach and we assess its effectiveness over a set of numerical experiments based on two IEEE test bus systems.

C. Article Organization
The rest of this article is organized as follows. In Section II, we describe the proposed distribution grid model. In Section III, we recall some preliminaries on the power flow. In Section IV, after describing the noncooperative framework, we introduce the novel concept of weaker equilibrium. In Section V, we present an algorithm to compute this equilibrium. In Section VI, we present some numerical experiments. Finally, Section VII concludes this article.

D. Mathematical Notation
Let us introduce some basic notation. R, R >0 , and R ≥0 denote the set of real, positive real, and non-negative real numbers, respectively. Moreover, R n , R n >0 , and R n ≥0 denote the set of real, positive real, and non-negative real n-dimensional vectors, respectively. A denotes the transpose of A. A is the square norm of A. A ⊗ B and A • B indicate the Kronecker and the Hadamard product between matrices A and B. 0 n and 1 n indicate the column vectors with n entries all equal to 0 and to 1, respectively, e.g., 0 n = (0, ..., 0) ∈ R n . We define the n × n identity matrix as I n = diag(1, 1, . . . , 1). Moreover, x = col(x 1 , ...., x n ) is equal to x = (x 1 , ...., x n ) . We define the mapping Π X : R n → X as the projection into the generic closed nonempty set X ⊆ R n , i.e., Π X (y) = argmin x∈X x − y . The set-valued mapping N X : R n → R n denotes the normal cone operator for the the set X ⊆ R n , i.e., N X (y) : For a fixed sampling interval ΔH, the value of a variable x at the hth sampling time h ΔH is denoted by x(h), where h = 1, 2, ... ∈ N is the discrete-time index. We indicate a real number with x and a complex number withx. Forx ∈ C, we indicate its modulus by |x| and its conjugate byx * .

II. DISTRIBUTION NETWORK MODEL
In this section, we introduce the model of the MV distribution grid. The grid is composed by several interconnected buses and is connected with the HV transmission network through a slack bus, from which the DSO injects power into the MV distribution grid. We suppose that the grid is controlled through a predictive approach characterized by a control and prediction horizon with fixed length H. The control horizon at time step t is denoted as H t = {t, ..., h, ..., t + H − 1}, where h > t is a generic time slot of the moving horizon.
We employ a graph G = (B, E) to model the power distribution grid elements and interconnections (i.e., buses and transmission lines), where B is the set of nodes with cardinality B, while E ∈ B × B is the set of pairs of distinct nodes called edges with cardinality E. The nodes b ∈ B of the graph represent the buses of the power system and the edges (b, r) ∈ E represent the transmission lines among these buses.
We assume that each bus b ∈ B is connected with several noncontrollable loads, whose aggregated per-slot active and reactive power demand scheduling vectors over the control horizon H t are P c , respectively. Moreover, we assume that the RESs (i.e., nondispatchable generators) connected with the bus b have aggregated active and reactive power production scheduling vectors over the control horizon H t equal to P w b : ≥0 , respectively. For the sake of clarity, let us define the net active and reactive power associated with bus b Moreover, we assume that the power distribution grid includes a set N of N active users that modify their energy-scheduling profile aiming at decreasing their total cost, while providing flexibility services to the grid. In particular, users i ∈ N are classified into the three categories N c , N g , and N s , denoting, respectively, the sets of users that can dynamically modify the profile of their controllable energy consumption, their dispatchable energy generation, and their energy charging/discharging strategy.

A. Flexible Loads
Users i ∈ N c have a controllable energy consumption profile ≥0 that can be modified in accordance with the market conditions. First, we assume that the controllable energy consumption is bounded by lower (i.e., c min i ) and upper (i.e., c max i ) limits Moreover, we assume that a given amount of controllable energy demand (i.e., ξ t,i ) must be satisfied over the control horizon H t For the sake of compactness, let us define for each user i ∈ N c a set of feasible strategies as follows:

B. Dispatchable Generators
The group of users i ∈ N g is equipped with dispatchable energy devices that aim at producing a generation profile We assume that energy producers are subject to variable production costs in accordance with a linear cost function W i (g i ) = η i 1 H g i , where η i is the generation cost per time slot [12]. The non-negative generation profile is upper bounded by the maximum per-slot energy generation capacity For the sake of compactness, let us define for each producer i ∈ N g a set of feasible strategies as follows:

C. Energy Storage Systems
Users i ∈ N s are equipped with an ESS whose energy storage profile is denoted as is the charging profile and s d, ≥0 the discharging profile. Neglecting the ESS technology specificity, for each user i ∈ N s , we characterize the corresponding ESS with its storage capacity SoC max i , maximum charging/discharging rate s max i , charging and discharging efficiency 0 < β c,i ≤ 1 and β d,i ≥ 1, and leakage rate 0 < α i ≤ 1. The reactive power does not affect the ESSs state of charge [18]: Indeed, ESSs are mainly used for active power support; as a consequence, we neglect the reactive power in the storage model.
In particular, the ESS state of charge at a given time step is equal to the value at the previous time step, decreased by the leakage rate and incremented or reduced by the energy storage profile s i (h) = (s c,i (h), s d,i (h)) altered through the charging and discharging efficiency The charging and discharging profile is, respectively, limited by the available state of charge and the storage capacity The maximum charging/discharging rate must be respected All the various ESS technologies suffer from degradation in terms of capacity decreasing and resistance increase. For the sake of simplicity, we introduce a simple quadratic cost Finally, for the sake of compactness, for each user i ∈ N s , we define the set of feasible strategies as

III. PRELIMINARIES ON POWER FLOW
In this section, we recall some basic concepts on the network power flow analysis, which aims at determining the steady-state condition of electric power systems. Specifically, the final goal is the computation of the voltage magnitude and phase angle in each bus of the network, given a power injection profile in each bus as input data [20].
Remark 1: In this work, we employ a simplified version of the ac OPF problem, as we disregard power losses and consider each bus as a PQ bus. The latter assumption is reasonable since generators can be modeled as PQ buses when a frequency droop control approach is not used [21]. However, we remark that the proposed approach can be easily improved including PV buses and considering power losses or it can be extended including other techniques to more accurately model the power flow in distribution systems (e.g., employing current injection models to represent unbalanced networks) [22]- [24].
For the sake of keeping the notation light, in this section, we refer to the value of variables at time h avoiding the timedependency (h) in the symbols. Using the classical π-model for the power grid, and disregarding the power losses, the relation between the current injectionĪ b at bus b and the voltage in a network of B nodes can be defined as whereȲ br is the element (b, r) of the admittance matrix, defined asȲ whereȳ bb is the admittance to the ground at bus b andȳ br =ȳ rb is the line admittance between buses b and r. It is apparent that y br =ȳ rb = 0 holds if there is no link between bus b and r. The complex power at bus b is given bȳ By representing the variables in a polar form, it is worthwhile where θ b is the phase angle at bus b while θ br is the phase difference between bus b and bus r. Consequently, by splitting the complex power into its active and reactive part, we obtain The set of nonlinear equations (13)-(14)-also known as power flow equations-can be solved in several ways; however, the most common approach in the literature is the Newton-Raphson approach [25]. The Newton-Raphson approach works iteratively, employing an initial guess for the value of parameters, most of the time with the so-called flat start. The results of this procedure correspond to the steady-state solution of the network; however, this may not be feasible due to some constraints violations. Indeed, in real-world applications, several quality constraints are applied to the power system, the most critical ones concerning the lower and upper bounding of the voltage magnitude and the maximum active power transfer where the active power P br transmitted between two interconnected buses b and r is calculated by IV. PROPOSED GAME MODEL The goal of this section is to illustrate the overall optimization problem for the proposed distribution network model. We preliminarily remark that, on the one hand, in traditional power systems, distribution grids are modeled in an approximated way as highly predictable time-constant loads. As a consequence, from the viewpoint of the HV network, which is connected to the MV network through the slack bus, the distribution infrastructure is merely considered passive and it is accordingly sized on the specific demand. On the other hand, the current pervasive integration of distributed renewable sources and flexible loads at the distribution level may affect the predictability of modern MV grids. Therefore, our goal is to design an intelligent mechanism that allows the DSO to increase the distribution network predictability with respect to the active power injected through the slack bus. To this aim, we introduce the reference signal P r := (P r (t), ..., P r (h), ..., P r (t + H − 1)) ∈ R H as the planned active power that the DSO intends to inject into the distribution network. Then, we define the mismatch as where P := (P 1 , ..., P b , ..., P B ) ∈ R H×B collects the active power demands actually forecasted in all the buses over the whole control horizon. We now suppose that the deviation between the planned and the forecasted power demand is dynamically regulated by leveraging on the optimal control of the independent active users. In particular, we assume the existence of two different electricity markets. The passive users connected to the distribution grid participate in a traditional electricity market employing a standard pricing mechanism (e.g., with time tariff based on a liberalized market). On the other side, the DSO coordinates a market where the prosumers participate to offer ancillary services in the distribution network. The DSO encourages active users to minimize the above-defined mismatch through a pricing mechanism based on such a deviation. At the same time, each prosumer tries to maximize its own profit over the given time horizon, while fully satisfying the power grid topology and physical limits. A natural framework to capture such a competitive and decentralized decision-making process relies on game theory. In particular, in the sequel, we propose a novel noncooperative game model to drive the individual decisions of active users towards a feasible solution.

A. Game Model
A noncooperative game is usually defined by a set of players (here corresponding to the active users), a strategy of each player, a cost or payoff function for each player and a set of feasible strategies for each player [16]. Each player behaves selfishly to optimize its own welfare, quantified through a net cost function. However, the main difference between a game with N players and N single independent optimization problems relies on the assumption that the net cost function for each player depends not only on its strategy but also on the strategies of the other players. More in detail, let us define for each active player i ∈ N its strategy by ≥0 , that collects the controllable consumption, dispatchable generation, and ESS charging/discharging profile. Moreover, x −i := col(x 1 , ...., x i−1 , x i+1 , ..., x N ) ∈ R 4H(N −1) collects the strategies of all the active users different from i, while x : = col(x 1 , ...., x i , ..., x N ) ∈ R 4HN collects the strategies of all the players. Each player can choose a strategy from its local feasible set defined as Note that x must belong to the intersection of the local feasible set Ω = N i=1 Ω x i . As mentioned above, the net cost for the prosumers is based on the mismatch in (18). In particular, we assume that the net cost function of each independent active user equals the operating costs (degradation and generation) minus the monetary exchange with the DSO over the control horizon H as where we define the vectors δ g = col (0 H , 1 H , 0 H , 0 H  However, in our model, the interaction between the players occurs also at the feasible set level, since the power systems constraints must be respected. For each time slot h ∈ H, the two power flow equations in (13) and (14) must be respected for each bus b ∈ B. Hence, L = 2BH equality constraints must hold. In particular, we define the mapping ϕ(x) : R 4HN → R L by collecting the L mappings ϕ l (x) that represent constraints (13) and (14) ϕ l (x) = 0 ∀l = 1, . . . , L.
Moreover, for each time slot h ∈ H, the two power quality constraints in (15) and the two constraints (16) must be respected for each bus b ∈ B and for each line (b, r) ∈ E, respectively. Hence, M = 2H(E + B) inequality constraints must hold. First, let us define the vector of power transfer between two buses (b, r) ∈ E over the control horizon as P br := (P br (t), . . . , P br (h), . . . , P br (t + H − 1)) ∈ R H : This is dependent on the strategies of all active users in the grid and is computed by (17). Consequently, we define P f (x) = col(P 11 , . . . , P br , . . . , P BB ) ∈ R HB as the vector collecting all power transfers in the network and we assume that the maximum transfer values are constant over time and are defined as P max f = col(P max 11 1 H , . . . , P max br 1 H , . . . , P max BB 1 H ) ∈ R HB . Thus, the following relations must hold: Similarly, let us denote the voltage magnitude for each bus b ∈ B over the control horizon as . . . , V b , . . . , V B ) ∈ R HE as the vector collecting all the voltage magnitudes in the network and we assume that the maximum (minimum) voltage magnitude is constant over time and defined as V max Thus, the following relations must hold: We now define the mapping φ(x) : i.e., by collecting the M = 2H(E + B) mappings φ m (x) that represent constraints (22) and (23) Hence, we finally define the overall feasible set X as the coupling constraint set by Having introduced all the game elements, let us formalize the overall problem as a generalized Nash equilibrium problem (GNEP), whose solution is the generalized Nash equilibrium (GNE), defined as the set of the interdependent optimization problems as follows: The GNE is formally defined as follows.

Definition 1 (GNE):
Solving the GNEP means computing a GNE, which is a collective strategy x * ∈ X such that for each However, this definition does not ensure the uniqueness of the equilibrium point. In order to overcome these difficulties, in the related literature, some classes of GNEP are solved by finding a solution for the associated variational inequality (VI) problem defined in the following. Definition 2 (VI problem): Given a subset K ⊆ R n and a map G : K → R n , the VI problem VI(K, G(x)) consists in finding a vector x * ∈ K such that Therefore, we can define the VI associated with the GNEP (26) as VI(X , N , x −N )). The VI is much more tractable than the original GNEP: In fact, a very substantial set of results is available. The resulting solution, the so-called variational equilibrium, is also a solution of the associated GNEP and is thus called variational generalized Nash equilibrium (vGNE). It should be noted that not all the solutions of the GNEP are solutions of the VI but all the solutions of the VI are solutions for the respective GNEP.
The GNEP solutions achieved through the VI are exactly those for which all players' Karush-Kuhn-Tucker (KKT) conditions have the same Lagrangian multipliers for the respective constraints, i.e., λ * = λ i , ∀i ∈ N , where λ i = col(λ 1 i , ..., λ M i ) ∈ R M is the vector collecting the Lagrangian multipliers for all the coupling constraints (we refer to [26] for a more formal definition). The equivalence of the Lagrangian multipliers (acting as penalizing coefficients) impose to the vGNE having a "fair" behavior between all the possible GNE.

B. Local Equilibrium in Nonconvex Games
In the formulated problem (26), the feasible set X may be nonconvex and, in general, the coupling constraints are nonlinear. In this context, the existence of a GNE cannot be proven and, as for the OPF problem, its seeking is challenging [27]. In order to overcome these issues, let us propose a novel concept, namely the local generalized Nash equilibrium problem (LGNEP), and let us search for its possible solutions, i.e., the local generalized Nash equilibrium (LGNE). For nonconvex problems, the common approach-that is well-accepted in practice-is then to look for a stationary and possibly locally optimal solution [28]. Hence, we define this weaker equilibrium point by substituting the original feasible set with its linear approximation in a generic point. In detail, we defineX (y) as the linear approximation of the feasible set in the point y. Therefore, let us formally define the novel concept as follows.
Definition 3 (LGNE) : We define the LGNEP as the problem of computing a LGNE, which is a collective strategy x * ∈ X with the property that no single player i ∈ N can benefit from a unilateral deviation from x * i contained in the linearized set X (x * ) if all the other players act according to the LGNE. More formally We can still define the VI problem associated with the LGNEP as However, (30) is a quasi-variational inequality (QVI), that is an extension of a VI, and is defined in detail as follows. Definition 4 (QVI problem): Given a subset K(x) ⊆ R n and a mapping G : K → R n , the QVI problem QVI(K(x), G(x)) consists in finding a vector x * ∈ K(x) such that The solution of the QVI is a quasi-variational equilibrium (QVE) and has the same advantages of the variational equilibrium in a local fashion. Similarly, we call the solution the LGNEP, which is also the solution for the associated QVI, as variational local generalized Nash equilibrium (vLGNE). Let us now generalize the results presented in [26] for the proposed local problem. Lemma 1: Let us consider the LGNEP associated with the proposed game (26), then the following implications hold.
1) Let x * be a solution of the LGNEP at which KKT conditions for all players hold with the same Lagrangian multipliers λ * = λ i ∀i ∈ N . Then, x * is a solution of the QVI (30) and thus it is a vLGNE. 2) Vice versa, let x * be a solution of the QVI (30) and thus a vLGNE. Then, x * is a solution of the LGNEP at which the KKT conditions hold with the same Lagrangian multipliers λ * = λ i ∀i ∈ N . Proof: For each player i ∈ N and for every x −i , the cost function f i (·, x −i ) is convex and is differentiable with respect to x i . Moreover, for each coupling constraint and for every x −i , the functions φ m (·, x −i ) and ϕ m (·, x −i ) are differentiable with respect to x i , and X is closed and bounded due to the limits on the local variables. Moreover, for each point y, the setX (y) is convex and therefore we can employ Theorem 3.1 in [26] to conclude the proof.
In other words, for a given LGNE, we have that, in a linearized local subset of X , each user unilaterally maximizes its own function, while keeping the rivals' strategies fixed at the optimal value. Moreover, if at this point, the common optimal multiplier associated with the individual constraints are the same for all the players λ * = λ i ∀i ∈ N , then the point is a vLGNE, and thus a locally "fair" equilibrium point. We can characterize a vLGNE also employing the normal cone since it is trivial proving that the optimality condition −F (x) ∈ N X (x) must be verified. Nevertheless, this optimality condition is hard to verify since the construction of the normal cone is usually hard.

C. Existence and Uniqueness of vLGNE
In game theory, all the results regarding the existence and in particular the uniqueness of an equilibrium are considered only for the convex setting. Therefore, in order to prove the existence of a vLGNE, let us employ the concept of proximality. In particular, it is possible to show that when the functions defining the feasible set are continuously differentiable, the resulting feasible set is a proximally smooth set first proposed by Clarke et al. [29]. A set is said to be proximally smooth if there exists a constant r > 0 such that the distance operator Δ X (·) is continuously differentiable on a "tube" of the form From [29], [30], it is possible to show the following properties hold for the proximally smooth sets. Lemma 2 (Clarke et al., 1995 [29]): Let r ∈ (0, +∞] and X be a nonempty closed set. If X is r-proximally smooth, the following properties hold. 1) The projection operator is single valued and for all x ∈ U X (r), Π X (x) = 0.

2) The projection operator is Lipschitz continuous on U X (r )
for any r ∈ (0, r), with constant r r−r .
3) The r-proximal normal cone is closed as a set-valued mapping and is equivalent to the normal cone, i.e., N r X (x) = N X (x). By focusing only on proximally smooth sets, the weaker properties of the projection operator described in Lemma 2 can be used. Hence, let us prove that a vLGNE is equivalent to a fixed point of a gradient projection as follows.
Lemma 3: Let the set X be r-proximally smooth; moreover, there is a finite constantF such that F (x) ≤F ∀x ∈ X . For any γ ∈ (0, r/F ), the following two statements are equal: i) x * is a vLGNE; ii) x * = Π X (x * − γF (x * )). Proof: Note that, since γ ∈ (0, r/F ), we have that x * − γF (x * ) ∈ N r X (x * ). Moreover, since the set is proximally smooth N r X (x * ) = N X (x * ) and that −F (x * ) ∈ N X (x * ), we can conclude the proof. Now, let us prove the existence of a vLGNE.

Theorem 1:
The LGNEP associated with the proposed game (26) has at least one vLGNE.
Proof: For each coupling constraint, and for every x −i , the functions φ m (·, x −i ) and ϕ m (·, x −i ) are differentiable with respect to x i ; hence, when considered individually, they are proximally smooth. It is easy to verify that the intersection of this different constraints is metrically calm and therefore X is r-proximally smooth with r > 0 [31]. Moreover, X is closed and bounded due to the limits on the local variables and consequentlyF is finite. From Lemma 3, we have that any fixed point of the projected mapping is a vLGNE. Hence, if there exists γ > 0 sufficiently small such that properties of Lemma 2 and r r−r where the first inequality holds since X is proximally smooth, while for the last one, we use the strong monotonicity and the Lipschitz continuity of the mapping F . As regards the uniqueness of vLGNEs, due to the nonconvexity of the feasible sets, the global uniqueness can not be ensured; however, the local uniqueness can be proved with respect to the linearized feasible setX (·). Proposition 1: If the mapping F (·) is strictly monotone, the strict inequality holds in (30) and thus any vLGNE x ∈ X is unique within its linearized feasible setX (x).
Proof: This proposition can be easily derived from Proposition 12.11 in [32] as the linearized feasible set is a convex set, and for every player i ∈ N and for every x −i , the function f i (·, x −i ) is convex and continuously differentiable.

V. DECENTRALIZED CONVERGENCE TO EQUILIBRIUM
In this work, the goal is to reach an equilibrium which is nontrivial due to presence of the nonlinear and possibly nonconvex coupling constraints. Hence, let us propose an approach based on the coupling constraints reformulation and the LGNEP definition to find a possible equilibrium.

A. Coupling Constraints Reformulation
In our model, satisfying the power quality constraints requires the variation of the power injection in some buses. Hence, given a power injection profile for the overall network, the DSO must identify which bus injection causes the constraint violation and calculate the corresponding violation level. Moreover, we need to linearize the constraints to compute the linearized set of coupling constraints.
Therefore, let us employ the so-called sensitivity factors (SFs), i.e., the first order derivatives at the points corresponding to the Newton-Raphson solutions. Note that, for the sake of keeping the notation light, in this subsection, we refer to the value of variables at time h avoiding indicating the time-dependency (h) in the corresponding symbols.
The SF for the voltage magnitude can be directly calculated from the inverse of the Jacobian matrix of the last Newton-Raphson iteration. More formally, for bus b ∈ B, the voltage SF with respect to a power injection change in the generic bus i ∈ B is defined as ∂|V | b /∂P i , then we arrange it in a compact form for all the control horizon as SF ) ∈ R H . Therefore, we define it for all the buses in the grid as The power transfer SF cannot be directly calculated. However, by differentiating (17), we calculate the power transfer SF with respect to phase angle and the voltage magnitude as follows: Subsequently, we link the above calculated SFs to the power transfer SFs with respect to the active and reactive power injection at each bus by using the Jacobian matrix J [25]. More formally, for each branch between buses b and r, (b, r) ∈ E, we get ⎡ Hence, let us define the power transfer variation in branch (b, r) ∈ E for a power injection in bus i ∈ B as ∂P br /∂P i and let us collect these for all the control horizon as SF f i,br := (∂P br /∂P i (t), . . . , ∂P br /∂P i (h), . . . , ∂P br /∂P i (t +H −1)) ∈ R H and for all branches as SF f i := col(SF f i,11 , . . . , SF f i,br , . . . , SF f i,BB ) ∈ R EH . Finally, we define the linearized coupling constraints in a point y, with respect to the variation of the strategy of user i, as where δ g := (I BE ⊗ δ ) is an auxiliary matrix and SF i :

B. Decentralized Control Algorithm
Let us now present a decentralized control architecture to reach an equilibrium: Each agent computes its control action, while a central coordinator attempts to satisfy the shared constraints possibly forcing the agents' strategies. It should be noted that in this approach, only a limited information exchange occurs between agents; in particular, when the shared constraints are satisfied without the intervention of the central coordinator, this approach becomes distributed.
Several decentralized penalty approaches have been proposed to solve the GNEP leveraging on the definition of a penalized Nash equilibrium problem (PNEP) of the original problem. Therefore, let us propose an algorithm based on the penalized reformulation of the original generalized game. The core idea of the proposed control algorithm is to solve at each step a penalized problem defined employing the coupling constraint reformulation around the current solution. In particular, by modifying the approach in [33] substituting the original constraints with their corresponding linear reformulation, we can rewrite the LGNEP as a penalized game P(y, ρ i ) = argmin where | · | + is the projection onto the positive orthant and is the vector collecting the penalizing factors for the player i ∈ N . Note that the reformulated problem P(y, ρ i ) is a convex NEP and thus a solution always exists and is unique. Moreover, for sufficiently large and finite penalizing factors ρ i , all players will be forced to satisfy the constraints until a feasible point is reached. However, the correct value of the penalty parameters is not known in advance and therefore a strategy must be selected that allows to update the values of the penalty parameters so that eventually the correct values are reached. In order to reach a vLGNE, the penalizing factors associated with an individual constraint should be the same for all the players, and thus we assume an equal update rule for all the players.
We now propose the novel decentralized procedure to reach a vLGNE formally defined in Algorithm 1. In the first phase, players initialize their starting strategies x 0 and the coordinator defines the initial penalty factors ρ 0 . The proposed algorithm works iteratively (lines 1 and 2, [18][19]. At each iteration, given the overall players' strategies, the coordinator computes the SF to connect the overall outage to a specific bus power injection for the prediction horizon (line 3). Then, the coordinator broadcasts the penalty factors and the SFs to all active players. Note that the coordinator computes the SF for each bus of the grid b ∈ B; however, it broadcasts to the generic active user i ∈ N only the factors related to its position (line 4).
After this phase, each player updates its strategy based on its previous step by solving the penalized problem (38) defined with the coupling constraints linearized around the current solution x k . However, the next strategy is selected employing a standard regularization approach with coefficient γ, in order to improve convergence (lines 5 and 6).
Finally, the coordinator terminates the iterative process when an adequate termination criterion is reached.
Remark 2: It should be noted that, due to the decentralized scheme of Algorithm 1, any untruthful report or behavior of a player may lead to a higher individual outcome. Consequently, as usually done in related works, let us assume the presence of a surveillance mechanism that acts to prevent such an issue [34], [35].
Remark 3: Note that in Algorithm 1 (lines 10-17), we do not consider (13) and (14) that defining the feasible set X . In fact, we assume that any solution computed with the Newton-Raphson approach (line 9)-that is feasible for (22) and (23)-is also feasible for (13) and (14).
We finally note that the vLGNEs are quasi-variational equilibria that-differently from the well-known variational equilibrium in jointly convex games-may not be unique. In fact, starting from different initial players' strategies, a different equilibrium may be reached. Nevertheless, as commonly done in the ac OPF, we initialize all the strategies to a standard value; hence, all players begin from the same situation. In this way, we guarantee to always reach the same quasi-variational equilibrium for the same scenario.

VI. NUMERICAL EXPERIMENTS
In this section, we show the performance of the proposed algorithm through several numerical experiments, by analyzing distribution systems under a high power flow variation due to a large penetration of distributed generation and active users. In particular, we refer to two standard test bus systems widely used by researchers as a benchmark for evaluating the performance of control solutions proposed in the power systems field: The IEEE 33-bus system [36] composed of 33 buses, which are connected through 37 branches in accordance with the radial topology typically employed in power distribution networks, and the IEEE 118-bus system [37], which includes 118 buses and 177 lines and has a meshed structure with several radial branches.
In the Newton-Raphson method, we use the so-called flat start [38] as a starting point for the unknown variables, i.e., all the angle phases are equal to zero, and all the voltage magnitudes are set to 1 in p.u. system. Note that this is extremely important for a fast power flow convergence and, therefore, for the overall algorithm's performance.
We consider a control horizon of 24 h and we assume for the PQ buses b = 1 (i.e., in all buses except the slack bus b = 1) a net power profile randomly generated with an average active and reactive power equal to 0.15 MW and 0.5 MVAr, respectively. The controllable loads have a randomly generated profile with an average active power equal to 0.02 MW. All data are updated at each time step to simulate the variability of their values in a rolling horizon framework. For the sake of simplicity, we assume that the minimum and maximum voltage magnitudes (i.e., V min b and V max b ) at each bus are 0.95 and 1.05 p.u., respectively. Moreover, the maximum power transfer of each line (i.e., P max br ) is set equal to 200% of the average power transfer in the merely passive case (i.e., Scenario 1 described in the sequel).
For the sake of simplicity, we assume that the energy price coefficients of active users κ i are all equal to 0. To assess the proposed framework, we perform several experiments both on the IEEE-33 and IEEE-118 bus systems addressing the three scenarios defined as follows.
1) Scenario 1. The distribution system comprehends only noncontrollable loads, without distributed renewable generation and active users. 2) Scenario 2. The system comprises noncontrollable loads and RESs as well as noncooperative active users, even though the latter neglects the coupling constraints on the maximum power transfer and voltage magnitude (i.e., users play a traditional Nash game rather than a generalized one). 3) Scenario 3. The system comprises all the elements as in the previous case; in addition, active users address the power quality coupling constraints. As for the first set of experiments, we employ the IEEE-33 bus system with seven active users, namely in buses 2, 3, 6, 19, 13, 14, and 15. In Fig. 1, we show a colormap graph representing the network state in a time step computed with the Newton-Raphson method. In particular, for each line, the percentage loading level is normalized with respect to the maximum value by employing different colors and line weights, i.e., hot colors indicate congestion in the line.
In the first scenario, presented in Fig. 1(a), it is evident that the grid is merely a passive system; therefore, this case represents the current state of traditional distribution systems around the world. Note that the power flow constraints are satisfied in a passive way because the capacity of the distribution system was correctly sized in the design phase, suitably taking into account the power demand on each bus.
In Fig. 1(b), we show the results related to Scenario 2, where the system comprehends the same noncontrollable loads profile in addition to several unpredictable RESs. Additionally, active users participate in the grid optimization to increase their profit. In particular, the active users modify their own strategies without any power flow restriction, i.e., active users' strategies are coupled only via the cost function. As a consequence, in such a scenario, the increased power production, together with lower power demand and the uncoordinated active users participation, may cause a high instability in the grid (represented by all the hot colors) causing an outage in the grid.
In order to show the effectiveness of our algorithm, in Fig. 1(c), we show the results related to Scenario 3: The distribution system is equipped with loads, RESs, and active users but the proposed noncooperative coordination scheme is applied. The algorithm is able to steer active users' strategies to a global equilibrium while satisfying the power flow constraints.
Moreover, we show the convergence properties of the proposed algorithm for Scenarios 2 and 3. In particular, we reduce the maximum allowable power transfer for all the branches linked with active users to 80% of the power transfer in the passive case (i.e., Scenario 1). We remark that, even though this setup may correspond to unrealistic data, it allows testing the proposed approach under extremely severe conditions, which could make also the passive scenario without active users infeasible. In particular, in Fig. 2, we show the convergence of the overall power exchange of each player with the main grid. Since the players have the same characteristic coefficients and initial  conditions, it is evident that in Scenario 2, shown in Fig. 2(a), the only possible equilibrium corresponds to bringing all players to have the same strategy. Conversely, in Scenario 3, convergence is reached with different strategies, as shown in Fig. 2(b). This is caused by the different displacements of active users in the distribution grid and therefore of their impact on the power quality constraints. In fact, we notably reduced the maximum admissible power transfer by forcing the active users responsible for the power outages to work against the market to satisfy the constraints.
As for the second set of experiments, we consider the IEEE-118 bus system, including 30 active users randomly displaced in the power grid. First, in Fig. 3, we show the mismatch between the active power injected into the slack bus and the reference value over the time window, in the case the proposed approach is deployed (Scenario 3) or not (Scenario 1). From Fig. 3, it is evident that the injected power profile is flatter in Scenario 3 than in Scenario 1, thus proving that the proposed framework increases the predictability and the power stability.
Finally, we analyze the impact of the coupling constraints on the payoff functions of the different players. In fact, if the power quality constraints are tight, the prosumers in the grid may be forced to work against the market. For instance, in a situation where it is profitable to charge a battery, a prosumer may be forced instead to discharge it in order to satisfy the constraints. In this experiment, we aim at exacerbating this situation by extremely varying the maximum power transfer. In particular, we modify the maximum power transfer of each line from 40% up to 200% of the average power transfer in the passive case (i.e., Scenario 1), and then computing the players' net cost. The results are shown in Fig. 4, where it is clear that in the case of a lower maximum power transfer, the net cost spread over the users is very large, while when the maximum power transfer increases, the net cost tends to be equal for all users. It should be noted that, as the active players are equipped with equal resources, in the case of no power quality constraints, the net costs should be equal; conversely, if the power quality constraints are included, the users' net costs are different due to their diverse displacement in the power grid.

VII. CONCLUSION
In this article, we propose a novel game-theoretic control mechanism for optimally regulating the operation of power distribution networks including active users with demand flexibility, dispatchable generation capability, and/or energy storage potential. Differently from the existing literature, this work investigates a decentralized model aimed at efficiently enhancing the penetration of distributed generation in medium voltage networks, as well as the predictability of the power injection from the transmission grid, while fulfilling the power system physical constraints. In addition, the conducted numerical experiments show that the proposed optimal control of active users is effective in increasing the predictability and power stability of the distribution networks. Future research will address the following: Demonstrating the optimality and the convergence properties of the proposed algorithm, and extending the proposed approach to different control goals (e.g., by replacing the cost functions with terms that take the reactive power mismatch into account) and different power flow models possibly including power losses.
Moreover, it should be noted that the proposed framework can be modified to directly take into account uncertainty.