Network games with dynamic players: Stabilization and output convergence to Nash equilibrium

This paper addresses a class of network games played by dynamic agents using their outputs. Unlike most existing related works, the Nash equilibrium in this work is defined by functions of agent outputs instead of full agent states, which allows the agents to have more general and heterogeneous dynamics and maintain some privacy of their local states. The concerned network game is formulated with agents modeled by uncertain linear systems subject to external disturbances. The cost function of each agent is a linear quadratic function depending on the outputs of its own and its neighbors in the underlying graph. The main challenge stemming from this game formulation is that merely driving the agent outputs to the Nash equilibrium does not guarantee the stability of the agent dynamics. Using local output and the outputs from the neighbors of each agent, we aim at designing game strategies that achieve output Nash equilibrium seeking and stabilization of the closed-loop dynamics. Particularly, when each agents knows how the actions of its neighbors affect its cost function, a game strategy is developed for network games with digraph topology. When each agent is also allowed to exchange part of its compensator state, a distributed strategy can be designed for networks with connected undirected graphs or connected digraphs.


Introduction
Game theory has various applications to the control of multiagent systems including smart grids, optical networks, and mobile sensor networks; see for example, Wang et al. (2014); Stanković et al. (2012); Pavel (2006). In these applications, the agents try to minimize local cost functions that depend on actions of their own and other players. The aim of the game is often set to seek a Nash Equilibrium (NE), i.e., no agent can gain by unilaterally changing its strategy. As far as we know, all NE seeking problems in literature use full agent states as decision variables and the sole purpose of the game strategy is to drive all the components of each agent state to the NE. These game theoretic problems have undesirable characters including the lack of privacy among agents and the restrictions on agent dynamics. First, as full agent states are used as decision variables, each agent has to know, or in the case of limited communication, observe the full states of all other agents. In this setting, it is impossible for the agents to converge to the NE while keeping some parts of their states unknown to the others. Second, in most existing works, the agents do not have independent dynamics such as in Ye and Hu (2017), or they can have simple homogeneous dynamics such as in Romano and Pavel (2019a). The recent work of Romano and Pavel (2019b) develops distributed NE seeking strategies for a class of heterogenous linear systems and defines the NE using partial local state.
Email addresses: meichen.guo@rug.nl (Meichen Guo), c. de.persis@rug.nl (Claudio De Persis) However, the agent dynamics therein take the special form of multi-integrators and at the defined NE, the part of the local state that does not explicitly appear in the NE must be zero. In summary, the existing NE seeking strategies are not applicable to engineering problems of multi-agent systems with general linear local dynamics. Motivated by these drawbacks in the existing works, we aim at solving an NE seeking problem such that, (i) if part of the local agent state is not directly involved in decision making, it can remain private from other agents; (ii) the agents can have more general and heterogeneous dynamics. However, it should be pointed out that this game formulation brings a major challenge that the outputs converging to the NE does not imply the stability of each agent. As we also assume that the agent states are not measurable, our goal is to design output feedback strategies that serve two purposes: stabilizing local dynamics and driving the outputs to the NE.
One important feature of NE seeking strategies in reference such as Lin et al. (2014); Parise et al. (2015); Koshal et al. (2016); Salehisadaghiani and Pavel (2016); Ye and Hu (2017); Deng and Liang (2019) ;De Persis and Grammatico (2019); Romano and Pavel (2019a) is that the designed strategies are distributed with respect to the underlying communication graph. The distributed game strategies proposed in Salehisadaghiani and Pavel (2016); Romano and Pavel (2019a) exploit the underlying communication network topology so that each agent estimates the actions of others using information from its neighbors. Nonetheless, in general games, each agent needs information on the actions of all other agents to determine its own action. When the size of the network is large, the computational burden of each agent can be extremely heavy. This limitation leads to research on network games. In Parise et al. (2015), distributed iterative strategies are proposed for agents to converge to an NE in network games with quadratic cost and convex constraints. Grammatico (2018) investigates the convergence of different equilibrium seeking proximal dynamics for multiagent network games with convex cost functions, time-varying communication graph, and coupling constraints. Our work focuses on NE seeking in a class of network games, since in cooperative control of large networks, it can be computational difficult or unrealistic for each agent to consider the actions of all others. Specifically, the network games considered in this work is a class of quasi-aggregative games that each agent minimizes its cost function depending on its own actions and the actions of its neighbors in the underlying network topology. If the network is connected by a complete graph, the concerned network games become general or aggregative games such as the ones studied in Koshal et al. (2016); Salehisadaghiani and Pavel (2016); Deng and Liang (2019); De Persis and Grammatico (2019); Romano and Pavel (2019a).
Another practical and important consideration in game theoretical engineering problems is the influence of external disturbances. In engineering applications, agents are often subject to external disturbances, for instance, the energy consumption demand in Wang et al. (2014) and wind pushing the mobile robots in formation control in Romano and Pavel (2019a). However, designing NE seeking strategies capable of disturbance rejection has not gained much attention other than a few works such as Stanković et al. (2012); Romano and Pavel (2019a). In this paper, the game is formulated with agents subject to deterministic disturbances generated by linear exosystems, which can be a combination of step functions and finitely many sinusoidal functions.
For a general or aggregative game setting, some recent works have investigated distributed NE seeking problems without or with simply local agent dynamics. Ye and Hu (2017) considers the NE seeking problem for agents communicating through an undirected and connected graph. Based on a consensus protocol and the gradient play approach, distributed NE seeking strategies are proposed for games with quadratic and nonquadratic cost functions. In the recent work Romano and Pavel (2019a), dynamic NE seeking strategies are proposed for single and double integrator networks subject to external disturbance modeled by linear exosystems. The cost function of each agent is a general convex cost function that depends on the actions of all the agents in the network. Under the assumption that the underlying communication graph is undirected and connected, dynamic strategies can be developed to estimate the actions of others and for the agents to converge to the unique Nash equilibrium. Reference Deng and Liang (2019) investigates an aggregative game of Euler-Lagrange systems in the presence of uncertain parameters. The network topology is assumed to be undirected and connected, and the agents are free of external disturbances. As Euler-Lagrange systems are nonlinear secondorder systems, the game strategies in Deng and Liang (2019) are proposed based on double integrators.
The distributed robust NE seeking strategies proposed in this work are inspired by methods in linear output regulation. Compared with existing works on distributed NE seeking strategy design, this paper has the following contributions. First, we define the NE using outputs of general linear systems as decision variables, which leads to strategies having a wider range of engineering applications. Second, the designed NE seeking strategies are capable of not only outputs convergence to the NE but also stabilization of agent dynamics despite model uncertainties and external disturbances. Last but not least, the proposed strategies can be applied to networks with communication digraphs, which is not the case for most existing game strategies including the ones in Ye and Hu (2017); Romano and Pavel (2019a); Deng and Liang (2019).
The rest of the paper is arranged as follows. The network game formulation and some preliminaries are given in Section 2. The distributed NE seeking strategies are developed and analyzed in Section 3. In Section 4, simulation results of the control of sensor networks are presented. Finally, some conclusive remarks are given in Section 5.
Notations. R denotes the set of real numbers. I n denotes the n × n identity matrix. For vectors x 1 , . . . , x N , col(x 1 , . . . , Given matrices A 1 , . . . , A N , blockdiag(A 1 , . . . , A N ) denotes the block diagonal matrix with A i on the diagonal.

Game formulation and preliminaries
In this section, the formulation of the NE seeking problem for a class of network games will be presented. Then, we give some preliminaries on linear output regulation that will be used in the subsequent sections. It is shown that any controller that solves the output regulation problem of the reformulated agent dynamics also solves the NE seeking problem.

Game formulation
Consider a game denoted by G(I, J i , Ψ i ) played by N agents having dynamicṡ input and strategy vector, y i ∈ Ψ i ⊆ R p i the output and decision variable, and denote µ ∈ R n i (n i +m i +q i +p i ) as the uncertainty. The disturbance w i is generated byẇ where the initial value where A i , B i , P i , C i are the known nominal parts and ∆A i , ∆B i , ∆P i , ∆C i are the uncertain parts. The uncertainty for each agent i, i ∈ I can be written as Define G c = (I, E) as the underlying communication graph of the network where I is the index set as defined above and E ⊂ I × I is the edge set. There is an edge between nodes i and j if (i, j) ∈ E, i, j ∈ I. For an undirected graph, if (i, j) ∈ E then ( j, i) ∈ E. Denote the neighbor set of agent i as N i ⊂ I. For an undirected graph, if there exists a path between every pair of nodes, the undirected graph is connected. A digraph is weakly connected if there exists an undirected path between every pair of nodes, and is strongly connected if there exists a directed path between every pair of nodes. Now, we give the definition of NE using outputs of (1) as decision variables in the network game G(I, J i , Ψ i ). In the game, each agent tries to minimize a local cost function denoted by J i , i ∈ I.
Definition 1 (NE in network games). Given a network game The local cost function concerned in this work is defined as Denote the neighbor set of agent i as N i . Assume that R i j 0, Q i j 0 for j ∈ N i , and R i j = 0, Q i j = 0 otherwise. Then, the cost function can also be written as J i (y i , y N i ). Note that since R ii > 0, for each i ∈ I, Ψ i = R p i and Φ i = R n i , the local cost function J i is strictly convex and radially unbounded in y i for all y N i ∈ Ψ N i . Then according to Başar and Olsder (1999, Corollary 4.2), there exists an NE for network game G(I, J i , Ψ i ).
Denote the partial gradient of cost function J i as Then, the pseudogradient can be written as F(y) =Ry +Q wherē Assumption 2.1. The matrixR is positive definite.
Following Facchinei and Pang (2003, Theorem 2.3.3), under Assumption 2.1, the mapping F is strictly monotone, and the game has a unique NE y * satisfying F(y * ) =Ry * +Q = 0.
Remark 2.2 (Monotonicity of F). Different assumptions on the monotonicity of mapping F have been used in existing works to guarantee the uniqueness of the NE. For example, for general cost functions, Koshal et al. (2016); Salehisadaghiani and Pavel (2016) assume the mapping to be strictly monotone, and Deng and Liang (2019); De Persis and Grammatico (2019) assume that the mapping is strongly monotone. Note that for general cost functions, the assumption of strict monotonicity is weaker than that of strong monotonicity. In this work, as the cost function of each agent is in the linear quadratic form (3), the strict monotonicity and the strong monotonicity of mapping F are equivalent.
Another assumption is made to exclude the trivial case where the disturbance w i for i ∈ I exponentially decays to 0.
Assumption 2.2. For each i ∈ I, S i has no eigenvalues with negative real parts. Now we give the formulation of the NE seeking problem for the network game G.
i ∈ I to design strategy u i such that (i) the closed-loop dynamics of all agents are stable, and (ii) the output y := col(y 1 , . . . , y N ) converge to the NE y * that satisfiesRy * +Q = 0.
In the contrary to most existing works on NE seeking, the solution of Problem 1 does not only need to drive the decision variables to the NE, but also guarantee the stability of the closed-loop systems. This objective is similar to output regulation problems where the aim of the regulator is to achieve reference tracking and/or disturbance rejection while guaranteeing that the closed-loop system is stable. In fact, we will get inspirations from output regulation for game strategies design in Section 3.

Preliminaries
First, we write the dynamics of all the agents in a stacked form. Denote m = i∈I m i ,n = i∈I n i ,p = i∈I p i , andq = i∈I q i . Use the pseudo-gradient as the regulated error, i.e., e = F(y) =Ry +Q.
In order to write the regulated error e as a function of the output y and an exogenous signal, we construct a linear exosystem that consists of the disturbance generator (2) and a constant component. Specifically, for i ∈ I, Then, the dynamics of the ith agent can be written aṡ where P i (µ i ) = [P i (µ i ) 0]. The local regulated error e i is defined as the local partial gradient described bȳ (4), v i1 is the same as w i in (2) and v i2 is constant 1. Then, we can use Q T ii v i2 to replace Q T ii in the expression of the local partial gradient e i . It should be pointed out that, adding a constant component v i2 is necessary and does not complicate the strategy design. If (2) already has a constant component, i.e. S i has an eigenvalue at 0, the addition of v i2 will not change the subsequent design of the internal model. On the other hand, if S i has no eigenvalue at 0, it is necessary to take the constant v i2 into consideration. We will show later that for the special case where the agents are not subject to external disturbances, due to the constant vector Q ii in the cost function, the controller needs to contain an integrator, which is the internal model for constant exogenous signals.
We denote (x i , u i ) for each i ∈ I as the steady state of (5) such that and the corresponding steady-state output as y i = C i (µ i )x i where x = col(x 1 , . . . , x N ). Then, seeing e i as the local regulated error, the distributed cooperative output regulation problem of (5) is to design a controller using e i such that (x i , u i ) converge to (x i , u i ) for all i ∈ I. In fact, the solution to the cooperative output regulation problem also solves the NE seeking Problem 1.
Proposition 1. Under Assumptions 2.1 to 2.2, if there exists (x i , u i ) such that (7) holds for all i ∈ I, the corresponding y = col(y 1 , . . . , y N ) is the NE of the network game G(I, J i , Ψ i ).

Assumption 2.3. For each i ∈ I, (A i , B i ) is stabilizable and
Remark 2.4. (Overall network stabilizability and detectability) As matrices R ii are positive definite for all i ∈ I, the detectability of pair (A i , C i ) is equivalent to the detectability of the pair (A i , (R ii + R T ii )C i ). Moreover, since the dynamics of the agents are decoupled, the pair (Ā,B) is stabilizable, (Ā,C) and (Ā,RC) are detectable.
Remark 2.5. (Existence of steady state) Assumption 2.4 guarantees the existence of the steady state (x i , u i ) Huang (2004, Theorem 1.9). The appendix presents some conditions guaranteed by Assumption 2.4 that will be used in subsequent sections.
To reject the disturbance generated by exosystem (4) and handle the uncertainties in the agent dynamics, an internal model is constructed for each agent i, i ∈ I. The following is a general definition of an internal model given an exosystem in the formẇ = S w.
Definition 2 (Internal model). Huang (2004, Definition 1.25) Given any square matrix S , a pair of matrices (M 1 , M 2 ) incorporates a p-copy internal model of S if where (T 1 , T 2 , T 3 ) are arbitrary constant matrices of any compatible dimensions, V is any nonsingular matrix with the same dimension as M 1 and G 1 = blockdiag(β 1 , . . . , β p ), G 2 = blockdiag(σ 1 , . . . , σ p ), where β i are square matrices and σ i are column vectors with appropriate dimensions, (β i , σ i ) are controllable pairs, and the characteristic polynomials of β i are the same as the minimal polynomial of S for all i = 1, . . . , p.
Remark 2.6. By Definition 2, as a special case of (M 1 , M 2 ), the pair of (G 1 , G 2 ) also incorporates a p-copy internal model of matrix S .
Remark 2.7. ("p-copy" Internal model) Under Definition 2, the dimension of the internal model is the dimension of the output times the order of the minimal polynomial of S . The internal model design in Isidori et al. (2003) has the same dimension but does not use the term "p-copy".

Distributed game strategies
In this section, two distributed output feedback control strategies are developed for the NE seeking problem. We first consider the case where the agents are connected by a diagraph without loops. Then, by letting the agents communicate more information with their neighbors, we relax the assumption on the information graph.

Directed communication graph
Using the internal model approach, a distributed error feedback strategy is designed in the form oḟ where η i ∈ R (n i +p i s i ) , K i = [K i1 K i2 ] is the gain matrix, ) incorporates a p icopy internal model of matrix S i . By Definition 2, the pair (M i1 , M i2 ) also incorporates a p i -copy internal model of matrix S i .
Before presenting the main result, we make the following assumption on the graph. In most existing works on distributed NE seeking strategy for games, for instance Salehisadaghiani and Pavel (2016) (2019); Romano and Pavel (2019a), the network topology is assumed to be undirected and connected. As far as we know, the strategies proposed in the aforementioned works are not applicable to games with directed communication graphs. However, it is noted that the controller (9) cannot solve NE seeking problem with undirected communication graphs. A distributed strategy that is able to handle both directed and undirected communication graphs will be presented in the next subsection by allowing the neighboring agents to exchange some additional information.
Remark 3.1. (State privacy in (9)) In controller (9), each agent i, i ∈ I, exchanges output y i with its neighbors. Note that its neighbors are not able to reconstruct the full state x i of agent i only using this output. In the case where C i I n i and p i n i , at least part of the agent state can remain private from its neighbors. Now we are ready to present the main result of this subsection.
Theorem 1 (Distributed strategy under communication digraphs). Under Assumptions 2.3, 2.4, and 3.1, the distributed strategy (9) is a solution to the NE seeking Problem 1.
Proof. Denote z i = col(x i , η i ), and z = col(z 1 , . . . , z N ). The closed-loop system can be written aṡ where v = col(v 1 , . . . , v N ) and S = blockdiag( S 1 , . . . , S N ). We use A c , P c and C c to denote the closed-loop system composed of the nominal dynamics and the controller (9), where A c is a block matrix with diagonal blocks A ci and off diagonal blocks E i j , C c is a block matrix with blocks C i j for i, j ∈ I, i j, P c =P, Q c = Q and otherwise.
According to Huang (2004, Theorem 1.31), under Assumptions 2.2 and 2.3, there exists a dynamic output feedback controller in the form of (9) such that the closed-loop system is stable and the error e converges to 0 asymptotically, if and only if matrix A c is Hurwitz, and the regulator equations have a unique solution Z for any µ in an open neighborhood of µ = 0. First, we examine the stability of the nominal matrix A c . Under Assumption 3.1, we can label the agents such that i < j if (i, j) ∈ E for i, j ∈ I. Then, A c becomes a block lower triangular matrix. For each i ∈ I, the diagonal A ci is similar to the matrix Note that there exists L i for i ∈ I such that A i − L i (R ii + R T ii )C i is Hurwitz, as the pair (A i , (R ii + R T ii )C i ) is detectable. Under Assumption 2.4 and by the definition of the internal model, we have that for all λ ∈ spec(G i1 ), Then, according to Huang (2004, Lemma 1.26), under Assumptions 2.2 and 2.3, the pair is Hurwitz. Then, for i ∈ I,Ā ci , and consequently A ci , are Hurwitz, which shows the diagonal blocks of A c are all Hurwitz. As A c is a block lower triangular matrix, it is also Hurwitz.
Next, we show there exists a unique solution Z to the regulator equations (11). As A c is Hurwitz, according to Huang (2004, Lemma 1.27), the equations X S =ĀX +BKΞ +P, has a unique solution (X, Ξ) for any matricesP andQ, wherē K = blockdiag(K 1 , . . . , K N ), Note that the matrix equations (13) can be put into the form of and the solvability of (14) means the solvability of regulator equations (11) for any µ in any open neighborhood of µ = 0. Therefore, by Huang (2004, Lemma 1.20), the distributed controller (9) solves the output regulation problem. Then, following Proposition 1, controller (9) also solves the NE seeking Problem 1.
For the case where the agents are free of disturbances or subject to constant external disturbances, the exosystem (4) satisfies spec( S i ) = {0}. Then, the controller (9) can be simplified. The distributed output feedback controller for disturbance-free NE seeking problem can be derived directly from Theorem 1.
Corollary 1. Consider an NE seeking problem defined in Problem 1 with agent dynamicṡ and cost functions (3) for all i ∈ I. Under Assumptions 2.3, 2.4 and 3.1, there exist matrices L i , K i1 and K i2 , such that the NE seeking problem has a solution in the form of (9) with Remark 3.2. (Integrator in the strategy) When the agents are not affected by external disturbances, we can design the p i -copy internal model (G i1 , G i2 ) as (0 p i ×p i , I p i ) for each agent i, i ∈ I. Note that in this case, it is still necessary to include an integrator in the controller. This is because the steady-state output y satisfyingRy +Q = 0 depends on the constant matrixQ. If the agents have dynamics (1) and the disturbance w i is a constant vector, the NE seeking strategy can use the same design as shown in Corollary 1, as spec( S i ) = {0} still holds.

General communication graph
Assumption 3.1 can be relaxed if the distributed game strategy is designed asξ whereê i = (R ii + R T ii )C i ξ i + j∈N i R i j C j ξ j , matrix L i and the pair (G i1 , G i2 ) have the same definitions as in (9). The difference between (16) and (9) is that in (16) agents exchange C i ξ i with neighbors. To rule out the case where the network contains isolated agents solving an optimization problem instead of playing games with neighbors, we have the following assumption on the communication graph.
Assumption 3.2. The communication graph among the agents is connected.
Under Assumption 3.2, the network topology can be a connected undirected graph, or a weakly or strongly connected digraph.
Theorem 2 (Distributed strategy under connected communication graphs). Under Assumptions 2.3, 2.4, and 3.2, distributed strategy (16) is a solution to the NE seeking Problem 1.
Proof. Denoting ξ = col(ξ 1 , . . . , ξ N ), ζ = col(ζ 1 , . . . , ζ N ), and z = col(x, ξ, ζ) gives the system matrix of the nominal closedloop system as Note that A c here is in the same form as A ci in the proof of Theorem 1. Therefore, under the assumption thatR is positive definite, it can be proved in the same fashion that A c is Hurwitz and the regulator equations have a unique solution. Then, applying Huang (2004, Theorem 1.31), we can prove that there exists a dynamic output feedback controller in the form of (16) that stabilizes the closed-loop system and drives the error e to zero asymptotically. Hence, following Proposition 1, (16) also solves the NE seeking problem.
By allowing the agents to exchange the additional information C i ξ i with neighbors, we can relax the assumption on the communication graph in Theorem 1. In (16), the ξ i -subsystem can be seen as an observer for the state x i and the ζ i -subsystem is the internal model for the exosystem (4). As the agents exchange C i ξ i instead of full state estimation ξ i , similar to the arguments in Remark 3.1 the agents maintain some privacy.
Similar to the previous subsection, we can derive a corollary for a disturbance-free NE seeking problem under general communication graph from Theorem 2.
Corollary 2. Consider an NE seeking problem with agent dynamics (15) and cost functions (3) for all i ∈ I. Under Assumptions 2.3 and 2.4, there exists matrices L i , K i1 and K i2 , such that the NE seeking problem has a solution in the form oḟ

Simulation results
In this section, the proposed NE seeking strategies are applied to connectivity control of sensor networks studied in Stanković et al. (2012). The network is composed of mobile robot agents to be positioned at optimal sensing points while keeping good connections with selected neighboring agents. In this example, we consider mobile robots modelled bẏ where x i1 and x i2 denote the position and velocity of each agent, respectively, c i > 0 is the friction parameter. The decision variable is the position of each agent, i.e., y i = x i1 . The cost function is defined as for each i ∈ I, where r i is the objective position of each agent. Then, by converging to the NE, the agents compromise between the individual objective of moving to position r i and the collective objective of maintaining the connectivity with their neighbors.
In the simulation, the sensor network is composed of 5 mobile robots subject to disturbance generated by the exosysteṁ which is a class of sinusoid signals having frequency π/10. The friction parameter is set as c i = 0.2. The initial positions of the agents are (0, 0), (1, 1), (1, −1), (2, 1), (2, −1), respectively. First, we consider a network connected by a digraph illustrated in (a) of Figure 1 Figure 2, the initial positions of the agents are denoted by circles and the NE is denoted by a collection of crosses. Figure 3 illustrates the regulated errors e i of the agents in x and y axes.
The second case is when the underlying graph of the agents is undirected as illustrated in Figure 1

Conclusion
Motivated by engineering applications, this paper considers output NE seeking problems in a class of network games with uncertain linear agent dynamics subject to external disturbances. The main challenge is to design a strategy capable of both driving the agent outputs to the NE and stabilizing the closed-loop dynamics. Other difficulties stem from the uncertainties in the dynamics, the external disturbance, and the relaxed assumptions on the communication graphs. By over coming these difficulties, our proposed game strategies have a wider range of applications to engineering multi-agent problems. Future work may consider similar output network games with general cost functions and/or nonlinear agent dynamics.

Appendix
Some insights regarding Assumption 2.4 Under Assumption 2.4, it can be proved that for any nonsingular matrix D i ∈ R p i ×p i and any λ ∈ spec(S i ) {0}, To prove this, we denote Then, F i ∈ R (n i +p i )×(n i +p i ) is nonsingular, rank F i = n i + p i , H i ∈ R (n i +p i )×(n i +p i ) , and