On the Upper Bound of Near Potential Differential Games

This letter presents an extended analysis and a novel upper bound of the subclass of Linear Quadratic Near Potential Differential Games (LQ NPDG). LQ NPDGs are a subclass of potential differential games, for which a distance between an LQ exact potential differential game and the LQ NPDG. LQ NPDGs exhibit a unique characteristic: the smaller the distance from an LQ exact potential differential game, the closer their dynamic trajectories. This letter introduces a novel upper bound for this distance. Moreover, a linear relation between this distance and the resulting trajectory errors is established, opening the possibility for further application of LQ NPDGs.


Introduction
Game theory is a widely used mathematical tool to model interaction between multiple agents [1].In a game, different players interact with each other in order to optimize their own cost function.Due to the interaction between them, the optimal solution has to be computed in a coupled manner.One of the solution concepts is the so-called Nash Equilibrium (NE), which emerges as a solution in non-cooperative games where players independently pursue their goals without forming agreements [2,.This necessitates coupled optimization processes for each player in an N-player game.For a comprehensive overview of the theory of dynamic games, it is referred to [3].
In the case of the so-called potential games, the game can be characterized by one single cost (potential) function instead of N, coupled optimizations.This enables the calculation of the Nash Equilibrium (NE) by simply optimizing this potential function.Furthermore, the uniqueness of the NE is assured when dealing with a convex potential function, enhancing the appeal of using this game characterization in practical scenarios, like motion planning [4], communication network management [5], modeling human-robot interactions [6], multi agent systems [7] or network-flow control problems [8].
The core idea of near potential games is the usage of a distance metric between two differential games.In that way, the required exactness of the exact potential differential games is transformed into a less restrictive condition, which permits a small, remaining difference between the two games.The concept of near potential static games is introduced in [9,10].Based on the intuitive idea that if two games are close in terms of the properties of the players' strategy sets, their properties in terms of NE should be somehow similar.A systematic framework for static games was developed in [9].It was shown that a near potential static game has a similar convergence of the strategies1 compared to an exact potential static game.A similar convergence of the strategies means that similar changes in the input strategies lead to similar changes in the payoffs in the game.Furthermore, it is also shown that the meaning of close can be quantified in the developed framework, see [9].
In this letter, a specific subclass, the Near Potential Differential Games (NPDG) is discussed.In [11], the concept of the NPDGs was introduced, in which, the similarities of the trajectories are given as a non-linear function of the closeness of two games.In this letter, a novel upper bound is provided: A linear relation is derived facilitating a more feasible application of this upper bound.The primary contribution is the derivation of this novel upper bound for NPDGs.

Preliminaries
In the following, the focus of this letter lies on the linear quadratic (LQ) differential games.LQ differential games are useful for modeling a wide range of engineering problems since they provide a simple and effective way to trade off conflicting objectives and make optimal decisions across dynamic systems.

Exact Potential Differential Games
Definition 1 (LQ Differential Game [12]).An LQ Differential Game Γ d is defined as a tuple of • a set of N players i ∈ P = {1, 2, ..., N}, • a dynamic system with the system matrix A and the input matrix of player i, • the joint set of control strategies of the players U = U (1) × ... × U (N) and • the set of the players' cost functions J = {J (1) , ... , J (N) }, where where Q (i) and R (i j) represent the penalty matrices for the system states and system inputs of the player i.The end of the game is τ end .It is assumed that the matrices of the cost functions have a diagonal structure n and R (i j) = diag r (i j) 1 , r (i j) 2 , ..., r (i j) p i , are positive semi-definite and positive definite, respectively.
Definition 2 (Nash Equilibrium [12]).The game is in a Nash equilibrium (NE) if the players cannot deviate from their actual strategies without increasing their costs In order to compute the NE of a differential game, the socalled coupled Riccati equations are set up [2, Chapter 7], for which the Hamiltonians of the players are computed such as For further details on the solution to the coupled Riccati equation, it is referred to [1, Chapter 3].
Definition 3 (LQ Exact Potential Differential Games [13]).Let an LQ differential game Γ epd with system dynamics (1) be given.Furthermore, let the quadratic cost functions (2) and Hamiltonian functions (3) of the players be given.Assume that the aggregated inputs of the players and the aggregated input matrices are defined such that 1) , B (2) , ..., B (N) , respectively.Furthermore, consider an LQ optimal control problem over an infinite time horizon τ end → ∞ with the cost function as well as the Hamilton function where the matrices Q (p) and R (p) are positive semi-definite and positive definite, respectively.If holds for ∀i ∈ P, the LQ differential game Γ epd is an LQ exact potential differential game, which has the potential function J (p) .
Definition 3 reveals that the NE can be computed by the optimal control problem of ( 1) and ( 4) in the case of an exact potential differential game as long ( 6) holds.For further discussions and examples, the reader is referred to [14] and [15].

Distance between two Potential Differential Games
Similar to the static case [10], a distance measure between two differential games is introduced.
Definition 4 (Differential Distance [11]).Let an exact potential differential game Γ epd with the potential function J (p) be given.Furthermore, let an arbitrary LQ differential game Γ npd according to Definition 1 be given.The differential distance (DD) between Γ (p)  epd and Γ npd is defined as Note 4: Definition 4 defines vector space, in which two games can be compared and their "closeness" can be quantified.It is the intuitive extension of Definition 3 because for an exact potential differential game, τ end ] holds, meaning that Γ npd has the same characteristics as Γ ed .Softening the condition σ (i) d (t) = 0 enables a broader use.Using Definition 4, the subclass of NPDGs is formally defined.
Definition 5 (Near Potential Differential Game [11]).A differential game Γ npd is said to be an NPDG if the DD between Γ npd and an arbitrary exact potential differential game Γ epd is where ∆ ≥ 0 is a small constant, meaning that Note 5.1: Definition 5 does not exclude the subclass of exact potential differential games as ∆ = 0 is possible.Thus, exact potential differential games are a subset of NPDGs.
Note 5.2: The maximum DD is the measure of the likeness between the games.As the maximum DD increases, the dynamics of states and input trajectories of the NPDG are gradually getting larger.Thus, the main question is that for a given upper bound ∆, how large the perturbation of the state and inputs dynamics between Γ npd and Γ epd is admissible.Therefore, this perturbation is quantitatively characterized for LQ differential games in the following.

Upper Bound of NPDGs
The main results of this letter are presented in this section: The novel upper bound of the DD and a further analysis of the boundness of an NPDG.

Properties of an NPDG
Theorem 1 (LQ NPDG).Let an LQ exact potential differential game Γ (p) ed with its state trajectories x (p) (t) in its NE be given.Furthermore, let an arbitrary LQ differential game Γ npd according to Definition 1 with its state trajectories x * (t) in the NE of Γ npd be given.It is also assumed that there is a ∆x (p) (t) ≥ 0 such x (p) (t) = x * (t) + ∆x (p) (t) or ( 9) holds, where ∆ is defined in (5).Furthermore, P (p) is the Riccati matrix obtained from the optimum of the potential function (4).The matrix P (i) is the solution of the coupled Riccati equation (3) for the player i, see [12].Then Γ npd is an LQ NPDG in accordance with Definition 5.
Proof.The derivative of H (i) is expressed as which holds for i ∈ P. Since the optimal control law of the players, ( 12) is zero, a small perturbation around the optimal solution is sought.Based on [6], the derivatives of the Hamiltonian of player i can be rewritten as and for the derivatives of the Hamiltonian of the potential function are obtained, where are scalar perturbation functions.Substituting the derivatives into (7), the DD is stated as Introducing an upper bound of the variation DD is rewritten as On the one hand, if (9) holds, the upper bound of σ (i) d (t) is rewritten to On the other hand, if (10) holds, the upper bound of σ (i) d (t) is Introducing the notation for the maximum magnitude of the state vectors the estimations ( 15) and ( 17) can be combined into x max i ∈ P.
proving that Γ npd is an NPDG with an upper bound of ∆ * .
If the upper bound of DD σ d between the NPDG and the exact potential differential games is sufficiently small, closed-loop characteristics with similar results can be drawn.In the case of differential games system state trajectories are analyzed 2 .The terms small and similar are described more precisely in the next subsection.

Dynamics of LQ NPDGs
The analysis of the so-called (approximate) ǫ-NE can be found in [16] or [17].In this letter, the dynamics of the system trajectories are analyzed in order to provide a bound of the differences between two LQ differential games.In contrast to [11], this letter provides a new, linear relation between the DD and the trajectory error.
Let it be assumed for the LQ differential game Γ npd that the control laws of the players i ∈ P are obtained from the solution to the coupled Riccati equations over an infinite time horizon, which leads to the closed-loop system dynamics where and that the unique solution to ( 18) is For the LQ exact potential differential games Γ epd , the control law p) is obtained from the optimization of the potential function ( 4), which is used to compute the feedback system dynamics where The solution to ( 20) is From the state trajectories x (p) (t) and x * (t), an upper bound (η) of the errors is provided for a given ∆ between two games.For this, a notion of the difference between two closed-loop system behaviors is introduced in Definition 6.
Definition 6 (Closed-Loop System Matrix Error).Consider an LQ exact potential differential game Γ epd with the system trajectories (21).Furthermore, assume that an arbitrary LQ differential game Γ npd is an NPDG with the system trajectories (19).Then, the closed-loop system matrix error between Γ epd and Γ npd is defined as Note 6: Two differential games are similar, if the closedloop system matrix error is small and consequently, the system trajectories of these two games x * (t) and x (p) (t) are close to each other.In this case, Γ npd is an NPDG.This closeness between an NPDG and an LQ exact potential differential game is quantified in Theorem 2.
Theorem 2 (Boundedness of NPDGs).Let an LQ NPDG Γ npd and an exact potential differential game Γ epd be given.Let the system state trajectories of the two games Γ (p) epd and Γ npd be x (p) (t) and x * (t), respectively.Moreover, hold for the initial values.Then, the error between the system state trajectories of Γ npd and Γ epd are bounded over an arbitrary time interval [t 0 , t 1 ], such that where C NPDG (t) ≥ 0 is a positive, time-invariant coefficient.
Proof.From the solution to the differential equations ( 18) and ( 20), is obtained.As (23) holds, using Definition 6 and [18, Theorem 11.16.7]leads to In the following, an upper bound of ∆K is sought.Let the notation P . . .
be introduced.Substituting ( 19), ( 20) and ( 26) in ( 25), the upper bound is obtained.In addition, let the matrix be defined where R (p) i is the submatrix for the inputs u (i) of player i, for which hold.Thus (27) can be reformulated to Due to the well-known scaling ambiguity, there is a manifold of the potential functions (4) that result in an identical feedback gain matrix, thus a scaling factor κ p > 0 ∈ R can be chosen such that J(p) = κ p • J (p) and R (p) 2 > 1 holds.Assuming a suitable scaling, (27) leads to .
Then, let the following matrix be introduced P . . .
The so-called Frobenius norm is defined as the entry-wise Euclidean norm of a matrix (see [19]), for which holds (see [20,Chapter 5] or [18,Section 9.8.12]).Applying the definition of the Frobenius norm to (30), is obtained.Using property (31) and (32) leads to an upper bound Due to the scaling ambiguity, J(i) = κ i • J (i) , κ i > 0 ∈ R holds and κ i and κ p can be modified to obtain R (i) and R (p) , such that holds, for which is sufficient (see [18,Section 9.9.42]).This leads to The substitution of the upper bound of ∆K in ( 25) by (36) leads to the coefficient which results in the following upper bound of the trajectory error Remark 1: From (38), it can be seen that the upper bound of the DD governs the maximal admissible error between the trajectories, where the function C NPDG (t) depends only on the initial value, the system structure and the time interval [t 0 , t 1 ].

Remark 2:
In (37), C NPDG (t) is bounded in the time interval [t 0 , t 1 ].Thus, Theorem 2 holds ∀t ∈ [t 0 , t 1 ] only.However, ∆ can be defined as In case of asymptotically stable system state trajectories x (p) (t) and x * (t), a monotonic decreasing series, ∆ N−1 ≥ ∆ N , can be assumed to prevent C NPDG (t) from an exponential growth for t → ∞.Consequently, Theorem 2 also holds for t → ∞.
Remark 3: Note that Theorem 2 differs from the upper bound of the distance between solutions of two general initial value problems of differential equations: The upper bound between two general initial value problems is given as a function of the Lipschitz constant and is usually proved with the Grönwall-Bellman inequality, see e.g.[21,Theorem 3.4.].On the other hand, Theorem 2 provides the link between the upper bound x (p) (t) − x * (t) 2 and the DD of the two games ∆, which differs from general initial value problems.Thus, Theorem 2 is a special case of Theorem 3.4.in [21].

Discussion
The main result of this letter enables a broader understanding of the concepts of NPDGs, which provide a more compact representation of strategic games.This makes them suitable for engineering applications, as the strictness of exact potential differential games is softened, thereby extending the applicability of the concept of potential games.
Illustrative engineering examples include human-human or robot-human interactions, for which NPDGs are suitable models.Such interactions are modeled by differential games in literature [22,23] and studies have demonstrated that the resulting motions of human-human or robot-human interactions can be characterized by the NE of this differential game [24].Nevertheless, the assumption of NE can be violated due to the socalled bounded rationality of humans in some cases (cf.[25,26]).In cases where these violations of the NE in humanmachine interaction scenarios, the proposed upper bound of the DD is a helpful tool to quantify the deviation from the NE.Thus, the concept can be used to analyze and design humanmachine interactions.

Summary and Outlook
This letter introduces a novel upper bound between an NPDG and an exact potential differential game.Moreover, this letter shows that the resulting trajectory error has a linear relation to the defined upper bound, which enables the prediction of the maximal trajectory error between an NPDG and an exact potential differential game.In the future, the proposed NPDG will be applied to model human-machine interactions.