A decentralized control approach in hypergraph distributed optimization decomposition cases

We study the main decomposition approaches (primal, dual and primal–dual) for a distributed optimization problem from a dynamical system perspective where the couplings among the variables of the optimization problem are described by an undirected, unweighted hypergraph. We conduct stability analysis for the respective dynamical systems of the decomposition cases by using non linear decentralized control theoretic techniques and spectral properties of the respective communication matrices, i.e., the incidence and the Laplacian matrices of the hypergraph. Finally, we provide numerical simulations under a specific coalitional setting that demonstrate the superiority of the hypergraph compared to its respective graph analogue, the clique expansion graph, for the given decomposition algorithms in terms of convergence rate and information transmission efficiency.


Introduction
With the extensive study of multi-agent systems in almost all aspects of applied mathematics many optimization algorithms based on standard optimization theory (Bertsekas 2009) were developed based on the graphical structure of these systems as in Bertsekas (1991), Cerquides (2014) and Shakkottai and Rayadurgam (2008).One of these optimization approaches is called distributed optimization and has its roots in Tsitsiklis (1984).The main trait of distributed optimization is the coupling (common) variables among the objective functions of the agents of the system where a review on distributed optimization is provided in Yang (2019).The couplings of these variables are usually described by using a graph but in this work we use instead a hypergraph in a similar fashion as in Samar et al. (2007) and Boyd (2007).
Hypergraphs (Berge 1973) are a generalization of graphs in the sense that they allow multiple nodes to be attached in the same edge (hyperedge) and not just two as it is in the case of a graph.As a result, this type of graphical structure is capable of depicting more complex communication structures than a graph.More details on the properties of hypergraphs are provided in Dai and Gao (2023).As well as providing a more efficient way for describing a communication structure than a graph, hypergraphs can also provide acceleration to the underlying optimization algorithms.This comparison makes more sense in the coalition or clustering setting where we usually have unions of complete subgraphs which can be viewed as the clique expansions of hypergraphs (Agarwal et al. 2006).It is also important to note that in the clustering setting the notion of consensus is quite important and in the hypergraph setting each hyperedge can be interpreted as consensus variables among the attached agents.For more details on consensus theory we refer the reader to Kia et al. (2019).In the examples of this work we provide numerical simulations under a specific coalitional setting where we show that the hypergraph is more efficient in terms of information transmission and faster in terms of convergence rate than its respective clique expansion graph for the dual decomposition and primal dual algorithm where we utilize the respective Laplacian matrices.
In most cases of distributed optimization algorithms their stability analysis is conducted by utilizing spectral properties and matrix theoretic techniques (Zhang 2011) of the respective communication matrix (Kosaraju 2017;Kvaternik and Pavel 2011), usually the graph Laplacian matrix.Our approach is similar due to the hypergraph Laplacian matrix proposed by Bolla (1993) which satisfies all the properties of a Laplacian matrix.In this paper we focus in algorithms that were presented in Palomar and Mung (2006) and Boyd (2007).We study the main decomposition cases (primal, dual and primal-dual) for the hypergraph distributed optimization problem from a dynamical system viewpoint.We conduct the stability analysis of these dynamical systems by utilizing tools from non linear control theory (Nijmeijer and Van der Schaft 1990), mostly passivity and Lyapunov techniques.We also show that the equilibrium points of the dynamical systems coincide with the optimal solution of the hypergraph distributed optimization problem.This work is an extensive version of Papastaikoudis and Lestas (2023) which was presented in the 12th international conference on complex networks and their applications.In this work we additionally study the prime and dual decomposition cases by expressing the problem with the help of the vector of common values and the hypergraph incidence matrix.We also show how the hypergraph Laplacian matrix which we use directly in the primal dual algorithm naturally rises from the dual decomposition algorithm.
The paper is organized in the following way: in "Preliminaries" section we present the main mathematical preliminaries that we will use in this work.In "Hypergraph distributed optimization" section we introduce the hypergraph distributed optimization problem and finally in "Decomposition cases" section we present the decomposition cases along with the respective stability analysis proofs.Examples that highlight the importance of the hypergraph over its graph analogue, the clique expansion graph, are also presented throughout the evolution of the paper.

Basic notions
By R we denote the set of real numbers and by R n its n-th dimension version.For x ∈ R n , by x ≥ 0 ( x > 0 ) we mean that the vector components of x are nonnegative (positive).By || • || 2 we denote the l 2 (Euclidean) norm.Function | • | denotes the cardinality function.For a matrix M ∈ R n×n by M T and S(M) we denote its transpose and its spectrum respectively.Matrix P ∈ R n×n is an orthogonal projection matrix when P 2 = P = P T with spectrum S(P) = {0, 1} .A symmetric matrix M is called positive (semi)definite M(�) ≻ 0 if and only if x T Mx(≥) > 0 for every nonzero x ∈ R n .

Non-linear control theory
We study the following continuous, time invariant system, with x(t) ∈ R n and f : R n → R n being continuous.A function V : R n → R for which the following relationship ||x|| → ∞ ⇒ V (x) → ∞ is satisfied is called radially unbounded.By V : R n → R is denoted the Lie derivative of V which is expressed by the following formula: Theorem 1 Let x * be an equilibrium point of (1).If V : R n → R is radially unbounded and V (x) < 0, ∀ x � = x * then x * is globally asymptotically stable and V is a valid Lyapu- nov function for (1).
Definition 1 By considering a system , that has the following state space expression where x(t), u(t), y(t) ∈ R n are the state, the input and the output of respectively and f , h : R n → R n .The system is passive if there exists a positive semidefinite function S : R n → R + := (0, ∞) , called storage function, such that holds ∀ x(t), u(t), y(t) ∈ R n .For the special case that: for some positive definite function ψ then is strictly passive.

Theorem 2
The negative feedback interconnection of two passive systems is a stable system (Fig. 1).
Theorem 3 (LaSalle's Invariance Principle) Let ⊂ D be a compact possitively invari- ant set with respect to (1).Let V : D → R be a continuously differentiable function such that V (x) ≤ 0 in .Let X be the set of all points in where V (x) = 0 .Let M be the largest invariant set in X .Then every solution starting in approaches M as t → ∞.
Lemma 1 Considering a convex function f : R n → R n , its gradient ∇f : R n → R n is incrementally passive, i.e., the following inequality holds, If f is strictly convex, inequality (4) strictly holds ∀x � = y and then ∇f is strictly incrementally passive.

Graphs
A graph G = (V, E) is an ordered pair, where V = {v 1 , . . ., v n } is the node set while E = {e 1 , . . ., e m } is the edge set.The degree of a node v i denoted by |v i | is the total number of edges that are adjacent to this node.The total number of nodes in the graph is called the order of the graph and is denoted by |V| .We define by D V the diagonal |V| × |V| matrix whose entries are the degrees of each node The adjacency matrix of graph G , denoted by A, is a |V| × |V| matrix whose (i, j)-th entry is given by  The hypergraph Laplacian (Bolla's Laplacian) Q is a |V| × |V| matrix given by the formula: The clique of a hyperedge E j is a complete subgraph of the hyperedge's adjacent nodes with |E j |(|E j | − 1)/2 pairwise interactions.We call a graph as the clique expansion of a hypergraph if we substitute all the hyperedges with their respective cliques.

Hypergraph distributed optimization
We have the following distributed optimization problem for K different subsystems, where • F i : R (p i +m i ) → R is the objective function of ith subsystem and is considered to be strictly convex, continuously differentiable with its gradient ∇F i being locally Lipschitz.• Vectors x i ∈ R p i , ∀ 1 ≤ i ≤ K denote the variables of the subsystems which we assume are coupling (i.e.their components appear in the variables of other subsystems as well). (5) the variables of the subsystems which we assume are local (i.e. they appear in only one subsystem).• C i is a feasible set for subsystem i, described by linear equalities and convex inequalities.• Vector z ∈ R N gives the respective common values of the N different groups of coupling variable components.• The relationship x i = E i z allocates the variable components of ith subsystem to their respective common values and E i is a p i × N matrix whose (l, j)-th entry is given by with x l i denoting the lth component of variable x i .We consider a hypergraph H = (V, E) to represent the coupling variables of the different subsystems.
associated with the jth component of vector z.Therefore the hyperedge set E is associated with the couplings of different variable components.
For the hypergraph H we have, Hence, by x j i we denote the coupling variable of the ith subsystem that belongs to the jth group of coupling variables for i = 1, . . ., K and j = 1, . . ., N .The relationship We let f i (x i ) denote the optimal value for the local variable of the ith subproblem, with variable y i , as a function of x i .Functions f i , i = 1, . . ., K are strictly convex as well since minimization with respect to a variable of a function preserves strict convexity.The Lagrangian of ( 7) is which can also be written as ( 8) where v i corresponds to ith subsystem and is a subvector of Lagrange multiplier v associated with x = Ez .The optimality conditions are: Example 1 Figure 2 presents a coalitional hypergraph H information structure among three coalitions where each agent from each coalition is allowed to engage in exactly one consensus relationship.Variables {x 1 1 , x 1 2 , x 1 3 } are attached to hyperedge E 1 and they describe the agents of the three coalitions engaging in the first consensus relationship.Variables {x 2 1 , x 2 2 } are attached to hyperedge E 2 while the variables {x 3 1 , x 3 3 } are attached to hyperedge E 3 and similarly describe the engagement of the respective agents to the con- sensus relationships.Hypergraph H has the following characteristics: .

Fig. 2 Hypergraph communication
The respective clique expansion graph of the hypergraph in Fig. 2 is given in Fig. 3.We notice that the only difference of the hypergraph with its respective clique expansion graph is the hyperedge E 1 and its respective clique C 1 since the other hyperedges are standard and their cliques remain the same.We notice that for the clique C 1 it is required to have three edges in order to substitute the information transmission of hyperedge E 1 and for this reason the hypergraph is more efficient than its respective clique expansion graph.The clique expansion matrix has the following characteristics:

Decomposition cases
In this section we introduce the continuous time dynamical system interpretation of the primal, dual and primal dual decomposition cases.We prove that the equilibrium points of the decomposition cases dynamical systems are the optimal solutions of the hypergraph distributed optimization problem.We prove the convergence of these dynamical systems using passivity techniques and Lyapunov theory.

Fig. 3 Clique expansion graph communication
To find a gradient of f, we calculate ∇f i , i = 1, . . ., K which are strictly incrementally pas- sive functions due to strict convexity of f i , i = 1, . . ., K .We then have By solving (12) using a gradient method, we have the following algorithm.

Dynamical System
Theorem 4 Let x * be an equilibrium point of the dynamical system (13a)-( 13d) then x * is a solution to the optimization problem (7).
Proof We find the equilibrium point of the dynamical system (13a)-(13d) from = 0 which is equal with the last of the optimality conditions (11c).The other optimality conditions are satisfied from the equations of the dynamical system and as a result, the equilibrium point of (13a)-(13d) solves the optimization problem (7).
Theorem 5 The dynamical system (13a)-( 13d) is a negative feedback interconnection of a passive and a strictly passive system.Proof The dynamical system (13a)-(13d) can be seen as a negative feedback interconnection of a passive and a strictly passive system as it is illustrated in Fig. 4. The system of non- . ., K is strictly incre- mentally passive and as a result, the system of non-linearities is strictly incrementally passive with zero storage function ∀ x(t) ∈ R n .The pre/post multiplication of a pas- sive system by E T and E respectively preserves passivity since matrix E T E = D E is positive definite.Now, regarding the integrator system it has input . In order for this system to be passive there must exist a storage function S(t) where S(t) : -Collect and sum dual variables for each hyperedge.
appropriate choice of storage function that satisfies the passivity inequality condition is where we have used in the second line that E T v * = 0 .As a result, the integrator system is passive and we have a negative feedback interconnection of a passive and a strictly passive system.

Theorem 6
The equilibrium point of the dynamical system (13a)-( 13d) is globally asymptotically stable.

Proof
The negative feedback interconnection of a passive and a strictly passive system is a stable system and a candidate Lyapunov function for the interconnected system is the sum of the storage functions of the individual systems.As a result, we choose for the dynamical system (13a)-(13d) as a Lyapunov candidate the function where V (t) : R N → R is radially unbounded.The Lie derivative of the Lyapunov function is since ∇f is strictly monotonically increasing and as a result, V (t) < 0 for x = x * .In the second line we made use of E T ∇f T (x * ) = 0 .From the above we conclude that the equi- librium point z = z * is globally asymptotically stable (G.A.S.) for the dynamics (13a)- (13d).

Dual decomposition
Now we will use the dual decomposition to solve the optimization problem (7).To form the dual problem we first minimize over z, which results in the optimality condition E T v = 0 and we then solve the following subproblems, Since the objective functions f i (x i ) are strictly convex, the solution to each subproblem is unique.The dual of the original problem ( 7) is We refer to (17) as the dual decomposition master problem.Assuming that strong duality holds, the primal problem (7) can be equivalently solved by solving the dual problem.Below we present the dynamical system of the dual decomposition algorithm.A detailed procedure of the extraction of the dual algorithm is given in Samar et al. (2007).

Dynamical system
We initialize v so that E T v = 0 .
where ∇f −1 i is strictly incrementally passive due to strict convexity of f i and we also assume that ∇f −1 has analytical expression.If we substitute (18b) in (18c) we will have v ) -Optimize subproblems separately as in (16).

Proof
We have that and also Q = Q T .As a result, Q is an orthogonal projection matrix.Matrix Q is also positive semidefinite with spectrum S(Q) = {0, 1}.
Example 2 Continuing the previous example we have that the Laplacian matrices for the hypergraph Q and the clique expansion graph L are given below, Remark 2 An interesting observation is that the communication matrix Q that is used in the dual decomposition algorithm apart from being a projection matrix is also the Bolla's Laplacian for hypergraphs (Bolla 1993) since D V = I and D E = E T E. Theorem 7 Let v * be an equilibrium point of the dynamical system (18a)-(18c) then v * is a solution to the optimization problem (7) under the initial conditions E T v 0 = 0, v 0 ∈ T and v ∈ T where T = R[Q] is the range of matrix Q.
Proof We find the equilibrium point of the dynamical system (18a)-(18c) from v(t) = 0 ⇒ −(x(t) − Ez(t)) = 0 ⇒ x(t) = Ez(t) which is the same as the optimality condition (11b).The rest of the optimality conditions are satisfied from the equations of the dynamical system at steady state and as a result, the equilibrium solution of (18a)-(18c) solves the optimization problem (7).
Theorem 8 The dynamical system (18a)-( 18c) is a negative feedback interconnection of a passive and a strictly passive system.

Proof
The dynamical system (18a)-( 18c) can be seen as a negative feedback interconnection of a passive and a strictly passive system as it is illustrated in Fig. 5.The non-linearities system is ∇f . ., K is strictly incre- mentally passive and as a result the system of the non-linearities is strictly passive.The pre/post multiplication of a passive system's input and output respectively by Q preserves passivity since matrix Q 2 = Q is positive semidefinite.Regarding the integrator system it has input −(x(t) − x * ) and output v(t) − v * .In order for this system to be passive there must exist a storage function S(t) : Lie derivative where we have used that Qx * = 0 .As a result, the integrator system is passive and we have a negative feedback interconnection of a passive and a strictly passive system.

Fig. 5 Dual decomposition
Theorem 9 The equilibrium point of the dynamical system (18a)-( 18c) converges asymptotically to the equilibrium point identified in Theorem 7, given initial conditions v 0 that satisfy E T v 0 = 0.
Proof By defining the space T = R[Q] as the range of matrix Q we have that the space T ⊥ is the orthogonal complement of T. We notice that v ∈ T since v ∈ T and so we can write v = Qk for some k.By premultiplying v by Q we get Qv = Q 2 k = Qk = v since Q is a projection matrix and as a result, we have Qv = v .The negative feedback intercon- nection of passive systems is a stable system and a candidate Lyapunov function for the interconnected system is the sum of the storage functions of the individual systems.As a result, we choose for the dynamical system (18a)-(18c) as a Lyapunov candidate function where V (t) : R p → R is radially unbounded.The Lie deriva- tive of the candidate Lyapunov function is since ∇f −1 (t) is strictly incrementally passive, Qx * = 0 and Qv = v .In order to prove convergence we will make use of LaSalle's invariance principle.The set � = {v ∈ T : V (v) ≤ l} is a compact set for every constant l due to the fact that V is radially unbounded.From V (t) = 0 we get v(t) = v * and since x = ∇f −1 (v) , the invari- ant set is the equilibrium point.As a result, our solution converges to the optimum solution of the distributed optimization problem for the given initial conditions.

Primal dual algorithm
For the primal dual algorithm we will not make use of the common values vector z but we will directly utilize the hypergraph Laplacian matrix in the formulation of the distributed optimization problem in the following way: We define the Lagrangian of (24) to be The primal dual dynamics of problem (24) are given below: Theorem 10 Let (x * , v * ) be an equilibrium point of the dynamical system (25a)-( 25b) then (x * , v * ) satisfies the optimality conditions of the optimization problem (24).
Proof We find the equilibrium point of the dynamical system from ẋ(t) = 0 ⇒ ∇f (x * ) = Qv * and v(t) = 0 ⇒ Qx * = 0 .We notice that the equilibrium point satisfies the optimality conditions in (11a)-( 11c) and as a result, the equilibrium point of the dynamical system (25a)-(25b) solves the optimization problem (24).
Theorem 11 The dynamical system (25a)-( 25b) is a negative feedback interconnection of a passive and a strictly passive system.Proof The dynamical system (25a)-(25b) can be seen as a negative feedback interconnection of two passive systems as it is illustrated in Fig. 6.
The first system which is enclosed by the blue parallelogram refers to the primal dynamics and is a negative feedback interconnection of two systems where the first system is the system of non-linearities which is ∇f . ., K is strictly incrementally passive and as a result, the sys- tem of the non-linearities is strictly passive.The other system enclosed by the blue parallelogram is an integrator system and in order to be passive there must exist a storage function where S 1 (t) : R p → R with Lie derivative where we have used that ∇f (x * ) = Qv * and as a result, the primal dynamics integra- tor system is passive.The overall system enclosed by the blue parallelogram is strictly passive.
The second system which is enclosed by the red parallelogram refers to the dual dynamics.The system is an integrator which is premultiplied and postmultiplied by Q.This pre/ (25a) post multiplication preserves passivity since matrix Q 2 = Q is positive semidefinite.For the integrator system we have u 2 (t) = −Q(x(t) − x * ) and y 2 (t) = Q(v(t) − v * ) .In order for this system to be passive there must exist a storage function S 2 (t) : where we have used that Qx * = 0 .As a result, the second system is passive.In conclu- sion we have a passive and a strictly passive system interconnected in negative feedback.

Theorem 12
The equilibrium point of the dynamical system (25a)-(25b) converges asymptotically to the equilibrium point identified in Theorem 10.

Proof
The Lyapunov function of the dynamical system (25a)-( 25b) system is constructed as the sum of the respective storage functions.We choose as candidate Lyapunov function V (t) : R p → R where since ∇f (x) is strictly incrementally passive.From LaSalle's invariance principle we have convergence to the largest invariant set for which V = 0 , i.e., x = x * .This set includes only the equilibrium point (x * , v * ) .As a result, the dynamical system (25a)-(25b) con- verges asymptotically to the solution (x * , v * ) .In the fifth line of the proof we have used that v T (t)Qx(t) = x T (t)Qv(t) while in the sixth line of the proof we have used that v T (t)Qx * = 0, x T (t)Qv * = ∇f (x * ) T x(t) and ∇f (x * ) T x * = 0 from the equilibrium prop- erties.
Remark 3 From the control theoretic interpretation of the decomposition cases it is possible to design controllers that improve the time response of the system.Since the dynamical systems satisfy the passivity property an appropriate choice of controllers would be passivity based controllers (Chopra 2006;Ortega et al. 1997) such as PID controllers.A problem with this type of controllers is that there is no standard rule of selecting their parameters (Somefun et al. 2021) and it can be viewed as a more adhoc procedure.An alternative way of acceleration is the use of proximal algorithm methods as they are presented in Parikh and Boyd (2014).
Remark 4 In Papastaikoudis and Lestas (2023) we have also presented a Laplacian matrix formula for a directed weighted hypergraph and we have shown how this matrix is decomposed for the stability analysis setting that we studied throughout this paper.
Example 3 Continuing the previous example and by setting the objective functions of the coalitions to be where where where with the constants c 1 , c 2 , c 3 to be randomly chosen.The objective function can be written more compactly as where The dual decomposition algorithm ( 19) with quadratic objective function takes the following form where L = {L H , L C } with L H corresponding to the hypergraph Laplacian matrix while L C corresponds to the clique expansion graph Laplacian matrix.The convergence of the primal variables, the dual variables and the objective function are presented in Figs. 7, 8 and 9 respectively and the hypergraph provides better convergence rate in all cases.
For the primal dual algorithm the dynamical system (25a)-(25b) takes the following form for quadratic objective functions where L = {L H , L C } with L H corresponding to the hypergraph Laplacian matrix while L C corresponds to the clique expansion graph Laplacian matrix.The convergence 2 1 0 0 0 0 0 1 2 1 0 0 0 0 0 1 2 0 0 0 0 0 0 0 2 1 0 0 0 0 0 1 2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 Remark 5 It is important to note that for continuous time optimization algorithms the convergence rate can be made arbitrarily fast by choosing the gains in the passive components to be arbitrarily large (as passivity is preserved for arbitrary positive gains).For this reason we will present an example for the discretized
The convergence of the primal variables, the dual variables and the objective function are presented in Figs. 15, 16 and 17 respectively and the hypergraph provides better convergence rate in all cases.Remark 6 We would like to point out that the hypergraph Laplacian matrix (6) has greater complexity associated with its computation relative to its respective clique expansion graph Laplacian matrix (5) but we would like to note that the tradeoff between this computational complexity and the convergence rate depends on the nature of the problem, e.g., the improved convergence rate can be of greater importance that overcomes the computation cost associated with the Laplacian computation or vice versa.This situation has to be considered based on the the nature of the problem and the desired goals.

Conclusion
We have studied various distributed algorithms that solve a network optimization problem that uses an undirected and unweighted hypergraph as its communication structure from a dynamical system viewpoint.We proved the stability of these dynamical systems with the use of non linear control theory and an appropriate decomposition of the respective incidence and Laplacian matrices of the hypergraph.We also highlight numerically the superiority of hypergraph compared to its respective clique expansion graph in terms of information transmission efficiency and convergence rate for the given algorithms.

Fig. 1
Fig. 1 Negative feedback interconnection of systems

Fig. 4
Fig. 4 Primal decomposition and average primal variables over each hyperedge.

Fig. 7
Fig. 7 Primal variables convergence for dual algorithm

Fig. 8
Fig. 8 Dual variables convergence for dual algorithm

Fig. 10
Fig. 10 Primal variables convergence for primal dual algorithm

Fig. 15
Fig. 15 Primal variables convergence for discrete dual algorithm E) where V = {v 1 , . . ., v n } is the finite set of nodes and E = {E 1 , . . ., E m } is the corresponding set of hyperedges is a generalization of a graph in the sense that each hyperedge can join any number of nodes and not just two.In the case that a hyperedge joins two nodes, it is called a "standard" hyperedge.The degree of a node v i denoted by |v i | is the total number of hyperedges adjacent to this node while the degree of a hyperedge E j denoted by |E j | is the total number of nodes adjacent to this hyperedge.We define by D V the diagonal |V| × |V| matrix whose entries are the degrees of each node, i.e., D V = diag{|v 1 |, . . ., |v n |} and by D E the diagonal |E| × |E| matrix whose entries are the degrees of each hyperedge, i.e., D E = diag{|E 1 |, . . ., |E m |} .For a hypergraph H , the incidence matrix, denoted by E is a |V| × |E| matrix whose (i, j)-th entry is defined as: