Subgradient averaging for multi-agent optimisation with different constraint sets

doi:10.1016/j.automatica.2021.109738

Automatica

Volume 131, September 2021, 109738

https://doi.org/10.1016/j.automatica.2021.109738 Get rights and content

Abstract

We consider a multi-agent setting with agents exchanging information over a possibly time-varying network, aiming at minimising a separable objective function subject to constraints. To achieve this objective we propose a novel subgradient averaging algorithm that allows for non-differentiable objective functions and different constraint sets per agent. Allowing different constraints per agent simultaneously with a time-varying communication network constitutes a distinctive feature of our approach, extending existing results on distributed subgradient methods. To highlight the necessity of dealing with a different constraint set within a distributed optimisation context, we analyse a problem instance where an existing algorithm does not exhibit a convergent behaviour if adapted to account for different constraint sets. For our proposed iterative scheme we show asymptotic convergence of the iterates to a minimum of the underlying optimisation problem for step sizes of the form $\frac{η}{k + 1}$ , $η > 0$ . We also analyse this scheme under a step size choice of $\frac{η}{\sqrt{k + 1}}$ , $η > 0$ , and establish a convergence rate of $O (\frac{ln k}{\sqrt{k}})$ in objective value. To demonstrate the efficacy of the proposed method, we investigate a robust regression problem and an $ℓ_{2}$ regression problem with regularisation.

Introduction

Distributed optimisation deals with multiple agents interacting over a network and has found numerous applications in different domains, such as wireless sensor networks (Baingana et al., 2014, Mateos and Giannakis, 2012), robotics (Martinez, Bullo, Cortes, & Frazzoli, 2007), and power systems (Bolognani, Carli, Cavraro, & Zampieri, 2015), due to its ability to parallelise computation and prevent agents from sharing information considered as private. Typically, distributed algorithms are based on an iterative process in which agents maintain some estimate about the decision vector in an optimisation context, exchange this information with neighbouring agents according to an underlying communication protocol/network, and update their estimate on the basis of the received information.

Despite the intense research activity in this area, only a few algorithms can simultaneously deal with time-varying networks, non-differentiable objective functions and account for the presence of constraints (Liang et al., 2019, Margellos et al., 2018, Nedić and Olshevsky, 2015, Xi and Khan, 2017, Zhu and Martinez, 2012), features that are often treated separately in the literature. Several of the commonly employed methods are based on a projected subgradient or a proximal step and their analysis consists of selecting the step size underlying these algorithms, establishing a convergence rate analysis, and quantifying practical convergence for (near-)real time applications.

In this paper, we study a class of optimisation problems that involves a separable objective function, while the feasible set can be decomposed as an intersection of different compact convex sets. A centralised version of this class of problems has been studied under a stochastic setting in Bianchi (2016) and Patrascu and Necoara (2018). Distributed algorithms for this class have been proposed in Johansson et al., 2008, Lee and Nedić, 2013, Lin et al., 2016, Mai and Abed, 2019, Margellos et al., 2018, Nedic and Ozdaglar, 2009, Nedic et al., 2010 and Zhu and Martinez (2012). References Johansson et al., 2008, Nedic and Ozdaglar, 2009 and Nedic et al. (2010) rely on Bertsekas and Tsitsiklis (1989) and Tsitsiklis, Bertsekas, and Athans (1986) to propose a distributed strategy based on projected sub-gradient methods. These results consist of an averaging step followed by a local sub-gradient projection update. In Margellos et al. (2018) a distributed scheme based on a proximal update is proposed, thus extending Johansson et al. (2008) and Nedic et al. (2010) to the case where different local constraint sets and an arbitrarily time-varying network are considered. The authors in Zhu and Martinez (2012) provide asymptotic convergence for a primal–dual algorithm that allows coupling between agents’ local estimates. We discuss additional related results in Section 4, after the proposed algorithm is presented and some notation introduced.

We motivate our approach by constructing an example showing that extending available algorithms to the case of different constraint sets might not exhibit a convergent behaviour for all problem instances. Hence, a direct adaptation of existing schemes is not always possible when dealing with different constraint sets. Notice also that distributed algorithms developed for the unconstrained case cannot be trivially adapted to our setting, as lifting the constraints in the objective (e.g., via characteristic functions) would violate boundedness of the subgradient, a typical requirement for such algorithms (Duchi et al., 2012, Margellos et al., 2018, Nedić and Olshevsky, 2015, Nedic et al., 2010).

The main contribution of this paper is the introduction and the characterisation of the convergence rate for a new subgradient averaging algorithm. The proposed scheme allows us to account for time-varying networks, non-differentiable objective functions and different constraint sets per agent as in Margellos et al. (2018), while achieving faster practical convergence as it is based on subgradient averaging as in Duchi et al., 2012, Johansson et al., 2008 and Mai and Abed (2019). Note that allowing simultaneously for different constraint sets per agent and time-varying communication network by means of a subgradient averaging scheme is a distinct feature of the algorithm in this paper. Preliminary results related to this paper appeared in Romao, Margellos, Notarstefano, and Papachristodoulou (2019), where several proofs have been omitted. Moreover, the construction of Section 2.2 that motivates the analysis of algorithms with different constraint sets is novel, and offers insight on the limitations of existing algorithms. We also provide detailed numerical examples, not included in the conference version.

The paper is organised as follows. In Section 2 we present the problem statement, the network communication structure, and the main assumptions adopted in this paper, followed by a numerical construction that motivates the algorithm of this paper. In Section 3 we present the proposed scheme and the main convergence results, namely, asymptotic convergence in iterates and a convergence rate as far as the optimal value is concerned. Section 4 provides detailed discussion and comparison of our scheme with other results in the literature. In Section 5 we study the robust linear regression problem and $ℓ_{2}$ regression with regularisation to demonstrate the main algorithmic features of our scheme and to compare our strategy against existing methods. Finally, some concluding remarks and future research directions are provided in Section 6. To ease the reader all proofs have been deferred to the Appendix).

Notation: We denote by $R$ the set of real numbers and $N$ the set of natural numbers (excluding zero). The symbol $R^{n}$ stands for the Cartesian product $R \times \dots \times R$ with $n$ terms. A sequence of elements in $R^{n}$ is denoted by ${(x (k))}_{k \in N}$ . For any set $X \subset R^{n}$ , we denote its interior, relative interior and convex hull by $int (X)$ , $ri (X)$ , and $conv (X)$ , respectively. We also denote by $f (X)$ as the image of the set $X$ over a function $f$ . The subdifferential of $f$ at a point $x \in dom f$ is denoted by $\partial f (x)$ . For any point $x \in R^{n}$ , ${‖ x ‖}_{2}$ stands for the Euclidean norm of $x$ and ${‖ x ‖}_{1}$ for the $ℓ_{1}$ norm of $x \in R^{n}$ , which are reduced to $| x |$ if $x$ is scalar.

Section snippets

Problem set-up and network communication

Consider the optimisation problem $\underset{x}{minimise} f (x) = \sum_{i = 1}^{m} f_{i} (x)$ $subject to x \in \cap_{i = 1}^{m} X_{i},$ where $x \in R^{n}$ is the vector of decision variables, and $f_{i} : R^{n} \to R$ and $X_{i} \subset R^{n}$ constitute the local objective function and constraint set, respectively, for agent $i$ , $i = 1, \dots, m$ . We suppose that each agent $i$ possesses as private information the pair $(f_{i}, X_{i})$ and maintains a local estimate $x_{i}$ of the common decision vector $x$ .

The goal is for all agents to agree on the local variables, that is, $x_{i} = x^{⋆}$ , for all $i = 1, \dots, m$ , where $x^{⋆}$ is an

Proposed algorithm

The main steps of the proposed scheme are summarised in Algorithm 1. We initialise each agents’ local variable with an arbitrary $x_{i} (0) \in X_{i}$ , $i = 1, \dots, m$ ; such points are not required to belong to $\cap_{i = 1}^{m} X_{i}$ .

At iteration $k$ , agent $i$ receives $x_{j}$ from the neighbouring agents and averages them through $A (k)$ , which captures the communication network, to obtain $z_{i} (k)$ . Recall that we denote the element of the $j$ -th row and $i$ -th column of matrix $A (k)$ by ${[A (k)]}_{j}^{i}$ . Agent $i$ then calculates a subgradient, $g_{i}$ , of its

Comparison with related algorithms

In this section we provide a detailed comparison of the proposed algorithm with other results in the literature. To this end, note that in Johansson et al. (2008) a similar distributed sub-gradient scheme is mentioned, but no analysis of such a scheme is presented. References Lee and Nedić (2013) and Lin et al. (2016) characterise the convergence rate of a sub-gradient algorithm under different constraint sets per agent that does not possess subgradient averaging. References Margellos et al.

Problem instance of Section 2.2 — revisited

We revisit the two-agent problem in (3), for which the iterative scheme in (2) is not guaranteed to converge, and apply this time our algorithm. Note that the optimal solution of (3) is given by $x^{⋆} = P_{{[0.5, 1]}^{2}} [- \frac{1}{8} Q^{- 1} (q_{1} + q_{2})] = [\begin{bmatrix} 0.5 \\ 1 \end{bmatrix}]$ where $P_{{[0.5, 1]}^{2}} [\cdot]$ represents the projection onto the feasible set of problem (3). Pictorially $x^{⋆}$ is shown in Fig. 1. To illustrate the convergence properties of Algorithm 1 we monitor the evolution of $\sqrt{\sum_{i = 1}^{2} {‖ x_{i} (k) - x^{⋆} ‖}_{2}^{2}}$ , where ${(x_{i} (k))}_{k \in N}$ , $i = 1, 2$ , are the iterates generated

Conclusion

In this paper we proposed a subgradient averaging algorithm for multi-agent optimisation problems involving non-differentiable objective functions and different constraint sets per agent. For this set-up we showed by means of a geometric construction that available schemes involving subgradient averaging cannot be used. For the proposed scheme we showed convergence of the algorithm iterates to some minimiser of a centralised problem counterpart. Moreover, we have also established a convergence

Acknowledgements

L. Romao is supported by the Coordination for the Improvement of Higher Education Personnel (CAPES) - Brazil. The work of K. Margellos and A. Papachristodoulou has been supported by EPSRC UK under grants EP/P03277X/1 and EP/M002454/1, respectively. Giuseppe Notarstefano is supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant 638992-OPT4SMART).

Licio Romao received the B.S. degree in Electrical Engineering from the Universidade Federal de Campina Grande, Brazil, in 2014, and M.S. degree in Electrical Engineering from the University of Campinas, Brazil, in 2017. He is currently pursuing the Ph.D. degree in Control Engineering at the University of Oxford.

He visited the University of Rome Tor Vergata in 2012, the University of California San Diego in 2015, and the University of Bologna in 2019. His research interests include optimisation

References (37)

LeiJ. et al.
Primal–dual algorithm for distributed constrained optimization
Systems & Control Letters
(2016)
LiangS. et al.
Distributed quasi-monotone subgradient algorithm for nonsmooth convex optimization over directed graphs
Automatica
(2019)
LinP. et al.
Distributed multi-agent optimization subject to nonidentical constraints and communication delays
Automatica
(2016)
LiuS. et al.
Convergence rate analysis of distributed optimization with projected subgradient algorithm
Automatica
(2017)
MaiV.S. et al.
Distributed optimization over directed graphs with row stochasticity and constraint regularity
Automatica
(2019)
BainganaB. et al.
Proximal-gradient algorithms for tracking cascades over social networks
IEEE Journal of Selected Topics in Signal Processing
(2014)
BertsekasD.P.
Convex optimization theory
(2009)
BertsekasD.P. et al.
Parallel and Distributed Computation: Numerical Methods
(1989)
BertsekasD.P. et al.
Neuro-dynamic programming
(1996)
BianchiP.
Ergodic convergence of a stochastic proximal point algorithm
SIAM Journal on Optimization
(2016)

BolognaniS. et al.

Distributed reactive power feedback control for voltage regulation and loss minimization

IEEE Transactions on Automatic Control

(2015)

Chen, A. I., & Ozdaglar, A. (2012). A fast distributed proximal-gradient method. In 50th annual allerton conference on...

DuchiJ.C. et al.

Dual averaging for distributed optimization: Convergence analysis and network scaling

IEEE Transactions on Automatic Control

(2012)

Jakovetic, D., Moura, J. M., & Xavier, J. (2012). Distributed Nesterov-like gradient algorithms. In 51st IEEE...

JohanssonB. et al.

Subgradient methods and consensus algorithms for solving convex optimization problems

Proceedings of the IEEE Conference on Decision and Control

(2008)

LeeS. et al.

Distributed random projection algorithm for convex optimization

IEEE Journal on Selected Topics in Signal Processing

(2013)

MargellosK. et al.

Distributed constrained optimization and consensus in uncertain networks via proximal minimization

IEEE Transactions on Automatic Control

(2018)

MartinezS. et al.

On synchronous robotic networks - Part I: Models, tasks and complexity

IEEE Transactions on Automatic Control

(2007)

Cited by (10)

Augmented Lagrangian Tracking for distributed optimization with equality and inequality coupling constraints
2023, Automatica
In this paper we propose a novel Augmented Lagrangian Tracking distributed optimization algorithm for solving multi-agent optimization problems where each agent has its own decision variables, cost function and constraint set, and the goal is to minimize the sum of the agents’ cost functions subject to local constraints plus some additional coupling constraint involving the decision variables of all the agents. In contrast to alternative approaches available in the literature, the proposed algorithm jointly features a constant penalty parameter, the ability to cope with unbounded local constraint sets, and the ability to handle both affine equality and nonlinear inequality coupling constraints, while requiring convexity only. The effectiveness of the approach is shown first on an artificial example with complexity features that make other state-of-the-art algorithms not applicable and then on a realistic example involving the optimization of the charging schedule of a fleet of electric vehicles.
Zeroth-order Gradient Tracking for Distributed Constrained Optimization
2023, IFAC-PapersOnLine
Distributed optimization is an important and practical problem that arose from machine learning, smart grid, and multi-robot systems. In this paper, we propose a zeroth-order gradient tracking method to solve a class of constrained distributed optimization problems with nonidentical feasible sets. We design a more general pseudo-gradient estimation scheme, which includes the existing coordinate descent, discretized gradient descent, and spherical smoothing methods as its special cases. Moreover, we propose pseudo-gradient tracking with projection dynamics to deal with nonidentical feasible set constraints and achieve the optimal solution. We show the proposed algorithm achieves the optimal solution with an O(lnT/√T) convergence rate. Finally, we present an example to demonstrate the effectiveness of the proposed algorithm.
Distributed Discrete-time Dynamic Outer Approximation of the Intersection of Ellipsoids
2024, arXiv
Distributed Gradient Tracking for Unbalanced Optimization With Different Constraint Sets
2023, IEEE Transactions on Automatic Control
Distributed outer approximation of the intersection of ellipsoids
2023, arXiv
A Communication-efficient Local Differentially Private Algorithm in Federated Optimization
2023, arXiv

View all citing articles on Scopus

He visited the University of Rome Tor Vergata in 2012, the University of California San Diego in 2015, and the University of Bologna in 2019. His research interests include optimisation algorithms and control strategies applied to large-scale, uncertain systems.

Kostas Margellos received the Diploma in Electrical Engineering from the University of Patras, Patras, Greece, in 2008, and the Ph.D. degree in control engineering from ETH Zurich, Zurich, Switzerland, in 2012.

He spent 2013, 2014 and 2015 as a Postdoctoral Researcher at ETH Zurich, UC Berkeley, and Politecnico di Milano, respectively. In 2016, he joined the Control Group, Department of Engineering Science, University of Oxford, Oxford, U.K., where he is currently an Associate Professor. He is also a Lecturer at Worcester College, Oxford. His research interests include optimisation and control of complex uncertain systems, with applications to generation and load side control for power networks.

Giuseppe Notarstefano is a Professor in the Department of Electrical, Electronic, and Information Engineering G. Marconi at Alma Mater Studiorum Università di Bologna. He was Associate Professor (June ‘16–June ‘18) and previously Assistant Professor, Ricercatore, (from Feb ‘07) at the Università del Salento, Lecce, Italy. He received the Laurea degree “summa cum laude” in Electronics Engineering from the Università di Pisa in 2003 and the Ph.D. degree in Automation and Operation Research from the Università di Padova in 2007. He has been visiting scholar at the University of Stuttgart, University of California Santa Barbara and University of Colorado Boulder. His research interests include distributed optimisation, cooperative control in complex networks, applied nonlinear optimal control, and trajectory optimisation and manoeuvring of aerial and car vehicles. He serves as an Associate Editor for IEEE Transactions on Automatic Control, IEEE Transactions on Control Systems Technology and IEEE Control Systems Letters. He has been part of the Conference Editorial Board of IEEE Control Systems Society and EUCA. He is recipient of an ERC Starting Grant.

Antonis Papachristodoulou FIEEE received the M.A./M.Eng. degrees in electrical and information sciences from the University of Cambridge, Cambridge, U.K., and the Ph.D. degree in Control and Dynamical systems (with a minor in aeronautics) from the California Institute of Technology, Pasadena, CA, USA. He is currently Professor of Engineering Science at the University of Oxford, Oxford, U.K., and a Tutorial Fellow at Worcester College, Oxford. He is also an EPSRC Fellow for Growth in Synthetic Biology and the Director of the EPSRC & BBSRC Centre for Doctoral Training in Synthetic Biology. His research interests include large-scale nonlinear systems analysis, sum of squares programming, synthetic and systems biology, networked systems, and flow control. Professor Papachristodoulou received the 2015 European Control Award for his contributions to robustness analysis and applications to networked control systems and systems biology. In the same year, he received the O. Hugo Schuck Best Paper Award.

^☆: The material in this paper was presented at the 58th IEEE Conference on Decision and Control, December 11–13, 2019, Nice, France. This paper was recommended for publication in revised form by Associate Editor Julien M. Hendrickx under the direction of Editor Christos G. Cassandras.

View full text

Subgradient averaging for multi-agent optimisation with different constraint sets☆

Abstract

Introduction

Section snippets

Problem set-up and network communication

Proposed algorithm

Comparison with related algorithms

Problem instance of Section 2.2 — revisited

Conclusion

Acknowledgements

Systems & Control Letters

Automatica

Automatica

Automatica

Automatica

Proximal-gradient algorithms for tracking cascades over social networks

IEEE Journal of Selected Topics in Signal Processing

Convex optimization theory

Parallel and Distributed Computation: Numerical Methods

Neuro-dynamic programming

Ergodic convergence of a stochastic proximal point algorithm

SIAM Journal on Optimization

Distributed reactive power feedback control for voltage regulation and loss minimization

IEEE Transactions on Automatic Control

Dual averaging for distributed optimization: Convergence analysis and network scaling

IEEE Transactions on Automatic Control

Subgradient methods and consensus algorithms for solving convex optimization problems

Proceedings of the IEEE Conference on Decision and Control

Distributed random projection algorithm for convex optimization

IEEE Journal on Selected Topics in Signal Processing

Distributed constrained optimization and consensus in uncertain networks via proximal minimization

IEEE Transactions on Automatic Control

On synchronous robotic networks - Part I: Models, tasks and complexity

IEEE Transactions on Automatic Control