Incremental proximal methods for large scale convex optimization

Bertsekas, Dimitri P.

doi:10.1007/s10107-011-0472-0

Incremental proximal methods for large scale convex optimization

Full Length Paper
Series B
Published: 11 June 2011

Volume 129, pages 163–195, (2011)
Cite this article

Mathematical Programming Submit manuscript

Dimitri P. Bertsekas¹

2007 Accesses
197 Citations
3 Altmetric
Explore all metrics

Abstract

We consider the minimization of a sum \({\sum_{i=1}^mf_i(x)}\) consisting of a large number of convex component functions f _i. For this problem, incremental methods consisting of gradient or subgradient iterations applied to single components have proved very effective. We propose new incremental methods, consisting of proximal iterations applied to single components, as well as combinations of gradient, subgradient, and proximal iterations. We provide a convergence and rate of convergence analysis of a variety of such methods, including some that involve randomization in the selection of components. We also discuss applications in a few contexts, including signal processing and inference/machine learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerated Gradient-Free Optimization Methods with a Non-Euclidean Proximal Operator

Article 16 August 2019

Adaptive Restart of the Optimized Gradient Method for Convex Optimization

Article 07 May 2018

A proximal method for solving quasi-variational inequalities

Article 08 December 2015

References

Blatt D., Hero A.O., Gauchman H.: A convergent incremental gradient method with a constant step size. SIAM J. Optim. 18, 29–51 (2008)
MathSciNet Google Scholar
Bauschke H.H., Combettes P.L., Luke D.R.: Hybrid projection-reflection method for phase retrieval. J. Opt. Soc. Am. 20, 1025–1034 (2003)
Article Google Scholar
Bauschke H.H., Combettes P.L., Kruk S.G.: Extrapolation algorithm for affine-convex feasibility problems. Numer. Algorithms 41, 239–274 (2006)
Article MathSciNet MATH Google Scholar
Ben-Tal A., Margalit T., Nemirovski A.: The ordered subsets mirror descent optimization method and its use for the positron emission tomography reconstruction. In: Butnariu, D., Censor, Y., Reich, S. (eds) Inherently Parallel Algorithms in Feasibility and Optimization and their Applications, Elsevier, Amsterdam, Netherlands (2001)
Google Scholar
Bertsekas D.P., Nedić C.A., Ozdaglar A.E.: Convex Analysis and Optimization. Athena Scientific, Belmont, MA (2003)
MATH Google Scholar
Bauschke, H.H. : Projection algorithms: results and open problems. In: Butnariu, D., Censor, Y., Reich, S. (eds.) Inherently Parallel Algorithms in Feasibility and Optimization and their Applications, Elsevier, Amsterdam, Netherlands (2001)
Google Scholar
Bertsekas D.P., Tsitsiklis J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont, MA (1996)
MATH Google Scholar
Bertsekas D.P., Tsitsiklis J.N.: Gradient convergence in gradient methods. SIAM J. Optim. 10, 627–642 (2000)
Article MathSciNet MATH Google Scholar
Beck A., Teboulle M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009)
Article MathSciNet MATH Google Scholar
Beck, A., Teboulle, M.: Gradient-based algorithms with applications to signal-recovery problems. In: Eldar, Y., Palomar, D. (eds.) Convex Optimization in Signal Processing and Communications, pp. 42–88. Cambridge University Press, Cambridge (2010)
Bertsekas, D.P., Yu, H.: A unifying polyhedral approximation framework for convex optimization. In: Laboratory for Information and Decision Systems Report LIDS-P-2820. MIT (2009); SIAM J. Optim. (to appear)
Bertsekas D.P.: Incremental least squares methods and the extended Kalman filter. SIAM J. Optim 6, 807–822 (1996)
Article MathSciNet MATH Google Scholar
Bertsekas D.P.: Hybrid incremental gradient method for least squares. SIAM J. Optim. 7, 913–926 (1997)
Article MathSciNet MATH Google Scholar
Bertsekas, D.P.: Nonlinear Programming. 2nd edn. Athena Scientific, Belmont, MA (1999)
MATH Google Scholar
Bertsekas D.P.: Convex Optimization Theory. Athena Scientific, Belmont, MA (2009)
Google Scholar
Bertsekas, D.P.: Incremental gradient, subgradient, and proximal methods for convex optimization: a survey. In: Labaratory for Information and Decision Systems Report LIDS-P-2848. MIT (2010)
Bioucas-Dias J., Figueiredo M.A.T.: A new TwIST: two-step iterative shrinkage thresholding algorithms for image restoration. IEEE Trans. Image Process. 16, 2992–3004 (2007)
Article MathSciNet Google Scholar
Chambolle A., DeVore R.A., Lee N.Y., Lucier B.J.: Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage. IEEE Trans. Image Process. 7, 319–335 (1998)
Article MathSciNet MATH Google Scholar
Cegielski A., Suchocka A.: Relaxed alternating projection methods. SIAM J. Optim. 19, 1093–1106 (2008)
Article MathSciNet MATH Google Scholar
Combettes P.L., Wajs V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4, 1168–1200 (2005)
Article MathSciNet MATH Google Scholar
Daubechies I., Defrise M., Mol C.D.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57, 1413–1457 (2004)
Article MATH Google Scholar
Elad M., Matalon B., Zibulevsky M.: Coordinate and subspace optimization methods for linear least squares with non-quadratic regularization. J. Appl. Comput. Harmon. Anal. 23, 346–367 (2007)
Article MathSciNet MATH Google Scholar
Figueiredo M.A.T., Nowak R.D.: An EM algorithm for wavelet-based image restoration. IEEE Trans. Image Process. 12, 906–916 (2003)
Article MathSciNet Google Scholar
Figueiredo M.A.T., Nowak R.D., Wright S.J.: Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Signal Process. 1, 586–597 (2007)
Article Google Scholar
Gubin L.G., Polyak B.T., Raik E.V.: The method of projection for finding the common point in convex sets U.S.S.R. Comput. Math. Phys. 7, 1–24 (1967) (English Translation)
Article Google Scholar
Grippo L.: A class of unconstrained minimization methods for neural network training. Optim. Methods Softw. 4, 135–150 (1994)
Article Google Scholar
Grippo L.: Convergent on-line algorithms for supervised learning in neural networks. IEEE Trans. Neural Netw. 11, 1284–1299 (2000)
Article Google Scholar
Helou E.S., De Pierro A.R.: Incremental subgradients for constrained convex optimization: a unified framework and new methods. SIAM J. Optim. 20, 1547–1572 (2009)
MathSciNet MATH Google Scholar
Johansson B., Rabi M., Johansson M.: A randomized incremental subgradient method for distributed optimization in networked systems. SIAM J. Optim. 20, 1157–1170 (2009)
Article MathSciNet MATH Google Scholar
Kibardin V.M.: Decomposition into functions in the minimization problem. Autom. Remote Control 40, 1311–1323 (1980)
Google Scholar
Kiwiel K.C.: Convergence of approximate and incremental subgradient methods for convex optimization. SIAM J. Optim. 14, 807–840 (2004)
Article MathSciNet MATH Google Scholar
Lions P.L., Mercier B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)
Article MathSciNet MATH Google Scholar
Litvakov B.M.: On an iteration method in the problem of approximating a function from a finite number of observations. Avtom. Telemech. 4, 104–113 (1966)
MathSciNet Google Scholar
Luo Z.Q., Tseng P.: Analysis of an approximate gradient projection method with applications to the backpropagation algorithm. Optim. Methods Softw. 4, 85–101 (1994)
Article Google Scholar
Luo Z.Q.: On the convergence of the LMS algorithm with adaptive learning rate for linear feedforward networks. Neural Comput. 3, 226–245 (1991)
Article Google Scholar
Mangasarian O.L., Solodov M.V.: Serial and parallel backpropagation convergence via nonmonotone perturbed minimization. Optim. Methods Softw. 4, 103–116 (1994)
Article Google Scholar
Martinet B.: Regularisation d’in équations variationelles par approximations successives. Revue Fran. d’Automatique et Infomatique Rech. Op’ erationelle 4, 154–159 (1970)
MathSciNet Google Scholar
Nedić A., Bertsekas D.P., Borkar V.: Distributed asynchronous incremental subgradient methods. In: Butnariu, D., Censor, Y., Reich, S. (eds) Inherently Parallel Algorithms in Feasibility and Optimization and their Applications, Elsevier, Amsterdam, Netherlands (2001)
Google Scholar
Nedić A., Bertsekas D.P.: Convergence rate of the incremental subgradient algorithm. In: Uryasev, S., Pardalos, P.M. (eds) Stochastic Optimization: Algorithms and Applications, Kluwer Academic Publishers, Dordrecht (2000)
Google Scholar
Nedić A., Bertsekas D.P.: Incremental subgradient methods for nondifferentiable optimization. SIAM J. Optim. 12, 109–138 (2001)
Article MathSciNet MATH Google Scholar
Nedić A., Bertsekas D.P.: The effect of deterministic noise in subgradient methods. Math. Program. Ser. A 125, 75–99 (2010)
Article MATH Google Scholar
Nedić, A.: Random projection algorithms for convex minimization problems. University of Illinois Report (2010); Math. Program. J. (to appear)
Neveu J.: Discrete Parameter Martingales. North-Holland, Amsterdam, The Netherlands (1975)
MATH Google Scholar
Predd J.B., Kulkarni S.R., Poor H.V.: A collaborative training algorithm for distributed learning. IEEE Trans. Inf. Theory 55, 1856–1871 (2009)
Article MathSciNet Google Scholar
Passty G.B.: Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. J. Math. Anal. Appl. 72, 383–390 (1979)
Article MathSciNet MATH Google Scholar
Ram S.S., Nedić A., Veeravalli V.V.: Incremental stochastic subgradient algorithms for convex optimization. SIAM J. Optim. 20, 691–717 (2009)
Article MathSciNet MATH Google Scholar
Ram S.S., Nedić A., Veeravalli V.V.: Distributed stochastic subgradient projection algorithms for convex optimization. J. Optim. Theory Appl. 147, 516–545 (2010)
Article MathSciNet MATH Google Scholar
Rabbat, M.G., Nowak, R.D.: Distributed optimization in sensor networks. In: Proceedings of Information Processing Sensor Networks, pp. 20–27. Berkeley, CA (2004)
Rabbat M.G., Nowak R.D.: Quantized incremental algorithms for distributed optimization. IEEE J Sel. Areas Commun 23, 798–808 (2005)
Article Google Scholar
Rockafellar R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
MATH Google Scholar
Rockafellar R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14, 877–898 (1976)
Article MathSciNet MATH Google Scholar
Shalev-Shwartz, S., Singer, Y., Srebro, N., Cotter A.: Pegasos: primal estimated subgradient solver for SVM. In: ICML 07 pp. 807–814. New York, N.Y. (2007)
Solodov M.V., Zavriev S.K.: Error stability properties of generalized gradient-type algorithms. J. Opt. Theory Appl. 98, 663–680 (1998)
Article MathSciNet MATH Google Scholar
Solodov M.V.: Incremental gradient algorithms with stepsizes bounded away from zero. Comput. Optim. Appl. 11, 28–35 (1998)
Article MathSciNet Google Scholar
Tseng P.: An incremental gradient(-projection) method with momentum term and adaptive stepsize rule. SIAM J. Optim. 8, 506–531 (1998)
Article MathSciNet MATH Google Scholar
Vonesch, C., Unser, M.: Fast iterative thresholding algorithm for wavelet-regularized deconvolution. In: Proceedings of the SPIE Optics and Photonics 2007 Conference on Mathematical Methods: Wavelet XII, vol. 6701, pp. 1–5. San Diego, CA (2007)
Wright, S.J., Nowak, R.D., Figueiredo, M.A.T.: Sparse reconstruction by separable approximation. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008), pp. 3373–3376 (2008)
Widrow, B., Hoff, M.E.: Adaptive switching circuits. Institute of Radio Engineers, Western Electronic Show and Convention, Convention Record, Part 4, 96–104 (1960)

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering and Computer Science, Laboratory for Information and Decision Systems, M.I.T., Mass, Cambridge, MA, 02139, USA
Dimitri P. Bertsekas

Authors

Dimitri P. Bertsekas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dimitri P. Bertsekas.

Additional information

Laboratory for Information and Decision Systems Report LIDS-P-2847, August 2010 (revised March 2011); to appear in Math. Programming Journal, 2011. Research supported by AFOSR Grant FA9550-10-1-0412. Many thanks are due to Huizhen (Janey) Yu for extensive helpful discussions and suggestions.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bertsekas, D.P. Incremental proximal methods for large scale convex optimization. Math. Program. 129, 163–195 (2011). https://doi.org/10.1007/s10107-011-0472-0

Download citation

Received: 01 August 2010
Accepted: 13 March 2011
Published: 11 June 2011
Issue Date: October 2011
DOI: https://doi.org/10.1007/s10107-011-0472-0

Keywords

Mathematics Subject Classification (2000)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Incremental proximal methods for large scale convex optimization

Abstract

Access this article

Similar content being viewed by others

Accelerated Gradient-Free Optimization Methods with a Non-Euclidean Proximal Operator

Adaptive Restart of the Optimized Gradient Method for Convex Optimization

A proximal method for solving quasi-variational inequalities

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2000)

Navigation

Incremental proximal methods for large scale convex optimization

Abstract

Access this article

Similar content being viewed by others

Accelerated Gradient-Free Optimization Methods with a Non-Euclidean Proximal Operator

Adaptive Restart of the Optimized Gradient Method for Convex Optimization

A proximal method for solving quasi-variational inequalities

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2000)

Search

Navigation