Skip to main content
Log in

Proximal Gradient/Semismooth Newton Methods for Projection onto a Polyhedron via the Duality-Gap-Active-Set Strategy

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

The polyhedral projection problem arises in various applications. To efficiently solve the dual problem, one of the crucial issues is to safely identify zero-elements as well as the signs of nonzero elements at the optimal solution. In this paper, relying on its nonsmooth dual problem and active set techniques, we first propose a Duality-Gap-Active-Set strategy (DGASS) to effectively identify the indices of zero-elements and the signs of nonzero entries of the optimal solution. Serving as an efficient acceleration strategy, DGASS can be embedded into certain iterative methods. In particular, by applying DGASS to both the proximal gradient algorithm (PGA) and the proximal semismooth Newton algorithm (PSNA), we propose the method of PGA-DGASS and PSNA-DGASS, respectively. Global convergence and local quadratic convergence rate are discussed. We report on numerical results over both synthetic and real data sets to demonstrate the high efficiency of the two DGASS-accelerated methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

Enquiries about data availability should be directed to the authors.

Notes

  1. https://www.gurobi.com/.

  2. https://coin-or.github.io/Ipopt/.

  3. To see (2.11) and (2.4) share the same set of solutions, let \(\Lambda \) be the solution set of (1.10). By Proposition 2.1, all \(\lambda ^*\in \Lambda \) satisfy \(\lambda ^*\in {{{\mathcal {Q}}}}\), and therefore \(\Lambda \subseteq {{{\mathcal {Q}}}}\); as \(\lambda ^*\) attains the minimum of \(D(\lambda )\) over \(\lambda \in {\mathbb {R}}^m\), \(\Lambda \) is the solution set of (2.4). Analogously, by Proposition 2.2, all \(\lambda ^*\in \Lambda \) satisfy \(\lambda ^*\in {{{\mathcal {Q}}}}_{{{\mathcal {C}}}}\), yielding \(\Lambda \subseteq \mathcal{Q}_{{{\mathcal {C}}}}\), i.e., \(\Lambda \) is the solution set of (2.11). Thus, \(\Lambda \) is the solution set for (1.10),(2.4) and (2.11).

  4. In the case that \({\tilde{D}}(\lambda ^1)<{\tilde{D}}(\lambda ^0)\), from the proof of Theorem 4.1, we know that \({\tilde{D}}(\lambda ^k)\le {\tilde{D}}_R^k\) for all k, and the sequence \(\{{\tilde{D}}_R^k\}\) is monotonically decreasing. Thus, for all sufficiently large k, \({\tilde{D}}(\lambda ^k+d_N^k)\) does not exceed \({\tilde{D}}(\lambda ^0)\). The trivial case \({\tilde{D}}(\lambda ^1)={\tilde{D}}(\lambda ^0)\) leads to \(\Delta _N^0=0\) due to \({\tilde{D}}(\lambda ^0)={\tilde{D}}_R^0\) and the sufficient descent condition (4.9); using the positive definiteness of \(H^0\) and the definition of \(\Delta _N^0\), we have \(d_N^0=0\), and by (4.8), the initial point \(\lambda ^0\) is optimal.

  5. https://www.gurobi.com/.

  6. https://coin-or.github.io/Ipopt/.

  7. The functions randi and rand are used to generate uniformly distributed pseudo-random integers and uniformly distributed pseudo-random numbers, respectively, and min(v) and max(v) compute the minimum and maximum values of v, respectively. Also, in order to ensure \(y=0\) is a feasible solution for (1.1), the generated random A and y satisfy \(\texttt {min(A*y)<0}\) and \(\texttt {max(A*y)>0}\).

  8. https://www.cuter.rl.ac.uk/Problems/netlib.shtml.

  9. http://miplib.zib.de/.

References

  1. Adler, I., Hu, Z.T. and Lin, T.: New proximal Newton-type methods for convex optimization. In 59th IEEE conference on decision and control, pp. 4828–4835. IEEE, (2020)

  2. Bagirov, A., Karmitsa, N., Mäkelä, M.M.: Introduction to nonsmooth optimization: theory, practice and software, vol. 12. Springer, (2014)

  3. Barzilai, J., Borwein, J.M.: Two-point step size gradient methods. IMA J. Numer. Anal. 8(1), 141–148 (1988)

    MathSciNet  MATH  Google Scholar 

  4. Beck, A.: First-order methods in optimization. SIAM, (2017)

  5. Becker, S., Fadili, J.: A quasi-Newton proximal splitting method. In: Advances in neural information processing systems 25, 2618–2626 (2012)

  6. Becker, S., Fadili, J., Ochs, P.: On quasi-Newton forward-backward splitting: proximal calculus and convergence. SIAM J. Optim. 29(4), 2445–2481 (2019)

    MathSciNet  MATH  Google Scholar 

  7. Birgin, E.G., Martínez, J.M., Raydan, M.: Spectral projected gradient methods: review and perspectives. J. Stat. Softw. 60, 1–21 (2014)

    Google Scholar 

  8. Boyd, S., Diaconis, P., Xiao, L.: Fastest mixing Markov chain on a graph. SIAM Rev. 46(4), 667–689 (2004)

    MathSciNet  MATH  Google Scholar 

  9. Censor, Y.: Computational acceleration of projection algorithms for the linear best approximation problem. Linear Algebra Appl. 416(1), 111–123 (2006)

    MathSciNet  MATH  Google Scholar 

  10. Clarke, F.H.: Optimization and Nonsmooth Analysis. SIAM, (1990)

  11. Dai, Y.H.: Alternate step gradient method. Optimization 52(4–5), 395–415 (2003)

    MathSciNet  MATH  Google Scholar 

  12. Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002)

    MathSciNet  MATH  Google Scholar 

  13. Drusvyatskiy, D., Lewis, A.S.: Error bounds, quadratic growth, and linear convergence of proximal methods. Math. Oper. Res. 43(3), 919–948 (2018)

    MathSciNet  MATH  Google Scholar 

  14. Facchinei, F., Pang, J.-S.: Finite-dimensional Variational Inequalities and Complementarity Problems. Springer, (2003)

  15. Fercoq, O., Gramfort, A., Salmon, J.: Mind the duality gap: safer rules for the lasso. In: International conference on machine learning, pp. 333–342. PMLR, (2015)

  16. Fletcher, R.: On the Barzilai-Borwein method. In: Optimization and control with applications, pp. 235–256. Springer, (2005)

  17. Gabidullina, Z.: A linear separability criterion for sets of Euclidean space. J. Optim. Theory Appl. 158(1), 145–171 (2013)

    MathSciNet  MATH  Google Scholar 

  18. Gao, B., Son, N.T., Absil, P.-A., Stykel, T.: Riemannian optimization on the symplectic Stiefel manifold. SIAM J. Optim. 31(2), 1546–1575 (2021)

    MathSciNet  MATH  Google Scholar 

  19. Ghaoui, L.E., Viallon, V., Rabbani, T.: Safe feature elimination for the lasso and sparse supervised learning problems. arXiv preprint arXiv:1009.4219, (2010)

  20. Gotoh, J., Takeda, A., Tono, K.: DC formulations and algorithms for sparse optimization problems. Math. Program. 169(1), 141–176 (2018)

    MathSciNet  MATH  Google Scholar 

  21. Hager, W.W., Zhang, H.: A new active set algorithm for box constrained optimization. SIAM J. Optim. 17(2), 526–557 (2006)

    MathSciNet  MATH  Google Scholar 

  22. Hager, W.W., Zhang, H.: An active set algorithm for nonlinear optimization with polyhedral constraints. Sci. China Math. 59(8), 1525–1542 (2016)

    MathSciNet  MATH  Google Scholar 

  23. Hager, W.W., Zhang, H.: Projection onto a polyhedron that exploits sparsity. SIAM J. Optim. 26(3), 1773–1798 (2016)

    MathSciNet  MATH  Google Scholar 

  24. Hiriart-Urruty, J.-B., Lemaréchal, C.: Fundamentals of Convex Analysis. Springer Science and Business Media, (2004)

  25. Hu, J., Liu, X., Wen, Z.W., Yuan, Y.X.: A brief introduction to manifold optimization. J. Oper. Res. Soc. China 8(2), 199–248 (2020)

    MathSciNet  MATH  Google Scholar 

  26. Huang, Y.-K., Dai, Y.-H., Liu, X.-W.: Equipping the Barzilai-Borwein method with the two dimensional quadratic termination property. SIAM J. Optim. 31(4), 3068–3096 (2021)

    MathSciNet  MATH  Google Scholar 

  27. Huang, Y.-K., Dai, Y.-H., Liu, X.-W., Zhang, H.: On the acceleration of the Barzilai-Borwein method. Comput. Optim. Appl. 81(3), 717–740 (2022)

    MathSciNet  MATH  Google Scholar 

  28. Laurent, C.: Fast projection onto the simplex and the l1 ball. Math. Program. 158, 575–585 (2016)

    MathSciNet  MATH  Google Scholar 

  29. Lee, J.D., Sun, Y., Saunders, M.A.: Proximal Newton-type methods for minimizing composite functions. SIAM J. Optim. 24(3), 1420–1443 (2014)

    MathSciNet  MATH  Google Scholar 

  30. Li, X., Sun, D., Toh, K.-C.: A highly efficient semismooth Newton augmented Lagrangian method for solving Lasso problems. SIAM J. Optim. 28(1), 433–458 (2018)

    MathSciNet  MATH  Google Scholar 

  31. Mifflin, R.: Semismooth and semiconvex functions in constrained optimization. SIAM J. Control. Optim. 15(6), 959–972 (1977)

    MathSciNet  MATH  Google Scholar 

  32. Nakayama, S., Narushima, Y., Yabe, H.: Inexact proximal memoryless quasi-Newton methods based on the Broyden family for minimizing composite functions. Comput. Optim. Appl. 79(1), 127–154 (2021)

    MathSciNet  MATH  Google Scholar 

  33. Ndiaye, E., Fercoq, O., Gramfort, A., Salmon, J.: Gap safe screening rules for sparse multi-task and multi-class models. Adv. Neural Inf. Process. Syst. 28, 811–819 (2015)

    Google Scholar 

  34. Ndiaye, E., Fercoq, O., Gramfort, A., Salmon, J.: Gap safe screening rules for Sparse-Group Lasso. Adv. Neural Inf. Process. Syst. 29, 388–396 (2016)

    MATH  Google Scholar 

  35. Ndiaye, E., Fercoq, O., Gramfort, A., Salmon, J.: Gap safe screening rules for sparsity enforcing penalties. J. Mach. Learn. Res. 18(1), 4671–4703 (2017)

    MathSciNet  MATH  Google Scholar 

  36. Nesterov, Y.E.: A method for solving a convex programming problem with convergence rate O(\(1/k^2\)). In Doklady A.N. (Eds.) Russian Academy of Sciences, vol. 269, pp. 543–547. (1983)

  37. Ogawa, K., Suzuki, Y., Takeuchi, I.: Safe screening of non-support vectors in pathwise SVM computation. In: International conference on machine learning, pp. 1382–1390. PMLR, (2013)

  38. Olbrich, J.: Screening rules for convex problems. Master’s thesis, ETH-Zürich, (2015)

  39. Patriksson, M.: Cost approximation: a unified framework of descent algorithms for nonlinear programs. SIAM J. Optim. 8(2), 561–582 (1998)

    MathSciNet  MATH  Google Scholar 

  40. Panagiotis, P. and Alberto B.: Proximal Newton methods for convex composite optimization. In: 52nd IEEE conference on decision and control, pp 2358–2363. IEEE, (2013)

  41. Qi, L., Sun, J.: A nonsmooth version of Newton’s method. Math. Program. 58(1), 353–367 (1993)

    MathSciNet  MATH  Google Scholar 

  42. Raydan, M.: The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim. 7(1), 26–33 (1997)

    MathSciNet  MATH  Google Scholar 

  43. Rockafellar, T.: Convex analysis. Princeton University Press, (1970)

  44. Shen, C., Liu, X.: Solving nonnegative sparsity-constrained optimization via DC quadratic-piecewise-linear approximations. J. Global Optim. 81(4), 1019–1055 (2021)

    MathSciNet  MATH  Google Scholar 

  45. Shen, C., Wang, Y., Xue, W., Zhang, L.-H.: An accelerated active-set algorithm for a quadratic semidefinite program with general constraints. Comput. Optim. Appl. 78(1), 1–42 (2021)

    MathSciNet  MATH  Google Scholar 

  46. Shen, C., Xue, W., Zhang, L.-H., Wang, B.: An active-set proximal-Newton algorithm for \( \ell _1 \) regularized optimization problems with box constraints. J. Sci. Comput. 85(3), 1–34 (2020)

    MathSciNet  Google Scholar 

  47. Stetsyuk, P.I., Nurminski, E.A.: Nonsmooth penalty and subgradient algorithms to solve the problem of projection onto a polytope. Cybern. Syst. Anal. 46(1), 51 (2010)

    MathSciNet  MATH  Google Scholar 

  48. Sun, D., Sun, J.: Semismooth matrix-valued functions. Math. Oper. Res. 27(1), 150–169 (2002)

    MathSciNet  MATH  Google Scholar 

  49. Torrisi, G., Grammatico, S., Smith, R.S., Morari, M.: A projected gradient and constraint linearization method for nonlinear model predictive control. SIAM J. Control Optim. 56(3), 1968–1999 (2018)

    MathSciNet  MATH  Google Scholar 

  50. Wang, B., Lin, L., Liu, Y.-J.: Efficient projection onto the intersection of a half-space and a box-like set and its generalized Jacobian. Optimization 71(4), 1073–1096 (2022)

    MathSciNet  MATH  Google Scholar 

  51. Wang, J. and Ye, J.: Safe screening for multi-task feature learning with multiple data matrices. In: International conference on machine learning, pp. 1747–1756. PMLR, (2015)

  52. Wang, J., Zhou, J., Liu, J., Wonka, P., Ye, J.: A safe screening rule for sparse logistic regression. Adv. Neural Inf. Process. Syst. 2, 1053–1061 (2014)

    Google Scholar 

  53. Wen, Z., Yin, W., Goldfarb, D., Zhang, Y.: A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization, and continuation. SIAM J. Sci. Comput. 32(4), 1832–1857 (2010)

    MathSciNet  MATH  Google Scholar 

  54. Wright, S.J., Nowak, R.D., Figueiredo, M.A.T.: Sparse reconstruction by separable approximation. IEEE Trans. Signal Process. 57(7), 2479–2493 (2009)

    MathSciNet  MATH  Google Scholar 

  55. Zhang, H., Hager, W.W.: A nonmonotone line search technique and its application to unconstrained optimization. SIAM J. Optim. 14(4), 1043–1056 (2004)

    MathSciNet  MATH  Google Scholar 

  56. Zhao, X., Yao, J.-C.: Linear convergence of a nonmonotone projected gradient method for multiobjective optimization. J. Global Optim. 82(3), 577–594 (2022)

    MathSciNet  MATH  Google Scholar 

  57. Zhao, X.-Y., Sun, D., Toh, K.-C.: A Newton-CG augmented lagrangian method for semidefinite programming. SIAM J. Optim. 20(4), 1737–1765 (2010)

    MathSciNet  MATH  Google Scholar 

  58. Zhou, B., Gao, L., Dai, Y.-H.: Gradient methods with adaptive step-sizes. Comput. Optim. Appl. 35(1), 69–86 (2006)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the associate editor and the two anonymous referees for their valuable comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chungen Shen.

Ethics declarations

Conflict of interest

The authors have not disclosed any competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The work of the first author was supported in part by the National Natural Science Foundation of China NSFC-72025201.

The work of the third author was supported in part by the National Natural Science Foundation of China NSFC-12071332.

The work of the last author was supported by the National Natural Science Foundation of China NSFC-11971118.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Shen, C., Zhang, LH. et al. Proximal Gradient/Semismooth Newton Methods for Projection onto a Polyhedron via the Duality-Gap-Active-Set Strategy. J Sci Comput 97, 3 (2023). https://doi.org/10.1007/s10915-023-02302-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10915-023-02302-6

Keywords

Navigation