Skip to main content
Log in

Proximal bundle methods for nonsmooth DC programming

  • Published:
Journal of Global Optimization Aims and scope Submit manuscript

Abstract

We consider the problem of minimizing the difference of two nonsmooth convex functions over a simple convex set. To deal with this class of nonsmooth and nonconvex optimization problems, we propose new proximal bundle algorithms and show that the given approaches generate subsequences of iterates that converge to critical points. Trial points are obtained by solving strictly convex master programs defined by the sum of a convex cutting-plane model and a freely-chosen Bregman function. In the unconstrained case with the Bregman function being the Euclidean distance, new iterates are solutions of strictly convex quadratic programs of limited sizes. Stronger convergence results (d-stationarity) can be achieved depending on (a) further assumptions on the second DC component of the objective function and (b) solving possibly more than one master program at certain iterations. The given approaches are validated by encouraging numerical results on some academic DC programs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. We have used \(\rho =10^4\times n\) in our numerical experiments.

  2. Probability computed by using the mvncdf Matlab’s function.

References

  1. Astorino, A., Miglionico, G.: Optimizing sensor cover energy via DC programming. Optim. Lett. 10(2), 355–368 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bagirov, A.M.: A method for minimization of quasidifferentiable functions. Optim. Methods Softw. 17(1), 31–60 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bagirov, A.M., Yearwood, J.: A new nonsmooth optimization algorithm for minimum sum-of-squares clustering problems. Eur. J. Oper. Res. 170(2), 578–596 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  4. Ben-Tal, A., Nemirovski, A.: Non-Euclidean restricted memory level method for large-scale convex optimization. Math. Program. 102, 407–456 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bonnans, J., Gilbert, J., Lemaréchal, C., Sagastizábal, C.: Numerical Optimization: Theoretical and Practical Aspects, 2nd edn. Springer, Berlin (2006)

    MATH  Google Scholar 

  6. Clarke, F.H.: Optimisation and nonsmooth analysis. Soc. Ind. Appl. Math. (1990). https://doi.org/10.1137/1.9781611971309

  7. Cruz Neto, J.X., Oliveira, P.R., Soubeyran, A., Souza, J.C.O.: A generalized proximal linearized algorithm for DC functions with application to the optimal size of the firm problem. Ann. Oper. Res. (2018). https://doi.org/10.1007/s10479-018-3104-8

  8. de Oliveira, W., Solodov, M.: A doubly stabilized bundle method for nonsmooth convex optimization. Math. Program. 156(1), 125–159 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  9. de Oliveira, W.: Target radius methods for nonsmooth convex optimization. Oper. Res. Lett. 45(6), 659–664 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  10. de Oliveira, W., Sagastizábal, C., Lemaréchal, C.: Convex proximal bundle methods in depth: a unified analysis for inexact oracles. Math. Program. Ser. B 148, 241–277 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  11. Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  12. Frangioni, A.: Generalized bundle methods. SIAM J. Optim. 13(1), 117–156 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  13. Frangioni, A., Gorgone, E.: Generalized bundle methods for sum-functions with “easy” components: applications to multicommodity network design. Math. Program. 145(1), 133–161 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  14. Fuduli, A., Gaudioso, M., Giallombardo, G.: A DC piecewise affine model and a bundling technique in nonconvex nonsmooth minimization. Optim. Methods Softw. 19(1), 89–102 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  15. Gaudioso, M., Giallombardo, G., Miglionico, G.: Minimizing piecewise-concave functions over polyhedra. Math. Oper. Res. 43(2), 580–597 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  16. Gaudioso, M., Giallombardo, G., Miglionico, G., Bagirov, A.M.: Minimizing nonsmooth DC functions via successive DC piecewise-affine approximations. J. Glob. Optim. 71(1), 37–55 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  17. Hare, W., Sagastizábal, C.: A redistributed proximal bundle method for nonconvex optimization. SIAM J. Optim. 20(5), 2442–2473 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  18. Hare, W., Sagastizábal, C., Solodov, M.: A proximal bundle method for nonsmooth nonconvex functions with inexact information. Comput. Optim. Appl. 63(1), 1–28 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  19. Henrion, R.: A Critical Note on Empirical (Sample Average, Monte Carlo) Approximation of Solutions to Chance Constrained Programs (Chapter 3 in [24]). Springer, Berlin (2013)

    MATH  Google Scholar 

  20. Hiriart-Urruty, J.B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms I. Grundlehren der mathematischen Wissenschaften, vol. 305, 2nd edn. Springer, Berlin (1996)

    MATH  Google Scholar 

  21. Hiriart-Urruty, J.B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms II. Grundlehren der mathematischen Wissenschaften, vol. 306, 2nd edn. Springer, Berlin (1996)

    MATH  Google Scholar 

  22. Hiriart-Urruty, J.B.: Generalized Differentiability/Duality and Optimization for Problems Dealing with Differences of Convex Functions, pp. 37–70. Springer, Berlin, Heidelberg (1985)

    MATH  Google Scholar 

  23. Holmberg, K., Tuy, H.: A production–transportation problem with stochastic demand and concave production costs. Math. Program. 85(1), 157–179 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  24. Hömberg, D., Tröltzsch, F. (eds.): System Modeling and Optimization. IFIP Advances in Information and Communication, vol. 391. Springer, Berlin (2013)

    MATH  Google Scholar 

  25. Hong, L.J., Yang, Y., Zhang, L.: Sequential convex approximations to joint chance constrained programs: A Monte Carlo approach. Oper. Res. 59(3), 617–630 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  26. Joki, K., Bagirov, A., Karmitsa, N., Mäkelä, M.M., Taheri, S.: Double bundle method for finding Clarke stationary points in nonsmooth DC programming. SIAM J. Optim. 28(2), 1892–1919 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  27. Joki, K., Bagirov, A.M., Karmitsa, N., Mäkelä, M.M.: A proximal bundle method for nonsmooth DC optimization utilizing nonconvex cutting planes. J. Glob. Optim. 68(3), 501–535 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  28. Kelley, J.E.: The cutting-plane method for solving convex programs. J. Soc. Ind. Appl. Math. 8(4), 703–712 (1960)

    Article  MathSciNet  MATH  Google Scholar 

  29. Khalaf, W., Astorino, A., d’Alessandro, P., Gaudioso, M.: A DC optimization-based clustering technique for edge detection. Optim. Lett. 11(3), 627–640 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  30. Kiwiel, K.C.: A proximal bundle method with approximate subgradient linearizations. SIAM J. Optim. 16(4), 1007–1023 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  31. Le Thi, H.A., Tao, P.D.: DC programming in communication systems: challenging problems and methods. Vietnam J. Comput. Sci. 1(1), 15–28 (2014)

    Article  Google Scholar 

  32. Le Thi, H.A., Pham Dinh, T., Ngai, H.V.: Exact penalty and error bounds in DC programming. J. Glob. Optim. 52(3), 509–535 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  33. Lemaréchal, C.: An algorithm for minimizing convex functions. Inf. Process. 1, 552–556 (1974)

    MathSciNet  MATH  Google Scholar 

  34. Lemaréchal, C., Nemirovskii, A., Nesterov, Y.: New variants of bundle methods. Math. Program. 69(1), 111–147 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  35. Lewis, A.S., Overton, M.L.: Nonsmooth optimization via quasi-Newton methods. Math. Program. 141(1), 135–163 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  36. Mäkelä, M.M., Miettinen, M., Lukšan, L., Vlček, J.: Comparing nonsmooth nonconvex bundle methods in solving hemivariational inequalities. J. Glob. Optim. 14(2), 117–135 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  37. Noll, D., Apkarian, P.: Spectral bundle methods for non-convex maximum eigenvalue functions: first-order methods. Math. Program. 104(2), 701–727 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  38. Pang, J.S., Razaviyayn, M., Alvarado, A.: Computing B-stationary points of nonsmooth DC programs. Math. Oper. Res. 42(1), 95–118 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  39. Prékopa, A.: Stochastic Programming. Kluwer, Dordrecht (1995)

    Book  MATH  Google Scholar 

  40. Rockafellar, R.: Convex Analysis, 1st edn. Princeton University Press, Princeton (1970)

    Book  MATH  Google Scholar 

  41. Souza, J.C.O., Oliveira, P.R., Soubeyran, A.: Global convergence of a proximal linearized algorithm for difference of convex functions. Optim. Lett. 10(7), 1529–1539 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  42. Tao, P.D., Le Thi, H.A.: Convex analysis approach to DC programming: theory, algorithms and applications. Acta Math. Vietnam. 22(1), 289–355 (1997)

    MathSciNet  MATH  Google Scholar 

  43. Tuy, H.: Convex Analysis and Global Optimization. Springer Optimization and Its Applications, 2nd edn. Springer, Berlin (2016)

    Book  MATH  Google Scholar 

  44. van Ackooij, W.: Eventual convexity of chance constrained feasible sets. Optimization 64(5), 1263–1284 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  45. van Ackooij, W., Cruz, J.B., de Oliveira, W.: A strongly convergent proximal bundle method for convex minimization in Hilbert spaces. Optimization 65(1), 145–167 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  46. van Ackooij, W., Henrion, R.: Gradient formulae for nonlinear probabilistic constraints with Gaussian and Gaussian-like distributions. SIAM J. Optim. 24(4), 1864–1889 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  47. van Ackooij, W., de Oliveira, W.: Level bundle methods for constrained convex optimization with various oracles. Comput. Optim. Appl. 57(3), 555–597 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  48. van Ackooij, W., Sagastizábal, C.: Constrained bundle methods for upper inexact oracles with application to joint chance constrained energy problems. SIAM J. Optim. 24(2), 733–765 (2014)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The author is grateful to the reviewers for their remarks and constructive suggestions that considerably improved the original version of this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Welington de Oliveira.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Appendix

A Appendix

1.1 A.1 A self-contained analysis of the sequence of infinitely many null steps generated after a last serious step

We assume that after the \({\hat{\ell }}\mathrm{th}\)-stability center \(x^{k({\hat{\ell }})} = {\hat{x}}\) only null steps are performed, i.e.,

$$\begin{aligned} f_1(x^{k+1})> f_1({\hat{x}}) +\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle - \kappa {\underline{\mu }}D(x^{k+1},{\hat{x}})\,, \end{aligned}$$

where \({\hat{g}}_2 = g_2^{k({\hat{\ell }})}\) is the last subgradient computed for \(f_2\). Notice that in this case the sequence \(\{\mu _k\}_{k\ge k({\hat{\ell }})}\) is nondecreasing. In what follows we present some auxiliary results to prove Lemma 4. We start by defining the following two useful functions:

$$\begin{aligned}&F^k(x):={\check{f}}_1^k(x) -\langle {\hat{g}}_2,x-{\hat{x}}\rangle +\mu _k\,D(x,{\hat{x}}) \end{aligned}$$
(31)
$$\begin{aligned}&F^{-k}(x) ={\check{f}}_1^k(x^{k+1})+ \langle p^{k+1}+s^{k+1},x-x^{k+1}\rangle -\langle {\hat{g}}_2,x-{\hat{x}}\rangle + \mu _k\,D(x,{\hat{x}}). \end{aligned}$$
(32)

Notice that \(F^{-k}\) is twice differentiable (because \(\omega \) defining D is so):

$$\begin{aligned} \nabla F^{-k}(x) = p^{k+1}+ s^{k+1}-{\hat{g}}_2 + \mu _k\left( \nabla \omega (x)-\nabla \omega ({\hat{x}})\right) \; \hbox { and }\; \nabla ^2 F^{-k}(x)=\mu _k\, \nabla ^2 \omega (x)\,, \end{aligned}$$

where \(\nabla ^2 \omega (x) \in {\mathbb {R}}^{n\times n}\) is the Hessian of the function \(\omega \). Since \(\omega \) is strongly convex then \(\nabla ^2 \omega (x)\) is positive definite for all \(x \in {\mathbb {R}}^n\). It follows from (10) that \(\nabla F^{-k}(x^{k+1})=0\), i.e., the point \(x^{k+1}\) is the unique minimizer of \( F^{-k}(x)\) over \({\mathbb {R}}^n\). The Taylor and mean value Theorems [5, Sect. 13] give, for some \(z = \lambda x^{k+1}+(1-\lambda ){\hat{x}}\) and \(\lambda \in [0,1]\),

$$\begin{aligned} F^{-k}(x)= & {} F^{-k}(x^{k+1}) + \langle \nabla F^{-k}(x^{k+1}),x-x^{k+1}\rangle + \frac{1}{2}\langle \nabla ^2F^{-k}(z) (x-x^{k+1}),x-x^{k+1}\rangle \nonumber \\= & {} F^{-k}(x^{k+1}) + \langle 0,x-x^{k+1}\rangle + \frac{\mu _k}{2} \langle \nabla ^2 \omega (z) (x-x^{k+1}),x-x^{k+1}\rangle \nonumber \\\ge & {} F^{-k}(x^{k+1}) + \frac{\mu _k\,\rho }{2}\Arrowvert x-x^{k+1} \Arrowvert ^2_p\nonumber \\= & {} F^{k}(x^{k+1}) + \frac{\mu _k\,\rho }{2}\Arrowvert x-x^{k+1} \Arrowvert ^2_p\,, \end{aligned}$$
(33)

where the inequality is due to the assumption that \(\omega \) is strongly convex with parameter \(\rho >0\) and norm \(\Arrowvert \cdot \Arrowvert _p\) in (5) (\(\langle \nabla ^2 \omega (z) (x-x^{k+1}),x-x^{k+1}\rangle \ge \rho \Arrowvert x-x^{k+1} \Arrowvert _p\)), and the last equality follows from (31) and (32). The above development is crucial to show the following lemma, which is essentially a reformulation of [10, Lemma 6.3] to our setting.

Lemma 7

Let \({\hat{x}}= x^{k(\ell )}\) be the last stability center generated by Algorithm 1 during the iteration \(k({\hat{\ell }})\) after which only null steps are performed. Assume also that for \(k\ge k({\hat{\ell }})\) the function \(F^k\) is the model given in (31) and \(x^{k+1}\) is an iterate obtained from a null step. If \(\{\mu _k\}_{k\ge k({\hat{\ell }})}\) is nondecreasing, then

  1. (i)

    the sequence \(\{ F^{k}(x^{k+1})\}_{{k\ge k({\hat{\ell }})}}\) is nondecreasing and satisfies

    $$\begin{aligned} F^{k}(x^{k+1})+\frac{\mu _k \rho }{2}\Arrowvert x^{k+2}-x^{k+1} \Arrowvert ^2_p \le F^{{k+1}}(x^{k+2}) \;{\hbox { for all }\; k\ge k({\hat{\ell }})}; \end{aligned}$$
  2. (ii)

    the sequence \(\{F^{k}(x^{k+1})\}_{{k\ge k(\hat{\ell })}}\) is bounded from above:

    $$\begin{aligned} F^{k}(x^{k+1})+\frac{\mu _k \rho }{2}\Arrowvert {\hat{x}}-x^{k+1} \Arrowvert ^2_p \le f_1({\hat{x}}) \;{\hbox { for all }\; k\ge k({\hat{\ell }})}; \end{aligned}$$
  3. (iii)

    the following inequality holds true for all \(k\ge k({\hat{\ell }})\)

    $$\begin{aligned} {\check{f}}_1^{k}(x^{k+1})-{\check{f}}_1^{k-1}(x^k)\le & {} F^{k}(x^{k+1}) -F^{k-1}(x^k) + \mu _{k-1}[D(x^k,{\hat{x}}) \\&- D(x^{k+1},{\hat{x}})] -\langle {\hat{g}}_2,x^k-x^{k+1}\rangle . \end{aligned}$$

Proof

Algorithm 1 ensures that the aggregate index \(-k\) enters the bundle in every null step. In particular \(-k \in {\mathcal {B}}_1^{{k+1}}\) for all \(k\ge k({\hat{\ell }})\). Then for all \(x \in X\) and all \(k\ge k({\hat{\ell }})\) we have that

$$\begin{aligned} F^{-k}(x)= & {} [{\check{f}}_1^k(x^{k+1})+ \langle p^{k+1},x-x^{k+1}\rangle ]+\langle s^{k+1},x-x^{k+1}\rangle -\langle {\hat{g}}_2,x-{\hat{x}}\rangle + \mu _k\,D(x,{\hat{x}})\\= & {} \bar{f}_1^{-k}(x)+\langle s^{k+1},x-x^{k+1}\rangle -\langle {\hat{g}}_2,x-{\hat{x}}\rangle + \mu _k\,D(x,{\hat{x}})\\\le & {} \bar{f}_1^{-k}(x) -\langle {\hat{g}}_2,x-{\hat{x}}\rangle + \mu _k\,D(x,{\hat{x}})\\\le & {} {\check{f}}_1^{k+1}(x) -\langle {\hat{g}}_2,x-{\hat{x}}\rangle + \mu _k\,D(x,{\hat{x}})\\\le & {} {\check{f}}_1^{k+1}(x) -\langle {\hat{g}}_2,x-{\hat{x}}\rangle + \mu _{k+1}\,D(x,{\hat{x}}) = F^{k+1}(x)\,, \end{aligned}$$

where the first inequality is due to \(s^{k+1}\in N_X(x^{k+1})\), the second follows from inequality \(\bar{f}_1^{-k}{(\cdot )}\le {\check{f}}_1^{k+1}{(\cdot )}\) ensured because \(-k \in {\mathcal {B}}_1^{{k+1}}\), and the last inequality follows from the assumption \(\mu _{k+1}\ge \mu _{k}\). Set \(x=x^{k+2}\) in (33) to obtain (i), and \(x={\hat{x}}\) to obtain \(F^{k}(x^{k+1}) + \frac{\mu _k\rho }{2}\Arrowvert {\hat{x}}-x^{k+1} \Arrowvert ^2_p\le F^{-k}({\hat{x}}) \le F^{k+1}({\hat{x}})= {\check{f}}_1^{k+1}({\hat{x}}) \le f_1({\hat{x}})\). To show (iii), note that for all \(k\ge k({\hat{\ell }})\)

$$\begin{aligned}&F^{k}(x^{k+1})-{\check{f}}_1^{k}(x^{k+1})\\&\quad =-\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle +\mu _k\, D(x^{k+1},{\hat{x}})\\&\quad \ge -\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle +\mu _{k-1}\, D(x^{k+1},{\hat{x}})\\&\quad =-\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle +\mu _{k-1}\, D(x^k,{\hat{x}}) +\mu _{k-1}[ D(x^{k+1},{\hat{x}})- D(x^k,{\hat{x}})]\\&\quad =-\langle {\hat{g}}_2,x^k-{\hat{x}}\rangle +\mu _{k-1}\, D(x^k,{\hat{x}}) +\mu _{k-1}[ D(x^{k+1},{\hat{x}})- D(x^k,{\hat{x}})]- \langle {\hat{g}}_2,x^{k+1}-x^k\rangle \\&\quad = F^{k-1}(x^k)-{\check{f}}_1^{k-1}(x^k) +\mu _{k-1}[ D(x^{k+1},{\hat{x}})- D(x^k,{\hat{x}})]- \langle {\hat{g}}_2,x^{k+1}-x^k\rangle , \end{aligned}$$

where the inequality is due to \(\mu _k\ge \mu _{k-1}\) and the last equality follows from (31) (with k therein replaced with \(k-1\)). The result thus follows. \(\square \)

Given the above properties, the following lemma shows that the cutting-plane model \({\check{f}}_1^k\) asymptotically approximates the DC component \(f_1\) on the sequence of null iterates.

Lemma 8

Under the assumptions of Lemma 7, Algorithm 1 ensures that \(\{x^k\}_{{k}}\) is a bounded sequence and

$$\begin{aligned} \lim _{k\rightarrow \infty }[f_1(x^k)-{\check{f}}_1^{k-1}(x^k)]=0. \end{aligned}$$

Proof

Lemma 7(i) ensures that the sequence \(\{F^k(x^{k+1})\}_{{k\ge k({\hat{\ell }})}}\) is nondecreasing. Thus, there exists a constant \(C>0\) such that \(F^k(x^{k+1}) \ge -C\) for all \(k\ge k({\hat{\ell }})\). Using Lemma 7(ii) we conclude that

$$\begin{aligned} \frac{\mu _k\rho }{2}\Arrowvert {\hat{x}}-x^{k+1} \Arrowvert _p^2 \le f_1({\hat{x}})-F^{k}(x^{k+1}) \le f_1({\hat{x}})+C \quad {\forall \; k\ge k({\hat{\ell }})}\,, \end{aligned}$$
(34)

showing that the sequences \(\{\Arrowvert {\hat{x}}-x^k \Arrowvert _p\}_{{k\ge k({\hat{\ell }})}}\) is bounded because \(\{\mu _k\}_{{k> k({\hat{\ell }})}}\) is nondecreasing. Accordingly, \(\{x^k\}_{{k}}\) is also bounded. It follows from Lemma 7(ii) that the sequence \(\{F^k(x^{k+1})\}_{{k\ge k({\hat{\ell }})}}\) is bounded from above by \(f_1({\hat{x}})\). Lemma 7(i) shows that \(\{F^k(x^{k+1})\}_{{k\ge k({\hat{\ell }})}}\) is nondecreasing and hence

$$\begin{aligned} \lim _{k\rightarrow \infty } [ F^{k+1}(x^{k+2})- F^{k}(x^{k+1})] = 0\; \hbox { and }\; \lim _{k\rightarrow \infty } \Arrowvert x^{k+2} -x^{k+1} \Arrowvert ^2_p=0\,, \end{aligned}$$
(35)

where the second limit follows from Lemma 7(i) and the assumption that \(\{\mu _k\}_{{k\ge k({\hat{\ell }})}}\) is nondecreasing. Note that, by (5),

$$\begin{aligned} D(x^k,{\hat{x}})-D(x^{k+1},{\hat{x}})&= [\omega (x^k)-\omega ({\hat{x}})-\langle \nabla \omega ({\hat{x}}),x^k-{\hat{x}}\rangle ]\\&\quad -[\omega (x^{k+1})-\omega ({\hat{x}})-\langle \nabla \omega ({\hat{x}}),x^{k+1}-{\hat{x}}\rangle ]\\&= \omega (x^k)-\omega (x^{k+1})-\langle \nabla \omega ({\hat{x}}),x^k-x^{k+1}\rangle . \end{aligned}$$

Since \(\{\mu _k\}_{{k}}\) is a bounded sequence and \(\omega \) is a continuous function, we conclude that

$$\begin{aligned} \lim _{k\rightarrow \infty } \mu _{k-1}[D(x^k,{\hat{x}})-D(x^{k+1},{\hat{x}})] = 0. \end{aligned}$$
(36)

The inclusion \(k \in {\mathcal {B}}_1^k\) implies \( f_1(x^k)+\langle g_1^k,x-x^k\rangle =\bar{f}_1^k(x)\le {\check{f}}_1^k(x) \; \hbox { for all }x\in {\mathbb {R}}^n. \) Setting \(x=x^{k+1}\) in this inequality yields \( f_1(x^k)=\bar{f}_1^k(x^{k+1}) + \langle g_1^k,x^k-x^{k+1}\rangle \). Therefore, for \(k>k({\hat{\ell }})\)

$$\begin{aligned} f_1(x^k)-{\check{f}}_1^{k-1}(x^k)&=\bar{f}_1^k(x^{k+1}) + \langle g_1^k,x^k-x^{k+1}\rangle -{\check{f}}_1^{k-1}(x^k)\\&\le {\check{f}}_1^k(x^{k+1}) + \langle g_1^k,x^k-x^{k+1}\rangle -{\check{f}}_1^{k-1}(x^k)\\&\le F^{k}(x^{k+1}) -F^{k-1}(x^k) + \mu _{k-1}[D(x^k,{\hat{x}}) \\&\quad - D(x^{k+1},{\hat{x}})] + \langle g_1^k- {\hat{g}}_2,x^k-x^{k+1}\rangle \,, \end{aligned}$$

where the last inequality is due to Lemma 7(iii). Applying the limit with \(k\rightarrow \infty \) in the above inequalities and taking into account (35) and (36), remembering that \(\{x^k\}_{{k}}\), and \(\{g_1^k\}_{{k}}\) are bounded sequences, we conclude that \(\limsup _{k\rightarrow \infty } [f_1(x^k)-{\check{f}}_1^{k-1}(x^k)]\le 0\). Since \(f_1\) is convex we have \(f_1(x^k)\ge {\check{f}}_1^{k-1}(x^k)\) and the result follows. \(\square \)

1.2 Proof of Lemma 4

Suppose that \({\hat{x}}\) is not a cluster point of \(\{x^{k+1}\}_{{k\ge k({\hat{\ell }})}}\). Then there would exist \(\epsilon >0\) and an index \({\tilde{k}}\ge k({\hat{k}})\) such that

$$\begin{aligned} (1-\kappa ){\underline{\mu }}\,D(x^{k+1},{\hat{x}})\ge \frac{(1-\kappa ){\underline{\mu }}\rho }{2}\Arrowvert x^{k+1}-{\hat{x}} \Arrowvert ^2_p> \epsilon \end{aligned}$$

for all index \(k+1\ge {\tilde{k}}\). It follows from Lemma 8 that there exists an index \({\bar{k}}\ge k({\hat{\ell }})\) such that

$$\begin{aligned} f_1(x^{k+1}) -{\check{f}}_1^{k}(x^{k+1})\le \epsilon \hbox { for all } k+1 \ge {\bar{k}}. \end{aligned}$$

Definition of \(x^{k+1}\), feasibility of \({\hat{x}}\) and inequality \({\check{f}}_1^k {(\cdot )}\le f_1{(\cdot )}\) yield the inequality

$$\begin{aligned} {\check{f}}_1^k(x^{k+1}) -\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle +\mu _k\, D(x^{k+1}, {\hat{x}}) \le f_1({\hat{x}}). \end{aligned}$$

As by assumption \(\kappa \in (0,1)\), \(\mu _k \ge {\underline{\mu }}\) and assuming that \(k+1 > \max \{{\tilde{k}},\,{\bar{k}}\}\) we get

$$\begin{aligned} f_1({\hat{x}})&\ge f_1(x^{k+1}) -\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle +\kappa {\underline{\mu }}D(x^{k+1}, {\hat{x}}) \\&\quad + (1-\kappa ){\underline{\mu }}D(x^{k+1}, {\hat{x}}) + {\check{f}}_1^k(x^{k+1})- f_1(x^{k+1}) \\&> f_1(x^{k+1}) -\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle +\kappa {\underline{\mu }}D(x^{k+1}, {\hat{x}}) + \epsilon -\epsilon \,, \end{aligned}$$

showing that \(x^{k+1}\) satisfies the descent test (13), i.e. \(x^{k+1}\) becomes the new stability center. But this contradicts the fact that \({\hat{x}}\) is the last stability center. Hence, the sequence \(\{x^{k+1}\}_{{k}}\) has a subsequence that converges to \({\hat{x}}\), i.e., \(\lim _{k\in {\mathcal {K}}} x^{k} ={\hat{x}}\) for some index set \({\mathcal {K}}\subset {\{k({\hat{\ell }})+1,k({\hat{\ell }})+2,\ldots \}}\). We now proceed to show that indeed the whole sequence converges to \({\hat{x}}\): it follows from (31) that

$$\begin{aligned} F^{k-1}(x^k) = {\check{f}}_1^{{k-1}}(x^k) -\langle {\hat{g}}_2,x^k-{\hat{x}}\rangle +\mu _{{k-1}}\,D(x^k,{\hat{x}}) \end{aligned}$$

and, therefore, \(\lim _{k\in {\mathcal {K}}} F^{k-1}(x^k) = f_1({\hat{x}})\) from Lemma 8. Lemma 7(i) shows that \(\{F^{k-1}(x^k)\}_{{k\ge k({\hat{\ell }})}}\) is nondecreasing and hence \(\lim _{k\rightarrow \infty } F^{k}(x^{k+1}) =\lim _{k\in {\mathcal {K}}} F^{k-1}(x^k)= f_1({\hat{x}})\). This property combined with (34) shows that the whole sequence \(\{x^k\}_{{k}}\) converges to \({\hat{x}}\). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de Oliveira, W. Proximal bundle methods for nonsmooth DC programming. J Glob Optim 75, 523–563 (2019). https://doi.org/10.1007/s10898-019-00755-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10898-019-00755-4

Keywords

Navigation