Proximal bundle methods for nonsmooth DC programming

de Oliveira, Welington

doi:10.1007/s10898-019-00755-4

Proximal bundle methods for nonsmooth DC programming

Published: 27 February 2019

Volume 75, pages 523–563, (2019)
Cite this article

Journal of Global Optimization Aims and scope Submit manuscript

Welington de Oliveira¹

974 Accesses
34 Citations
Explore all metrics

Abstract

We consider the problem of minimizing the difference of two nonsmooth convex functions over a simple convex set. To deal with this class of nonsmooth and nonconvex optimization problems, we propose new proximal bundle algorithms and show that the given approaches generate subsequences of iterates that converge to critical points. Trial points are obtained by solving strictly convex master programs defined by the sum of a convex cutting-plane model and a freely-chosen Bregman function. In the unconstrained case with the Bregman function being the Euclidean distance, new iterates are solutions of strictly convex quadratic programs of limited sizes. Stronger convergence results (d-stationarity) can be achieved depending on (a) further assumptions on the second DC component of the objective function and (b) solving possibly more than one master program at certain iterations. The given approaches are validated by encouraging numerical results on some academic DC programs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Minimizing nonsmooth DC functions via successive DC piecewise-affine approximations

Article 13 September 2017

Bundle Methods for Nonsmooth DC Optimization

Enhanced proximal DC algorithms with extrapolation for a class of structured nonsmooth DC minimization

Article 14 September 2018

Notes

We have used $\rho =10^4\times n$ in our numerical experiments.
Probability computed by using the mvncdf Matlab’s function.

References

Astorino, A., Miglionico, G.: Optimizing sensor cover energy via DC programming. Optim. Lett. 10(2), 355–368 (2016)
Article MathSciNet MATH Google Scholar
Bagirov, A.M.: A method for minimization of quasidifferentiable functions. Optim. Methods Softw. 17(1), 31–60 (2002)
Article MathSciNet MATH Google Scholar
Bagirov, A.M., Yearwood, J.: A new nonsmooth optimization algorithm for minimum sum-of-squares clustering problems. Eur. J. Oper. Res. 170(2), 578–596 (2006)
Article MathSciNet MATH Google Scholar
Ben-Tal, A., Nemirovski, A.: Non-Euclidean restricted memory level method for large-scale convex optimization. Math. Program. 102, 407–456 (2005)
Article MathSciNet MATH Google Scholar
Bonnans, J., Gilbert, J., Lemaréchal, C., Sagastizábal, C.: Numerical Optimization: Theoretical and Practical Aspects, 2nd edn. Springer, Berlin (2006)
MATH Google Scholar
Clarke, F.H.: Optimisation and nonsmooth analysis. Soc. Ind. Appl. Math. (1990). https://doi.org/10.1137/1.9781611971309
Cruz Neto, J.X., Oliveira, P.R., Soubeyran, A., Souza, J.C.O.: A generalized proximal linearized algorithm for DC functions with application to the optimal size of the firm problem. Ann. Oper. Res. (2018). https://doi.org/10.1007/s10479-018-3104-8
de Oliveira, W., Solodov, M.: A doubly stabilized bundle method for nonsmooth convex optimization. Math. Program. 156(1), 125–159 (2016)
Article MathSciNet MATH Google Scholar
de Oliveira, W.: Target radius methods for nonsmooth convex optimization. Oper. Res. Lett. 45(6), 659–664 (2017)
Article MathSciNet MATH Google Scholar
de Oliveira, W., Sagastizábal, C., Lemaréchal, C.: Convex proximal bundle methods in depth: a unified analysis for inexact oracles. Math. Program. Ser. B 148, 241–277 (2014)
Article MathSciNet MATH Google Scholar
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)
Article MathSciNet MATH Google Scholar
Frangioni, A.: Generalized bundle methods. SIAM J. Optim. 13(1), 117–156 (2002)
Article MathSciNet MATH Google Scholar
Frangioni, A., Gorgone, E.: Generalized bundle methods for sum-functions with “easy” components: applications to multicommodity network design. Math. Program. 145(1), 133–161 (2014)
Article MathSciNet MATH Google Scholar
Fuduli, A., Gaudioso, M., Giallombardo, G.: A DC piecewise affine model and a bundling technique in nonconvex nonsmooth minimization. Optim. Methods Softw. 19(1), 89–102 (2004)
Article MathSciNet MATH Google Scholar
Gaudioso, M., Giallombardo, G., Miglionico, G.: Minimizing piecewise-concave functions over polyhedra. Math. Oper. Res. 43(2), 580–597 (2018)
Article MathSciNet MATH Google Scholar
Gaudioso, M., Giallombardo, G., Miglionico, G., Bagirov, A.M.: Minimizing nonsmooth DC functions via successive DC piecewise-affine approximations. J. Glob. Optim. 71(1), 37–55 (2018)
Article MathSciNet MATH Google Scholar
Hare, W., Sagastizábal, C.: A redistributed proximal bundle method for nonconvex optimization. SIAM J. Optim. 20(5), 2442–2473 (2010)
Article MathSciNet MATH Google Scholar
Hare, W., Sagastizábal, C., Solodov, M.: A proximal bundle method for nonsmooth nonconvex functions with inexact information. Comput. Optim. Appl. 63(1), 1–28 (2016)
Article MathSciNet MATH Google Scholar
Henrion, R.: A Critical Note on Empirical (Sample Average, Monte Carlo) Approximation of Solutions to Chance Constrained Programs (Chapter 3 in [24]). Springer, Berlin (2013)
MATH Google Scholar
Hiriart-Urruty, J.B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms I. Grundlehren der mathematischen Wissenschaften, vol. 305, 2nd edn. Springer, Berlin (1996)
MATH Google Scholar
Hiriart-Urruty, J.B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms II. Grundlehren der mathematischen Wissenschaften, vol. 306, 2nd edn. Springer, Berlin (1996)
MATH Google Scholar
Hiriart-Urruty, J.B.: Generalized Differentiability/Duality and Optimization for Problems Dealing with Differences of Convex Functions, pp. 37–70. Springer, Berlin, Heidelberg (1985)
MATH Google Scholar
Holmberg, K., Tuy, H.: A production–transportation problem with stochastic demand and concave production costs. Math. Program. 85(1), 157–179 (1999)
Article MathSciNet MATH Google Scholar
Hömberg, D., Tröltzsch, F. (eds.): System Modeling and Optimization. IFIP Advances in Information and Communication, vol. 391. Springer, Berlin (2013)
MATH Google Scholar
Hong, L.J., Yang, Y., Zhang, L.: Sequential convex approximations to joint chance constrained programs: A Monte Carlo approach. Oper. Res. 59(3), 617–630 (2011)
Article MathSciNet MATH Google Scholar
Joki, K., Bagirov, A., Karmitsa, N., Mäkelä, M.M., Taheri, S.: Double bundle method for finding Clarke stationary points in nonsmooth DC programming. SIAM J. Optim. 28(2), 1892–1919 (2018)
Article MathSciNet MATH Google Scholar
Joki, K., Bagirov, A.M., Karmitsa, N., Mäkelä, M.M.: A proximal bundle method for nonsmooth DC optimization utilizing nonconvex cutting planes. J. Glob. Optim. 68(3), 501–535 (2017)
Article MathSciNet MATH Google Scholar
Kelley, J.E.: The cutting-plane method for solving convex programs. J. Soc. Ind. Appl. Math. 8(4), 703–712 (1960)
Article MathSciNet MATH Google Scholar
Khalaf, W., Astorino, A., d’Alessandro, P., Gaudioso, M.: A DC optimization-based clustering technique for edge detection. Optim. Lett. 11(3), 627–640 (2017)
Article MathSciNet MATH Google Scholar
Kiwiel, K.C.: A proximal bundle method with approximate subgradient linearizations. SIAM J. Optim. 16(4), 1007–1023 (2006)
Article MathSciNet MATH Google Scholar
Le Thi, H.A., Tao, P.D.: DC programming in communication systems: challenging problems and methods. Vietnam J. Comput. Sci. 1(1), 15–28 (2014)
Article Google Scholar
Le Thi, H.A., Pham Dinh, T., Ngai, H.V.: Exact penalty and error bounds in DC programming. J. Glob. Optim. 52(3), 509–535 (2012)
Article MathSciNet MATH Google Scholar
Lemaréchal, C.: An algorithm for minimizing convex functions. Inf. Process. 1, 552–556 (1974)
MathSciNet MATH Google Scholar
Lemaréchal, C., Nemirovskii, A., Nesterov, Y.: New variants of bundle methods. Math. Program. 69(1), 111–147 (1995)
Article MathSciNet MATH Google Scholar
Lewis, A.S., Overton, M.L.: Nonsmooth optimization via quasi-Newton methods. Math. Program. 141(1), 135–163 (2013)
Article MathSciNet MATH Google Scholar
Mäkelä, M.M., Miettinen, M., Lukšan, L., Vlček, J.: Comparing nonsmooth nonconvex bundle methods in solving hemivariational inequalities. J. Glob. Optim. 14(2), 117–135 (1999)
Article MathSciNet MATH Google Scholar
Noll, D., Apkarian, P.: Spectral bundle methods for non-convex maximum eigenvalue functions: first-order methods. Math. Program. 104(2), 701–727 (2005)
Article MathSciNet MATH Google Scholar
Pang, J.S., Razaviyayn, M., Alvarado, A.: Computing B-stationary points of nonsmooth DC programs. Math. Oper. Res. 42(1), 95–118 (2017)
Article MathSciNet MATH Google Scholar
Prékopa, A.: Stochastic Programming. Kluwer, Dordrecht (1995)
Book MATH Google Scholar
Rockafellar, R.: Convex Analysis, 1st edn. Princeton University Press, Princeton (1970)
Book MATH Google Scholar
Souza, J.C.O., Oliveira, P.R., Soubeyran, A.: Global convergence of a proximal linearized algorithm for difference of convex functions. Optim. Lett. 10(7), 1529–1539 (2016)
Article MathSciNet MATH Google Scholar
Tao, P.D., Le Thi, H.A.: Convex analysis approach to DC programming: theory, algorithms and applications. Acta Math. Vietnam. 22(1), 289–355 (1997)
MathSciNet MATH Google Scholar
Tuy, H.: Convex Analysis and Global Optimization. Springer Optimization and Its Applications, 2nd edn. Springer, Berlin (2016)
Book MATH Google Scholar
van Ackooij, W.: Eventual convexity of chance constrained feasible sets. Optimization 64(5), 1263–1284 (2015)
Article MathSciNet MATH Google Scholar
van Ackooij, W., Cruz, J.B., de Oliveira, W.: A strongly convergent proximal bundle method for convex minimization in Hilbert spaces. Optimization 65(1), 145–167 (2016)
Article MathSciNet MATH Google Scholar
van Ackooij, W., Henrion, R.: Gradient formulae for nonlinear probabilistic constraints with Gaussian and Gaussian-like distributions. SIAM J. Optim. 24(4), 1864–1889 (2014)
Article MathSciNet MATH Google Scholar
van Ackooij, W., de Oliveira, W.: Level bundle methods for constrained convex optimization with various oracles. Comput. Optim. Appl. 57(3), 555–597 (2014)
Article MathSciNet MATH Google Scholar
van Ackooij, W., Sagastizábal, C.: Constrained bundle methods for upper inexact oracles with application to joint chance constrained energy problems. SIAM J. Optim. 24(2), 733–765 (2014)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The author is grateful to the reviewers for their remarks and constructive suggestions that considerably improved the original version of this article.

Author information

Authors and Affiliations

MINES ParisTech, PSL – Research University, CMA – Centre de Mathématiques Appliquées, Sophia Antipolis, France
Welington de Oliveira

Authors

Welington de Oliveira
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Welington de Oliveira.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Appendix

1.1 A.1 A self-contained analysis of the sequence of infinitely many null steps generated after a last serious step

We assume that after the ${\hat{\ell }}\mathrm{th}$-stability center $x^{k({\hat{\ell }})} = {\hat{x}}$ only null steps are performed, i.e.,

$$\begin{aligned} f_1(x^{k+1})> f_1({\hat{x}}) +\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle - \kappa {\underline{\mu }}D(x^{k+1},{\hat{x}})\,, \end{aligned}$$

where ${\hat{g}}_2 = g_2^{k({\hat{\ell }})}$ is the last subgradient computed for $f_2$. Notice that in this case the sequence $\{\mu _k\}_{k\ge k({\hat{\ell }})}$ is nondecreasing. In what follows we present some auxiliary results to prove Lemma 4. We start by defining the following two useful functions:

$$\begin{aligned}&F^k(x):={\check{f}}_1^k(x) -\langle {\hat{g}}_2,x-{\hat{x}}\rangle +\mu _k\,D(x,{\hat{x}}) \end{aligned}$$

(31)

$$\begin{aligned}&F^{-k}(x) ={\check{f}}_1^k(x^{k+1})+ \langle p^{k+1}+s^{k+1},x-x^{k+1}\rangle -\langle {\hat{g}}_2,x-{\hat{x}}\rangle + \mu _k\,D(x,{\hat{x}}). \end{aligned}$$

(32)

Notice that $F^{-k}$ is twice differentiable (because $\omega $ defining D is so):

$$\begin{aligned} \nabla F^{-k}(x) = p^{k+1}+ s^{k+1}-{\hat{g}}_2 + \mu _k\left( \nabla \omega (x)-\nabla \omega ({\hat{x}})\right) \; \hbox { and }\; \nabla ^2 F^{-k}(x)=\mu _k\, \nabla ^2 \omega (x)\,, \end{aligned}$$

where $\nabla ^2 \omega (x) \in {\mathbb {R}}^{n\times n}$ is the Hessian of the function $\omega $. Since $\omega $ is strongly convex then $\nabla ^2 \omega (x)$ is positive definite for all $x \in {\mathbb {R}}^n$. It follows from (10) that $\nabla F^{-k}(x^{k+1})=0$, i.e., the point $x^{k+1}$ is the unique minimizer of $ F^{-k}(x)$ over ${\mathbb {R}}^n$. The Taylor and mean value Theorems [5, Sect. 13] give, for some $z = \lambda x^{k+1}+(1-\lambda ){\hat{x}}$ and $\lambda \in [0,1]$,

$$\begin{aligned} F^{-k}(x)= & {} F^{-k}(x^{k+1}) + \langle \nabla F^{-k}(x^{k+1}),x-x^{k+1}\rangle + \frac{1}{2}\langle \nabla ^2F^{-k}(z) (x-x^{k+1}),x-x^{k+1}\rangle \nonumber \\= & {} F^{-k}(x^{k+1}) + \langle 0,x-x^{k+1}\rangle + \frac{\mu _k}{2} \langle \nabla ^2 \omega (z) (x-x^{k+1}),x-x^{k+1}\rangle \nonumber \\\ge & {} F^{-k}(x^{k+1}) + \frac{\mu _k\,\rho }{2}\Arrowvert x-x^{k+1} \Arrowvert ^2_p\nonumber \\= & {} F^{k}(x^{k+1}) + \frac{\mu _k\,\rho }{2}\Arrowvert x-x^{k+1} \Arrowvert ^2_p\,, \end{aligned}$$

(33)

where the inequality is due to the assumption that $\omega $ is strongly convex with parameter $\rho >0$ and norm $\Arrowvert \cdot \Arrowvert _p$ in (5) ($\langle \nabla ^2 \omega (z) (x-x^{k+1}),x-x^{k+1}\rangle \ge \rho \Arrowvert x-x^{k+1} \Arrowvert _p$), and the last equality follows from (31) and (32). The above development is crucial to show the following lemma, which is essentially a reformulation of [10, Lemma 6.3] to our setting.

Lemma 7

Let ${\hat{x}}= x^{k(\ell )}$ be the last stability center generated by Algorithm 1 during the iteration $k({\hat{\ell }})$ after which only null steps are performed. Assume also that for $k\ge k({\hat{\ell }})$ the function $F^k$ is the model given in (31) and $x^{k+1}$ is an iterate obtained from a null step. If $\{\mu _k\}_{k\ge k({\hat{\ell }})}$ is nondecreasing, then

(i)
the sequence $\{ F^{k}(x^{k+1})\}_{{k\ge k({\hat{\ell }})}}$ is nondecreasing and satisfies
$$\begin{aligned} F^{k}(x^{k+1})+\frac{\mu _k \rho }{2}\Arrowvert x^{k+2}-x^{k+1} \Arrowvert ^2_p \le F^{{k+1}}(x^{k+2}) \;{\hbox { for all }\; k\ge k({\hat{\ell }})}; \end{aligned}$$
(ii)
the sequence $\{F^{k}(x^{k+1})\}_{{k\ge k(\hat{\ell })}}$ is bounded from above:
$$\begin{aligned} F^{k}(x^{k+1})+\frac{\mu _k \rho }{2}\Arrowvert {\hat{x}}-x^{k+1} \Arrowvert ^2_p \le f_1({\hat{x}}) \;{\hbox { for all }\; k\ge k({\hat{\ell }})}; \end{aligned}$$
(iii)
the following inequality holds true for all $k\ge k({\hat{\ell }})$
$$\begin{aligned} {\check{f}}_1^{k}(x^{k+1})-{\check{f}}_1^{k-1}(x^k)\le & {} F^{k}(x^{k+1}) -F^{k-1}(x^k) + \mu _{k-1}[D(x^k,{\hat{x}}) \\&- D(x^{k+1},{\hat{x}})] -\langle {\hat{g}}_2,x^k-x^{k+1}\rangle . \end{aligned}$$

Proof

Algorithm 1 ensures that the aggregate index $-k$ enters the bundle in every null step. In particular $-k \in {\mathcal {B}}_1^{{k+1}}$ for all $k\ge k({\hat{\ell }})$. Then for all $x \in X$ and all $k\ge k({\hat{\ell }})$ we have that

$$\begin{aligned} F^{-k}(x)= & {} [{\check{f}}_1^k(x^{k+1})+ \langle p^{k+1},x-x^{k+1}\rangle ]+\langle s^{k+1},x-x^{k+1}\rangle -\langle {\hat{g}}_2,x-{\hat{x}}\rangle + \mu _k\,D(x,{\hat{x}})\\= & {} \bar{f}_1^{-k}(x)+\langle s^{k+1},x-x^{k+1}\rangle -\langle {\hat{g}}_2,x-{\hat{x}}\rangle + \mu _k\,D(x,{\hat{x}})\\\le & {} \bar{f}_1^{-k}(x) -\langle {\hat{g}}_2,x-{\hat{x}}\rangle + \mu _k\,D(x,{\hat{x}})\\\le & {} {\check{f}}_1^{k+1}(x) -\langle {\hat{g}}_2,x-{\hat{x}}\rangle + \mu _k\,D(x,{\hat{x}})\\\le & {} {\check{f}}_1^{k+1}(x) -\langle {\hat{g}}_2,x-{\hat{x}}\rangle + \mu _{k+1}\,D(x,{\hat{x}}) = F^{k+1}(x)\,, \end{aligned}$$

where the first inequality is due to $s^{k+1}\in N_X(x^{k+1})$, the second follows from inequality $\bar{f}_1^{-k}{(\cdot )}\le {\check{f}}_1^{k+1}{(\cdot )}$ ensured because $-k \in {\mathcal {B}}_1^{{k+1}}$, and the last inequality follows from the assumption $\mu _{k+1}\ge \mu _{k}$. Set $x=x^{k+2}$ in (33) to obtain (i), and $x={\hat{x}}$ to obtain $F^{k}(x^{k+1}) + \frac{\mu _k\rho }{2}\Arrowvert {\hat{x}}-x^{k+1} \Arrowvert ^2_p\le F^{-k}({\hat{x}}) \le F^{k+1}({\hat{x}})= {\check{f}}_1^{k+1}({\hat{x}}) \le f_1({\hat{x}})$. To show (iii), note that for all $k\ge k({\hat{\ell }})$

$$\begin{aligned}&F^{k}(x^{k+1})-{\check{f}}_1^{k}(x^{k+1})\\&\quad =-\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle +\mu _k\, D(x^{k+1},{\hat{x}})\\&\quad \ge -\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle +\mu _{k-1}\, D(x^{k+1},{\hat{x}})\\&\quad =-\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle +\mu _{k-1}\, D(x^k,{\hat{x}}) +\mu _{k-1}[ D(x^{k+1},{\hat{x}})- D(x^k,{\hat{x}})]\\&\quad =-\langle {\hat{g}}_2,x^k-{\hat{x}}\rangle +\mu _{k-1}\, D(x^k,{\hat{x}}) +\mu _{k-1}[ D(x^{k+1},{\hat{x}})- D(x^k,{\hat{x}})]- \langle {\hat{g}}_2,x^{k+1}-x^k\rangle \\&\quad = F^{k-1}(x^k)-{\check{f}}_1^{k-1}(x^k) +\mu _{k-1}[ D(x^{k+1},{\hat{x}})- D(x^k,{\hat{x}})]- \langle {\hat{g}}_2,x^{k+1}-x^k\rangle , \end{aligned}$$

where the inequality is due to $\mu _k\ge \mu _{k-1}$ and the last equality follows from (31) (with k therein replaced with $k-1$). The result thus follows. $\square $

Given the above properties, the following lemma shows that the cutting-plane model ${\check{f}}_1^k$ asymptotically approximates the DC component $f_1$ on the sequence of null iterates.

Lemma 8

Under the assumptions of Lemma 7, Algorithm 1 ensures that $\{x^k\}_{{k}}$ is a bounded sequence and

$$\begin{aligned} \lim _{k\rightarrow \infty }[f_1(x^k)-{\check{f}}_1^{k-1}(x^k)]=0. \end{aligned}$$

Proof

Lemma 7(i) ensures that the sequence $\{F^k(x^{k+1})\}_{{k\ge k({\hat{\ell }})}}$ is nondecreasing. Thus, there exists a constant $C>0$ such that $F^k(x^{k+1}) \ge -C$ for all $k\ge k({\hat{\ell }})$. Using Lemma 7(ii) we conclude that

$$\begin{aligned} \frac{\mu _k\rho }{2}\Arrowvert {\hat{x}}-x^{k+1} \Arrowvert _p^2 \le f_1({\hat{x}})-F^{k}(x^{k+1}) \le f_1({\hat{x}})+C \quad {\forall \; k\ge k({\hat{\ell }})}\,, \end{aligned}$$

(34)

showing that the sequences $\{\Arrowvert {\hat{x}}-x^k \Arrowvert _p\}_{{k\ge k({\hat{\ell }})}}$ is bounded because $\{\mu _k\}_{{k> k({\hat{\ell }})}}$ is nondecreasing. Accordingly, $\{x^k\}_{{k}}$ is also bounded. It follows from Lemma 7(ii) that the sequence $\{F^k(x^{k+1})\}_{{k\ge k({\hat{\ell }})}}$ is bounded from above by $f_1({\hat{x}})$. Lemma 7(i) shows that $\{F^k(x^{k+1})\}_{{k\ge k({\hat{\ell }})}}$ is nondecreasing and hence

$$\begin{aligned} \lim _{k\rightarrow \infty } [ F^{k+1}(x^{k+2})- F^{k}(x^{k+1})] = 0\; \hbox { and }\; \lim _{k\rightarrow \infty } \Arrowvert x^{k+2} -x^{k+1} \Arrowvert ^2_p=0\,, \end{aligned}$$

(35)

where the second limit follows from Lemma 7(i) and the assumption that $\{\mu _k\}_{{k\ge k({\hat{\ell }})}}$ is nondecreasing. Note that, by (5),

$$\begin{aligned} D(x^k,{\hat{x}})-D(x^{k+1},{\hat{x}})&= [\omega (x^k)-\omega ({\hat{x}})-\langle \nabla \omega ({\hat{x}}),x^k-{\hat{x}}\rangle ]\\&\quad -[\omega (x^{k+1})-\omega ({\hat{x}})-\langle \nabla \omega ({\hat{x}}),x^{k+1}-{\hat{x}}\rangle ]\\&= \omega (x^k)-\omega (x^{k+1})-\langle \nabla \omega ({\hat{x}}),x^k-x^{k+1}\rangle . \end{aligned}$$

Since $\{\mu _k\}_{{k}}$ is a bounded sequence and $\omega $ is a continuous function, we conclude that

$$\begin{aligned} \lim _{k\rightarrow \infty } \mu _{k-1}[D(x^k,{\hat{x}})-D(x^{k+1},{\hat{x}})] = 0. \end{aligned}$$

(36)

The inclusion $k \in {\mathcal {B}}_1^k$ implies $ f_1(x^k)+\langle g_1^k,x-x^k\rangle =\bar{f}_1^k(x)\le {\check{f}}_1^k(x) \; \hbox { for all }x\in {\mathbb {R}}^n. $ Setting $x=x^{k+1}$ in this inequality yields $ f_1(x^k)=\bar{f}_1^k(x^{k+1}) + \langle g_1^k,x^k-x^{k+1}\rangle $. Therefore, for $k>k({\hat{\ell }})$

$$\begin{aligned} f_1(x^k)-{\check{f}}_1^{k-1}(x^k)&=\bar{f}_1^k(x^{k+1}) + \langle g_1^k,x^k-x^{k+1}\rangle -{\check{f}}_1^{k-1}(x^k)\\&\le {\check{f}}_1^k(x^{k+1}) + \langle g_1^k,x^k-x^{k+1}\rangle -{\check{f}}_1^{k-1}(x^k)\\&\le F^{k}(x^{k+1}) -F^{k-1}(x^k) + \mu _{k-1}[D(x^k,{\hat{x}}) \\&\quad - D(x^{k+1},{\hat{x}})] + \langle g_1^k- {\hat{g}}_2,x^k-x^{k+1}\rangle \,, \end{aligned}$$

where the last inequality is due to Lemma 7(iii). Applying the limit with $k\rightarrow \infty $ in the above inequalities and taking into account (35) and (36), remembering that $\{x^k\}_{{k}}$, and $\{g_1^k\}_{{k}}$ are bounded sequences, we conclude that $\limsup _{k\rightarrow \infty } [f_1(x^k)-{\check{f}}_1^{k-1}(x^k)]\le 0$. Since $f_1$ is convex we have $f_1(x^k)\ge {\check{f}}_1^{k-1}(x^k)$ and the result follows. $\square $

1.2 Proof of Lemma 4

Suppose that ${\hat{x}}$ is not a cluster point of $\{x^{k+1}\}_{{k\ge k({\hat{\ell }})}}$. Then there would exist $\epsilon >0$ and an index ${\tilde{k}}\ge k({\hat{k}})$ such that

$$\begin{aligned} (1-\kappa ){\underline{\mu }}\,D(x^{k+1},{\hat{x}})\ge \frac{(1-\kappa ){\underline{\mu }}\rho }{2}\Arrowvert x^{k+1}-{\hat{x}} \Arrowvert ^2_p> \epsilon \end{aligned}$$

for all index $k+1\ge {\tilde{k}}$. It follows from Lemma 8 that there exists an index ${\bar{k}}\ge k({\hat{\ell }})$ such that

$$\begin{aligned} f_1(x^{k+1}) -{\check{f}}_1^{k}(x^{k+1})\le \epsilon \hbox { for all } k+1 \ge {\bar{k}}. \end{aligned}$$

Definition of $x^{k+1}$, feasibility of ${\hat{x}}$ and inequality ${\check{f}}_1^k {(\cdot )}\le f_1{(\cdot )}$ yield the inequality

$$\begin{aligned} {\check{f}}_1^k(x^{k+1}) -\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle +\mu _k\, D(x^{k+1}, {\hat{x}}) \le f_1({\hat{x}}). \end{aligned}$$

As by assumption $\kappa \in (0,1)$, $\mu _k \ge {\underline{\mu }}$ and assuming that $k+1 > \max \{{\tilde{k}},\,{\bar{k}}\}$ we get

$$\begin{aligned} f_1({\hat{x}})&\ge f_1(x^{k+1}) -\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle +\kappa {\underline{\mu }}D(x^{k+1}, {\hat{x}}) \\&\quad + (1-\kappa ){\underline{\mu }}D(x^{k+1}, {\hat{x}}) + {\check{f}}_1^k(x^{k+1})- f_1(x^{k+1}) \\&> f_1(x^{k+1}) -\langle {\hat{g}}_2,x^{k+1}-{\hat{x}}\rangle +\kappa {\underline{\mu }}D(x^{k+1}, {\hat{x}}) + \epsilon -\epsilon \,, \end{aligned}$$

showing that $x^{k+1}$ satisfies the descent test (13), i.e. $x^{k+1}$ becomes the new stability center. But this contradicts the fact that ${\hat{x}}$ is the last stability center. Hence, the sequence $\{x^{k+1}\}_{{k}}$ has a subsequence that converges to ${\hat{x}}$, i.e., $\lim _{k\in {\mathcal {K}}} x^{k} ={\hat{x}}$ for some index set ${\mathcal {K}}\subset {\{k({\hat{\ell }})+1,k({\hat{\ell }})+2,\ldots \}}$. We now proceed to show that indeed the whole sequence converges to ${\hat{x}}$: it follows from (31) that

$$\begin{aligned} F^{k-1}(x^k) = {\check{f}}_1^{{k-1}}(x^k) -\langle {\hat{g}}_2,x^k-{\hat{x}}\rangle +\mu _{{k-1}}\,D(x^k,{\hat{x}}) \end{aligned}$$

and, therefore, $\lim _{k\in {\mathcal {K}}} F^{k-1}(x^k) = f_1({\hat{x}})$ from Lemma 8. Lemma 7(i) shows that $\{F^{k-1}(x^k)\}_{{k\ge k({\hat{\ell }})}}$ is nondecreasing and hence $\lim _{k\rightarrow \infty } F^{k}(x^{k+1}) =\lim _{k\in {\mathcal {K}}} F^{k-1}(x^k)= f_1({\hat{x}})$. This property combined with (34) shows that the whole sequence $\{x^k\}_{{k}}$ converges to ${\hat{x}}$. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

de Oliveira, W. Proximal bundle methods for nonsmooth DC programming. J Glob Optim 75, 523–563 (2019). https://doi.org/10.1007/s10898-019-00755-4

Download citation

Received: 23 October 2017
Accepted: 19 February 2019
Published: 27 February 2019
Issue Date: October 2019
DOI: https://doi.org/10.1007/s10898-019-00755-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Proximal bundle methods for nonsmooth DC programming

Abstract

Access this article

Similar content being viewed by others

Minimizing nonsmooth DC functions via successive DC piecewise-affine approximations

Bundle Methods for Nonsmooth DC Optimization

Enhanced proximal DC algorithms with extrapolation for a class of structured nonsmooth DC minimization

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Appendix

1.1 A.1 A self-contained analysis of the sequence of infinitely many null steps generated after a last serious step

Lemma 7

Proof

Lemma 8

Proof

1.2 Proof of Lemma 4

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Proximal bundle methods for nonsmooth DC programming

Abstract

Access this article

Similar content being viewed by others

Minimizing nonsmooth DC functions via successive DC piecewise-affine approximations

Bundle Methods for Nonsmooth DC Optimization

Enhanced proximal DC algorithms with extrapolation for a class of structured nonsmooth DC minimization

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Appendix

A Appendix

1.1 A.1 A self-contained analysis of the sequence of infinitely many null steps generated after a last serious step

Lemma 7

Proof

Lemma 8

Proof

1.2 Proof of Lemma 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation