Linear Convergence of Prox-SVRG Method for Separable Non-smooth Convex Optimization Problems under Bounded Metric Subregularity

Zhang, Jin; Zhu, Xide

doi:10.1007/s10957-021-01978-w

Linear Convergence of Prox-SVRG Method for Separable Non-smooth Convex Optimization Problems under Bounded Metric Subregularity

Published: 06 January 2022

Volume 192, pages 564–597, (2022)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

609 Accesses
1 Altmetric
Explore all metrics

Abstract

With the help of bounded metric subregularity which is weaker than strong convexity, we show the linear convergence of proximal stochastic variance-reduced gradient (Prox-SVRG) method for solving a class of separable non-smooth convex optimization problems where the smooth item is a composite of strongly convex function and linear function. We introduce an equivalent characterization for the bounded metric subregularity by taking into account the calmness condition of a perturbed linear system. This equivalent characterization allows us to provide a verifiable sufficient condition to ensure linear convergence of Prox-SVRG and randomized block-coordinate proximal gradient methods. Furthermore, we verify that these sufficient conditions hold automatically when the non-smooth item is the generalized sparse group Lasso regularizer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Random Gradient-Free Minimization of Convex Functions

Article 30 November 2015

Yurii Nesterov & Vladimir Spokoiny

The Frank-Wolfe Algorithm: A Short Introduction

Article Open access 13 December 2023

Sebastian Pokutta

Tseng’s extragradient method with double projection for solving pseudomonotone variational inequality problems in Hilbert spaces

Article 10 April 2024

Zhongbing Xie, Gang Cai, … Qiao-Li Dong

References

Aragón Artacho, F.J., Geoffroy, M.H.: Characterization of metric regularity of subdifferentials. J. Convex Anal. 15(2), 365–380 (2008)
MathSciNet MATH Google Scholar
Aragón Artacho, F.J., Geoffroy, M.H.: Metric subregularity of the convex subdifferential in banach spaces. J. Nonlinear Convex Anal. 15(1), 35–47 (2014)
MathSciNet MATH Google Scholar
Aubin, J.P.: Lipschitz behavior of solutions to convex minimization problems. Math. Oper. Res. 9(1), 87–111 (1984)
Article MathSciNet MATH Google Scholar
Borwein, J.M., Zhu, Q.J.: Techniques of Variational Analysis. Springer, New York (2005)
MATH Google Scholar
Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds) Advances in Neural Information Processing Systems, vol. 27, pp. 1646–1654. Curran Associates, Inc. (2014)
Dontchev, A.L., Rockafellar, R.T.: Regularity and conditioning of solution mappings in variational analysis. Set-Valued Anal. 12(1–2), 79–109 (2004)
Article MathSciNet MATH Google Scholar
Drusvyatskiy, D., Lewis, A.S.: Error bounds, quadratic growth, and linear convergence of proximal methods. arXiv preprint arXiv:1602.06661, (2006)
Francisco, F., Pang, J.S.: Finite-dimensional Variational Inequalities and Complementarity Problems. Springer, Berlin (2007)
Google Scholar
Fercoq, O., Richtárik, P.: Optimization in high dimensions via accelerated, parallel, and proximal coordinate descent. SIAM Rev. 58(4), 739–771 (2016)
Article MathSciNet MATH Google Scholar
Fornasier, M., Rauhut, H.: Recovery algorithms for vector-valued data with joint sparsity constraints. SIAM J. Numer. Anal. 46(2), 577–613 (2008)
Article MathSciNet MATH Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: A note on the Lasso and a sparse group Lasso. arXiv preprint arXiv:1001.0736, (2010)
Gfrerer, H., Ye, J.J.: New constraint qualifications for mathematical programs with equilibrium constraints via variational analysis. SIAM J. Optim. 27(2), 842–865 (2017)
Article MathSciNet MATH Google Scholar
Gong, P., Ye, J.: Linear Convergence of Variance-Reduced Stochastic Gradient without Strong Convexity. arXiv preprint arXiv:1406.1102, (2014)
Karimi, H., Nutini, J., Schmidt, M.: Linear convergence of gradient and proximal-gradient methods under the Polyak-Łojasiewicz condition. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD, pp. 795–811. Lecture Notes in Computer Science, vol. 9851. Springer, Cham (2016)
Kowalski, M.: Sparse regression using mixed norms. Appl. Comput. Harmon. Anal. 27(3), 303–324 (2009)
Article MathSciNet MATH Google Scholar
Liu, J., Ye, J.: Efficient $l_1/l_q$ norm regularization. arXiv preprint arXiv:1009.4766, (2010)
Luo, Z.Q., Tseng, P.: On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control. Optim. 30(2), 408–425 (1992)
Article MathSciNet MATH Google Scholar
Ma, C., Tappenden, R., Takáč, M.: Linear convergence of randomized feasible descent methods under the weak strong convexity assumption. J. Mach. Learn. Res. 17(1), 8138–8161 (2016)
MathSciNet MATH Google Scholar
Meier, L., Van De Geer, S., Bühlmann, P.: The group Lasso for logistic regression. J. Roy. Stat. Soc. Ser. B. 70(1), 53–71 (2008)
Article MathSciNet MATH Google Scholar
Necoara, I., Clipici, D.: Parallel random coordinate descent method for composite minimization: convergence analysis and error bounds. SIAM J. Optim. 26(1), 197–226 (2016)
Article MathSciNet MATH Google Scholar
Necoara, I., Nesterov, Y., Glineur, F.: Random block coordinate descent methods for linearly constrained optimization over networks. J. Optim. Theory Appl. 173(1), 227–254 (2017)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2), 341–362 (2012)
Article MathSciNet MATH Google Scholar
Richtárik, P., Takáč, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program. 144(1–2), 1–38 (2014)
Article MathSciNet MATH Google Scholar
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Book MATH Google Scholar
Rockafellar, R.T., Wets, R.: Variational Analysis. Springer, Berlin (2009)
MATH Google Scholar
Robinson, S.M.: Stability theory for systems of inequalities. Part I: linear systems. SIAM J. Numer. Anal. 12(5), 754–769 (1975)
Article MathSciNet MATH Google Scholar
Robinson, S.M.: Strongly regular generalized equations. Math. Oper. Res. 5(1), 43–62 (1980)
Article MathSciNet MATH Google Scholar
Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: A sparse-group Lasso. J. Comput. Graph. Stat. 22(2), 231–245 (2013)
Article MathSciNet Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B. 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
Tseng, P.: Approximation accuracy, gradient methods, and error bound for structured convex optimization. Math. Program. 125(2), 263–295 (2010)
Article MathSciNet MATH Google Scholar
Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117(1), 387–423 (2009)
Article MathSciNet MATH Google Scholar
Wang, P.W., Lin, C.J.: Iteration complexity of feasible descent methods for convex optimization. J. Mach. Learn. Res. 15(1), 1523–1548 (2014)
MathSciNet MATH Google Scholar
Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)
Article MathSciNet MATH Google Scholar
Ye, J.J., Ye, X.Y.: Necessary optimality conditions for optimization problems with variational inequality constraints. Math. Oper. Res. 22(4), 977–997 (1997)
Article MathSciNet MATH Google Scholar
Ye, J.J., Yuan, X.M., Zeng, S.Z., Zhang, J.: Variational analysis perspective on linear convergence of some first order methods for nonsmooth convex optimization problems. Set-Valued Var. Anal. (2021). https://doi.org/10.1007/s11228-021-00591-3
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B. 68(1), 49–67 (2006)
Article MathSciNet MATH Google Scholar
Zheng, X.Y., Ng, K.F.: Metric subregularity of piecewise linear multifunctions and applications to piecewise linear multiobjective optimization. SIAM J. Optim. 24(1), 154–174 (2014)
Article MathSciNet MATH Google Scholar
Zhou, H., Sehl, M.E., Sinsheimer, J.S., Lange, K.: Association screening of common and rare genetic variants by penalized regression. Bioinformatics 26(19), 2375–2382 (2010)
Article Google Scholar
Zhou, Z., So, A.M.C.: A unified approach to error bounds for structured convex optimization problems. Math. Program. 165(2), 689–728 (2017)
Article MathSciNet MATH Google Scholar
Zhou, Z., Zhang, Q., So, A.M.C.: $\ell _{1, p}$-Norm Regularization: Error Bounds and Convergence Rate Analysis of First-Order Methods. ICML, 1501–1510, (2015)

Download references

Acknowledgements

This paper is supported by the National Natural Science Foundation of China (Nos. 11901380, 11971220), Shenzhen Science and Technology Program (No. RCYX20200714114700072), the Stable Support Plan Program of Shenzhen Natural Science Fund (No. 20200925152128002), Guangdong Basic and Applied Basic Research Foundation (No. 2019A1515011152) and Shanghai Pujiang Program (No. 2020PJC058). We are grateful to the editor and two anonymous referees for their comments which have helped us improve the paper substantially.

Author information

Authors and Affiliations

Department of Mathematics, Southern University of Science and Technology, National Center for Applied Mathematics Shenzhen, Shenzhen, China
Jin Zhang
School of Management, Shanghai University, Shanghai, China
Xide Zhu

Authors

Jin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xide Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xide Zhu.

Additional information

Communicated by Xiaojun Chen.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Proof of Lemma 5.1

If $(\varvec{s}_J)_j\in [-\lambda ,\lambda ]$, we have $\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big )_j=0$. Clearly, (22) is true for any $\varvec{y}_J\in \partial \Vert \varvec{x}_J\Vert _1$. If $(\varvec{s}_J)_j>\lambda $, we have $\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big )_j=(\varvec{s}_J)_j-\lambda $. Considering $\Vert \varvec{y}_J\Vert _{\infty }\le 1$, we can further obtain $ 0<(\varvec{s}_J)_j-\lambda \le (\varvec{s}_J-\lambda \varvec{y}_J)_j. $ If $(\varvec{s}_J)_j<-\lambda $, we have $\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big )_j=(\varvec{s}_J)_j+\lambda $. Similarly, we obtain $ (\varvec{s}_J-\lambda \varvec{y}_J)_j\le (\varvec{s}_J)_j+\lambda <0. $ In either case, we have (22). Consequently, (23) holds for any $\varvec{x}_J \in \mathbb {R}^{|J|}$ and $\varvec{y}_J \in \partial \Vert \varvec{x}_J\Vert _1$. $\square $

Proof of Lemma 5.2

From (24), we know $ \varvec{s}_J\in w_J \partial \Vert \varvec{x}_J\Vert _p+\lambda {\partial \Vert \varvec{x}_J\Vert }_1. $ Since $p\in ]1, \infty [$, we have $\frac{p}{q}\in ]0, \infty [$. From (19), we know that since $\varvec{x}_J\ne \varvec{0}$, there exists $\varvec{y}_J\in \partial \Vert \varvec{x}_J\Vert _1$ such that

$$\begin{aligned} \varvec{s}_J-\lambda \varvec{y}_J = w_J \frac{\varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}. \end{aligned}$$

(A.1)

Let $j\in J$ be arbitrary. If $(\varvec{x}_J)_j>0$, then $\big (\varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)\big )_j>0$ and $(\varvec{y}_J)_j=1$. It follows from (A.1) that $(\varvec{s}_J)_j-\lambda =(\varvec{s}_J-\lambda \varvec{y}_J)_j\ge 0$ holds for such j. If $(\varvec{x}_J)_j=0$, then $\big (\varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)\big )_j=0$ and $(\varvec{y}_J)_j\in [-1,1]$. It follows from (A.1) that $(\varvec{s}_J-\lambda \varvec{y}_J)_j=0$ holds for such j. If $(\varvec{x}_J)_j<0$, then $\big (\varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)\big )_j<0$ and $(\varvec{y}_J)_j=-1$. It follows from (A.1) that $(\varvec{s}_J)_j+\lambda =(\varvec{s}_J-\lambda \varvec{y}_J)_j\le 0$ holds for such j. In either case, we know that $\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big )_j=(\varvec{s}_J-\lambda \varvec{y}_J)_j$ holds for each $j\in J$. Thus, we obtain (25). $\square $

Proof of Proposition 5.1

The case (i) is trivial since $\partial g_J(\varvec{x}_J) \equiv \{\varvec{0}\}$ for all $\varvec{x}_J\in \mathbb {R}^{|J|}$. We now consider the case (ii). In this case, we have $\partial g_J(\varvec{x}_J)=\lambda {\partial \Vert \varvec{x}_J\Vert }_1$ for all $\varvec{x}_J\in \mathbb {R}^{|J|}$, which means that for any fixed $\varvec{s}_J\in \mathbb {R}^{|J|}$,

$$\begin{aligned} (\partial g_J)^{-1}(\varvec{s}_J)=\big (\lambda {\partial \Vert \cdot \Vert }_1\big )^{-1}(\varvec{s}_J). \end{aligned}$$

The case $\Vert \varvec{s}_J\Vert _{\infty }>\lambda $ is trivial. Now, we suppose that there exists $\varvec{x}_J\in \mathbb {R}^{|J|}$ satisfying $\varvec{x}_J\in \big (\lambda {\partial \Vert \cdot \Vert }_1\big )^{-1}(\varvec{s}_J)$, and then, we have $\varvec{s}_J\in \lambda {\partial \Vert \varvec{x}_J\Vert _1}$. Clearly, we have $\Vert \varvec{s}_J\Vert _{\infty }\le \lambda $. Moreover, we have $\big (\lambda {\partial \Vert \cdot \Vert }_1\big )^{-1}(\varvec{s}_J)=\{\varvec{0}\}$ if $\Vert \varvec{s}_J\Vert _{\infty }<\lambda $. Thus, let us consider the case where $\Vert \varvec{s}_J\Vert _{\infty }=\lambda $. According to (15), we know that if $\Vert \varvec{s}_J\Vert _{\infty }=\lambda $, then

$$\begin{aligned} (\varvec{x}_J)_j \in \left\{ \begin{array}{ll} [0,+\infty [ &{} ~\text{ if }~~ (\varvec{s}_J)_j=\lambda ,\\ ]-\infty ,0] &{} ~\text{ if }~~ (\varvec{s}_J)_j=-\lambda ,\\ \{0\} &{} ~\text{ if }~~ (\varvec{s}_J)_j\in ]-\lambda ,\lambda [. \end{array}\right. \end{aligned}$$

Combined with (28), we obtain (27) immediately.

(iii) If $p\in ]1, \infty [$, then we have $q\in ]0,\infty [$ and $\frac{q}{p}\in ]0,\infty [$. For any fixed $\varvec{s}_J\in \mathbb {R}^{|J|}$, we divide the difference of $\Vert \varvec{s}_J\Vert _q$ and $w_J$ into three cases, that is, $\Vert \varvec{s}_J\Vert _q>w_J$, $\Vert \varvec{s}_J\Vert _q<w_J$ and $\Vert \varvec{s}_J\Vert _q=w_J$. It follows from (19) that if there exists $\varvec{x}_J\in \mathbb {R}^{|J|}$ such that $\varvec{s}_J\in w_J \partial \Vert \varvec{x}_J\Vert _p$, then we have $\Vert \varvec{s}_J\Vert _q\le w_J$, and moreover, if $\varvec{x}_J\ne \varvec{0}$, then we have $\Vert \varvec{s}_J\Vert _q=w_J$ . Thus, we immediately obtain that $\big (w_J \partial \Vert \cdot \Vert _p\big )^{-1}(\varvec{s}_J)=\emptyset $ if $\Vert \varvec{s}_J\Vert _q>w_J$ and $\big (w_J \partial \Vert \cdot \Vert _p\big )^{-1}(\varvec{s}_J)=\{\varvec{0}\}$ if $\Vert \varvec{s}_J\Vert _q<w_J$. Next, we consider the case $\Vert \varvec{s}_J\Vert _q=w_J$. Suppose that there exists $\varvec{x}_J\in \mathbb {R}^{|J|}$ such that $\varvec{s}_J\in w_J \partial \Vert \varvec{x}_J\Vert _p$, then either $\varvec{x}_J=\varvec{0}$ or $\varvec{x}_J\ne \varvec{0}$ satisfying

$$\begin{aligned} \varvec{s}_J=w_J \frac{\varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}. \end{aligned}$$

(A.2)

Clearly, (A.2) means that vectors $\varvec{s}_J$ and $\varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)$ are linearly dependent. In either case, there must exist $\varvec{\alpha }_J \ge 0$ such that

$$\begin{aligned} \varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)={({\alpha }_J)^{\frac{p}{q}}} \varvec{s}_J. \end{aligned}$$

(A.3)

Substituting (A.3) into (17), we can further obtain

$$\begin{aligned} \varvec{x}_J=\varvec{\varvec{\varphi }}_{{\frac{q}{p}}}\big (\varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)\big )=\varvec{\varvec{\varphi }}_{{\frac{q}{p}}}\big ({({\alpha }_J)^{\frac{p}{q}}} \varvec{s}_J\big ) = {\alpha }_J {\varvec{\varvec{\varphi }}_{{\frac{q}{p}}}(\varvec{s}_J)}. \end{aligned}$$

Thus, we have (29).

(iv) In this case, we have $\partial g_J(\varvec{x}_J)=w_J \Vert \varvec{x}_J\Vert _1$. Similar to the proof of case (ii), we can obtain (30). In the following, we consider the case (v). First, we consider the first case where $\big \Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big \Vert _q>w_J$. In this case, we claim that

$$\begin{aligned} \big (w_J \partial \Vert \cdot \Vert _p+\lambda {\partial \Vert \cdot \Vert }_1\big )^{-1}(\varvec{s}_J)=\emptyset . \end{aligned}$$

(A.4)

Indeed, if (A.4) does not hold, then there exists $\varvec{x}_J\in \mathbb {R}^{|J|}$ such that

$$\begin{aligned} \varvec{s}_J\in w_J \partial \Vert \varvec{x}_J\Vert _p+\lambda {\partial \Vert \varvec{x}_J\Vert }_1. \end{aligned}$$

(A.5)

Moreover, from (A.5), there exists $\varvec{y}_J\in {\partial \Vert \varvec{x}_J\Vert }_1$ such that $ \varvec{s}_J-\lambda \varvec{y}_J\in w_J \partial \Vert \varvec{x}_J\Vert _p. $ Clearly, we have $\Vert \varvec{s}_J-\lambda \varvec{y}_J \Vert _q\le w_J$. Since $\big \Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big \Vert _q>w_J$, we have

$$\begin{aligned} \Vert \varvec{s}_J-\lambda \varvec{y}_J \Vert _q<\big \Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big \Vert _q. \end{aligned}$$

Clearly, this result conflicts with (23). Thus, we have (A.4) if $\big \Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big \Vert _q>w_J$. Next, let us consider the second case where $\big \Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big \Vert _q<w_J$. In this case, we know from (19) that $\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\in w_J \partial \Vert \varvec{0}\Vert _p$. Moreover, from (21), we have $\varvec{s}_J-\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\in \lambda {\partial \Vert \varvec{0}\Vert }_1$. Thus, we have $ \varvec{s}_J\in w_J \partial \Vert \varvec{0}\Vert _p+\lambda {\partial \Vert \varvec{0}\Vert }_1, $ that is,

$$\begin{aligned} \varvec{0} \in \big (w_J \partial \Vert \cdot \Vert _p+\lambda {\partial \Vert \cdot \Vert }_1\big )^{-1}(\varvec{s}_J). \end{aligned}$$

(A.6)

We claim that if $\big \Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big \Vert _q<w_J$, then

$$\begin{aligned} \big (w_J \partial \Vert \cdot \Vert _p+\lambda {\partial \Vert \cdot \Vert }_1\big )^{-1}(\varvec{s}_J)=\{\varvec{0}\}. \end{aligned}$$

(A.7)

Indeed, if there exists $\varvec{x}_J\ne \varvec{0}$ satisfying (A.5), then by Lemma 5.2 we have

$$\begin{aligned} \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J) = w_J \frac{\varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}. \end{aligned}$$

(A.8)

From (A.8), we can further obtain that $\big \Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big \Vert _q=w_J$. Clearly, this result conflicts with the assumption that $\big \Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big \Vert _q<w_J$. Thus, we have (A.7) if $\big \Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big \Vert _q<w_J$. Lastly, we consider the third case $\big \Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big \Vert _q=w_J$. Clearly, $\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\in w_J \partial \Vert \varvec{0}\Vert _p$. Similarly, we have (A.6). Now, we suppose that there exists $\varvec{x}_J\ne \varvec{0}$ satisfying (A.5). According to Lemma 5.2, we have (A.8). Since $\big \Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big \Vert _q=w_J$, we can rewrite (A.8) as

$$\begin{aligned} \frac{\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)}{\big \Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big \Vert _q} = \frac{\varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}. \end{aligned}$$

(A.9)

Clearly, (A.9) means that $\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)$ and $\varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)$ are linearly dependent nonzero vectors. Thus, we know that there must exist ${\alpha }_J > {0}$ such that

$$\begin{aligned} \varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)= (\alpha _J)^{\frac{p}{q}} \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J). \end{aligned}$$

(A.10)

Using (17), we can further obtain from (A.10) that $ \varvec{x}_J=\varvec{\varvec{\varphi }}_{\frac{q}{p}}\big (\varvec{\varvec{\varphi }}_{\frac{p}{q}}(\varvec{x}_J)\big ) = {\alpha }_J \varvec{\varvec{\varphi }}_{\frac{q}{p}}\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big ). $ This, together with (A.6), yields

$$\begin{aligned} \big ( w_J \partial \Vert \cdot \Vert _p + \lambda {\partial \Vert \cdot \Vert }_1 \big )^{-1}(\varvec{s}_J) = \big \{\alpha _J \varvec{\varvec{\varphi }}_{{\frac{q}{p}}}\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J)\big )\in \mathbb {R}^{|J|}: {\alpha }_J \ge {0}\big \}. \end{aligned}$$

From the above analysis, we obtain (31).

(vi) In this case, we have $\partial g_J(\varvec{x}_J) = (w_J + \lambda ) \Vert \varvec{x}_J\Vert _1$. Similar to the proof of case (ii), we obtain (32). This completes the proof. $\square $

Proof of Lemma 5.3

Since $p \in ]1, 2]$, we have $\frac{q}{p} \in [1,\infty [$ and $\frac{p}{q} \in ]0, 1]$. For any $p \in ]1, 2]$, it is clear that the function $z \rightarrow \text{ sign }(z) |z|^{\frac{q}{p}}$ is continuously differentiable on $\mathbb {R}$ and hence locally Lipschitz. Thus, for any fixed $\varvec{x}^0\in \mathbb {R}^n$, there exist $\delta _{J}, \kappa _{J}>0$ such that for all $\varvec{\xi }_{J,1}, \varvec{\xi }_{J,2} \in \mathbb {U}_{\delta _{J}}\big (\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)\big )$,

$$\begin{aligned} \left\| \varvec{\varphi }_{\frac{q}{p}}(\varvec{\xi }_{J,1}) - \varvec{\varphi }_{\frac{q}{p}}(\varvec{\xi }_{J,2})\right\| \le \kappa _{J} \Vert \varvec{\xi }_{J,1} - \varvec{\xi }_{J,2}\Vert . \end{aligned}$$

(A.11)

Next, we will show that there exists $\epsilon _J>0$ such that for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _J}(\varvec{x}_J^0)$,

$$\begin{aligned} \big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)-\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)\big \Vert \le \delta _{J}, \end{aligned}$$

(A.12)

and

$$\begin{aligned} \Big \Vert \varvec{s}_J^0 {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q}-\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)\Big \Vert \le \delta _{J}. \end{aligned}$$

(A.13)

Since the function $z\rightarrow \text{ sign }(z) |z|^{\frac{p}{q}}$ is continuous on $\mathbb {R}$, there exists $\epsilon _{J,1}>0$ such that (A.12) is satisfied for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _{J,1}}(\varvec{x}_J^0)$. Next, let us consider (A.13). If $\varvec{x}^0_J=\varvec{0}$, then we have $\lim _{\varvec{x}_J\rightarrow \varvec{x}^0_J}\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)=\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)=\varvec{0}$ since the function $z\rightarrow \text{ sign }(z) |z|^{\frac{p}{q}}$ is continuous. Hence, in this case, we have

$$\begin{aligned} \lim _{\varvec{x}_J\rightarrow \varvec{x}^0_J}\Big \Vert \varvec{s}_J^0 {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q}-\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)\Big \Vert =0. \end{aligned}$$

(A.14)

Otherwise, from (19), we have $\varvec{s}_J^0=w_J {\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)}\big /{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)\big \Vert _q}$. Considering the continuity of the function $z\rightarrow \text{ sign }(z) |z|^{\frac{p}{q}}$, we have $\lim _{\varvec{x}_J\rightarrow \varvec{x}^0_J}\varvec{s}_J^0 {w_J}^{-1}{\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q}=\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)$. In either case, we have (A.14). Thus, there exists $\epsilon _{J,2}>0$ such that (A.13) holds for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _{J,2}}(\varvec{x}_J^0)$.

Let $\varepsilon _J{:}{=}\min \{\epsilon _{J,1},\epsilon _{J,2}\}$. For any $\varvec{x}_J\in \mathbb {U}_{\epsilon _{J}}(\varvec{x}_J^0)$, we have (A.12) and (A.13). Combining (A.11) and (17), we obtain that (A.11) holds for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _{J}}(\varvec{x}_J^0)$. $\square $

Proof of Lemma 5.4

Since $p\in ]1,2]$, we have $\frac{q}{p}\ge 1$. For any $p\in ]1,2]$, it is clear that the function $z\rightarrow \text{ sign }(z) |z|^{\frac{q}{p}}$ is continuously differentiable on $\mathbb {R}$ and hence locally Lipschitz. Thus, for $\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)$, there exist $\delta _{J},\kappa _{J}>0$ such that for all $\varvec{\xi }_{J,1}, \varvec{\xi }_{J,2} \in \mathbb {U}_{\delta _{J}}\big (\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)\big )$,

$$\begin{aligned} \left\| \varvec{\varphi }_{\frac{q}{p}}(\varvec{\xi }_{J,1}) - \varvec{\varphi }_{\frac{q}{p}}(\varvec{\xi }_{J,2})\right\| \le \kappa _{J} \Vert \varvec{\xi }_{J,1}-\varvec{\xi }_{J,2}\Vert . \end{aligned}$$

(A.15)

Next, we will show that there exists $\epsilon _J>0$ such that for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _J}(\varvec{x}_J^0)$,

$$\begin{aligned} \left\| \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)-\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)\right\| \le \delta _{J}, \end{aligned}$$

(A.16)

and

$$\begin{aligned} \left\| \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J^0) {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q}-\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)\right\| \le \delta _{J}. \end{aligned}$$

(A.17)

Since $p\in ]1,2]$, we have $\frac{p}{q}\in ]0, 1]$. Since the function $z\rightarrow \text{ sign }(z) |z|^{t}$ is continuous for any fixed $t\in ]0, 1]$, there exists $\epsilon _{J,1}>0$ such that (A.16) holds for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _{J,1}}(\varvec{x}_J^0)$. Next, let us consider (A.17). If $\varvec{x}^0_J=\varvec{0}$, we have $\lim _{\varvec{x}_J\rightarrow \varvec{x}^0_J}\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)=\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)=\varvec{0}$ since the function $z\rightarrow \text{ sign }(z) |z|^{\frac{q}{p}}$ is continuous. Hence, in this case, we obtain

$$\begin{aligned} \lim _{\varvec{x}_J\rightarrow \varvec{x}^0_J} \left\| \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J^0) {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q}-\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)\right\| = 0. \end{aligned}$$

(A.18)

If $\varvec{x}^0_J \ne \varvec{0}$, then by Lemma 5.2 we have $\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J^0)=w_J {\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)}\big /{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)\big \Vert _q}$. Considering the continuity of the function $z\rightarrow \text{ sign }(z) |z|^{\frac{q}{p}}$, we have $\lim _{\varvec{x}_J\rightarrow \varvec{x}^0_J} \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J^0) {w_J}^{-1}$ $ {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q}=\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}^0_J)$. In either case, we have (A.18). Thus, there exists $\epsilon _{J,2}>0$ such that (A.17) holds for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _{J,2}}(\varvec{x}_J^0)$.

Let $\varepsilon _J{:}{=}\min \{\epsilon _{J,1},\epsilon _{J,2}\}$. For any $\varvec{x}_J\in \mathbb {U}_{\epsilon _{J}}(\varvec{x}_J^0)$, we have (A.16) and (A.17). With (A.15) and (17), we obtain that (35) is satisfied for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _{J}}(\varvec{x}_J^0)$. $\square $

Proof of Lemma 5.5

Let $j\in J$ be arbitrary. Since $\varvec{x}_J\ne 0$, we have

$$\begin{aligned} \partial \Vert \varvec{x}_J\Vert _p=\frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}. \end{aligned}$$

(i) In the case where $(\varvec{s}_J^0)_j>\lambda $, we have $\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)\big )_j=(\varvec{s}_J^0)_j-\lambda >0$. If $(\varvec{x}_J)_j>0$, then $\big (\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big )_j>0$ and $(\partial \Vert \varvec{x}_J\Vert _1)_j=1$. Clearly, we have

$$\begin{aligned} \Bigg (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\Bigg )_j = \Bigg (\varvec{s}_J^0-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}-\lambda \partial \Vert \varvec{x}_J\Vert _1\Bigg )_j. \end{aligned}$$

If $(\varvec{x}_J)_j<0$, then $\big (\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big )_j<0$ and $(\partial \Vert \varvec{x}_J\Vert _1)_j=-1$. Clearly, we have

$$\begin{aligned} 0<\Bigg (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\Bigg )_j < \Bigg (\varvec{s}_J^0-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}-\lambda \partial \Vert \varvec{x}_J\Vert _1\Bigg )_j. \end{aligned}$$

If $(\varvec{x}_J)_j=0$, then $\big (\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big )_j=0$ and $(\partial \Vert \varvec{x}_J\Vert _1)_j=[-1,1]$. Clearly, for any $\varvec{y}_J\in \partial \Vert \varvec{x}_J\Vert _1$,

$$\begin{aligned} 0<\Bigg (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\Bigg )_j \le \Bigg (\varvec{s}_J^0-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}-\lambda \varvec{y}_J\Bigg )_j. \end{aligned}$$

(ii) In the case where $(\varvec{s}_J^0)_j<-\lambda $, we have $\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)\big )_j=(\varvec{s}_J^0)_j+\lambda <0$. If $(\varvec{x}_J)_j>0$, then $\big (\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big )_j>0$ and $(\partial \Vert \varvec{x}_J\Vert _1)_j=1$. Clearly, we have

$$\begin{aligned} 0>\Bigg (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\Bigg )_j > \Bigg (\varvec{s}_J^0-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}-\lambda \partial \Vert \varvec{x}_J\Vert _1\Bigg )_j. \end{aligned}$$

If $(\varvec{x}_J)_j<0$, then $\big (\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big )_j<0$ and $(\partial \Vert \varvec{x}_J\Vert _1)_j=-1$. Clearly, we have

$$\begin{aligned} \Bigg (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\Bigg )_j = \Bigg (\varvec{s}_J^0-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}-\lambda \partial \Vert \varvec{x}_J\Vert _1\Bigg )_j. \end{aligned}$$

If $(\varvec{x}_J)_j=0$, then $\big (\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big )_j=0$ and $(\partial \Vert \varvec{x}_J\Vert _1)_j=[-1,1]$. Clearly, for any $\varvec{y}_J\in \partial \Vert \varvec{x}_J\Vert _1$,

$$\begin{aligned} 0 > \Bigg (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\Bigg )_j \ge \Bigg (\varvec{s}_J^0-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}-\lambda \varvec{y}_J\Bigg )_j. \end{aligned}$$

(iii) In the case where $(\varvec{s}_J^0)_j\in [-\lambda , \lambda ]$, we have $\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)\big )_j=0$. If $(\varvec{x}_J)_j>0$, then $\big (\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big )_j>0$ and $(\partial \Vert \varvec{x}_J\Vert _1)_j=1$. Clearly, we have

$$\begin{aligned} 0>\Bigg (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\Bigg )_j \ge \Bigg (\varvec{s}_J^0-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}-\lambda \partial \Vert \varvec{x}_J\Vert _1\Bigg )_j. \end{aligned}$$

If $(\varvec{x}_J)_j<0$, then $\big (\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big )_j<0$ and $(\partial \Vert \varvec{x}_J\Vert _1)_j=-1$. Clearly, we have

$$\begin{aligned} 0<\Bigg (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\Bigg )_j \le \Bigg (\varvec{s}_J^0-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}-\lambda \partial \Vert \varvec{x}_J\Vert _1\Bigg )_j. \end{aligned}$$

If $(\varvec{x}_J)_j=0$, then $\big (\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big )_j=0$. Hence, we have

$$\begin{aligned} \Bigg (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\Bigg )_j=0. \end{aligned}$$

From the above analysis, in either case, for any $\varvec{y}_J\in \partial \Vert \varvec{x}_J\Vert _1$, we have

$$\begin{aligned} \Bigg |\Bigg (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\Bigg )_j\Bigg | \le \Bigg |\Bigg (\varvec{s}_J^0-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}-\lambda \varvec{y}_J\Bigg )_j\Bigg |,~~\forall ~j\in J. \end{aligned}$$

Thus, we obtain (36). This completes the proof. $\square $

Proof of Proposition 5.3

Let $\left( \varvec{x}^0_J, \varvec{s}^0_J \right) \in \hbox \mathrm{gph} (\partial g_J)$ be arbitrary, then

$$\begin{aligned} \varvec{s}_J^0 \in w_J \partial \Vert \varvec{x}_J^0\Vert _p + \lambda \partial \Vert \varvec{x}_J^0\Vert _1. \end{aligned}$$

(A.19)

We consider the following five cases: (i) $p=1$; (ii) $p \in ]1, 2]$, $w_J=0$ and $\lambda = 0$; (iii) $p \in ]1, 2]$ $w_J>0$ and $\lambda =0$; (iv) $p \in ]1, 2]$, $w_J=0$ and $\lambda >0$; (v) $p \in ]1, 2]$ $w_J>0$ and $\lambda >0$.

(i) In this case, we have $g_J(\varvec{x}_J) = (w_J+\lambda ) \Vert \varvec{x}_J\Vert _1$ for all $\varvec{x}_J \in \mathbb {R}^{|J|}$. If $w_J + \lambda > 0$, then $g_J$ is a polyhedral convex function. From [39, Section 4.2], we know that $g_J$ is metrically subregular at $\left( \varvec{x}_J^0, \varvec{s}_J^0 \right) $. Otherwise, we have $g_J( \varvec{x}_J ) \equiv 0$ for all $\varvec{x}_J \in \mathbb {R}^{|J|}$. Considering (A.19), we have $\varvec{s}_J^0 = \varvec{0}$. It follows from (26) that $\left( \partial g_J \right) ^{-1}\left( \varvec{s}^0_J\right) = \mathbb {R}^{|J|}$. Thus, for all $\epsilon _J > 0$ and $\varvec{x}_J \in \mathbb {U}_{\epsilon _J}(\varvec{x}_J^0)$,

$$\begin{aligned} \mathrm{{dist}}\left( \varvec{x}_J,(\partial g_J)^{-1}(\varvec{s}^0_J)\right) =\mathrm{{dist}}\big (\varvec{s}_J^0,\partial g_J(\varvec{x}_J)\big )=0. \end{aligned}$$

(A.20)

(ii) In this case, we have $g_J(\varvec{x}_J)\equiv 0$ for all $\varvec{x}_J\in \mathbb {R}^{|J|}$. Similar to the proof of case (i), we can show that (A.20) holds for all $\epsilon _J>0$ and $\varvec{x}_J\in \mathbb {U}_{\epsilon _J}(\varvec{x}_J^0)$.

(iii) In this case, we have $g_J(\varvec{x}_J) = w_J \Vert \varvec{x}_J\Vert _p$ for all $\varvec{x}_J \in \mathbb {R}^{|J|}$. It follows from (19) that either (iiia) $\Vert \varvec{s}_J^0\Vert _q< w_J$ or (iiib) $\Vert \varvec{s}_J^0\Vert _q=w_J$.

(iiia) If $\Vert \varvec{s}_J^0\Vert _q<w_J$, then we have $\big (w_J \partial \Vert \Vert _p\big )^{-1}(\varvec{s}_J^0)=\{\varvec{0}\}$ and hence $\varvec{x}_J^0=\varvec{0}$. Thus, for any $\varvec{x}_{J}\in \mathbb {R}^{|J|}$, we have

$$\begin{aligned} \mathrm{{dist}}\Big (\varvec{x}_J,(\partial g_J)^{-1}(\varvec{s}^0_J)\Big ) = \left\| \varvec{x}_J-\varvec{x}_J^0 \right\| . \end{aligned}$$

(A.21)

Set $ \epsilon _J {:}{=} \min _{\varvec{z} \in \mathbb {R}^{|J|}}\left\{ \left\| \varvec{s}_J^0-w_J \varvec{z} \right\| : \Vert \varvec{z}\Vert _q=1 \right\} , $ then $\epsilon _J>0$ due to $\Vert \varvec{s}_J^0\Vert _q<w_J$.

Let $\varvec{x}_J\in \mathbb {U}_{\epsilon _J}(\varvec{x}_J^0)$ be arbitrary. In the case where $\varvec{x}_J=\varvec{0}$, we can obtain (A.20) immediately. In the case where $\varvec{x}_J\ne \varvec{0}$, from (A.21), we have $ \mathrm{{dist}}\Big (\varvec{x}_J,(\partial g_J)^{-1}(\varvec{s}^0_J)\Big )=\Vert \varvec{x}_J-\varvec{x}_J^0\Vert \le \epsilon _J. $ From (19), we have $\partial \Vert \varvec{x}_J\Vert _p={\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}\big /{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}$. Since $\left\| {\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}\big /{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\right\| _q=1$, we have

$$\begin{aligned} \mathrm{{dist}}\big (\varvec{s}_J^0,\partial g_J(\varvec{x}_J)\big )=\left\| \varvec{s}_J^0-w_J \frac{\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\right\| \ge \epsilon _J. \end{aligned}$$

In either case, we obtain that for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _J}(\varvec{x}_J^0)$,

$$\begin{aligned} \mathrm{{dist}}\Big (\varvec{x}_J,(\partial g_J)^{-1}(\varvec{s}^0_J)\Big )\le \mathrm{{dist}}\big (\varvec{s}_J^0,\partial g_J(\varvec{x}_J)\big ). \end{aligned}$$

(iiib) If $\Vert \varvec{s}_J^0\Vert _q=w_J$, it is clear that

$$\begin{aligned} \varvec{s}_J^0\in w_J \partial \Vert \varvec{0}\Vert _p. \end{aligned}$$

(A.22)

In addition, from (29), we have

$$\begin{aligned} (\partial g_J)^{-1}(\varvec{s}^0_J)=\Big \{\alpha _J \varvec{\varphi }_{\frac{q}{p}}(\varvec{s}_J^0): \alpha _J\ge 0\Big \}. \end{aligned}$$

(A.23)

In the case where $\varvec{x}_J=\varvec{0}$, together with (A.22) and (A.23), we can obtain (A.20) immediately. Next, we focus on the case where $\varvec{x}_J\ne \varvec{0}$. First, from (19), we have $\partial \Vert \varvec{x}_J\Vert _p={\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}\big /{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}$. Using (A.23), we have

$$\begin{aligned} \mathrm{{dist}}\left( \varvec{x}_J,(\partial g_J)^{-1}(\varvec{s}^0_J)\right)= & {} \min \left\{ \left\| \varvec{x}_J-\alpha _J \varvec{\varphi }_{\frac{q}{p}}(\varvec{s}_J^0)\right\| : \alpha _J\ge 0\right\} \nonumber \\\le & {} \left\| \varvec{x}_J-\left( {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q}\right) ^{\frac{q}{p}} \varvec{\varphi }_{\frac{q}{p}}\big (\varvec{s}_J^0\big )\right\| . \end{aligned}$$

This, together with (17), yields that

$$\begin{aligned} \mathrm{{dist}}\Big (\varvec{x}_J,(\partial g_J)^{-1}(\varvec{s}^0_J)\Big )\le \Big \Vert \varvec{x}_J-\varvec{\varphi }_{\frac{q}{p}}\big (\varvec{s}_J^0 {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q} \big )\Big \Vert . \end{aligned}$$

(A.24)

By Lemma 5.3, there exist $\epsilon _J,\kappa _{J,1}>0$ such that for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _J}(\varvec{x}_J^0)$,

$$\begin{aligned} \Big \Vert \varvec{x}_J-\varvec{\varphi }_{\frac{q}{p}}\big (\varvec{s}_J^0 {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q}\big )\Big \Vert \le \kappa _{J,1} \Big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)-\varvec{s}_J^0 {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q}\Big \Vert ,\nonumber \\ \end{aligned}$$

(A.25)

and moreover, there must exist $\kappa _{J,2}>0$ such that

$$\begin{aligned} \big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J) \big \Vert _q \le \kappa _{J,2}, ~~\forall ~\varvec{x}_J\in \mathbb {U}_{\epsilon _J}(\varvec{x}_J^0). \end{aligned}$$

(A.26)

Let $\kappa _J{:}{=}\kappa _{J,1} \kappa _{J,2} {w_J}^{-1}$, then $\kappa _J>0$. Thus, for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _J}(\varvec{x}_J^0)$, we have

$$\begin{aligned} \mathrm{{dist}}\Big (\varvec{x}_J,(\partial g_J)^{-1}(\varvec{s}^0_J)\Big )\le & {} \Big \Vert \varvec{x}_J-\varvec{\varphi }_{\frac{q}{p}}\big ({w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q} \varvec{s}_J^0\big )\Big \Vert \nonumber \\\le & {} \kappa _{J,1} \Big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)-{w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q} \varvec{s}_J^0\Big \Vert \nonumber \\\le & {} \kappa _{J,1} \kappa _{J,2} {w_J}^{-1} \Big \Vert w_J {\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}\big /{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}-\varvec{s}_J^0\Big \Vert \nonumber \\= & {} \kappa _J \mathrm{{dist}}\big (\varvec{s}_J^0,\partial g_J(\varvec{x}_J)\big ), \end{aligned}$$

where the first inequation follows from (A.24), the second is due to (A.25), the third is due to (A.26), and the last comes from the definition of $\kappa _J$.

(iv) In this case, we have $g_J(\varvec{x}_J)=\lambda \Vert \varvec{x}_J\Vert _1$ for all $\varvec{x}_J\in \mathbb {R}^{|J|}$. Similar to the proof of case (i), we can show that (A.20) holds for all $\epsilon _J>0$ and $\varvec{x}_J\in \mathbb {U}_{\epsilon _J}(\varvec{x}_J^0)$.

(v) In this case, we have $g_J(\varvec{x}_J)=w_J \Vert \varvec{x}_J\Vert _p+\lambda \Vert \varvec{x}_J\Vert _1$ for all $\varvec{x}_J\in \mathbb {R}^{|J|}$. It follows from (31) that either (va) $\Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J^0)\Vert _q< w_J$ or (vb) $\Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J^0)\Vert _q=w_J$.

(va) If $\Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J^0)\Vert _q<w_J$, then from (31) we have

$$\begin{aligned} (\partial g_J)^{-1}(\varvec{s}^0_J)=\{\varvec{0}\}, \end{aligned}$$

(A.27)

and hence, $\varvec{x}_J^0 = \varvec{0}$. Thus, for any $\varvec{x}_{J}\in \mathbb {R}^{|J|}$, we have

$$\begin{aligned} \mathrm{{dist}}\left( \varvec{x}_J, (\partial g_J)^{-1}(\varvec{s}^0_J) \right) = {\left\| \varvec{x}_J-\varvec{x}_J^0 \right\| } = \Vert \varvec{x}_J\Vert . \end{aligned}$$

(A.28)

In the case $\varvec{x}_J=\varvec{0}$, we have (A.20) immediately. In the case $\varvec{x}_J \ne \varvec{0}$, we set

$$\begin{aligned} \epsilon _J {:}{=} \min _{\varvec{z}_J,\varvec{y}_J\in \mathbb {R}^{|J|}} \Big \{{\left\| \varvec{s}_J^0-(w_J \varvec{z}_J+\lambda \varvec{y}_J) \right\| } : \Vert \varvec{z}_J\Vert _q=1, \varvec{y}_J\in \partial \Vert \varvec{z}_J\Vert _1 \Big \}. \end{aligned}$$

We claim $\epsilon _J>0$. Otherwise, if $\epsilon _J=0$, then there exists $\varvec{z}^0_J\in \mathbb {R}^{|J|}$ satisfying $\Vert \varvec{z}^0_J\Vert _q=1$ and $y^0_J\in \partial \Vert \varvec{z}^0_J\Vert _1$, such that $ \varvec{s}_J^0=w_J \varvec{z}^0_J+\lambda \varvec{y}^0_J. $ Since $\Vert \varvec{z}^0_J\Vert _q=1$, we know $\varvec{z}^0_J\ne \varvec{0}$. Take $\tilde{\varvec{x}}^0_J{:}{=}\varvec{\varphi }_{\frac{q}{p}}(\varvec{z}^0_J)$. It is easy to verify that $\varvec{z}^0_J={\varvec{\varphi }_{\frac{p}{q}}(\tilde{\varvec{x}}^0_J)}\big /{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\tilde{\varvec{x}}^0_J)\big \Vert _q}$. From (19), we have $\varvec{z}^0_J\in \partial \Vert \tilde{\varvec{x}}^0_J\Vert _p$. Moreover, we have $\partial \Vert \varvec{z}^0_J\Vert _1=\partial \Vert \tilde{\varvec{x}}^0_J\Vert _1$ since $\varvec{z}^0_J\ne \varvec{0}$. Thus, we have

$$\begin{aligned} \varvec{s}_J^0\in w_J \partial \Vert \tilde{\varvec{x}}^0_J \Vert _p + \lambda \partial \Vert \tilde{\varvec{x}}^0_J \Vert _1. \end{aligned}$$

Clearly, this result conflicts with (A.27). Hence, we have $\epsilon _J>0$.

Since $\varvec{x}_J\ne \varvec{0}$, we have $\partial \Vert \varvec{x}_J\Vert _p={\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}\big /{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}$. Moreover, we have $\partial \Vert \varvec{x}_J\Vert _1=\partial \Big \Vert {\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}\big /{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\Big \Vert _1$. Considering the definition of $\epsilon _J$, we can obtain $ \mathrm{{dist}}\big (\varvec{s}_J^0,g_J(\varvec{x}_J)\big )\ge \epsilon _J. $ This, together with (A.28), yields that

$$\begin{aligned} \mathrm{{dist}}\Big (\varvec{x}_J,(\partial g_J)^{-1}(\varvec{s}^0_J)\Big )\le \mathrm{{dist}}\big (\varvec{s}_J^0,g_J(\varvec{x}_J)\big ),~~\forall ~\varvec{x}_J\in \mathbb {U}_{\epsilon _J}(\varvec{x}_J^0). \end{aligned}$$

(vb) If $\Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)\Vert _q= w_J$, then from (31) we have

$$\begin{aligned} (\partial g_J)^{-1}(\varvec{s}^0_J)= \big \{\beta _J \varvec{\varphi }_{{\frac{q}{p}}}\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)\big )\in \mathbb {R}^{|J|}:\beta _J\ge 0\big \}. \end{aligned}$$

(A.29)

Clearly, we have $\varvec{0} \in (\partial g_J)^{-1}(\varvec{s}^0_J)$, namely

$$\begin{aligned} \varvec{s}_J^0\in w_J \partial \Vert \varvec{0}\Vert _p+\lambda \partial \Vert \varvec{0}\Vert _1. \end{aligned}$$

In the case where $\varvec{x}_J=\varvec{0}$, we can obtain (A.20) immediately. Next, we consider the case $\varvec{x}_J\ne \varvec{0}$. Using (A.29), we have

$$\begin{aligned} \mathrm{{dist}}\Big (\varvec{x}_J,(\partial g_J)^{-1}(\varvec{s}^0_J)\Big )= & {} \min \Big \{\big \Vert \varvec{x}_J-\beta _J \varvec{\varphi }_{{\frac{q}{p}}}\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)\big )\big \Vert : \beta _J\ge 0\Big \}\nonumber \\\le & {} \Big \Vert \varvec{x}_J-\left( {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q}\right) ^{\frac{q}{p}} \varvec{\varphi }_{{\frac{q}{p}}}\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)\big )\Big \Vert . \end{aligned}$$

This, together with (17), yields

$$\begin{aligned} \mathrm{{dist}}\Big (\varvec{x}_J,(\partial g_J)^{-1}(\varvec{s}^0_J)\Big ) \le \Big \Vert \varvec{x}_J-\varvec{\varphi }_{\frac{q}{p}}\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J) {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q} \big )\Big \Vert . \end{aligned}$$

(A.30)

By Lemma 5.4, there exist $\epsilon _{J,1},\kappa _{J,1}>0$ such that for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _{J,1}}(\varvec{x}_J^0)$,

$$\begin{aligned}&\Big \Vert \varvec{x}_J-\varvec{\varphi }_{\frac{q}{p}}\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J^0) {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q}\big )\Big \Vert \nonumber \\\le & {} \kappa _{J,1} \Big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)-\varvec{\mathcal {T}}_{\lambda }(\varvec{s}_J^0) {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q}\Big \Vert , \end{aligned}$$

(A.31)

and moreover, there exists $\kappa _{J,2}>0$ such that for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _{J,1}}(\varvec{x}_J^0)$,

$$\begin{aligned} \big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J) \big \Vert _q\le \kappa _{J,2}. \end{aligned}$$

(A.32)

Let $\kappa _J{:}{=}\kappa _{J,1} \kappa _{J,2} {w_J}^{-1}$, then $\kappa _J>0$. Thus, for all $\varvec{x}_J\in \mathbb {U}_{\epsilon _J}(\varvec{x}_J^0)$, we obtain

$$\begin{aligned} \mathrm{{dist}}\Big (\varvec{x}_J,(\partial g_J)^{-1}(\varvec{s}^0_J)\Big )\le & {} \Big \Vert \varvec{x}_J-\varvec{\varphi }_{\frac{q}{p}}\big (\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J) {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q} \big )\Big \Vert \nonumber \\\le & {} \kappa _{J,1} \Big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)-\varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J) {w_J}^{-1} {\Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\Vert _q}\Big \Vert \nonumber \\\le & {} \kappa _{J} \Big \Vert \varvec{\mathcal {T}}_{\lambda }(\varvec{s}^0_J)-w_J {\varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)}\big /{\big \Vert \varvec{\varphi }_{\frac{p}{q}}(\varvec{x}_J)\big \Vert _q}\Big \Vert \nonumber \\\le & {} \kappa _J \mathrm{{dist}}\big (\varvec{s}_J^0,\partial g_J(\varvec{x}_J)\big ), \end{aligned}$$

where the first inequation follows from (A.30), the second is due to (A.31), the third is due to (A.32) and the definition of $\kappa _J$, the last comes from Lemma 5.5.

In summary, in either case, there exist $\epsilon _J,\kappa _{J}>0$ such that

$$\begin{aligned} \mathrm{{dist}}\Big (\varvec{x}_J,(\partial g_J)^{-1}(\varvec{s}^0_J)\Big ) \le \kappa _J \mathrm{{dist}}\big (\varvec{s}_J^0,g_J(\varvec{x}_J)\big ),~~\forall ~\varvec{x}_J\in \mathbb {U}_{\epsilon _J}(\varvec{x}_J^0). \end{aligned}$$

Consequently, $\partial g$ is metrically subregular at $\left( \varvec{x}_J^0, \varvec{s}_J^0 \right) \in \hbox \mathrm{gph} (\partial g_J)$. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, J., Zhu, X. Linear Convergence of Prox-SVRG Method for Separable Non-smooth Convex Optimization Problems under Bounded Metric Subregularity. J Optim Theory Appl 192, 564–597 (2022). https://doi.org/10.1007/s10957-021-01978-w

Download citation

Received: 08 April 2021
Accepted: 08 November 2021
Published: 06 January 2022
Issue Date: February 2022
DOI: https://doi.org/10.1007/s10957-021-01978-w

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Linear Convergence of Prox-SVRG Method for Separable Non-smooth Convex Optimization Problems under Bounded Metric Subregularity

Abstract

Access this article

Similar content being viewed by others

Random Gradient-Free Minimization of Convex Functions

The Frank-Wolfe Algorithm: A Short Introduction

Tseng’s extragradient method with double projection for solving pseudomonotone variational inequality problems in Hilbert spaces

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Proof of Lemma 5.1

Proof of Lemma 5.2

Proof of Proposition 5.1

Proof of Lemma 5.3

Proof of Lemma 5.4

Proof of Lemma 5.5

Proof of Proposition 5.3

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Linear Convergence of Prox-SVRG Method for Separable Non-smooth Convex Optimization Problems under Bounded Metric Subregularity

Abstract

Access this article

Similar content being viewed by others

Random Gradient-Free Minimization of Convex Functions

The Frank-Wolfe Algorithm: A Short Introduction

Tseng’s extragradient method with double projection for solving pseudomonotone variational inequality problems in Hilbert spaces

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Proof of Lemma 5.1

Proof of Lemma 5.2

Proof of Proposition 5.1

Proof of Lemma 5.3

Proof of Lemma 5.4

Proof of Lemma 5.5

Proof of Proposition 5.3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation