Abstract
Support vector machines (SVMs) have been recognized as a powerful tool to perform linear classification. When combined with the sparsity-inducing nonconvex penalty, SVMs can perform classification and variable selection simultaneously. However, the nonconvex penalized SVMs in general cannot be solved globally and efficiently due to their nondifferentiability, nonconvexity, and nonsmoothness. Existing solutions to the nonconvex penalized SVMs typically solve this problem in a serial fashion, which are unable to fully use the parallel computing power of modern multi-core machines. On the other hand, the fact that many real-world data are stored in a distributed manner urgently calls for a parallel and distributed solution to the nonconvex penalized SVMs. To circumvent this challenge, we propose an efficient alternating direction method of multipliers (ADMM) based algorithm that solves the nonconvex penalized SVMs in a parallel and distributed way. We design many useful techniques to decrease the computation and synchronization cost of the proposed parallel algorithm. The time complexity analysis demonstrates the low time complexity of the proposed parallel algorithm. Moreover, the convergence of the parallel algorithm is guaranteed. Experimental evaluations on four LIBSVM benchmark datasets demonstrate the efficiency of the proposed parallel algorithm.
Similar content being viewed by others
References
Allen-Zhu Z, Hazan E, 2016. Variance reduction for faster non-convex optimization. Proc 33rd Int Conf on Machine Learning, p.699–707.
Boyd S, Parikh. N, Chu E, et al., 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn, 3(1):1–122. https://doi.org/10.1561/2200000016
Candès EJ, Wakin MB, Boyd.S. P, 2008. Enhancing sparsity by reweighted ℓ1 minimization. J Fourier Anal Appl, 14(5-6):877–905. https://doi.org/10.1007/s00041-008-9045-x
Daneshmand A, Facchinei. F, Kungurtsev V, et al., 2015. Hybrid random/deterministic parallel algorithms for convex and nonconvex big data optimization. IEEE Trans Signal Process, 63(15):3914–3929. https://doi.org/10.1109/TSP.2015.2436357
di Lorenzo P, Scutari G, 2015. Distributed nonconvex optimization over networks. IEEE 6th Int Workshop on Computational Advances in Multi-sensor Adaptive Processing, p.229–232. https://doi.org/10.1109/CAMSAP.2015.7383778
Facchinei F, Scutari. G, Sagratella S, 2015. Parallel selective algorithms for nonconvex big data optimization. IEEE Trans Signal Process, 63(7):1874–1889. https://doi.org/10.1109/TSP.2015.2399858
Fan JQ, Li.R. Z, 2001. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc, 96(456):1348–1360. https://doi.org/10.1198/016214501753382273
Gong PH, Zhang.C. S, Lu ZS, et al., 2013. A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. Proc Int Conf on Machine Learning, p.37–45.
Guan L, Qiao LB, Li.D. S, et al., 2018. An efficient ADMMbased algorithm to nonconvex penalized support vector machines. Proc 18th IEEE Int Conf on Data Mining, p.1209–1216. https://doi.org/10.1109/ICDMW.2018.00173
Hong MY, Luo.Z. Q, Razaviyayn M, 2016. Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM J Optim, 26(1):337–364. https://doi.org/10.1137/140990309
Laporte L, Flamary. R, Canu S, et al., 2014. Nonconvex regularizations for feature selection in ranking with sparse SVM. IEEE Trans Neur Netw Learn Syst, 25(6):1118–1130. https://doi.org/10.1109/TNNLS.2013.2286696
Liu HC, Yao. T, Li RZ, 2016. Global solutions to folded concave penalized nonconvex learning. Ann Stat, 44(2):629–659. https://doi.org/10.1214/15-AOS1380
Mazumder R, Friedman JH, Hastie. T, 2011. SparseNet: coordinate descent with nonconvex penalties. J Am Stat Assoc, 106(495):1125–1138. https://doi.org/10.1198/jasa.2011.tm09738
Ochs P, Chen YJ, Brox. T, et al., 2014. iPiano: inertial proximal algorithm for nonconvex optimization. SIAM J Imag Sci, 7(2):1388–1419. https://doi.org/10.1137/130942954
Razaviyayn M, Hong MY, Luo.Z. Q, et al., 2014. Parallel successive convex approximation for nonsmooth nonconvex optimization. Proc 27th Int Conf on Neural Information Processing Systems, p.1440–1448.
Reddi SJ, Sra. S, Poczos B, et al., 2016. Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization. Proc 30th Conf on Neural Information Processing Systems, p.1145–1153.
Scutari G, Facchinei. F, Lampariello L, 2017. Parallel and distributed methods for constrained nonconvex optimization—Part I: theory. IEEE Trans Signal Process, 65(8):1929–1944. https://doi.org/10.1109/TSP.2016.2637317
Sherman J, Morrison WJ, 1950. Adjustment of an inverse matrix corresponding to a change in one element of a given matrix. Ann Math Stat, 21(1):124–127. https://doi.org/10.1214/aoms/1177729893
Sun T, Jiang. H, Cheng LZ, et al., 2017a. A convergence frame for inexact nonconvex and nonsmooth algorithms and its applications to several iterations. https://arxiv.org/abs/1709.04072
Sun T, Jiang. H, Cheng LZ, 2017b. Iteratively linearized reweighted alternating direction method of multipliers for a class of nonconvex problems. https://arxiv.org/abs/1709.00483
Sun T, Yin PH, Cheng.L. Z, et al., 2018a. Alternating direction method of multipliers with difference of convex functions. Adv Comput Math, 44(3):723–744. https://doi.org/10.1007/s10444-017-9559-3
Sun T, Yin PH, Li.D. S, et al., 2018b. Non-ergodic convergence analysis of heavy-ball algorithms. https://arxiv.org/abs/1811.01777v2
Sun Y, Scutari. G, Palomar D, 2016. Distributed nonconvex multiagent optimization over time-varying networks. Proc 50th Asilomar Conf on Signals, Systems and Computers, p.788–794. https://doi.org/10.1109/ACSSC.2016.7869154
Wang Y, Yin WT, Zeng.J. S, 2015. Global convergence of ADMM in nonconvex nonsmooth optimization. https://arxiv.org/abs/1511.06324
Zhang CH, 2010. Nearly unbiased variable selection under minimax concave penalty. Ann Stat, 38(2):894–942. https://doi.org/10.1214/09-AOS729
Zhang HH, Ahn. J, Lin XD, et al., 2006. Gene selection using support vector machines with non-convex penalty. Bioinformatics, 22(1):88–95. https://doi.org/10.1093/bioinformatics/bti736
Zhang SB, Qian. H, Gong XJ, 2016. An alternating proximal splitting method with global convergence for nonconvex structured sparsity optimization. Proc 30th AAAI Conf on Artificial Intelligence, p.2330–2336.
Zhang T, 2010. Analysis of multi-stage convex relaxation for sparse regularization. J Mach Learn Res, 11:1081–1107.
Zhang X, Wu YC, Wang. L, et al., 2016. Variable selection for support vector machines in moderately high dimensions. J R Stat Soc Ser B, 78(1):53–76. https://doi.org/10.1111/rssb.12100
Zhang YM, Li.D. S, Guo CX, et al., 2017. CubicRing: exploiting network proximity for distributed in-memory key-value store. IEEE/ACM Trans Netw, 25(4):2040–2053. https://doi.org/10.1109/TNET.2017.2669215
Author information
Authors and Affiliations
Corresponding author
Additional information
Project supported by the Major State Research Development Program, China (No. 2016YFB0201305)
Compliance with ethics guidelines
Lei GUAN, Tao SUN, Lin-bo QIAO, Zhi-hui YANG, Dong-sheng LI, Ke-shi GE, and Xi-cheng LU declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Guan, L., Sun, T., Qiao, Lb. et al. An efficient parallel and distributed solution to nonconvex penalized linear SVMs. Front Inform Technol Electron Eng 21, 587–603 (2020). https://doi.org/10.1631/FITEE.1800566
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.1800566
Key words
- Linear classification
- Support vector machine (SVM)
- Nonconvex penalty
- Alternating direction method of multipliers (ADMM)
- Parallel algorithm