Mini-batch cutting plane method for regularized risk minimization

Lu, Meng-long; Qiao, Lin-bo; Feng, Da-wei; Li, Dong-sheng; Lu, Xi-cheng

doi:10.1631/FITEE.1800596

Mini-batch cutting plane method for regularized risk minimization

Published: 16 December 2019

Volume 20, pages 1551–1563, (2019)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

83 Accesses
1 Citation
Explore all metrics

Abstract

Although concern has been recently expressed with regard to the solution to the non-convex problem, convex optimization is still important in machine learning, especially when the situation requires an interpretable model. Solution to the convex problem is a global minimum, and the final model can be explained mathematically. Typically, the convex problem is re-casted as a regularized risk minimization problem to prevent overfitting. The cutting plane method (CPM) is one of the best solvers for the convex problem, irrespective of whether the objective function is differentiable or not. However, CPM and its variants fail to adequately address large-scale dataintensive cases because these algorithms access the entire dataset in each iteration, which substantially increases the computational burden and memory cost. To alleviate this problem, we propose a novel algorithm named the mini-batch cutting plane method (MBCPM), which iterates with estimated cutting planes calculated on a small batch of sampled data and is capable of handling large-scale problems. Furthermore, the proposed MBCPM adopts a “sink” operation that detects and adjusts noisy estimations to guarantee convergence. Numerical experiments on extensive real-world datasets demonstrate the effectiveness of MBCPM, which is superior to the bundle methods for regularized risk minimization as well as popular stochastic gradient descent methods in terms of convergence speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

Vitor Werner de Vargas, Jorge Arthur Schneider Aranda, … Jorge Luis Victória Barbosa

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on semi-supervised learning

Article Open access 15 November 2019

Jesper E. van Engelen & Holger H. Hoos

References

Belloni A, 2005. Introduction to Bundle Methods. Lecture Notes for IAP, Operations Research Center, MIT, USA.
Google Scholar
Bennett KP, Mangasarian OL, 1992. Robust linear programming discrimination of two linearly inseparable sets. Optim Methods Softw, 1(1):23–34. https://doi.org/10.1080/10556789208805504
Article Google Scholar
Bottou L, 2010. Large-scale machine learning with stochastic gradient descent. Proc 19^th Int Conf on Computational Statistics, p.177-187. https://doi.org/10.1007/978-3-7908-2604-3_16
Chapter Google Scholar
Crammer K, Singer Y, 2003. Ultraconservative online algorithms for multiclass problems. J Mach Learn Res, 3:951–991.
MATH Google Scholar
Duarte M, Hu YH, 2004. Vehicle classification in distributed sensor networks. J Parall Distr Comput, 64(7):826–838. https://doi.org/10.1016/j.jpdc.2004.03.020
Article Google Scholar
Duchi J, Hazan E, Singer Y, 2011. Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res, 12:2121–2159.
MathSciNet MATH Google Scholar
Franc V, Sonnenburg S, 2008. Optimized cutting plane algorithm for support vector machines. Proc 25^th Int Conf on Machine Learning, p.320-327. https://doi.org/10.1145/1390156.1390197
Hiriart-Urruty JB, Lemaréchal C, 1993. Convex Analysis and Minimization Algorithms I. Springer, Berlin, Germany. https://doi.org/10.1007/978-3-662-02796-7
Book Google Scholar
Kelley JEJr, 1960. The cutting-plane method for solving convex programs. J Soc Ind Appl Math, 8(4):703–712. https://doi.org/10.1137/0108053
Article MathSciNet Google Scholar
Kingma DP, Ba J, 2014. Adam: a method for stochastic optimization. https://arxiv.org/abs/1412.6980
Kiwiel KC, 1983. An aggregate subgradient method for nonsmooth convex minimization. Math Program, 27(3):320–341. https://doi.org/10.1007/BF02591907
Article MathSciNet Google Scholar
Kiwiel KC, 1990. Proximity control in bundle methods for convex nondifferentiable minimization. Math Program, 46(1):105–122. https://doi.org/10.1007/BF01585731
Article MathSciNet Google Scholar
LeCun Y, Bottou L, Bengio Y, et al., 1998. Gradient-based learning applied to document recognition. Proc IEEE, 86(11):2278–2324. https://doi.org/10.1109/5.726791
Article Google Scholar
Lemaréchal C, Nemirovskii A, Nesterov Y, 1995. New variants of bundle methods. Math Program, 69(1):111–147. https://doi.org/10.1007/BF01585555
Article MathSciNet Google Scholar
Ma J, Saul LK, Savage S, et al., 2009. Identifying suspicious URLs: an application of large-scale online learning. Proc 26^th Int Conf on Machine Learning, p.681-688. https://doi.org/10.1145/1553374.1553462
Mokhtari A, Eisen M, Ribeiro A, 2018. IQN: an incremental quasi-Newton method with local superlinear convergence rate. SIAM J Opt, 28(2):1670–1698. https://doi.org/10.1137/17M1122943
Article MathSciNet Google Scholar
Mou LL, Men R, Li G, et al., 2016. Natural language inference by tree-based convolution and heuristic matching. Proc 54^th Annual Meeting of the Association for Computational Linguistics, p.130-136.
Qian N, 1999. On the momentum term in gradient descent learning algorithms. Neur Netw, 12(1):145–151. https://doi.org/10.1016/S0893-6080(98)00116-6
Article MathSciNet Google Scholar
Schramm H, Zowe J, 1992. A version of the bundle idea for minimizing a nonsmooth function: conceptual idea, convergence analysis, numerical results. SIAM J Opt, 2(1):121–152. https://doi.org/10.1137/0802008
Article MathSciNet Google Scholar
Sonnenburg S, Franc V, 2010. COFFIN: a computational framework for linear SVMs. Proc 27^th Int Conf on Machine Learning, p.999-1006.
Teo CH, Vishwanthan SVN, Smola AJ, et al., 2010. Bundle methods for regularized risk minimization. J Mach Learn Res, 11:311–365. https://doi.org/10.1145/1756006.1756016
MathSciNet MATH Google Scholar
Tsochantaridis I, Joachims T, Hofmann T, et al., 2005. Large margin methods for structured and interdependent output variables. J Mach Learn Res, 6:1453–1484.
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Science and Technology on Parallel and Distributed Laboratory, National University of Defense Technology, Changsha, 410073, China
Meng-long Lu, Lin-bo Qiao, Da-wei Feng, Dong-sheng Li & Xi-cheng Lu

Authors

Meng-long Lu
View author publications
You can also search for this author in PubMed Google Scholar
Lin-bo Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Da-wei Feng
View author publications
You can also search for this author in PubMed Google Scholar
Dong-sheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Xi-cheng Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Da-wei Feng.

Ethics declarations

Meng-long LU, Lin-bo QIAO, Da-wei FENG, Dongsheng LI, and Xi-cheng LU declare that they have no conflict of interest.

Additional information

Project supported by the National Key R&D Program of China (No. 2018YFB0204300) and the National Natural Science Foundation of China (Nos. 61872376 and 61806216)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, Ml., Qiao, Lb., Feng, Dw. et al. Mini-batch cutting plane method for regularized risk minimization. Front Inform Technol Electron Eng 20, 1551–1563 (2019). https://doi.org/10.1631/FITEE.1800596

Download citation

Received: 25 September 2018
Revised: 23 June 2019
Published: 16 December 2019
Issue Date: November 2019
DOI: https://doi.org/10.1631/FITEE.1800596

Key words

CLC number

TP391

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Mini-batch cutting plane method for regularized risk minimization

Abstract

Access this article

Similar content being viewed by others

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on semi-supervised learning

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Key words

CLC number

Navigation

Mini-batch cutting plane method for regularized risk minimization

Abstract

Access this article

Similar content being viewed by others

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on semi-supervised learning

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Search

Navigation