稀疏逻辑回归问题的一个光滑化共辄梯度算法
A Smoothing Conjugate Gradient Algorithm for Sparse Logistic Regression Problems
DOI: 10.12677/AAM.2023.128364, PDF, 下载: 106  浏览: 168  国家社会科学基金支持
作者: 李飘云, 韦潇鹏*, 唐敏笙:桂林电子科技大学数学与计算科学学院, 广西 桂林
关键词: 稀疏逻辑回归共辄梯度法光滑函数Sparse Logistic Regression Conjugate Gradient Method Smoothing Function
摘要: 稀疏逻辑回归是一种具有稀疏约束的逻辑回归模型,它广泛应用于神经网络、 机器学习和生物信 息领域。 本文基于近似l1-范数的思想,采用六个光滑函数对稀疏逻辑回归模型中的l1- 范数的每个 分量进行近似,将问题转换为光滑化无约束最小化问题,然后设计共辄梯度法求解近似模型井给 出收敛性分析。 最后通过数值实验与己知求解稀疏逻辑回归模型的四个算法进行比较,得出共辄 梯度法求解稀疏逻辑回归问题是有效的。
Abstract: Sparse logistic regression is a kind of logistic regression model with sparse constraints, which is widely used in the fields of neural networks, machine learning, and bioinfor- matics. In this paper, based on the idea of approximating the l1 norm, six smooth functions are used to approximate each component of the l1 norm in the sparse logis- tic regression model, and the problem is transformed into a smoothed unconstrained minimization problem, then a conjugate gradient method is designed to solve the approximated model and the convergence analysis is given. Finally, numerical exper- iments are conducted to compare with four known algorithms for solving the sparse logistic regression model, and it is concluded that the conjugate gradient method is effective in solving the sparse logistic regression problem.
文章引用:李飘云, 韦潇鹏, 唐敏笙. 稀疏逻辑回归问题的一个光滑化共辄梯度算法[J]. 应用数学进展, 2023, 12(8): 3665-3683. https://doi.org/10.12677/AAM.2023.128364

参考文献

[1] Beck, A. and Teboulle, M. (2009) A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2, 183-202.
https://doi.org/10.1137/080716542
[2] Scheinberg, K. and Tang, X. (2016) Practical Inexact Proximal Quasi-Newton Method with Global Complexity Analysis. Mathematical Programming, 160, 495-529.
https://doi.org/10.1007/s10107-016-0997-3
[3] Koh, K., Kim, S., Boyd, S., et al. (2007) An interior-Point Method for Large-Scale -Regularized Logistic Regression. Journal of Machine Learning Research, 8, 1519-1555.
[4] Friedman, J., Hastie, T. and Tibshirani, R. (2010) Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33, 1-22.
https://doi.org/10.18637/jss.v033.i01
[5] Yuan, G.X., Ho, C.H. and Lin, C.J. (2012) An Improved GLMNET for f1-Regularized Logistic Regression. The Journal of Machine Learning Research, 13, 1999-2030.
https://doi.org/10.1145/2020408.2020421
[6] Yuan, G.X., Chang, K.W., Hsieh, C.J., et al. (2010) A Comparison of Optimization Methods and Software for Large-Scale f1-Regularized Linear Classification. Journal of Machine Learning Research, 11, 3183-3234.
https://www.jmlr.org/papers/v11/yuan10c.html
[7] Bian, Y., Li, X., Cao, M., et al. (2013) Bundle CDN: A Highly Parallelized Approach for Large-Scale f1-Regularized Logistic Regression. In: Blockeel, H., Kersting, K., Nijssen, S. and Zˇelezny´, F., Eds., Machine Learning and Knowledge Discovery in Databases, Springer, Berlin, Heidelberg, 81-95.
https://doi.org/10.1007/978-3-642-40994-3 6
[8] Peng, H., Wang, Z., Chang, E.Y., et al. (2012) Sublinear Algorithms for Penalized Logistic Regression in Massive Datasets. In: Flach, P.A., De Bie, T. and Cristianini, N., Eds., Machine Learning and Knowledge Discovery in Databases, Springer, Berlin, Heidelberg, 553-568.
https://doi.org/10.1007/978-3-642-33460-3_41
[9] Peng, H., Liang, D. and Choi, C. (2013) Evaluating Parallel Logistic Regression Models. 2013 IEEE International Conference on Big Data, Silicon Valley, CA, 6-9 October 2013, 119-126.
https://doi.org/10.1109/BigData.2013.6691743
[10] Yu, H.F., Huang, F.L. and Lin, C.J. (2011) Dual Coordinate Descent Methods for Logistic Regression and Maximum Entropy Models. Machine Learning, 85, 41-75.
https://doi.org/10.1007/s10994-010-5221-8
[11] Balamurugan, P. (2013) Large-Scale Elastic Net Regularized Linear Classification SVMs and Logistic Regression. 2013 IEEE 13th International Conference on Data Mining, Dallas, TX, 7-10 December 2013, 949-954.
https://doi.org/10.1109/ICDM.2013.126
[12] Figueiredo, M., Nowak, R. and Wright, S.J. (2008) Gradient Projection for Sparse Reconstruc- tion: Application to Compressed Sensing and Other Inverse Problems. Journal of Selected Topics in Signal Processing, 1, 586-597.
https://doi.org/10.1109/JSTSP.2007.910281
[13] Nowak, R. and Figueiredo, M. (2001) Fast Wavelet-Based Image Deconvolution Using the EM Algorithm. Conference Record of Thirty-Fifth Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, 4-7 November 2001, 371-375.
[14] Hale, E.T., Yin, W. and Zhang, Y. (2009) Fixed-Point Continuation for f1-Minimization: Methodology and Convergence. SIAM Journal on Optimization, 19, 1107-1130.
https://doi.org/10.1137/070698920
[15] Bioucas-Dias, J.M. and Figueiredo, M. (2007) A New TwIST: Two-Step Iterative Shrink- age/Thresholding Algorithms for Image Restoration. Transactions on Image Processing, 16, 2992-3004.
https://doi.org/10.1109/TIP.2007.909319
[16] Becker, S., Bobin, J. and Candes, E. (2011) NESTA: A Fast and Accurate First-Order Method for Sparse Recovery. SIAM Journal Imaging Sciences, 4, 1-39.
https://doi.org/10.1137/090756855
[17] Tsuruoka, Y., Tsujii, J. and Ananiadou, S. (2009) Stochastic Gradient Descent Training for L1- Regularized Log-Linear Models with Cumulative Penalty. Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, 1, 477-485.
https://doi.org/10.3115/1687878.1687946
[18] 陈宝林. 最优化理论与算法[M]. 北京: 清华大学出版社, 2005: 294-305.
[19] Wu, C.Y., Zhan, J.M., Lu, Y., et al. (2019) Signal Reconstruction by Conjugate Gradient Algorithm Based on Smoothing f1-Norm. Calcolo, 56, Article No. 42.
https://doi.org/10.1007/s10092-019-0340-5
[20] Polak, E. and Ribiere, G. (1969) Note sur la convergence de m´ethodes de directions conjugu´ees. Revue Fran¸caise d’Informatique et de Recherche Op´erationnelle, 16, 35-43.
https://doi.org/10.1051/m2an/196903R100351
[21] Polyak, B.T. (1969) The Conjugate Gradient Methods in Extreme Problems. USSR Compu- tational Mathematics Physics, 9, 94-112.
https://doi.org/10.1016/0041-5553(69)90035-4
[22] Hu, Q.J., Zhang, H.R. and Chen, Y. (2022) Global Convergence of a Descent PRP Type Conjugate Gradient Method for Nonconvex Optimization. Applied Numerical Mathematics, 173, 38-50.
https://doi.org/10.1016/j.apnum.2021.11.001
[23] Touati-Ahmed, D. and Storey, C. (1990) Efficient Hybrid Conjugate Gradient Techniques. Journal of Optimization Theory and Applications, 64, 379-397.
https://doi.org/10.1007/BF00939455
[24] Yuan, G.L., Li, T.T. and Hu, W.J. (2020) A Conjugate Gradient Algorithm for Large-Scale Nonlinear Equations and Image Restoration Problems. Applied Numerical Mathematics, 147, 129-141.
https://doi.org/10.1016/j.apnum.2019.08.022
[25] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016: 57-60.
[26] Wang, R., Xiu, N. and Zhang, C. (2019) Greedy Projected Gradient-Newton Method for Sparse Logistic Regression. Transactions on Neural Networks and Learning Systems, 31, 527- 538.
https://doi.org/10.1109/TNNLS.2019.2905261
[27] Saheya, B., Yu, C.H. and Chen, J.S. (2018) Numerical Comparisons Based on Four Smoothing Functions for Absolute Value Equation. Journal of Applied Mathematics and Computing, 56, 131-149.
https://doi.org/10.1007/s12190-016-1065-0
[28] Voronin, S., Ozkaya, G. and Yoshida, D. (2014) Convolution Based Smooth Approximations to the Absolute Value Function with Application to Non-Smooth Regularization. arXiv: 1408.6795