Skip to main content
Log in

Feature selection in SVM via polyhedral k-norm

  • Original Paper
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

We treat the feature selection problem in the support vector machine (SVM) framework by adopting an optimization model based on use of the \(\ell _0\) pseudo-norm. The objective is to control the number of non-zero components of the normal vector to the separating hyperplane, while maintaining satisfactory classification accuracy. In our model the polyhedral norm \(\Vert .\Vert _{[k]}\), intermediate between \(\Vert .\Vert _1\) and \(\Vert .\Vert _{\infty }\), plays a significant role, allowing us to come out with a DC (difference of convex) optimization problem that is tackled by means of DCA algorithm. The results of several numerical experiments on benchmark classification datasets are reported.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. Symbol “*” in Dataset column of Table 5 indicates that parameter C has been set to 2.

References

  1. Amaldi, E., Kann, V.: On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theor. Comput. Sci. 209(1–2), 237–260 (1998)

    Article  MathSciNet  Google Scholar 

  2. Bertolazzi, P., Felici, G., Festa, P., Fiscon, G., Weitschek, E.: Integer programming models for feature selection: new extensions and a randomized solution algorithm. Eur. J. Oper. Res. 250(2), 389–399 (2016)

    Article  MathSciNet  Google Scholar 

  3. Bradley, P.S., Mangasarian, O.L., Street, W.N.: Feature selection via mathematical programming. INFORMS J. Comput. 10(2), 209–217 (1998)

    Article  MathSciNet  Google Scholar 

  4. Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: Shavlik, J., (ed.) Machine Learning Proceedings of the Fifteenth International Conference (ICML ’98). Morgan Kaufmann, San Francisco, California, pp. 82–90 (1998)

  5. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)

    Book  Google Scholar 

  6. Di Pillo, G., Grippo, L.: Exact penalty functions in constrained optimization. SIAM J. Control Optim. 27(6), 1333–1360 (1989)

    Article  MathSciNet  Google Scholar 

  7. Dy, J.G., Brodley, C.E., Wrobel, S.: Feature selection for unsupervised learning. J. Mach. Learn. Res. 5, 845–889 (2004)

    MathSciNet  MATH  Google Scholar 

  8. Gasso, G., Rakotomamonjy, A., Canu, S.: Recovering sparse signals with a certain family of nonconvex penalties and DC programming. IEEE Trans. Signal Process. 57(12), 4686–4698 (2009)

    Article  MathSciNet  Google Scholar 

  9. Gaudioso, M., Gorgone, E., Labbé, M., Rodríguez-Chía, A.M.: Lagrangian relaxation for SVM feature selection. Comput. Oper. Res. 87, 137–145 (2017)

    Article  MathSciNet  Google Scholar 

  10. Gaudioso, M., Giallombardo, G., Miglionico, G.: Minimizing piecewise-concave functions over polytopes. Math. Oper. Res. 43(2), 580–597 (2018)

    Article  MathSciNet  Google Scholar 

  11. Gaudioso, M., Giallombardo, G., Miglionico, G., Bagirov, A.M.: Minimizing nonsmooth DC functions via successive DC piecewise-affine approximations. J. Glob. Optim. 71(1), 37–55 (2018)

    Article  MathSciNet  Google Scholar 

  12. Gotoh, J., Takeda, A., Tono, K.: DC formulations and algorithms for sparse optimization problems. Math. Program. Ser. B 169(1), 141–176 (2018)

    Article  MathSciNet  Google Scholar 

  13. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  14. Hastie, T., Tibshirani, R., Wainwright, M.: Statistical Learning with Sparsity: The Lasso and Generalizations. CRC Press, Boca Raton (2015)

    Book  Google Scholar 

  15. Hempel, A.B., Goulart, P.J.: A novel method for modelling cardinality and rank constraints. In: 53rd IEEE Conference on Decision and Control, Los Angeles, CA, USA December 15–17, pp. 4322–4327 (2014)

  16. Hiriart-Urruty, J.-B.: Generalized Differentiability/Duality and Optimization for Problems Deling with Differences of Convex Functions, Lecture Notes in Economic and Mathematical Systems, vol. 256, pp. 37–70. Springer, Berlin (1986)

    Google Scholar 

  17. Hiriart-Urruty, J.-B., Ye, D.: Sensitivity analysis of all eigevalues of a symmetric matrix. Numer. Math. 70(1), 45–72 (1995)

    Article  MathSciNet  Google Scholar 

  18. Joki, K., Bagirov, A.M., Karmitsa, N., Mäkelä, M.M.: A proximal bundle method for nonsmooth DC optimization utilizing nonconvex cutting planes. J. Glob. Optim. 68, 501–535 (2017)

    Article  MathSciNet  Google Scholar 

  19. Le Thi, H.A., Dinh, T.P.: The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. J. Glob. Optim. 133, 23–46 (2005)

    MathSciNet  MATH  Google Scholar 

  20. Le Thi, H.A., Le, H.M., Nguyen, V.V., Dinh, T.P.: A DC programming approach for feature selection in support vector machines learning. Adv. Data Anal. Classif. 2, 259–278 (2008)

    Article  MathSciNet  Google Scholar 

  21. Maldonado, S., Pérez, J., Weber, R., Labbé, M.: Feature selection for support vector machines via mixed integer linear programming. Inf. Sci. 279, 163–175 (2014)

    Article  MathSciNet  Google Scholar 

  22. Mangasarian, O.L.: Nonlinear Programming. McGraw-Hill, New York (1969)

    MATH  Google Scholar 

  23. Overton, M.L., Womersley, R.S.: Optimality conditions and duality theory for minimizing sums of the largest eigenvalues of symmetric matrices. Math. Program. 62(1–3), 321–357 (1993)

    Article  MathSciNet  Google Scholar 

  24. Pilanci, M., Wainwright, M.J., El Ghaoui, L.: Sparse learning via Boolean relaxations. Math. Program. Ser. B 151, 63–87 (2015)

    Article  MathSciNet  Google Scholar 

  25. Rinaldi, F., Schoen, F., Sciandrone, M.: Concave programming for minimizing the zero-norm over polyhedral sets. Comput. Optim. Appl. 46, 467–486 (2010)

    Article  MathSciNet  Google Scholar 

  26. Strekalovsky, A.S.: Global optimality conditions for nonconvex optimization. J. Glob. Optim. 12, 415–434 (1998)

    Article  MathSciNet  Google Scholar 

  27. Soubies, E., Blanc-Féraud, L., Aubert, G.: A unified view of exact continuous penalties for \(\ell _2\)-\(\ell _0\) minimization. SIAM J. Optim. 27(3), 2034–2060 (2017)

    Article  MathSciNet  Google Scholar 

  28. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  29. Vapnik, V.: The Nature of the Statistical Learning Theory. Springer, Berlin (1995)

    Book  Google Scholar 

  30. Wang, L., Zhu, J., Zou, H.: The doubly regularized support vector machine. Stat. Sin. 16, 589–615 (2006)

    MathSciNet  MATH  Google Scholar 

  31. Watson, G.A.: Linear best approximation using a class of polyhedral norms. Numer. Algorithms 2, 321–336 (1992)

    Article  MathSciNet  Google Scholar 

  32. Weston, J., Elisseeff, A., Schölkopf, B., Tipping, M.: Use of the zero-norm with linear models and kernel methods. J. Mach. Learn. Res. 3, 1439–1461 (2003)

    MathSciNet  MATH  Google Scholar 

  33. Wright, S.J.: Accelerated block-cordinate relaxation for regularized optimization. SIAM J. Optim. 22(1), 159–186 (2012)

    Article  MathSciNet  Google Scholar 

  34. Wu, B., Ding, C., Sun, D., Toh, K.-C.: On the Moreau–Yosida regularization of the vector \(k\)-norm related functions. SIAM J. Optim. 24(2), 766–794 (2014)

    Article  MathSciNet  Google Scholar 

  35. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67, 301–320 (2005)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We are grateful to Francesco Rinaldi for having provided us with the datasets Colon Cancer and Nova that we have used to compare our results with those in [25].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manlio Gaudioso.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gaudioso, M., Gorgone, E. & Hiriart-Urruty, JB. Feature selection in SVM via polyhedral k-norm. Optim Lett 14, 19–36 (2020). https://doi.org/10.1007/s11590-019-01482-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11590-019-01482-1

Keywords

Navigation