Multi-label feature selection via feature manifold learning and sparsity regularization

Cai, Zhiling; Zhu, William

doi:10.1007/s13042-017-0647-y

Multi-label feature selection via feature manifold learning and sparsity regularization

Original Article
Published: 01 March 2017

Volume 9, pages 1321–1334, (2018)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

1923 Accesses
78 Citations
Explore all metrics

Abstract

Multi-label learning deals with data associated with different labels simultaneously. Like traditional single-label learning, multi-label learning suffers from the curse of dimensionality as well. Feature selection is an efficient technique to improve learning efficiency with high-dimensional data. With the least square regression model, we incorporate feature manifold learning and sparse regularization into a joint framework for multi-label feature selection problems. The graph regularization is used to explore the feature geometric structure for gaining a better regression coefficient matrix which reflects the importance of varying features. Besides, the \(\ell _{2,1}\)-norm is imposed on the sparsity term to guarantee the sparsity of the regression coefficients. Furthermore, we design an iterative updating algorithm with proved convergence to tackle the aforementioned formulated problem. The proposed method is validated in six publicly available data sets from real-world applications. Finally, extensively experimental results demonstrate its superiority over the compared state-of-the-art multi-label feature selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-label feature selection based on logistic regression and manifold learning

Article 04 January 2022

Yao Zhang, Yingcang Ma & Xiaofei Yang

Robust Feature Selection with Feature Correlation via Sparse Multi-Label Learning

Article 01 January 2020

Jiangjiang Cheng, Junmei Mei, … Ping Zhong

Sparse Matrix Feature Selection in Multi-label Learning

Notes

http://mulan.sourceforge.net/datasets-mlc.html.

References

Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. NIPS 14:585–591
Google Scholar
Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2016) Feature selection for high-dimensional data. Prog Artif Intell 5(2):65–75
Article Google Scholar
Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recognit 37(9):1757–1771
Article Google Scholar
Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp 333–342
Cai X, Nie F, Huang H (2013) Exact top-k feature selection via l2, 0-norm constraint. In: IJCAI
Chang X, Nie F, Yang Y, Huang H (2014) A convex formulation for semi-supervised multi-label feature selection. In: AAAI, pp 1171–1177
Chen W, Yan J, Zhang B, Chen Z, Yang Q (2007) Document transformation for multi-label feature selection in text categorization. In: Seventh IEEE International Conference on Data Mining (ICDM 2007), IEEE, pp 451–456
Chinnaswamy A, Srinivasan R (2016) Hybrid feature selection using correlation coefficient and particle swarm optimization on microarray gene expression data. In: Innovations in bio-inspired computing and applications, Springer, pp 229–239
Clare A, King RD (2001) Knowledge discovery in multi-label phenotype data. In: European Conference on Principles of Data Mining and Knowledge Discovery, Springer, pp 42–53
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Doquire G, Verleysen M (2011) Feature selection for multi-label classification problems. In: International work-conference on artificial neural networks, Springer, pp 9–16.
Dougherty J, Kohavi R, Sahami M et al (1995) Supervised and unsupervised discretization of continuous features. In: Machine learning: proceedings of the 12th international conference, vol. 12, pp 194–202
Dumais S, Platt J, Heckerman D, Sahami M (1998) Inductive learning algorithms and representations for text categorization. In: Proceedings of the 7th international conference on information and knowledge management, ACM, pp 148–155
Efron B, Hastie T, Johnstone I, Tibshirani R et al (2004) Least angle regression. Ann Stat 32(2):407–499
Article MathSciNet MATH Google Scholar
Elisseeff A, Weston J (2001) A kernel method for multi-labelled classification. Adv Neural Inf Process Syst 14:681–687
Google Scholar
Ghamrawi N, McCallum A (2005) Collective multi-label classification. In: Proceedings of the 14th ACM international conference on information and knowledge management, ACM, pp 195–200
Gharroudi O, Elghazel H, Aussem A (2014) A comparison of multi-label feature selection methods using the random forest paradigm. In: Canadian conference on artificial intelligence, pp 95–106
Godbole S, Sarawagi S (2004) Discriminative methods for multi-labeled classification. In: Pacific-Asia conference on knowledge discovery and data mining, pp 22–30
Gu Q, Li Z, Han J (2011) Correlated multi-label feature selection. In: Proceedings of the 20th ACM international conference on information and knowledge management, ACM, pp 1087–1096
Gu Q, Zhou J (2009) Co-clustering on manifolds. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 359–368
Guo S, Guo D, Chen L, Jiang Q (2016) A centroid-based gene selection method for microarray data classification. J Theor Biol 400:32–41
Article MathSciNet MATH Google Scholar
He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. Adv Neural Inf Process Syst 186:507–514
Google Scholar
He X, Cai D, Yan S, Zhang HJ (2005) Neighborhood preserving embedding. In: 10th IEEE international conference on computer vision (ICCV’05), vol. 1, vol. 2, IEEE, pp 1208–1213
Ji S, Tang L, Yu S, Ye J (2010) A shared-subspace learning framework for multi-label classification. ACM Trans Knowl Discov Data (TKDD) 4(2):1–29
Article Google Scholar
Jolliffe I (2002) Principal component analysis. Wiley Online Library
Jungjit S, Michaelis M, Freitas AA, Cinatl J (2013) Two extensions to multi-label correlation-based feature selection: a case study in bioinformatics. In: 2013 IEEE international conference on systems, man, and cybernetics, IEEE, pp 1519–1524
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324
Article MATH Google Scholar
Kong D, Ding C, Huang H, Zhao H (2012) Multi-label relieff and f-statistic feature selections for image annotation. In: Computer vision and pattern recognition (CVPR), 2012 IEEE conference on, IEEE, pp 2352–2359
Kong X, Philip SY (2012) gmlc: a multi-label feature selection framework for graph classification. Knowl Inf Syst 31(2):281–305
Article Google Scholar
Lee J, Kim DW (2013) Feature selection for multi-label classification using multivariate mutual information. Pattern Recognit Lett 34(3):349–357
Article Google Scholar
Lee J, Kim DW (2015) Fast multi-label feature selection based on information-theoretic feature ranking. Pattern Recognit 48(9):2761–2771
Article Google Scholar
Lee J, Lim H, Kim D (2012) Approximating mutual information for multi-label feature selection. Electron Lett 48(15):929–930
Article Google Scholar
Lin Y, Hu Q, Liu J, Duan J (2015) Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing 168:92–103
Article Google Scholar
McCallum A (1999) Multi-label text classification with a mixture model trained by em. In: AAAI99 Workshop on Text Learning, pp 1–7
Nie F, Huang H, Cai X, Ding C (2010) Efficient and robust feature selection via joint \(\ell _{2,1}\)-norms minimization. In: Advances in neural information processing systems, pp 1813–1821
Nie F, Wang X, Jordan MI, Huang H (2016) The constrained laplacian rank algorithm for graph-based clustering. In: Thirtieth AAAI Conference on Artificial Intelligence. Citeseer
Nie F, Xiang S, Jia Y, Zhang C, Yan S (2008) Trace ratio criterion for feature selection. AAAI 2:671–676
Google Scholar
Niyogi X (2004) Locality preserving projections. In: Neural information processing systems, vol. 16, MIT, pp 153–160
Read J (2008) A pruned problem transformation method for multi-label classification. In: Proc. 2008 New Zealand Computer Science Research Student Conference (NZCSRS 2008), pp 143–150
Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359
Article MathSciNet Google Scholar
Schapire RE, Singer Y (2000) Boostexter: a boosting-based system for text categorization. Mach Learn 39(2–3):135–168
Article MATH Google Scholar
Sharma A, Dehzangi A, Lyons J, Imoto S, Miyano S, Nakai K, Patil A (2014) Evaluation of sequence features from intrinsically disordered regions for the estimation of protein function. Plos One 9:2, e89, 890
Sharma A, Imoto S, Miyano S, Sharma V (2011) Null space based feature selection method for gene expression data. Int J Mach Learn Cybern 3(4):269–276
Article Google Scholar
Sharma A, Koh CH, Imoto S, Miyano S (2011) Strategy of finding optimal number of features on gene expression data. Electron Lett 47(8):480–482
Article Google Scholar
Sharma A, Paliwal KK, Imoto S, Miyano S (2014) A feature selection method using improved regularized linear discriminant analysis. Mach Vis Appl 25(25):775–786
Article Google Scholar
Slavkov I, Karcheska J, Kocev D, Kalajdziski S, Dzeroski S (2013) Extending relieff for hierarchical multi-label classification. Mach Learn 4:1–13
Google Scholar
Song L, Smola A, Gretton A, Bedo J, Borgwardt K (2012) Feature selection via dependence maximization. J Mach Learn Res 13(1):1393–1434
MathSciNet MATH Google Scholar
Spolaôr N, Cherman EA, Monard MC, Lee HD (2013) A comparison of multi-label feature selection methods using the problem transformation approach. Electron Notes Theor Comput Sci 292:135–151
Article Google Scholar
Spolaôr N, Cherman EA, Monard MC, Lee HD (2013) Relieff for multi-label feature selection. In: Intelligent Systems (BRACIS), 2013 Brazilian Conference on, IEEE, pp 6–11
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Methodol 58(1):267–288
MathSciNet MATH Google Scholar
Trohidis K, Tsoumakas G, Kalliris G, Vlahavas IP (2008) Multi-label classification of music into emotions. ISMIR 8:325–330
Google Scholar
Tsoumakas G, Katakis I, Vlahavas I (2011) Random k-labelsets for multilabel classification. IEEE Trans Knowl Data Eng 23(7):1079–1089
Article Google Scholar
Wang D, Nie F, Huang H (2015) Feature selection via global redundancy minimization. IEEE Trans Knowl Data Eng 27(10):2743–2755
Article Google Scholar
Wang FY (2016) Control 5.0: newton to merton in popper’s cyber-social-physical spaces. IEEE/CAA J Autom Sin 3(3):233–234
Article MathSciNet Google Scholar
Wang FY, Wang X, Li L, Li L (2016) Steps toward parallel intelligence. IEEE/CAA J Autom Sin 3(4):345–348
Article MathSciNet Google Scholar
Wang FY, Zhang JJ, Zheng X, Wang X, Yuan Y, Dai X, Zhang J, Yang L (2016) Where does alphago go: from church-turing thesis to alphago thesis and beyond. IEEE/CAA J Autom Sin 3(2):113–120
Article Google Scholar
Wang S, Pedrycz W, Zhu Q, Zhu W (2015) Subspace learning for unsupervised feature selection via matrix factorization. Pattern Recognit 48(1):10–19
Article MATH Google Scholar
Wang S, Wang J, Wang Z, Ji Q (2014) Enhancing multi-label classification by modeling dependencies among labels. Pattern Recognit 47(10):3405–3413
Article Google Scholar
Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neural Netw Learn Syst 23(11):1738–1754
Article Google Scholar
Yu K, Yu S, Tresp V (2005) Multi-label informed latent semantic indexing. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, pp 258–265
Yu Y, Pedrycz W, Miao D (2014) Multi-label classification by exploiting label correlations. Expert Syst Appl 41(6):2989–3004
Article Google Scholar
Zhang M, Ding CH, Zhang Y Nie F (2014) Feature selection at the discrete limit. In: AAAI, pp 1355–1361
Zhang ML, Peña JM, Robles V (2009) Feature selection for multi-label naive bayes classification. Inf Sci 179(19):3218–3229
Article MATH Google Scholar
Zhang ML, Wu L (2015) Lift: Multi-label learning with label-specific features. Pattern Anal Mach Intell IEEE Trans 37(1):107–120
Article MathSciNet Google Scholar
Zhang ML, Zhou ZH (2006) Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng 18(10):1338–1351
Article Google Scholar
Zhang ML, Zhou ZH (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048
Article MATH Google Scholar
Zhang ML, Zhou ZH (2014) A review on multi-label learning algorithms. Knowl Data Eng IEEE Trans 26(8):1819–1837
Article Google Scholar
Zhang Y, Zhou ZH (2010) Multilabel dimensionality reduction via dependence maximization. ACM Trans Knowl Discov Data (TKDD) 4(3):1–21
Article Google Scholar
Zhu P, Zuo W, Zhang L, Hu Q, Shiu SCK (2015) Unsupervised feature selection by regularized self-representation. Pattern Recognit 48(2):438–446
Article MATH Google Scholar

Download references

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China under Grant Nos. 61379049 and 61379089.

Author information

Authors and Affiliations

Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
Zhiling Cai & William Zhu
Lab of Granular Computing, Minnan Normal University, Zhangzhou, China
Zhiling Cai

Authors

Zhiling Cai
View author publications
You can also search for this author in PubMed Google Scholar
William Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to William Zhu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cai, Z., Zhu, W. Multi-label feature selection via feature manifold learning and sparsity regularization. Int. J. Mach. Learn. & Cyber. 9, 1321–1334 (2018). https://doi.org/10.1007/s13042-017-0647-y

Download citation

Received: 07 July 2016
Accepted: 20 January 2017
Published: 01 March 2017
Issue Date: August 2018
DOI: https://doi.org/10.1007/s13042-017-0647-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-label feature selection via feature manifold learning and sparsity regularization

Abstract

Access this article

Similar content being viewed by others

Multi-label feature selection based on logistic regression and manifold learning

Robust Feature Selection with Feature Correlation via Sparse Multi-Label Learning

Sparse Matrix Feature Selection in Multi-label Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-label feature selection via feature manifold learning and sparsity regularization

Abstract

Access this article

Similar content being viewed by others

Multi-label feature selection based on logistic regression and manifold learning

Robust Feature Selection with Feature Correlation via Sparse Multi-Label Learning

Sparse Matrix Feature Selection in Multi-label Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation