Elsevier

Neurocomputing

Volume 236, 2 May 2017, Pages 5-13
Neurocomputing

Unsupervised feature selection for visual classification via feature-representation property

https://doi.org/10.1016/j.neucom.2016.07.064Get rights and content

Abstract

Feature selection is designed to select a subset of features for avoiding the issue of ‘curse of dimensionality’. In this paper, we propose a new feature-level self-representation framework for unsupervised feature selection. Specifically, the proposed method first uses a feature-level self-representation loss function to sparsely represent each feature by other features, and then employs an 2,p-norm regularization term to yield row-sparsity on the coefficient matrix for conducting feature selection. Experimental results on benchmark databases showed that the proposed method effectively selected the most relevant features than the state-of-the-art methods.

Introduction

High-dimensional data could lead to expensive computation cost as well as result in the issue of ‘curse of dimensionality’ so that affecting the performance of learning from the data [1], [2], [3]. In the past decades, dimensionality reduction (including feature selection and subspace learning) via reducing the dimensions has been becoming an efficient solution to high-dimensional data [4], [5].

Feature selection directly removes a subset of features to output interpretable results, so that making it practical in real applications [6]. Previous feature selection methods can be classified into three categories, e.g., supervised feature selection, semi-supervised feature selection and unsupervised feature selection [1]. Supervised feature selection methods usually select features according to the labels of training data. For example, Gu et al. proposed to seek a subset of features by maximizing the lower bound of traditional Fisher score [4], while Zhang et al. proposed to use spectral-spatial feature combination for hyper spectral image analysis [7]. Since supervised feature selection methods enclose labels to conduct feature selection, they are able to select discriminative features.

Semi-supervised feature selection mainly utilizes a small number of labeled samples and a large number of unlabeled samples for the training stage [8]. For example, Lv et al. employed a manifold regularization term to conduct the discriminative semi-supervised feature selection [9]. Wang et al. proposed to first learn the class labels of unlabeled samples, and then to use the learned class labels to define the margins for feature weight learning [10].

However, due to all kinds of reasons such as unknown labels and time-consuming to obtain labels, it is difficult to obtain enough labels for learning from data, unsupervised feature selection thus is practical in alleviating irrelevant features [11], [7]. Compared to either supervised feature selection method or semi-supervised feature selection method, unsupervised feature selection lacks the label information, so it is very challenging to conduct unsupervised feature selection [12]. Recently, unsupervised feature selection methods mainly utilized evaluation indicators to remove redundant features. For example, Liu et al. combined the Laplacian score with the distance-based entropy measure to conduct unsupervised feature selection [13], while Nie et al. proposed to use a corresponding score to conduct feature selection [14].

In this paper, we propose a new unsupervised feature selection method with the utilization of the property of feature self-representation, in which features can represent themselves to find representative feature ingredients. Motivated by the successful application of the self-similarity in subspace clustering [15], [7], [16], this paper first proposes a feature-level self-representation for unsupervised learning, and then adds an 2,1-norm regularizer in the objective function to yield sparse feature selection. In our method, the proposed loss function is proposed to represent each feature by other features with the rationale of that the important features are usually used to represent other features and the unimportant features will be disused for all features. The group sparsity (i.e., the 2,1-norm regularization term) penalizes all coefficients in the same row of the regression matrix together for joint selection or un-selection in predicting the response variables. Besides, this paper also devises an novel and efficient optimization method to solve the resulting objective function as well as proves its convergence. It should be noted that the property of self-representation is not a new concept, which has been popularly used in machine learning and computer vision such as in the application of sparse coding [17] and low-rank [18]. However, previous literature [19], [20] focused on the sample-level self-similarity where each sample is represented by all samples. In this paper, we propose to represent each feature by its relevant features. That is, we conduct feature selection via devising a feature-level self-representation loss function. The contribution of our method is described as follows:

  • Unlike previous unsupervised feature selection methods mainly utilize a number of evaluation indicators to remove the redundant features, we propose a novel feature-level self-representation to remove the irrelevant features. The proposed feature-level self-representation is different from the sample-level self-similarity, which represent each sample by all samples.

  • We propose a novel iterative optimization algorithm to solve the resulting objective function, which is also testified to efficiently converge to the optimum solution.

The left parts of this paper are organized as follows: Section 2 introduces related work on feature selection methods and Section 3 gives the details of our proposed feature selection model. In 4 Experiments, 5 Conclusion, respectively, we show our experimental results and conclude our paper.

Section snippets

Related work

Dimensionality reduction methods are usually divided into two groups: feature selection methods [21] and subspace learning methods [22], [6]. Feature selection methods are widely used for reducing the dimensions of high-dimensional data to output interpretable results [23], [24]. That is, feature selection methods select a subset of features in accordance with criteria, such as distinguishing features with good characteristics and correlating to the predefined goal. The state-of-the-art feature

Approach

In this section, we first define the notations used in this paper, and then describe the details of the proposed method, followed by the proposed optimization method to the resulting objective function.

Experiments

In this section, we compared our proposed Self-Representation Feature Selection (SR_FS for short) method with the comparison methods in terms of classification performance. Specifically, we first used each dimensionality reduction method to map original high-dimensional data into low-dimensional space, and then used the resulting reduced data to conduct classification with Support Vector Machine (SVM) via the LIBSVM toolbox.1 Then the

Conclusion

In this paper, we have proposed a new feature selection method based on the property of feature-representation. Experimental results showed the advantages of the proposed method over the comparison methods on both binary classification and multi-class classification.

In real applications, there usually exist missing data in high-dimensional data [5], [17]. In our future work, we will extend the proposed method to conduct feature selection on the high-dimensional data with incomplete data.

Acknowledgement

This work was supported in part by the National Natural Science Foundation of China (Grant No: 61263035, 61573270, 61450001 and 61363009), the China 973 Program (Grant No: 2013CB329404), the Guangxi Natural Science Foundation (Grant No: 2015GXNSFCB139011), the Guangxi Higher Institutions’ Program of Introducing 100 High-Level Over-seas Talents, the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing, Innovation Project of Guangxi Graduate

Wei He is with the Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, M. S. China. Email: [email protected].

References (47)

  • T. Bouwmans et al.

    Robust pca via principal component pursuit: a review for a comparative evaluation in video surveillance

    Comput. Vis. Image Underst.

    (2014)
  • Y. Xu et al.

    A novel local preserving projection scheme for use with face recognition

    Expert Syst. Appl.

    (2010)
  • K.H. Thung et al.

    Neurodegenerative disease diagnosis using incomplete multi-modality data via matrix shrinkage and completion

    Neuroimage

    (2014)
  • J. Zhang et al.

    Continuous rotation invariant local descriptors for texton dictionary-based texture classification

    Comput. Vis. Image Underst.

    (2013)
  • J. Cao et al.

    Shilling attack detection utilizing semi-supervised learning method for collaborative recommender system

    World Wide Web-Internet Web Inf. Syst.

    (2013)
  • Y. Qin et al.

    Semi-parametric optimization for missing data imputation

    Appl. Intell.

    (2007)
  • X. Zhu, X. Li, S. Zhang, C. Ju, X. Wu, Robust joint graph sparse coding for unsupervised spectral feature selection,...
  • Q. Gu, Z. Li, J. Han, Generalized fisher score for feature selection, in: UAI, 2012, pp....
  • C. Zhang, Y. Qin, X. Zhu, J. Zhang, S. Zhang, Clustering-based missing value imputation for data preprocessing, in:...
  • Q. Zhang et al.

    Automatic spatial-spectral feature selection for hyperspectral image via discriminative sparse multimodal learning

    IEEE Trans. Geosci. Remote Sens.

    (2015)
  • J. Zhang et al.

    Local energy pattern for texture classification using self-adaptive quantization thresholds

    IEEE Trans. Image Process. A Publ. IEEE Signal Process. Soc.

    (2013)
  • L.V. Sunzhong, H. Jiang, L. Zhao, D. Wang, M. Fan, Manifold based fisher method for semi-supervised feature selection,...
  • J.Y. Wang, J. Yao, Y. Sun, Semi-supervised local-learning-based feature selection, in: IJCNN, 2014, pp....
  • Cited by (14)

    • Latent energy preserving embedding for unsupervised feature selection

      2022, Digital Signal Processing: A Review Journal
      Citation Excerpt :

      In [44], Deng et al. present a sparse sample self-representation approach for subspace clustering. In [45], He et al. utilize the regression and self-representation for feature selection. Although the above methods can achieve satisfactory feature selection performance, they still have some shortcomings.

    • SRAGL-AWCL: A two-step multi-view clustering via sparse representation and adaptive weighted cooperative learning

      2021, Pattern Recognition
      Citation Excerpt :

      This paper is the first one to combine NMF with manifold learning to preserve the local structural features of the input data. GSR_SFS: This is a method based on graph self-representation for sparse feature selection [34]. The difference between GSR_SFS and DSRMR is that a traditional fixed similarity matrix is used to solve the model.

    • MCFS: Min-cut-based feature-selection

      2020, Knowledge-Based Systems
      Citation Excerpt :

      Although many studies emphasize the elimination of redundant features, others warn about the possible damage that this deletion may cause due to the exclusion of potentially relevant features [4,5]. The applications of feature-selection are numerous, and include, among many others [6,7], those related with genomic analysis [8], text mining [9], spam detection [10], image retrieval [11], image classification [12,13], and clustering [14]. Two highly regarded feature-selection techniques [15] are CFS (Correlation-based feature-selection) [16], due to the quality of the subset of selected features, and FCBF (Fast Correlation-Based Filter Solution) [17] thanks to its ability to work with datasets with many features and the small size of the subset of selected features.

    • Robust unsupervised feature selection via dual self-representation and manifold regularization

      2018, Knowledge-Based Systems
      Citation Excerpt :

      Hou et al. proposed a general framework for unsupervised feature selection by joint embedding learning and sparse regression [25]. In recent years, many self-representation based methods have been explored and shown promising results [26–28,30,33,41]. The assumption behind these methods is that each feature can be well approximated by the linear combination of its relevant features and the representation coefficient matrix with sparsity constrain can be used as the feature weights.

    View all citing articles on Scopus

    Wei He is with the Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, M. S. China. Email: [email protected].

    Xiaofeng Zhu is with the Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, M. S. China. Email: [email protected]. His research topics include feature selection and analysis, pattern recognition and data mining.

    Debo Cheng is with the Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, M. S. China. Email:[email protected].

    Rongyao Hu is with the Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, M. S. China. Email: [email protected].

    Shichao Zhang is a Distinguished Professor and the director of Institute of School of Computer Science and Information Technology at the Guangxi Normal University, Guilin, China. He holds a Ph.D. degree in Computer Science from Deakin University, Australia. His research interests include data analysis and smart pattern discovery. He has published over 50 international journal papers and over 60 international conference papers. He has won over 10 nation-class grants, such as the China NSF, China 863 Program, China 973 Program, and Australia Large ARC. He is an Editor-in-Chief for International Journal of Information Quality and Computing, and is served as an associate editor for IEEE Transactions on Knowledge and Data Engineering, Knowledge and Information Systems, and IEEE Intelligent Informatics Bulletin. Email: [email protected].

    1

    Wei He and Debo Cheng have equally contributed to this work.

    View full text