Skip to main content

Sparse Feature Learning

  • Chapter
  • First Online:
Feature Learning and Understanding

Part of the book series: Information Fusion and Data Science ((IFDS))

Abstract

The traditional linear feature extraction methods focus ℓ2, 1on data global structure information or data local structure information. Although these learning methods perform well in some real applications to some extent, they still have some limitations. In this chapter, some sparse representation problems with different norm regularizations are reviewed. Then the classical sparse learning method, i.e., Lasso, and its variations are introduced, which reduce the affection caused by outliers and produce the sparsity of the outputs. Some sparse feature learning methods based on generalized regression are presented, including generalized robust regression (GRR), robust jointly sparse regression (RJSR) and locally joint sparse marginal embedding (LJSME). These methods not only preserve the local structure but also enhance the model robustness due to using norm and feature selection. In addition, the traditional projection matrix is divided into a new representation, i.e., a projection matrix and an orthogonal rotation matrix, and thus the small-class problem can be overcome.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Chen J, Huo X (2006) Theoretical results on sparse representations of multiple-measurement vectors. IEEE Trans Signal Process 54:4634–4643

    Article  ADS  Google Scholar 

  • Clemmensen L, Hastie T, Witten D, Ersbøll B (2011) Sparse discriminant analysis. Technometrics 53:406–413

    Article  MathSciNet  Google Scholar 

  • Cui Y, Fan L (2012) A novel supervised dimensionality reduction algorithm: graph-based fisher analysis. Pattern Recogn 45:1471–1481

    Article  Google Scholar 

  • Ding C, Zhou D, He X, Zha H (2006) R1-PCA: rotational invariant L1-norm principal component analysis for robust subspace factorization. In: Proceedings of the 23rd international conference on machine learning. ACM, New York

    Google Scholar 

  • Donoho DL (2006) For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Commun Pure Appl Math 59:797–829

    Article  Google Scholar 

  • Donoho DL, Elad M (2003) Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization. Proc Natl Acad Sci 100:2197–2202

    Article  ADS  MathSciNet  Google Scholar 

  • Donoho DL, Tsaig Y, Drori I, Starck J-L (2012) Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit. IEEE Trans Inf Theory 58:1094–1121

    Article  MathSciNet  Google Scholar 

  • Draper NR, Nostrand RCV (1979) Ridge regression and James-Stein estimation: review and comments. Technometrics 21:451–466

    Article  MathSciNet  Google Scholar 

  • Efron B, Hastie T, Johnstone I, Tibshirani R et al (2004) Least angle regression. Ann Stat 32:407–499

    Article  MathSciNet  Google Scholar 

  • He X (2003) Locality preserving projections. Adv Neural Inf Proces Syst 16:186–197

    Google Scholar 

  • He X, Yan S, Hu Y, Niyogi P, Zhang H-J (2005b) Face recognition using Laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27:328–340

    Article  Google Scholar 

  • Hou C, Nie F, Yi D, Wu Y (2011) Feature selection via joint embedding learning and sparse regression. In: Proceedings of the twenty-second international joint conference on artificial intelligence

    Google Scholar 

  • Hou C, Nie F, Li X, Yi D, Wu Y (2014) Joint embedding learning and sparse regression: a framework for unsupervised feature selection. IEEE Trans Cybern 44:793–804

    Article  Google Scholar 

  • Kwak N (2008) Principal component analysis based on L1-norm maximization. IEEE Trans Pattern Anal Mach Intelligence 30:1672–1680

    Article  Google Scholar 

  • Lai Z, Xu Y, Yang J, Tang J, Zhang D (2013) Sparse tensor discriminant analysis. IEEE Trans Image Process 22:3904–3915

    Article  ADS  MathSciNet  Google Scholar 

  • Lai Z, Mo D, Wen J, Shen L, Wong W (2018) Generalized robust regression for jointly sparse subspace learning. IEEE Trans Circuits Syst Video Technol 29(3):756–772

    Article  Google Scholar 

  • Li H, Jiang T, Zhang K (2006) Efficient and robust feature extraction by maximum margin criterion. IEEE Trans Neural Netw 17:157–165

    Article  ADS  Google Scholar 

  • Ma Z, Nie F, Yang Y, Uijlings JRR, Sebe N, Hauptmann AG (2012) Discriminating joint feature analysis for multimedia data understanding. IEEE Trans Multimedia 14:1662–1672

    Article  Google Scholar 

  • Martinez A, Benavente R (1998) The AR face database, CVC. Copyright of Informatica (03505596)

    Google Scholar 

  • Mo D, Lai Z (2019) Robust jointly sparse regression with generalized orthogonal learning for image feature selection. Pattern Recogn 93:164–178

    Article  Google Scholar 

  • Mo D, Lai Z, Wong W (2019) Locally joint sparse marginal embedding for feature extraction. IEEE Trans Multimedia 21(12):3038–3052

    Article  Google Scholar 

  • Nie F, Huang H, Cai X, Ding C (2010a) Efficient and robust feature selection via joint L2,1-norms minimization. In: Proceedings of advances in neural information processing systems, vol 23

    Google Scholar 

  • Nie F, Xu D, Tsang IW-H, Zhang C (2010b) Flexible manifold embedding: a framework for semi-supervised and unsupervised dimension reduction. IEEE Trans Image Process 19:1921–1932

    Article  ADS  MathSciNet  Google Scholar 

  • Pang Y, Yuan Y (2010) Outlier-resisting graph embedding. Neurocomputing 73:968–974

    Article  Google Scholar 

  • Pima I, Aladjem M (2004) Regularized discriminant analysis for face recognition. Pattern Recogn 37:1945–1948

    Article  Google Scholar 

  • Ren C-X, Dai D-Q, Yan H (2012) Robust classification using L2, 1-norm based regression model. Pattern Recogn 45:2708–2718

    Article  Google Scholar 

  • Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326

    Article  ADS  Google Scholar 

  • Saab R, Chartrand R, Yilmaz O (2008) Stable sparse approximations via nonconvex optimization. In: 2008 IEEE international conference on acoustics, speech and signal processing

    Google Scholar 

  • Shao J, Wang Y, Deng X, Wang S (2011) Sparse linear discriminant analysis with high dimensional data. Ann Stat 39(2):1241–1265

    Article  MathSciNet  Google Scholar 

  • Sindhwani V, Niyogi P, Belkin M, Keerthi S (2005) Linear manifold regularization for large scale semi-supervised learning. In: Proceedings of the 22nd ICML workshop on learning with partially classified training data, vol 28

    Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc Ser B Methodol 58:267–288

    MathSciNet  MATH  Google Scholar 

  • Tropp JA, Gilbert AC (2007) Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans Inf Theory 53:4655–4666

    Article  MathSciNet  Google Scholar 

  • Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009b) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31:210–227

    Article  Google Scholar 

  • Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neural Netw Learn Syst 23:1738–1754

    Article  Google Scholar 

  • Xu Z, Chang X, Xu F, Zhang H (2012) $ L_ ${$1/2$}$ $ regularization: a thresholding representation theory and a fast solver. IEEE Trans Neural Netw Learn Syst 23:1013–1027

    Article  Google Scholar 

  • Yang J, Yin W, Zhang Y, Wang Y (2009) A fast algorithm for edge-preserving variational multihannel image restoration. SIAM J Imaging Sci 2:569–592

    Article  MathSciNet  Google Scholar 

  • Yang AY, Sastry SS, Ganesh A, Ma Y (2010) Fast ℓ 1-minimization algorithms and an application in robust face recognition: a review. In: 2010 IEEE international conference on image processing

    Google Scholar 

  • Yang Y, Shen HT, Ma Z, Huang Z, Zhou X (2011) L2,1-norm regularized discriminative feature selection for unsupervised learning. In: Proceedings of the twenty-second international joint conference on artificial intelligence. AAAI Press

    Google Scholar 

  • Yang Y, Ma Z, Hauptmann AG, Sebe N (2012) Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Trans Multimedia 15:661–669

    Article  Google Scholar 

  • Yang J, Chu D, Zhang L, Xu Y, Yang J (2013) Sparse representation classifier steered discriminative projection with applications to face recognition. IEEE Trans Neural Netw Learn Syst 24:1023–1035

    Article  Google Scholar 

  • Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B Stat Methodol 68:49–67

    Article  MathSciNet  Google Scholar 

  • Zhang J, Yu J, Wan J, Zeng Z (2015) L2,1 norm regularized fisher criterion for optimal feature selection. Neurocomputing 166:455–463

    Article  Google Scholar 

  • Zhong F, Zhang J (2013) Linear discriminant analysis based on L1-norm maximization. IEEE Trans Image Process 22:3018–3027

    Article  ADS  MathSciNet  Google Scholar 

  • Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101:1418–1429

    Article  MathSciNet  Google Scholar 

  • Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67:301–320

    Article  MathSciNet  Google Scholar 

  • Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15:265–286

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Zhao, H., Lai, Z., Leung, H., Zhang, X. (2020). Sparse Feature Learning. In: Feature Learning and Understanding. Information Fusion and Data Science. Springer, Cham. https://doi.org/10.1007/978-3-030-40794-0_7

Download citation

Publish with us

Policies and ethics