Skip to main content
Log in

Robust graph regularized nonnegative matrix factorization for clustering

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Nonnegative matrix factorization and its graph regularized extensions have received significant attention in machine learning and data mining. However, existing approaches are sensitive to outliers and noise due to the utilization of the squared loss function in measuring the quality of graph regularization and data reconstruction. In this paper, we present a novel robust graph regularized NMF model (RGNMF) to approximate the data matrix for clustering. Our assumption is that there may exist some entries of the data corrupted arbitrarily, but the corruption is sparse. To address this problem, an error matrix is introduced to capture the sparse corruption. With this sparse outlier matrix, a robust factorization result could be obtained since a much cleaned data could be reconstructed. Moreover, the \(\ell _{1}\)-norm function is used to alleviate the influence of unreliable regularization which is incurred by unexpected graphs. That is, the sparse error matrix alleviates the impact of noise and outliers, and the \(\ell _{1}\)-norm function leads to a faithful regularization since the influence of the unreliable regularization errors can be reduced. Thus, RGNMF is robust to unreliable graphs and noisy data. In order to solve the optimization problem of our method, an iterative updating algorithm is proposed and its convergence is also guaranteed theoretically. Experimental results show that the proposed method consistently outperforms many state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://archive.ics.uci.edu/ml/datasets.html

  2. http://glaros.dtc.umn.edu/gkhome/views/cluto/

References

  • Bach F, Jenatton R, Mairal J, Obozinski G (2012) Optimization with sparsity-inducing penalties. Found Trends Mach Learn 4(1):1–106

    Article  MATH  Google Scholar 

  • Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434

    MathSciNet  MATH  Google Scholar 

  • Blake C, Merz CJ (1998) UCI repository of machine learning databases. Department of Information and Computer Sciences, University of California. http://www.ics.uci.edu/~mlearn/MLRepository.html

  • Cai D, He X, Han J, Huang TS (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560

    Article  Google Scholar 

  • Cai D, He X, Wu X, Han J (2008) Non-negative matrix factorization on manifold. yIn: IEEE International Conference on Data Mining, pp. 63–72

  • Candès EJ, Li X, Ma Y, Wright J (2011) Robust principal component analysis? J ACM 58(3):11

    Article  MathSciNet  MATH  Google Scholar 

  • Chen J, Wu L, Audhkhasi K, Kingsbury B, Ramabhadrari B (2016) Efficient one-vs-one kernel ridge regression for speech recognition. yIn: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2454–2458

  • Ding C, Li T, Jordan MI (2010) Convex and semi-nonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell 32(1):45–55

    Article  Google Scholar 

  • Ding C, Li T, Peng W, Park H (2006) Orthogonal nonnegative matrix t-factorizations for clustering. In: SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 126–135

  • Du L, Shen YD (2013) Towards robust co-clustering. In: International Joint Conference on Artificial Intelligence, pp. 1317–1322

  • Gu Q, Ding C, Han J (2011) On trivial solution and scale transfer problems in graph regularized nmf. In: International Joint Conference on Artificial Intelligence, vol. 22, p. 1288

  • Gu Q, Zhou J (2009) Local learning regularized nonnegative matrix factorization. In: International Joint Conference on Artificial Intelligence

  • Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (2011) Robust statistics: the approach based on influence functions, vol 114. John Wiley and Sons

  • Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469

    MathSciNet  MATH  Google Scholar 

  • Huang J, Nie F, Huang H, Ding C (2014) Robust manifold nonnegative matrix factorization. ACM Trans Knowl Discov Data 8(3):11

    Article  Google Scholar 

  • Huang S, Xu Z, Wang F (2017) Nonnegative matrix factorization with adaptive neighbors. In: IEEE International Joint Conference on Neural Networks, pp. 486–493

  • Karypis G (2002) CLUTO-a clustering toolkit. Technical Report 02-017. Department of Computer Science, University of Minnesota. http://www.cs.umn.edu/cluto

  • Kong D, Ding C, Huang H (2011) Robust nonnegative matrix factorization using l21-norm. In: ACM International Conference on Information and Knowledge Management, pp. 673–682

  • Le Q, Sarlós T, Smola A (2013) Fastfood-approximating kernel expansions in loglinear time. In: International Conference on Machine Learning, vol. 85

  • Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Proceedings of Advances in Neural Information Processing Systems, pp. 556–562

  • Liu H, Li X, Zheng X (2013) Solving non-negative matrix factorization by alternating least squares with a modified strategy. Data Min Knowl Disc 26(3):435–451

    Article  MathSciNet  MATH  Google Scholar 

  • Liu Y, Jiao L, Shang F (2013) A fast tri-factorization method for low-rank matrix recovery and completion. Pattern Recogn 46(1):163–173

    Article  MATH  Google Scholar 

  • Long M, Wang J, Ding G, Shen D, Yang Q (2014) Transfer learning with graph co-regularization. IEEE Trans Knowl Data Eng 26(7):1805–1818

    Article  Google Scholar 

  • Pauca VP, Piper J, Plemmons RJ (2006) Nonnegative matrix factorization for spectral data analysis. Linear Algebra Appl 416(1):29–47

    Article  MathSciNet  MATH  Google Scholar 

  • Pauca VP, Shahnaz F, Berry MW, Plemmons RJ (2004) Text mining using non-negative matrix factorizations. In: SIAM International Conference on Data Mining, pp. 452–456

  • Seung HS, Lee DD (2000) The manifold ways of perception. Science 290(5500):2268–2269

    Article  Google Scholar 

  • Shang F, Jiao L, Wang F (2012) Graph dual regularization non-negative matrix factorization for co-clustering. Pattern Recogn 45(6):2237–2250

    Article  MATH  Google Scholar 

  • Sun M, Van Hamme H (2012) Large scale graph regularized non-negative matrix factorization with \(\ell _1\) normalization based on kullback-leibler divergence. IEEE Trans Signal Process 60(7):3876–3880

    Article  MathSciNet  Google Scholar 

  • Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323

    Article  Google Scholar 

  • Wang F, Li T, Wang X, Zhu S, Ding C (2011) Community discovery using nonnegative matrix factorization. Data Min Knowl Disc 22(3):493–521

    Article  MathSciNet  MATH  Google Scholar 

  • Wang F, Zhang C, Li T (2009) Clustering with local and global regularization. IEEE Trans Knowl Data Eng 21(12):1665–1678

    Article  Google Scholar 

  • Wang H, Nie F, Huang H, Makedon F (2011) Fast nonnegative matrix tri-factorization for large-scale data co-clustering. In: International Joint Conference on Artificial Intelligence, pp. 1553–1558

  • Wu L, Yen IE, Chen J, Yan R (2016) Revisiting random binning features: Fast convergence and strong parallelizability. In: SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1265–1274

  • Yoo J, Choi S (2010) Orthogonal nonnegative matrix tri-factorization for co-clustering: multiplicative updates on stiefel manifolds. Inf Process Manag 46(5):559–570

    Article  Google Scholar 

  • Zhang H, Zha ZJ, Yan S, Wang M, Chua TS (2012) Robust non-negative graph embedding: Towards noisy data, unreliable graphs, and noisy labels. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2464–2471

Download references

Acknowledgements

This paper was in part supported by Grants from the Natural Science Foundation of China (No. 61572111), a 985 Project of UESTC (No.A1098531023601041), a Fundamental Research Fund for the Central Universities of China (No. ZYGX2016Z003) and the National High Technology Research and Development Program of China (863 Program) (No. 2015AA015408).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongjun Wang.

Additional information

Responsible editor: Fei Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, S., Wang, H., Li, T. et al. Robust graph regularized nonnegative matrix factorization for clustering. Data Min Knowl Disc 32, 483–503 (2018). https://doi.org/10.1007/s10618-017-0543-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-017-0543-9

Keywords

Navigation