Skip to main content
Log in

On ill-conceived initialization in archetypal analysis

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

We show that an improper initialization of the matrix of prototypes, \({\mathbf {V}}\), can be misleading, and potentially gives rise to a degenerate fuzzy partition when performing fuzzy clustering by means of an archetypal analysis. Subsequently, we propose an algorithm to correct the initial guess for \({\mathbf {V}}\), which is grounded in two theoretical results on convex hulls. A numerical experiment carried out to assess its accuracy, and involving more than 200,000 initializations, shows a failure rate of below 0.8%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. See Wild et al. (2004) for two alternative objective functions.

  2. They used a random initialization and did not check the extremality of the prototypes before performing an AA.

  3. The data are categorized into two classes: window and non-window glass.

References

  • Barber CB, Dobkin DP, Huhdanpaa H (1996) The quickhull algorithm for convex hulls. ACM Trans Math Softw 22(4):469–483

    Article  MathSciNet  MATH  Google Scholar 

  • Bauckhage C, Thurau C (2009) Making archetypal analysis practical. In: Proceedings of the 31st DAGM symposium on pattern recognition. Springer, Berlin, pp 272–281

  • Bemporad A, Fukuda K, Torrisi FD (2001) Convexity recognition of the union of polyhedra. Comput Geom 18:141–154

    Article  MathSciNet  MATH  Google Scholar 

  • Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York

    Book  MATH  Google Scholar 

  • Casalino G, Buono ND, Mencar C (2014) Subtractive clustering for seeding non-negative matrix factorizations. Inf Sci 257:369–387

    Article  MathSciNet  MATH  Google Scholar 

  • Cutler A, Breiman L (1994) Archetypal analysis. Technometrics 36(4):338–347

    Article  MathSciNet  MATH  Google Scholar 

  • D’Urso P (2015) Fuzzy clustering. In: Hennig C, Meila M, Murtagh F, Rocci R (eds) Handbook of cluster analysis. Chapman & Hall/CRC Handbooks of Modern Statistical Methods, pp 545–573

  • Demaine ED, Schulz A (2016) Embedding stacked polytopes on a polynomial-size grid. https://arxiv.org/abs/1403.7980. Accessed 3 July 2017

  • Ding C, Li T, Jordan MI (2010) Convex and semi-nonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell 32(1):45–55

    Article  Google Scholar 

  • Donoho DL, Gasko M (1992) Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann Stat 20:1803–1827

    Article  MathSciNet  MATH  Google Scholar 

  • Donoho D, Stodd V (2004) When does non-negative matrix factorization give a correct decomposition into parts? In: Thrun S, Saul LK, Schölkopf PB (eds) Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, pp 1141–1148

    Google Scholar 

  • Dulá JH, Hegason RV (1996) A new procedure for identifying the frame of the convex hull of a finite collection of points in multidimensional space. Eur J Oper Res 92:352–367

    Article  MATH  Google Scholar 

  • Eugster MJA, Leisch F (2009) From spider-man to hero—archetypal analysis in R. J Stat Softw 30(8):1–23

    Article  Google Scholar 

  • Gawrilow E, Joswig M (2000) polymake: a framework for analyzing convex polytopes. In: Kalai G, Ziegler GM (eds) Polytopes combinatorics and computation. Birkhäuser, Basel, pp 43–74

    Chapter  Google Scholar 

  • Gonska B, Ziegler GM (2013) Inscribable stacked polytopes. Adv Geom 8(4):723–740

    MathSciNet  MATH  Google Scholar 

  • Hochbaum DS, Shmoys DB (1985) A best possible heuristic for the \(k\)-center problem. Math Oper Res 10(2):180–184

    Article  MathSciNet  MATH  Google Scholar 

  • Johnson B, Tateishi R, Xie Z (2012) Using geographically-weighted variables for image classification. Remote Sens Lett 3(6):491–499

    Article  Google Scholar 

  • Kalai G (1994) Some aspects of the combinatorial theory of convex polytopes. In: Bisztriczky T, McMullen P, Schneider R, Weiss AI (eds) Polytopes: abstract,convex and computational. Springer, Berlin, pp 205–229

    Chapter  Google Scholar 

  • Kliengenberg B, Curry J, Dougherty A (2009) Non-negative matrix factorization: ill-posedness and a geometric algorithm. Pattern Recognit 42:918–928

    Article  MATH  Google Scholar 

  • Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37

    Article  Google Scholar 

  • Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791

    Article  MATH  Google Scholar 

  • Lichman M (2013) UCI machine learning repository, School of Information and Computer Sciences, University of California, Irvine, CA, USA. http://archive.ics.uci.edu/ml. Accessed 3 July 2017

  • Mangasarian OL, Wolberg WH (1990) Cancer diagnosis via linear programming. SIAM News 23(5):1–18

    Google Scholar 

  • Mirkin B, Satarov G (1990) Method of fuzzy additive types for analysis of multidimensional data I. Autom Remote Control 51(5):683–688

    MATH  Google Scholar 

  • Mørup M, Hansen LK (2012) Archetypal analysis for machine learning and data mining. Neurocomputing 80:54–63

    Article  Google Scholar 

  • Nascimento S, Mirkin B (2017) Ideal type model and an associated method for relational fuzzy clustering. In: Proceedings of the 2017 IEEE international conference on fuzzy systems (FUZZ-IEEE), IEEE, Naples, Italy. https://doi.org/10.1109/FUZZ-IEEE.2017.8015473. http://ieeexplore.ieee.org/document/8015473/?reload=true

  • Nascimento S, Mirkin B, Moura-Pires F (2003) Modeling proportional membership in fuzzy clustering. IEEE Trans Fuzzy Syst 11(2):173–186

    Article  MATH  Google Scholar 

  • Paatero P, Tapper U (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environ 5:111–126

    Google Scholar 

  • Pal NR, Bezdek JC (1995) On cluster validity for fuzzy c-means model. IEEE Trans Fuzzy Syst 3(3):370–379

    Article  Google Scholar 

  • Rezaei M, Boostani R, Rezaei M (2004) An efficient initialization method for nonnegative matrix factorization. J Appl Sci 11(2):354–359

    Google Scholar 

  • Seidel R (1986) Constructing higher-dimensional convex hulls at logarithmic cost per Face. In: Proceedings of the 18th ACM symposium on the theory of computing, pp 404–413

  • Steuer RE (1986) Multiple criteria optimization: theory, computation, and application. Wiley, New York

    MATH  Google Scholar 

  • Suleman A (2015a) A convex semi-nonnegative matrix factorisation approach to fuzzy c-means clustering. Fuzzy Sets Syst 270:90–110

    Article  MathSciNet  MATH  Google Scholar 

  • Suleman A (2015b) A new perspective of modified partition coefficient. Pattern Recognit Lett 56:1–6

    Article  Google Scholar 

  • Wild S, Curry J, Dougherty A (2004) Improving non-negative matrix factorization through structured initialization. Pattern Recognit 37:2217–2232

    Article  Google Scholar 

  • Woodbury MA, Clive J (1974) Clinical pure types as a fuzzy partition. J Cybern 11:277–298

    MATH  Google Scholar 

  • Zheng Z, Yang J, Zhu Y (2007) Initialization enhancer for non-negative matrix factorization. Eng Appl Artif Intell 20:101–110

    Article  Google Scholar 

  • Ziegler GM (2004) Convex polytopes: extremal constructions and f-vector shapes. IAS/Park City Math Ser 14:1–73

    Google Scholar 

  • Ziegler GM (2007) Lectures on polytopes, 7th edn. Springer, New York

    MATH  Google Scholar 

Download references

Acknowledgements

The author is indebted to Günter M. Ziegler and C. Bradford Barber for their advice which significantly contributed to this research work. However, the work is the exclusive responsibility of the author. He also thanks the three anonymous reviewers for their comments, suggestions and careful reading of an earlier version of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdul Suleman.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Suleman, A. On ill-conceived initialization in archetypal analysis. Adv Data Anal Classif 11, 785–808 (2017). https://doi.org/10.1007/s11634-017-0303-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-017-0303-0

Keywords

Mathematics Subject Classification

Navigation