Abstract
We show that an improper initialization of the matrix of prototypes, \({\mathbf {V}}\), can be misleading, and potentially gives rise to a degenerate fuzzy partition when performing fuzzy clustering by means of an archetypal analysis. Subsequently, we propose an algorithm to correct the initial guess for \({\mathbf {V}}\), which is grounded in two theoretical results on convex hulls. A numerical experiment carried out to assess its accuracy, and involving more than 200,000 initializations, shows a failure rate of below 0.8%.
Similar content being viewed by others
Notes
See Wild et al. (2004) for two alternative objective functions.
They used a random initialization and did not check the extremality of the prototypes before performing an AA.
The data are categorized into two classes: window and non-window glass.
References
Barber CB, Dobkin DP, Huhdanpaa H (1996) The quickhull algorithm for convex hulls. ACM Trans Math Softw 22(4):469–483
Bauckhage C, Thurau C (2009) Making archetypal analysis practical. In: Proceedings of the 31st DAGM symposium on pattern recognition. Springer, Berlin, pp 272–281
Bemporad A, Fukuda K, Torrisi FD (2001) Convexity recognition of the union of polyhedra. Comput Geom 18:141–154
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York
Casalino G, Buono ND, Mencar C (2014) Subtractive clustering for seeding non-negative matrix factorizations. Inf Sci 257:369–387
Cutler A, Breiman L (1994) Archetypal analysis. Technometrics 36(4):338–347
D’Urso P (2015) Fuzzy clustering. In: Hennig C, Meila M, Murtagh F, Rocci R (eds) Handbook of cluster analysis. Chapman & Hall/CRC Handbooks of Modern Statistical Methods, pp 545–573
Demaine ED, Schulz A (2016) Embedding stacked polytopes on a polynomial-size grid. https://arxiv.org/abs/1403.7980. Accessed 3 July 2017
Ding C, Li T, Jordan MI (2010) Convex and semi-nonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell 32(1):45–55
Donoho DL, Gasko M (1992) Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann Stat 20:1803–1827
Donoho D, Stodd V (2004) When does non-negative matrix factorization give a correct decomposition into parts? In: Thrun S, Saul LK, Schölkopf PB (eds) Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, pp 1141–1148
Dulá JH, Hegason RV (1996) A new procedure for identifying the frame of the convex hull of a finite collection of points in multidimensional space. Eur J Oper Res 92:352–367
Eugster MJA, Leisch F (2009) From spider-man to hero—archetypal analysis in R. J Stat Softw 30(8):1–23
Gawrilow E, Joswig M (2000) polymake: a framework for analyzing convex polytopes. In: Kalai G, Ziegler GM (eds) Polytopes combinatorics and computation. Birkhäuser, Basel, pp 43–74
Gonska B, Ziegler GM (2013) Inscribable stacked polytopes. Adv Geom 8(4):723–740
Hochbaum DS, Shmoys DB (1985) A best possible heuristic for the \(k\)-center problem. Math Oper Res 10(2):180–184
Johnson B, Tateishi R, Xie Z (2012) Using geographically-weighted variables for image classification. Remote Sens Lett 3(6):491–499
Kalai G (1994) Some aspects of the combinatorial theory of convex polytopes. In: Bisztriczky T, McMullen P, Schneider R, Weiss AI (eds) Polytopes: abstract,convex and computational. Springer, Berlin, pp 205–229
Kliengenberg B, Curry J, Dougherty A (2009) Non-negative matrix factorization: ill-posedness and a geometric algorithm. Pattern Recognit 42:918–928
Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791
Lichman M (2013) UCI machine learning repository, School of Information and Computer Sciences, University of California, Irvine, CA, USA. http://archive.ics.uci.edu/ml. Accessed 3 July 2017
Mangasarian OL, Wolberg WH (1990) Cancer diagnosis via linear programming. SIAM News 23(5):1–18
Mirkin B, Satarov G (1990) Method of fuzzy additive types for analysis of multidimensional data I. Autom Remote Control 51(5):683–688
Mørup M, Hansen LK (2012) Archetypal analysis for machine learning and data mining. Neurocomputing 80:54–63
Nascimento S, Mirkin B (2017) Ideal type model and an associated method for relational fuzzy clustering. In: Proceedings of the 2017 IEEE international conference on fuzzy systems (FUZZ-IEEE), IEEE, Naples, Italy. https://doi.org/10.1109/FUZZ-IEEE.2017.8015473. http://ieeexplore.ieee.org/document/8015473/?reload=true
Nascimento S, Mirkin B, Moura-Pires F (2003) Modeling proportional membership in fuzzy clustering. IEEE Trans Fuzzy Syst 11(2):173–186
Paatero P, Tapper U (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environ 5:111–126
Pal NR, Bezdek JC (1995) On cluster validity for fuzzy c-means model. IEEE Trans Fuzzy Syst 3(3):370–379
Rezaei M, Boostani R, Rezaei M (2004) An efficient initialization method for nonnegative matrix factorization. J Appl Sci 11(2):354–359
Seidel R (1986) Constructing higher-dimensional convex hulls at logarithmic cost per Face. In: Proceedings of the 18th ACM symposium on the theory of computing, pp 404–413
Steuer RE (1986) Multiple criteria optimization: theory, computation, and application. Wiley, New York
Suleman A (2015a) A convex semi-nonnegative matrix factorisation approach to fuzzy c-means clustering. Fuzzy Sets Syst 270:90–110
Suleman A (2015b) A new perspective of modified partition coefficient. Pattern Recognit Lett 56:1–6
Wild S, Curry J, Dougherty A (2004) Improving non-negative matrix factorization through structured initialization. Pattern Recognit 37:2217–2232
Woodbury MA, Clive J (1974) Clinical pure types as a fuzzy partition. J Cybern 11:277–298
Zheng Z, Yang J, Zhu Y (2007) Initialization enhancer for non-negative matrix factorization. Eng Appl Artif Intell 20:101–110
Ziegler GM (2004) Convex polytopes: extremal constructions and f-vector shapes. IAS/Park City Math Ser 14:1–73
Ziegler GM (2007) Lectures on polytopes, 7th edn. Springer, New York
Acknowledgements
The author is indebted to Günter M. Ziegler and C. Bradford Barber for their advice which significantly contributed to this research work. However, the work is the exclusive responsibility of the author. He also thanks the three anonymous reviewers for their comments, suggestions and careful reading of an earlier version of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Suleman, A. On ill-conceived initialization in archetypal analysis. Adv Data Anal Classif 11, 785–808 (2017). https://doi.org/10.1007/s11634-017-0303-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-017-0303-0