Abstract
The basic concepts of distance based classification are introduced in terms of clear-cut example systems. The classical k-Nearest-Neigbhor (kNN) classifier serves as the starting point of the discussion. Learning Vector Quantization (LVQ) is introduced, which represents the reference data by a few prototypes. This requires a data driven training process; examples of heuristic and cost function based prescriptions are presented. While the most popular measure of dissimilarity in this context is the Euclidean distance, this choice is frequently made without justification. Alternative distances can yield better performance in practical problems. Several examples are discussed, including more general Minkowski metrics and statistical divergences for the comparison of, e.g., histogram data. Furthermore, the framework of relevance learning in LVQ is presented. There, parameters of adaptive distance measures are optimized in the training phase. A practical application of Matrix Relevance LVQ in the context of tumor classification illustrates the approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In this article, we use the term distance in its general sense, not necessarily implying symmetry or other metric properties.
References
Bishop, C.: Pattern Recognition and Machine Learning. Cambridge University Press, Cambridge (2007)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, New York (2009)
Duda, R., Hart, P., Storck, D.: Pattern Classification, 2nd edn. Wiley, New York (2001)
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Biehl, M., Hammer, B., Verleysen, M., Villmann, T. (eds.): Similarity-Based Clustering. LNCS, vol. 5400. Springer, Heidelberg (2009)
Hammer, B., Schleif, F.-M., Zhu, X.: Relational extensions of learning vector quantization. In: Lu, B.-L., Zhang, L., Kwok, J. (eds.) ICONIP 2011, Part II. LNCS, vol. 7063, pp. 481–489. Springer, Heidelberg (2011)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13, 21–27 (1967)
Kohonen, T.: Self-Organizing Maps, 2nd edn. Springer, Heidelberg (1997)
Kohonen, T.: Improved versions of learning vector quantization. In: International Joint Conference on Neural Networks, vol. 1, pp. 545–550 (1990)
Hart, P.: The condensed nearest neighbor rule. IEEE Trans. Inf. Theor. 14, 515–516 (1968)
Wu, Y., Ianakiev, K., Govindaraju, V.: Improved k-nearest neighbor classification. Pattern Recogn. 35, 2311–2318 (2002)
Strickert, M., Hammer, B., Villmann, T., Biehl, M.: Regularization and improved interpretation of linear data mappings and adaptive distance measures. In: Proceedings of the IEEE Symposium on Computational Intelligence (IEEE SSCI), IEEE, vol. 2013, p. 8 (2013)
Helsinki University of Technology: Bibliography on the Self-Organizing Map (SOM) and Learning Vector Quantization (LVQ). Neural Networks Research Centre, HUT (2002)
Sato, A., Yamada, K.: Generalized Learning vector quantization. In: Touretzky, D.S., Hasselmo, M.E. (eds.) Proceedings of the 1995 Conference, Cambridge, MA, USA, MIT Press. vol. 8, Advances in Neural Information Processing Systems, pp. 423–429 (1996)
Sato, A., Yamada, K.: An analysis of convergence in generalized LVQ. In: Niklasson, L., Bodn, M., Ziemke, T. (eds.) Proceedings of the International Conference on Artificial Neural Networks, Springer, pp. 170–176 (1998)
Seo, S., Obermayer, K.: Soft learning vector quantization. Neural Comput. 15(7), 1589–1604 (2003)
Seo, S., Bode, M., Obermayer, K.: Soft nearest prototype classification. Trans. Neural Netw. 14, 390–398 (2003)
Seo, S., Obermayer, K.: Dynamic hyperparameter scaling method for LVQ algorithms. In: IJCNN’06, International Joint Conference on Neural Networks, IEEE, pp. 3196–3203 (2006)
Schneider, P., Biehl, M., Hammer, B.: Hyperparameter learning in probabilistic prototype-based models. Neurocomputing 73(7–9), 1117–1124 (2010)
Schneider, P., Biehl, M., Hammer, B.: Adaptive relevance matrices in learning vector quantization. Neural Comput. 21(12), 3532–3561 (2009)
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 405 (1951)
Bottou, L.: Online algorithms and stochastic approximations. In: Saad, D. (ed.) Online Learning and Neural Networks. Cambridge University Press, Cambridge (1998)
Lee, J., Verleysen, M.: Nonlinear Dimension Reduction. Springer, New York (2007)
Hammer, B., Villmann, T.: Classification using non-standard metrics. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks, ESANN 2005, pp. 303–316. d-side publishing (2005)
Lee, J., Verleysen, M.: Generalization of the Lp-norm for time series and its application to self-organizing maps. In: Cottrell, M. (ed.) Proceedings of the Workshop on Self-Organizing Maps (WSOM), Paris, Sorbonne, pp. 733–740 (2005)
Villmann, T., Hammer, B.: Functional principal component learning using Oja’s method and Sobolev norms. In: Príncipe, J.C., Miikkulainen, R. (eds.) WSOM 2009. LNCS, vol. 5629, pp. 325–333. Springer, Heidelberg (2009)
Lange, M., Villmann, T.: Derivatives of Lp-norms and their approximations. Machine Learning Reports MLR-04-2013, pp. 43–59 (2013)
Giles, J.: Classes of semi-inner-product spaces. Trans. Am. Math. Soc. 129, 436–446 (1967)
Lumer, G.: Semi-inner-product spaces. Trans. Am. Math. Soc. 100, 29–43 (1961)
Golubitsky, O., Watt, S.: Distance-based classification of handwritten symbols. Int. J. Doc. Anal. Recogn. (IJDAR) 13(2), 133–146 (2010)
Biehl, M., Breitling, R., Li, Y.: Analysis of tiling microarray data by learning vector quantization and relevance learning. In: Yin, H., Tino, P., Corchado, E., Byrne, W., Yao, X. (eds.) IDEAL 2007. LNCS, vol. 4881, pp. 880–889. Springer, Heidelberg (2007)
Joliffe, I.: Principal Component Analysis. Springer, New York (2002)
Biehl, M., Kästner, M., Lange, M., Villmann, T.: Non-euclidean principal component analysis and Oja’s learning rule – theoretical aspects. In: Estevez, P.A., Principe, J.C., Zegers, P. (eds.) Advances in Self-Organizing Maps. AISC, vol. 198, pp. 23–34. Springer, Heidelberg (2013)
Villmann, T., Kästner, M., Backhaus, A., Seiffert, U.: Processing hyperspectral data in machine learning. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks, ESANN 2013, p. 6. d-side publishing (2013)
Kullback, S., Leibler, R.: On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951)
Villmann, T., Haase, S.: Divergence based vector quantization. Neural Comput. 23(5), 1343–1392 (2011)
Mwebaze, E., Schneider, P., Schleif, F.M., Aduwo, J., Quinn, J., Haase, S., Villmann, T., Biehl, M.: Divergence based classification and learning vector quantization. Neurocomputing 74, 1429–1435 (2011)
Schölkopf, B.: The kernel trick for distances. In: Tresp, V. (ed.) Advances in Neural Information Processing Systems, pp. 301–307. MIT Press, Cambridg (2001)
Inokuchi, R., Miyamoto, S.: LVQ clustering and SOM using a kernel function. In: Proceedings of the 2004 IEEE International Conference on Fuzzy Systems, vol. 3, pp. 1497–1500 (2004)
Schleif, F.-M., Villmann, T., Hammer, B., Schneider, P., Biehl, M.: Generalized derivative based kernelized learning vector quantization. In: Fyfe, C., Tino, P., Charles, D., Garcia-Osorio, C., Yin, H. (eds.) IDEAL 2010. LNCS, vol. 6283, pp. 21–28. Springer, Heidelberg (2010)
Schölkopf, B., Smola, A.: Learning with Kernels. MIT Press, Cambridge (2002)
Steinwart, I.: On the influence of the kernel on the consistency of support vector machines. J. Mach. Learn. Res. 2, 67–93 (2001)
Villmann, T., Kästner, M., Nebel, D., Riedel, M.: ICMLA face recognition challenge - results of the team ‘Computational Intelligence Mittweida’. In: Proceedings of the International Conference on Machine Learning Applications (ICMLA’12), pp. 7–10. IEEE Computer Society Press (2012)
Villmann, T., Haase, S., Kästner, M.: Gradient based learning in vector quantization using differentiable kernels. In: Estevez, P.A., Principe, J.C., Zegers, P. (eds.) Advances in Self-Organizing Maps. AISC, vol. 198, pp. 193–204. Springer, Heidelberg (2013)
Bunte, K., Schneider, P., Hammer, B., Schleif, F.M., Villmann, T., Biehl, M.: Limited rank matrix learning, discriminative dimension reduction, and visualization. Neural Netw. 26, 159–173 (2012)
Biehl, M., Bunte, K., Schleif, F.M., Schneider, P., Villmann, T.: Large margin linear discriminative visualization by matrix relevance learning. In: Proceedings of the WCCI 2012 - IEEE World Congress on Computational Intelligence, IEEE Press (2012)
Bojer, T., Hammer, B., Schunk, D., von Toschanowitz, K.T.: Relevance determination in learning vector quantization. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks, pp. 271–276 (2001)
Hammer, B., Villmann, T.: Generalized relevance learning vector quantization. Neural Netw. 15(8–9), 1059–1068 (2002)
Weinberger, K., Blitzer, J., Saul, L.: Distance metric learning for large margin nearest neighbor classification. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems, vol. 18, pp. 1473–1480. MIT Press, Cambridge (2006)
Backhaus, A., Ashok, P., Praveen, B., Dholakia, K., Seiffert, U.: Classifying Scotch Whisky from near-infrared Raman spectra with a radial basis function network with relevance learning. In: Verleysen, M. (ed.) European symposium on Artificial Neural Networks, vol. 2012, pp. 411–416 (2012)
Biehl, M., Bunte, K., Schneider, P.: Relevance and matrix adaptation in learning vector quantization (2013). http://matlabserver.cs.rug.nl/gmlvqweb/web
Arlt, W., Biehl, M., Taylor, A., Hahner, S., Libe, R., Hughes, B., Schneider, P., Smith, D., Stiekema, H., Krone, N., Porfiri, E., Opocher, G., Bertherat, J., Mantero, F., Allolio, B., Terzolo, M., Nightingale, P., Shackleton, C., Bertagna, X., Fassnacht, M., Stewart, P.: Urine steroid metabolomics as a biomarker tool for detecting malignancy in adrenal tumors. J. Clin. Endocrinol. Metab. 96, 3775–3784 (2011)
Biehl, M., Schneider, P., Smith, D., Stiekema, H., Taylor, A., Hughes, B., Shackleton, C., Stewart, P., Arlt, W.: Matrix relevance LVQ in steroid metabolomics based classification of adrenal tumors. In: Verleysen, M. (ed.) 20th European Symposium on Artificial Neural Networks (ESANN 2012), pp. 423–428, d-side publishing (2012)
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27, 861–874 (2006)
Biehl, M., Hammer, B., Schleif, F.M., Schneider, P., Villmann, T.: Stationarity of matrix relevance learning vector quantization. Technical report MLR-01-2009, Machine Learning Reports, University of Leipzig (2009)
Biehl, M., Bunte, K., Schneider, P.: Analysis of flow cytometry data by matrix relevance learning vector quantization. PLoS ONE 8(3), e59401 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Biehl, M., Hammer, B., Villmann, T. (2014). Distance Measures for Prototype Based Classification. In: Grandinetti, L., Lippert, T., Petkov, N. (eds) Brain-Inspired Computing. BrainComp 2013. Lecture Notes in Computer Science(), vol 8603. Springer, Cham. https://doi.org/10.1007/978-3-319-12084-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-12084-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12083-6
Online ISBN: 978-3-319-12084-3
eBook Packages: Computer ScienceComputer Science (R0)