ABSTRACT
We study the problem of learning a classification task in which only a dissimilarity function of the objects is accessible. That is, data are not represented by feature vectors but in terms of their pairwise dissimilarities. We investigate the sufficient conditions for dissimilarity functions to allow building accurate classifiers. Our results have the advantages that they apply to unbounded dissimilarities and are invariant to order-preserving transformations. The theory immediately suggests a learning paradigm: construct an ensemble of decision stumps each depends on a pair of examples, then find a convex combination of them to achieve a large margin. We next develop a practical algorithm called Dissimilarity based Boosting (DBoost) for learning with dissimilarity functions under the theoretical guidance. Experimental results demonstrate that DBoost compares favorably with several existing approaches on a variety of databases and under different conditions.
- Balcan, M.-F., & Blum, A. (2006). On a theory of learning with similarity functions. International Conference on Machine Learning. Google ScholarDigital Library
- Balcan, M.-F., Blum, A., & Vempala, S. (2004). On kernels, margins, and low-dimensional mappings. International Workshop on Algorithmic Learning Theory.Google ScholarCross Ref
- Balcan, M.-F., Blum, A., & Vempala, S. (2006). Kernels as features: On kernels, margins, and low-dimensional mappings. Machine Learning, 65, 79--94. Google ScholarDigital Library
- Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Wadsworth.Google Scholar
- Chang, C. C., & Lin, C. J. (2001). A library for support vector machines. Available at http://www.csie.ntu.edu.tw/cjlin/libsvm.Google Scholar
- Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. International Conference on Machine Learning.Google Scholar
- Goldfarb, L. (1985). A new approach to pattern recognition. In L. N. Kannal and A. Rosenfeld (Ed.), Progress in Pattern Recognition, 2, 241--402.Google Scholar
- Graepel, T., Herbrich, R., Bollmann-sdorra, P., & Obermayer, K. (1999). Classification on pairwise proximity data. Advances in Neural Information Processing Systems. Google ScholarDigital Library
- Hinton, G. E., & Revow, M. (1996). Using pairs of data points to define splits for decision trees. Advances in Neural Information Processing Systems.Google Scholar
- Jacobs, D. W., Weinshall, D., & Gdalyahu, Y. (2000). Classification with nonmetric distances: Image retrieval and class representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 583--560. Google ScholarDigital Library
- Jain, A. K., & Zongker, D. E. (1997). Representation and recognition of handwritten digits using deformable templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 1386--1391. Google ScholarDigital Library
- Li, J., Chen, G., & Chi, Z. (2002). A fuzzy image metric with application to fractal coding. IEEE Transactions on Image Processing, 11, 636--643. Google ScholarDigital Library
- Maltoni, D., Maio, D., Jain, A. K., & Prabhakar, S. (2003). Handbook of fingerprint recognition. New York: Springer. Google ScholarDigital Library
- Pekalska, E., & Duin, R. P. W. (2002). Dissimilarity representations allow for building good classifiers. Pattern Recognition Letters, 23, 943--956. Google ScholarDigital Library
- Pekalska, E., & Duin, R. P. W. (2005). The dissimilarity representation for pattern recognition. World Scientific. Google ScholarDigital Library
- Pekalska, E., Paclík, P., & Duin, R. P. W. (2002). A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research, 2, 175--211. Google ScholarDigital Library
- Ripley, B. (1996). Pattern recognition and neural networks. Cambridge: Cambridge University Press. Google ScholarDigital Library
- Schapire, R. E., & Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37, 297--336. Google ScholarDigital Library
- Simard, P., Cun, Y. L., & Denker, J. (1993). Efficient pattern recognition using a new transformation distance. Advances in Neural Information Processing Systems. Google ScholarDigital Library
- Zhao, W., Chellappa, R., Phillips, P. J., & Rosenfeld, A. (2003). Face recognition: A literature survey. ACM Computing Surveys, 35, 399--458. Google ScholarDigital Library
- On learning with dissimilarity functions
Recommendations
Theory and algorithm for learning with dissimilarity functions
We study the problem of classification when only a dissimilarity function between objects is accessible. That is, data samples are represented not by feature vectors but in terms of their pairwise dissimilarities. We establish sufficient conditions for ...
Dissimilarity functions and divergence measures between fuzzy sets
In this paper we propose two approaches to constructing divergence measures. The construction is based on the use of dissimilarity functions and fuzzy equivalencies. Firstly, we introduce some ways of generating dissimilarity functions. Then, we present ...
Dissimilarity based ensemble of extreme learning machine for gene expression data classification
Extreme learning machine (ELM) has salient features such as fast learning speed and excellent generalization performance. However, a single extreme learning machine is unstable in data classification. To overcome this drawback, more and more researchers ...
Comments