Abstract
The widely known binary relevance method for multi-label classification, which considers each label as an independent binary problem, has been sidelined in the literature due to the perceived inadequacy of its label-independence assumption. Instead, most current methods invest considerable complexity to model interdependencies between labels. This paper shows that binary relevance-based methods have much to offer, especially in terms of scalability to large datasets. We exemplify this with a novel chaining method that can model label correlations while maintaining acceptable computational complexity. Empirical evaluation over a broad range of multi-label datasets with a variety of evaluation metrics demonstrates the competitiveness of our chaining method against related and state-of-the-art methods, both in terms of predictive performance and time complexity.
Download to read the full chapter text
Chapter PDF
References
Fürnkranz, J., Hüllermeier, E., Mencía, E.L., Brinker, K.: Multilabel classification via calibrated label ranking. Machine Learning 73(2), 133–153 (2008)
Godbole, S., Sarawagi, S.: Discriminative methods for multi-labeled classification. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 22–30. Springer, Heidelberg (2004)
Ji, S., Tang, L., Yu, S., Ye, J.: Extracting shared subspace for multi-label classification. In: KDD 2008: 14th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pp. 381–389. ACM, New York (2008)
Mencía, E.L., Fürnkranz, J.: Efficient pairwise multilabel classification for large-scale problems in the legal domain. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 50–65. Springer, Heidelberg (2008)
McCallum, A.K.: Multi-label text classification with a mixture model trained by EM. In: Association for the Advancement of Artificial Intelligence workshop on text learning (1999)
Nadeau, C., Bengio, Y.: Inference for the generalization error. Machine Learning 52(3), 239–281 (2003)
Petrovskiy, M.: Paired comparisons method for solving multi-label learning problem. In: HIS 2006: Sixth International Conference on Hybrid Intelligent Systems, p. 42. IEEE, Los Alamitos (2006)
Read, J., Pfahringer, B., Holmes, G.: Multi-label classification using ensembles of pruned sets. In: ICDM 2008, pp. 995–1000. IEEE, Los Alamitos (2008)
Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3), 297–336 (1999)
Schapire, R.E., Singer, Y.: Boostexter: A boosting-based system for text categorization. Machine Learning 39(2/3), 135–168 (2000)
Sun, L., Ji, S., Ye, J.: Hypergraph spectral learning for multi-label classification. In: KDD 2008: 14th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pp. 668–676. ACM, New York (2008)
Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.: Multilabel classification of music into emotions. In: ISMIR 2008: 9th International Conference on Music Information Retrieval (2008)
Tsoumakas, G., Katakis, I.: Multi label classification: An overview. International Journal of Data Warehousing and Mining 3(3) (2007)
Tsoumakas, G., Katakis, I., Vlahavas, I.P.: Effective and efficient multilabel classification in domains with large number of labels. In: ECML/PKDD 2008 Workshop on Mining Multidimensional Data (2008)
Tsoumakas, G., Vlahavas, I.P.: Random k-labelsets: An ensemble method for multilabel classification. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 406–417. Springer, Heidelberg (2007)
Vens, C., Struyf, J., Schietgat, L., Džeroski, S., Blockeel, H.: Decision trees for hierarchical multi-label classification. Machine Learning 2(73), 185–214 (2008)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Yan, R., Tesic, J., Smith, J.R.: Model-shared subspace boosting for multi-label classification. In: KDD 2007: 13th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pp. 834–843. ACM, New York (2007)
Zhangand, M.-L., Zhou, Z.-H.: A k-nearest neighbor based algorithm for multi-label classification. In: GnC 2005: IEEE International Conference on Granular Computing, pp. 718–721. IEEE, Los Alamitos (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Read, J., Pfahringer, B., Holmes, G., Frank, E. (2009). Classifier Chains for Multi-label Classification. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2009. Lecture Notes in Computer Science(), vol 5782. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04174-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-04174-7_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04173-0
Online ISBN: 978-3-642-04174-7
eBook Packages: Computer ScienceComputer Science (R0)