Abstract
Relational classification aims at including relations among entities into the classification process, for example taking relations among documents such as common authors or citations into account. However, considering more than one relation can further improve classification accuracy. Here we introduce a new approach to make use of several relations as well as both, relations and local attributes for classification using ensemble methods. To accomplish this, we present a generic relational ensemble model that can use different relational and local classifiers as components. Furthermore, we discuss solutions for several problems concerning relational data such as heterogeneity, sparsity, and multiple relations. Especially the sparsity problem will be discussed in more detail. We introduce a new method called PRNMultiHop that tries to handle this problem. Furthermore we categorize relational methods in a systematic way. Finally, we provide empirical evidence, that our relational ensemble methods outperform existing relational classification methods, even rather complex models such as relational probability trees (RPTs), relational dependency networks (RDNs) and relational Bayesian classifiers (RBCs).
Similar content being viewed by others
References
Angelova R, Weikum G (2006) Graph-based text classification: learn from your neighbor. In: Proceedings of SIGIR, pp 485–492
Bernstein A, Clearwater S, Provost F (2003) The relational vector-space model and industry classification. In: Proceedings of IJCAI workshop on statistical models from relational data, pp 8–18
Breiman L (1996). Bagging predictors. Mach Learn 24: 123–140
Chakrabarti S, Dom B, Indyk P (1998) Enhanced hypertext categorization using hyperlinks. In: Proceedings of ACM SIGMOD, pp 307–318
Cramer H (1999). Mathematical methods of statistics. Princeton University Press, Princeton
Dietterich TG (1997). Machine learning research: four current directions. AI Mag 18(4): 97–136
Domingos P and Pazzani M (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Mach Learn 29: 103–130
Fürnkranz J (2002) Hyperlink ensembles: a case Study in hypertext classification. Informat Fusion, vol 3
Heß A, Kushmerick N (2004) Iterative classification for relational data: a case study of semantic web services. In: Proceedings of European conference on machine learning
Hopfield J (1982). Neural Networks and physical systems with emergent collective computational abilities. Natl Acad Sci 79: 2554–2558
Huang Z, Chen H and Zeng D (2004). Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering. ACM Trans Informat Syst 22(1): 116–142
Jensen D, Neville J (2002) Linkage and autocorrelation cause feature selection bias in relational learning. In: Proceedings of the 19th international conference on machine learning
Jensen D, Neville J, Gallagher B (2004) Why collective inference improves relational classification. In: Proceedings of the 10th ACM SIGKDD
Kleinberg J (1998) Authoritative sources in hyperlinked environment. In: Proceedings of the 9th ACM-SIAM symposium on discrete algorithms
Kramer S, Lavrac N, Flach P (2001) Propositionalization approaches to relational data mining. In: Relational data mining. Kluwer, Dordrecht, pp 262–291
Lu Q, Getoor L (2003) Link-based text classification. In: Proceedings of IJCAI workshop on text mining and link analysis
Macskassy AS, Provost F (2003) A simple relational classifier. In: Proceedings of the multi-relational data mining workshop ACM SIGKDD
Macskassy AS, Provost F (2004) Classification in networked data: a toolkit and a univariate case study. CeDER Working Paper, Stern School of Business, New York University
McCallum A, Nigam K, Rennie J and Seymore K (2000). Automating the construction of Internet portals with machine learning. Proc Informat Retri 3(2): 127–163
Neville J, Jensen D (2000) Iterative classification in relational data. In: Proceedings of AAAI workshop on learning statistical models from relational data, pp 13–20
Neville J, Jensen D (2003) Collective classification with relational dependency Networks. In: Proceedings of the second workshop on multi-relational data mining KDD
Neville J, Jensen D (2005) Leveraging relational autocorrelation with latent group models. In: Proceedings of the 5th IEEE international conference on data mining, Houston, USA, November 2005, pp 322–329
Neville J, Jensen D, Friedland L, Hay M (2003) Learning relational probability trees. In: Proceedings of SIGKDD
Neville J, Jensen D, Gallagher B, Fairgrieve R (2003) Simple estimators for relational Bayesian classifiers. In: Proceedings of the 3rd IEEE international conference on data mining, pp 609–612
Preisach C, Schmidt-Thieme L (2006) Relational ensemble classification. In: Proceedings of the 6th IEEE international conference on data mining, Hongkong, December 2006, pp 499–509
Provost F, Domingos P (2000) Well-trained PETs: improving probability estimation trees. CeDER Working Paper, Stern School of Business, New York University
Rifkin RM and Klautau A (2004). In defense of One-Vs-All classification. J Mach Learn Res 5: 101–141
Taskar B, Segal E, Koller D (1997) Probabilistic classification and clustering in relational data. In: Proceedings of the 17th international joint conference on artificial intelligence, pp 870–878
Ting KM, Witten IH (1997) Stacked generalization: when does it work? In: Proceedings of the international joint conference on artificial intelligence, pp 866–871
Utard H, Fürnkranz J (2005) Link-local features for hypertext classification. In: Proceedings of EWMF, pp 51–64
Van Assche A, Vens C, Blockeel H, Dzeroski S (2004) Using random forests for relational learning. In: Proceedings of ICML workshop on statistical relational learning and its connections to other fields, pp 110–116
Witten IH, Frank E (2000) Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann
Wolpert DH (1992) Stacked generalization. In Neural Networks 5. Pergamon Press, pp 214–259
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Preisach, C., Schmidt-Thieme, L. Ensembles of relational classifiers. Knowl Inf Syst 14, 249–272 (2008). https://doi.org/10.1007/s10115-007-0093-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-007-0093-3