Skip to main content
Log in

Ensembles of relational classifiers

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Relational classification aims at including relations among entities into the classification process, for example taking relations among documents such as common authors or citations into account. However, considering more than one relation can further improve classification accuracy. Here we introduce a new approach to make use of several relations as well as both, relations and local attributes for classification using ensemble methods. To accomplish this, we present a generic relational ensemble model that can use different relational and local classifiers as components. Furthermore, we discuss solutions for several problems concerning relational data such as heterogeneity, sparsity, and multiple relations. Especially the sparsity problem will be discussed in more detail. We introduce a new method called PRNMultiHop that tries to handle this problem. Furthermore we categorize relational methods in a systematic way. Finally, we provide empirical evidence, that our relational ensemble methods outperform existing relational classification methods, even rather complex models such as relational probability trees (RPTs), relational dependency networks (RDNs) and relational Bayesian classifiers (RBCs).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Angelova R, Weikum G (2006) Graph-based text classification: learn from your neighbor. In: Proceedings of SIGIR, pp 485–492

  2. Bernstein A, Clearwater S, Provost F (2003) The relational vector-space model and industry classification. In: Proceedings of IJCAI workshop on statistical models from relational data, pp 8–18

  3. Breiman L (1996). Bagging predictors. Mach Learn 24: 123–140

    MATH  MathSciNet  Google Scholar 

  4. Chakrabarti S, Dom B, Indyk P (1998) Enhanced hypertext categorization using hyperlinks. In: Proceedings of ACM SIGMOD, pp 307–318

  5. Cramer H (1999). Mathematical methods of statistics. Princeton University Press, Princeton

    MATH  Google Scholar 

  6. Dietterich TG (1997). Machine learning research: four current directions. AI Mag 18(4): 97–136

    Google Scholar 

  7. Domingos P and Pazzani M (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Mach Learn 29: 103–130

    Article  MATH  Google Scholar 

  8. Fürnkranz J (2002) Hyperlink ensembles: a case Study in hypertext classification. Informat Fusion, vol 3

  9. Heß A, Kushmerick N (2004) Iterative classification for relational data: a case study of semantic web services. In: Proceedings of European conference on machine learning

  10. Hopfield J (1982). Neural Networks and physical systems with emergent collective computational abilities. Natl Acad Sci 79: 2554–2558

    Article  MathSciNet  Google Scholar 

  11. Huang Z, Chen H and Zeng D (2004). Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering. ACM Trans Informat Syst 22(1): 116–142

    Article  Google Scholar 

  12. Jensen D, Neville J (2002) Linkage and autocorrelation cause feature selection bias in relational learning. In: Proceedings of the 19th international conference on machine learning

  13. Jensen D, Neville J, Gallagher B (2004) Why collective inference improves relational classification. In: Proceedings of the 10th ACM SIGKDD

  14. Kleinberg J (1998) Authoritative sources in hyperlinked environment. In: Proceedings of the 9th ACM-SIAM symposium on discrete algorithms

  15. Kramer S, Lavrac N, Flach P (2001) Propositionalization approaches to relational data mining. In: Relational data mining. Kluwer, Dordrecht, pp 262–291

  16. Lu Q, Getoor L (2003) Link-based text classification. In: Proceedings of IJCAI workshop on text mining and link analysis

  17. Macskassy AS, Provost F (2003) A simple relational classifier. In: Proceedings of the multi-relational data mining workshop ACM SIGKDD

  18. Macskassy AS, Provost F (2004) Classification in networked data: a toolkit and a univariate case study. CeDER Working Paper, Stern School of Business, New York University

  19. McCallum A, Nigam K, Rennie J and Seymore K (2000). Automating the construction of Internet portals with machine learning. Proc Informat Retri 3(2): 127–163

    Article  Google Scholar 

  20. Neville J, Jensen D (2000) Iterative classification in relational data. In: Proceedings of AAAI workshop on learning statistical models from relational data, pp 13–20

  21. Neville J, Jensen D (2003) Collective classification with relational dependency Networks. In: Proceedings of the second workshop on multi-relational data mining KDD

  22. Neville J, Jensen D (2005) Leveraging relational autocorrelation with latent group models. In: Proceedings of the 5th IEEE international conference on data mining, Houston, USA, November 2005, pp 322–329

  23. Neville J, Jensen D, Friedland L, Hay M (2003) Learning relational probability trees. In: Proceedings of SIGKDD

  24. Neville J, Jensen D, Gallagher B, Fairgrieve R (2003) Simple estimators for relational Bayesian classifiers. In: Proceedings of the 3rd IEEE international conference on data mining, pp 609–612

  25. Preisach C, Schmidt-Thieme L (2006) Relational ensemble classification. In: Proceedings of the 6th IEEE international conference on data mining, Hongkong, December 2006, pp 499–509

  26. Provost F, Domingos P (2000) Well-trained PETs: improving probability estimation trees. CeDER Working Paper, Stern School of Business, New York University

  27. Rifkin RM and Klautau A (2004). In defense of One-Vs-All classification. J Mach Learn Res 5: 101–141

    MathSciNet  Google Scholar 

  28. Taskar B, Segal E, Koller D (1997) Probabilistic classification and clustering in relational data. In: Proceedings of the 17th international joint conference on artificial intelligence, pp 870–878

  29. Ting KM, Witten IH (1997) Stacked generalization: when does it work? In: Proceedings of the international joint conference on artificial intelligence, pp 866–871

  30. Utard H, Fürnkranz J (2005) Link-local features for hypertext classification. In: Proceedings of EWMF, pp 51–64

  31. Van Assche A, Vens C, Blockeel H, Dzeroski S (2004) Using random forests for relational learning. In: Proceedings of ICML workshop on statistical relational learning and its connections to other fields, pp 110–116

  32. Witten IH, Frank E (2000) Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann

  33. Wolpert DH (1992) Stacked generalization. In Neural Networks 5. Pergamon Press, pp 214–259

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christine Preisach.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Preisach, C., Schmidt-Thieme, L. Ensembles of relational classifiers. Knowl Inf Syst 14, 249–272 (2008). https://doi.org/10.1007/s10115-007-0093-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-007-0093-3

Keywords

Navigation