Ensembles of relational classifiers

Preisach, Christine; Schmidt-Thieme, Lars

doi:10.1007/s10115-007-0093-3

Ensembles of relational classifiers

Regular Paper
Published: 12 July 2007

Volume 14, pages 249–272, (2008)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Christine Preisach¹ &
Lars Schmidt-Thieme¹

174 Accesses
18 Citations
Explore all metrics

Abstract

Relational classification aims at including relations among entities into the classification process, for example taking relations among documents such as common authors or citations into account. However, considering more than one relation can further improve classification accuracy. Here we introduce a new approach to make use of several relations as well as both, relations and local attributes for classification using ensemble methods. To accomplish this, we present a generic relational ensemble model that can use different relational and local classifiers as components. Furthermore, we discuss solutions for several problems concerning relational data such as heterogeneity, sparsity, and multiple relations. Especially the sparsity problem will be discussed in more detail. We introduce a new method called PRNMultiHop that tries to handle this problem. Furthermore we categorize relational methods in a systematic way. Finally, we provide empirical evidence, that our relational ensemble methods outperform existing relational classification methods, even rather complex models such as relational probability trees (RPTs), relational dependency networks (RDNs) and relational Bayesian classifiers (RBCs).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Angelova R, Weikum G (2006) Graph-based text classification: learn from your neighbor. In: Proceedings of SIGIR, pp 485–492
Bernstein A, Clearwater S, Provost F (2003) The relational vector-space model and industry classification. In: Proceedings of IJCAI workshop on statistical models from relational data, pp 8–18
Breiman L (1996). Bagging predictors. Mach Learn 24: 123–140
MATH MathSciNet Google Scholar
Chakrabarti S, Dom B, Indyk P (1998) Enhanced hypertext categorization using hyperlinks. In: Proceedings of ACM SIGMOD, pp 307–318
Cramer H (1999). Mathematical methods of statistics. Princeton University Press, Princeton
MATH Google Scholar
Dietterich TG (1997). Machine learning research: four current directions. AI Mag 18(4): 97–136
Google Scholar
Domingos P and Pazzani M (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Mach Learn 29: 103–130
Article MATH Google Scholar
Fürnkranz J (2002) Hyperlink ensembles: a case Study in hypertext classification. Informat Fusion, vol 3
Heß A, Kushmerick N (2004) Iterative classification for relational data: a case study of semantic web services. In: Proceedings of European conference on machine learning
Hopfield J (1982). Neural Networks and physical systems with emergent collective computational abilities. Natl Acad Sci 79: 2554–2558
Article MathSciNet Google Scholar
Huang Z, Chen H and Zeng D (2004). Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering. ACM Trans Informat Syst 22(1): 116–142
Article Google Scholar
Jensen D, Neville J (2002) Linkage and autocorrelation cause feature selection bias in relational learning. In: Proceedings of the 19th international conference on machine learning
Jensen D, Neville J, Gallagher B (2004) Why collective inference improves relational classification. In: Proceedings of the 10th ACM SIGKDD
Kleinberg J (1998) Authoritative sources in hyperlinked environment. In: Proceedings of the 9th ACM-SIAM symposium on discrete algorithms
Kramer S, Lavrac N, Flach P (2001) Propositionalization approaches to relational data mining. In: Relational data mining. Kluwer, Dordrecht, pp 262–291
Lu Q, Getoor L (2003) Link-based text classification. In: Proceedings of IJCAI workshop on text mining and link analysis
Macskassy AS, Provost F (2003) A simple relational classifier. In: Proceedings of the multi-relational data mining workshop ACM SIGKDD
Macskassy AS, Provost F (2004) Classification in networked data: a toolkit and a univariate case study. CeDER Working Paper, Stern School of Business, New York University
McCallum A, Nigam K, Rennie J and Seymore K (2000). Automating the construction of Internet portals with machine learning. Proc Informat Retri 3(2): 127–163
Article Google Scholar
Neville J, Jensen D (2000) Iterative classification in relational data. In: Proceedings of AAAI workshop on learning statistical models from relational data, pp 13–20
Neville J, Jensen D (2003) Collective classification with relational dependency Networks. In: Proceedings of the second workshop on multi-relational data mining KDD
Neville J, Jensen D (2005) Leveraging relational autocorrelation with latent group models. In: Proceedings of the 5th IEEE international conference on data mining, Houston, USA, November 2005, pp 322–329
Neville J, Jensen D, Friedland L, Hay M (2003) Learning relational probability trees. In: Proceedings of SIGKDD
Neville J, Jensen D, Gallagher B, Fairgrieve R (2003) Simple estimators for relational Bayesian classifiers. In: Proceedings of the 3rd IEEE international conference on data mining, pp 609–612
Preisach C, Schmidt-Thieme L (2006) Relational ensemble classification. In: Proceedings of the 6th IEEE international conference on data mining, Hongkong, December 2006, pp 499–509
Provost F, Domingos P (2000) Well-trained PETs: improving probability estimation trees. CeDER Working Paper, Stern School of Business, New York University
Rifkin RM and Klautau A (2004). In defense of One-Vs-All classification. J Mach Learn Res 5: 101–141
MathSciNet Google Scholar
Taskar B, Segal E, Koller D (1997) Probabilistic classification and clustering in relational data. In: Proceedings of the 17th international joint conference on artificial intelligence, pp 870–878
Ting KM, Witten IH (1997) Stacked generalization: when does it work? In: Proceedings of the international joint conference on artificial intelligence, pp 866–871
Utard H, Fürnkranz J (2005) Link-local features for hypertext classification. In: Proceedings of EWMF, pp 51–64
Van Assche A, Vens C, Blockeel H, Dzeroski S (2004) Using random forests for relational learning. In: Proceedings of ICML workshop on statistical relational learning and its connections to other fields, pp 110–116
Witten IH, Frank E (2000) Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann
Wolpert DH (1992) Stacked generalization. In Neural Networks 5. Pergamon Press, pp 214–259

Download references

Author information

Authors and Affiliations

Information Systems and Machine Learning Lab, University of Hildesheim, Marienburger Platz 22, 31141, Hildesheim, Germany
Christine Preisach & Lars Schmidt-Thieme

Authors

Christine Preisach
View author publications
You can also search for this author in PubMed Google Scholar
Lars Schmidt-Thieme
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christine Preisach.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Preisach, C., Schmidt-Thieme, L. Ensembles of relational classifiers. Knowl Inf Syst 14, 249–272 (2008). https://doi.org/10.1007/s10115-007-0093-3

Download citation

Received: 26 March 2007
Accepted: 28 April 2007
Published: 12 July 2007
Issue Date: March 2008
DOI: https://doi.org/10.1007/s10115-007-0093-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ensembles of relational classifiers

Abstract

Access this article

Similar content being viewed by others

Ensemble-Based Relationship Discovery in Relational Databases

Graph Based Relational Features for Collective Classification

Structure learning for relational logistic regression: an ensemble approach

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Ensembles of relational classifiers

Abstract

Access this article

Similar content being viewed by others

Ensemble-Based Relationship Discovery in Relational Databases

Graph Based Relational Features for Collective Classification

Structure learning for relational logistic regression: an ensemble approach

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation