Substructure counting graph kernels for machine learning from RDF data

doi:10.1016/j.websem.2015.08.002

Journal of Web Semantics

Volume 35, Part 2, December 2015, Pages 71-84

https://doi.org/10.1016/j.websem.2015.08.002 Get rights and content

Highlights

•
Systematic graph kernel framework for RDF.
•
Fast computation algorithms.
•
Low frequency labels and hub removal on RDF to enhance machine learning.

Abstract

In this paper we introduce a framework for learning from RDF data using graph kernels that count substructures in RDF graphs, which systematically covers most of the existing kernels previously defined and provides a number of new variants. Our definitions include fast kernel variants that are computed directly on the RDF graph. To improve the performance of these kernels we detail two strategies. The first strategy involves ignoring the vertex labels that have a low frequency among the instances. Our second strategy is to remove hubs to simplify the RDF graphs. We test our kernels in a number of classification experiments with real-world RDF datasets. Overall the kernels that count subtrees show the best performance. However, they are closely followed by simple bag of labels baseline kernels. The direct kernels substantially decrease computation time, while keeping performance the same. For the walks counting kernel this decrease in computation time is so large that it thereby becomes a computationally viable kernel to use. Ignoring low frequency labels improves the performance for all datasets. The hub removal algorithm increases performance on two out of three of our smaller datasets, but has little impact when used on our larger datasets.

Introduction

In recent years graph kernels have been introduced as a promising method to perform data mining and machine learning on Linked Data and the Semantic Web. These methods take Resource Description Framework (RDF) data as input.

One main advantage of this approach is that the techniques are therefore very widely applicable to all kinds of Linked Data. Almost no assumptions are made on the semantics of the data and its content, other than that it is in RDF. So, additionally, these methods require very little knowledge of Semantic Web technologies to be employed.

Another advantage is the host of existing machine learning algorithms, called kernel methods [1], [2], that can be used with these graph kernels. The most well known of these algorithms is the Support Vector Machine (SVM) for classification and regression. However, algorithms exist for ranking [3], [4], clustering [5], outlier detection [6], etc., which can all be used directly with these kernels. More recently, interest has increased for large scale linear classification [7] for larger datasets, for which a number of graph kernels can also be used.

In this paper we give a comprehensive overview of graph kernels for learning from RDF data. We introduce a framework for these kernels, which are based on counting different graph substructures, that encompasses most of the graph kernels previously introduced for RDF, but also introduces new variants. The framework includes fast kernel variants that are computed directly on the RDF graph. We also detail the necessary adaptation of the Weisfeiler–Lehman graph kernel [8] needed to compute a number of kernels in our framework. Furthermore, we give two strategies to further improve the machine learning performance with these kernels. The first strategy ignores vertex labels which have a low frequency of occurrence among the instances and the second strategy removes hubs to simplify the RDF graphs. All of our kernels defined in the framework can be used with large scale linear classification methods.

The kernels are studied in a number of classification experiments on different RDF datasets. The goal of these experiments is to study the influence of the different choices for graph kernels defined in our framework. It turns out that kernel performance differs per dataset. Overall, kernels that count subtrees in the graphs are the best choice. However, simple bag of labels baseline kernels also perform well and are significantly cheaper to compute. The strategy to ignore low frequency labels has a positive effect on performance in all tasks, whereas the hub removal strategy only has a positive effect in a number of tasks and has no influence for larger datasets.

The work presented in this paper consolidates and expands our earlier papers on graph kernels for RDF [9], [10] and hub removal [11].

The rest of this paper is structured as follows. We begin with an overview of related work. In Section 3 we introduce our kernel framework and algorithms. Section 4 covers our experiments with these kernels. We end with conclusions and suggestions for future work.

Section snippets

Related work

Graph kernels, such as those introduced in [12], [8], [13], are methods to perform machine learning on graph structured data, using kernel methods [1], [2].

For learning from RDF data, the intersection subtree and intersection graph kernels were introduced in [14]. A fast approximation of the Weisfeiler–Lehman graph kernel [8], specifically designed for RDF was introduced in [9]. In [10] a fast and simple graph kernel, similar to the intersection subtree kernel was defined. In the context

Graph kernels for RDF

The Resource Description Framework (RDF)¹ is the foundation of Linked Data and the Semantic Web. The central idea is to store statements about resources in subject–predicate–object form, called triples, which define relations between a set of terms. A triple $(s, p, o)$ in a set of triples $T$ specifies that the subject term $s$ is related to the object term $o$ by the predicate $p$ .

Often a set of triples $T$ is referred to and visualized as a graph, where the subject

Experiments

In this section we present results for five sets of experiments using the kernels presented above. The first four sets are classification experiments with Support Vector Machines (SVMs): the first using the kernels on regular datasets, the second using the MinFreq kernels, the third using the kernels in combination with hub removal, and finally using the kernels on unlabeled RDF graphs. The final set of experiments presents the runtimes for the different kernels.

The goal of these experiments is

Conclusions and future work

We have presented a framework for substructure counting graph kernels for RDF data. This framework systematically covers most of the graph kernels introduced for RDF and also provides a number of new kernels. The definitions include kernel variants that are computed directly on the RDF graph. We detailed the adaptation of the Weisfeiler-Lehman graph kernel [8], needed to compute the subtree counting kernels, to ensure that identical subtrees are not repeatedly counted. Furthermore, we

Acknowledgments

This publication was supported by the Dutch national program COMMIT and by the Netherlands eScience center. The authors thank Peter Bloem for valuable discussions.

References (41)

B. Schölkopf et al.
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
(2001)
J. Shawe-Taylor et al.
Kernel Methods for Pattern Analysis
(2004)
T. Joachims
Optimizing search engines using clickthrough data
O. Chapelle et al.
Efficient algorithms for ranking with svms
Inform. Retrieval
(2010)
I.S. Dhillon et al.
Weighted graph cuts without eigenvectors — a multilevel approach
IEEE Trans. Pattern Anal. Mach. Intell.
(2007)
B. Schölkopf et al.
Estimating the support of a high-dimensional distribution
Neural Comput.
(2001)
G.-X. Yuan et al.
Recent advances of large-scale linear classification
Proc. IEEE
(2012)
N. Shervashidze et al.
Weisfeiler-Lehman graph kernels
J. Mach. Learn. Res.
(2011)
G.K.D. de~Vries
A fast approximation of the Weisfeiler-Lehman graph kernel for RDF data
G.K.D. de Vries, S. de Rooij, A fast and simple graph kernel for RDF, in: C. d’Amato, P. Berka, V. Svátek, K. Wecel...

P. Bloem, A. Wibisono, G.K.D. de Vries, Simplifying RDF data for graph-based machine learning, in: KNOW@LOD,...

S.V.N. Vishwanathan et al.

Graph kernels

J. Mach. Res. Lett.

(2010)

N. Shervashidze, T. Petri, K. Mehlhorn, K.M. Borgwardt, S. Viswanathan, Efficient graphlet kernels for large graph...

U. Lösch et al.

Graph kernels for RDF data

V.C. Ostuni et al.

A linked data recommender system using a neighborhood-based graph kernel

M. Rowe

Transferring semantic categories with vertex kernels: Recommendations with semanticsvd++

S. Bloehdorn et al.

Kernel Methods for Mining Instance Data in Ontologies

N. Fanizzi et al.

Induction of robust classifiers for web ontologies through kernel machines

J. Web Sem.

(2012)

V. Bicer et al.

Relational kernel machines for learning from graph-structured rdf data

Y. Huang et al.

Multivariate prediction for learning on the semantic web

Cited by (0)

View full text

Journal of Web Semantics

Substructure counting graph kernels for machine learning from RDF data

Highlights

Abstract

Introduction

Section snippets

Related work

Graph kernels for RDF

Experiments

Conclusions and future work

Acknowledgments

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Kernel Methods for Pattern Analysis

Optimizing search engines using clickthrough data

Efficient algorithms for ranking with svms

Inform. Retrieval

Weighted graph cuts without eigenvectors — a multilevel approach

IEEE Trans. Pattern Anal. Mach. Intell.

Estimating the support of a high-dimensional distribution

Neural Comput.

Recent advances of large-scale linear classification

Proc. IEEE

Weisfeiler-Lehman graph kernels

J. Mach. Learn. Res.

A fast approximation of the Weisfeiler-Lehman graph kernel for RDF data