Substructure counting graph kernels for machine learning from RDF data
Introduction
In recent years graph kernels have been introduced as a promising method to perform data mining and machine learning on Linked Data and the Semantic Web. These methods take Resource Description Framework (RDF) data as input.
One main advantage of this approach is that the techniques are therefore very widely applicable to all kinds of Linked Data. Almost no assumptions are made on the semantics of the data and its content, other than that it is in RDF. So, additionally, these methods require very little knowledge of Semantic Web technologies to be employed.
Another advantage is the host of existing machine learning algorithms, called kernel methods [1], [2], that can be used with these graph kernels. The most well known of these algorithms is the Support Vector Machine (SVM) for classification and regression. However, algorithms exist for ranking [3], [4], clustering [5], outlier detection [6], etc., which can all be used directly with these kernels. More recently, interest has increased for large scale linear classification [7] for larger datasets, for which a number of graph kernels can also be used.
In this paper we give a comprehensive overview of graph kernels for learning from RDF data. We introduce a framework for these kernels, which are based on counting different graph substructures, that encompasses most of the graph kernels previously introduced for RDF, but also introduces new variants. The framework includes fast kernel variants that are computed directly on the RDF graph. We also detail the necessary adaptation of the Weisfeiler–Lehman graph kernel [8] needed to compute a number of kernels in our framework. Furthermore, we give two strategies to further improve the machine learning performance with these kernels. The first strategy ignores vertex labels which have a low frequency of occurrence among the instances and the second strategy removes hubs to simplify the RDF graphs. All of our kernels defined in the framework can be used with large scale linear classification methods.
The kernels are studied in a number of classification experiments on different RDF datasets. The goal of these experiments is to study the influence of the different choices for graph kernels defined in our framework. It turns out that kernel performance differs per dataset. Overall, kernels that count subtrees in the graphs are the best choice. However, simple bag of labels baseline kernels also perform well and are significantly cheaper to compute. The strategy to ignore low frequency labels has a positive effect on performance in all tasks, whereas the hub removal strategy only has a positive effect in a number of tasks and has no influence for larger datasets.
The work presented in this paper consolidates and expands our earlier papers on graph kernels for RDF [9], [10] and hub removal [11].
The rest of this paper is structured as follows. We begin with an overview of related work. In Section 3 we introduce our kernel framework and algorithms. Section 4 covers our experiments with these kernels. We end with conclusions and suggestions for future work.
Section snippets
Related work
Graph kernels, such as those introduced in [12], [8], [13], are methods to perform machine learning on graph structured data, using kernel methods [1], [2].
For learning from RDF data, the intersection subtree and intersection graph kernels were introduced in [14]. A fast approximation of the Weisfeiler–Lehman graph kernel [8], specifically designed for RDF was introduced in [9]. In [10] a fast and simple graph kernel, similar to the intersection subtree kernel was defined. In the context
Graph kernels for RDF
The Resource Description Framework (RDF)1 is the foundation of Linked Data and the Semantic Web. The central idea is to store statements about resources in subject–predicate–object form, called triples, which define relations between a set of terms. A triple in a set of triples specifies that the subject term is related to the object term by the predicate .
Often a set of triples is referred to and visualized as a graph, where the subject
Experiments
In this section we present results for five sets of experiments using the kernels presented above. The first four sets are classification experiments with Support Vector Machines (SVMs): the first using the kernels on regular datasets, the second using the MinFreq kernels, the third using the kernels in combination with hub removal, and finally using the kernels on unlabeled RDF graphs. The final set of experiments presents the runtimes for the different kernels.
The goal of these experiments is
Conclusions and future work
We have presented a framework for substructure counting graph kernels for RDF data. This framework systematically covers most of the graph kernels introduced for RDF and also provides a number of new kernels. The definitions include kernel variants that are computed directly on the RDF graph. We detailed the adaptation of the Weisfeiler-Lehman graph kernel [8], needed to compute the subtree counting kernels, to ensure that identical subtrees are not repeatedly counted. Furthermore, we
Acknowledgments
This publication was supported by the Dutch national program COMMIT and by the Netherlands eScience center. The authors thank Peter Bloem for valuable discussions.
References (41)
- et al.
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
(2001) - et al.
Kernel Methods for Pattern Analysis
(2004) Optimizing search engines using clickthrough data
- et al.
Efficient algorithms for ranking with svms
Inform. Retrieval
(2010) - et al.
Weighted graph cuts without eigenvectors — a multilevel approach
IEEE Trans. Pattern Anal. Mach. Intell.
(2007) - et al.
Estimating the support of a high-dimensional distribution
Neural Comput.
(2001) - et al.
Recent advances of large-scale linear classification
Proc. IEEE
(2012) - et al.
Weisfeiler-Lehman graph kernels
J. Mach. Learn. Res.
(2011) A fast approximation of the Weisfeiler-Lehman graph kernel for RDF data
- G.K.D. de Vries, S. de Rooij, A fast and simple graph kernel for RDF, in: C. d’Amato, P. Berka, V. Svátek, K. Wecel...