Keywords

1 Background

In the past few years, recommender systems leveraging knowledge graphs have proven to be competitive with state-of-the-art collaborative filtering systems and to efficiently address issues such as new items and data sparsityĀ [1, 7,8,9,10, 12]. node2vec has shown to be able to effectively learn features from graph structures, outperforming existing systems in node classification and link prediction tasksĀ [3]. In this paper, we show that node2vec can be effectively used to learn knowledge graph embeddings to perform item recommendation. node2vec is applied on a knowledge graph including user feedback on items, modelled by the special relation ā€˜feedbackā€™, and item relations to other entities. Then, recommendations are generated using the relatedness between users and items in the vector space. The evaluation on the Movielens dataset shows that: (1) node2vec with default hyper-parameters outperforms collaborative filtering baselines on all metrics and the MostPop algorithm on most metrics (2) node2vec with optimized hyper-parameters significantly outperforms all baselines under consideration.

2 Approach

Item Recommendation: given a set of items I and a set of users U, the problem of item recommendation is that of ranking a set of N candidate items \(I_{candidates} \subset I\) according to what a user may like. More formally, the problem consists in defining a ranking function \(\rho (u,i)\) that assigns a score to any user-item pair \((u,i) \in U \times I_{candidates}\) and then sorting the items according to \(\rho (u,i)\):

$$\begin{aligned} L(u) = \{i_1, i_2, ..., i_N\} \end{aligned}$$
(1)

where \(\rho (u,i_j) > \rho (u,i_{j+1})\) for any \(j=1..N-1\).

node2vecĀ [3] learns representations of nodes in a graph through the application of the word2vec model on sequences of nodes sampled through random walks (Fig.Ā 1). The innovation brought by node2vec is the definition of a random walk exploration that is flexible and adaptable to the diversity of connectivity patterns that a network may present. Given a knowledge graph K encompassing users U, items I (the object of the recommendations, e.g. a movie) and other entities E (objects connected to items, e.g. the director of a movie), node2vec generates vector representations of the users \(x_u\) and of the items \(x_i\) (and of other entities \(x_e\)). Thus, we propose to use as a ranking function the relatedness between the user and the item vectors: \(\rho (u,i) = d(x_u,x_i)\) where d is the cosine similarity in this work.

Knowledge Graph Construction: the dataset used for the evaluation is MovieLens 1MFootnote 1Ā [4]. We used the publicly available mappings from MovieLens 1M items to the corresponding DBpedia entitiesĀ [8] to create the knowledge graph K using DBpedia data. We split the data into training \(X_{train}\), validation \(X_{val}\) and test set \(X_{test}\), containing respectively 70%, 10% and 20% of the ratings for each user. We selected a set of properties based on their frequency of occurrenceFootnote 2: [ā€œdbo:directorā€, ā€œdbo:starringā€, ā€œdbo:distributorā€, ā€œdbo:writerā€,ā€œdbo:musicComposerā€, ā€œdbo:producerā€, ā€œdbo:cinematographyā€, ā€œdbo:editingā€]. We add ā€œdct:subject" to this set of properties, as it provides an extremely rich categorization of items. For each property p, we include in K all the triples (i,Ā p,Ā e) where \(i \in I\) and \(e \in E\), e.g. (dbr:Pulp_Fiction, dbo:director, dbr:Quentin_Tarantino). We finally add the ā€˜feedbackā€™ property, modeling all movie ratings that are \(r \ge 4\) in \(X_{train}\) as triples (u,Ā feedback,Ā i).

Evaluation: we use the evaluation protocol known as AllUnratedItemsĀ [11] and we measure standard information retrieval metrics such as P@5, P@10, Mean Average Precision (MAP), R@5, R@10, NDCG (Normalized Discounted Cumulative Gain), MRR (Mean Reciprocal Rank). As baselines, we use collaborative filtering algorithms based on Singular Value DecompositionĀ [6], ItemKNN with baselinesĀ [5] and the MostPop item recommendation strategy, which ranks items based on their popularity (i.e. total number of positive ratings). The baselines are implemented using the surprise python libraryFootnote 3.

Fig. 1.
figure 1

Node2vec for item recommendation using the knowledge graph. Users are represented in black, items in orange and entities in grey. node2vec learns knowledge graph embeddings by sampling sequences of nodes through random walks and then applying the word2vec model on the sequences. The ranking function for item recommendation is then given by the node relatedness in the vector space.

3 Results

The results of the evaluation are reported in TableĀ 1. In ā€œnode2vec (default)ā€ the hyper-parameters have been set to their default value as reported in the original paperĀ [3] and in the reference Python implementation available on GithubFootnote 4 (\(p=1, q=1, num\_walks=10, walk\_length=80, window\_size=10, iter=1, dimensions=128\)). We observe that ā€œnode2vec (default)ā€ outperforms SVD and ItemKNN on all metrics, but that the MostPop approach performs slightly better on the P@5 and P@10 and on MRR. Note that MostPop, although trivial, is known to be quite effective on the Movielens 1M dataset as a consequence of the strong concentration of item feedback on a small number of highly popular itemsĀ [2]. In ā€œnode2vec (opt)ā€ we have optimized the hyper-parameters by a combination of grid-search and manual search over the validation set, exploring the ranges: \(p \in \{0.25, 1, 4\}\), \(q \in \{0.25, 1, 4\}\), \(dimensions \in \{200,500\}\), \(walk\_length \in \{10,20,30,50,100\}\), \(window\_size \in \{10, 20, 30\}\), \(num\_walks \in \{10, 50\}\). We found the configuration (\(p=4, q=1, num\_walks=50, walk\_length=100, window\_size=30, iter=5, dimensions=200\)) to be optimal on the validation set in the explored range. We observed that the number of walks per node, the walk length, i.e. the maximum length of random walk, and the context size are particularly significant to improve the performance. However, the hyper-parameters optimization is a time consuming endeavor, as it requires running the whole evaluation pipelines with multiple configurations. Thus, in a future work, we will extend the evaluation to other datasets and investigate the relation between hyper-parameters and the graph structure, with the aim of elaborating some indications to guide the hyper-parameter search process.

Table 1. Results on the MovieLens 1M dataset sorted by NDCG