Graph neural networks for preference social recommendation

Social recommendation aims to improve the performance of recommendation systems with additional social network information. In the state of art, there are two major problems in applying graph neural networks (GNNs) to social recommendation: (i) Social network is connected through social relationships, not item preferences, i.e., there may be connected users with completely different preferences, and (ii) the user representation of current graph neural network layer of social network and user-item interaction network is the output of the mixed user representation of the previous layer, which causes information redundancy. To address the above problems, we propose graph neural networks for preference social recommendation. First, a friend influence indicator is proposed to transform social networks into a new view for describing the similarity of friend preferences. We name the new view the Social Preference Network. Next, we use different GNNs to capture the respective information of the social preference network and the user-item interaction network, which effectively avoids information redundancy. Finally, we use two losses to penalize the unobserved user-item interaction and the unit space vector angle, respectively, to preserve the original connection relationship and widen the distance between positive and negative samples. Experiment results show that the proposed PSR is effective and lightweight for recommendation tasks, especially in dealing with cold-start problems.


INTRODUCTION
Recommendation systems is a hot spot in current network applications and research (Wang, Wang & Yeung, 2015;Ebesu, Shen & Fang, 2018). High-quality recommendations can help users quickly discover interesting content and increase product sales. In recent years, with the rise of graph neural networks, recommendation systems based on graph neural networks, have received extensive attention . However, the traditional user-item interaction network (U-I network) has the problem of data sparsity (Guo et al., 2019), that will affect the performance of the recommendation system. Social recommendation (Guo, Zhang & Yorke-Smith, 2015) enhances the user representation by introducing additional user-user information, and further enhances the item representation through the information aggregation of the graph neural network. In addition, the recommendation system also suffer from the cold-start problem (Wahab et al., 2022), i.e., the amount of information about the new users is too small for personalized recommendation. Social recommendation assigns an initial preference vector to new users by user-user information. This vector is used to recommend suitable items for new users.
In social recommendation, users capture information from social network and U-I network. According to the different integration forms, the social recommendation model based on graph neural network can be divided into unified graph model and separated graph model (Wu et al., 2022). The unified graph model merges the social network and the U-I network, and directly extracts the joint information of the two networks through the graph neural network. As shown in Fig. 1A, in the unified graph, the social network and the U-I network share the same user representation, which effectively ensures the consistency of information updates in both networks. Considering the information differences of users and items, Neural graph collaborative filtering (NGCF)  designed different aggregation methods for neighboring users and neighboring items. However, the artificial design cannot meet the complex network environment. Diffnet++ (Wu et al., 2020) and SEFrame (Chen & Wong, 2021) used attention mechanism to adaptively capture the information interaction between neighboring users, between neighboring items, and between neighboring users and neighboring items. However, the network is sparse, i.e., there are a large number of unknown connected edge, which leads to the information bias, especially after using attention to highly aggregate the neighbors. Therefore, some social network models used user similarity (Song et al., 2021), generative adversarial networks (Yu et al., 2019), and other methods to complement the network relationships. Both social networks and U-I networks have their own unique information. The unified graph model lacks separate representations of the two networks, which affects the representation performance to some extent. The separated graph model handles the information of social network and U-I network separately, and extracts the information of the two networks through different graph neural networks. Therefore, the choice of graph neural networks is more flexible under the separated graph model. In contrast to the graph neural network-based social recommendation framework (GNN-SoR) (Guo & Wang, 2020) and SocialLGN (Liao et al., 2022) which used a classical graph neural network model, AGREE (Cao et al., 2019) grouped nodes and used attention for each group to capture local information. DANSER (Wu et al., 2019c) further proposed dual attention to capture the interaction between the two graph neural networks. The user representations of social networks and U-I networks obtained from the above separated graph models are able to effectively capture the differences between the two networks. As shown in Fig. 1B, in the separated graph, the user representations of the two networks need to be merged. Diffnet (Wu et al., 2019b) simply summed the two types of user representations to greatly reduce the computational complexity. However, the method needs to ensure that the amount of information contained in the two types of user representations cannot be significantly different. GraphRec (Fan et al., 2019) used a multi-layer neural network to further explore the potential information of the two types of user representations. It can improve the performance of the representations, but may lead to over-fitting. Some articles (Song et al., 2019;Xu et al., 2020;Liao et al., 2022), on the other hand, used the concatenate operation for user presentations, which can solve the difference of the amount of information at a low computational complexity. However, in the existing separated graph model, the user Figure 1 Unified graph model and separated graph model of GNN-based social recommendation, where u is user, i is item, h S u is user representation of social network, h I u is user representation of U-I network, h k u is the user representation output of layer k, h k i is the item representation output of layer k. (A) Unified graph model, which cannot represent social network and U-I network separately. (B) Separated graph model. h k u is the common user input of the two networks of layer k + 1. In layer k + 1, user representation of social network redundant user-item interaction information, user representation of U-I network redundant user-user interaction information. Aggregating the updated redundant user representations h k+1 u will cause further redundancy.
Full-size DOI: 10.7717/peerjcs.1393/ fig-1 representation output of the current layer is the combination of the user representation of the two networks. Whether it is the social network or the U-I network, it is redundant to use the combined user representation output of the current layer as the input of the next layer of graph neural network. As the number of layers deepens, the redundant information will continue to accumulate. This article proposes graph neural networks for preference social recommendation (PSR). PSR adopts the separated graph model to fully capture the independent information of social network and U-I network. Compared with the previous separated graph model, we further separate the updating and combining operations of user representations of the two networks to avoid information redundancy. Furthermore, few articles note that not all social network relationships contribute to the U-I network. Social networks are noisy, and friends do not necessarily share the same preferences. Therefore, we propose the social preference network to enhance the social network. The main contributions of the article are summarized as follows. 1. A friend influence indicator is proposed. It captures user preferences through user-item interaction information, and then transforms social networks into social preference networks that are more suitable for recommendation systems. 2. A PSR model is proposed. It can effectively avoid information redundancy, and can fully capture the respective information and joint information of the two networks. 3. Two losses in the objective function are used. These two losses are used to preserve the initial connection relationship between nodes and widen the distance between samples and negative samples, respectively. The rest of article is organized as follows. Section 2 is related work, Section 3 describe the PSR model, Section 4 is experiment, and Section 5 gives conclusions.

RELATED WORK
We propose the social preference network. Its main idea is to use heterogeneous networks to complement the heterogeneous information of homogeneous networks, thus reducing the information difference between the two networks. The idea is applicable to social networks and can be generalized to other networks, such as social opportunistic networks (Liu et al., 2018;Zhang et al., 2019). In addition, we use different graph neural network models to capture information from social network and U-I network, respectively. Our approach is based on separated graph model. The following is the related work of the article.

Graph neural network in social recommendation
Graph neural networks, especially graph convolutional networks, can achieve fast and efficient information aggregation and update through network topology information. GCN (Welling & Kipf, 2016) aggregates neighbor nodes by degree penalty, realizing convolution on the network. NGCF  changes the GCN convolution kernel by adding additional interaction information between nodes and neighbor nodes, and successfully introduces graph convolution into the recommendation system. LightGCN  adopts the idea of SGC (Wu et al., 2019a) which deletes the nonlinear activation function of NGCF. GraphRec (Fan et al., 2019) uses the attention mechanism, and adds rating embedding in aggregation to improve node representation. SEPT (Yu et al., 2021a) refers to the deep graphic infomax (DGI) (Velickovic et al., 2019) model and uses contrastive learning as the loss function to effectively mine the neighborhood information of nodes. Our method uses two modified graph neural network models to update node representations in social networks and U-I networks, respectively.

Influence of friends in social recommendation
Social recommendation is based on the assumption that the user's friends will influence the user's preferences. That is, it is important to explore social relationships in social recommendation. DiffNet (Wu et al., 2019b) believes that the relationship between users and different friends is consistent. This idea is simple, but it may be unrealistic in real situations. GraphRec (Fan et al., 2019), GAT-NSR (Mu et al., 2019) and DGRec (Song et al., 2019) use neural networks to learn the similarity between users and friends, and achieve certain results. DANSER (Wu et al., 2019c) learns the weight of social relation by a dual graph attention to mine the importance of users.
The above separated graph method fully mines the social relationships of users in social networks. However, users' social relationships are not always positive for item recommendation, that is, friends may have completely different item preferences. To solve this problem, one way is to use the unified graph model. For example, DiffNetLG (Song et al., 2021) uses user similarity to complement social relationships. Since the two networks in the unified graph share the same user representation, the added edge can reflect the item preference relationship between users to a certain extent. However, the unified graph model lacks separate representations for social networks and U-I networks. In separated graph model, a more reasonable method is to mine the influence of friends of users in social networks. The enhanced social recommendation framework (ESRF) (Yu et al., 2020) uses an autoencoder to reconstruct complex and high-order friend influences in networks, and uses the original social network relationship to constrain it to ensure the validity of the obtained user preference relationship network. SEPT (Yu et al., 2021a) mines strongly connected social relationships from the original social network. Then, the social relationships are used to constrain the preference similarity of the original social network. HOSR (Liu et al., 2020) uses topological information to capture high-order social relationships, so as to mine possible consistent item preferences between users who are not directly connected. MHCN (Yu et al., 2021b) uses hypergraphs to model high-order relationships among users, and uses multiple channels to construct different hypergraphs to improve robustness. MTRTrust (Mauro, Ardissono & Hu, 2019) introduces additional user global influence information, which is used to evaluate the importance of different user preferences together with the local influence of users. Our method uses the user's real item preference to mine the user's friend influence to ensure the consistency of social relationships and preference relationships. We constrain influence through original social networks to preserve the original social network information.

GRAPH NEURAL NETWORKS FOR PREFERENCE SOCIAL RECOMMENDATION (PSR)
We propose graph neural networks for preference social recommendation (PSR). The algorithm fully mine the users' preference and the preference relationship between users.

Problem description
In this article, we use two network including user-item interaction network (U-I network) G I = (U ,I ,E I ) and user-user interaction network (social network) G S = (U ,E S ), where ..,i M } denotes the item nodes, E I and E S represent the edge of the two networks, respectively, and N is the number of users, M is the number of item. p u is the user representation of social network, q u and q i are the user representation and item representation of U-I network, respectively. A S ∈ R N ×N is the adjacency matrix of social network and A I ∈ R N ×M is the rating matrix of U-I network.
Our goal is to enrich the node representation information in U-I network through the generated social preference network.

Algorithm framework
As shown in Fig. 2, our algorithm framework consists of three parts: 1. Social preference network representation: The social preference network is constructed by the social network and the U-I network. It uses GNN 1 with l-layer parameter sharing for representation, and p k u is the user representation output of layer k. 2. U-I network representation: U-I network uses GNN 2 with l-layer parameter sharing for user representation and uses GNN 3 with l-layer parameter sharing for item representation. And q k u and q k i are corresponding user and item representation output of layer k. 3. Two losses-based PSR model training: The final representation is the mean of all layer representations. The two losses are used to preserve the original connection relationship and widen the distance between positive and negative sample. h k u is the combined user representation of layer k. Instead of h k u , the algorithm framework chooses p k u and q k u as the input of next layer of the two network respectively, that well solve the problem of information redundancy.

Model description
Our model is a separated graph model. For clarity, we disassemble the whole model into the following three parts.

Social preference network representation
In most cases, friends will influence each other, resulting in similar preferences, but it is not absolute. If friends have completely different preferences, recommendations based on social relationships are unreliable. This means that social networks cannot necessarily be used directly for recommendation systems, which need to be adjusted beforehand.
We use user preference information of the U-I network to obtain social network friend influence. Analyzing the U-I network, let the item sets of users' preferences be H and the common preference matrix of users be C, then the common preference number C xy between user x and user y is C xy = |H x ∩ H y |. By fully considering the preference relationship and preference difference between users, we propose the friend influence indicator T xy : We use social network to constrain indicator results to ensure the validity of friend influence. That is, the indicator only calculates the friend influence among connected users in social network. The value range of T xy is [0,1]. If T xy = 0, the mutual influence is 0, which means that there is no common preference between user x and user y. If T xy = 1, the mutual influence reaches the maximum, which means that the preference is highly correlated between users. In particular, if all friend influence indicator values T x_ of user x are all 0, the user x only has U-I network information, but no social network information.
Taking the friend influence indicator as the edge weight of the social network, and removing edges with weight value of 0, a new view is obtained. In this article, we name the view the Social Preference Network.
Next, we update user representation in the social preference network. We use friend influence as aggregate weight of neighbor nodes, and define the update method of the representation p k u x of user x of layer k as where S x is the neighbors of user x in social preference network, σ is the tanh activation function, W 1 ∈ R d×d is weight matrix and d is the dimension of hidden layer. In particular, T xy = 1 when y = x.
We use D u = diag {D u 1 ,...,D u N } ∈ R N ×N as the diagonal degree matrix, where D u x = |H x | is the degree of user x in U-I network. Combining Eqs. (1) and (2), the matrix formulation of node update of layer k in social preference network is expressed as where the weight matrix W 1 is shared by parameters in different layers.

U-I network representation
The U-I network can well reflect the user's preference information, that can be directly used in recommendation system. We define the update method of the representation q k u x of user x of layer k as where I x is the neighbor item set of user x, and D u x and D i y represent the degree of user x and item y respectively in U-I network. In order to be consistent with the user representation of social preference network, we use the same activation function σ = tanh and the same dimension of the weight matrix W 2 ∈ R d×d .
In order to reduce the number of parameters and the computational complexity, we delete the weight matrix and nonlinear activation function used in item update. Therefore, we define the update method of the representation q k i z of item z of layer k as where I z is neighbor user set of item z, and D i z and D u y represent the degree of item z and user y respectively in U-I network. We Combining Eqs. (4), (5) and (6), the matrix formulation of node update of layer k in U-I network is expressed as where f is activation function, the first N lines f (x) = σ (x), the last M lines f (x) = x, and the weight matrix W 2 is shared by parameters in different layers.

Two losses-based PSR model training
We use a full connected layer to process the user representation of the social preference network and the U-I network to obtain the final user representation h u where || means concatenate, W 3 ∈ R 2d×d is weight matrix. The final item representation h i is We use inner product y xz = h u x · h i z as the rating of user x and item z. By minimizing Bayesian Personalized Ranking (BPR) loss (Rendle et al., 2009) and Global Orthogonal Regularization (GOR) loss (Zhang et al., 2017), the model is trained. BPR loss is used to preserve the original connection relationship, the calculation formula is u is randomly initialized user representation, q 0 i is randomly initialized item representation. To keep consistent with the comparison algorithm (Liao et al., 2022), λ is set to 1e −4 .
GOR loss is used to widen the distance between positive and negative samples, and the calculation formula is where N is the number of negative samples. To make the activation function φ smoother, we use Softplus instead of RELU of the original article. Through Eqs. (10) and (11), the final loss function is obtained as where α is hyperparameter used to balance the two loss.

Model implementation steps
We propose graph neural networks for preference social recommendation (PSR). It proposes a social preference network to transform the friend relationship in social network into the item preference relationship. In addition, PSR proposes a social recommendation model that can effectively reduce information redundancy, and uses two losses to constrain the obtained node representation. The specific implementation process of the model is shown in algorithm 1.

Computational complexity
We analyze the space and time complexity of PSR, and add LightGCN  and SocialLGN (Liao et al., 2022) for comparison.

Space complexity
In PSR, there are two parts of trainable parameters: (i) initial representation of the node, and (ii) weight matrix of neural network. For (i), the space complexity is (N +M )d, which is consistent with most neural network models (e.g., LightGCN and SocialLGN). For (ii), PSR uses three weight matrixes W 1 ∈ R d×d , W 2 ∈ R d×d and W 3 ∈ R 2d×d . Since each weight matrix is parameter-shared among layers, the space complexity of PSR in this part is 4d 2 .
In summary, the total space complexity of PSR is (N + M + 4d)d, which is consistent with SocialLGN. Since min(N ,M ) d, 4d can be ignored, which means that the space complexity of PSR is also approximately equal to LightGCN. Time complexity Similar to most graph convolution kernels, the friend influence indicator can be calculated as preprocessing. For a single-layer neural network, considering the sparsity of the network, the time complexity of node aggregation of social network is O(|E S |d), while U-I network is O(|E I |d), and the time complexity of the graph diffusion operation through weight matrix is O(4Nd 2 ). Therefore, the total time complexity of PSR is O(|E S |dl + |E I |dl + 4Nd 2 l) which is linearly related to max(|E S |,|E I |,N ). It is consistent with SocialLGN, which means that it is lower than most existing GNN-based social recommendation models (SocialLGN is a light GNN-based model).

Datasets
We use LastFM (Yu et al., 2021a;Yu et al., 2021b) and Ciao (Fan et al., 2019;Fan et al., 2020) to analyze the performance of model. These two datasets are real-world datasets that are often used in recommendation systems. As a music dataset, LastFM includes friend relationships and users' music preferences. As an online shopping dataset, Ciao includes friend relationships and users' shopping information. The dataset statistics are shown in Table 1.

Comparison algorithms
We compare PSR with some well-known methods to verify the performance of the model. The comparison algorithms are as follows.
-BPR (Rendle et al., 2009): A non-graph neural network model which ranks items by maximizing the posterior probability. -SBPR (Zhao, McAuley & King, 2014): The first model to introduce social relationships into recommender systems, that sorts the items according to the user's preference, the user's friend's preference, and the remaining preference.
-DiffNet (Wu et al., 2019b): It treats friends influence equally, and updates user and item information by accumulation.
-NGCF : A social recommendation model which only aggregates neighborhood information without aggregating central node information.
-LightGCN : It removes the weight matrix and nonlinear activation function in NGCF.
-SocialLGN (Liao et al., 2022): It designs a graph fusion component for user update, and removes the nonlinear activation function and weight matrix for item update.

Parameter settings
For better comparison, we keep aligned with the experimental settings of the current SOTA model (SocialLGN). We take 80% of the data as the training set. And we set the random seed to 2020, the representation dimension to 64, the number of neural network layers l to 3, λ to 1e −4 , the initial learning rate to 1e −3 and Adam as the optimizer. For the new hyperparameter α in PSR, we choose from {0,1,...,10}. In order to reduce the influence of hyperparameters α, we choose α = 5 by default.

Evaluation indicators
We use three mainstream evaluation indicators (Wu et al., 2020;Liao et al., 2022), namely Precision@K, Recall@K, and NDCG@K, to evaluate the recommendation performance of top-K ranking.
Precision@K indicates the probability of correct prediction in the predicted positive sample set.
Recall@K indicates the probability of correct prediction in the real positive sample set.
For the connection relationship of the U-I network, TP is the number of predicted connected edges which are actually connected, FP is the number of predicted connected edges which are actually disconnected, FN is the number of predicted disconnected edges which are actually connected.
NDCG@K considers the ranking order of the prediction results on the basis of the above two indicators, and the formula is where rel i is relevance score, |REL| is the ranking result under the similarity.

Recommendation performance evaluation
PSR is compared with 6 well-known algorithms under the LastFM and Ciao datasets. We use Precision@K, Recall@K, NDCG@K to evaluate the recommendation performance, where the value of K is {10,20}. The results are shown in Table 2. In recommendation systems, considering the importance of the cold-start problem, we also do the experiment under cold start. Cold start refers to personalized recommendation for new users. Table 3 is the experimental results under cold start.
Tables 2 and 3 show that PSR obtains the best results for 11/12 indicators in LastFM dataset. In particular, in the cold start experiment, the PSR increases by an average of 21.8% compared to SocialLGN. In the Ciao dataset, PSR obtains the best results for 9/12 indicators. From Table 1, it can be seen that the social network density of the Ciao dataset is low, which means that there may be a lot of missing social information. For PSR model, the generation of the social preference network is constrained by the social network, therefore, the performance improvement of the model under the Ciao dataset is lower than that of the LastFM dataset. Next, we further analyze the evaluation indicators. Under the Precision index, PSR obtains the optimal performance, which indicates that PSR has the highest prediction accuracy for user preferences. Under the Recall index, PSR obtains 6/8 optimal performance, which indicates that PSR has the highest real accuracy for user preferences overall. Similarly, under the NDCG index, PSR achieves 6/8 optimal performance, which shows that PSR is able to rank the importance of the preferred items well. When k = 10, PSR obtains 9/12 optimal results, while when k = 20, PSR obtains 11/12 optimal results. This shows that PSR is relatively more suitable for the recommendation of multiple number of items.

Parameter sensitivity analysis
There are two core hyperparameters in PSR: (i) the number of neural network layers l, and (ii) the value of hyperparameter α. We conduct experiments on these two hyperparameters to analyze the parameter sensitivity of PSR. Figure 3 is the parameter sensitivity experiment of l, and Fig. 4 is the parameter sensitivity experiment of α, where CS means cold start. From Fig. 3, with the increase of the number of neural network layers, it can be seen that the overall performance of PSR tends to rise first and then decline. When l = 3, PSR achieves the best overall performance. When l > 3, PSR suffers from oversmoothing (Welling & Kipf, 2016), which leads to performance degradation. Compared to the Ciao dataset, the performance of the LastFM dataset is more variable. We think the possible reason is that  the LastFM dataset is composed of multiple disconnected sub-networks. The different sub-networks have different structures, resulting in different speeds of oversmoothing. In contrast, the network composed of the Ciao dataset is a connected graph, making the speed of oversmoothing relatively consistent. Therefore, the LastFM dataset is more variable in the over-smoothing problem. Compared with the general recommendation performance, the recommendation performance of cold-start is more affected by the oversmoothing. We think the possible reason is that the cold-start user representations rely entirely on the connected user representations in social network, which are more sensitive. From Fig. 4, it can be seen that PSR is less sensitive to hyperparameter α. In Section 4.2, in order to obtain the best overall performance, we finally choose α = 5.

Ablation experiment
We analyze each part of the PSR through ablation experiments. Table 4 shows the performance of the PSR model under different ablation experiments, where PSR-BPR is the model without BPR loss, PSR-GOR is the model without GOR loss, PSR-Pre is the model without social preference network, PSR-each is the model that replaces respective user representation with redundant combined user representation as the next layer input, PSR-item is the model that adds weight matrix and tanh activation function to item, PSR-cat is the model that changes the concatenate operation in Eq. (8) to addition, and PSR-output is the model that only uses the output layer as node representation. PSR obtains 11/12 optimal results, proving the necessity of each component of PSR. Similar to the analysis in Section 4.2, we believe that the Ciao dataset lacks a lot of social information, that reduces the information gains by transforming the social network into the social preference network. Therefore, PSR-Pre achieves two optimal results in the Ciao dataset. The experimental results of PSR-BPR and PSR-GOR demonstrate the necessity of GOR loss and BPR loss. The experimental results of PSR-each show that redundant user representations used by previous articles based on separated graph models would damage the final recommendation performance. PSR-output uses the representation of the last layer, and PSR uses the average of the representations over different layers. Both can capture different order information of nodes. However, PSR is able to capture richer information about the network structure compared to PSR-output. Moreover, the average operation of PSR is equivalent to reducing the weight of high-order information and increasing the weight of low-order information. This implies a hypothesis that the closer the information is to the node, the more important it is to the node. In addition, the average operation, to some extent, can alleviate the possible oversmoothing problem of the last layer representation. And, the experimental results also show that the recommendation performance of PSR-output is lower than that of PSR. Compared with the addition operation used by PSR-cat, the concatenate operation used by PSR is better. It indicates that a great information difference exists between user representations in social networks and U-I networks. Compared with PSR-item, PSR has less number of parameters and computational complexity, but obtains better performance. We try to analyze the possible reasons for this. PSR-item is the model that adds weight matrix and a nonlinear activation function to items in U-I network. Compared with PSR, PSR-item can better fit the relationship between user representations and item representations in U-I networks and thus improve the recommendation performance. However, in social recommendation, we fit the relationship between user representations with additional social network information and item representations. Therefore, the item representations which over-fit U-I network information may affect the final social recommendation performance.

CONCLUSION
In this article, we propose an approach called graph neural networks for preference social recommendation (PSR). The approach proposes the social preference network, which is used to solve the problem of inconsistency between friend relations and preference relations. Next, PSR uses a separated graph model. By independently updating the social network and U-I network, it reduces information redundancy and fully captures the information of each of networks. Finally, PSR uses two losses to preserve the original connection relationship and widen the distance between positive and negative samples, respectively. Experimental results show that PSR has good performance in social recommendation, especially in cold start. Our approach provides an initial exploration of preference relations in social networks, which may be affected by the sparsity of social network. In the future, we will focus on the social network with large amount of missing information, and further mine user's preference relationships to generate a more suitable social preference network, so as to improve the performance of social recommendation.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
This work was supported by the National Natural Science Foundation of China under Grants 62176236. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.