Sign Prediction on Unlabeled Social Networks Using Branch and Bound Optimized Transfer Learning

Sign prediction problem aims to predict the signs of links for signed networks. Currently it has been widely used in a variety of applications. Due to the insufficiency of labeled data, transfer learning has been adopted to leverage the auxiliary data to improve the prediction of signs in target domain. Existing works suffer from two limitations. First, they cannot work if there is no target label available. Second, their generalization performance is not guaranteed due to that fact that the solution of their objective functions is not global optimal solution. To solve these problems, we propose a novel sign prediction on unlabeled social networks using branch and bound optimized transfer learning (SP BBTL) sign prediction model.Themain idea of SP BBTL is to use target feature vectors to reconstruct source domain feature vectors based on relationship projection, which is a complicated optimal problem and is solved by proposed optimization based on branch and bound that can obtain global optimal solution.With this design, the target domain label information is not required for classifier. Finally, the experimental results on the large scale social signed networks validate the superiority of the proposed model.


Introduction
Sign prediction predicts signs for links of signed networks, in which signed networks are networks whose edges have signs representing the relationship between nodes. The sign of a link is either positive or negative. A link with a positive sign is also called a positive link, which means the two end nodes of this link trust or like each other. A link with a negative sign is also called a negative link, which means the two end nodes of this link distrust or dislike each other. Compared with unsigned networks which only contain values representing the existence of links, signed networks contain more valuable node relationship information. Because of this rich preserved information, signed networks have been widely used in many applications, such as recommender systems [1,2] and community detection [3,4].
Most link prediction methods for signed networks are supervised or semisupervised. It is difficult for them to predict unlabeled target networks without any prior target label information. As for those link prediction methods using transfer learning or ensemble learning technologies, there is nonneglectable knowledge loss in knowledge transferring or domain mapping. The main challenge of the link prediction in signed networks is the data insufficiency problem. And this is the motivation to use an auxiliary labeled network to predict signs for unlabeled target network. Signs are manually labeled by experts, which is time consuming and expensive. This leads to the insufficient number of labeled signs in the real applications. Transfer learning, which is able to transfer knowledge from other domains to assist sign prediction, has therefore been used to address this problem [5,6]. The domain containing the signs for prediction is called the target 2 Complexity domain, and the domain whose knowledge is transferred to the target domain is called the source domain.
Though transfer learning based sign prediction methods have good performances on labeled signed networks, they are unable to predict signs on unlabeled signed networks. In [7][8][9], they map the feature vectors both in source domain and target domain into a high dimensional space to get the common knowledge as the transferable knowledge. These mapping approaches lost knowledge in the calculation course of high dimensional space like reproducing kernel Hilbert space. And these methods also lost knowledge in inverse transformation when calculating the reconstruction errors. Existing methods need a number of labeled signs of the target domain to train the sign classifier. Labeled signs of the target domain are used together with the labeled signs of the source domain to map feature vectors of both domains. These feature vectors are mapped to a common feature vector space, and the mapped feature vectors are called the common knowledge for transferring. The common knowledge along with the labeled signs of both domains is then used to train the sign classifier. However, in the real applications, it is sometimes unable to get any labeled signs of the target domain, which makes it impossible for the exiting methods [10] to map both domains for the common knowledge.
In addition, sign prediction performances of existing works need further improvement since the optimization of existing objective functions always lead to local optimal solutions or ill-condition solutions. Transferring knowledge between different domains is a complicated process, so the objective functions of sign prediction are usually nonconvex. Most existing works like [11] use gradient descent algorithms to optimize their nonconvex objective functions to predict signs. However, since the integration length of the existing works is fixed for gradient descent algorithms, it is not always possible for the optimal solution to be selected as the final solution of the objective function. This leads to the local optimal solution of these works' objective functions. So, the sign prediction performances of these works are not stable for use. Some other existing works [12][13][14] optimize their objective function with least angle regression methods or iterations. However, when getting the analytic solution for the objective function, the error of mapping transformations is usually large with these optimal methods [15][16][17]. This leads to the great loss of the common knowledge for knowledge transferring.
To solve the problems of existing works, we proposed a novel sign prediction model using branch and bound optimized transfer learning (SP BBTL). SP BBTL is different from existing works [18][19][20] which rely on the target labels to establish relationship between source domain and target domain; SP BBTL establishes a direct projection from source feature vectors to target feature vectors to obtain the reconstruction errors as the relationship between source domain and target domain. Direct mapping can preserve more original and specific information and knowledge in source domain and it is enough to train the classifier and predict target labels without any prior information of target labels. Besides, SP BBTL adapts branch and bound optimization to calculate the global optimal solution. Specifically, the proposed model optimizes the objective functions via branch and bound (BB), which can get the global optimal solution by ensuring the bounds of solutions and BB can be applied in many combination optimization problems.
There are three main advantages in SP BBTL. First, it does not require any sign labels in the target domain because of feature vectors mapping. Secondly, the BB based model can be used to compute the global optimal solution of a nonconvex mixed optimization problem with feature vectors in social networks. Third, the proposed method performs well in the imbalance networks compared with existing works because SP BBTL gets the global optimal solution in the course of source feature vectors reconstruction that has preserved more complete and original transferable knowledge in source domain.
The rest of this paper is organized as follows. Section 2 gives a brief review of the related works; Section 3 presents the details of the proposed method; Section 4 demonstrates the experimental results; Section 5 concludes this paper and points out the future works.

Related Works
In this paper, we propose a novel sign prediction method via transfer learning technology. Thus, the relative works are mainly separated into two parts: sign prediction and transfer learning.

Sign Prediction.
There are mainly three categories for sign prediction approaches. The first type constructs the nonbayesian model based on a set of vertex attributes. The second type derives the joint probability of each sample based on the knowledge of probabilities. The third type leverages linear algebra methods to calculate the similarities between network nodes based on rank-reduced similarity matrices. References [7-9, 21, 22] are supervised which requires a sufficient number of training samples to construct the sign prediction model. All of the existing approaches require some prior knowledge to train classifiers, yet the cost of getting the prior knowledge is expensive in the real applications. Besides, many sign prediction problems face up to class imbalanced problem in reality. Reference [23] is not suitable for the class imbalanced problem, yet [24] utilized adjacency matrix and Laplace matrix to train and test the classifier. But the objective functions in this approach are optimized by iteration computing, which generates considerable error in calculation course.
In addition to model design, another focus of sign prediction is the extraction of useful feature vectors to construct the sign prediction model. There are mainly two types of features: vertex features and edge features. Vertex features consist of neighborhood node based features, path based features, Katz value [25,26], cluster coefficient scores, etc. Edge features are actually the features of a pair of nodes, which mainly include kernel features conjunction, extended graph formation, and generic SimRank.

Transfer Learning.
In the real social networks, it is very hard or expensive to obtain the label for our target problem Complexity 3 which results in the insufficiency of available data. To solve this problem, transfer learning has been adopted in the sign prediction problem, which tries to utilize the knowledge from source domain to predict the signs in the target domain. Currently transfer learning based sign prediction approaches can be divided into three types: transferring knowledge of instances, transferring knowledge of parameters, and transferring knowledge of feature representations [27]. Reference [28] belongs to instances based methods but they cannot work without target labels. The approaches of transferring knowledge of parameters [29] assume that the model parameters of related learning task can always be shared, which is actually hard for the real networks.
To deal with data insufficiency problem, there are some unsupervised transfer learning approaches [12,30,31], which do not require any sign label in the target domain. But they require designing the pivot feature to achieve the good performance for the model. Unfortunately, the design of pivot feature is usually very challengeable. Another related approach SO [32] showed good performance on the regular datasets, yet it has a substantial performance degradation for the class imbalanced datasets.
In general, the analysis of related work shows that traditional sign predictions require a sufficient number of sign labels for training. To alleviate this, transfer learning based approaches have been proposed, yet most of these approaches still need some number of sign labels in the target domain. The existing unsupervised sign prediction approaches based on transfer learning do not need any sign labels in the target domain, but they are usually designed to solve a certain sign prediction problem and hard to use as a universal solution. Therefore, a novel transfer learning based approach for sign prediction is required. In this paper, we propose a novel sign prediction model named sign prediction on unlabeled social networks using branch and bound optimized transfer learning (SP BBTL). The detailed introduction of SP BBTL is presented in next section.

Problem Definition.
A signed network can be represented as a directed graph G = (V,E,Y), where V represents the nodes, E represents the edges, and Y is the sign of E.
Y. An adjacency matrix A is used to describe the connection of G, in which A is the connection between V and V , here = .
Sign prediction predicts for ( E) of G. To predict signs of links, a feature vector F is extracted from A to described E. F is used to train the sign classifier and then predict signs of target links. Link prediction is a learning task that predicts whether a link exists in a labeled or unlabeled network. Sign prediction is a learning task that predicts the signs of links, which also is called labels or weights of links. Labels used in this work consist of positive label and negative label. Predicting links of a network is the same as predicting the labels of links.   Figure 1. SP BBTL first constructs a sign classifier based on the knowledge of S, i.e., F S and Y S . SP BBTL discovers the relationship between F T and F S . This relationship projects F T to F S and generates a new representation F T2S , which is used to establish the relationship between G S and G T . F T2S is then used as the input of the trained sign classifier to get the output: predicted signs Y T2S .
The key step of SP BBTL is to achieve domain adaption from G T to G S and to establish a mapping from F T to F S : where F T2S is the projection of F T into F S , and H is the mapping function. The mapping in (1) should maximize the similarity while minimize the difference between G S and G T . The detailed architecture of the proposed SP BBTL model is shown in Figure 2, in which the grey rectangle represents the functional module and the white rectangle represents the data. The inputs of SP BBTL are the adjacency matrices of T and S. The output is the predicted signs Y of T. Feature vectors extraction module extracts F T and F S from the input matrices. A branch bound option based domain reconstruction module is proposed to establish the mapping from F T to F S . The reconstructed feature vectors F T2S will   Figure 2, given the adjacency matrices of source domain and target domain, feature vectors F S and F T are firstly extracted to describe links of S and T for link prediction. In this work, five features are extracted for each feature vectors. These features include link positive outdegree, link negative outdegree, link positive indegree, link negative indegree, and link embeddedness [33].

Feature Vectors Extraction. As shown in
Link positive outdegree + (V ) denotes the number of positive edges pointing from V to other nodes. + (V ) reflects the likelihood that V gives positive sign to a connected link. The higher value + (V ) has, the more probably = 1. Link negative outdegree − (V ) is the number of negative edges pointing from V to other nodes. − (V ) reflects the likelihood that V gives negative sign to a connected link. The higher value − (V ) has, the more probably = −1. Link positive indegree + (V ) is the number of positive edges pointing to V . + (V ) reflects the likelihood that V gets positive sign from a connected link. The higher value + (V ) has, the more probably = 1. Link negative indegree − (V ) is the number of negative edges pointing to V . − (V ) reflects the likelihood that V gets negative sign from a connected link. The higher value − (V ) has, the more probably = −1. Link embeddedness em( ) is the number of common neighbors of V and V . The link embeddedness of each edge (or link) contains the essential characteristic relationship among its neighbor nodes, which reflects the global structural feature of a substructural network in the whole network. em( ) also represents the structural balance of : according to the structural balance theory, the higher value em( ) has, the more probably a positive relationship between V and V exists and the more probably = 1.

Branch and Bound Optimized Domain Reconstruction.
Domain reconstruction is the key part of SP BBTL model. It will build up the latent relationship between source feature vectors and target feature vectors collaboratively. Reconstructing domain from F T to F S can be represented as where X is the solution of (2). In essence, X is the bridge to enable the collaborative use of knowledge in S and T. X is denoted by F T2S in Figure 1, and (2) can be also written as A branch and bound based method is proposed for solving (3) with the minimum globalized error to get a global optimal solution. With minimum error, (3) can be rewritten as * = argmin F S − F T X 1 , where * is the optimum solution of (3). 1-norm sums the matrix along the column to select the maximum numerical value. If the sum of each column of X is minimized, the reconstruction error is minimized. This ensures the divergence of the transfer learning task to be minimized.
To minimize the error of (4), motivated by the idea of sparse coding, the constrained condition g is set to be g: ( ) 0 ≤ , = 1, 2, . . . , where is the number of nonzero elements in each column of X. g can control the sparsity of F S for the reconstruction. ensures the nonzero elements corresponding to the selected samples are neighbors of X.
Calculating (2) is solving a problem of mixed optimization. However, existing methods can only calculate the local optimal solution, which cannot get the global optimal solution of (2). Therefore, the branch and bound method is proposed to achieve the optimal solution calculation. Branch and bound is a generalized search algorithm which includes searching and iterating. Specifically speaking, it compares the size relationship between the given error and the difference about upper bound and lower bound of the feature vectors, and then it adjusts the upper bound and lower bound according to the size relationship. This method controls the complexity of solution vectors in different network scales via parameter . It can calculate out the global optimal solution for mixed optimization problem. [34] The details are shown in Algorithm 1.

Sign Classification.
With the extracted global optimal solution X, which is F T2S in (3), Y T is predicted as where C is a sign classifier that is trained by F S and Y S and F T2S . Y T2S is the output of C to get the predicted sign Y T2S .

Experimental Results
Four datasets extracted from the real-world applications are used in the experiments to verify the performances of the proposed method. These datasets (http://snap.stanford .edu/data/index.html) are Bitcoinotc [35] (denoted as OTC), Bitcoinalpha (denoted as ALP) [35], Epinions (denoted as EPI) [36], and Slashdot (denoted as SLA) [37]. The values of data label for EPI and SLA belong to {-1,1}. The values of data labels for OTC and ALP are mapped into {-1,1} by setting the label to be -1 if the original label is less than 0 and setting the label to be 1 if the original label is greater than 0. The details of the experimental datasets are given in Table 1. Accuracy and F1-score [38] are used to measure the sign prediction performances of the proposed method. Two baseline methods are used in this paper to compare with SP BBTL. The first method is Source-Only (SO) model. It predicts signs of links with data merely from source domain [32]. The second method is Nonnegative Matrix Trifactorization (NMTF) [14]. NMTF predicts signs of links by matrix trifactorization using source domain labels and target domain feature vectors. By factorizing the adjacency matrix of the source domain and the target domain, NMTF gets the latent feature vectors of each domain, which are used together with the explicit feature vectors of each domain to transfer knowledge from the source domain to the target domain.
Sign prediction performances of SP BBTL are firstly measured with various network sizes. In the experiments, the number of samples in the target domain is fixed to 3000, while the number of source domain samples varies from 3500 to 9500. The proportion of positive link to negative link is set to be 7:3 (± 5%). The experimental results are given in Figure 3, in which AAA-BBB means AAA is the source domain of sign prediction and BBB is the target domain of sign prediction. For example, OTC-ALP means predicting signs of ALP by using OTC as the source domain.  As shown in Figure 3, when transferring knowledge from each of the source domain-target domain groups, the tendency of the performance about network size is slightly decreasing because the accuracy is negative related to network size. The proposed SP BBTL performs better than the baselines because SP BBTL really can transfer more useful knowledge from source domain to target domain. In addition, the more efficient optimization also contributes to the superior sign prediction performance of SP BBTL. Compared with our proposed model, the two baseline methods failed to decrease the transfer loss. This means the solution of their objective function is not globally optimal, which leads to their limited link prediction performances.
Sign prediction performances of SP BBTL are then measured with the various negative link ratios. In the experiments, the number of samples in the target domain is set to be 3000, while the number of samples in the source domain is set to be 4500 (OTC-ALP and ALP-OTC) and 6500 (EPI-SLA and SLA-EPI) respectively. The ratio of negative links varies from 10% to 90%. The experimental results are given in Figure 4. It is shown that the accuracy and F1-score of SP BBTL are superior to baselines with different negative link ratios. The accuracy of SP BBTL and the baseline method tends to be micro-W-like distribution on OTC-ALP and ALP-OTC dataset, while the accuracy of SP BBTL and the baseline method tends to be micro-V-like distribution on EPI-SLA and SLA-EPI datasets. F1-score of SP BBTL and baseline methods decreases with the increasing of negative link ratios.
In addition, SP BBTL is insensitive to the decreasing of negative link ratios, while the baseline methods decrease significantly with the increasing of negative link ratios, especially on OTC-ALP and ALP-OTC datasets.
The influence of the constraint parameter on sign prediction performances of SP BBTL is further analyzed. In the experiments held for OTC-ALP and ALP-OTC, the scale of the source domain and the scale of the target domain are 4500 and 3000, respectively. In the experiments held for EPI-SLA and SLA-EPI, the scale of the source domain and the scale of the target domain are 6500 and 3000, respectively. The value of the constraint parameter varies from 0.1 to 2.5, and the negative link ratio is set to be 0.7 in both source domain and target domain. The experimental results are given in Figure 5. Based on the experimental results, the performance of SP BBTL is relatively stable when is larger than 0.5. When is close to 0, that means the zero solution for (3) which is meaningless in sign prediction. So, is suggested to be a value around 0.5 and this contributes to the best prediction performances of SP BBTL.

Conclusion and Future Work
In this paper, a novel method named sign prediction on unlabeled social networks using branch and bound optimized transfer learning (SP BBTL) is proposed to solve a sign prediction problem via feature vectors projections. In SP BBTL, labeled source feature vectors are mapped into unlabeled target feature vectors and then the relationship between two domains can be established so that the classifier can be trained without any target label. In addition, the proposed optimization based on branch and bound (BB) performs efficient on social networks because the branch and bound optimization method adapted in the proposed model can ensure the global optimal solution of the objective function. Branch and bound can get global optimal solution by highly efficient searching and iteration. It can maximize the transferable knowledge of the source domain, while minimize the transfer loss. Experimental evaluation validates the superior effectiveness and stability of SP BBTL in real social networks. At last we give the suggested value for parameter in proposed model. In the future, we will try to improve the proposed method from several aspects. Firstly, we will try to develop a generalized algorithm, which could not only minimize the influence of negative transfer, but also discover transferable knowledge with different categories of source domains, such as the text data and the image data. Secondly, we will improve the model to minimize the number of the source domain instances used for knowledge transfer, only with little cost in link prediction performances. Lastly, we will extend our model from solving binary sign prediction problem to multilabel sign prediction problem.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that there is no conflict of interest regarding the publication of this paper.