Self-Avoiding Pruning Random Walk on Signed Network

A signed network represents how a set of nodes are connected by two logically contradictory types of links: positive and negative links. In a signed products network, two products can be complementary (purchased together) or substitutable (purchased instead of each other). Such contradictory types of links may play dramatically different roles in the spreading process of information, opinion, behavior etc. In this work, we propose a Self-Avoiding Pruning (SAP) random walk on a signed network to model e.g. a user's purchase activity on a signed products network. A SAP walk starts at a random node. At each step, the walker moves to a positive neighbour that is randomly selected and its previously visited node together with its negative neighbours are removed. We explored both analytically and numerically how signed network topological features influence the key performance of a SAP walk: the evolution of the pruned network resulted from the node removals, the length of a SAP walk and the visiting probability of each node. These findings in signed network models are further verified in two real-world signed networks. Our findings may inspire the design of recommender systems regarding how recommendations and competitions may influence consumers' purchases and products' popularity.


Introduction
The concept of multi-layer networks has been proposed in 2010 [1][2][3][4][5] to capture different types of relationships/ links among the same set of nodes. For example, the rapid development of the Internet, smart phones and information technology has facilitated the boost of online platforms, such as Facebook and YouTube, for communications, creating and sharing information and knowledge. Users may participate in one or several online networks besides their physical contacts forming a multi-layer network where the nodes represent the users and the links in each layer represent a specific type of connections such as physical contacts and online follower-followee relationships. Such multi-layer networks support the spreading of e.g. information, behavioural patterns, opinions, fashion within each layer respectively and allow as well these spreading processes on different layers to interact, introducing new phenomena that dramatically differ from a single spreading process on a single network [6][7][8][9][10][11][12][13][14][15][16][17][18][19].
Signed networks is a special type of two-layer networks where the same set of nodes are connected by two logically contradictory types of links, so called positive and negative links. The positive and negative links may represent friendly and antagonistic interactions respectively in a signed social network [20] and represent the complementary (i.e. when a product e.g. a phone is purchased, the other product e.g. a phone charger is likely to be bought in addition) and substitutable (two products can be purchased instead of each other such as the phones from two competing brands) relationships respectively in a signed network of products [21][22][23].
Whereas all types of links in most multi-layer networks such as physical contact and online friendships are mostly positive thus facilitate the spread of information, opinion and etc the positive and negsative links in a signed network usually play dramatically different roles in a general spreading process. The random walk (RW) and self-avoiding walk have been used to model users' purchase activity on a recommendation network of Figure 1. Schematic plot of SAP walk on a signed network with pruning probability r=1. The signed network is represented on the top as a two-layer network a negative layer and a positive layer where dashed lines between the two layers emphasis that the nodes are the same individual across layers and represented at the bottom as a single network with two types of links: positive (solid lines) and negative (dotted lines). At t=0, a walker visits a random node in the network, which is in red. At step 1 or t=1, the walker moves to a random positive neighbour (in red) and the previously visited node and its negative neighbours (in yellow) are removed. Such steps repeat until the walker has no node to visit any more. The pruned network in grey is shrinking over time.
the SAP model where the complementary products of a product are recommended or preferred with different strength.
The paper is organised as follows: we introduce the basic definitions related to signed network models and RWs in section 2. The influence of the signed network features on the aforementioned properties are studied in signed network models when the pruning probability is r=1 in section 4, and when r 1 ¹ in section 5. Our observations and understanding obtained in signed network models are further verified in two real-world signed networks in section 7. We summarise our findings and discuss promising future work is in section 8.

Definitions
In this section, we introduce basic definitions regarding to signed network representation and models, different types of RWs [26] and their relation to the SAP walk.

Signed network representation
In a signed network with N nodes, two N×N adjacency matrices A + and Acan be used to represent the positive and negative connections respectively. Element  . An example of signed network is shown in figure 1, which is plotted both as a two-layer network (above) and a single network with two types of links (bottom).

Signed network models
The simplest signed networks can be constructed by generating the positive layer and negative layer independently from the same network model or two different network models respectively, such as the Erdős-Rényi ER and scale-free SF random network model.
Erdős-Rényi ER random network is one of the most studied random network models that allow many problems to be treated analytically [27,28]. To generate an Erdős-Rényi random network with N nodes and average degree E[D], we start with N nodes and place each link between two nodes that are chosen at random among the N nodes until a total number L NE , where D is the degree of a random node in the network, and the link density p We use the hidden parameter model [29][30][31][32] to generate scale-free networks which have a power-law degree distribution Pr D k ck ] as observed in many real-world networks [33][34][35]. The hidden parameter model is considered because the degree distribution and the average degree of the generated scale-free networks are both controllable. We start with N isolated nodes and assign each node i a hidden parameter i i 1 h = a , i=1, 2, ..., N. At each step, two nodes i and j are chosen randomly with a probability proportional to η i and η j and they are connected as a link if they were not connected previously. Such steps are repeated until L E D N 2 = [ ] links have been added. In this case, the generated random network has a power-law degree distribution Pr D k ) . In this paper, we consider N=1000, average degree E[D]=4 and λ=3, such that the ER and SF networks have the same average degree and a size of the largest connected component close to N.
The positive and negative degree of a node are possibly correlated, actually positively correlated as shown in the real-world signed networks in section 7. Moreover, triangles with an odd number of positive links, so called balanced triangles, have been shown to appear more frequently than the other types of signed triangles [36].
We focus on the simplest signed networks where the positive and negative connections are generated independently from either the same or different network models, i.e. ER or SF model. In this case, the positive and negative degree of a node are uncorrelated. We construct four types of signed networks: ER-ER, ER-SF, SF-SF and SF-ER, where N=1000 and the average degree of both layers are 4. Moreover, we consider as well . Such networks are generated as follows. First, an ER (or SF) network is generated as the positive network layer. Second, set the negative degree of each node the same as its positive degree. Third, select randomly a fraction 1−ρ of the nodes and shuffle randomly their negative degrees. After the shuffling, the generated degree sequences for the two layers are correlated with linear correlation coefficient ρ [37,38]. Given the negative degree of each node, construct the negative network layer according to the configuration model [39].

Related work
Classic RW starts at a random node in an unsigned network. At each step, the walker moves from its current location node i to a neighbour that is selected uniformly at random. In this process, the walker can visit any node repeatedly if the network is connected. RW has been widely applied e.g. to model network routing protocol, users' visit at websites via hyper links and to detect network topology [40][41][42][43][44]. The self-avoiding random walk (SAW) is the same as the RW except that at each step the walker moves to a random neighbour that has not yet been visited. Hence, each node can be visited maximally once. A SAW stops when the walker has no further node to visit any more. The SAW was first introduced by chemist Froly to study the behaviour of polymers on lattice graph [45]. SAW has also been applied to detect protein-protein interaction [46], to detect network structure which is more efficient than classic RW by avoiding previously visited nodes in each step [47], and to detect unidentified network traffic [48].
Performance of these two types of RWs has been analytically studied [49]. The probability that a node is visited by a classic random walker has been shown to be proportional to the degree of that node. The path length of a SAW is the number of links that have been traversed in total in a SAW. The path length of SAW has been studied, especially regarding to the average and the probability distribution [50]. Tishby et al have found that the path length of SAW on an Erdős-Rényi random network follows the Gompertz distribution in the tail [51].
RW and SAW have been used to model users' purchase activity on a recommendation network [24,25]. In contrast to RW and SAW, SAP walk addresses further that products can be substitutable to each other and are seldom or not purchased by the same user.
Jung et al considered signed RW, where the sign of the walker changes depending on the signs of the links that walker has traversed [52]. This work addresses, for the first time, that the signed links could influence the dynamics of the walk thus the walkers' trajectories. The SAP walk is equivalent to a self-avoiding walk on the positive network layer if the negative network layer is empty, i.e. no negative links exist.
Opinion diffusion (voter model) on a signed network has been proposed in [53]. Dynamics of influence diffusion and influence maximisation problem on signed networks have been explored [53] beyond the influence maximisation problem on single unsigned networks [54]. Viral spreading processes on signed networks have been studied in [55]. From an analytic point of view, SAP walk is more challenging to trace because the earlier trajectory of a walker influences its future moves, in contrast to spreading models where state transitions of each node depend only on the current states of neighbours and the local dynamic rule.

Evolution of the pruned network structure
The SAP walk on a signed network is more complex than previous RW models. At each step, the walker moves to a random positive neighbour, and afterwards, not only the previously visited node but also its negative neighbours are removed/pruned from the signed network. As shown in figure 1, the pruned signed network (in grey) G(t) is shrinking over time. The pruned signed network at a step t refers to the remaining signed network after the removal of nodes at step t. The initial signed network corresponds to G(0).
The pruned positive network layer G t + ( ) suggests the potential sub-graph of the original signed network that the walker could further explore via a SAP walk. In this section, we will explore how the topology, especially the average degree, of the pruned positive network layer is changing over time. We start with the simpler case when both the initial positive and negative layers are ER networks, with possibly different average degree. Firstly, we examine the case when the initial negative layer is an empty graph, i.e. the average degree on such a signed network is equivalent to a SAW on the positive network layer G 0 + ( ). Initially, the network has nodes. At any step t, the pruned positive network layer has N−t nodes. An insightful observation of a SAW walk on an ER random graph in [51] is as follows. A SAW walker has a higher probability to visit a neighbour with a higher degree. Take the step t=1 as an example. Starting from the random node that is visited at step t=0, the walker walks to a node with degree k in G 0 . The probability to walk to a node with degree k in G 0 + ( ), thus with degree k 1 in G 1 + ( ) after the removal of the previously visited node and its links is Pr D k ]. The node to visit at t=1 is as if chosen randomly from G 1 + ( ).
Note that when a randomly selected node together with its links are removed from an ER network, the remaining network is again an ER network with the same link density, i.e. the probability that two nodes are connected. Hence, the network pruning resulting from a SAP walk on an ER positive network with an empty negative layer is statistically equivalent to the node removal process upon the initial ER network where at each step, a node is randomly selected and this node together with its links are removed from the network. The pruned positive network G t + ( ) at any step t is thus an ER network with N−t nodes and link density ). The average degree 6 of the pruned positive network at step t is when the original signed network is sparse and the size N is large. Furthermore, we consider the case where the negative network layer G 0 -( ) is not empty but an ER network . At any step t of a SAP walk, the network is pruned by the removal of the previously visited node and its negative neighbours. Since the negative and positive ER networks are generated independently, the node to visit at each step is as if chosen randomly from the negative layer G t -( ). From the view of the negative network layer, a random node and its negative neighbours together with all their negative links are removed from the negative layer at each step. The negative network layer G t -( ) remains approximately an ER network with the same link density over time. This is an approximation because the neighbour of a random node tends to have a higher degree. The link density remains approximately the same p ) nodes are removed. The size N(t) refers to the average size of the pruned network at step t over a large number realisations of the stochastic SAP walks. Hence, the size of the pruned negative layer, which is as well the size of the pruned positive layer follows From the prospective of the positive layer, at step t, the negative neighbours of the previously visited node are as if chosen randomly from positive layer G t 1 -+ ( ). The positive layer remains as an ER network with the same link density p E D N The average degree of the pruned positive network layer at time t is When the original signed network is sparse and the size N is large ER(positive)-SF(negative) signed networks are pruned slightly less than ER-ER networks when both layers have the same average degree 4 (see figure 2). This can be explained as follows. If a visited node has a large negative degree, its removal will lead to the removal of many nodes, its negative neighbours. If a negative neighbour of a visited node has a large negative degree, however, the removal of such a negative neighbour together with its negative links will not remove extra nodes but makes the negative layer sparser, protecting the network from the pruning. In ER-SF networks, the visited nodes are as if randomly chosen from the negative layer, thus tend to have a low negative degree. Nodes with a high negative degree in the SF negative layer are likely 6 The average degree of the pruned positive or negative network layer at a step t refers to the average degree at step t over all the nodes and over a large number of SAP walk realisations.

As shown in
to be removed as the negative neighbour of a visited node, which reduces the pruning. We expect that the SAP walks on ER-SF networks whose average degree in the negative layer is perform similarly to the SAP walks on ER-ER networks but with a lower average degree E D 0 -[ ( )]in the negative layer. Hence, we fit the average degree E D t + [ ( )] of the positive pruned network as a function of the SAP walk step in ER-SF networks by our theory (3) for ER-ER networks. Figure 2 shows the optimal fit when E D 0 . If we remove the top 2% of the nodes with the highest degree from the SF negative network layer whose original average degree is 4, the resultant average degree becomes 3.4. Figure 2 shows that the pruned positive network, e.g. E D t , shrinks faster if the initial network is a SF-SF signed network than ER-SF signed network. This is mainly due to the fact that, a node with a large positive degree is likely to be visited in early steps and removed, significantly reducing the average degree of the positive pruned layer. However, the negative neighbours of a visited node are as if chosen randomly in the positive layer and tend to have a lower positive degree in a SF-SF network than in ER-SF network, slightly reducing the pruning effect.
Hence, a SF or in general a heterogeneous positive layer and a dense negative layer tend to facilitate the pruning of the network whereas a SF (heterogeneous) negative layer reduces the pruning effect.

Length of a SAP walk
The length or hopcount H of a SAP walk counts the total number of positive links, or the total number of move steps, a SAP walker traverses until it has no other node to move to. In the context of a signed produce network, H+1 suggests the total number of purchases of a consumer. Signed networks leading to a large H+1 promotes the purchases of more products. We would like to understand how the original signed network topology influences the length of a SAP walk.
The probability distribution of the length H of a SAP walk is shown in figure 3(a), for various types of signed networks. Intuitively, a SAP walk stops when the walker has no other node to move to, which is likely to happen if the current pruned positive network layer is less connected in the sense that no giant connected component exists but only small connected clusters exist. Hence, a dense initial negative network layer G 0 -( ) leads to the removal of many nodes in each step and effectively reduces the connectivity of the positive layer, resulting in a small length H. This explains our observation in ER-ER signed network ( figure 3(a)) that a dense G 0 -( ) leads to a small length H on average. The distribution of the length H of a SAP walk on an ER-ER network can be analytically derived. A SAP walk has a length H=h requires that the node that the walker visits at step h has degree 0 in the pruned positive layer G h + ( ) and each node visited in a previous step t where t h 0  < , has a positive degree in the corresponding pruned positive network layer larger than 0. As discussed in section 4.2, the pruned positive layer remains an ER network with the same link density p + but with a shrinking size N(t). Hence . The SF network has a power exponent 3. The best curve fitting for ER-SF networks by theory equation (3) is obtained Pr H h ]. Which type of signed networks tend to lead to a long length of a SAP walk? If we look at the average path length, the ordering of signed networks from the highest to the lowest follows: . This ordering is consistent with our previous explanation: a heterogeneous positive layer such as SF network and a dense negative layer facilitate the pruning of the network leading to a short length of a SAP walk whereas a heterogeneous e.g. SF negative layer reduces the pruning effect attributing to a long length of a SAP walk.
The length of a SAP walk actually depends on not only the link density but as well the connectivity of the pruned positive layer. The negative neighbours of a visited node are as if chosen randomly in the positive layer since the two layers are independent in connections. Removal of such random nodes in the positive layer reduces less the connectivity of the positive layer if the original positive layer is a SF network since SF networks are robust against random node removals compared to ER networks. However, hubs in the positive layer are more likely to be visited and removed reducing more significantly the density of the SF positive layer.

Nodal visiting probability
The probability that a node is visited by a SAP walk implies a certain kind of importance of the node, e.g. the probability that a product is purchased when the signed network represents the network of products. Intuitively, a node with a higher positive degree in the initial signed network has a higher chance to be visited by a SAP walk. Hence, we examine the visiting probability of a node given its initial positive degree, which is shown in figure 4.
Firstly, we analytically derive the nodal visiting probability in ER-ER networks. Specifically, we compute the probability v k that a random node j with degree d k 0 j = + ( ) in the initial positive layer is visited by a SAP walk starting at a random node. We denote X t as the node that is visited by a SAP walk at step t. Since the node j can be visited at any step 0th by a SAP walk of length h, we have  assuming that the probability node j is visited at any step t is independent of the length H of the walk as long as Ht. The node j is visited at step t by a SAP walk that has a length Ht requires that node j is not visited nor removed in the previous steps and j is connected with the node X t 1 -visited at step t−1. Hence )is the degree of node j at step t 1 in the pruned positive network layer given that j is not visited in the first t−1 steps. The node X t to be visited at step t as well as its negative neighbours are as if randomly chosen from the pruned positive layer G t 1 -+ . The pruned network remains approximately (precisely if the negative layer is empty) an ER network with the same link density p + when the node visited and its negative neighbours are removed at each step. The ratio is the probability that node j is connected with the node X t 1 -visited in the previous step and p N t ) is the probability that the walker choose node j out of the p N t 1 1 --+ ( ( ) )positive neighors of X t 1 -to move to. We approximate the degree t j D ¢ + ( ) by its average using the same symbol, which follows the following recursion for t t ¢ < thus before the node is visited The first (second) term corresponds to the case that node j is (not) connected with the node visited at step t 1 ¢ -. In the first case where j is connected with X t 1 ¢-, the degree t j D ¢ + ( ) at step t¢ could be reduced from t 1 j D ¢ -+ ( )due to the removal of X t 1 ¢-and its negative neighbours which happen to be a positive neighbour of node j. In the second case, the degree t j D ¢ + ( )decreases from t 1 j D ¢ -+ ( )due to the removal of X t 1 ¢-ʼs negative neighbours which happen to be a positive neighbour of node j. Combining equations (2) and (5)-(8), we could derive the probability v k that a random node j with degree d k 0 in the initial positive layer is visited by a SAP walk on an ER-ER signed network.
As shown in figure 4, our numerical solution of nodal visiting probability well approximates the simulation results especially when the initial negative ER network is sparse e.g.
. When the initial negative ER network is denser, the actual visiting probability is lower than the prediction of the numerical solution. This is because our theoretical analysis assumes that the negative layer remains an ER network with the same link density after the removal of each visited node and its negative neighbours, as if all these nodes removed are chosen randomly. In fact, high negative degree nodes are more likely to be removed as a negative neighbour of a visited node. The actual t j D ¢ + ( ), thus also the visiting probability, is smaller than their corresponding analytic estimations. Figure 4 shows that the visiting probability of a node grows approximately linearly with the initial positive degree of the node in each signed network model. A large slope of a curve in figure 4 features a fast growth of the nodal visiting probability as a function of the initial positive degree of a node, i.e. a high heterogeneity of nodal visiting probabilities. Interestingly, the order of the various signed network models in the heterogeneity of nodal visiting probabilities, the order of these networks in the average SAP walk length and the order of these signed networks in the average degree of the pruned positive layer at a given step are the same. A SAP walk that prunes the network slowly tends to have a long length and lead to a high heterogeneity in nodal visiting probabilities. A SAP walker tends to visit high degree nodes in the pruned positive layer at each step. A longer length of a SAP walk, thus, attributes to a higher visit probability of a node with a large initial positive degree, leading to more heterogeneity of nodal visiting probabilities.

Influence of degree-degree correlation
The positive and negative degree of a node can be correlated. The real-world networks considered in section 7 have a positive correlation between the degrees of a node in the two layers. In this subsection, we explore how the degree-degree correlation between the positive and negative layers may influence the aforementioned performance of a SAP walk.
We consider ER-ER and SF-SF signed networks with E D and N=1000, where the degree-degree correlation ρ varies within [0,1]. We illustrate the three properties of a SAP walk using the two extreme case ρ=0 and ρ=1, whereas results from other ρ values within [0,1] lead to the same observations. We simulate 100 independent realisations of a SAP walk on each of the independently generated 100 signed networks to derive the three properties of SAP walks. In both ER-ER and SF-SF signed networks, we find that a positive degree-degree correlation evidently facilitates the pruning of the network, reduces the average path length and leads to a more homogeneous visiting probabilities among the nodes (see figures 5(a)-(c)). Such effects are more evident in SF-SF networks than in ER-ER networks and can be explained as follows. When the degree-degree correlation is positive, a high degree node in the positive layer tends to have a high degree in the negative layer. The high positive degree of such a node tends to let the walker visit the node in earlier steps. After being visited, the node together with its many negative neighbours, are removed, pruning the network significantly. The high negative degree of such a node tends to let the node be removed as a negative neighbour of a node that has been visited. The removal of such a node together with its many positive links significantly prunes the positive layer and reduces the connectivity.

Influence of the community structure in signed networks
Beyond the degree distribution and correlation in the two layers of a network, we explore further how the community structure of a signed network may influence the SAP walks. We consider the Girvan and Newman (GN) networks to model networks with a community structure [56]. Each GN network has N=1000 nodes and average degree 4, the same as our ER and SF network models. The N nodes are divided into four groups, each with 250 nodes. Within each group, each of the E[D in ]·N/8 links is placed between two nodes that are randomly selected from the group. On average, each node has E[D in ] links connecting it to the nodes within the same group. We place further each of the E[D out ]·N/2 links between two nodes that are randomly chosen from the N nodes but from different groups. The average out-degree E[D out ] is the expected number of links that connect a node to the other nodes from a different group. The average degree 4 requires E D E D 4 in out When E[D out ]=0, the GN network is composed of four isolated ER networks. When E[D out ]=3, the GN network becomes an ER network. A GN network with E[D out ]<3 represents a network with a community structure where nodes within the same group are more likely connected than nodes from different groups.
We focus on the case when the two layers of a signed network are independent. In this case, the SAP walks perform similarly on ER-GN (negative layer) networks and on ER-ER networks, since the visited nodes are as if randomly chosen from the negative layer. When the positive layer has the community structure, SAP walks perform quite differently. As shown in figure 6 , because the sub-graph of the ER negative layer that corresponds to the group where the walker resides is again an ER network with the same link density. Hence, the distribution of the SAP walk length and the nodal visiting probabilities on GN-ER networks when E D 0 out = [ ] can be analytically deduced by our theories for ER-ER networks. An interesting observation is that as E D out [ ]increases slightly from 0, the average length of a SAP increases significantly as shown in figure 6(a). The influence of the community structure of the positive layer on a SAP walk is evident when the community structure is significant. When the length of a walk becomes shorter as the positive layer becomes more modular, the nodal visiting probabilities become more homogeneous, which is consistent with our previous findings. The influence of the community structure in the positive layer on the pruning speed, i.e. the average degree E D t + [ ( )] of the positive pruned network as a function of the step t, is not evident. This is mainly due to the independence between the two layers: the negative neighbours of a visited node are as if chosen randomly from the positive layers.
When the two layers are correlated, the influence of the community structure on SAP walks is non-trivial. Consider, for example, that both layers are GN networks with E D 0 and follow the same grouping (any two nodes that belong to the same group in one layer are also within the same group in the other layer). The average length of a SAP walk is further reduced to E[H]=28.31 due to the correlated community structure in the negative layer.

Influence of signed network topology on SAP walk when r 1 ¹
In this section, we consider the general case of the SAP walk that each negative neighbour of a visited node is removed independently with a probability r where 0r1. We first consider the ER-ER and SF-SF networks where the positive and negative layers are generated independently.
The SAP walk with a pruning probability r on an ER-ER signed network where the average degree of the two layers are E D + [ ] and E D -[ ] respectively is equivalent to the SAP walk with pruning probability 1 on an ER-ER signed network whose average degree in the two layers are E D + [ ] and E D r -[ ] · respectively. Scaling the pruning probability by r in a SAP walk model is equivalent to scaling the link density of the negative layer by r. Hence, all our theoretical results in section 4 for SAP walks with r=1 on ER-ER networks can be extended to the SAP walks with an arbitrary pruning probability r on ER-ER networks. However, such equivalence does not hold when the positive and negative degrees of a node are correlated nor in SF-SF networks. We take the average path length of a SAP walk as an example and explore the effect of the pruning probability p and the degree-degree correlation ρ on the average path length. As shown in figure 7, the effect of the pruning probability on the average hopcount is more evident as the degree-degree correlation increases. When the degree-degree correlation is high, nodes with a high degree in both layers tend to be removed in early steps of a walk. In this case, a smaller pruning probability could effectively reduce the pruning.

Generalisation of the SAP walk model
The SAP walk model can be generalised from multiple perspectives to better approximate real-world purchase behaviour of users. We illustrate one possible generalisation to take into account the heterogeneous preference of a user over the complementary/recommended products. It has been observed that customers tend to prefer popular projects and more engaged customers, i.e. those who have purchased/walked more, are more likely to  buy niche or less popular products [57]. We have shown that the popularity of a product, i.e. the visiting probability of the corresponding node, grows approximately linearly with its initial positive degree. Hence, we consider the generalised model where the probability for a walker at step t residing at node i to walk to a positive neighbour j of i is proportional to d 0 is the positive degree of node j in the initial signed network and 0 g > . In this case, a node with a high initial positive degree is preferred, whereas such preference becomes weaker as the walker moves more, i.e. becomes more engaged. Our classic SAP model discussed earlier corresponds to the case when γ=0. In figure 8, we compare the three walk features of our classic model when γ=0 and the generalised model when γ=1. We find that such preference of visiting a node with a high initial positive degree leads to a longer walk on average and more heterogeneous nodal visiting probabilities. A node with a high initial positive degree is visited more thus becomes more popular when γ=1.
The SAP model can be as well extended by letting the probability for a walker at step t residing at node i to walk to a positive neighbour j of i be proportional to the popularity of j, i.e. the number of walkers that have visited node j and such preference decreases as a walker moves more. We hypothesise that a node with a large initial positive degree may become more and more popular over time, based on our observations in the SAP walk with γ=1.

SAP walks on real-world signed networks
Finally, we choose two real-world signed networks and explore their network features and how these features may influence the SAP walks on these networks. We consider the Wikipedia adminship election network and an Extracted Epinions social network [58]. In Wiki network, two nodes connected by a positive (negative) link suggest that the two users support (reject) each other to be an administrator. A positive (negative) link in Epinions network means that the corresponding two users trust (distrust) each other's reviews.
The Epinions network is far larger than Wiki. We have sampled the Epinions network by firstly removing all nodes with zero positive degree or zero negative degree and then randomly selecting the same number of nodes as in Wiki from the largest connected positive layer of Epinions together with the positive and negative links among these nodes. Basic topological features of these two networks of the same size are shown in table 1. The degree correlation ρ D measures the linear correlation coefficient between the positive degree and negative degree of a node. The positive and negative layers tend to be positively correlated in their degrees, i.e. ρ D >0, instead of independent as assumed in our signed network models.
The degree distributions of the positive and negative layer in both Wiki and Epinions are shown in figure 9 are highly heterogeneous, closer to a scale-free distribution than a Poisson distribution.
Upon each real-world signed network, we simulate independently 10 5 SAP walks and investigate their key properties discussed earlier. The negative network layer has been missing in modelling the purchase behaviour of a user. Hence, we consider as well the SAP walk on these two real-world networks, where, however, the negative network layer is replaced as an empty network without any link. Figure 10(a) shows that the positive layer is pruned or shrinks faster in Wiki than in Epinions network. Wiki has a shorter length of SAP walk on average than Epinions, as shown in figure 10(b). One explanation for both observations is that Wiki has a slightly denser initial negative layer (larger E D 0 -[ ( )]) than Epinions as shown in table 1, which removes on average more nodes per step. Moreover, the high degree correlation ρ D in Wiki contributes as well to a fast pruning in the positive layer and a short length of SAP walks.
When the negative layer is empty, i.e.
], the positive layer is pruned far slower, the average path length E[H] is far larger. A SAP walk on a signed network with an empty negative layer is equivalent to a selfavoiding walk on the positive network layer. In this case, the Epinion positive layer leads to a longer average path length than the Wiki positive layer. This is likely due to the higher standard deviation of the degree in the Wiki positive layer 49.55 than that in the Epinion 45.58 positive layer. A higher degree standard deviation implies an earlier visit of the hubs, whose removal may significantly prune the network and reduce the connectivity.
The visiting probability of a node versus its initial positive degree tends to have a larger slope in Epinions than that in Wiki. This is consistent with our observations in signed network models that the visiting probability v k increases faster with k in a signed network that leads to a higher average degree E D t  The negative network layer dramatically prunes a signed network and reduces length of a SAP walk. Moreover, a heterogeneous degree distribution in the positive layer and a positive degree-degree correlation between positive and negative layers may further enhance the pruning effect, shorten the SAP walk length and facilitate homogeneous visiting probabilities of nodes. These effects have been observed consistently in both network models and real-world networks.

Conclusion
Classic spreading models assume that all network links are beneficial for information diffusion. However, the positive and negative links in a signed network may facilitate and prevent the contagion of information, opinion and behaviour etc respectively. As a start, we propose a SAP RW on a signed network to model, for example, a user's purchase activity on a signed network of products. We unravel the significant effect of the negative links and the signed network structure in general on SAP walks. We found that a more heterogeneous degree distribution of the positive network layer such as the power-law distribution, a denser negative layer and a high degree-degree correlation between the two layers tend to prune the network faster, suppress the length of SAP walks and reduce the heterogeneity in nodal visiting probabilities. When the two layers are independent, however, a more heterogeneous degree distribution of the negative network layer tends to slow down the pruning and contribute to a longer length of SAP walks and more heterogeneity in nodal visiting probabilities. These observations has been obtained from both signed network models and real-world signed network and analytically proved in signed ER-ER networks. Real-world networks tend to have a heterogeneous degree distribution in the positive layer and a positive degree-degree correlation, which reduce total purchases of users but increase the homogeneity of the popularity of products. Our findings point out the possibility to influence users' purchases and product popularity via recommendations and competitions.
It is interesting to explore further the influence of other key features. We have shown that the community structure in the positive network layer may reduce the length of a SAP walk and the heterogeneity of nodal visiting probabilities. Beyond the community structure, balanced triangles are shown to appear more frequently than unbalanced ones in real-world networks. The effect of the fraction of balanced triangles on SAP walks and other dynamic processes remains interesting to investigate. We could as well to improve the SAP walk towards a more realistic model of e.g. user's purchase activity, by taking into account, for example, the choice of the initial node to visit, the possibility that a walker/user may stop the walk earlier and the heterogeneity of the links preference over recommendations. Optimisation problems that are interesting to be further explored include how to add nodes to an existing signed network, how to add positive links via e.g. recommendations or how to recommend a product path/sub-graph to maximise the visiting probabilities of a group of nodes.