A New Method Based on Evolutionary Algorithm for Symbolic Network Weak Unbalance

The symbolic network adds the emotional information of the relationship, that is, the “+” and “-” information of the edge, which greatly enhances the modeling ability and has wide application in many fields. Weak unbalance is an important indicator to measure the network tension. This paper starts from the weak structural equilibrium theorem, and integrates the work of predecessors, and proposes the weak unbalanced algorithm EAWSB based on evolutionary algorithm. Experiments on the large symbolic networks Epinions, Slashdot and WikiElections show the effectiveness and efficiency of the proposed method. In EAWSB, this paper proposes a compression-based indirect representation method, which effectively reduces the size of the genotype space, thus making the algorithm search more complete and easier to get better solutions.

et al. [Cartwright and Harary (1956)] then redefined and expanded the theory in graph theory in the 1950s. Arahona [Barahona (1999)] pointed out that solving the structural imbalance problem is an NP-hard problem. Terzi et al. [Terzi and Winkler (2011)] proposed a spectral method for solving the imbalance. Facchetti et al. [Facchetti, Iacono and Altafini (2011)] used the canonical transformation to give an efficient greedy algorithm for solving the imbalance. Chiang et al. [Chiang, Hsieh and Natarajan (2013)] used the Katz metric to find the number of negative loops and used this to measure the imbalance of the symbol network. Sun Yixiang et al. [Sun, Du and Gong (2014)] proposed a dense-mother algorithm for solving structural imbalances by using the characteristics of evolutionary algorithm global optimization. In the 1960s, Davis [Davis (1977)] improved the structural balance theory. He believed that "the enemy of the enemy is a friend" is not necessarily correct in many occasions. His theory is called weak structural balance theory. Leskovec et al. [Leskovec, Huttenlocher and Kleinberg (2010)] have shown through experiments that weak structural equilibrium is more common than structural equilibrium in a large number of actual symbolic networks. However, it is usually not feasible to simply extend the method of solving structural imbalances such as Sun et al. [Sun, Du and Gong (2014)] to solve the weak structural imbalance. Earlier research on this problem was Doreian and Mrvar [Doreian and Mrvar (1996)]. In view of the good performance of evolutionary algorithms [Jong (2016)] in solving many NP-hard problems, and also inspired by the literature [Sun, Du and Gong (2014)], this paper proposes an evolutionary algorithm EAWSB for solving the weak imbalance of symbol networks. Experiments on large symbolic networks Epinions, Slashdot and WikiElections show that this method is effective and efficient.

Structural balance and weak structural balance
A symbolic network can be defined as a graph G(V, E, σ), where V and E are node sets and edge sets, respectively. The mapping σ: E→{+, −} defines the symbol properties of each edge. Fig. 1 is the four basic paradigms of the symbolic network. In the case of structural equilibrium, (a)(b) is balanced, (c)(d) is unbalanced. In the case of weak structural equilibrium, (a)(b)(d) is balanced and (c) is unbalanced. However, for the general symbolic network, the statistical method is no longer valid. At this time, its balance and weak balance are given by the following Theorem 1 [Cartwright and Harary (1956)] and Theorem 2 [Davis (1977)]. Theorem 1: A symbolic network is structurally balanced if and only if its node set can be divided into two classes and satisfy the following conditions: The edges in the same class are all positive, and the edges between different classes are all negative. Theorem 2: A symbolic network is weakly structurally balanced if and only if its node set can be divided into multiple classes and satisfy the following conditions: The edges in the same class are all positive, and the edges between different classes are all negative.

Calculation of structural balance and weak structural balance
By using Theorem 1 and Theorem 2, we can give another definition of (weak) unbalance. The nodes of a symbolic network are divided into several classes. At present, the methods of seeking (weak) imbalance are mainly spectral methods [Terzi and Winkler (2011)], canonical transformation [Facchetti, Iacono and Altafini (2011)], Katz metric [Chiang, Hsieh and Natarajan (2013)], evolutionary algorithm [Sun, Du and Gong (2014)] and block model [Doreian and Mrvar (1996)]. The literature [Sun, Du and Gong (2014)] is also based on evolutionary algorithms, but it is only solved and discussed in a relatively small scale symbolic network and structural equilibrium case.

The Algorithm EAWSB
Considering the complexity of large-scale symbolic networks and the global optimization of evolutionary algorithms, combined with Theorem 2, this paper proposes an EAWSB (Evolutionary Algorithms for Weak Structural Balance) algorithm for solving the weak imbalance of symbol networks. The details are as follows.

Energy function and fitness function
According to Theorem 2, the energy function reflecting the weak imbalance can be defined as follows. The algorithm for defining the EAWSB is: Since E(S)=(m-F(S))/2, minimizing E(S) is equivalent to maximizing F(S). In this case, if the maximum number of categories k is specified in advance, the optimization problem to be solved by the algorithm EAWSB is transformed into vol.1, no.2, pp.41-53, 2019

The natural representation and compression representation of the individual
The general individual representation is divided into direct representation and indirect representation [Jong (2016)]. For large symbolic networks, the value of n is too large, often tens of thousands, which seriously affects the performance of genetic operations and the overall algorithm [Liu, Meng, Ding et al. (2019)]. Theorem 3: Given the symbolic network G (V, E, _), suppose that A is the optimal solution of the optimization problem represented by (3), then for any I {1,... N}, there are all The set does not satisfy the condition in the theorem, that is, Note that h only appears in the first summation in the above formula, and does not appear in the second summation, so we can define . This gives us a better solution than S*. Because Theorem 3 tells us the state of a node, that is, the optimal class value to which it belongs can be found by the state of its neighbor node using Eq. (4). So how do you find the dominating set U? Algorithm 1 gives a solution. Algorithm 1 generates a compressed representation: Algorithm 1 consists of 3 parts. Part 1 (lines 2-4) defines three arrays, ori_deg and deg, which hold the degree information of nodes exactly the same at the beginning. Part 2 (lines 5-11) handles leaf nodes (i.e., nodes with degree 1). The leaf node has a unique neighbor node. Part 3 (lines 12-18) uses a degree ratio selection strategy to select a node, i.e., the probability that a node is selected is the sum of the degrees of a node divided by the degrees of all nodes.  2 is an example of algorithm 1 generating compression coding. E is a leaf node that is generally not selected, but its neighbor nodes must be elected to the dominating set. The last three nodes A, C, and D are selected to dominate the set U. The individual compression code is ind_c=sAsCsD, the natural code is ind=sAsBsCsDsEsF, and the compression ratio is 50%.

Population initialization
The theory of homogeneity [Easley and Kleinberg (2019)] tells us that We will become more similar to our friends. The above selection-assignment process is repeated iniK times. Where iniK is a positive integer representing the initialization strength. The time complexity of population initialization is O(iniK*davg).

Genetic operator 1) cross
This paper uses the one-way crossover operator proposed by Tasgin et al. [Tasgin, Herdagdelen and Bingol (2006)]. The main idea is as follows. Find all the nodes in ind1 whose category value is s, change the category values of these nodes to s in ind2, and return the modified ind2. 2) variation In this paper, a single point mutation is used to randomly select a node on the individual to be mutated and assign it a new category value. The time complexity of the mutation is O(1).

Local search
Starting from Theorem 3, the local search can be designed as follows: For a given individual ind, a node vi on it is randomly selected, and the state of the node is modified.

EAWSB algorithm composition
To be exact, EAWSB is a cluster of algorithms, which consists of EAWSB_N, EAWSB_I, EAWSB_C and EAWSB_IC. They have the same function, but have different performance in different occasions. The difference between them is shown in Tab. 1.

Experimental environment
Tab. 2 is the experimental environment of Algorithm EAWSB in this paper.

Data set
This article was conducted on three large symbolic network datasets, Epinions, Slashdot, and WikiElections. Epinions (epinions.com) is a product review website [Guha, Kumar and Raghavan (2004)]. Slashdot (slashdot.com) is a technology news site [Jérôme, Lommatzsch and Bauckhage (2009)] that allows users to mark authors as "friends" or "enemies" for other users'articles, forming a network of friends/enemies. WikiElections [Leskovec, Huttenlocher and Kleinberg (2010)] is a dataset for Wikipedia users voting for elections. It is a support or objection network. Tab. 3 is the original case of the three data sets. The experiment is mainly carried out on the large undirected symbolic network shown in Tab. 4.

Operation results
Figs. 3-5 show the results of the four algorithms EAWSB_N, EAWSB_I, EAWSB_C, and EAWSB_IC on Epinions, Slashdot, and WikiElections. The operation is performed in five cases according to the number of categories k=2, 3, 4, 5, and 6, where k=2 is a structural equilibrium situation, which can be regarded as a special case of weak structural balance.

Performance comparison with similar algorithms
Meme-sb [Sun, Du and Gong (2014)] is a structural unbalanced algorithm based on the timid algorithm. Tab. 5 and Tab. 6 show the experimental results and running time of EAWSB_I and meme-sb u shows nder three large-scale symbolic network datasets in the number of categories k=2~6. Experiments show that EAWSB_I is significantly better than meme-sb on the two large datasets of Epinions and Slashdot. In addition, meme-sb is slightly better than EAWSB_I on WikiElections. Because meme-sb has a large "tearing" negative impact, the one-way crossover used by EAWSB_I is easier to maintain the integrity of the building block than the 2-point crossover used by meme-sb.

Conclusion
Weak unbalance is an important indicator to measure the tension of the network [Hou, Wei, Wang et al. (2018)]. This paper starts from the weak structural equilibrium theorem, and integrates the work of predecessors, and proposes the weak unbalanced algorithm EAWSB based on evolutionary algorithm. Experiments on large symbolic networks Epinions, Slashdot, and Wiki Elections demonstrate the effectiveness and efficiency of this approach. In EAWSB, this paper proposes a compression-based individual indirect representation method, which effectively reduces the size of the genotype space, thus making the algorithm search more complete and easier to get a better solution. In this paper, an incremental fitness calculation method is proposed, which reduces the time complexity of fitness calculation from O (n) to O (davg), and greatly improves the efficiency of the algorithm.