A Method to Select the Edges in the Minimum Hamiltonian Cycle

The goal of traveling salesman problem (TSP) is to find the minimum Hamiltonian cycle (Min-HC) i.e., a cycle that visits each city once and exactly once and incurs the least length, time or cost, etc. It has been proven to be NP-complete [1]. There is no exact polynomial algorithm for TSP until NP=P. Till now the classical time complexity of TSP is believed as O*(2n). TSP has been extensively studied in the fields of combinatorial optimization and computer science due to its theoretical and practical values [2].


Introduction
The goal of traveling salesman problem (TSP) is to find the minimum Hamiltonian cycle (Min-HC) i.e., a cycle that visits each city once and exactly once and incurs the least length, time or cost, etc. It has been proven to be NP-complete [1]. There is no exact polynomial algorithm for TSP until NP=P. Till now the classical time complexity of TSP is believed as O*(2 n ). TSP has been extensively studied in the fields of combinatorial optimization and computer science due to its theoretical and practical values [2].
In mathematics, an undirected weighted graph (WG) is used to represent the symmetrical TSP. For a WG with n vertices, it is noted as G={V, E, W}, where V={v 1 , v 2 ,…, v n } are the vertex sets, E=[e ij ] n × n are the edges and W=[w ij ] n × n are the weights of edges, v i (1 ≤ i ≤ n) is the vertex, e ij (1 ≤ i, j ≤ n) is the edge linking the vertices v i and v j and w ij is the weight of e ij . For the symmetrical TSP, w ij =w ji . A cycle visiting each vertex once and exactly once is called a Hamiltonian cycle (HC) and the Min-HC is the HC with the minimum weights, i.e., optimal Hamiltonian cycle.
Given an undirected WG with n vertices, the number of the HCs is (n-1)1/2. It is impossible to find the Min-HC by evaluating each of them. The exact algorithms have been designed by many researchers to deal with TSP. The graph search algorithms [3] are feasible for TSP with less than tens of cities. Considering more constraints, the integer programming [4,5] and dynamic programming methods [6] can resolve the TSP with thousands of cities on super computers. For larger scale of TSP, the computation time is too long. It is reported that a large scale of TSP with 85,900 has been resolved with Concorde package [6]. A 128 networked computers system is used to execute the task and 1.2 years was consumed. The computation time is equal to 136 years when one computer completes the computation task. To resolve more complex TSP cases, the new methods are advocated to reduce the computation time.
The hardness of TSP results from a large number of HCs whereas the number of Min-HCs is usually very small. Generally, we strive to find one Min-HC. With a WG, the edges in the Min-HC are hard to discriminate in view of their weights. In most cases, not all the small weighted edges are included in the Min-HC. The method to determine the edges in the Min-HC is valuable. Most scholars design the exact algorithms to search the edges, paths in the Min-HC until the Min-HC is found [7][8][9]. The results show that the exact polynomial algorithm does not exist until NP=P. The heuristics based on the weights usually search the approximations [10][11][12]. Different from above research, the frequency graph is introduced and used to determine the possible edges in the Min-HC. The frequency graph is computed with a set of optimal i-vertex paths and it includes the law of conversion between the optimal i-vertex paths and the Min-HC. In other words, the frequencies of the edges in the Min-HC are generally bigger than those of most of the other edges. Hence, only the edges with big frequencies above a threshold will be taken as the candidate edges in the Min-HC. The threshold is changed according to number i. When the optimal i-vertex paths are short, the frequencies of a few edges in the Min-HC will not be outstanding. If i is big enough, the frequencies of all the edges in the Min-HC will be much bigger than those of the other edges and the Min-HC is easy to find. The frequency graph has been introduced in a previous paper [13]. However, the change of the frequencies of the edges in the Min-HC is not discussed. In this paper, we use mathematical methods to reveal two characteristics of the frequencies of the edges in the Min-HC. Firstly, the frequencies of the edges in the Min-HC are generally bigger than the average frequency and those of most of the other edges in a given frequency graph. In addition, the frequencies of the edges in the Min-HC increase faster than the average frequency does according to i. Here the frequency graphs are computed with different set of optimal i-vertex paths (the change of the frequencies of the edges in the Min-HC will be shown with a series of frequency graphs). The edges with these characteristics are used to compose the Min-HC. The findings can be as the basis to design algorithms for TSP.

Abstract
To find the minimum Hamiltonian cycle is the objective of traveling salesman problem (TSP) whereas it has been proven to be NP-complete. To select the right edges in the minimum Hamiltonian cycle, the frequency graph is computed with a set of optimal i-vertex paths. The number of optimal i-vertex paths is formulated according to i. In these frequency graphs, the frequencies of the edges in the minimum Hamiltonian cycle are generally bigger than the average frequency and those of most of the other edges. Before i ≤ n/2+1 for even number n and i ≤ (n ± 1)/2+1 for odd number n, the frequencies of the edges in the minimum Hamiltonian cycle increase faster than the average frequency does according to i. The frequencies on the edges and their change can be taken as the heuristic information to select the edges in the minimum Hamiltonian cycle. Two simple examples are given to show the change of the frequencies of the edges in the minimum Hamiltonian cycle. The results show that they are different from those of the other edges according to i. The results can be used as the basis to design algorithms for TSP.
for TSP, the NOP i s should be neglected in the search process.
Only n OP i s belong to the Min-HC and the other cannot if only one Min-HC is taken into account. In addition, the OP j s with more vertices are combined with the OP i s (i<j) with fewer vertices. Similarly, none of the NOP i s will belong to an arbitrary OP j . Given all of the OP j s, the OP i s emulated from the OP j s are the intersections between the OP i s and the OP j s. These OP i s have the potential to form the longer OP j s even the Min-HC. On the other hand, these OP i s not in the OP j s have no chance to form the longer OP j s and the Min-HC. If a set of OP j s with more vertices are given, the frequencies of the OP i s can be emulated from them. The OP i s with bigger frequencies are included by more number of OP j s and they are apt to belong to the Min-HC. In the extreme case, an OP i s included by all of the OP j s must belong to the Min-HC. Certainly, its frequency is maximal. In addition, the nearer the set of OP j s approach to the Min-HC, the bigger the frequencies of the OP i s in the Min-HC will be. The nearness represents the number of intersections between the OP j s and the Min-HC, i.e., the number of OP i s emulated from the set of OP j s. On the other hand, the OP i s with small or zero frequencies have little change to compose the Min-HC.
The Min-HC is combined with the edges and the frequencies of edges are concerned. Given a set of OP i s with more than 2 vertices, the frequencies of the edges are emulated and an FG is computed. The FG has the similar topological structure as the WG. In an FG, the numbers on the edges are their frequencies emulated from the set of OP j s. For TSP, the FGs in which the frequencies of the edges in the Min-HC are much bigger than those of the other edges are regarded. Then the edges in the Min-HC will be easily selected from these FGs.
When all of the OP i s are used to compute an FG, the average frequency of the edges is computed as formula (4) in case that the FG includes all of the edges.
It notes that the average frequency changes as a combinatorial number according to i. In fact, the frequencies of the edges have big distinctions when the FG is computed with the OP i s. Though the number of optimal i-vertex paths increases first and then decreases, the number of edges they include may become less and less according to i. The frequencies of some edges will be much bigger than those of the other edges because they are included in many long OP i s. In reverse, some edges are seldom able to compose the long OP i s and their frequencies will be small or zero. For each edge in the Min-HC, at least i-1 OP i s include it based on its different positions in these OP i s. These i-1 OP i s are used to compose the OP 2i-2 s in the Min-HC. For example in Figure 1, the edge (v i , v j ) will be included by 3 OP 4 s, 4 OP 5 s, 5 OP 6 s and etc.
Even if no other OP i s include the edges in the Min-HC, the frequencies of the edges in the Min-HC will be no less than i-1. Furthermore, the edges in the Min-HC have more potential than the other edges to compose the long OP i s because all the OP i s approach to

The Frequency Graph (Fg) and its Function
The relationships between the optimal i-vertex paths and the Min-HC are described first. Given a WG with n vertices, an HC is the sequence of all of the vertices and it is represented as The superscript i indicates the number of vertices in the P i . It has two end vertices v 1 and v i and the other vertices are the middle vertices. Similarly, the Min-HC includes the same number of P i s. It notes that the P i s in the Min-HC are different from most of the other P i s. The P i s in the MIN-HC are named as the optimal i-vertex paths (OP i ). As the Min-HC, an optimal i-vertex path does not change short whatever the middle vertices are exchanged. An optimal i-vertex path with i vertices is noted as Given an OP i in the MIN-HC, its two end vertices are concluded. In addition, the OP i is shorter than that of the other P i s with the same vertices in condition that their two end vertices are identical. With the description of the optimal i-vertex paths, the number of OP i s is computed as formula (1) [13] for symmetrical TSP.
Where i n C is the number of the combinations in case that i vertices are selected from n vertices. The number of OP i s rises first if i ≤ n/2+1 for even number n or i ≤ (n ± 1)/2+1 for odd number n and then decreases according to i. There is a maximal number when i=n/2+1 for even number n (i=(n ± 1)/2+1 for odd number n). The ratio between the number of OP i+1 s and OP i s is computed as formula (2).
In view of formula (2), the number of OP i s reaches its maximum when i is less than or equal to (n+1)/2. It is interesting that the ratio r i is related to the Harmonic Progression. The partial sum of r i is computed as formula (3).
Where ln is the sign of the nature logarithm and γ is the Euler constant.
The number of P i s is i n P / 2 , where i n P is the number of the permutations in case that i vertices are selected from n vertices. It always increases until i=n. The P i s is classified into OP i s and non-OP i s (NOP) whose number is equal to ( ) It is clear that the number of NOP i s is much bigger than that of the OP i s once i is bigger than 4. As we know, none of these NOP i s belong to the Min-HC. These NOP i s are useless to search the Min-HC. When we design algorithms the Min-HC according to i until they become the OP n s. Among the n × (n-1)/2 OP n s, n OP n s turn into the Min-HC when their two end vertices are connected. Therefore, the edges in the Min-HC have more chance to compose the OP i s than the other edges do, especially to compose the long OP i s. Their frequencies changes as formula (4) and they will be bigger than the average frequency in most cases. On the other hand, the frequencies of most of the other edges will be much smaller than the average frequency when i is big enough. That's their difference.
The intrinsic reason is that they cannot compose the long OP i s near to the Min-HC. The edges with big frequencies are included by many OP i s and they have the potential to form the Min-HC. In the extreme case, the edges included by all of the OP i s have the biggest frequencies and they belong to the Min-HC. Hence, we take the edges with big frequencies as the candidate edges in the Min-HC.

The Changes of the Frequencies of the Edges in the Min-Hc
Given an FG computed with a set of OP i s and it includes K edges (K>n), the formula (5) is given as follows. Where f O represents the frequency of an arbitrary edge in the Min-HC and f R is the average frequency of the rest K-1 edges, f is the average frequency of the K edges. O It believes that f O is bigger than f and f is bigger than f R (it is right in most cases). When K is near to n × (n-1)/2, f O -f will be much bigger than f-f R . It says f O is much bigger than f whereas f R is near to f. In an ideal case, each of the OP i s includes the edge in the Min-HC and f O is almost equal to formula (1). f R is computed as formula (6) as K=n × (n-1)/2.
It is more close to formula (4) and they have a big gap with the maximum value of f O (see formula (1)).
There is the other special case that f O is equal to f. For example, f O =f if i=2 and 3. When i=2 and 3, the f O is equal to 1 and 2 × (n-2), respectively. As we know, the number of OP i s are always becoming bigger according to i until i is bigger than n/2+1 for even number n ((n+1)/2+1 for odd number n). Some of the other edges have less and less chance to compose the OP i s according to i. Their frequencies will increase slower than the average frequency does. On the other hand, the edges in the Min-HC have more and more possibility to compose the long OP i s. Therefore, f O will be bigger than f in most cases according to i.
The readers wonder f O may be smaller than f in some FGs. How can we discriminate the edges in the Min-HC? In addition, there will be a lot of the other edges with frequencies bigger than f, which will enhance the difficulty to select the right edges in the Min-HC. From above analysis, the edges in the Min-HC will have more chance to compose the OP i s according to i than most of the other edges do. It is easy to infer that the frequencies of the edges in the Min-HC increase faster than the average does according to i. The edges in the Min-HC can be selected with respect to their frequency changes. Given two sets of OP i+1 s and OP i s, the change of the average frequency is computed as formula (7).
Formula (7) illustrates the change between two average frequencies computed with the OP i+1 s and OP i s. The change of the frequencies of the edges in the Min-HC will be bigger than that computed with formula (7) for most of OP i s. We can compute a series of FGs with the corresponding OP i s and then find the edges whose frequencies become faster than the average frequency does according to i. These edges with these characteristics are taken as the candidate edges in the Min-HC.
When i become big, it is hard to compute the OP i s with many vertices. In addition, the minimum frequency of the Min-HC edges is concerned to be as the threshold to select the right edges. In a recent research [14], the frequencies of edges are computed with the frequency polygons with i vertices. Given a TSP with n vertices (cities), it includes
Due to the simplicity of the example, all of the OP i s (4 ≤ i ≤ 10) are generated to compute 10 FGs. Each FG is computed with the OP i s with the same number of vertices. The frequencies of the 10 edges in the Min-HC are recorded according to i. The changes of the frequencies of these edges are lined according to i and shown in Figure  2. The horizontal axis represents the number i, which means the FG is computed with the OP i s. The vertical axis means the frequencies of the edges (in the Min-HC). The bottom bold line is the change of the average frequency computed with formula (4). It notes that when i is big enough, the FG will include smaller and smaller number of edges.
The formula (4) should be modified. When i=8, 9, 10 in this example, In view of Figure 2, the frequencies of the edges in the Min-HC are bigger than the average frequency in most cases. The frequencies of the edge (6,2) are always the minimum comparing with those of the other edges. It is smaller than the average frequency when i=4. Except i=4, the frequencies of all of the edges are always bigger than the average frequency. In addition, the frequencies of these edges are much bigger than the average frequency after i>4. It says the edges in the Min-HC have more chance to compose the OP i s than most of the other edges do according to i. That is to say, the edges in the Min-HC are apt to form the OP i s near to Min-HC.
The minimum frequency of the Min-HC edge is given according to i in Table 2. The parameter a is computed in view of formula (8). It is found the parameter a is smaller than 3. In most cases, the parameter a is near to 2 and the minimum frequency of the Min-HC edge is approximately equal to ( ) On the other hand, the frequencies of most of the other edges will not always be bigger than the average frequency according to i, especially when i is big. The edges with small frequencies are seldom included by the OP i s and they have little chance to compose the long OP i s and the Min-HC. Among the other 35 edges, 10 edges whose frequencies are the maximum when i=4 are chosen to illustrate their frequencies according to i. The changes of the frequencies of these edges are shown in Figure 3. The average frequency is also given with bold lines for comparisons. They are different from the change of the frequencies of the Min-HC edges. Before i ≤ 5, the frequencies of these edges are bigger than the average frequency. After i>5, the frequencies of some edges begin to become smaller than the average frequency. When i=6, 7, 8, 9 and 10, the frequencies of 2, 3, 5, 10 and 10 edges are smaller than the average frequency, respectively. The frequencies of the rest 25 edges are smaller than the average frequency after i>5 and they are smaller than those of the edges shown in Figure 3. They are impossible to compose the Min-HC.
Although the frequencies of some other edges are bigger than those of a few Min-HC edges when i is small, their changes have obvious distinctions according to i. The frequencies of the edges in the Min-HC will become bigger than the average frequency whereas those of most of the other edges will be smaller than the average frequency once i is big.
Come back to formula (1), the number of OP i s reaches the maximum when i=n/2+1 for even number n. In theory, the frequencies of the edges emulated from the OP i s will rise before i ≤ n/2+1. After i>n/2+1, their frequencies will decrease owing to the reduction of the OP i s. The average frequency just conforms to the rule. For the simple example, the average frequency computed with OP 7 s is smaller than that computed with OP 6 s. The frequencies of the edges will be the maximum when i=6. Before i=6, the frequencies of the edges in Figures 2 and 3 increase more rapidly than the average frequency does (the frequencies of the OHC increases faster than the other edges). However, it is interesting that the frequencies of the edges in the Min-HC still arise until i=7. It means the Min-HC edges have more chance to compose the long OP i s than the other edges do according to i. See Figure 2, the frequencies of the Min-HC edges becomes bigger and bigger until i=7. After that, the frequencies of them are still much bigger than the average frequency. It is different from the change of the frequencies of the other edges. In Figure 3, the frequencies of the other edges and the average frequency begin to decrease when i>6. In     addition, the frequencies of the other edges decrease more rapidly than the average frequency does until they become smaller than the average frequency. It is the second difference between the frequencies of the edges in the Min-HC and those of the other edges.
When i is smaller than n/2+1 (n is even), the frequencies of the edges rise according to i. When i=n/2+2 for even number n, the frequencies of the edges in the Min-HC will still rise whereas those of the other edges will begin to decrease. Figures 2 and 3 give us an overview of the changes of the frequencies of the edges according to i when n is even. Next the other simple example is adopted to illustrate the changes of the frequencies of the edges if n is odd. The first 9 vertices in Table 1 are selected to compute all of the OP i s (4 ≤ i ≤ 9) and then 9 FGs are computed with these OP i s. The Min-HC is (0, 1,2,6,7,8,4,5,3,0). The changes of the frequencies of the 9 edges in the Min-HC are shown in Figure 4. The average frequency is drawn with bold lines in Figure 4. When i=8 and 9, the FG includes 31 and 21 edges, respectively. They are similar to those in Figure 2. Only two frequencies of edge (2,6) are smaller than the average frequency when i=4 and 5. All of the frequencies of the other edges are bigger than the average frequency according to i. The minimum frequency of the Min-HC edge is listed in Table 3 according to i. The parameter a is also computed with formula (8). Similarly, parameter a is below 3 according to i. It says the number of edges in the frequency polygons is less than 3i according to i.
The other 9 edges with the maximum frequencies when i=4 are chosen and their frequency changes are shown in Figure 5 according to i. Different from the changes of the frequencies in Figure 4, some of these frequencies are bigger than the average frequency first and then become smaller than the average frequency according to i.
The number of OP i s reaches the maximum value when i=(n ± 1)/2+1 for odd number n. In this case, the number of OP 5 s and OP 6 s is the maximum. In theory, the frequencies of the edges will be the maximum when i=6. Figure 4 shows the frequencies of the Min-HC edges reach their maximum number when i=6 and then decreases.
The changes are different from those in Figure 2 where they reach the maximum number when i=7(i=n/2+2). Before i=6, the frequencies of the Min-HC edges increase sharply in Figure 4. However, the frequencies of the other edges increase slower from i=5 to i=6 in Figure  5. This is a big difference of Figures 4 and 5. Although the frequencies of some other edges increases before i=(n+1)/2+1 for odd number n, the frequencies of the Min-HC edges increase faster than the other edges do from i=(n-1)/2+1 to i=(n+1)/2+1. For the frequencies of the Min-HC edges, they increase sharply to reach their maximum before i=(n+1)/2+1. On the other hand, the frequencies of the other edges rise a little slower from i=(n-1)/2+1 to i=(n+1)/2+1 than they before act. It verifies that the edges in the Min-HC have more possibility to compose the long OP i s than the other edges do according to i. Some useful conclusions are drawn according to the experiments with the two simple examples. When an FG is computed with the OP i s, the frequencies of the edges in the Min-HC will be bigger than that computed with formula (4) in most cases. When i becomes big, the formula (8) is more suitable to compute the minimum frequency of the Min-HC edge. The frequencies of the Min-HC edges become bigger and bigger until they reach the maximum frequencies. In this stage, they increase faster than the average frequency does according to i. When i is big, they are much bigger than the average frequency and those of most of the other edges. Moreover, the frequencies of the edges in the Min-HC will still arise when i=n/2+2 for even number n. For odd number n from i=(n-1)/2+1 to i=(n+1)/2+1, the frequencies of the edges in the Min-HC increase as before whereas those of the other edges will increase more slowly than before. After i>n/2+2 for even number n and i>(n+1)/2+1 for odd number n, the frequencies of the edges in the Min-HC will be much bigger than the average frequency according to i. On the other hand, the frequencies of most of the other edges decrease faster than the average frequency does until they become smaller than the average frequency.

Conclusion and Future Work
The revelation of the change of the frequencies of the Min-HC edges is the main contribution of the research. The Min-HC is combined with n OP i s but not the NOP i s whereas it is hard to conclude which OP i s belong to the Min-HC. When these OP i s are converted into the FGs, the difference between the Min-HC edges and the other edges is illuminated by their frequencies. When the FGs are computed with the OP i s, the edges in the Min-HC usually have bigger frequencies than those of the other edges. The intrinsic reason is that these OP i s approach to the Min-HC nearer and nearer according to i. The edges in the Min-HC have more chance to compose the OP i s than that the other edges, especially to combine the long OP i s. Hence, the edges with big frequencies emulated from them have the potential to compose the Min-HC.
In general, the frequencies of the edges in the Min-HC are bigger than the average frequency and they will be much bigger than the average frequency when i is big. The formula (8) can be used to approximate the minimum frequency of the Min-HC edge. The frequencies of the edges in the Min-HC increase faster than the average frequency does according to i if i<n/2+1 for even number n and i<(n+1)/2+1 for odd number n. In the two simple examples, it is observed that the frequencies of the Min-HC edges will still become bigger when i=n/2+2 for even number n. On the other hand, the frequencies of the other edges begin to decrease once i>n/2+1. For odd number n from i=(n-1)/2+1 to i=(n+1)/2+1, the frequencies of the edges in the Min-HC increase as before whereas those of the other edges will increase  more slowly than before. The results mean the Min-HC edges have more possibility to compose the long OP i s than the other edges do. The future work will focus on the minimum frequency of the edges in the Min-HC. We will verify if the minimum frequency is bigger than a given number in the worst case.