Study on a hybrid algorithm combining enhanced ant colony optimization and double improved simulated annealing via clustering in the Traveling Salesman Problem (TSP)

In the process of solving the Traveling Salesman Problem (TSP), both Ant Colony Optimization and simulated annealing exhibit different limitations depending on the dataset. This article aims to address these limitations by improving and combining these two algorithms using the clustering method. The problems tackled include Ant Colony Optimization’s susceptibility to stagnation, slow convergence, excessive computations, and local optima, as well as simulated annealing’s slow convergence and limited local search capability. By conducting tests on various TSPLIB datasets, the algorithm proposed in this article demonstrates improved convergence speed and solution quality compared to traditional algorithms. Furthermore, it exhibits certain advantages over other existing improved algorithms. Finally, this article applies this algorithm to logistics transportation, yielding excellent results.


INTRODUCTION
Since the Traveling Salesman Problem (TSP) is widely used in artificial intelligence, logistics transportation, circuit board design, and other fields (Di Placido, Archetti & Cerrone, 2022;Crişan, Pintea & Palade, 2017;Yu, Lian & Yang, 2021;Wang et al., 2016;Lim, Kanagaraj & Ponnambalam, 2014;Li et al., 2018), it has been studied by a large number of scholars.Some researchers are currently using exact algorithms such as the branch-and-bound algorithm, mixed integer linear programming method, and dynamic programming method to solve the TSP instances (Dell'Amico, Montemanni & Novellani, 2021;Gelareh et al., 2020;Lu, Benlic & Wu, 2018).However, as instances become more complex and data sets become larger, the exact algorithms no longer have advantages.Instead, various approximate algorithms are more suitable for solving the instances that are complex and the data is huge.
The bionic algorithm is a heuristic algorithm that simulates natural phenomena or processes.Many research scholars have used bionic algorithms, such as the genetic algorithm, particle swarm optimization and so on to approximate the solution of TSP instances (Zheng et al., 2023;Zheng, Zhang & Yang, 2022;Al-Gaphari, Al-Amry & Al-Nuzaili, 2021;Khan & Maiti, 2019;Zhong et al., 2018).Ant Colony Optimization and simulated annealing also are considered suitable for solving the TSP.
In nature, the foraging habits of ants served as the basis principle for the Ant Colony Optimization (ACO) (Colorni, Dorigo & Maniezzo, 1991).The Italian researcher, M. Dorigo, based it on the fact that ants can usually choose an optimal route between their nest and food supply.In the years since the proposal of the Ant Colony Optimization, numerous scholars have shown interest in enhancing its performance and have put forth various approaches for improvement (Dorigo, Maniezzo & Colorni, 1996;Stutzle & Hoos, 2000;Dong, Guo & Tickle, 2012).Even today, many researchers continue to refine and propose diverse methods to enhance the ACO.This sustained interest and ongoing efforts in improving the ACO are primarily driven by its significant potential for solving complex optimization problems efficiently and effectively.The researchers discovered that ants release a chemical known as a pheromone when foraging.The ACO uses this pheromone as a cue for its pathfinding direction and always proceeds in the direction of higher pheromone concentration.The ACO is characterized by excellent robustness, rapid high accuracy, strong local exploration capabilities, and distributed parallel computing for small-scale solutions.In large-scale instances, the algorithm performs poorly because it is prone to stalling, slow convergence, excessive computing effort, and falling into local optimal solutions.Scholars have suggested a variety of improved methods to solve these flaws.Many academics believe that refreshing the pheromone update methods in the ACO is an area for development.The primary reason is that the pheromone is essential to the ACO (Du et al., 2022;Ning et al., 2021Ning et al., , 2018)).Du et al. (2022) propose a method to correct pheromone levels, which encourages ants to deposit pheromones in nearby cities, enabling them to select superior cities. Ning et al. (2021) propose a pheromone matrix with negative feedback.This way expands the diversity of the way that ants choose links and places links that have not been visited before inferior links, so the effectiveness of the ants at constructing paths is improved dramatically.In order to enhance the global search capability, Ning et al. (2018) propose a novel pheromone smoothing mechanism designed to reinitialize the pheromone matrix when the ACO's search process approaches a defined stagnation state.However, since pheromones are among the quantities produced by the model, the enhanced algorithms that depend on them cannot overcome the model's restrictions.Meanwhile, in the TSP instances, the ACO uses a roulette wheel strategy to choose the next city to visit and then travels across every city.Therefore, the pheromone strategy of the improved ACO is also unable to avoid the selection of poor path points that lead the algorithm to the local optimal solution.Some scholars have found the shortcomings of the optimal search strategy and framework of the ACO and have proposed their methods (Gao, 2020;Gülcü et al., 2018;Wei, Han & Hong, 2014;Ratanavilisagul, 2017).Gao (2020) propose a new ACO that utilizes a strategy of combining pairs of searching ants to diversify the solution space.Additionally, to reduce the influence of having a limited number of meeting ants, a threshold constant is introduced.This method improves the solution accuracy and reduces the work of the ant colony.Gülcü et al. (2018) propose using the 3-opt operator as a means to improve the quality of the ACO's solution.Wei, Han & Hong (2014) the authors propose embedding the ACO into the cultural algorithm framework using a dual inheritance mechanism to make the optimal solution evolve in the population space and belief space.Although these methods improve the quality of the solution, they prolong the time to solve the problem by applying complex search strategies, which leads to stagnation of the algorithm in dealing with large-scale TSP instances.Many other scholars have found that the shortcomings of the ACO can be compensated for using a combination of other algorithms (Gong et al., 2022;Wang & Han, 2021;Rokbani et al., 2021;Yang et al., 2020;Gulcu et al., 2018;Qian & Su, 2018;Gunduz, Kiran & Ozceylan, 2015); they obtain excellent solutions for the TSP.Gong et al. (2022) propose a hybrid algorithm based on a state-adaptive slime mold model and fractional-order ant system (SSMFAS) to address the TSP.Wang & Han (2021) propose the hybrid symbiotic organisms search (SOS) and ACO (SOS-ACO).Gulcu et al. (2018) propose a parallel cooperative hybrid algorithm for solving the TSP instances.
Physical annealing served as the basis for the simulated annealing (SA) (Metropolis et al., 1953) concept.This concept was used in the discipline of combinatorial optimization by Kirkpatrick, Gelatt & Vecchi (1983).In the years since the proposal of the SA, numerous scholars have shown interest in enhancing its performance and have put forth various approaches for improvement (Allwright & Carpenter, 1989;Lin, Kao & Hsu, 1993;Geng et al., 2011).The SA is a global search algorithm with flexible, widespread, efficient operation, less initial condition requirements, and other advantages.Currently, many researchers have found that the search strategy and parameter tuning of the SA can be challenging.As a result, they have proposed various improvement methods to address these difficulties (Wang et al., 2015;Zhao, Xiong & Shu, 2015;Lin, Bian & Liu, 2016).Wang et al. (2015) propose a multi-agent SA with instance-based sampling (MSA-IBS) by exploiting the learning ability of instance-based search algorithms to solve TSP instances.Zhao, Xiong & Shu (2015) propose a SA with a hybrid local search for the TSP, which improves solution accuracy.Lin, Bian & Liu (2016) propose a hybrid SA-tabu search algorithm to solve the TSP.Fully considering the characteristics of the hybrid algorithm, they develop a dynamic neighborhood structure for the hybrid algorithm to improve search efficiency by reducing the randomness of the conventional 2-opt neighborhood.More scholars have found that the SA can be combined with other algorithms to provide a better result.The popular hybrid SA for the TSP is the list-based SA.This novel hybrid algorithm mainly combines SA with the list-based threshold accepting (LBTA) algorithm.It has proven to be an effective solution for solving large-scale TSP instances (Zhan et al., 2016;Wang et al., 2019;Ilin et al., 2022;Ilhan & Gokmen, 2022).Other researchers have also discovered that combining SA with other algorithms not included in the LBTA also can yield promising results and enhance the solving capability of the SA (Deng, Xiong & Wang, 2021;He, Wu & Xu, 2018;Ezugwu, Adewumi & Frîncu, 2017).Deng, Xiong & Wang (2021) propose a hybrid Cellular Genetic Algorithm with the SA (SCGA), which is closer to the theoretical optimal value and has good robustness.He, Wu & Xu (2018) combine SA and the genetic algorithm to propose the Improved Genetic Simulated Annealing (IGSAA), This method makes SA more effective in avoiding getting stuck in local optima.The symbiotic biological search algorithm and the SA were merged in the literature (Ezugwu, Adewumi & Frîncu, 2017) to increase the accuracy of the solution and speed up the convergence of the SA.
Firstly, in this article, we introduce two strategies to address slow convergence and susceptibility to local optima of the ACO, which effectively accelerate its convergence and enhance the accuracy to a considerable extent.Secondly, we propose two strategies to overcome the accuracy limitations of the simulated annealing, which traditionally lacked optimization ability.Subsequently, we analyze the shortcomings of the two improved algorithms and synergistically combine their advantages to devise a new hybrid algorithm.We thoroughly test this algorithm using 22 different TSP instances.Additionally, we conduct a comprehensive comparison with other traditional algorithms and state-of-theart techniques from the literature.Finally, we apply this novel algorithm to the domain of logistics and transportation, showcasing its potential and practicality in real-world scenarios.

BASIC INTRODUCTION AND IMPROVEMENT OF BASIC ALGORITHM Description of the traveling salesman problem
In the DFJ formulation, the TSP can be represented by an assignment-complete graph G ¼ ðV; EÞ, where V represents the set of vertices and E represents the set of edges.The distance between vertices i and j is denoted as d i;j and is assumed to be known.To represent the TSP mathematically, we introduce binary decision variables x ij , where x ij ¼ 1 if the edge ði; jÞ is included in the loop path, and x ij ¼ 0 otherwise.The objective is to minimize the total distance traveled, which can be expressed as: Subject to the following constraints.Each city must be visited exactly once: s:t: Each city must be left exactly once: Subtour elimination constraints to prevent subtours: x ij 2 f0; 1g; i; j 2 V (5)

Introduction and improvement of the elite ant colony optimization
The elite ant colony optimization The ACO has been under development for over 20 years, and numerous researchers have been continuously improving and refining ACOs.Some notable improved ACO include the Maximum-Minimum Ant Colony Optimization, Elite Ant Colony Optimization, Sorting-Based Ant Colony Optimization, and others.Among them, the Elite Ant Colony Optimization (EACO) (Dorigo, Maniezzo & Colorni, 1996) introduces an elite ant strategy, which rewards ants that discover the optimal path in the current cycle with additional pheromones.This strategy reduces the number of iterations required by the ACO and improves the quality of the solution to some extent.
(1) Transition probability The ant colony uses a probabilistic selection method to decide to transfer from the current city i to the next city j and releases a certain amount of pheromone during the transfer process.In the initial process, the pheromone concentration of each path is equal, then the transfer probability of an ant to transfer from the current city i to the next city j is shown in Eq. ( 6).
g ij is the heuristic factor between the current city i and the next city j, h ij is the pheromone concentration left by the ant between the current city i and the next city j, tabu k ½k ¼ 1; 2; 3… is called the tabu list to record the cities that ant k has currently traveled, a is the pheromone factor, meaning the importance of path with remaining pheromones, b is the heuristic factor, denoting the affection of heuristic information.
(2) Update of pheromones The pheromones on each road are updated once all the ants have traveled through all the cities.The three steps of the EACO's pheromone update are pheromone volatilization, ant release of pheromones along their separate paths, and pheromone reward for elite ants.
Pheromone volatilization equation: l is representative of the rate of volatilization.The number of pheromones remains on the path at the current iteration for ant k, which can be calculated as: Q is the pheromone augmentation factor, T k is the path of the ant k, d ðijÞ is the length of the current edge, when d ðijÞ smaller, more pheromones will be obtained on the current edge.
Update formula for additional pheromones awarded to elite ants: e is the pheromone augmentation factor for the optimal path, L best is the current loop optimal solution, T bk is the path of the elite ants.
The pheromones for all ants are updated using the following equation: Adaptive elite ant colony optimization (AEACO) The EACO exhibits similar robustness to the ACO and can be easily integrated with other algorithms.While the EACO improves upon the number of iterations required by the ACO, it still inherits the limitations of slow convergence and susceptibility to local optima.Therefore, this article proposes two improvement strategies for the slow convergence of the EACO.

Strategy one
The optimal path between a colony's exploration of a nest and a food source relies heavily on the information transmitted through pheromones.If the ant colony already knows the starting city a and the ending city b (the ending city is the city before the ant colony returns to its starting point), the ant colony tends to choose the city closer to the ending city, leading to a biased selection process for the next visited city during exploration.Utilizing the best pheromone information between the starting city a and the ending city b, the ant colony can efficiently determine the appropriate direction to explore, thereby accelerating the search process.To address this, this article proposes an improved pheromone update strategy in conjunction with the EACO.Furthermore, all edges are initialized based on the distance between the starting city a and the ending city b.The concentration of pheromones in the initialized ant colony is determined as follows: s ij 0 ð Þ is the initial pheromone concentration between the current city i and the next city j, d ab is the linear distance between the starting city a and the ending city b, d aj is the distance from the starting city a to the next city j, d jb is the distance from the next city j to the ending city b.This strategy changes the initial pheromone concentration of the EACO and focuses on the distance between the current city and the ending city, which provides directional guidance for the initial ant colony and avoids blind searches of the ant colony, thus improving the speed of solution and accuracy of the solution.

Strategy two
The method for ants to select the next city from the current city is primarily based on the roulette wheel betting method.This method ensures a well-balanced algorithm, where ants with higher fitness values are more likely to be selected, while ants with lower fitness values still have a chance to be chosen.This approach allows the ant colony to explore and experiment within the solution space, allowing all unvisited cities to be selected.However, in the actual solution process, it is important to avoid consecutively visiting two cities that are particularly far apart.If the ant colony selects a distant city as the next destination from its current location, it would result in wasted iterations, leading to increased solution time for the algorithm.More critically, it could significantly affect the overall direction of the algorithm's solution and potentially trap it in a local optimal solution.Therefore, this article restricts the city selection process to the roulette wheel betting method.Ants are not allowed to choose a more distant city as the next visited destination.This constraint is expressed mathematically as follows: R is the maximum distance from other cities that can be visited, k is parameter [1,2], d ir is the current city i to a city r that has not been visited, Eq. ( 13) indicates whether the unvisited city r can be added to the roulette match, 1 is acceptance and 0 indicates no acceptance, x ir is the decision variable for the current city i to the unvisited city r.
Our algorithm called the Adaptive Elite Ant Colony Optimization (AEACO) is an efficient optimization-seeking algorithm for small-scale TSP instances, which will automatically adjust the number of iterations and the number of ants for different smallscale instances.Meanwhile, a comparison test with the ACO and the AEACO is run to evaluate the performance of the AEACO.The AEACO, ACO, and AEACO are put to the test 30 times, with each algorithm's optimal solution, average error rate, and solution time is provided.Table 1, Figures 1 and 2 show the experimental results, Meanwhile, Time is the solution time for the algorithm to solve the TSP instance 30 times, SD is the error rate and calculated by Eq. ( 14), which is the difference between the optimal solution (denoted as Opt) obtained by the algorithm and the known optimal solution (denoted as KopS) of TSPLIB, SD avg is the average error of the solved result at the end of the every process after 30 runs, SD best is the error of the best solution after 30 runs and Best is best optimal solution after 30 runs.
Parameter setting of EACO: 5, the number of ants m is the number of cities.  Parameter setting of AEACO: 2, the number of ants m is the number of cities, the number of iterations of the algorithm is 0:5 times the number of cities.
From Fig. 1, the AEACO has demonstrated superior performance in terms of speed and accuracy compared to the ACO and EACO, it is evident that the AEACO converges to the optimal solution with minimal time and number of iterations, the convergence also speed is also remarkably fast as depicted in Fig. 2 and Table 2, it can be observed that the AEACO produces better results.Nevertheless, as the scale increases, the algorithm's accuracy diminishes, and the solution time considerably lengthens.

Simulated annealing (SA)
According to Metropolis et al. (1953), the basic idea behind solid annealing is to first slowly cool the solid after heating it to a specific temperature.When a solid is heated, its interior particles become disorganized as the temperature rises, and internal energy increases.When a solid is cooled, on the other hand, its interior particle population becomes ordered as the temperature drops, and internal energy decreases when a particular equilibrium state is reached at each temperature.The SA replicates the steps involved in this principle, including the initial temperature setting, the initial solution, and the temperature decline.
The SA commences from an initial solution, denoted as x, and proceeds by perturbing x according to predefined rules to generate a candidate solution, denoted as y.The Metropolis criterion is a fundamental acceptance rule used in the SA to determine whether a new solution should be accepted or rejected during the optimization process.The acceptance of y is determined using the Metropolis criterion.If accepted, y replaces x as the new initial solution, from which further candidate solutions are generated.As the temperature decreases, the initial solution x evolves iteratively.Eventually, this progressive evolution, driven by the decreasing temperature, leads the algorithm to converge towards the global optimal solution.
E old and E new are the objective function value and T k is the current temperature.
The SA receives worse solutions with a certain probability.Therefore, the climbing ability is strong and it is not easy to fall into the local optimum, but the SA is slow to converge and has poor local search ability.

Simulated annealing with multiple optimization seeking methods (MSA)
The SA is a powerful global optimization algorithm known for its excellent hill-climbing ability.However, the SA is heavily dependent on the initial temperature and a single optimization search method.As a result, the algorithm can fall into premature convergence and become trapped in local optimal solutions.Therefore, this article proposes two strategies to improve the deficiencies of the SA.

Strategy one
The traditional simulated annealing, which primarily employs perturbation operations to perturb the solution sequence, uses a random exchange of the positions of a specific pair of cities as its perturbation mechanism in the TSP instances, which is the primary cause of the algorithm's slow convergence.The perturbation mechanism, however, plays a critical role in the algorithm's superiority.As the perturbation approach for the SA process cannot be overly complicated, we will employ three conventional perturbation operations to change the old solution sequence.
(1) Swap method: randomly swap two positions in the solution sequence.
(2) Random insert method: randomly swap two adjacent positions in the solution sequence.
(3) 2-opt method: two positions in the solution sequence are randomly selected and arranged in reverse order from these two positions.

Strategy two
The traditional SA typically employ a sufficiently large initial temperature to enhance the search performance.However, there is no universally recommended method for determining the initial temperature.Selecting an inappropriate initial temperature not only results in wasted time but also hampers the effectiveness of the solution search.According to the method described in literature (Lin, Kao & Hsu, 1993), we employed a specific initial temperature approach tailored to the unique characteristics of our algorithm: E avg and E min are the expected average and minimum values, respectively, of the objective function for N randomly selected feasible solutions within the solution space, p is parameter [0,1], a is a parameter value that mainly prevents the temperature starting point from being too high, resulting in slow convergence of the algorithm.
The MSA-1 is randomly selected to perturb the solution sequence by the swap method, random insertion method, and 2-opt method.The MSA-1 means that the MSA uses only strategy 1 and not strategy 2. In this article, three instances of pr76, tsp225, and pcb1173 are used to test the MSA-1 for four different proportions of perturbations of the swap method, random insertion method, and 2-opt method.The experimental results are shown in Table 2.In the table, K is the number of temperature changes, Errors is the average error of the solution after 30 experiments.Meanwhile, Agiter is the average of the number of iterations in which the optimal result emerges during the solution of the TSP instances after 30 times.SD avg is the average error rate of solving 30 times.
As shown in Table 2, when the swap method:random insertion method:2Àopt method ¼ 1:1:2, the solution effect is better.A larger proportion of the 2-opt method is beneficial for obtaining better solutions and reducing the number of iterations.
By utilizing the swap method, random insertion method, and 2-opt method to enhance perturbation in the SA.The perturbation capabilities of the SA can be improved, which can leading to more accurate solutions.Using strategy 2 to improve the initialization temperature of the SA has the advantage of avoiding too high or too low a temperature that would cause the algorithm to fall into a local optimal solution.In this study, comparison tests between the MSA-1 and the conventional the SA are carried out using the same initial solution sequence, initial temperature, termination temperature, cooling factor, and the maximum number of iterations.The MSA does not use the same initial temperature.A total of 30 tests are carried out, and the experimental results are displayed in Table 3 and Fig. 3.
The parameters were set as follows: initial temperature: 300, termination temperature 1, cooling factor: 0.998, and the maximum number of iterations: 100.
Experiments show that the solution accuracy of the MSA-1 is much higher than that of the SA in the same environment.The main reason is that Strategy 1 improves the exploration pattern of the SA.Comparing the MSA-1 and the MSA, MSA has strategy 2 which provides an effective initial temperature, leading to a high convergence accuracy of the MSA.But MSA of the average time is longer than the SA, resulting in the inability of the SA to improve the solution accuracy and reduce the solution time, mainly because the MSA is not sufficient in local search capability.
A hybrid algorithm combining enhanced ant colony optimization and improved double simulated annealing via clustering (ACO-DSA) After improving Ant Colony Optimization and simulated annealing, the AEACO helps the initial ant colony form a good search path and rewards the elite ant colony that finds the optimal path with extra pheromones.However, this algorithm cannot be used for the  larger TSP instances, mainly because the AEACO cannot get rid of the inherent defects of the ACO.While the enhanced simulated annealing increases the algorithm's perturbation of the solution sequence and improves its climbing ability and solution accuracy, it also leads to an increase in solution time.These two algorithms can work together as a complement.From an overall perspective, these two algorithms can complement each other, with the AEACO responsible for rapid search of small-scale TSP instances and the MSA responsible for the overall TSP instances to jump out the local optimal solution problem.Based on the above analysis, this article will use the clustering algorithm to combine the advantages of the AEACO and the MSA with each other.

Steps of the algorithm
The algorithm in this article will be carried out in three processes.Initialization process: Formation of small cluster classes and cluster sequences, intracluster optimization.
First annealing process: Optimizing the sequence of clusters and intra-cluster optimization.
Second annealing process: Global optimization.

Formation of small cluster classes and cluster sequences
TSP instances can be fundamentally viewed as sorting problems, allowing large-scale TSP instances to be decomposed into multiple smaller ones.Solving the TSP involves determining the sequence in which the salesman selects the next city from the current city, typically focusing on cities around the current location.This approach is preferred over selecting cities across significant distances to maintain precision.To address this, as shown in Fig. 4, a clustering algorithm can be applied to partition all cities in the TSP instances into several small clusters, comprising cities that are in close proximity to each other.These clusters represent groups of closely located cities.
Steps for forming small cluster classes: Step 1: Randomly select k cities as Medoids.
Step 2: The remaining cities of the data are divided into cluster classes according to the principle of closest to the Medoids.

Sort all clusters
Segmentation of all cities into clusters using the K-M algorithm denotes a cluster denotes a city Step 3: Update: for each cluster formed by the assigned data points, select a new Medoid that minimizes the total dissimilarity or distance within that cluster.Iterate through each data point within the cluster and calculate the total dissimilarity as the sum of distances between the data point and all other points in the cluster.Choose the data point with the lowest total dissimilarity as the new Medoid for that cluster.
Step 4: Evaluate whether each cluster exceeds its maximum member capacity and consider reclassifying any surplus members from the exceeding clusters to other appropriate clusters.
Step 5: Repeat the process of 2-4 until all Medoids no longer change or the set maximum number of iterations has been reached.
Step 6: Forming cluster sequence: using the greedy algorithm and cluster centroids to sort the clusters.

Intra-cluster optimization
The AEACO is an efficient algorithm for solving small-scale TSP instances with known starting city and ending city.Therefore, in this article's algorithm, the AEACO is used to find the best solution found for the TSP instances that are split into small-scale ones.As shown in Fig. 5, the AEACO is used to find the best cluster solutions inside each of the m cluster classes as well as the best sequence of solutions.
Steps for finding the best solution founds and solution sequences within the cluster class: Step 1: Parameter initialization: the number of m ants (the number of cities), pheromone importance factor a, heuristic function importance factor b, pheromone volatility factor q, total pheromone release Q, the maximal number of iterations itermax (0:5 times the number of cities), elite reward strategy value e, parameter k.
Step 2: Initialization of pheromones: initialize the pheromone concentrations on each path according to Eq. ( 11).
Step 3: Construct the solution space: Select the set of cities set from unvisited cities according to strategy two of AEACO (Eqs.( 12) and ( 13)).Each ant in the population selects the next city from cities set to move based on pheromone trails and heuristic information.
Repeat the process until all ants have visited all cities, ensuring that each city is visited exactly once.
Calculate the path length each ant and identify the ant with the best path (elite ant).
Step 4: Pheromone update: leave pheromones on each ant's passing edge and reward elite ants with a certain amount of extra pheromones on their passing edge.
Step 5: Termination condition: if iter , itermax (the current iteration number < the maximum iteration number), clear the path record table and return to Step 3; if iter .¼ itermax (the current iteration number >= the maximum iteration number), terminate the calculation and output the best solution found.

Optimizing the sequence of clusters
The split into small-scale clusters needs to be recombined into a solution to the large-scale TSP instances, however, in the process of combination, a suitable sequence is needed for sorting.Therefore, the algorithm in this article uses the MSA to find the optimal ordering of this cluster.To obtain a better solution, the received sequence of clusters is first perturbed to produce a new sequence of clusters.Next, the optimal sequence of clusters is internally searched for each cluster in the new sequence of clusters, and the optimal solution and sequence of clusters are obtained by using the table of the nearest cities and the optimal solution of clusters obtained from the internal search of clusters.Lastly, the decision of whether to accept the new cluster sequence is made using the Metropolis criterion, as shown in Fig. 6.

SE (Starting-Ending) strategy
As illustrated in the Fig. 7, the algorithm presented in this article primarily utilizes the clustering method to partition cities into distinct clusters.Subsequently, it aims to determine the optimal sequence for connecting these clusters.Hence, the most crucial consideration lies in selecting cities that are interconnected within their respective clusters.After the perturbation, it is necessary to determine the cities at the starting and ending points within the perturbed part based on the cluster.To achieve this, pheromones and distances from surrounding clusters to cities in the perturbed part are taken into account.
Selection probabilities are established using these factors and are combined with the roulette pair method to select the starting city and ending city.The probability of selection of interconnected cities in the current cluster selection and other clusters is given by the formula: Pheromone update of SE strategy: a denotes a city in the cluster k in the disturbed part, m ab denotes the heuristic factor between the city a and the city b, x ab is the pheromone concentration between the city i and the city b, cluster k denotes the cluster of the disturbed part, cluster k denotes disturbed parts of the surrounding cluster, a 1 denotes the pheromone factor, b 1 denotes the heuristic factor.l 1 is representative of the rate of volatilization, U 1 is a constant; L best 1 is the current loop optimal solution of MSA.
Steps to optimize the sequence of clusters: Step 1: Parameter initialization: temperature initialization T 0 operation with Eq. ( 16), the maximum number of iterations L 0 , the termination temperature T end , and the cooling factor q 0 1 , q 0 , 0 ð Þ .
Step 2: Initialization of pheromones: set the same pheromone on each path.
Step 3: The initialization process of the cluster sequence: perform the internal cluster search for each cluster, and build up the optimal solution table of each cluster and the internal sequence table of each cluster according to the sequence of clusters.
Step 4: Perturbation of cluster sequence: perturbation of cluster sequence using swap method, random insertion method and 2-opt method with a probability of 1:1:2.
Step 5: Calculate the results after the perturbation: Using SE strategy to find the starting city and the ending city of the perturbed part of the clusters.Using an intra-cluster optimization search for the clusters in the perturbed part.Update the optimal solution table and the internal sequence table of the clusters.
Step 6: The Metropolis criterion: use the Metropolis criterion to determine whether to receive a new cluster sequence.
Step 7: Determine if the maximum number of iterations has been reached: if so, exit the current loop, otherwise return to Step 3.
Step 8: Pheromone update: leave pheromones on each ant's passing edge and reward elite ants with a certain amount of extra pheromones on their passing edge.
Step 10: Stop condition: determine whether the termination temperature T end is reached, if T now ¼ T end , the algorithm ends.Otherwise return to Step 3.
Step 11: Formation of a quality solution: connect all clusters.

Global optimization
Since the clustering algorithm splits the TSP instances into several small TSP instances, mainly based on the nearest principle, resulting in the solution of this algorithm is not necessarily the optimal solution, due to the strong climbing ability of the MSA can jump out of the local optimal solution, so the global search for the sequence after experiencing the clustered sequence seeking is performed to increase the quality of the solution of the algorithm in this article.
Steps for global optimization search: Step 1: Parameter initialization: initial temperature T 1 (Initial temperature calculated by combining Eq. ( 16) and historical data from Optimizing the sequence of clusters), the maximum number of iterations L 1 , termination temperature T END , cooling factor q 1 1 , q 1 , 0 ð Þ .
Step 2: Solution sequence perturbation: the solution sequence is perturbed using the swap method, random insertion method, and 2-opt method with a probability of 1:1:2.
Step 3: The Metropolis criterion: the Metropolis criterion is used to determine whether to receive new sequences and new solutions.
Step 4: Determine if the maximum number of iterations has been reached: if so, exit the current loop, otherwise return to Step 2.
Step 5: Cooling: Step 6: Stop condition: determine whether the termination temperature T END is reached, if T NOW ¼ T END , the algorithm ends.Otherwise return to Step 2.
Step 7: Output the best solution found: output the best solution found and solution sequence.

The flow of ACO-DSA
As shown in Fig. 8, we demonstrate the entire algorithmic flow of our algorithm.

The process of initialization
Step 1: The n cities are clustered using the K-M algorithm.
Step 2: Using the greedy algorithm and cluster centroids, create a cluster sequence.
Step 3: The starting and ending cities of each cluster are selected using the SE strategy.The AEACO is then employed to find the best solution found and sort each cluster based on the cluster class order.Finally, the optimal solution table and the internal sequential table are constructed.

First annealing process
Step 4: Initial temperature, termination temperature, number of iterations, and cooling factor of intra-cluster optimization are initialized for the operation.
Step 5: Perturb the cluster sequence and perform intra-cluster optimization search for the cities in the perturbed part of the cluster to calculate the solution value of the perturbed part.
Step 6: If the new solution sequence is accepted, determine using the Metropolis acceptance criterion: whether to update the optimal solution table for each cluster and the internal sequence table for each cluster; if not, do not update these two tables.
Step 7: Judge whether the number of iterations is reached, if not, return to Step 5, otherwise, execute the next step.
Step 8: Judge whether the termination temperature is reached, if not, return to Step 5, otherwise find out the solution sequence and solution based on the optimal solution table of each cluster and the internal sequence table of each cluster, and execute the next step.
Second annealing process Step 9: Initial temperature, termination temperature, number of iterations, and cooling factor of global optimization are initialized for the operation.
Step 10: Perturb the solution sequence and calculate the new solution.
Step 11: Use the Metropolis acceptance criterion to judge whether the new solution sequence is received, if so, update the solution sequence; otherwise, do not update the solution sequence.
Step 12: Judge whether the number of iterations is reached, if not, return to Step 10, otherwise, execute the next step.
Step 13: Judge whether the termination temperature is reached, if not, return to Step 10, otherwise, output the solution sequence and the best solution found.

EXPERIMENT AND RESULT ANALYSIS
To test the effectiveness of the ACO-DSA, experiments will be conducted using TSP instance of different sizes from the TSPLIB database, which are arranged as follows: (1) The ACO-DSA is used to find the optimal solution for different TSP instance sizes.
(2) The effect of varying clusters of clusters on the first annealing at the same TSP scale and temperature control.

Testing the solution effect of the ACO-DSA
To verify the operational effectiveness of the ACO-DSA, 30 tests are run on TSP instances of different sizes.In particular, Table 4 shows the relevant parameter settings of the algorithm in this article for each instance.In the table, Size represents the number of clusters into which the TSP instance is divided.Meanwhile, parameter settings of AEACO and SE strategy are: a ¼ 7, b ¼ 10, q ¼ 0:1, Q ¼ 1, e ¼ 0:5, k ¼ 1:2, The number of ants is the number of cities, The number of iterations of the algorithm is 0:5 times the number of cities, a 1 ¼ 7, b 1 ¼ 10.Table 5 shows the relevant results of each instance in each process and the average of the sum of the optimal solutions that emerge after 30 times of conducting and the total time required for each experiment.Time is the solution time of the process, SD is the error rate and calculated by Eq. ( 14), SD avg is the average error of the solved result at the end of the every process after 30 runs, SD best is the error of the best result after 30 runs and Best is the best result after 30 runs.
According to experiments, when using the ACO-DSA to solve instances of various sizes, the solution accuracy increases at the end of each process and converges to a specific range of accuracy.Figure 9 shows some examples of optimal paths obtained by the ACO-DSA, every result was excellent.

Testing the impact of cluster size
Decomposing the instance into numerous small clusters and effectively ranking each cluster is the primary priority of the first annealing process of the ACO-DSA.The effect of the number of clusters on the time and accuracy of the ACO-DSA is investigated by varying the number of clusters and the number of iterations while maintaining the same termination temperature and cooling factor.As depicted in Table 6, we illustrate the influence of the number of clusters and the number of iterations on the initial stage of the proposed algorithm using the pr299 instance.The horizontal axis of the table represents the number of cities in the cluster (5, 10, 15, and 20), while the vertical axis represents the number of iterations L 0 (1, 5, 10, 15, 20, and 25).
The formula for calculating the number of cities in the cluster is shown below: Parameter settings for the pr299 instances: termination temperature: 10, cooling factor: 0.998, a 0 ¼ 2, p 0 ¼ 0:01 (Error indicates the average error rate of solving 10 times, Time represents the solution time for the first annealing process of the ACO-DSA to solve the TSP instance 10 times).
The results indicate that increasing the number of cities in the cluster leads to longer optimization times.However, it also achieves better optimization results for the same case.Similarly, when maintaining the same number of cities in the cluster, a higher number of iterations results in superior optimization outcomes, albeit with a significant increase in processing time.Specifically, when using 20 cities in the cluster and 25 iterations, the solution accuracy reaches its peak.Nevertheless, it is not recommended to choose a configuration with a large number of cities in the cluster and a high number of iterations due to the considerable time consumption.Figure 10 shows the analyzed graphs of the results of this article's algorithm and the ACO, SA, GA, and PSO.The line graphs show the error rates of the optimal solutions obtained by the SA, GA, PSO, and this article's algorithm after 30 solving of 10 TSP instances, and it can be seen that this article's algorithm achieves the best optimal solution.The bar chart shows the comparison of the two error rates of the ACO, SA, GA, and PSO.

Comparison with other algorithms
Table 8 displays the experimental results concerning the algorithm proposed in this article and several other literature algorithms, the design idea of the ACO-ABC (Gunduz, Kiran & Ozceylan, 2015) in the literature is the same as the algorithm in this article, both of  them use the ACO to obtain the initial solution, and finally use other optimization methods to improve the initial solution.The IGSSA (He, Wu & Xu, 2018) and the ACO-PSO (Qian & Su, 2018) are respectively improvements to the Ant Colony Optimization and simulated annealing (Note: where '-' indicates that they are not mentioned in their literature).
Figure 11 shows the comparison of this article's algorithm with other algorithms in the literature.The average of SD avg and the average of SD best are the optimal solution error rate and the average error rate for all the TSP instances solution results mentioned in other literature and compared with the results of this article's algorithm.From the analysis of  Fig. 11, it is obtained that this article's algorithm is better than ACO-ABC and ACO-PSO, and is slightly worse than IGSAA in terms of the average error rate, but the IGSAA can only be used in small-scale TSP instances and does not apply to large-scale TSP instances.

APPLICATION OF THE ACO-DSA ON LOGISTICS
Figure 12 below illustrates a map containing 44 cities that a company needs to travel to for transporting supplies.The journey starts and ends in Guizhou, forming a closed-loop route.The primary objective of the solution is to determine the optimized route and the corresponding distance to be traveled, encompassing all 44 cities.In the actual distribution process, there are the following situations: (1) the geographic locations of the distributing center and customer points are known; (2) the shortest path between each customer point and the distributing center is known; (3) the path of distribution starts from the distributing center and needs to return to the distributing center after completing all deliveries, forming a closed-loop distribution route; (4) each customer points can only be reached once; (5) the effect of road factors on the vehicle is not considered.The following Fig. 12 shows the customer points that a logistics company needs to distribute and the optimal path obtained by using the ACO-DSA (The white car represents the distributing center).

Results of the simulation
The path planning for this company's problem is carried out by the ACO, EACO, AEACO, SA, MSA, and this article's algorithm based on the customer points indicated above.Table 9 records the minimum, maximum, and average values of this outcome.In the  simulation results, both MSA and this article's algorithm find the minimum optimized path length of 13,524.50km for this company, but this article's algorithm has the shortest solution time and the highest solution quality.Meanwhile, from the average value, the solution effect of this algorithm is the most stable.
The approximate iterative trajectories of the SA, MSA, and the algorithm in this article are shown for better comparison.Figure 13 shows the iterative results of the ACO, EACO, AEACO and this article's algorithm for solving the path planning of this company.Compared with the ACO, EACO and AEACO, this article's algorithm converges faster, has better initialization results and takes less time to solve.Figure 14 shows the comparison of the optimization effect of SA, MSA and the algorithm in this article.From the results, the convergence speed and optimization effect of this article's algorithm are much better than SA and MSA.

Figure 3
Figure 3 Several errors in the results of the SA, MSA-1 and SA solving different size instances.Full-size  DOI: 10.7717/peerj-cs.1609/fig-3

Figure 10
Figure 10 Comparison results the ACO-DSA with other traditional algorithms in terms of solution accuracy for different TSP instances.Full-size  DOI: 10.7717/peerj-cs.1609/fig-10

Figure 11
Figure 11 Comparison results ACO-DSA with other methods in the literature in terms of solution accuracy for different TSP instances.Full-size  DOI: 10.7717/peerj-cs.1609/fig-11

Table 1
Results of the ACO, EACO and AEACO for solving small-scale instances.

Table 2
MSA-1 with different ratios for solving TSP instances.

Table 3
Results of the SA, MSA-1 and MSA for solving different size instances.

Table 4
Parameter settings of the ACO-DSA for different TSP instances.

Table 7
The experimental records indicate that as the size of the instance increases, both ACO and PSO exhibit a deteriorating trend in solution accuracy and solution time performance.However, GA and PSO tend to converge to a certain level of accuracy.Remarkably, when compared to the traditional algorithms, the algorithm proposed in this article demonstrates superior performance.In terms of the average

Table 5
Results of the operation of the ACO-DSA clustering for different TSP instances.rateacrossall instances, the algorithm proposed in this article achieves a value of 1.49, while ACO, SA, GA, and PSO attain values of 29.39, 5.59, 2.84, and 11.16, respectively.Additionally, considering the average value of the average error rate across all instances, this article's algorithm exhibits a value of 3.98, while ACO, SA, GA, and PSO exhibit values of35.62,9.30,4.97, and 13.44, respectively.Moreover, the algorithm proposed in this article also demonstrates advantages in terms of solution time.

Table 6
Results of testing the number of different clusters and different iterations of pr299 instance.
Time (s) Error Time (s) Error Time (s) Error Time (s)Error

Table 7
Comparison results of the ACO-DSA with other traditional algorithms for different TSP instances.

Table 8
Comparison results of the ACO-DSA with other methods in the literature for different TSP instances.

Table 9
Results of the company's logistics transportation path solution.Figure13Iteration results of the ACO, EACO, AEACO and ACO-DSA.