Ant Colony Optimization with Warm-Up

The Ant Colony Optimization (ACO) is a probabilistic technique inspired by the behavior of ants for solving computational problems that may be reduced to finding the best path through a graph. Some species of ants deposit pheromone on the ground to mark some favorable paths that should be used by other members of the colony. Ant colony optimization implements a similar mechanism for solving optimization problems. In this paper a warm-up procedure for the ACO is proposed. During the warm-up, the pheromone matrix is initialized to provide an efficient new starting point for the algorithm, so that it can obtain the same (or better) results with fewer iterations. The warm-up is based exclusively on the graph, which, in most applications, is given and does not need to be recalculated every time before executing the algorithm. In this way, it can be made only once, and it speeds up the algorithm every time it is used from then on. The proposed solution is validated on a set of traveling salesman problem instances, and in the simulation of a real industrial application for the routing of pickers in a manual warehouse. During the validation, it is compared with other ACO adopting a pheromone initialization technique, and the results show that, in most cases, the adoption of the proposed warm-up allows the ACO to obtain the same or better results with fewer iterations.


Introduction
The Ant Colony Optimization (ACO) is a combinatorial optimization technique inspired by the behaviour of some species of ants. Broadly, when an ant must choose one route instead of the other, he/she looks at the quantity of pheromone left by other members of the colony. A higher level of pheromone means a better route, usually because it is shorter if compared to the others. This curious behavior inspired the creation of a probabilistic technique of operational research for solving computational problems, which can be reduced to finding the best path through a graph. The first version was proposed by [1], and it was originally called Ant System. Since then, many versions and different applications of the ACO were studied, and the algorithm is nowadays known to be a well-established and efficient approach for many practical problems, primarily the well-known traveling salesman problem (TSP) [2]. Consequently, an improvement in the ACO would lead to great benefits in many industrial and nonindustrial fields. Being the ACO a metaheuristic algorithm, most of the problems approached with it are strictly time-critical. Usually, they are NP-hard problems, in which the global optimum is refused a priori to seek a reasonably good suboptimal solution. However, the ACO, like all the evolutionary algorithms, needs many iterations to converge to a good solution, and, in the case of large-size problems, this process can be very time-consuming [3]. For this reason, the implementation of the ACO for solving large-size problems in real-time (i.e., a few seconds or even less) might be problematic. This is the first open problem highlighted also by [4], and this is because the adoption of a technique able to speed up the ACO may be very useful.
In this paper, a warm-up procedure to reduce the number of iterations required by the ACO to converge to a good solution is proposed. The success of the ACO is essentially based on a variable called the pheromone matrix. The pheromone matrix registers the current quantity of pheromone on each edge of the graph, and it, therefore, determines the probability to include each specific edge in a newly generated solution. In the classic version of the ACO, every time the algorithm is executed, all the elements of the pheromone matrix are set equal to a starting (generally low) value. Then, as the computed iterations increase, the pheromone on the most promising paths is increased and that on the less convenient paths is reduced. Although, in most real implementations, the graph of nodes is given and is the same in every execution. Furthermore, it was verified that, given a graph, in many cases, each time the ACO was executed, after a certain number of iterations, the pheromone matrix was always very similar. The warm-up procedure proposed in this paper aims to carry out a fine-tuning of the pheromone matrix on a specific graph, so that, every time the ACO is executed, it starts from an already weighted graph, where the promising paths were highlighted with a high level of pheromone and the bad paths excluded a priori. This process is supposed to reduce the number of iterations that the ACO requires to converge every time it is executed. The remainder of this paper is organized as follows. Firstly, a brief overview of the scientific contributions to ACO is presented in Section 2. Then, the warm-up procedure proposed in this paper is described in Section 2.
The computational experiments are shown in Section 4, where the ACO with warm-up is compared to the classic ACO, and two other ACO versions that carry out an initialization of the pheromone matrix. Finally, the conclusions are presented in section 5.

Literature Review
The Ant Colony Optimization (ACO) was introduced by [1] as a novel nature-inspired metaheuristic for the solution of hard combinatorial optimization problems. ACO belongs to the class of metaheuristics, which are approximate algorithms used to obtain good enough solutions to NP-hard problems in a reasonable amount of time. When searching for food, some species of ants initially explore the area surrounding the nest randomly. As soon as an ant finds a food source, it evaluates the quantity and the quality of the food. On the way back, the ant deposits a chemical pheromone trail on the ground. The quantity of pheromone deposited depends on the quantity and quality of the food and guides other ants to the food source. As shown by [5], the communication via pheromone between the ants enables them to find the shortest paths between their nest and food sources, and the same consideration also applies in ant colony optimization algorithms for solving combinatorial optimization problems. Even if the first proof-of-concept application for the ACO was a traveling salesman problem (TSP), up to now the above algorithm was applied to many combinatorial optimization problems. For instance, it was applied to assignment problems [6][7][8], routing problems [9][10][11], scheduling problems [12,13]. Less known but equally efficient applications concern the resource-constrained project scheduling problem [14,15], flow shop scheduling [16], sequential ordering problem [17], and open shop scheduling problem [18]. The scientific community also proposed many applications for nonindustrial environments such as solutions for DNA sequencing or web page ranking [19]. The various variants of the ACO generally differ from each other in the pheromone update rules. In particular, most applications belong to one of these two categories: the iteration-best-update or the best-so-far-update. Basically, in the first case, the update of pheromone takes place at every iteration, while in the second case it takes place only when a new best solution is found, introducing in this way a much stronger bias towards the good solutions found. The most successful ACO variants are the Ant Colony System [20] and the Min-Max Ant System [21], which also are the most used in practice. Since this claims to be just a brief overview of the key points concerning ACO, for a deeper analysis of the scientific contributions on this algorithm the literature reviews by [4,22] are suggested.

General Considerations
In most applications where the ACO is used or might be used, the graph of nodes that characterizes the problem is given and constant. Consequently, even the matrix of the costs associated with the edges is constant. This is well-known by the practitioners and affirmed by many scientific publications: see for example [4,23,24], which, indeed, takes the matrix of the costs as given. For instance, in classic traveling salesman problems for vehicle routing or picker routing in manual warehouses, the nodes represent the locations to visit and the costs associated with the edges represent the distance between the two connected nodes. Hence, the matrix of distances does not change until the roads network changes (in case of vehicle routing) or the warehouse layout changes (in case of picker routing). As matter of fact, in almost all the papers that treat these topics, the matrix of distances is defined only once using an exhaustive algorithm such as Floyd-Warshall to find the shortest path between all the nodes of the graph (see for instance [10]). Hence, when the graph and the matrix of costs are formalized, a warm-up may also be carried out. The warm-up allows a tuning of the pheromone matrix used by the ACO, and, in this way, every time the ACO is executed from then on, the number of iterations it needs to converge to a good solution is reduced, and, consequently, its computational time is reduced. The aim of the warm-up is therefore to highlight a priori the most promising paths, as well as excluding a priori the worst ones. All these aspects were already affirmed and well-described also by other scientific contributions focused on the initialization of the pheromone matrix (see for instance [25,26]).

The Notation Used
In the remainder of this section, for describing the proposed procedure, the following notation is used. • is the Hadamard product.

The Procedure
The warm-up emulates the update of the pheromone that, in classic ACO, is made during the first iterations of the algorithm, i.e., those generally aimed to explore the graph. The procedure is iterative and relatively easy. First of all, the pheromone matrix T is initialized, setting each element τ i,j = τ 0 ( ∀τ i,j ∈ {1, . . . , N}|i = j) and τ i,j = 0 ( ∀τ i,j ∈ {1, . . . , N}|i = j). Similarly, to avoid divisions by zero, all the elements on the diagonal of the matrix of costs are made equal to 1 (C = C + I N ). At each iteration m, the probability matrix P(m) is built by calculating each of its elements as in the following equation.
Note Equation (1) is the same used by many authors and mentioned by [4] to compute the probability to include the edge (i, j) in the new generated solution at each iteration of the algorithm. Then, the matrix of updates U(m) is calculated as in Equation (2), according to [1].
Then, the pheromone matrix T is updated according to Equation (3). In particular, the pheromone on each edge is updated according to its cost, and its corresponding value in the matrix of probabilities.
Finally, the pheromone evaporates as expressed in Equation (4).
The process is then repeated until the maximum number of iterations M is reached. In general, it is possible to see how the warm-up emulates exactly the same process that takes place during the iterations of the ACO. However, while during each iteration of the classic ACO only the pheromone on the edges owning to a new generated best solution is increased, in this case, at each iteration, all the edges of the graph see an increase of the pheromone, and this increase is proportional to their attractiveness. This is also a peculiarity of the proposed approach when compared to existing ones in literature (see for instance [25] or [26]), which generally initialize the pheromone simply depending on the cost associated to each edge of the graph. The author is aware that, over the years, several versions of the ACO were proposed by the scientific community, and most of them differ from the others for the formulas adopted to calculate the increase of pheromone [27], the evaporation [9], and the probability to choose an edge instead of the other [28]. On occasion of this study, reference is made to the first version by [1]. As several different versions of the ACO exist, many different versions of this warm-up procedure can be made by doing slight modifications to the formulas.

The Parameters Tuning
Concerning the tuning of parameters, the same setting analyzed and defined as 'optimal' by [1] is used (i.e., α = 1, β = 2, ρ = 0.9, Q = 5, τ 0 = 0.1). The additional parameters used in the warm-up that need an optimization are the evaporation rate used during the warm-up (i.e., ρ wu ) and the number of iterations of the warm-up (i.e., M). There is no real optimum for these parameters that can be defined a priori; both depend on the size of the problem, its complexity, and the type of connections in the graph. In occasion of this study, to carry out a good setting before the computational experiments described in the next section, three different traveling salesman problem benchmarks are used. Each of these problems consists in the construction of the cheapest Hamiltonian cycle through a set of nodes, and each of them has a different complexity identified by the number of nodes to connect (i.e., 20,30,40). Concerning the parameters, three different levels were identified per each of them, i.e., ρ wu ∈ {0.5, 0.9, 1.0} and M ∈ {200, 400, 600}, and the ACO with warm-up was tested on each problem using all the possible combinations. Moreover, because of the randomness of the procedure, given a benchmark problem, and a combination of ρ wu and M, not just a single execution of the ACO was considered; conversely, the algorithm was executed five times under the same conditions and its average result and standard deviation monitored. The results are reported in Table 1, where is visible that the best results (those highlighted in greed) are obtained for ρ wu = 1 and M = 400. As suggested by ρ wu = 1, the evaporation should be avoided during the warm up.

General Considerations
For validating the efficiency of the proposed warm-up approach, a set of computational experiments is presented in this section. All the experiments carried out are based on the traveling salesman problem (TSP), which, to the author's best knowledge, is also the most frequent and popular application of the ACO. The objective of the algorithm is therefore the definition of a low-cost Hamiltonian cycle: given (i) a set of nodes to visit, (ii) a set of edges connecting them to each other, and (iii) a cost associated to each edge, the algorithm has to define the sequence in which the nodes should be visited that minimizes the total cost of covered edges. Firstly, a set of generic TSP instances is used. In particular, five different graphs are generated, and, on each graph, five different experiments of different complexities are done. Each experiment is taking in consideration a different set of nodes of the graph: the greater is the set of nodes, the higher is the complexity of the problem. Then, to validate the proposed approach in a more realistic context, the simulation of a real industrial case is used. The layout of a manual warehouse for order picking is considered, and the proposed algorithm is used to define the optimal (or almost optimal) paths made by pickers to collect the desired products. No capacity limits are imposed on pickers or aisles, hence the situation is perfectly comparable to a classic TSP, although, the graph is more constrained and has all the characteristics of those used to model warehouses.

The Comparison Algorithms
The proposed ACO with warm-up (ACOWU) is compared to a classic ACO (i.e., without warm-up) having the same parameters setting, and two ACOs using a pheromone initialization technique(i.e., [25,26]).
The first comparison algorithm with pheromone initialization proposed by [26] (hereafter simply referred to as Dai) is based on the Minimal Spanning Tree (MST). Given the graph of nodes, once calculated the MST using the well-known Prim's algorithm, and given τ 0 the starting pheromone on nodes, the pheromone on nodes belonging to the MST is set to τ 1/β 0 . Conversely, the algorithm proposed by [25] (hereafter simply referred to as Bellaachia) says to set the pheromone on edge (i, j), namely τ i,j : where N * (i.e., ⊂N) is the set of nodes, different by j, which can be reached by i.

Collected Information
The warm-up is made only once on each graph, while at each run of the algorithms three main parameters are controlled: (i) the cost of the best solution found, (ii) the number of iterations needed to find it, and (iii) the computational time. Being all the observed algorithms subject to a certain randomness, to have a better understanding of their reliability, they were all iterated 10 times on each experiment, and the average and standard deviations are therefore reported. For sake of clarity, in all the following tables, the results of the proposed algorithm are written in bold when it outperforms the classic ACO, and highlighted in grey every time it outperforms all the other algorithms.

Results Obtained in the Simulation of the Real Warehouse
After studying the effect of the proposed warm-up on a set of generic TSP instances, it is in interest of the author to analyze its effect in a more realistic and complex environment. The simulation of a manual warehouse for picking was therefore used, and the proposed ant colony optimization with warm-up is used to define the routing of pickers-i.e., once defined the picking locations the picker has to visit, the order in which they are visited is defined. The faced problem is essentially a TSP, but the graph of nodes and paths is more constrained, with less possible paths between nodes and many mandatory walkways. Importantly, no additional constraints such as capacity of pickers' baskets, definition of batches, or interference between pickers moving through the aisles are considered.
Starting from the warehouse layout, a graph of accessible positions is generated placing a node in front of each storage location and a node where aisles cross to each other, and then, using the well-known Floyd-Warshall algorithm, the matrix of minimum distances between nodes is generated. The starting warehouse is made of 20 aisles with 16 storage locations each, crossed by a single cross-aisle in the middle (i.e., between the 8th and the 9th locations). Each storage location are 2×2 m, aisles are 4 m wide, while the cross-aisle is 8 m wide. The resulting graph used in the tests is shown in Figure 4.
The results obtained by the compared algorithms in the simulation of the warehouse are reported in Table 5 and can be intuitively visualised looking at Figures 5-7 respectively in terms of (i) cost of the best solution found, (ii) solutions explored before finding the best, and (iii) computational times. The results broadly respect what already seen in previous experiments. On average the proposed ACOWU is still the best in terms of cost even if sometimes it cannot provide a better solution than the classic ACO, but the same could be said for the other algorithms using a pheromone initialization strategy. The Dai algorithm is still in second position and proved to be a very good alternative. Concerning the solutions explored and therefore the computational time Bellaachia algorithm is the best (as already seen in previous experiments). However, the proposed ACOWU is again a good alternative as clearly visible in Figures 6 and 7. Again, as in the previous experiments on generic TSP instances, the difference in terms of solutions explored and computational time is not that big. However, the utilization of a pheromone initialization technique, as already proved in literature, guarantees some advantages over the classic ACO.

Conclusions
In this paper, a warm-up procedure for the ACO was proposed and validated. During the warm-up, the pheromone matrix of the ACO is initialized to provide an efficient new starting point for the algorithm so that it can obtain the same (or better) results with less iterations. The warm-up is based exclusively on the graph made by the nodes and the edges that formalize the problem. This graph, in most applications, is given, and does not need to be recalculated every time before executing the algorithm. Because of this, the warm-up procedure can be made only once when setting the hyper-parameters of the algorithm to speed it up every time it is used from then on. Firstly, a parameters tuning was made to find the optimal setting for the warm-up. Then, two set of the experiments were carried out to validate the proposed approach. The first set of experiments was done using some generic TSP instances, then, to validate algorithm in a more realistic context, a second set of experiments in a warehouse for picking was made. The ant colony with warm-up was compared with a classic ACO (without warm-up), and with two ACO using a pheromone initialization technique. The results obtained are promising, and the warm-up approach is generic enough to find application in almost all the contexts where the ACO can be applied. Of course, the impact and the efficiency of the warm-up might change from one application to the other, but the preliminary results shown in this paper prove that its analysis is worth studying, paving the way for many studies and possible extensions.