Modified ant colony optimization algorithm for solving the vehicle routing problem with a given load capacity

The article presents a mathematical model of the Vehicle routing problem with loading constraints in the form of a mathematical programming problem. A modification of the Ant colony optimization algorithm is presented to solve the problem. The main idea of the modification is to use the probabilistic rule of returning the agent (ant) to the depot before full unloading. The source of such behavior of the agent can be the concept of «fullness» of the ant. The computational experiment for comparing the classical and modified Ant colony optimization algorithms is presented. Testing was carried out according to two schemes: intensive testing and cross-testing. The conclusions about the efficiency of the proposed algorithm are presented.


Introduction
The Vehicle Routing Problem (VRP) is one of the key combinatorial optimization problems in transport and logistics. The goal of the VRP is to find a set of routes for the delivery of a certain resource to each point by agents (vehicles) with their return to the common starting locationthe depot (center, base). Having a map of cities with depots and roads connecting them at the input, it is necessary to get a set of routes with a minimum total length at the output [1,2]. The computational complexity of the algorithm depends on the size of the input data exponentially, i.e. the problem is NP-hard [3]. Figure 1 shows an example of the map of cities, depots, and roads, for which it is necessary to find a solution to a classic VRP. The map of delivery points (cities) is represented by a graph on which the numbered circles are delivery points, the lines connecting the circles are roads, and the vertical oval numbered with a zero is a depot. In general, there is a direct road from the depot to each of the delivery points, and every two delivery points are connected to each other in pairs. All distances are known. In this example, it is necessary to deliver resources from the depot to 18 delivery points, using 6 agents.
An example of a possible solution to this problem is shown in Figure 2. It can be seen how the set of available paths formed clustersroutes with a total minimum length, which are the solution to the VRP.

Problem statement
Let be given: is the load capacity of each agent; is the number of agents.
It is necessary to find a set of routes that start and end at the depot, so that their total length is minimal.
To construct a mathematical model in the form of a mathematical programming problem, enter the following variables: (1) 2. Each route starts and ends at the depot 3. Each vertex enters only one route and only one time 4. The total weight of the delivered goods for each route does not exceed W 5. Special condition for the absence of sub-cycles [4].
Thus, a mathematical model (1)-(6) of the VRP with limited load capacity is obtained. This problem is successfully solved by meta-heuristic algorithms, one of which is Ant colony optimization algorithm (ACO). ACO is very flexible, it can be easily expanded and modified.

Classification of solution algorithms
It is possible to use different approaches to solve the VRP with limited load capacity. By convention, all the used algorithms are divided into two classes: exact and approximate.
Exact algorithms produce an optimal solution for a given data set. Since the problem is NP-hard, there are no exact algorithms whose running time depends on the size of the input data in a polynomial way. Therefore, such algorithms are used for small input data volumes, or when the time required for the solution is not an important criterion.
Examples of exact algorithms are [5]: 1) Dynamic Programming methods. In such methods, the general problem is broken down into simpler recursively related subproblems, and each sub-problem is solved once.
2) The Branch and Bound Method. The basic idea of this method is partition feasible set into subsets, each of which is checked and eliminated if it does not contain optimal solutions.
Approximate algorithms allow finding a suboptimal solution in polynomial time. If the input data is large enough, or computing resources are limited, these algorithms are used. Such algorithms do not guarantee optimality but give results that are accurate enough to solve the problem. Among the approximate algorithms it can be distinguished: 1) Heuristic algorithms [6]. Such algorithms implement some practical method, the «correctness» of which is not proved for all possible conditions, but showing results good enough to continue using it (ant colony optimization algorithm, genetic algorithm, simulated annealing algorithm).
2) Greedy algorithms [7]. When choosing a solution, preference is given to locally optimal options, on the assumption that the global solution will also be optimal.  [8]. Some additional function is added to the objective function. This additional function «penalizes» non-optimal solutions by some rule.

3) Penalty Function Methods
Among the algorithms presented above, a heuristic algorithm simulating the behavior of an ant colony, the so-called Ant colony optimization , is of particular interest [9,10].
The ACO has interesting and intuitive biological prerequisites. Its structure provides wide opportunities for modification and easy integration of testing and analysis systems. The algorithm allows monitoring the construction of the solution at every moment.

Modification of the ACO
The purpose and main idea of this study are to estimate the convergence rate of the algorithm (based on the average number of iterations) in the case when each ant (agent) has the opportunity to return to the depot at its own choice before full unloading. This decision is made by the agent probabilistically, depending on its current load. To do this, a new rule is added to the basic rules of the classical ACO [9, 10] -the probability of return based on the «fullness» of the ant. Thus, the modification uses the rules presented below.
The probability of selecting a city j from the current city i: where is the city from the number of available cities ; ℎ is the amount of pheromone on the road ( , ); = 1 ⁄is the attractiveness of the road ( , ), as the inverse of its length. That is, the longer the road, the less attractive it is; ,are the configurable ACO parameters.
Pheromone Update Rule: where is the pheromone evaporation coefficient; is the set of ants that passed along the road ( , ) on the iteration; is the value that characterizes the order of the optimal solution; is the total route length of the k-th ant.
The probability of returning to the depot at the current fullness is: Here is the current fullness of the ant. The parameter not only describes the idea of modification, but also corresponds to common sensethe ants return to the anthill, not reaching hungry death; ais the parameter responsible for the steepness of the curve and the displacement. Figure 3 shows the graph of the function (9) at = 5.  Figure 3. Graph of the function that evaluates the probability of a return. With this definition, the lower the capacity value of a particular ant agent, the more likely it is that it will decide to return to the depot, and vice versa (figure 3).

Modified ACO algorithm Algorithm 1.
Step 1. If the current system does not meet the exit conditions (limit on the maximum number of iterations or limit on the maximum number of iterations without improving the current solution, etc.), go to Step 2. Otherwise, the end of the algorithm.
Step 2. Create K ants. Put the number of the current ant = 1.
Step 3. Select the k-th ant. Put it in the depot and make up for fullness с = .
Step 4. If there are unvisited cities, go to Step 5. Otherwise, go to Step 8.
Step 5. If the ant is still full with с > 0, go to Step 6. Otherwise, go to Step 3. Step 6. Make a probabilistic decision to return to the depot or continue driving according to the formula (9). Go to Step 3, if the decision to return is made, otherwise to Step 7.
Step 7. For each of the unvisited cities, calculate the transition probability using the formula (7). Make a probabilistic transition to one of the available citie.
Step 8. Lower the capacity (fullness) by the value of d (the current city query). Go to Step 4.
Step 9. If necessary, update the best solution.
Step 10. If the current ant is not the last ( < ), take the next ant ( = + 1), and go to Step 3. Otherwise, go to Step 11.
Step 11. For each road, update the pheromone concentration according to rule (8). Go to Step 1. Thus, the proposed modification idea is implemented in steps 5 and 6. The remaining steps of the algorithm correspond to the classical ACO algorithm.

Test results
To evaluate the efficiency of both versions of the algorithm, we used given immutable data sets of different dimensions with known solutions for them and two testing schemes [11].
1. Intensive testing: for each data set, the solution of the classical and modified versions of the ACO is found 100 times. After that, the average deviation from the optimal solution, the average number of iterations required for convergence, and the frequency of obtaining optimal solutions are calculated.
2. Cross-testing: a series of 14 consecutive tests is performed; two tests for each data set, in which the problem is solved by classical and modified ACO, respectively. The purpose of this test is to The name of each of the sets [11] contains a description of the system: E-is the source, means «Eilon and Christofides»; nis the dimension of the system; kis the minimum number of agents required for the convergence of the algorithm to the optimal value.
The results of the classical ACO are presented in Table 1, and the modified ACOin Table 2. We can see that for small dimensions, the modified ACO shows better results, both in accuracy and in the number of iterations. When increasing the input data, the classical ACO shows greater accuracy with a superior number of iterations.
The results of cross-testing are presented in Table 3. The algorithm is considered more efficient if it shows better parameters for the accuracy of the solution and the number of iterations, or if, with a small loss in accuracy, it finishes the work in a much smaller number of iterations. So, for example, on the third, fourth, and sixth data sets, the modified ACO slightly (in relation to the average deviation of intensive testing) was inferior to the classical ACO in accuracy, but it required a significantly smaller number of iterations. In general, it can be argued that the algorithms show approximately the same results. Despite the fact that the modified ACO showed better results with small input data and was more often effective in cross-testing, these results, however, are not enough to give the modified ACO a preference over the classical version.
If we answer the practical question, namely, how would the overall operation of the system for the formation of shortest paths be affected by the possibility for the driver of the vehicle to return to the depot at his own choice before the full delivery of the resource, then we can assume that such behavior will not damage the efficiency of an irregular nature.