A novel approach for solving travelling thief problem using enhanced simulated annealing

Real-world optimization problems are getting more and more complex due to the involvement of inter dependencies. These complex problems need more advanced optimizing techniques. The Traveling Thief Problem (TTP) is an optimization problem that combines two well-known NP-Hard problems including the 0/1 knapsack problem and traveling salesman problem. TTP contains a person known as a thief who plans a tour to collect multiple items to fill his knapsack to gain maximum profit while incurring minimum cost in a standard time interval of 600 s. This paper proposed an efficient technique to solve the TTP problem by rearranging the steps of the knapsack. Initially, the picking strategy starts randomly and then a traversal plan is generated through the Lin-Kernighan heuristic. This traversal is then improved by eliminating the insignificant cities which contribute towards profit adversely by applying the modified simulated annealing technique. The proposed technique on different instances shows promising results as compared to other state-of-the-art algorithms. This technique has outperformed on a small and medium-size instance and competitive results have been obtained in the context of relatively larger instances.


INTRODUCTION
Optimization problems are handling more efficiently on a daily basis, and this aspect considered that these problems are becoming more complex. These several optimization real-world problems (Laporte, 1992;Kellerer, Pferschy & Pisinger, 2004) are interacting with each other. Moreover, problems are complex to solve, especially those which are not purely independent. Real-life optimization problems usually contain several problems that interrelate with each other. In order to resolve these problems, it is important to realize the cities from which we have not pick any item and considered only those cities from which we have to pick items. So, this planned tour gave the results much profitable.
The TTP is a benchmark problem intended to address concerns presented in that there is a gap between theory and practice in the field of meta heuristics for combinatorial optimization problems. It is claimed that the definition of complexity is the main difference between the benchmark problems used in theoretical work, and the real-worldproblems to which the results are intended to be applied in practice. For the benchmark problems, it is claimed, there is a tendency to equate complexity with size for example, a number of cities for the TSP while real-world problems usually include additional sources of complexity, such as the interdependence of components problems, as intentionally featured by the TTP (Mei, Li & Yao, 2016).
So, for the TTP, there exist different kinds of heuristic solvers that feature different levels of communication, and there appears to be an expectation that such communication is necessary (Mei et al., 2015). As new algorithms are developed, it would be interesting to see how well this expectation is met. This research performs a novel technique to solve the optimization problem in which real-world problems are modeled by interconnecting with a different problem. To solve such problems, the main challenging task is the complexity which we should reduce to solve problems, as one problem influencing the other problem and vice versa (Bonyadi et al., 2019). Subsequently, the computed results of instances include the large range with the different feature which start from a limited or few numbers of cities with limited items and knapsack to a large set of cities with a large number of items and include the large knapsack capacity (Polyakovskiy et al., 2014). While the small set of instances and medium-size instances are solved soon and give an optimal solution but large instances remain unsolved for a number of years. These instances are basically introduced in the TSP problem and further mold it with knapsack parameters. TSPLIB library is used from which these instances are taken to solve the problem (Reinelt, 1991).
The rest of the article is organized as follows: In "Background", we briefly discuss the basic concepts of traveling salesman problem, knapsack problem, and traveling thief problem. "Methodology" describes the methodology of the proposed approach. The experiments and results are presented in "Results and Discussions". "Conclusion" concludes the article.

BACKGROUND Travelling salesman problem
Travelling salesman problem is considered as the well-known NP-Hard problem which contains N cities and a traveling salesman make a tour where all cities have to be visited exactly once (Nallaperuma et al., 2013). The salesman person starts a tour from the starting point and also ended at the same starting point after visiting all the cities. The goal of this problem is to minimize travel time considered as a traveling cost. Moreover, there is no time limit and no velocity mentioned in this problem. There are different parameters of traveling salesman problem with a single objective to minimize the cost. Moreover, all the detailed parameter are discussed and explain more clearly for the understand-ability of TSP: There are N number of cities, where city set is, X = {1, 2,…, n }, D i, j is a distance matrix which finds the distance from one city to all other cities, there are many items in each city, velocity is denoted by V and it remains constant during the tour, Tour X i where city i contains n number of items containing all cities in the order in which they can be visited, the travel time t between city X i and X i+1 , ∀i = 1,2,…,n is calculated through Objective function f(X) is considered as: (1)

Knapsack problem
The knapsack problem is an NP-hard problem that we can solve by different optimization techniques. The knapsack has a limited capacity which is mentioned with symbol C. Moreover, this problem contains the items which have different weight and values. Weight and value are denoted by w i and p i respectively and items are denoted by i. Items are packed into the knapsack according to the capacity by focusing on the value and weight of each item. Here, this problem has to gain maximum profit by selecting the optimal combination of items (Stolk et al., 2013). Number of Items can be considered as ( i = 1 to m ), P i εR value of an item, weight w i εR weight of items, the limited capacity of knapsack is C , Itemset Z ! which contains or packed the items in the knapsack. Z ! ¼ fz 1 ; z 2 ; . . . ; z m g; z i ε{0;1}; that is, z i = 1 item i is packed and z i = 0 item i is not packed, Objective function of the KP is to maximize the total value of items selected without exceeding the limit of the knapsack capacity which are formulated below.
Travelling thief problem The benchmark solution set is considered to solve the TTP, as the authors of the TTP benchmark solution set also presented many approaches with different variants to solve this optimization problem (Polyakovskiy et al., 2014). At first, this problem was drive by  combining two well-known NP-hard problems and proposed two models to solve TTP. These models are proposed but most of the researchers focus on the first model. Different researchers use different approaches to solve this problem that is, a random local search (Deb & Sinha, 2010) is applied in early and evolutionary algorithms with a simple (1+1) approach . Some optimization algorithms like ACO and Genetic algorithms (NSGA) (Laszczyk & Myszkowski, 2019) are also applied but the efficiency of approaches is limited to specific scenarios (Wagner, 2016). There is only one objective to maximize the profit in the first model and two new parameters are also introduced to make these sub-problems interdependent . Therefore Faulkner only computes the fitness after multiple items are added and backtrack if the score became worse (Faulkner et al., 2015;Bonyadi et al., 2019). The traveling speed is related to the knapsack weight and knapsack rent which is paid and increased R per unit time. Two objectives are considered in the second model as maximizing the total profit and minimizes the time and cost by adding the three parameters (Deb & Sinha, 2009). Moreover, many other variants are added in traveling thief problem to solve this by molded the problem in other dimensions that is, Multiple Knapsack Problem (Lalami et al., 2012), multi-objective knapsack (Bazgan, Hugot & Vanderpooten, 2009), fractional knapsack (Ishii, Ibaraki & Mine, 1977), bi-level knapsack (Chen & Zhang, 2013), etc. After this, the benchmark is designed with different algorithms to solve TTP but these are the simple techniques to verify this problem (Polyakovskiy et al., 2014).
Now we formulate the TTP which is the combination of two well-known bench mark problems: Knapsack problem and Travelling salesman problem. We consider as, Each item i having a weight w i and a value p i W (knapsack capacity) is the maximum capacity of the knapsack, V max = 1 is the maximum velocity, V min = 0.1 is the minimum velocity and D = distance matrix.
The distance matrix D is calculated to find the cost from each city to another city where the star symbol represents that no path is found.
In the given example as shown in Fig. 1 the thief starts tour from node 1 moving to node 2 or 5, where the distance between these nodes are 5 and 4 respectively. The current weight of the knapsack (W c ) is 0, thus, V c = V max = 1, which results in t 1,2 (cost/time taken from city 1 to city 2) is 5. Equation (4) is used to calculate the current speed of thief as By this equation, we find the current speed of the thief. The speed of the thief fluctuates according to the weight of knapsack. So, the speed of the thief and knapsack capacity are inversely proportional to each other as the knapsack is filled, the thief becomes slow. After the compilation of all process, the solution of proposed example is represented as x = 1, 3, 2, 4, and 5 which shows the cities for travel in a proper order and z = 0, 5, 0, 3, and 1 that indicates the items which are picked from city according to the index, for example, item I 5 is packed from city 2, I 3 is packed from city 4 and I 1 is packed from city 5 respectively. The goal of this problem is to find the maximum profit G(x, z) by using the value gð Z ! Þ produced by picking plan using Eq. (3) and the rent R × f(X) produced using Eq. (1) as

METHODOLOGY
This section includes the methodology of TTP in which we initialize the solution strategy to tackle the traveling thief problem. We proposed an efficient technique to solve the TTP problem by rearranging the steps used in this problem. Initially, picking strategy starts randomly and then a traversal plan is generated through Lin-Kernighan heuristic. This traversal is then improved by eliminating the insignificant cities by applying the modified simulated annealing technique. The knapsack is filled by picking the most profitable items and used three fitness functions to maximize the profit (Martins et al., 2017). The Extended Simulated Annealing (ESA) is summarized in the flow chart in Fig. 2.

Initialization
In this stage we initialize the basic parameters and randomly produces the solution by picking the items from the different cities and generate the tour.

KP search
After the initialization process, we select the items with the help of objective function and then eliminate the cities from which we do not pick any item. This process saves our travel time and cost through which our profit will increase at a maximum level. The limitation of this technique is highlighted when one city X i is only connected to another city X i+1 , but we have deleted the city X i because we have no interest in the items of this city, but their corresponding city X i+1 is the only city from which we have to pick an item but we have deleted the city X i , so after this we cannot reach at X i+1 in this situation. So, at that stage, we may get the infinity value as a profit. The simulated Annealing approach is used for this problem. The following parameters of SA are used: The absolute temperature T abs , set to 1.
The initial temperature T 0 , set to 100. temperature cooling parameter α, set to 0.95.
The number of iterations depends upon the size of the instances. m (items) scattered in the N cities,

TSP search
In TSP search, we finds the best tour x by using the previous picking plan z to get profit. Further, we also focused on the travel cost instead of profit, because the profit also depends on the travel cost by eliminating it from gain profit. So, we also try to minimize the total traveling cost. Similarly, the picking plan is also improved by comparing the current picking with the previous plan and considered the best plan for further implementation.

RESULTS AND DISCUSSIONS
The results are computed on different instances, as discussed in Table 1. Moreover, this section graphically represent the results produced by all the algorithms using instances individually and provide a comparison in Table 2 as profit and time of each instance.
The Table 1 demonstrates all parameters of each instance in which a number of cities and the number of items is included. Factor items are the number of items per city and note that each item has its own profit and weight. Further, the first node which is the starting node has no item in all instances. The knapsack capacity is also mentioned in the Table 1, which is the limited capacity of the knapsack and different for each instance.
In this research, the benchmark dataset included thousands of files. These files are further divided into three categories based on their sizes as small, medium and large instances. These sizes are described in different ranges which are based on the number of cities and the number of items per city. The small size included the range of files up to 1,000 cities and included all types of files that is, bounded correlated strongly (bsc), uncorrelated (un) and uncorrelated with similar weights (usw).
Moreover, Table 2 represents the overall result of the instances by comparing the ESA with two existing algorithms. It includes the mean value which is the profit of the thief and the execution time of each instance in seconds as 600 is the maximum time to execute the instance file.
In Table 2, the proposed technique is compared with two existing techniques the memetic algorithm, hill climbing, and simulated annealing technique as a hybrid (Yafrani & Ahiod, 2016;Lourenço, Pereira & Costa, 2016). The first column of the table represents the instances file individually and then each column mentioned the mean value or the objective value of instances, standard deviation (std), as well as the time (T) in seconds. Average result is presented in Table 2 after executing 10 times per input file.
Here, clearly shown that the proposed techniques perform very well with respect to mean value and the time . Further, by comparing these techniques the small instances show good results as they are shown in the first twenty one rows. The proposed technique is very efficient as in these small instances the minimum time period is calculated 25 s which are very less. This is happened just because of eliminating the cities, the time travel is reduced to a minimum and generate a tour. Sometimes, it may increase to the maximum time limit as shown in a 280 bounded strongly correlated instances but this time reached in only a few instances.
Moreover, the medium-size instances also perform better according to their range, it includes the instances from rl1304 to brd14051. As shown clearly in Table 2 that the last file of bounded strongly correlated perform their computation in 174 s and the existing techniques compute the results in 586 and 600 s respectively. Subsequently, in all medium-size instances, the performance is better with respect to both time and objective value.
The results of large instances also perform better and it includes the only nine instance files. In these instances the existing techniques perform better in only three instances and on the other six instances, the proposed algorithm is very effective. For the time comparison of large instances, the existing techniques are better to compute the file but the difference is minor. So, this approach concluded that it performs significantly well for small and medium sizes and compatible with the large size instance. By eliminating the non-benefit cities from the tour this technique differs on the large size and gives better results. There are some graphical representation of the results of all the categories is given below. The Fig. 3 shows that our technique performs better in all types of instances as it gains 18,835, 155,765 and 59,072 of BSC, UN, and USW respectively. So the performance of ESA is much better in eil76 instance.
The existing algorithms perform less than ESA and take maximum time to solve this problem. So, in small instances, this study concluded that ESA performs better in most instances and gives much efficiency in all instances. These instances are compared with execution time as presented in Fig. 4. It performs better results in few seconds as previous techniques perform this task in 600 s but ESA performs in 108, 159 and 345 s respectively.
The Fig. 5 clearly demonstrates that the bounded correlated strongly instance perform much better in ESA as it gains 1,705,291, but in an uncorrelated instance, the performance degrades and in an uncorrelated similar weight, it also performs much better as it   Fig. 6 clearly describe that performance is much better than existing algorithms by comparing the execution time. Instance rl1304 performs better results in few seconds as previous techniques perform this task in 600 s but ESA performs in only 300, 310 and 280 s respectively.
Large instances focus in most of the studies. In this paper, the instances which are included large instances having more than 15,000 number of cities. The Fig. 7 shows that, all the results perform better in all instances as compared to two other state-of-the-art algorithms. Two files perform better with ESA and performance degrade in the only uncorrelated file which gains 29,618,079 profit.
The Fig. 8 shows that all instances solve in a maximum time limit. So, in large instances, this study concluded that ESA performs better and competitive in most instances but it takes the maximum time to execute the file. Moreover, the more time is taken but the profit of these instances covers up the execution time of the instance as it gives more profit. Table 3 clearly demonstrates the overall performance of our technique in which different experiments are performed with three associate instances with diverse weights. These sub-instances are stated above as bounded strongly correlated, uncorrelated with similar weights and uncorrelated . Our algorithm clearly shows that the results mentioned in Table 2 perform much better by eliminating the cities on which we cannot get any profit. This is performed by analyzing that when we don't take an item from city x then why we have to travel to such a city. This will overhead all cost traveling from city X i to X i+1 . As remember our knapsack is rented and has many burdens to travel from one city to another. So, we eliminate that city to improve the profit. As the profit will increase the cost of traveling will decease automatically because they both are inversely proportional to each other. Moreover, time will also decrease here to solve the TTP.
Here we have discussed the trend of different sizes to solve TTP with different techniques. As many optimization approaches are performed by many researchers to solve this problem. Many of them target the specific ranges that is, small, medium-size, large. In this approach, we have concluded that this approach performs significantly well for small and medium sizes and compatible with the large size instance (Mei, Li & Yao, 2014). By eliminating the non-benefit cities from the tour this technique differs on the large size and gives better objective value as we are targeting to maximize the profit and minimize the cost and time as well. However, we have faith that as the tour is generated randomly, it will give better results if the tour and picking plan is focused more deeply and total time travel is also reduced in many instances.

CONCLUSION
In this article, we have proposed a technique in which simulated annealing is modified to solve the TTP. This problem contains a person known as a thief and plans a tour to collect multiple items to fill his knapsack to gain maximum profit and minimum cost in 600 s as a standard of TTP time duration. This technique efficiently solves the TTP problem by rearranging the steps; first it creates a picking plan, and then generates a tour. This tour is then improved by eliminating the cities which are useless. The proposed technique on different instances shows promising results as compared to other states-of-the-art algorithms. We have computed many different instances and mentioned many of them which are selected and compared, in which many instances show that the result outperforms and rather they are competitive. So, at this stage, we may get the infinity value as a profit.