Memetic algorithm for the dynamic vehicle routing problem with simultaneous delivery and pickup

In recent years, the Vehicle Routing Problem (VRP) has become an important issue for distribution companies. Also, the rapid development of communication means and the appearance of reverse logistics have given rise to new variants of the VRP. This article deals with an important variant of the VRP which is Dynamic Vehicle Routing Problem with Simultaneous Delivery and Pickup (DVRPSDP), in which new customers appear during the working day and each customer requires simultaneous delivery and pickup. A Memetic Algorithm (MA) that combines Genetic Algorithm (GA) and local search procedure have been proposed to solve the problem. The performance of the algorithm is evaluated with the tests carried out on a set of benchmarks found in the literature. The proposed memetic algorithm is very efficient and gives many good solutions.


Introduction
In recent years, the world has experienced an economic, technological and environmental evolution. So, transportation companies, large and small, are seeking to design an efficient distribution system that increases their competitiveness and satisfies their customers. The first challenge encountered by these companies is the protection of the environment which has become a main objective of most countries, so companies should adopt environmentally friendly approaches and integrate environmental preservation into all stages of supply chain management. Among the measures to be taken we have the collection of reusable or recyclable packaging, empty packaging and end-of-life products, and deposit them in the recycling depots. The second challenge is the emergencies, it consists in the appearance of urgent customers when the working day has already started and the vehicles are in circulation, so the new customers must be inserted in the distribution plan as soon as possible. Combining the two challenges, i.e. the environmental protection and the emergencies issue, will give rise to a dynamic and inverse distribution logistics that aims to optimize the delivery of products to customers and the simultaneous pickup of reusable or empty packaging in order to recycle them, in a context where new customers who have a demand for delivery and collection appear when the working day has already started, which requires their integration into planned tours or creating new tours. In this context, a very important variant of the VRP is created and studied: The Dynamic Vehicle Routing Problem (Sáez et al., 2008;Berbeglia et al., 2010;Zachariadis et al., 2010;Mitrović-Minić & Laporte, 2004;Pureza & Laporte, 2008;Ropke & Pisinger, 2006;Sheridan et al., 2013;Srour et al., 2018) with Simultaneous Delivery and Pickup (DVRPSDP). Fig. 1 shows an example of DVRPSDP.
The rest of this paper is organized as follows: Section 2 presents a detailed literature review of DVRPSDP. In section 3 the description of the problem is introduced and a mathematical model is proposed. Section 4 is reserved to the description of the proposed algorithm. Section 5 presents the parameter of the algorithm, the benchmark used for the problem and the numerical results. The conclusion is presented in section 6.

Literature Review
The VRP (Barkaoui et al., 2015) is a classical problem of combinatorial optimization. It consists of determining the routes for a set of vehicles available at the depot to serve a set of customers dispersed geographically. Each vehicle begins its tour from the depot and returns to the same depot. VRP has several variants including DVRP, VRPSDP and DVRPSDP. The main objective of our study deals with the arrival of new customers with delivery and collection demands (DVRPSDP) which is an important problem encountered by the distribution systems of large companies but unfortunately it is not very studied by researchers. To our knowledge, there are no studies in the literature that address the dynamic case of VRPSDP in the periodic and deterministic case. In this section, we will therefore be content to review the VRPSDP and DVRP literature.

VRPSDP
The VRPSDP is an extension of VRP in which the customer visited has a quantity to deliver and a quantity to be collected. When the vehicle finishes its tour, it returns to the depot to deposit what it has collected. The VRPSDP was introduced by Min (1989), he proposed a mathematical model for this variant, and developed a resolution approach based on the construction of Clusters of customers, then Assigning trucks/drivers to clusters and finally Creating the route structure. And then the VRPSDP became an objective of several research due to its importance in the protection of the environment. Several methods and techniques have been proposed in the literature to solve VRPSDP. We will mention the most relevant ones. For example, Dethloff (2001) described the important role played by the VRPSDP in reverse logistics. He suggested 2 an insertion-based heuristic which is based on three insertion criteria: Travel Distance (TD), Residual Capacity (RC), and Radial overload, to solve the problem. Angelelli and Mansini (2002) were the first to develop an exact approach based on the Branch and Price methods to solve the VRPSDP with time windows (VRPSDPTW). The same technique is used by Dell'Amico et al. (2006) to solve the VRPSDP. Chen and Wu (2006) have solved the VRPSDP using the insertion-based procedure to create a good initial solution, then a hybrid heuristic based on the record-to-record travel, tabu lists, and route improvement procedures are suggested to improve the initial solutions. Subramanian et al. (2007) suggested a greedy randomized adaptive search procedure that uses a variable neighborhood descent method to perform the local search for solving the VRPSDP. Gajpal and Abad (2009) proposed an approach to solve the VRPSDP, this approach is based on an Ant Colony System (ACS) algorithm that uses 3 local search schemes: 2-opt scheme, the customer insertion/interchange multi-route scheme, and the sub-path exchange multi-route scheme. Ai and Kachitvichyanukul (2009) developed a Particle Swarm Optimization (PSO) algorithm (Goksal et al., 2013;Lan & Xia, 2015) to solve the VRPSDP. Subramanian et al. (2010) solved the VRPSDP by using a multi start heuristic which consists in a variable neighborhood descent procedure, a random neighborhood ordering procedure and an Iterated Local Search (ILS) framework. A heuristic algorithm is proposed by Liu et al. (2013) to solve VRPSDPTW. The first heuristic is based on a GA, and the second one is a Tabu Search (TS) that is based on route assignment, an augmented cost function, route re-optimization, and attribute-based aspiration levels. Aguiar et al. (2014) solved capacitated VRPSDP with heterogeneous fleet by meta-heuristic PSO Discrete Adapted (PSODA). Hu et al. (2015) addressed the VRPSDP, in a context where the quantity to be delivered to customers is known at the beginning but the quantity to be collected is uncertain and known when the vehicle delivers to customer, and also the goods to be delivered and collected are incompatible. They proposed the VNS to solve the VRPSDP, as well as other methods to deal with uncertainty pickup. Kalayci and Kaya (2016) proposed an ACS empowered variable neighborhood search algorithm to solve VRPSDP. Zhao et al. (2018) has studied the disruption caused by the arrival of new pickup customers (Catay, 2010) on the original plan of VRPSDP. They established a model for the problem, and proposed a new dispatching method (Gendreau et al., 2006) using tabu search algorithm to solve the problem. Bouanane et al. (2020) carried out a taxonomic survey of the VRPSDP, conducting a comprehensive taxonomic classification of the VRPSDP literature published since 1989. Tao et al. (2020) studied the disruption of VRPSDP. They proposed a mathematical model to identify the disruption caused by the change of the original distribution and they developed a hybrid quantum bacterial foraging scheduling algorithm to solve the  Christopher et al. (2021) compares the Variable Neighborhood Descent algorithm (VND) and TS algorithm in solving VRPSDP.

DVRP
The Dynamic Vehicle Routing Problem (DVRP), is the problem where new customers appear when the working day has already started and vehicle tours are planned, so the new customers must be included in the working day. The DVRP with capacity and time duration constraints was introduced by Kilby et al. (1998). Later, the DVRP became the object of several researches and several extensions, among these researches we have the work of Montemanni et al. (2005), they gave a description of the DVRP, and proposed the approach of dividing the DVRP into a sequence of static VRPs. Each one is solved with ACS. The same algorithm is used by Schyns (2015) to solve the capacitated DVRP with TW (DVRPTW) with split delivery and heterogeneous fleets. Necula et al. (2017) also proposed the resolution of the DVRPTW using an ant colony optimization algorithm. Chang et al. (2003) proposed Tabu Search algorithm (TS) to solve Dynamic Pickup (Mes et al., 2010) and delivery problems with TW(DPDPTW), in a deterministic and continuous perspective, i.e., the optimization is applied after the arrival of each new request. Hanshar et al. (2007) suggested a GA to solve the DVRP using the approach adopted by Montemanni et al. (2005), the proposed algorithm has given good results in terms of minimizing travel costs. The GA is also used by Elhassania et al. (2014) to solve DVRP, they combine the GA with some heuristics for the construction of the initial population and the crossover. Luo et al. (2011) treated the DVRP with stochastic (Swihart & Papastavrou, 1999) requests for emergencies. They developed a hybrid algorithm based on Clarke and Write saving algorithms with the vehicle capacity and TW to build the initial solution, and the TS to optimize the solution. Khouadija et al. (2012) developed a PSO and a Variable Neighborhood Search (VNS) to solve the DVRP. Hong (2012) proposed an NS-based approach to solve the DVRP with hard time windows. He used the remove-reinsert mechanism to insert new customers in the planned tours. Euchi et al. (2015) proposed the resolution of the DVRP with a hybrid meta-heuristic approach that uses an Artificial Ant Colony based on 2-opt local search. Mandziuk and Zychowski (2016)

Problem Formulation
The problem studied in this paper consists in satisfying the customers who appear when the planning of the day is already realized, and the vehicles have started the distribution. Each new customer has two kinds of demand: distribution demand and collecting demand. To serve the new customers there are two ways: either use the vehicles available at the depot, or insert the new customers into the existing vehicle tours. We adopted the approach proposed by Montemanni et al. (2005) which consists of dividing the working day in time slices (periods). During the first period, vehicles start their tours from the depot, and serve the customers received in the previous day after (Time limit for receipt customers). During the following period, each vehicle begins its tour from the last customer it visited in the previous period. The capacity of the vehicle is the remaining capacity in the vehicle after serving the last customer of the previous period. The start time begins after serving the last customer of the previous period. The mathematical model of each period is already described in Berahhou and Benadada (2020).

Notation
T: is the length of the working day.
: Set of new customer locations. : Set of vehicles available at the depot. : Set of vehicles in circulation (vehicles that have already visited customers). V = ∪ : Set of all vehicles. p: Index of the period. 0: Index of the depot.
: is the last customer visited by the vehicle k at the period . : is the total quantity delivered to customers already committed to vehicle k at the end of the period . : is the total quantity collected from customers already committed to vehicle k at the end of the period . : is the end of the serving time for the last customer committed to vehicle k at the period . : Load of vehicle in circulation k ∈ when leaving the depot. : Delivery quantity of customer j ∈ . : Pick-up quantity of customer j ∈ . : Service time of customer j.
: Travel costs between customer i and customer j. : Load of vehicle k ∈ V at the beginning of the period . : Load of vehicle after having serviced customer j ∈ . : Variable used to prohibit sub tours; can be interpreted as position of node j ∈ in the route.
The objective function is Under the following constraints: , The mathematical model obtained is a mixed linear program. The objective function (1) minimizes the total cost of the tours.
Constraints (2), (3) and (4) impose that all vehicles start their tours from the last customer that have been committed to it and return to the depot. Constraints (5) require that each customer is visited once and only once by one vehicle. The flow conservation constraints are carried out by the constraints (6). Constraints (7) represent the load of vehicles in circulation at (1) the beginning of the period, and the constraints (8) represent the load of vehicles coming from the depot. The Vehicle load after visiting the first customer of the period is limited by constraints (9). Constraints (10) ensure the vehicle load on route. Constraints (11) and (12) express that the quantity loaded in a vehicle does not exceed capacity limits. Constraints (13) ensure the availability of deliveries. Constraints (14) assure that the total duration of each route does not exceed a predefined limit. Constraints (15) impose that the number of vehicles added must not exceed the number of vehicles available in the depot. Finally, constraints (16) are Subtours breaking constraints.

The Proposed Memetic Algorithm
The DVRPSDP is a problem related to the VRP which is NP-Complex, the Cplex solver is used to validate the proposed mathematical model and to solve small instances, but it requires a long execution time. So, in order to find a good solution for large problem instances in a reasonable time, we will propose a meta-heuristic solution. The memetic algorithm (MA) nominated by Moscato (1989), is part of the family of evolutionary algorithms and is based on the mechanisms of natural evolution; it is inspired by Darwin's theory of species evolution. It consists in combining the GA, first introduced by Holland (1975), with local search heuristics to improve the solution. It is widely used to solve several optimization problems, because it gives good solutions when the exact methods cannot give a solution within a reasonable time, among these problems we find the VRP for which the MA has given good results. Our algorithm was inspired from the MA used by Ayadi and Benadada (2013) to solve the multi-objective VRP with multiple trips. Fig. 2 represents the life cycle of MA.

Coding the solution
For each period , we represent a chromosome by an array. Each chromosome represents the customers with the order in which they are visited by a vehicle. The first box contains the point of departure of the vehicle. If it starts from the central depot the box contains 0, otherwise it contains the last customer visited in the precedent period. Fig. 3 shows the coding of a solution for 6 customers and 2 vehicles.

Initialization of the population
The first step of the GA consists in building a population of individuals that will be the basis of future generations. To generate the initial population, we use the nearest neighbor method first, if the number of solutions obtained is less than the number necessary to build the initial population, we use a random insertion method to generate new solutions.

Nearest neighbor method or k nearest neighbors
This method is widely used to solve the VRP, because it is fast in execution. First, we look for the nearest customer to the depot, and add him to the tour. Then, the closest customer of the last customer visited is searched and inserted at the end of Fig. 3. Chromosome representation the tour, and so on until the maximum capacity or time is reached. Then, a new tour is started and the addition of customers is done in the same way until all customers are inserted.

Random insertion method
This method starts with the creation of a tour, then customers are randomly selected and added to the tour one by one. If adding a customer violates the constraints of the problem a new tour is created. The algorithm stops when all customers are added.

Selection and Evaluation
After the construction of the initial population, it is necessary to select the individuals that will participate in the reproduction. We will use the selection by tournament which is widely used by evolutionary algorithms. It consists in choosing randomly a subset of individuals, then selecting the best individual according to its fitness function. The process is repeated until the desired number of individuals is obtained (Miller & Goldberg, 1995). The fitness function chosen for our problem is the total distance traveled to visit all customers.

Crossover
The crossover is a genetic operator that combines two chromosomes called parents to produce a new one called child. The importance of the crossover is that the chromosome created can be better if it takes the best features of each parent and is applied with a crossover probability . Due to the nature of DVRP, traditional crossover methods such as Partially-Mapped-Crossover (PMX) are not adaptable for its resolution, so a new version of crossover is represented in Figure 4.
The Best Cost Route Crossover with priority to Capacity Constraint (BCRCPCC) method is a combination of Best Cost Route Crossover (BCRC) and capacity priority. First it checks the remaining capacity in the vehicle, making sure if it can meet the quantity requested by this customer. If no vehicle can meet this need, we create a new tour using an empty vehicle that is available at the depot. If a vehicle is chosen the customer will be inserted into the location that minimizes the total cost of each parent with the respect of the constraint. This requires calculating the insertion cost of each customer in an area of the tour.

Mutation
The mutation is a genetic operator that allows exploration of the research space, and avoids that the algorithm converges too quickly towards a local optimum. It is applied, with a probability of mutation , to the chromosomes generated by the crossing step, or directly to the selected parents if the crossing is not performed. Our algorithm will use two mutation methods, it depends on the number of vehicles used, the first one is used in the case of a single vehicle and the second one is used in the case of several vehicles.

One vehicle mutation
In the one vehicle mutation there is only one line so the mutation consists in choosing randomly two cut points, and then reverse the order of visit of the customers who are between the 2 chosen points. Fig. 5 shows an example of one vehicle mutation.

Several vehicles mutation
In several vehicles' mutation, two customers are randomly selected from two different tours and they are exchanged. Fig. 6 shows an example of several vehicles' mutation.

Correction process
After applying the crossover and mutation operations on the parents, the constraints are violated in most cases. Therefore, it is necessary to apply a correction procedure to check the feasibility of the generated children. This procedure launches a local search to improve the order in which customers are visiting. Then the tours that exceed the capacity of the vehicles or the maximum service time are partitioned according to the position that guarantees the best total cost of the two new tours. Finally, the short turns are combined using the Clark and Wright method (Clarke & Wright, 1964). Fig. 7 shows the steps of the correction process.

Local Search
Local search is applied to each solution of the initial population and to each new child obtained through crossing and mutation operations. The proposed local improvement procedure uses two basic movements, highly responsive in solving the vehicles routing problems; reintegration and exchange. In order to accelerate research, the strategy of first improvement is adopted  rather than the best improvement. The procedure begins with the improvement of each individual's tour to be processed separately from the other tours. Then, local research is applied between the individual's tours two by two.

Local Search Intra-tour
Local search intra-tour consists in applying the reintegration and exchange operators on each tour separately from the other solution tours. The algorithm of local search intra-tour is detailed below Exchange ci and cj 5: If the new solution is better than the original solution 6: Save the exchange and go to 1 7: For each customer ci of Ri, 8: Insert ci in the best position of Ri. 9: If the new solution is better than the original solution 10: Save the movement and go to 1 11: End

Local Search Inter-tour
In the local search inter-tour, the exchange and reintegration operators are applied between two different tours of the solution. First of all, the solution tours are sorted from the longest to the shortest in terms of distance. Indeed, the longest tours are the first to improve, considering that they are the first to be responsible for the high cost of the solution. The algorithm of local search inter-tour is detailed below: If the exchange between ci and cj reduces the cost and all constraints are verified. 7: Exchange ci and cj 8: If the new solution is better than the original solution 9: Save the exchange and go to 1 10: For each customer ci of Ri, 11: If the total quantity of ci and Rj does not exceed the capacity of the vehicle 12: Insert c i in the best position of R j . 13: If the new solution is better than the original solution 14: Save the movement and go to 1 15: End

Parameter tuning with Taguchi method
The parameters of the MA have a great impact on the results obtained. The quality of the solution changes with the variation of parameter values. In order to choose the best parameters for our algorithm we adopted the Taguchi method. The Taguchi experimental design is widely used for optimization problems; it helps to identify the best parameters for the algorithms. Taguchi method is based on two tools: Orthogonal Array (OA) and signal to noise (ratio S/N). The Taguchi method based on OA allows to reduce the number of experiments; the experimental schemes are constructed from the different levels of the parameters. In the ratio S/N, S represents the variable response signal and N is the standard deviation noise. So, the S/N ratio represents the strength of change in the response variable. The objective of Taguchi method is to maximize the S/N ratio (Azadeh et al. (2017); Sarbijan et al. (2022)).
Parameters of the proposed MA considered for tuning are: population size, crossover probability and mutation probability. Different levels of these parameters for the tuning process are shown in Table 1. By studying the different designs of the Taguchi method, we choose the L16 design using Minitab 21 software. More details about Taguchi method and the different designs can be found in the book written by Roy (2010). For each scheme, the instance of 50 customers was tested, and 10 runs of the MA is performed for each design. The algorithm stops after 200 iterations without any improvement in the objective function. The proposed algorithm is coded in Java and tested on a machine with Intel Core I7 processor, 2Ghz, 32GB of RAM and the operating system windows 10. The results obtained are summarized in Table 2.  According to the S/N ratios diagram of the Taguchi design represented in Fig. 8, the optimal values of the parameters of the MA are the maximum values of the diagram. Table 3 represents the optimal value of each parameter.

Benchmarks
The DVRP and DVRPSDP was tested on the instances of Christofides and Beasley (1984) and Christofides (1979) (7 instances), Taillard (1993) (12 instances), and Fisher et al. (1981) (2 instances) proposed by Kilby et al. (1998) and extended by Montemanni et al. (2005). The arrival time of each customer is proportional to the arrival time proposed by Kilby et al. (1998). Concerning the number of periods and , we used the same parameter of Montemanni et al. (2005) (25 periods and = ). The number of vehicles is set to 50 for each problem. To generate the delivery and pickup demands of each client from their original demand, we adopted the method used by Salhi and Nagy (1999). For each client i of Xi and Yi coordinates, and original demand , we calculate the ratio , and then the delivery demand and pickup demand are calculated as follows: = min ( , ) , = * and = (1 − ) * Each instance is divided as sets X and Y. A set X of instances is created, and another set Y of instances is generated by exchanging the delivery and pickup demand for each customer.

Results
To our knowledge, there are no studies in the literature that address the dynamic case of VRPSDP with periodic reoptimization, so our results have not been compared with other works. Thus, before giving the numerical results obtained by the application of the MA to large instances of the DVRPSDP, we started by solving small instances of the VRPSDP with Cplex and then with the MA. This allowed us, on the one hand, to validate the mathematical model, and on the other hand, to test the performance of the MA. Our test is performed on the data set A of Augerat et al. (1995) by considering only the first 10, 15, 20, 25 and 30 customers. According to the results presented in Table 4, we can see that for the small instances of 10, 15, 20, 25 customers the Cplex solver was able to give the optimal result but this requires a long time that goes up to 7h for 25 customers, however the MA was able to give results very close to the optimal value in a time that does not exceed 1 minute. For the problem of 30 customers, Cplex could not give a solution because of the running out of memory. We conclude that the MA has shown its efficiency by giving solutions very close to the optimal value and in less than one minute of CPU time.   The table shows the distance for the first period T1(static case) and the total distance of all periods T (dynamic case). We notice that for all the instances tested the MA was able to provide a feasible solution in a reasonable time that does not exceed 10 min.

Results of classical DVRP
As we are the first to study the DVRPSDP with periodic reoptimization, to the best of our knowledge, we could not compare our results with other works. Thus, to better evaluate the performance of the proposed algorithm, we will apply the MA to the classical case of DVRP. Next, we compare our MA with the Ant Colony System (DVRP-ACS) developed by Montemanni et al. (2005), and with GA (DVRP-GA) proposed by Hanshar and Beatrice (2007). The results of table 6 indicate that the MA was able to find a new best solution for 18 instances (out of 21 instances) with a considerable improvement that has reached 9%, the MA gives better results than Hanshar's algorithm for 18 instances (86%) and Montemanni's algorithm for all instances (100%). The Gap measures the difference between the MA and the best result (between DVRP-ACS and DVRP-GA). We conclude that the performance of the MA is better than the performance of ACS algorithm and genetic algorithm in solving DVRP.

Conclusion
In this paper, we treated the dynamic vehicle routing problem with simultaneous delivery and pickup (DVRPSDP). A memetic algorithm which combines genetic algorithm with a local search procedure is developed to solve the problem. The algorithm was tested on DVRP and DVRPSDP using problem instances reported in the literature. For the DVRP we compared the MA results with those of Montemanni et al. (2005) results and Hanshar and Beatrice (2007) results, The MA was able to outperform the ACS algorithm and 86% of the MA results are better than GA results. The DVRPSDP is not widely studied, we have not found any work to compare our results. As future work, we will test the performance of the algorithm on larger instances, and also, we will propose other metaheuristics to solve the problem and compare the results with those obtained by the MA. Another avenue of research is to add more constraints to the problem such as stock restriction, time windows and more objective functions.

Appendix
In order to compare the results of the static VRPSDP with works in the literature, MA was tested using the data of Christofides et al. (1979) (12 instances and 50 to 199 customers), used by Salhi and Nagy (1999) to test the VRPSDP. Table 5 provides comparison of the solution quality for the static VRPSDP in terms of minimizing travel distance is done between Parallel Iterated Local Search (PILS) proposed by Subramanian et al. (2010), and the ant colony system empowered variable neighborhood search algorithm (ACSEVNS) developed by Kalayci and Kaya (2016) and the MA solution. The results prove that the efficiency of the MA is better than the precedent algorithms. The results shows that MA has been able to provide new best solutions for all 24 instances, and it improves the results of Kalayci and Kaya (2016) and Subramanian et al. (2010) with a considerable decrease of up to 6% for the CMT2X and CMT2Y instances. In terms of the number of vehicles used, we notice that the Memetic algorithm uses less vehicles for all instances. Since Kalayci and Kaya (2016) did not specify the number of vehicles in their paper, some of the best-known results in Table 5 do not contain the number of vehicles. N: number of customers, V: number of vehicles, Gap%: Percentage difference between the best known and the best-found solution, the best solution of is highlighted in bold, -: means the value was not mentioned in the source.