An Adapted Version of the Water Wave Optimization Algorithm for the Capacitated Vehicle Routing Problem with Time Windows with Application to a Real Case Using Probe Data

: Customers’ habits, as far as shipping requests are concerned, have changed in the last decade, due to the fast spread of e-commerce and business to consumer (B2C) systems, thus generating more and more vehicles on the road, tra ﬃ c congestion and, consequently, more pollution. Trying to partially solve this problem, the operational research ﬁeld dedicates part of its research to possible ways to optimize transports in terms of costs, travel times, full loads etc., with the aim of reducing ine ﬃ ciencies and impacts on proﬁt, planet and people, i.e., the well-known triple bottom line approach to sustainability, also thanks to new technologies able to instantly provide probe data, which can detail information as far as the vehicle behavior. In line with this, an adapted version of the metaheuristic water wave optimization algorithm is here presented and applied to the context of the capacitated vehicle routing problem with time windows. This latter one is a particular case of the vehicle routing problem, whose aim is to deﬁne the best route in terms of travel time for visiting a set of customers, given the vehicles capacity and time constraints in which some customers need to be visited. The algorithm is then tested on a real case study of an express courier operating in the South of Italy. A nearest neighbor heuristic is applied, as well, to the same set of data, to test the e ﬀ ectiveness and accuracy of the algorithm. Results show a better performance of the proposed metaheuristic, which could improve the journeys by reducing the travel time by up to 23.64%.


Introduction
There is no doubt that together with industry 4.0, big data, artificial intelligence, climate change and many others, sustainability is one of the keywords of the 21st century we are living right now. Indeed, concerns about the planet and living beings must be put first, as the future strongly depends on our behavior and actions; according to that, authorities as well as companies and simple consumers are encouraged to act by taking care of this aspect. Researchers and practitioners also play a crucial role, as their fundamental contribution is to develop innovations, metrics, models and tools which can support and stimulate sustainable practices in all the fields in which sustainability can be declined, as well as on the other side understand how current activities are unsustainable [1]. One of the areas which deserves particular attention, since facilities and activities involved have significant impacts, is the transportation activity [2]. In terms of the well-known triple bottom line (TBL) approach, from an logistics issues; examples of these contexts include the travelling salesman problem (TSP) [20], the problem of routing a picker in manual warehouses [21], or the scheduling of high-speed trains [19]. To the best of the authors' knowledge, however, there is no evidence in the literature of its application to the CVRPTW and this is the gap intended to fill in this paper.
Clearly, there is no doubt that these innovative solutions can be implemented only thanks to the availability of data which are directly generated from vehicles, such as their speed or their locations, which can generate useful information in terms of travel time.
To test the efficiency and effectiveness of this tool, the adapted WWO was then applied to the case of an express courier operating in Caserta, a city in the South of Italy. Probe data were provided by the company itself, which instantly collect information as far as the localization of its vehicles (origin-destination), through a Global Positioning System. Results from this case are then compared with those obtained by applying a different algorithm to the same set of data, namely the heuristic nearest neighbor [22], considered one of the best known heuristics for solving the CVRP [23]; indeed, previous studies used its results as a benchmark for their assessment [24,25].
In the remainder of the paper, Section 2 provides a background on the WWO, followed by Section 3, where the adapted version of the algorithm is presented; Section 4 shows the case study and its results, which are also compared with those obtained applying the nearest neighbor algorithm. Finally, implications and conclusions are provided (Section 5).

Background: The Water Wave Optimization Algorithm
According to Zheng [19], the WWO algorithm is a nature-inspired algorithm that mimics the shallow water wave models. It consists of an initial population of waves randomly generated, each of which defined by two parameters: the wavelength λ ∈ R + and the wave height (or amplitude) h ∈ Z + . The aim is to explore the seabed area, corresponding to the solution space X, and to identify the "best wave" x * ∈ X, namely the wave with the highest energy level; this involves determining a fitness value f(x), acting as discriminatory in preferring a certain wave rather than another one. The final aim is to define the best f(x), i.e., the higher fitness value for a maximization problem, the minimum for a minimization one. The lower the distance between the wave and the seabed and the wavelength is, the higher the wave height and the fitness value are. In the general case, a wave can have n dimensions, corresponding to n aspects of the problem under examination. At each iteration of the algorithm, three operations are carried out on the waves to modify them and try to improve the current best solution of the problem; the operations are named propagation, refraction and breaking [26] and recall the natural behavior of waves. The development of the algorithm can be resumed in the following five steps [21]: Step 1, Initialization. A population of N solutions (i.e., waves) for the problem under examination is randomly generated; each wave has an initial h set at h MAX .
Step 2. The fitness value f(x) is determined and the wave having the best value x * is identified.
Step 3, Propagation. Propagation is the operation by which the solutions of the problem change at each iteration of the algorithm. It is therefore carried out on all waves of the population; anytime a wave is propagated, it gets a new position, which is obtained by applying the following formula: where x(d) represents the original wave, rand(−1, 1) is a uniformly distributed random number in the range [−1;1], d (1 ≤ d ≤ n) is a generic dimension of the problem, L(d) is the length of the d dimension of the search space. The fitness value f (x ) of the new solution is calculated and then compared to the corresponding fitness value before the propagation; to choose whether to maintain x or replace it with x , the following set of inequalities is applied (for a maximization problem): The wavelength is thus updated according to the following formula: where f max and f min are respectively the maximum and the minimum fitness values among the current population, α is the wavelength reduction coefficient and ε > 0 is a very small positive number, to avoid division-by-zero. The propagation is repeated once for each iteration of the algorithm.
Step 4, Refraction. After some iterations, it is possible that some waves reach h = 0. This means that, despite their propagation for h max times, these waves were not able to identify an effective solution for the problem under examination; the logical consequence would be their removal from the set of problem solutions. Refraction is performed exactly on those waves that are expected to disappear, and consists in identifying a new wave, determined as a Gaussian random number with mean µ and standard deviation σ: where x * is the best solution found so far; hence, the aim of refraction is to replace the ineffective solution of the problem with a good new one. The height of the new wave is also restored to h max , as previously done in the propagation phase when better solutions were obtained, and the wavelength is updated as follows: Step 5, Breaking. Breaking is a procedure that mimics the real behavior of waves when moving to a position where the water depth is low, i.e., their breaking into a train of solitary waves. This procedure is only applied to those waves that returned the best solutions to the problem in question, i.e., on waves x that became x * after propagation. By breaking, a local search is performed around this best solution by taking k random dimensions (with 1 ≤ k ≤ k max < n, being k max a predefined number) and generating a solitary wave x for each dimension d, as follows: where β is the breaking coefficient. If there are no solitary waves better than x * , x * is maintained; otherwise it is replaced by the fittest one among the solitary waves.
Steps 2-5 are repeated for each iteration of the algorithm on the whole set of waves in the population. Generally, a maximum number I of iterations is set in advance, and once it has been completely carried out, the algorithm stops.

The Adapted Water Wave Optimization to the CVRPTW
The original formulation of the WWO cannot be directly applied to the CVRPTW; hence, in this section, we illustrate its adaptation to make it suitable for solving the problem under examination.
The application of the WWO to the CVRPTW is intended to generate an effective route for a set of M trucks, by minimizing their travel distance and time (namely a minimization problem), according to capacity and availability constraints of the vehicles and time constraints. Vehicles are assumed to start from a depot (indicated as vertex 0) and return to the depot at the end of the trip (indicated as vertex N + 1, where N corresponds to the number of customers). Consequently, the objective function corresponds to the sum of the single contributions (i.e., the (i, j) arches in the graph where i j) forming the route. Note that each route originates at vertex 0 and ends at vertex N+1; on the contrary, no arc terminates in vertex 0, or originates from vertex N + 1.
When applying the WWO to this problem, each wave x (denoting the solution of the problem) will correspond to a generic route, consisting of a vector containing at least two zero elements, Sustainability 2020, 12, 3666 5 of 13 which indicate the location of the depot and represent respectively the origin and destination nodes. The remaining n ∈ N + non-zero nodes represent the customers to be visited (1, 2, . . . , N). The fitness function f (x) will instead denote the total travel time to cover route x. For instance, considering 5 customers to be visited by 2 vehicles (i.e., N = 5 and M = 2), a potential solution to the problem, reflecting a route configuration, could be the following: The phases of the algorithm are below detailed, recalling the previous subdivision in five steps, preceded by Table 1, which illustrates the nomenclature adopted. Before proceeding, two variables are introduced: a binary variable x ij for each arc (i, j), where i j, i N+1, j 0, which scores 1 in the case that the arc belongs to the optimal solution, 0 otherwise; s iC k instead is defined for each node i of each k-th route and indicates the time in which the vehicle services customer i.
The algorithm aims to determine the following value: Subject to: x ij = 1 i f arc (i, j) belongs to the optimal solution 0 otherwise ∀i, j ∈ N, i j, i N + 1, j 0 (14) Clearly, f (x) corresponds to the objective function, namely the fitness function, and expresses the total time for servicing the customers, computed on the set of optimal routes C k covered by the trucks. According to what has been said in Section 2, being a minimization problem, the aim in this case is to determine the lowest fitness value.
Constraint (8) states that each customer is visited exactly once, while the next inequality (9) corresponds to the vehicle capacity constraint. The three following equations (10, 11, and 12) impose that each vehicle leaves at vertex 0 (the depot), and after arriving at a customer, the vehicle leaves again, and finally reaches vertex N + 1 (the depot). Inequality (13) ensures that the time windows are observed, and finally, (14) corresponds to the integrality constraint.
Step 1, Initialization. A population of possible routes is randomly generated. To this end, an initial round trip tour is set, whose corresponding string includes N − 1 + 2 = N + 1 zeros: two respectively opening and closing the route, and the remaining ones placed between each pair of nodes, since at this stage it is assumed that each customer is directly connected to the depot. Again, for instance, considering the previous example with N = 5, a possible configuration (with N + 1 = 6 zeros) can be as follows: 0 1 0 2 0 3 0 4 0 5 0 Clearly, it would be possible to start with any other random configuration. In literature, the idea of generating a population starting from a unique configuration is known as seed initialization, and it was noted that both heuristics and metaheuristics implementing this kind of initialization returned better results [27]. After I permutations, whose number is set at the start of the algorithm and defines the problem size, the result is an initial population with the following characteristics: • the number of solutions corresponds to the number of permutations; • all solutions have the same size (2N+1) and the same value of the objective function, i.e., the same fitness value (the journey time for each route is steady, as each node is always included between two zeros); • all solutions are possible and realistic.
These solutions form the matrix V(i, j), whose dimensions are (I; 2N+1). Note that after many iterations of (ite) launched, it was observed that the best number of iterations is equal to N-M.
Step 2, Propagation. In the description of the original algorithm, at the second step, the fitness value f(x) was determined, and the wave having the best value x * was identified. In the adapted version, however, after initialization, all of the possible solutions own the same fitness value (cf. step 1), meaning that it is not possible to identify the best one. As a result, propagation immediately comes, performed on all the solutions determined in the initialization phase.
From each x solution, a new x solution is identified, and this last will replace x (if consistent with the constraints, otherwise it will not be considered), even if the objective function value gets worse, namely, it gets a higher value, in this specific case.
For each solution, a k ∈ [1; K] integer number is randomly generated, where K = V (i, j + 1 : j − 1) is the number of the zeros in each string; j*, the zero occupying the k-th position, is removed. This corresponds to the union of two arches; in other words, two nodes initially belonging to two different sub-circuits are connected, meaning that two customers will be served without any stop at the depot. Of course, this union could generate either a saving or an increase in time depending on the route in question.
The benefit of the seed initialization is that, after the permutations, it is no longer possible to have two identical solutions, even if the k number generated is the same. Indeed, for instance, assuming k = 3 for two different solutions, still having N = 5 the two updated solutions could be as follows: The benefit of the seed initialization is that, after the permutations, it is no longer possible to have two identical solutions, even if the k number generated is the same. Indeed, for instance, assuming = 3 for two different solutions, still having = 5 the two updated solutions could be as follows: The value of the objective function is then computed, for each c containing j*-1 and j*+1. Finally, the capacity constraint is checked: given , demand vector the following summation is determined: ∑ , ; ∀c, ∀i solution and compared to .
After each iteration, K is updated and decreased by one until its value reaches −1.
Step 3, Refraction. Aiming to improve solutions and making the search be as thorough as possible, refraction operator is performed on those solutions that generated sub-circuits breaking the capacity constraint, i.e., when ∑ , > , as it is imposed that the vehicle capacity loading is not exceeded. These sub-circuits c are identified, and given that J is the number of their elements, a random value j ∈ [1; ] is generated, to which demand corresponds. This number reflects the position occupied by the node (i.e., customer) candidate for switching its position with a node belonging to another sub-circuit. The criteria for selecting this second node is that of having a request lower than the one owned by the node that occupies the j-th position, in the attempt of respecting the capacity constraint actually violated. Those nodes to which a lower capacity corresponds constitute the Q set (Q= ∉ ∪ < ), and among them, the one contributing to minimizing the total time involved, i.e., the objective function, is chosen j'*. These two nodes are then switched. In case the capacity constraint is still not respected, this operation is iterated until Q becomes an empty set; anytime there is the need to repeat the procedure, Q is updated, excluding j'*.
The value of the objective function and the capacity constraints are evaluated again.
Step 4, Breaking. Net of refraction; if the capacity constraint is still violated, breaking is activated. In the adapted approach, this procedure simply consists in permuting two random nodes: one of the zeros of the string in question (whose position belongs to the range [1; K], where K is appropriately updated according to the current iteration) and one of the non-zero elements of the solution. In mathematical terms, the swap is operated on and , where ∈V(i,j)=0, while ∈V(i,j)≠0. The check on the constraint is repeated, and if it is still not respected (i.e.,∑ , >cm), the refraction process is repeated. Theoretically, breaking and refraction could alternate n times; however, to avoid excessive computational efforts, the algorithm is set so that this can occur just once. If after one breaking and refraction process, the solution is still not acceptable, there is probably little potential to exploit this solution effectively. Therefore, propagation will be repeated on this solution to get a more effective one. On the contrary, if the solution becomes acceptable, it will be checked for correspondence to the time windows constraint.
Step 5. At this step, finally, the check on the time constraint is activated. For each of the nodes requested to be visited within a time window [a; b], the expected time of the visit is computed; for node j this accounts for = ∑ . In the case ∉ [aj, bj] (i.e., the visit does not fall into the time span), the maximum value max within the sub-circuit is determined; then, a new value for the arrival time is computed as = − max . A new node j'' is then selected, so that '' ≤ , and inserted instead of j in the sub-circuit. This procedure is iterated until the time windows are respected, i.e., ∀ , ∀ ∈ , ∀ ∈ c → ∑ ∈ [aj, bj], and finally, the best solution, in terms of minimal time, is proclaimed.
Steps 2-5 are repeated for each iteration of the algorithm on the whole set of routes in the population. As in the original formulation of the algorithm, a number ite of iterations is allowed and once it has been completely carried out the algorithm stops.
The flowchart depicted in Figure 1 resumes the main steps of the algorithm.
The value of the objective function is then computed, for each c containing j*−1 and j*+1. Finally, the capacity constraint is checked: given G, demand vector G = [d h ], where h ∈ [1,n], the following summation is determined: h= j k ,j k+1 d h ; ∀c, ∀i solution and compared to c m .
After each iteration, K is updated and decreased by one until its value reaches m − 1.
Step 3, Refraction. Aiming to improve solutions and making the search be as thorough as possible, refraction operator is performed on those solutions that generated sub-circuits breaking the capacity constraint, i.e., when h= j k ,j k+1 d h > c m , as it is imposed that the vehicle capacity loading is not exceeded.
These sub-circuits c are identified, and given that J is the number of their elements, a random value j ∈ [1; J] is generated, to which d j demand corresponds. This number reflects the position occupied by the node (i.e., customer) candidate for switching its position with a node belonging to another sub-circuit. The criteria for selecting this second node is that of having a request lower than the one owned by the node that occupies the j-th position, in the attempt of respecting the capacity constraint actually violated. Those nodes to which a lower capacity corresponds constitute the Q set (Q= j c ∪ d j < d j ), and among them, the one contributing to minimizing the total time involved, i.e., the objective function, is chosen j'*. These two nodes are then switched. In case the capacity constraint is still not respected, this operation is iterated until Q becomes an empty set; anytime there is the need to repeat the procedure, Q is updated, excluding j'*.
The value of the objective function and the capacity constraints are evaluated again.
Step 4, Breaking. Net of refraction; if the capacity constraint is still violated, breaking is activated. In the adapted approach, this procedure simply consists in permuting two random nodes: one of the zeros of the string in question (whose position belongs to the range [1; K], where K is appropriately updated according to the current iteration) and one of the non-zero elements of the solution. In mathematical terms, the swap is operated on y 1 and y 2 , where y 1 ∈V(i,j)=0, while y 2 ∈V(i,j) . The check on the constraint is repeated, and if it is still not respected (i.e., h= j k ,j k+1 d h >c m ), the refraction process is repeated. Theoretically, breaking and refraction could alternate n times; however, to avoid excessive computational efforts, the algorithm is set so that this can occur just once. If after one breaking and refraction process, the solution is still not acceptable, there is probably little potential to exploit this solution effectively. Therefore, propagation will be repeated on this solution to get a more effective one. On the contrary, if the solution becomes acceptable, it will be checked for correspondence to the time windows constraint.
Step 5. At this step, finally, the check on the time constraint is activated. For each of the nodes requested to be visited within a time window [a; b], the expected time of the visit is computed; for node j this accounts for T j = i t ij . In the case T j [a j , b j ] (i.e., the visit does not fall into the time span), the maximum value maxt ij within the sub-circuit is determined; then, a new value for the arrival time a j is computed as a j = b j − maxt ij . A new node j" is then selected, so that b j ≤ a j , and inserted instead of j in the sub-circuit.
This procedure is iterated until the time windows are respected, i.e.,∀i, ∀c ∈ I, ∀j ∈ c → j t ij ∈ [a j , b j ], and finally, the best solution, in terms of minimal time, is proclaimed.
Steps 2-5 are repeated for each iteration of the algorithm on the whole set of routes in the population. As in the original formulation of the algorithm, a number ite of iterations is allowed and once it has been completely carried out the algorithm stops.
The flowchart depicted in Figure 1 resumes the main steps of the algorithm.

Case Study
In order to test its effectiveness, the adapted version of the WWO to the CVRPTW was applied to a real case study using the software MATLAB®, with real data obtained from an express courier operating in Caserta, South of Italy, and serving customers who ordered products through an ecommerce platform. Related results are discussed in this section.
According to the taxonomy proposed by Braekers et al. [28], the scenario and the physical characteristics of the problem in question are below resumed, respectively in Table 2 and Table 3. Table 2. Scenario characteristics according to [28] for the problem in question.

Case Study
In order to test its effectiveness, the adapted version of the WWO to the CVRPTW was applied to a real case study using the software MATLAB®, with real data obtained from an express courier operating in Caserta, South of Italy, and serving customers who ordered products through an e-commerce platform. Related results are discussed in this section.
According to the taxonomy proposed by Braekers et al. [28], the scenario and the physical characteristics of the problem in question are below resumed, respectively in Tables 2 and 3. Table 2. Scenario characteristics according to [28] for the problem in question.

Scenario Characteristics
Number of stops on the route Known (deterministic) Customer service demand quantity Deterministic Onsite services/waiting times Deterministic Time windows structure Strict time windows Node/Arc covering constraints Precedence + Subset covering constraints Table 3. Problem physical characteristics according to [28]. Data about the case study were obtained thanks to interviews with a manager from the company itself, and they reproduce a real working day, specifically Thursday 19th of September 2019. Note that the sequencing followed on that specific day was not provided, as well as the tools used by the company to determine the routes their vehicles have to travel.

Physical Characteristics
That day, exactly 67 customers needed to be visited (this means N = 67, reflecting the number of nodes of the graph in question) and two vehicles were available (M = 2), both with a load capacity of 300 kg. The depot is unique. As far as the time windows are concerned, 55 customers shall be reached between 10 a.m. and 6 p.m., while the remaining 12 benefit from the "sprinter service" offered by the company and need to be visited within the timeframe 10 a.m. -1 p.m., i.e., an interval of three hours. This characteristic makes this CVRP a CVRPTW, since the time spans impose constraints on how the routes can be built. The necessary data for the application of the propose approach are the distances between all the possible pairs of nodes (constituting the distance matrix), appropriately converted into travel time, and the weights associated to each delivery. This latter one reflects the capacity constraint, which is expressed in terms of weight of the freight shipped; the volume of the shipment, instead, is not taken into account in the evaluation. The full set of data is too big to be fully reported in the paper, but it could be provided to the interested readers upon request.
Note that no information was provided as far as the goods transported; anyway, considering that in an e-commerce context the quantities ordered can be very low (e.g., a single item such as a book, a garment rather than a small domestic accessory) as well as their weight (for instance, a book may weigh approximately 1-2 kg), the available capacity of the vehicles appears to be reasonable.
The number of permutations I made on the initial round trip configuration was set at 200, which corresponds to the number of solutions in the population; the number of iterations ite, instead, was set as N − M = 65.
At the end of the last iteration, among the whole set of solutions generated (200), 50 of them resulted in being respectful of all the constraints, and the route returning the best value in terms of objective function, i.e., the minimum travel time, was found to be as follows (Tables 4 and 5, respectively, for the two vehicles): The two objective values for the first and the second route, respectively, are 166 (2.76 h) and 329 min (5.48 h), serving 34 and 33 different customers. The critical path is again the second, which however, is quite different from that returned by the adapted WWO, and in particular, it is 103 min longer. Specifically, as far as vehicle 1, the nearest neighbor made the solution 9.2% worse in terms of extra time; same reasoning goes for vehicle 2, which should have to travel 45.6% longer in time.
Moreover, if we compute the total time from the two routes as returned by the algorithms tested, we obtain 378 min (6.3 h) in the case of WWO and 495 min (8.25 h) with the nearest neighbor heuristic, meaning that the WWO could improve the solution by reducing the overall travel time by up to 23.64%. This latter value in minutes, however, would be outside the normal work shift of one day (8 h) and therefore, the resulting solution would not be suitable for application in practice. Overall, the proposed algorithm is able to identify a better solution in terms of efficiency, which is also suitable for direct implementation.
Results were clearly provided to the express courier as well, and the company's managers positively assessed the tool according to their practical experience.

Conclusions
In the present manuscript, an adapted version of a metaheuristic algorithm, namely the water wave optimization is presented, applied to the vehicle routing problem with time windows and capacity constraints. The aim of the application is to optimize the route travelled by a fleet of transport means for reducing unnecessary road travelled and thus reducing impactful emissions and inefficiencies. The algorithm was implemented through the software MATLAB®, and it was tested on a working day of an express courier operating in the South of Italy, in the province of Caserta. Results were satisfactory, both in terms of computational time (22 s) and of the solution identified; indeed, by comparing the routes found by the adapted WWO to those obtained by applying the heuristic nearest neighbor, the first algorithm has demonstrated better performance and outcomes, being able to improve the solution found by the heuristic algorithm by 23.64%, in terms of overall time spent by both the vehicles.
Moreover, the proposed approach could be easily adapted to problems different from those modelled in this paper. For instance, it could be used: • for minimizing cost instead of travelling time; • for modelling and solving a simple CVRP without time windows, by removing step 5 (i.e., the check on time constraints) from the algorithm structure.
Note that the time values used as input data do not take into account any inconvenience, traffic condition or real situations which can compromise the possibility of being respected; this could be particularly the case for big cities, where traffic or congestion can affect the real travel time of vehicles. Although neglecting these inconveniences is what is typically done in similar studies, this is for sure also a limitation of the current formulation of the problem. Moreover, the capacity constraint was modelled and assessed according to the weights of the items transported, without taking into account their volumes. In the case under examination, due to the limited size of the goods, it is reasonable to think that using the item's weight instead of the volume could be acceptable; however, in other contexts, the lack of one of these aspects could represent a stringent limitation. This point will be taken into account in future applications.
Theoretically, anyway, the outcomes are brilliant. With regard to the future, our ongoing research activity is intended to improve the programming language, to avoid redundancies and streamline the procedure, for reducing the average number of iterations and further improve the computational time. Higher programming languages and/or software applications can also be involved as well.
For sure, we mean to carry out a set of experiments on known benchmark sets (e.g., see [8] or [26]) and compare their results; a further possible research activity could also address the comparison of the results and the effectiveness of the algorithm with other metaheuristics (e.g., genetic algorithms), whose original formulation would probably need to be adapted for the resolution of the VRP problem with time windows, in case appropriate adaptations do not exist.
Finally, other case studies are highly recommended to further validate the adapted version of the WWO to the CVRPTW. Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.