Benchmark dataset for the Asymmetric and Clustered Vehicle Routing Problem with Simultaneous Pickup and Deliveries, Variable Costs and Forbidden Paths

In this paper, the benchmark dataset for the Asymmetric and Clustered Vehicle Routing Problem with Simultaneous Pickup and Deliveries, Variable Costs and Forbidden Paths is presented (AC-VRP-SPDVCFP). This problem is a specific multi-attribute variant of the well-known Vehicle Routing Problem, and it has been originally built for modelling and solving a real-world newspaper distribution problem with recycling policies. The whole benchmark is composed by 15 instances comprised by 50–100 nodes. For the design of this dataset, real geographical positions have been used, located in the province of Bizkaia, Spain. A deep description of the benchmark is provided in this paper, aiming at extending the details and experimentation given in the paper A discrete firefly algorithm to solve a rich vehicle routing problem modelling a newspaper distribution system with recycling policy (Osaba et al.) [1]. The dataset is publicly available for its use and modification.


Data description
The benchmark accompanying this article consists on 15 different dataset instances for the Asymmetric and Clustered Vehicle Routing Problem with Simultaneous Pickup and Deliveries, Variable Costs and Forbidden Paths (AC-VRP-SPDVCFP). This problem is a rich variant [3] of well-known Vehicle Routing Problem and it represents a newspaper delivery problem with recycling policies. These are the main characteristics of the problem: Asymmetry [4]: This feature implies that the cost of traveling from one client to another one is different from its reverse trip. Furthermore, every relation between two different nodes is affected by this asymmetry. Clustered [5]: All nodes that comprise the whole scenario are grouped into different sets or clusters. Additionally, a hard restriction is involved with this condition: if a vehicle visits a client, it must Specifications Table   Subject Artificial Intelligence Specific subject area Control and Optimization, Discrete Mathematics and Optimization Type of data All datasets are freely available in Ref.
[9], and also in Mendeley. The whole benchmark is composed by 15 different instances, a descriptive map with the geographical distribution of the clients, an informative text file with all the latitudes and longitudes of the customers, and an indicative XML parser for facilitate the reading of the instances. In addition, all datasets are called Osaba_X_Y_Z, where X is the number of customers, Y the distribution of the clusters, and Z the distribution of the clients. All these details are deeply explained in the following section.
In the following Fig. 1, a brief excerpt of a dataset is shown. All instances are XML files, containing the list of all clients that comprise the dataset. Within the list, the information of the depot is also available. As can be observed, for each client the following information is provided: address, identification, cluster, delivery demand, pickup demand, coordinate X and coordinate Y.

Experimental design, materials, and methods
In this section, the process followed to generate the whole benchmark is detailed. For enhancing the contextualization of the datasets, the Geographical locations of the depot, customers and clusters around the province of Bizkaia are illustrated in Fig. 2. Further information can be found in Ref. [1].
As has been introduced above, the benchmark is composed by 15 instances. Moreover, these datasets are composed by 50e100 nodes. Each node represents a client, placed in specific geographical location. Furthermore, the maximum number of clusters has been established in 10, existing also datasets with 5, and 8 cluster.
For the generation of these clusters, they have been build using the order of appearance. This means that first ten customers comprise the first clusters, second ten clients build the second cluster, and so on. It is also important to highlight that all the clusters that compose a dataset contain the same amount of nodes. Regarding the assignment of both delivery and pick-up demands, the following formula has been used: d i ¼ 10; p i ¼ 5; ci 2f1; 5; 9; …97g d i ¼ 10; p i ¼ 0; ci 2f2; 6; 10; …98g Additionally, the well-known Euclidean distance has been employed for calculating the cost of traveling from any client i to other customer j, using a constant 0.8, or a value 1.2 for guarantying the asymmetry feature of the problem. Furthermore, for assigning peak travel costs, an additional constant 1.2 and 1.4 (for odd or even clients, respectively) has been employed for increasing the valley travel costs.
Lastly, a pre-established amount of random street is selected as forbidden. In the following Table 1, the characteristics of each generated instances are summarized. For enhancing the understandability of this table, following clarifications should be made: Osaba_50_1_1 and Osaba_50_1_2 are composed by 5 clusters, which are the clusters {1, 3, 5, 7, 9}. Additionally, Osaba_50_2_1 and Osaba_50_2_2 are built by sets {2, 4, 6, 8, 10}. Moreover, clusters in Osaba_50_1_3 and Osaba_50_1_4 are comprised by 5 nodes, involving the first five customers of each cluster. On the contrary, for the construction of the 10 clusters of Osaba_50_2_3 and Osaba_50_2_4, last five clients of every set are employed. Additionally, for building Osaba_80_X datasets first 8 clusters, or nodes (depending on the case) have been chosen. Finally, for vehicle capacities and forbidden paths, values of 240 and 5 have been chosen respectively for odd instances (Osaba_X_X_1, Osaba_X_X_3 …); and values 160 and 10 for even datasets.