A tailored Benders decomposition approach for last-mile delivery with autonomous robots

This work addresses an operational problem of a logistics service provider that consists of ﬁnding an optimal route for a vehicle carrying customer parcels from a central depot to selected facilities, from where autonomous devices like robots are launched to perform last-mile deliveries. The objective is to minimize a tardiness indicator based on the customer delivery deadlines. This article provides a better understanding of how three major tardiness indicators can be used to improve the quality of service by minimizing the maximum tardiness, the total tardiness, or the number of late deliveries. We study the problem complexity, devise a unifying Mixed Integer Programming formulation and propose an eﬃcient branch-and-Benders-cut scheme to deal with instances of realistic size. Numerical results show that this novel Benders approach with a tailored combinatorial algorithm for generating Benders cuts largely outperforms all other alternatives. In our managerial study, we vary the number of available facilities, the coverage radius of autonomous robots and their speed, to assess their impact on the quality of service and environmental costs.


Introduction
The growth of the world population living in urban areas, reaching 54%, poses many logistic challenges.
Designing efficient and effective transportation systems for both goods and passengers, while ensuring mobility, safety and sustainability, is the main challenge for modern city logistics. While both environmental and social factors are increasingly considered in new transportation technologies and models, the pressure for efficient distribution of goods also increases, with users often requiring the so-called same-day deliveries (Bertsimas et al., 2019;Savelsbergh & Woensel, 2016;Taniguchi, 2014;Taniguchi et al., 2014Taniguchi et al., , 2016. In 2016, the cost of global parcel delivery, excluding pickup, line-haul, and sorting, amounted to approximately EUR 70 billion (Joerss et al., 2016). According to the aforementioned McKinsey report, over the next ten years, market volumes in Germany and the US might reach 5 billion and 25 billion parcels per year, respectively. The biggest share (often higher than 50%) in total parcel delivery cost goes to last-mile delivery. Innovative and disruptive last-mile concepts have been proposed to cope with the increasing demand for logistic efficiency and competitive prices: pickup points networks, integrated public and freight transportation, deliveries directly into the customer car's trunk, crowd-shipping, and more recently, the use of unmanned aerial vehicles (drones) and self-driving autonomous robots (Qi et al., 2018;Savelsbergh & Woensel, 2016).
The use of drones for performing deliveries has gained increasing interest recently, with many authors investigating new mathematical models, exact and heuristic algorithms, expanding the literature on classical Traveling Salesman Problems (TSP) and the Vehicle Routing Problems (VRP). However, from a regulation point of view, the adoption of drones on practical delivery scenarios has been rendered increasingly difficult due to stricter rules concerning their operation and safety, especially in urban areas. In this context, selfdriving robots have an advantage as they are designed to operate at low speeds, e.g., pedestrian speed, so that they can safely share existing sidewalks and bike lanes with people. Self-driving delivery robots were introduced much later than drones, however many initiatives can now be found such as the self-driving robots developed by e-novia (2020), , and Twinswheel (2020) that have been tested in many cities around the world. More recently, Amazon also announced the development of their own self-driving delivery robots, called Scout (Amazon, 2020). FedEx also tested its six-wheeled autonomous robot, called the Roxo SameDay Bot (FedEx, 2020) and is testing the Ford Digit robot for delivery. The robots by e-novia, Startship and Amazon are depicted in Fig. 1. These are sidewalk robots, with a capacity to perform a single delivery before returning to the assigned facility. There exist other recent studies (see, e.g., Jennings & Figliozzi (2019) in which the routing of road autonomous robots with a larger capacity is considered.
The latter allows robots to deliver multiple customers within the same route, but this is out of scope of the current paper.
From the modeling and planning perspective, drone-based and robot-based deliveries share some similarities. As pointed out by Boysen et al. (2018), a major difference (which also distinguishes our research from the existing literature) is the fact that drones allow for unattended delivery, which is (currently) not possible with robots. This is why the literature on drones focuses on minimizing the makespan, rather than on the attended-delivery tardiness indicators studied in this paper. Moreover, drones travel at a higher speed, so that the trucks can collect them en-route and reuse them for later deliveries, which does not apply to self-driving robots. Further important differences are highlighted in Boysen et al. (2018), Clarke & Moses (2014), Goodchild & Toy (2018), and Jones (2017).
Our contribution. This work addresses the problem of a logistics service provider (LSP) of finding an optimal route for a vehicle carrying customer parcels from a central depot to selected facilities, from where autonomous devices like self-driving robots are launched to perform attended last-mile deliveries. We call that problem the Uncapacitated Routing-Scheduling Problem (URSP). Due dates for customer delivery are agreed beforehand. The objective is to serve all customers in a timely fashion.
Our contributions can be summarized as follows: • As delays in the last-mile delivery may be unavoidable, we introduce three problem variants in which we minimize the following tardiness measures: the maximum tardiness, the total tardiness, or the number of late deliveries. While similar tardiness indicators have been considered in the humanitarian relief operations (Campbell et al., 2008;Huang et al., 2012), they have been mostly neglected for last-mile delivery.
• We study the complexity of all these problem variants and devise a generic Mixed Integer Linear Programming (MILP) formulation for them.
• The resulting MILP model involves a large number of variables and becomes intractable for instances of realistic size. We therefore propose a new problem-tailored Benders decomposition framework capable of handling all three URSP variants in a unifying way.
• This new exact method is based on a normalization approach which guarantees that the generated Benders cuts are sparse and numerically stable and that they can be separated using an efficient combinatorial procedure.
• The method is implemented in a branch-and-cut fashion using a modern general purpose MILP solver, which significantly improves the scalability of our approach.
• We use our new mathematical model and the exact algorithm applied in a realistic setting to answer the following questions: -What is the impact of the coverage radius or the speed of self-driving robots (or alternatively, the number of available facilities) on the quality of service (QoS) measured by one of the three tardiness indicators? -How are the robot speed and coverage radius affecting the environment in terms of: a) the distance traveled by the delivery truck, and b) the CO 2 emissions? -How is the structure of the optimal solution affected by the choice of the tardiness indicator?
While VRPs combined with drone deliveries can be seen as an established concept in the literature, to the best of our knowledge our paper provides a first exact solution framework to optimize three different tardiness indicators in the context of robot-based last-mile operations. The recent work by Boysen et al. (2018) also exploits self-driving robots in combination with classical transportation problems in a more general variant where robots can also be collected en-route and launched from the trucks. They propose a mathematical model with a single tardiness indicator and provide optimal solutions for instances with less than 10 customers, and a multi-start heuristic for larger instances. Our exact method provides optimal solutions for instances with up to 100 customers, which allows an in-depth managerial study involving three different tardiness indicators. It is complementary to the one provided by Boysen et al. (2018) as the robots cannot be carried on board on the latter one.
Outline of the paper. In the remainder of this section we provide a literature overview. A formal definition for the URSP is given in Section 2, where we also investigate its complexity. A multi-commodity network flow MILP formulation is presented in Section 3. Due to the complexity of the model, we propose to solve it by a Benders Decomposition approach outlined in Section 4. Details on the separation of Benders cuts and subtour-elimination inequalities, and the computation of lower and upper bounds, are given in Section 5. In Section 6, the three variants of our Benders approach are implemented and their performance is analyzed.
We assess the impact of varying the number of available facilities, the robots speed and the robots coverage radius. Concluding remarks and future works are discussed in Section 7.

Related literature
To the best of our knowledge, the work of Boysen et al. (2018) (see below) is the only one that combines scheduling and routing aspects, in the context of two-tier urban logistics for parcel deliveries using robots.
The remaining literature focuses on the minimization of operational costs, which may include facility opening and/or allocation costs, while (optionally) respecting time windows. Moreover, most of the articles provide heuristic methods, sometimes combined with simulations. In this section we briefly overview these recent related works and point out the differences to some classical network design and routing problems. Boysen et al. (2018) studied a capacitated single vehicle scheduling problem with truck-based autonomous robots for last-mile deliveries. Different from the URSP, in their problem setting the truck is loaded with both customer packets and robots. Also, besides the central depot, the customers and the robot facilities, the network has additional drop-off points from where robots in truck can be launched to perform deliveries.
The truck can be reloaded with new robots in the robot facilities. The aim is to find a feasible route for the vehicle such that the number of customers served late is minimized. The authors analyze the complexity of the problem and propose a MILP formulation (which is capable of solving only small instances with up to 10 customers) and a heuristic method for larger instances, evaluated on randomly generated instances.
Related Network Design and Routing Problems. The URSP resembles in some aspects the ring-star problem (Labbé et al., 2004), the median cycle problem (Labbé et al., 2005), the Steiner ring-star problem (Xu et al., 1999), or the Traveling Purchaser Problem (Manerba et al., 2017), which are all routing-location-allocation problems with an underlying ring-star structure. Traditional applications of these problems are in the design of circular shaped transportation infrastructure (e.g, a metro line or a motorway), in the design of telecommunication networks, or routing.
The major difference between the URSP and the above mentioned problems is in the nature of the objective function, which minimizes the setup costs of the selected nodes and edges in the ring and the assignment cost of those nodes not in the ring. Hence, these are pure network network-design problems that mainly address strategic decisions with no scheduling aspects included. On the other hand, the URSP is a scheduling problem in which we assume that the infrastructure is already available, and hence, the major goal is to minimize the indicators for late deliveries. Indeed, in the URSP, we have facility nodes and customer nodes, the ring is composed of facility nodes only, but not necessarily all of them will be used and there is no construction or setup cost as no ring is actually constructed.
In another family of two-echelon Vehicle Routing Problems (2E-VRP), the second leg of delivery corresponds to routes instead of stars (see, e.g., the recent surveys by Cuda et al. (2015) and Guastaroba et al. (2016)). Among the 2E-VRP studies that focus on sustainable applications for e-commerce and city logistics, we highlight the work of Enthoven et al. (2020) who recently introduced the 2E-VRP with Covering Options. In this problem, the first echelon consists of truck routes departing from a single depot to visit two type of locations: covering locations or satellite locations, from where goods are picked up by or delivered to the customers. At covering locations with parcel lockers, customers can pick up goods themselves, whereas at satellite locations, goods are transferred to zero-emission vehicles (such as cargo bikes) that perform last-mile deliveries. The objective is to design routes covering the demands at minimum cost.
Related Two-Tier Urban Logistics Problems. We now highlight several recent studies that model last-mile deliveries using self-driving robots. The major difference to our approach is in the nature of the objective function (these studies focus on the minimization of operational costs) and in the methodology (they are all heuristics). Many of these studies also analyse the trade-offs and the pros and cons of the green last-mile deliveries, whereas in our work we primarily focus on the scheduling aspects, assuming that the necessary infrastructure is already provided. Bakach et al. (2021) study a robot-based urban delivery by applying a sequential approach: in the first phase, they solve a facility location MILP formulation to find the minimum number p of locations of robot depots to open. In the second phase, they solve a p-median MILP formulation in which the operational cost for robot deliveries are minimized. An advantage of such a two-phase method (as opposed to an integrated approach, like ours) is that larger instances can be tackled. On the other hand, the approach of Bakach et al. (2021) is a heuristic and there is no guarantee that a globally optimal solution can be found.
Using a similar two-tier network for performing deliveries with robots, Poeting et al. (2019a) analyze two time slot selection policies for the LSP: in the first one, the customer chooses a due window, whereas in the second one, the customer triggers the delivery on demand. The authors use a TSP formulation to find the tour of the truck, and a Simulated Annealing heuristic to find the reassignment, allocation, and scheduling of the delivery robots. An agent-based model is used to simulate several days of parcel deliveries, based on the geographical data for the city of Cologne in Germany. The agent-based simulation approach is also used in Poeting et al. (2019b), where the authors compare the last-mile delivery of parcels using conventional truck-based deliveries against robot-based deliveries. For the truck only optimization problem, they propose a heuristic based on an MILP model for solving the TSP with Precedence Constraints. The robot-based delivery problem is modeled as an Orienteering Problem with Multiple Time Windows and a heuristic is presented. Sonneberg et al. (2019) study the Electric Location Routing Problem with Time-Windows, which employs autonomous robots for last-mile delivery of parcels. The objective is to select the best location of the robot depots and the corresponding robot routes, while minimizing daily operational costs. The authors analyze how the number of compartments in the robot affects the solution costs and the amount of robots used.
Another simulation-optimization framework that focuses on the trade-off between costs and operations for multi-modal last-mile deliveries can be found in Brotcorne et al. (2019); Perboli et al. (2018). A very recent survey on the routing problems for e-commerce and last-mile delivery is given by Archetti & Bertazzi (2021).
Drone-based deliveries. As pointed out in the introduction, drone-based deliveries allow for unattended delivery, and therefore the underlying models do not need to take into account scheduling constraints.
Concerning the related literature on drone-based deliveries, the closest setting to ours has been considered by Kim & Moon (2018), in which a delivery truck loaded with parcels leaves the depot and has to serve all the customers. However, contrary to our assumptions, only a single drone station is allowed (see our central facility policy in Section 2.2) and deliveries are also allowed directly from the truck without using drones. Chauhan et al. (2019) investigate the Maximum Coverage Facility Location Problem with Drones 6 (MCFLPD) which seeks to locate a given number of facilities and assign drones to them, which will serve the demand points. The authors describe an MILP formulation which incorporates drone energy consumption and range constraints. They introduce greedy and three-stage heuristics for solving large instances. Multiple scenarios are considered to show the impact of the drone battery capabilities on coverage. Even though the URSP resembles the MCFLPD in the way that it also has to select a subset of robot facilities, no scheduling aspects are incorporated into the MCFLPD. Kitjacharoenchai et al. (2020) study a synchronized truck-drone 2E-VRP which allows multiple drones to fly from a truck, serve one or multiple customers, and return to the same truck for battery swap and package retrieval. Truck routes start and finish at a main depot and customers are visited either by a truck or a drone. In the second echelon of the problem, the trucks can be seen as moveable intermediate depots. The objective minimizes the total arrival time at the depot after completing the deliveries.
For a broader overview, we refer the interested reader to recent surveys on drone-delivery problems which can be found in Chung et al. (2020) and Macrina et al. (2020).

Problem statement and complexity analysis
In this section we provide a formal problem definition and show that finding optimal routes and allocations of customers to distribution facilities is an N P-hard problem, whatever the tardiness indicator. We also show that if the automated vehicle is at least as fast as the truck and the coverage radius of each device is large enough, the problem becomes tractable. In this case, we show that a truck delivery to a "central" facility represents the optimal delivery policy.

The three variants of the Uncapacitated Routing-Scheduling Problem (URSP)
In the Uncapacitated Routing-Scheduling Problem (URSP) a vehicle departing from a central distribution depot at a predefined time, loaded with the packets for a group of customers, has to visit a subset of facilities where the packets are unloaded and then transported to their respective customers by autonomous robots.
In addition, each delivery is supposed to take place before a given deadline. The objective is to minimize a tardiness indicator concerning the packets delivered after the deadline. Fig. 2 shows an instance of the problem and a feasible solution. The delivery deadlines for each customer are in brackets, and the travel times between two points are on the arcs. On the solution proposed, 6 clients are served late, with a maximum tardiness of 3 units, and a cumulative tardiness of 10 units.
More formally, we consider a distribution network composed of a central depot, denoted by 0, a set F of local distribution facilities and a set of customers C. Let F 0 = F ∪ {0} denote the set of facilities including the depot, then set A F = {(i, j) ∈ F 0 × F 0 , i = j} and A C = F × C. The network is modeled by a directed graph G = (V, A), where V = C ∪ F 0 is the set of nodes, and A = A F ∪ A C , is the set of arcs. For each arc (i, j) ∈ A, let t ij be the time required for traveling from i to j. Observe that if some customer k ∈ C is not connected to facility i ∈ F , one can still assign a time value t ik ≥ L on arc (i, k), where L is a sufficiently large value. Therefore, we can assume all arcs exist between C and F . The following assumptions are made: • A single vehicle is used to transfer the parcels for a set C of customers from the depot to a subset • At each facility i ∈ F , autonomous robots are launched to deliver the packets to some subset of customers C i ⊆ C assigned to it. Each robot can deliver a single customer, and then it returns to its facility.
• No customer is directly served by the depot. This is not a limiting assumption as a facility could have the same position as the depot.
• A sufficient number of autonomous robots is available at each open facility.
• Without loss of generality, service times and preparation times are integrated into the travel times t ik .
• The range of robots allows to reach every customer from at least one of the facilities (otherwise, the customers that cannot be reached from any of the available locations can be preprocessed and removed from the model).
• The delivery to each customer k ∈ C has to take place before a due date u k . A penalty w k may be applied per late time unit. These penalty weights w k , if different than one, can be used to express priorities among customers.
Let T ⊂ A be the tour which starts at 0 and visits all facilities in F ⊂ F . We denote by t T (i, j) the traversal time from node i to node j in tour T , for i, j ∈ F ∪ {0}. Assuming that the truck departs from the depot at time 0, for each k ∈ C let t * T (0, k) = min i∈F {t T (0, i) + t ik } be the traversal time of the shortest path, denoted P * T (0, k), connecting the depot to client k via arcs of tour T . Then, we define the positive slack between the travel time along the routing path P * T (0, k) and due date time as: Additionally, for each k ∈ C, let l k (T ) = 1 if k is served late in tour T , i.e., [t * T (0, k) − u k ] + > 0, and l k (T ) = 0 otherwise. The following objective functions (tardiness indicators) are considered in this paper: • (min-max) Minimize the (weighted) maximum tardiness: • (min-sum) Minimize the sum of (weighted) tardiness: • (min-num) Minimize the (weighted) number of late deliveries : These objective functions allow to capture different preferences of decision makers. Objective min-num is the most restrictive as it focuses on minimizing the number of late deliveries, thus making no differentiation between deliveries which are one minute late or one hour late. Thus, it tends to preserve the overall quality of service and to reduce the number of potential customer complaints. On the other hand, objectives min-max and min-sum minimize the amount of time by which a customer is served after the deadline. Nevertheless, among the latter two functions, objective min-max does not directly reduce the number of customers served late as it only focuses on a single value which is the maximum delay and all other late deliveries can reach that maximum delay with no impact on the objective.
We also make use of the following notation. Given a customer k ∈ C, let F (k) ⊆ F be the subset of facilities that can serve k (i.e., facilities i with t ik < L). Given the subset

Complexity and polynomially solvable cases
So far we have not made any particular assumption related to the speed of the delivery truck versus the speed of the autonomous robots, i.e., regarding the structure of the travel times t ij , (i, j) ∈ A. Nevertheless, the URSP is a two-echelon transportation problem in which different types of vehicles are used in each delivery stage. Such vehicles are clearly traveling at different speeds, with the autonomous robots being significantly slower than the truck Consequently, the travel times t ij , (i, j) ∈ A, can be decomposed into two components: times on arcs in A F between facilities (referring to the travel times of the truck), and times on arcs in A C between facilities and customers (referring to the travel times of autonomous vehicles).
For the special case of the URSP in which the travel matrix satisfies the triangle inequality, which happens, e.g., when the robots are at least as fast as the truck, we define the central-facility policy as the strategy in which the truck is sent directly from the depot to a single facility i * ∈ F from where all customers k ∈ C will be served by autonomous robots, before turning back to the depot. Under this policy we can show that all three problem variants can be solved in polynomial time. Notice that if we assume that the robots have a limited range coverage the triangle inequality may be violated, as not all customers could be reached from each facility.
Proposition 1. The central-facility policy is optimal for all three variants of the URSP if the travel times Proof where v is the speed of the truck. Let us assume that shortest path distances d ij are pre-computed for each pair of distinct nodes i, j ∈ F ∪ C. These distances d ij satisfy the triangle inequality, so on A F travel times t ij = d ij /v also satisfy the triangle inequality. Now for i, j ∈ F Hence travel times satisfy the triangle inequality on A, so the conditions of Proposition 1 are satisfied.
The following proposition shows that, in general, the URSP is N P-hard to solve. All related proofs are given in Appendix B.
Proposition 2. If the travel times t ij , (i, j) ∈ A, do not satisfy the triangle inequality, then all three variants of the URSP studied in this paper are N P-hard, even if travel times on A F satisfy the triangle inequality.
We point out that Boysen et al. (2018) prove N P-hardness for the min-num variant of the problem, in which the truck is allowed to carry robots along the route. If we assume that the robot-capacity of the truck is zero, then their proof of NP-hardness is also valid for our min-num URSP. This is why in the Appendix B we only provide a detailed proof for the min-max and min-sum URSP.
The next section describes an extended MILP formulation for the problem. The formulation uses flow variables to model delivery paths for each single customer, i.e., it belongs to a family of multi-commodity flow (MCF) formulations for routing problems (see, e.g., Letchford & Salazar-González (2015)).

MILP formulations
In the URSP, we assume that there are enough autonomous robots available at the distribution facilities so that all the delivery requests can be covered. The model is based on the following property: Property 3. For all three URSP variants, there always exists an optimal solution such that the delivery truck stops at each distribution facility at most once.
We now devise a multi-commodity flow MILP formulation for the problem. The proposed model uses the following decision variables: is on the tour of the delivery truck 0, otherwise Notice that variables f k ij can be used to model both arcs defining the vehicle tour and assignment of customers to facilities. However, to avoid redundancies, variables f k are defined only on A F and for the arcs on A C we use the additional set of variables z ik to represent the assignment of customer k ∈ C to facility The set of URSP feasible solutions can now be modeled by the following set of constraints: The left hand side of constraints (1) computes the travel time to reach a customer k ∈ C. Constraints (2) enforce flow conservation, while constraints (3) and (4) forbid flow through arcs which are not in the tour.
Finally, with (9), we guarantee that the arcs chosen from A F define a tour composed of facility locations and the depot. The set of possible tours X for the delivery truck must satisfy the degree constraints (10)- (12) and the subtour elimination constraints (13): To model the min-max objective function, we introduce an auxiliary variable t ≥ 0 which represents the objective value. The overall model for the min-max URSP reads as follows: We notice that constraints (1)-(9) may not be sufficient to prevent that facilities with no assigned customers are included in a tour. These empty facilities could only appear at the end of the tour, when the delivery truck is empty. Since it has no influence on the solution cost, they can be removed in a post-processing phase. Otherwise, if the travel times on A F satisfy the triangle inequality, additional constraints (15) can be added to ensure that a selected facility i ∈ F serves at least one customer: Similarly, constraints (13) are not redundant in our model. This is a difference between the URSP and some classical network design or routing problems (see Section 1.1). In the ring-star problems for example, constraints (2)-(3), together with non-negative route costs guarantee that optimal solutions will contain no subtours. However, because there is no cost associated to variables x in the URSP, and the objective function focuses on the tardiness indicators, if we leave out constraints (13), redundant subtours could be created, and they would need to be eliminated in a post-processing phase.
The other two variants of the URSP can be modeled in a similar way. First, the min-sum URSP, which aims at minimizing the sum of (weighted) tardiness, can be obtained as: The minimization of the (weighted) number of late deliveries in min-num requires new binary variables: l k equals 1 if customer k is served late, and 0, otherwise. Then, min-num URSP is modeled as: where M k is an upper bound on the length of the path connecting 0 to k, computed as M k = (|F | * Variables l k are used to count the number of late deliveries and the validity of the bound for M k follows from Property 3, i.e., the number of stops made by the delivery truck is at most |F |. In all three models, variables s k , k ∈ C assume positive values whenever the traversal time of the path P T (0, k) chosen to reach customer k is longer than the due date u k , as defined in constraints (1). Initial lower bounds for the values of s k , k ∈ C can be obtained in a combinatorial way, using the following proposition.
Proposition 4. For each k ∈ C, let P * G (0, k) be the shortest path from 0 to k on graph G, and t * as the shortest path from 0 to k in G is shorter than the shortest path in the subgraph of G induced by T .
It is not difficult to see that variables z ik do not have to be binary in the above models. We have: Observation 5. In all three models described above, constraints z ik ∈ {0, 1} can be replaced by z ik ≥ 0, for all k ∈ C, i ∈ F (k), as flow conservation constraints (2) will force them to assume integral values.

13
MILP model (1)-(9) provides stronger lower bounds than some compact formulations typically used in vehicle routing problems (see, e.g., the Miller-Tucker-Zemlin formulations in Toth & Vigo (2014)). However, with the increasing number of customers, the model becomes intractable for many modern MILP solvers, as the number of decision variables is in O(|C| · |A|). Fortunately, Observation 5 allows us to project out flow and assignment variables (f and z, respectively) in a Benders-like fashion. Hence, at the cost of introducing a family of Benders cuts (which is exponential in size), the number of variables can be reduced to O(|A F | + |C|), making it intrinsically more scalable for larger input data. This, along with an efficient combinatorial algorithm for separating Benders cuts, will allow us to derive a highly effective exact solution method based on Branch-and-Benders-cuts, as detailed in the next section.

Benders decomposition
In the standard decomposition approach by Benders (1962), a mixed integer linear program can be solved in an iterative fashion by keeping the "complicating variables" (whose values must be integer) in the relaxed master problem and projecting out continuous variables (and the associated constraints) by replacing them with two families of Benders cuts: optimality and feasibility cuts. The former ones are used to properly bound the contribution of continuous variables in the objective, and the latter ones are added to guarantee that any solution of the relaxed master problem remains feasible with respect to original constraints. For each solution of the relaxed master problem, these cuts are separated by solving linear programs (LP), and hence, the process is repeated by alternating between solving integer linear programs (relaxed master problem) and LPs (Benders subproblems), until an optimal solution is found.
In this section we propose to solve the three variants of the URSP using a Benders decomposition approach in which routes for the distribution truck are generated at the master level (described by x variables), along with the delays s k (in time units) for serving customers k ∈ C. Flow and assignment variables are eliminated from the model, and hence, a vector (x, s) obtained by solving the relaxed master problem does not necessarily correspond to a feasible solution. In the jargon of Benders decomposition, this means that Benders feasibility cuts need to be generated to discard any infeasible combination (x * , s * ) in which delays s * cannot be guaranteed by letting the delivery truck follow the route described by x * ∈ X.
Our approach deviates from the above described classical Benders decomposition approach in two important aspects: 1. Benders feasibility cuts are typically separated by choosing (extreme) rays of the unbounded dual of the Benders subproblem. Many authors have observed that such implementations suffer from slowconvergence and instability, due to degeneracy of the underlying Benders subproblem and the fact that the rays returned by the LP solvers do not even have to be extreme (Fischetti et al., 2010;Magnanti & Wong, 1981;Wentges, 1996). Some of the proposed techniques to alleviate these issues are the generation of Pareto-optimal cuts (Magnanti & Wong, 1981;Papadakos, 2008), facet-defining cuts (Conforti & Wolsey, 2018) or a normalization of the dual of the Benders subproblem (Fischetti et al., 2010). Contrary to Fischetti et al. (2010), who propose to intersect the unbounded dual cone with a simplex, we propose an alternative and problem-tailored normalization approach. Moreover, instead of solving an LP, our normalization allows to exploit an efficient combinatorial procedure inspired by the one of Magnanti et al. (1986) to obtain the coefficients of the Benders cut.
2. Instead of solving each relaxed master problem as an integer linear program (ILP) as done in traditional implementations, we follow the line of research in which Benders cuts are separated on the fly in a branch-and-cut fashion (also called Branch-and- Benders-cut). Recent studies have shown that the latter allows for a significant boost in the performance of MILP solvers (Contreras & Fernández, 2014;Fischetti et al., 2016Fischetti et al., , 2017b.

Problem reformulation using problem-tailored normalized Benders cuts
For the ease of explanation, we will focus on the min-max URSP variant, the other two variants can be modeled analogously. After projecting out f and z variables from the MILP model shown in Section 3, the relaxed master problem (RMP) is initialized as follows: Once the solution (s * , x * ) of the RMP is found, it has to be checked whether there exists a possibility to ship the parcels through the network defined by the values of x * , and whether the tardiness for each customer does not exceed s * . In a standard Benders decomposition approach this is done by checking the feasibility of the LP defined by (1)-(5) in which the values of (x, s) variables are fixed to (x * , s * ). By Farkas Lemma, an (extreme) ray of the unbounded dual of this LP is taken to generate a Benders feasibility cut that will be added to the RMP to cut off the point (x * , s * ) and the process is repeated until a feasible (and hence optimal) (x * , s * ) is found.
However, we can look at the problem from a different perspective. There are namely two possible sources of infeasibility of the solution of the RMP: a) there exists a customer k ∈ C such that there is no path between the depot and k in the network defined by x * , and hence the delivery time is equal to +∞, or b) all customers are connected to the depot, but there exists k ∈ C such that the parcel cannot be delivered within the deadline u k + s * k . The first source of infeasibility can be resolved by explicitly imposing connectivity between the depot and each customer, adding constraints: which can be interpreted as Benders feasibility cuts, together with the subtour elimination constraints (which are part of the description of the set X).
To deal with the second source of infeasibility, we propose a problem-tailored normalization approach. We first observe that, after adding the feasibility cuts of the first type, the min-max URSP can be reformulated as follows: where, for a given route x * ∈ X, the function θ k (x * ) calculates the serving time for a customer k ∈ C: In the jargon of Benders decomposition, we observe that for any vector (x, s), Benders decomposition requires a solution of |C| independent subproblems of the form (24)-(29), called Benders subproblem associated to customer k ∈ C (given in its primal form).
Proof. Each x * ∈ X represents a non-trivial tour starting at the depot. Constraints (21) ensure that at least one among the facilities from which customer k ∈ C can be reached, must be visited by the delivery truck. Hence, the values x * provide a network along which one unit of flow can be sent from the depot to each k ∈ C, which is required to ensure that (24)-(29) is feasible.
We observe that constraints (21) are implied by the compact model, but once the flow variables are projected out, they need to be imposed explicitly. Furthermore, we observe that the results of Lemma 6 hold, even for fractional values of x * .
The function θ k (x) is convex in x, and to derive a valid Benders reformulation of the problem, we will replace the value function reformulation constraints (20) with an exponential family of linear underestimators of θ k (x) that we will refer to as normalized Benders cuts. "Normalization" comes from the fact that the polytope of the unbounded dual of the Benders subproblem (when the problem is formulated in a traditional way, for separating Benders feasibility cuts) is intersected with a hyperplane in which the dual variable associated to the constraint (1) is fixed to one. We point out that our approach is different from a standard normalization recipe proposed by Fischetti et al. (2010), in which the dual cone is intersected with a simplex (i.e., the sum of all dual variables is imposed to be equal to one). The major advantage of our problem-tailored normalization is summarized in the following proposition which gives a characterization of the Benders subproblem (24)-(29).
Proposition 7. Let x * ∈ X be a binary vector representing a subtour T that contains node 0 and also satisfies (21). Let F T ⊂ F 0 and A T ⊂ F T × F T represent the node set and arc set of T .
be the support graph that extends this subtour to arcs connecting k. Then the primal Benders subproblem (24)-(29) is the shortest path problem (SPP) on G T , whose value is t * T (0, k).
Proof. It is easy to see that objective function (24)  those arcs for which the master solution x * ij = 0, (i, j) ∈ A F , and also prevent facilities disconnected from 0 from serving customers. Consequently, only arcs in A T can be used to send flow from the depot to the last facility. Hence, the SPP is not longer solved on G, but on G T , which is a subgraph of G.
Consequently, given a customer k ∈ C, and a solution of the relaxed master problem (x * , s * ), a path P T (0, k) from 0 to customer k in subgraph G T defines a valid Benders cut if t * T (0, k) > u k + s * k . Moreover, valid Benders cuts can also be derived at fractional points x that satisfy all the constraints from the set X. It is not difficult to see that for fractional solutions x * of the master problem, the primal Benders subproblem becomes the minimum cost flow problem on graph G, in which arc capacities are defined as x * ij for (i, j) ∈ A F , and x * (δ − (i)) for (i, k) ∈ A C .
To derive appropriate Benders cuts, we exploit the LP-duality theory. The value of θ k (x * ) can be equivalently obtained by solving the dual of (24)-(29). As the following lemma shows, this dual can be slightly simplified. Let us associate the dual variables α i to constraints (25) for i ∈ F 0 , −β ij with β ij ≥ 0 to constraints (26) for (i, j) ∈ A F and α k to constraint (28).
Lemma 8. The dual Benders subproblem associated to customer k ∈ C can be formulated as: Proof. This dual formulation is obtained by eliminating constraints (27) from the primal Benders subproblem as we will show that they are redundant. Consider a facility node i ∈ F , and its incoming flow (j,i)∈δ − (i) f k ji . By the flow preservation constraints, the incoming flow is equal to the outgoing flow. By construction of graph G T and the nature of the objective function, either the total incoming flow will be routed within the facility subnetwork, or towards customer k. In the latter case, the flow routed from i to k corresponds to z ik , , where the last inequality follows from constraints (26).
We observe that the polytope does not depend on x * anymore, hence, by enumerating all extreme points (α * , β * ) of P k , we can replace the non-linear constraint (20) with the following exponential family of normalized Benders cuts associated to customer k ∈ C: Expression (34) can be further simplified by setting α 0 = 0 when solving formulation (30)-(33). Moreover, if β ij = 0, ∀(i, j) ∈ A F , then α k is expected to be equal to the length of the shortest path from the depot to customer k on the support graph G T , as we will show later.

Combinatorial algorithm for separating normalized Benders cuts
Feasible values for the dual variables (α, β) can be obtained by solving the linear program (30)-(33) for each k ∈ C with any available LP solver. However, the impact on the total computational time caused by the successive calls to the solver on the Benders subproblems is expected to grow with the problem size. As the number of facilities increases, so does the number of possible routes, which in turn requires more cuts to be generated. So as to avoid the computational burden of solving a LP for each subproblem, one may want to compute the values for (α, β) in a combinatorial (and faster) way.
From Proposition 7 and Lemma 8, we can solve the dual problem (30)-(33) by exploiting the structure of the classical shortest path problem (SPP). Indeed, observe that the well known dual formulation for the SPP over the original graph G can be obtained from model (30)-(33) by completely removing the β variables. Nonetheless, if not properly penalized in the objective function (e.g., if x ij < 1, ∀(i, j) ∈ A F ), the presence of the β variables may allow the path length constraints (31) to be relaxed, thus allowing the model to find artificially long paths. Consequently, the role of the master solution x * on the dual objective function (30) is to restrain some of the β variables so that instead of solving the SPP on the original graph G, model (30)-(33) is equivalent to solving the problem on the support graph G T induced by x * , as discussed in Proposition 7.
The optimal shortest paths P * T (0, k) over the support graph G T , for all k ∈ C, can be obtained by applying a labelling algorithm, similar to Dijkstra's algorithm on G T , with the advantage that it takes only O(|F T |) time because G T is a tour. As a result, we have that α * 0 = 0 and α * k = t * T (0, k), i.e., the traversal time of P * T (0, k). We furthermore denote by t T (0, i) the length of the path from the depot to facility i ∈ F T along this tour.
When the Benders subproblem is a shortest path, Magnanti et al. (1986) showed (in the context of the uncapacitated network design problem) that the coefficients of Benders cuts can be computed as indicated in Proposition 9, which adapts their result to our problem's specificities and notations. Then we show in Proposition 11 that this particular setting provides Benders cuts that are particularly sparse for our problem.
Proposition 9. For a master solution x * ∈ {0, 1} |A F | and customer k ∈ C, an optimal (α * , β * ) solution for the Benders subproblem (30)-(33) can be computed as: and for facility nodes i ∈ F : As a corollary, the separation of Benders cuts for x * ∈ X and k ∈ C can be performed in time O(|F T |).
Note that in the strengthened Benders cuts of Magnanti et al. (1986), there is a single formula of α * i for any node i, which would correspond to α * i = min(t T (0, i), α * k − t * G (i, k)) with our time notation, i.e. formula (ii). Since in our problem a node i is not reachable if i ∈ T , we need to distinguish three cases (i)-(iii), also for the proof of Proposition 11. The proofs for this and the following propositions can be found in the Appendix C.
The major implication of the above result is that we can avoid solving an LP (or finding a min-cost flow) for calculating a violated Benders cut.
The following results provide additional theoretical arguments for choosing this specific calibration of Benders cuts. We show that the optimal dual multipliers (α * , β * ) computed in Proposition 9 result in sparse and numerically stable cuts for our specific problem. We show below that more than 50% of β * ij multipliers are equal to zero. If the values of t ij are integer, the cuts are numerically stable because the dual multipliers are calculated in a combinatorial way so that they also remain integer. On the contrary, if one would use cut-generating LPs as an alternative way to derive Benders cuts (as e.g., for deriving Paretooptimal or facet-defining cuts, see Magnanti & Wong (1981) and Conforti & Wolsey (2018), respectively), this numerical stability would be lost.
We thus get that the fraction of β * variables equal to zero in our Benders cuts is always at least 0.5, and tends to 1 when the tour is small, the smallest tour being composed of a single direct trip from the depot to one facility (ρ → 2/m, i.e., |F T | → 2). As mentioned earlier, sparser Benders cuts increase the computational performance of the method, which could not be necessarily achieved without a combinatorial algorithm for the cut generation.

Implementation details
In this section we provide implementation details and explain how feasible solutions are obtained. In the initialization phase, we calculate combinatorial lower bounds according to Proposition 4 and insert these values as default lower bounds for s k , k ∈ C. In addition, we add constraints (10)-(12).

Separation algorithms
There are two types of subtour elimination constraints that we consider in our implementation: and Constraints (38) are separated only in the case F (k) = F . To this end, for a given fractional solution x * of the master and a given k ∈ C, we generate an auxiliary graph k)} and set the arc capacities to: If the maximum 0-k flow on G k is smaller than one, let W be a subset of facilities inducing the associated min-cut, such that 0 ∈ W , and let W be another subset of nodes from F 0 ∪ {k} inducing the min-cut, such that k ∈ W . Assuming that W ∪ W = F 0 ∪ {k}, two valid cuts are then added to the model: x(δ + (W )) ≥ 1 and x(δ − (W ∩ F 0 )) ≥ 1, referred to as forward and backward cut, respectively. The maximum flow is calculated using the preflow-push implementation by Cherkassky & Goldberg (1997). See also (Gollowitzer & Ljubić, 2011;Manerba et al., 2017) for the separation of these cuts for related routing and network design problems. Alternatively, to separate an integer solution x * , we perform a graph traversal on G k starting from 0, and add a violated cut for a customer that cannot be reached (if any). For the separation of constraints (39), see (Ljubić et al., 2006;Fischetti et al., 2017a).
Finally, normalized Benders cuts (34) are separated at integer points only (in the so-called lazy cut fashion), and only when no violated cuts of types (38)-(39) are found. For a given solution (x * , s * ) of the relaxed master problem and each k ∈ C, we find the 0-k shortest path on the support graph induced by x * , calculate the values of (α * , β * ) according to Proposition 9, and add the corresponding cut if s

Obtaining feasible solutions: A local search heuristic
Feasible solutions are obtained by applying a simple constructive heuristic followed by a local search phase. Besides computing initial upper bounds, this heuristic is also used as a primal heuristic within the branch-and-bound tree. It takes as input a set of available facilities F : when the heuristic is called for the first time to initialize upper bounds we have F = F , and within the branching nodes we define F as a set of facilities visited by the current LP-optimal solution (x * , s * ), i.e., F = {i ∈ F : x * (δ − (i)) > 0}.
In a pre-processing phase, the shortest paths P * G (0, k) from the depot to each customer k ∈ C in G are pre-computed. The values u k − t * G (0, k) (difference between the due date and the length of the shortest path for each customer) are then stored in a sorted list L in non-decreasing order.
Construction phase: we initialize a new route T with the depot only. We then apply the following two steps: • Greedy insertion following the most-urgent-deadline-first policy: We select a customer from the top of the list L and follow the 0-k shortest path to insert all the facilities along this path to T (the insertion consider only the facilities in F which are not already in T ). We apply the best insertion policy with respect to the total tardiness criterion, i.e., we choose the insertion position so as to minimize the sum of tardiness for all reachable customers. The customer k is then removed from the list and this step is repeated until L is empty.
• Greedy insertion of the remaining available facilities: If T does not visit all the facilities from F , we insert the remaining ones using the same best insertion policy from above.
Local search phase: In the local search phase, we now try to improve the current tour T by applying some of the standard moves typically used in the vehicle routing literature (Gendreau et al., 1994;Vidal et al., 2012;Vidal, 2017). More precisely, we perform node re-insertions and swaps in a Variable Neighborhood Descent fashion (Hansen & Mladenović, 2003), and finally remove unused facilities.
This local search heuristic is used to initialize feasible solutions for all exact methods tested in our computational study.

Computational experiments
This section presents the computational results obtained with the proposed formulations and decomposition methods. The goal is two-fold: 1) to demonstrate the computational efficiency of the proposed branch-and-Benders-cut approach when compared to alternative exact methods, and 2) to conduct a managerial study analyzing how some of the major features (like the number of available facilities, the coverage radius or the travel speed of self-driving robots) affect the late deliveries.
In what follows, we first present the computational setting. Second, we describe how the benchmark instances are generated. Third, the numerical results for each method are discussed and compared. Finally, managerial insights are provided. In addition to the proposed URSP MILP formulation given in Section 3, three Benders decomposition procedures were implemented; • Auto-Benders, obtained by solving the URSP model with the built-in Benders decomposition implemented by CPLEX. In practice, we relaxed the integrality constraints on the flow and assignment variables and select the fully automated strategy.
• LP-Benders, implemented as described in Section 4, and the Benders cuts are obtained by solving the subproblems (24)-(29) as LPs.
• SP-Benders, where the Benders cuts are computed using the combinatorial shortest path-based (SP) procedure described in Subsection 4.2.
A computation time limit of one hour (3600 s) was imposed for each instance. All the other parameters of CPLEX were left at their default values. Additionally, all the methods are initialized with a feasible solution obtained by applying the local search heuristic described in Section 5.2.

Instance generation
The instances were generated by a procedure similar to the one described by Boysen et al. (2018). Both facilities and customers coordinates are uniformly generated on a 10 km square grid. Each benchmark set contains either 25 or 50 symmetric Euclidean problems which are classified according to the number of facilities and customers.
The travel times among facilities are based on an average truck speed of 30 km/h as in Boysen et al. (2018). Likewise, in the default scenarios, the travel times between facilities and customers were defined for autonomous robots moving at 5 km/h, i.e., at pedestrian speed. For each customer k ∈ C, the due date is computed as: The parameters {ρ min , ρ max } control the tightness and the width of the due dates time horizon, and the random number δ k , drawn uniformly from the interval (0, 1], defines how the due dates are spread. Based on preliminary experiments, the more challenging problems are those with tighter due dates. Such instances can be obtained by setting ρ min = 1.
During our experiments the most restrictive scenarios in terms of facility-customer coverage were obtained by setting the robot speed to 5 km/h and the robot coverage radius to 30 minutes. To generate feasible instances we made sure that each customer can be reached by at least one facility and that each facility can reach at least one customer.

Analysis of computational performance
In this section we discuss the numerical results obtained with the four proposed methods for solving the URSP (MILP Formulation from Section 3, Auto-Benders, LP-Benders and our SP-Benders approach) and compare their performance. When running the compact MILP formulation from Section 3, we leave out the subtour elimination constraints (as they will not affect the solution value, and redundant subtours could be removed in a post-processing phase).
The experiments were carried on benchmark problems with |C| ∈ {25, 50, 75, 100} customers, and |F | = 20 facilities. The problems are grouped according to the number of customers and for each group 25 instances were generated. The customer deadlines were generated using ρ min = 1 and ρ max = 5. The speed of autonomous robots was set to 5 km/h with no coverage radius limit. The three tardiness objective functions were used with uniform weights w k = 1.0, for all k ∈ C. The results obtained by each of the four methods are detailed in Table 1. For each instance group and objective function, we provide the number of instances solved to optimality within the time limit (#Solved), and the corresponding average CPU time in seconds. For those instances which were not solved within the time limit, the average gap between the best integer solution found (among the four methods) and the best lower bound (for each method) is reported.
In total, SP-Benders was able to solve to optimality 231 instances out of 300 (100 instances per each objective function). At the same time, the compact formulation and automatic Benders approach find optimal solutions for only 112 instances, whereas for LP-Benders this figure drops to 70. As expected, the compact MILP formulation scales very badly with the increasing number of commodities: already for |C| = 75 and the min-num objective function not a single instance could be solved to optimality, and the average gap for the unsolved instances is higher than 95%. The remaining three Benders decomposition methods scale better with the increasing size of C, but there are still significant differences between them. SP-Benders drastically outperforms all other methods both in terms of computational time and average gap. SP-Benders shows an average gap lower than 12% over all unsolved instances and all three objective functions, whereas for the second best approach, Auto-Benders, this gap is above 60%. The weak performance of LP-Benders can be explained by the fact that the dual Benders subproblem is highly degenerate, so that the optimal dual solution (α, β) obtained from the LP solver produces numerically unstable, possibly too dense and shallow cuts. Auto-Benders uses sophisticated stabilization techniques (see, e.g., in-out approach described in (Fischetti et al., 2017a)) to overcome these issues, whereas our SP-Benders relies on normalization, sparsity of derived cuts and their numerical stability thanks to the combinatorial procedure for calculating the dual multipliers (cf. Propositions 9 and 11).
Table 1 also shows that the min-num and min-sum objective functions are generally more computationally expensive than the min-max objective function, and that the amount of instances solved to optimality decreases as the number of customers increases. Nevertheless, the performance of SP-Benders remains stable even with the increasing number of customers, and shows the potential of our method to be applied to even larger instances. Additional analysis of the computational performance for the min-max objective function is provided in the Appendix D.

Managerial insights
In this section we analyze how the QoS is affected by the number of available facilities, the speed of robots and their coverage radius. In addition, to assess the environmental impacts, we focus on the savings in km (and respectively the CO 2 emissions) for the delivery trucks. This emission analysis is the most suitable for our purposes, given the differences in electricity generation (for powering the robots) between different cities/countries. Assuming that the delivery trucks do not belong to the last-generation vehicles, we estimate emission of 200 grams of CO 2 per kilometer and estimate annual CO 2 emissions assuming 300 working days per year. We also report the length of the paths traversed by the robots. We are not analyzing operational costs, as done in e.g., (Brotcorne et al., 2019) and  or (Bakach et al., 2021), for several reasons: 1) Our model deals with operational (scheduling) decisions while assuming that the underlying infrastructure (robots, facilities, delivery truck with a sufficiently large capacity) is available.
2) Investment cost data (e.g., operating costs related to fleet management and maintenance, personnel costs, costs per stop, etc) vary between cities and organizations; 3) In a case study conducted by Bakach et al.
(2021) the authors show that, compared with conventional truck-based deliveries, robot-based deliveries can save about 70% of operational cost. Therefore, in this section we focus on the impact of the available infrastructure and technology on: a) the quality of service, and b) the environmental costs. The obtained results are summarized in Table 3 and Fig. 3. Each row in Table 3 reports average values across 50 instances from our benchmark set. When varying the number of available facilities (respectively, the robots' speed or coverage radius), we report the average number of used facilities in an optimal solution (second column), the average distance (in km) traveled by the delivery truck (third column), the average distance (in km) traveled by all robots (fourth column), and the average distance (in km) traveled by each robot (last column). In Fig. 3 we vary the same parameters and report the average solution value (max-tardiness in our case), together with the bars corresponding to the 95% confidence interval.

Number of Customers (|C|) 50
Number of Facilities (|F |) 10, 15, 20 * , 25 Device speed (km/h) 5 * , 6, 10, 15 Device coverage radius (R min.) 30, 35, 40, 45, 60 * Objective function min-max * , min-sum, min-num By default, we consider the maximum tardiness indicator as the objective function. Table 2 summarizes the values of the test parameters. The selected intervals are based on the technical capabilities of existing delivery robots. For example, on the most conservative side we have the autonomous robots produced by , with a maximum speed of 6 km/h and maximum coverage range of 6 km, on the other side we have the robots produced by e-novia (2020), which have a maximum coverage radius of 80 km and different speed modes: it imposes a maximum speed of 6 km/h on sidewalks (pedestrian speed), and can travel at up to 20 km/h using bicycle lanes.
Effect of increasing the number of facilities. In this setting, we iteratively expanded the set of available facilities from |F | = 10 to |F | ∈ {10, 15, 20, 25} by adding facilities to existing ones, with a uniform distribution in the space. The chart given in Fig. 3a shows that the maximum tardiness improves in a rather linear way with the increasing number of facilities available for robot delivery, with a gain of 30% when doubling the number of facilities. Interestingly,  Let us now focus on a logistics service provider and compare two possible solutions: 1) vertical integration of the last-mile delivery using own infrastructure (i.e., distribution facilities and self-driving robots) versus 2) outsourcing to one or several third-party logistics (3PL) providers that offer the last-mile delivery using their own distribution facilities and self-driving robots. As we can see from Fig. 3a, having a larger set of available facilities is highly beneficial for reducing the late deliveries. On the other hand, the upfront investment cost for constructing/renting distribution facilities may be quite large for a single company. Therefore, it can be interesting for the LSP to expand the infrastructure by adding the facilities of another 3PL provider as soon as it significantly improves the coverage of the service area, and potentially sign contracts with multiple 3PL providers, sorting them by marginal improvement of the QoS. A full economic analysis and trade-off cost study, introducing the costs per stop of standard delivery, could be further conducted, following the findings of, e.g., Brotcorne et al. (2019) and Perboli et al. (2018).
Effect of increasing the device speed. In this experiment, we change the robot speed from 5 to 6, 10 and 15 km/h, respectively, while keeping |F | = 20, R = 60 minutes, |C| = 50. Fig. 3b shows that a significant gain in the QoS is obtained by employing robots with higher speed (the max-tardiness of more than 6 minutes drops down to near zero when robots with 5 km/h speed are replaced by ones with the speed of 10 km/h).
Surprisingly, faster robots seem to be useful only up a certain threshold. For example, on our benchmark set the max-tardiness can no longer be improved by employing robots whose speed is 15 km/h, which can be explained by the fact that already with the speed of 10 km/h, most of the customers could be reached in due time. Naturally, this threshold switch varies with the problem data. The optimal selection of the speed of robots depends on the structure of the underlying city network, and hence needs to be numerically tested on the specific case study. Table 3 provides complementary information with respect to Fig. 3b. Besides the significant reduction of the maximum tardiness, we observe another general trend: by increasing the speed of robots, less facilities are being visited (the average number of stops drops down from 6 to 2.9), the routes of the delivery truck are becoming shorter (the average length drops down from 27.3 km to 16 km), which in turn leads to longer distances traversed by self-driving robots (the average distance traversed by a robot doubles from 3.7 km to 7.2 km). In terms of annual CO 2 emission savings, increasing the speed of robots from 5 km/h to 15 km/h, results in annual savings of ≈ 675 kg CO 2 , for a single urban area represented by a 10 km square grid considered in our study. Fig. 4 shows optimal solutions for an URSP instance in which the robot speeds are set to 5 and 15 km/h, respectively. Besides the significant reduction of the maximum tardiness (from ≈ 11 to ≈ 0.5 minutes), the example also shows that for the given instance the truck route is reduced by more than 50%, whereas the average distance traveled by robots increases by approximately 45%, and fewer facilities are visited.
Effect of increasing the coverage radius. In this experiment we change the coverage radius of robots from 60 to 45, 40, 35 and 30 minutes, while keeping |F | = 20, |C| = 50, and the speed of 5 km/h. Again, the gain in the QoS is significant up to the point where the radius is large enough to cover all customers in due time.
The curve of Fig. 3c has a convex shape, which indicates that the lower the radius of the current technology, the higher the marginal impact of increasing it. Table 3 provides more detailed information for all instances from our benchmark set. As in the previous case, when increasing the radius we notice a general trend that less facilities are visited (the number of stops drops from 9 to 6), the truck routes are shorter (the average  Hence, among the three factors analyzed in this work, increasing the coverage radius of robots from 30 to 60 minutes, has the highest environmental impact in terms of annual CO 2 emissions savings. Annual savings of ≈ 750 kg CO 2 can be achieved for a single urban area represented by a 10 km square grid considered in our study.

Comparing how the choice of the KPI affects the resulting solution
In this article, we have suggested three tardiness KPIs as possible objective functions: min-max, min-sum and min-num. We now attempt to answer the question: how does the choice of the KPI influence the quality of the obtained solution, i.e., if we were to find an optimal schedule according to, say, min-max criterion, how does such solution perform with respect to the other two criteria, namely min-sum and min-num?
To answer this question, in Figure 5 and Table 4 we compare how the value of the objective function changes when the same solution is evaluated using an alternative lateness indicator. For example, we take the solutions found by our method using the min-sum objective function and evaluate them using the min-num and min-max objectives, and vice-versa. That way, we indirectly measure how different the solutions are.
For this analysis, we consider all the instances from our benchmark set with the default setting |F | = 20, |C| = 50, the speed of 5 km/h, and the coverage radius of 60 minutes. The obtained results are summarized in Fig. 5 and Table 4. In Fig. 5a, the boxplots show the distribution of the maximum tardiness by evaluating the cost of the optimal solutions obtained by all three objective functions using the min-max objective function. Similar re-evaluations of optimal solutions using the min-sum and min-num KPIs are shown in Figs. 5b and 5c, respectively. From Fig. 5a and Fig. 5b, we observe that the min-max and min-sum objective functions behave in a similar way, with the same median values obtained when switching from min-max to min-sum and vice-versa.
The average optimal min-max tardiness is 4.74 minutes (cf . Table 4), and it increases to 6 minutes (on average) if an optimal min-sum solution is taken instead. Similarly, the average optimal min-sum tardiness is 11.7 minutes, and this value increases to 15.6 minutes (on average) if an optimal min-max solution is taken instead.
These deteriorations of the average objective value are much more pronounced for the min-num KPI that turns out to be significantly more disruptive compared to the other two KPIs. Indeed, the min-num objective induces a deterioration of max-tardiness of 175 % and sum-tardiness of 77 % on average, with a factor 3 to 4 for the median. This can be explained as follows. To minimize the number of late customers (min-num), the model sacrifices a few customers that will be served very late as tardiness measured in time units is no more controlled, in order to serve more customers in due time, and vice-versa.
Overall, the obtained results indicate that min-sum and min-max solutions are very similar, but that they can be quite different from the solutions obtained when min-num KPI is optimized. We therefore conclude that, if the decision-maker wishes to control both conflicting criteria (which are the number of late customers

Concluding remarks
Last-mile delivery is currently being disrupted by the introduction of innovative technologies like selfdriving robots, drones and other autonomous devices which impact delivery cost and safety. In this paper we studied a scheduling problem in this disruptive environment from the perspective of a logistics service provider: The problem optimizes the quality of service (on-time deliveries) by combining routing and lastmile delivery decisions, where the last-mile delivery is performed by self-driving robots. Given the limited capacities of an LSP (namely, the existing logistics network, current resources and available technology), there is potentially a large number of customers that will be served late. We proposed to optimize the delivery schedule by controlling one of the three key performance indicators: maximum tardiness over all customers, total tardiness, or the number of late deliveries. We provided a compact MILP formulation and a problem reformulation based on normalized Benders cuts. We showed that the separation of normalized Benders cuts is tractable and can be solved in a combinatorial fashion by employing a labelling algorithm.
Our normalized Benders cuts provide sparse and numerically more stable cuts than their generic counterparts obtained by solving a linear program. Our exact method based on branch-and-Benders-cut allowed us to efficiently solve the problem on realistic instances of larger size.
Several managerial insights have been also derived. One of the key take-aways is that all three tardiness indicators are very sensitive to the number of available facilities. This should drive the LSP to sign contracts with several third party providers that use robot-deliveries, if available, to improve the coverage of the network and get closer to customers. The three tardiness KPIs are also highly sensitive to the speed and coverage radius of the self-driving robots, up to the point where speed or radius get high enough to cover all customers in due time. Finally, our tests showed that the total/maximum tardiness and the number of late deliveries are somewhat conflicting criteria that do not necessarily behave in the same way. These key performance indicators could be controlled in the same decision model to improve the quality of service. The problem defined in this paper is generic and could be extended to many real-case applications and variants.
When it comes to future work, several interesting directions are possible. For example, for VRP applications with large scale input data, the state-of-the-art methods are typically based on math/metaheuriscs (Vidal et al., 2012;Vidal, 2017;Gendreau et al., 1994). We therefore believe that developing matheuristics derived from the proposed mathematical framework could be a promising direction for future research, in particular for the logistics providers whose business model is based on the same-day delivery. Another promising direction would be to replace the standard delivery truck in our model by (a fleet of) electric vehicles (Desaulniers et al., 2016). Also, the available number of self-driving robots at any given facility may be limited, which leads to natural extensions of the proposed problem to the variants of the Capacitated Routing-Scheduling Problem, that cannot be handled directly with the same techniques developed in this paper. Stochastic problem variants that capture the uncertainty on travel times, would result in more challenging but highly interesting models for future research.
In addition, extending our concept of due dates to time-windows and comparing different delivery policies Graph representing the logistic network t ij Time required for traveling from node i to node j, i, j ∈ V u k Delivery due date for client k ∈ C w k Penalty per late time unit for client k ∈ C T Vehicle tour connecting a subset of distribution facilities with the depot P * T (0, k) Shortest path connecting the depot to client k via arcs of tour T t T (i, j) Traversal time from node i to node j via arcs of tour T t * T (0, k) Traversal time from the depot to client k via the shortest path P * T (0, k) P * G (0, k) Shortest path connecting the depot to client k in the network G t * G (0, k) Shortest traversal time from the depot to client k in the network G R Device's coverage radius (in minutes) δ − (S) Set of arcs entering a set of nodes S ⊂ V δ + (S) Set of arcs leaving a set of nodes S ⊂ V Let T be an optimal tour visiting two or more facilities, and let i be the first facility visited in that tour.
Then, a new solution T can be created by stopping at i, and serving all customers from i. Let k ∈ C be a customer that was served from some facility j, j = i in the optimal solution. Due to triangle inequality, we have: t * T (0, k) = t T (0, j) + t jk = t T (0, i) + t T (i, j) + t jk ≥ t T (0, i) + t ik = t * T (0, k), so potentially travel times to k can be improved, and hence, the new solution associated with T will be at least as good as the starting one.
To find the single-facility optimal solution, one can simply evaluate all single-facility solutions and take the best among them.
Proof of Proposition 2. Note that for the min-num objective, our problem is a particular case of Boysen et al. (2018) where their number δ of robots initially on the truck is zero. Even though their N P-hardness proof assumes δ is equal to the number of customers, with a slight adaptation, a reduction from the decision version of the Hamiltonian Path problem ("is there a Hamiltonian path of length ≤ L?") can be used to prove the N P-hardness of the min-num URSP.
We now provide a detailed proof for the min-max URSP, and will provide a sketch of the proof for the other min-sum variant.
The polynomial reduction for the min-max URSP is from the metric Shortest Hamiltonian Path problem, which is known to be N P-hard. Consider an instance G = (V, E) of the metric Shortest Hamiltonian Path problem with V = {1, . . . , n}, and c ij the cost (or distance) between nodes i and j which satisfy the triangle inequality. We construct an instance of the min-max URSP as follows. Define the set of potential facilities as F 0 = {0, f 1 , . . . , f n }, and the set of customers as C = {k 1 , . . . , k n } and u ki = 0 for i = 1, . . . , n. Set M = max (i,j)∈V ×V,i =j c ij . For travel times on arcs inside F 0 , set t 0,fi = t fi,0 = M n/2 for i = 1, . . . , n, and t fi,fj = c ij for i, j = 1, . . . , n, i = j. This way, the triangle inequality holds on A F because any path from f i to f j has length at most nM = t fi,0 + t 0,fj .
For times on arcs from F to C, set t fi,ki = 0 for i = 1, . . . , n, and t fi,kj = ∞ for j = i. Fig. B.6 illustrates a transformed min-max URSP instance for V = {1, 2, 3, 4}. Now, because the only arc that reaches customer k i with no infinite time is arc (f i , k i ) (with time 0), the tour T in the URSP solution necessarily passes through each node f i , i = 1, . . . , n, which defines a Hamiltonian path over V in the original instance. The time to reach customer k i will be the value of the path in tour T connecting 0 to k i . To prove the N P-hardness for the min-sum URSP, we use a similar construction, with u k = 0, k ∈ C, as well, and the reduction follows from the Traveling Repairman Problem (TRP) (Afrati et al., 1986;Sahni & Gonzalez, 1976). model, and more than 50% for LP-Benders. Similar observations can be derived for the other two objective functions. q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Figure

Appendix D.2. Examples
To illustrate the effect of increasing the number of facilities on the max-tardiness, two optimal solutions for an URSP instance with |F | = 10 and |F | = 25 are depicted in Fig. D.9. We observe that by incorporating more facilities, all customers could be served on time.
Similarly, the effect of the increased coverage radius is illustrated in Fig. D.10. We show two optimal solutions for an URSP instance in which the coverage radius is set to 30 and 60 min, respectively. When increasing the radius we notice that for the given instance the truck route is reduced by more than 40%, whereas the average distance traveled by robots increases by 20%, and fewer facilities are visited.