The Probabilistic Travelling Salesman Problem with Crowdsourcing

We study a variant of the Probabilistic Travelling Salesman Problem arising when retailers crowdsource last-mile deliveries to their own customers, who can refuse or accept in exchange for a reward. A planner must identify which deliveries to oﬀer, knowing that all deliveries need fulﬁlment, either via crowdsourcing or using the retailer’s own vehicle. We formalise the problem and position it in both the literature about crowdsourcing and among routing problems in which not all customers need a visit. We show that to evaluate the objective function of this stochastic problem for even one solution, one needs to solve an exponential number of Travelling Salesman Problems. To address this complexity, we propose Machine Learning and Monte Carlo simulation methods to approximate the objective function, and both a branch-and-bound algorithm and heuristics to reduce the number of evaluations. We show that these approaches work well on small size instances and derive managerial insights on the economic and environmental beneﬁts of crowdsourcing to customers.


Introduction
With the increasing growth of e-commerce worldwide, business models for last-mile delivery (LMD) of parcels need to innovate and consider fast, cheap and reliable transportation to end customers. One possibility is to crowdsource some deliveries through digital platforms to non-professional couriers who use their own transportation means. While reducing costs and providing a source of income for people who might otherwise be off the labour market [18], crowdsourcing also raises concerns about job quality, environmental impact [34] and trust [22].
The main motivation for this work is to investigate the practice of crowdsourcing delivery to end customers as a system that takes advantage of the benefits of crowdsourcing while mitigating the major associated drawbacks [see, e.g., 55]. We address the case of retail companies with physical shops that also sell online. The real-life case prompting this work is, indeed, that of a supermarket chain that offers home delivery of groceries. In traditional LMD, a professional fleet either owned by the chain or outsourced, would deliver the groceries. In current crowdsourced LMD models, the chain would have a platform where potential couriers enrol, get offers for some deliveries, and accept or refuse the offers (this is the model of, e.g., Amazon Flex). Because there is no consolidation of parcels and couriers make journeys that they otherwise would not, this system generates both extra traffic and emissions.
In our case, instead, the chain has a loyalty programme in which the enrolled clients provide their home address. This enables us to defend a slightly different model where the crowd is composed of clients that are already in the store and whose home address is close to one of the delivery points.
Because the clients are already heading home, we reduce the number of cars associated with the ecosystem supermarket-delivery and also the number of miles travelled. Under this model, the planner offers participating clients to deliver someone else's groceries in exchange for a discount. Since clients can accept or refuse the offer, at the end of the day a supermarket vehicle will serve all deliveries which the planner did not crowdsource. (Note that, in grocery retail, it is commonplace to schedule the supermarket vehicle tour at the end of the day, because this time coincides with most customers returning home after a workday [52,53].) Therefore, in a typical implementation, the following would happen: (i) in the early afternoon the supermarket fixes the list of grocery deliveries to perform at the end of the day; (ii) then, the supermarket selects which deliveries it will attempt to crowdsource; (iii) one to two hours before closing time, which corresponds to peak hour for in-store visits, the supermarket attempts to crowdsource the set of deliveries selected in step (ii); (iv) at closing time, the truck performs all remaining deliveries: those not selected for crowdsourcing and those selected but not succesfully crowdsourced.
A similar scheme was first discussed in the work of Archetti, Savelsbergh, and Speranza [7], in which the authors consider that either a professional fleet or occasional drivers can carry out deliveries, but assume that the occasional drivers will always accept offers. In this work we assume that each crowdsourcing fee (i.e., the compensation or discount offered to the customer) is fixed and that the probability that some customer accepts a delivery during the day, provided the decision-maker offers it, is also fixed and known. We refer the reader to, e.g., the recent work of Yildiz and Savelsbergh [62] discussing optimisation models to set the fees and estimate the probabilities, and to [24,25,26] for the similar problem of optimising crowdsourcing fees to maximise connectivity coverage for internet-ofthings devices. We also assume that the supermarket's fleet comprises a single vehicle, because in our motivating example each supermarket manages its own home deliveries in a small geographical area. This assumption is, in any case, easy to relax.
Under the assumptions above, our problem becomes a stochastic generalisation of the Travelling Salesman Problem which we name the Probabilistic Travelling Salesman Problem with Crowdsourcing (PTSPC). Other generalisations of the TSP which do not require to visit all customers have been studied extensively. In particular, as we discuss in Section 2, the literature considers the extreme cases in which the tour planer has either no power or total power in deciding which clients should not be visited. Our problem sits in between these two extremes.

Contributions
The main contributions of this paper are the following: • We introduce the PTSPC, a stochastic generalisation of the TSP. Other well-known routing problems, such as the Probabilistic TSP and the Profitable Tour Problem, are special cases of the PTSPC (see Section 2.2).
• From the point of view of applications, our problem models the case of a company that wants to crowdsource its deliveries to its own customers, who can either accept or refuse the offer. From a theoretical point of view, our problem occupies a novel niche in the spectrum of TSP generalisations in which the planner has intermediate power of deciding which customers to visit (see Section 2).
• We formalise this problem and show that computing the objective function of even one solution, requires solving an exponential number of TSPs (each of which is N P-hard). Thus, we devise efficient (exact and heuristic) ways to explore the solution space and to approximate the objective value of the solutions (see Section 4). In particular, in Section 4.3.2 we show how to use Machine Learning techniques to accurately predict the value of the objective function of the PTSPC.
• With an extensive computational analysis, in Section 5 we derive insights on the benefits of crowdsourcing and prove that the algorithms we propose are suitable for integration in a support decision system because they can provide high quality solutions in short times.

Problem description
We begin by placing the PTSPC in a spectrum of TSP generalisations which do not require to visit all customers, along a gradient of "decision power" attributed to the tour planner.
At one extreme of this gradient lies the case in which the tour planner has no choice over which customer will require a delivery (we assume that we visit customers to deliver some goods) and which will not, because a random variable determines customer presence. This problem is the Probabilistic Travelling Salesman Problem (PTSP) [42]. In the PTSP a decision-maker has to devise a tour to visit a set of delivery points, some of which might be later revealed not to be available. Because the decision-maker does not know in advance which customers will drop out, he/she faces two options.
• The first option is to solve a TSP problem for each possible set of delivery points, wait until the status of all customers is revealed and use the TSP tour visiting the customers requiring delivery. Using this strategy is computationally expensive, but it can also be inconvenient for operational reasons; for example, if used over multiple days, it can produce every day a radically different tour while still visiting a similar set of delivery points (see, e.g., [33] for the importance of consistency in parcel delivery).
• A second option, called the a priori approach, addresses this concern. It consists in first planning an a priori tour visiting all the customers; when the stochastic outcome is revealed, the decisionmaker amends the solution skipping the deliveries that are not required, and performing the remaining ones in the same order as they appear in the a priori tour. This is the first approach introduced, together with the definition of PTSP, by Jaillet [40]. It has the advantage that, when the problem is solved for a multi-day planning horizon, all routes will be similar.
At the opposite extreme there is the case in which the tour planner has total control over which deliveries to perform, giving rise to the Profitable Tour Problem (PTP) [21,28]. In this case, visiting a delivery point earns a profit, while traversing an edge incurs into a cost. The objective is to select the deliveries and plan the tour that maximise the difference between collected profits and travel costs.
Intermediate cases arise when the visit requirements are stochastic, but the decision-maker has some leverage on their outcome. In the PTSPC, for example, the planner has the power of forcing a visit with the retailer's own vehicle if he/she never offers the corresponding delivery for crowdsourcing. One could imagine more complicated interactions in which, e.g., the decision-maker can increase (decrease) the compensation offered to raise (lower) the probability of customers accepting a delivery (see, e.g., [10]).

Formalisation of the PTSPC
. . , n} is the set of delivery locations. Let c ij ∈ R + be the cost of traversing an edge {i, j} ∈ E and assume that, if the planner offers delivery i ∈ V for crowdsourcing, there is a probability p i ∈ [0, 1] that some provider will accept the offer. In this case, the decision-maker pays a fee m i ∈ R + and removes i from the list of customers to visit. Otherwise, if the offer is not accepted, the planner needs to visit i with the retailer's own vehicle. We assume that probabilities p i are independent which, under our motivating example, is a reasonable approximation: we expect the number of potential couriers to be larger than the number of deliveries, and to offer at most one delivery to each customer. Different assumptions might hold if the planner outsourced deliveries to a logistic provider (e.g., the provider could accept or refuse in block deliveries in a same area). To estimate the values of p i a practitioner could use, for example, historical data on the success rate of crowdsourcing deliveries to the same area, or an estimation method based on the density of customers living within a small radius from the delivery point.
Denote with O ⊆ V the subset of deliveries offered for crowdsourcing and with A ⊆ O the set of accepted offers, which is only revealed at the end of the day (see Figure 1). In this sense, the PTSPC is a two-stage problem and the set of accepted offers is only revealed in the second stage. The decisionmaker has to decide which deliveries to offer for crowdsourcing, i.e., the elements of O, assuming that he/she will reoptimise the end-of-day tour after set A is revealed. In other words, the planner looks for the set O with the lowest expected cost with respect to the random variable A.
This approach can appear similar to the first approach for the PTSP mentioned above (computing all possible tours and implementing the one corresponding to the realised customers), but the two differ in an important aspect. In the PTSP the decision-maker cannot affect the outcome of the random variables: he/she will wait for their realisation and then choose the tour through the customers requiring a visit. In short, the PTSP becomes equivalent to a sequence of TSPs on the operational level. On the tactical level, the PTSP decision-maker can calculate the expected cost of the reoptimised tour (using a weighted average of all the tour costs) but, differently from our problem, cannot act to decrease it. Let c V ′ \A be the travel cost of the reoptimised tour of the PTSPC: the shortest simple tour starting and ending at the depot, visiting all delivery points which were either not offered, or whose offer for crowdsourcing was not accepted. (Cost c X , for a generic subset X of vertices is defined formally in Appendix A.) The cost associated with offering deliveries O and having deliveries A accepted is the sum of the crowdsourcing fees for the accepted deliveries plus the cost of the reoptimised tour: The objective of the problem is to find the set O opt which gives the lowest expected cost: Any algorithm which aims to solve (2) faces two challenges. First, the solution space 2 V grows exponentially with the number of delivery points. Second, and differently from most other optimisation problems, the evaluation of the objective function of even one solution is costly: one has to solve 2 |O| TSPs, each of which is N P-hard. Our approach will, thus, focus on tackling both aspects. In Section 4.1 we introduce an exact branch-and-bound algorithm to avoid the complete enumeration of all sets O ⊆ V ; in Section 4.2 we propose alternative, heuristic, strategies to explore the solution space; in Section 4.3 we propose approximation methods to efficiently estimate the value ] .

Relation with classical TSP problems
The PTSPC is a generalisation of two well-studied routing problems. When the crowdsourcing costs are large, m i = ∞ ∀i ∈ V , there is no incentive to offer any delivery for crowdsourcing and the planner will decide to perform all deliveries with the retailer's own vehicle. In this case, the PTSPC becomes the classical Travelling Salesman Problem. The reduction to the TSP also happens when all probabilities of providers accepting a crowdsourcing offer are zero, p i = 0 ∀i ∈ V . When the customers always accept crowdsourcing offers, p i = 1 ∀i ∈ V , the problem reduces to the Profitable Tour Problem with prizes for performing deliveries with the retailer's own vehicle equal to the savings of the corresponding crowdsourcing fees.
We note that these extreme cases are interesting from a theoretical point of view, but hard to happen in practice. For the case study considered it is, thus, important to devise a solution method which works for the general PTSPC because reduction to other problems is unlikely. At the same time, the solution methods we devise for the PTSPC heavily address its specificity (e.g., the complexity of evaluating its objective). Thus, we do not expect our algorithms to be competitive with state-of-the-art methods for classical problems such as the TSP and a practitioner faced with such problems should browse the extensive existing literature on the topic.

Literature review
In this section we position the PTSPC in a broader context, highlighting the characteristics it shares with other non-deterministic routing problems and with other optimisation problems arising when integrating crowdshipping in LMD. We also briefly note that the PTSPC (and the PTSP) belong to a growing group of stochastic combinatorial optimisation problems in which the data affected by uncertainty is modelled with Bernoulli random variables (see, e.g., [1,11,36,48]). By contrast, in the majority of problems arising in logistics, stochastic quantities such as travel times [43], costs [27,60], release dates [7] or demands [13] are modelled with continuous random variables.

Crowdsourcing in last-mile delivery
In a recent survey, Alnaggar, Gzara, and Bookbinder [2] review operational research literature on crowdsourced LMD and propose a classification of problems arising in the field. Under their classification scheme, the PTSPC: (i) focuses on e-retailers, i.e., the company offering the delivery is the same that sells the product; (ii) offers a per-delivery rate determined by the company; (iii) uses pre-planned trips because drivers were already heading in the direction of the delivery points; (iv) focuses on self--scheduling individuals, because the customers enter the supermarket at their own convenience; and (v) considers short-haul deliveries within the same city. These characteristics set the PTSPC apart from most other problems considered in the literature, which focus on crowdsourcing to professional couriers rather than truly occasional drivers.
Archetti, Savelsbergh, and Speranza [7] were among the first to address a problem arising in outsourcing in LMD: the Vehicle Routing Problem (VRP) with Occasional Drivers (ODs). In this problem, the company can decide whether to serve a delivery with a vehicle of its own fleet, or to outsource it for a fixed fee. The planner assigns deliveries to ODs which are already heading towards a destination. For the assignment to take place, the delivery point needs to be close to the driver's destination. The model works under the assumption that ODs always accept requests from the company, provided that they fulfil this "closeness" condition. This assumption is important, as optimal solutions tend to use a high percentage of available ODs. The authors propose a Mixed-Integer Programming (MIP) formulation, but must resort to a multi-start heuristic to tackle instances with more than 25 deliveries. They identify three characteristics affecting the profitability of such a schema: the number of available ODs, their flexibility (how far the delivery point can be from the OD's original destination) and the compensation amount.
Other authors extended the VRP with ODs model to incorporate real-life features such as time windows, multiple and split deliveries [44], transshipment nodes [45], and coordinating ODs on bikes or on foot with a delivery truck from which they relay [38,41]. The MIP model by Huang and Ardiansyah [38] illustrates well how even deterministic problems can be hard to solve, when they require the interaction of traditional and crowdsourced supply chain segments. The authors, in fact, could solve instances with up to 15 delivery points using the Gurobi solver. They could not solve any instance with 20 and 30 customers and, sometimes, after a 4-hour time limit the solver did not even provide a valid integer solution.
Recent literature also addressed dynamic versions of the problem, in which delivery requests and driver availability become known during the day. Arslan et al. [8] consider a real-time system in which ODs make trip announcements and the company can then assign them pickups and deliveries which are compatible with their announced travel (i.e., involving only a small detour) and the recipients' time windows. They tackle the dynamic nature of the problem with a rolling horizon approach, corresponding to a planner who takes decisions at different moments during the day. The decisions consist in matching deliveries to ODs and routing the own fleet for deliveries which were not crowdsourced. With a simulation study the authors show how this approach can produce savings, depending on the time flexibility of the ODs and their willingness to take larger detours. They also conclude that this system is more indicated when all parcels share the same origin, such as a central depot, as this greatly reduces the cost to operate the own fleet.
Other problems share some of the characteristics of the PTSPC. Dayarian and Savelsbergh [20], for example, propose a model which also addresses the case of using in-store customers as occasional drivers and an own fleet in charge of completing the distribution of parcels. Differently from the PTSPC, they simulate each customer individually; if a customer has a destination compatible with a delivery point, they assume that the customer will accept the delivery. The authors consider two cases: a static case, where all customer visits and deliveries are known in advance, and a dynamic case, where information is only known up to a certain time and the planner reoptimises following a rolling horizon approach. Data on the presence and destination of customers is collected while the customers shop inside the store. For the dynamic case, the authors also propose to base the decisions at each epoch on forecasts of future demand and in-store visits, which they obtain averaging sample historical scenarios. Through a simulation study, the authors determine a trade-off between the savings obtained by reducing the own fleet and the risk introduced when relying on customer availability.
Stochastic occasional drivers are also a feature of the Vehicle Routing Problem with Dynamic Occasional Drivers, introduced by Dahle, Andersson, and Christiansen [19]. The authors, however, model ODs as stochastic vehicles possibly appearing at random times during the day (according to a known distribution) in a VRP with time windows. They use a two-stage stochastic model in which they plan the routes of the own fleet in the first stage and, after OD appearance times are revealed, they amend the routes and assign deliveries to ODs in the second stage. The authors solve instances with 5, 10, 15 or 20 deliveries, 2 own vehicles, and 2 or 3 ODs. To do so, they use a MIP model and enumerate all possible 2 K scenarios, where K is the number of occasional drivers. The problem proves hard to solve: for example, they cannot find the optimal solution within 2 hours of computation for instances with 10 delivery points and 3 ODs.
Finally, Gdowska, Viana, and Pedroso [30] introduced a multi-vehicle problem for delivery with crowdshipping based on the VRP with ODs [7], which is most similar to our problem. In their work, the authors consider a multi-vehicle fleet of own vehicles and propose an agent-oriented bi-level stochastic model and a local search algorithm for its solution. With computational experiments on instances with 15 deliveries, they show that solutions using crowdsourcing can produce savings, but they cannot assess the accuracy of their heuristic, because they lack exact solutions.

Probabilistic TSP with profits
The literature on the Probabilistic TSP is vast and its full review is out of the scope of this paper (we refer the reader to classic works [12,14,15,39,40,42] and surveys [31,32]). In the following, we focus on a class of problems which shares common characteristics with the PTSPC: the class of Travelling Salesman Problems with Profits and Stochastic Customers (TSPPSC) introduced by Zhang et al. [63]. This class contains three problems, which mirror the three classic problems in the area of TSP with profits: the Orienteering Problem (OP) which maximises the collected profit under an upper bound on tour duration, the Prize-Collecting TSP (PCTSP) which minimises tour duration under a lower bound on collected profit, and the Profitable Tour Problem (PTP) described in Section 2.
For each of these problems, the authors describe the corresponding version with stochastic customers. Of the three new problems, the one most related to the PTSPC is the PTP with Stochastic Customers (PTPSC). In this problem the decision-maker has to select a subset of delivery points and plan an a priori tour only visiting the selected points. When the random outcomes are revealed, the planner amends the tour skipping the points not requiring service. If a delivery is not even included in the a priori tour, the delivery point will not be visited and the corresponding profit will not be collected. This is analogous to outsourcing the delivery, paying a fee equal to the lost profit to a provider who is always available to accept requests, highlighting a sort of duality between the proposed problem and our PTSPC. In our problem we have certain delivery points but uncertain outsourcing; in the PTPSC there are uncertain delivery points, but certain outsourcing. Looking at probabilities, the decision-maker of the PTPSC has the possibility of setting p i = 1 (exclude from the tour) for those customers he/she does not select in the a priori tour; in contrast, in our problem the decision-maker can set p i = 0 (force in the tour) for those customers he/she does not offer for crowdsourcing.
Zhang, Wang, and Liu [64] propose a genetic algorithm to solve the PTPSC with homogeneous probabilities (i.e., all p i 's are the same). Their analysis is limited to 5 instances with 8, 13, 20, 28 and 50 customers. It shows that, for the two largest instances, the genetic algorithm produces in a few seconds better results than solving the non-linear formulation with a commercial solver running for three hours.
Finally, we remark that of the three problems with profits mentioned above, there are no exact algorithms for their stochastic-customer variants in the literature, while heuristics only exist for the PTPSC [64] and the OP with stochastic customers [5,49,63].

Algorithms
In this section we propose an exact algorithm based on branch-and-bound, and four heuristic algorithms to explore the solution space of the PTSPC in search for the optimal set of offered deliveries. We also introduce two ways to approximate the objective function of the problem, which we can use instead of exact evaluation to further speed-up the heuristic algorithms.

A branch-and-bound algorithm
We develop a branch-and-bound (B&B) algorithm in which branching is associated with the inclusion or exclusion of delivery points in the set O of offered deliveries. At each node of the tree, we denote with O ⊆ V the deliveries that the planner will offer for crowdsourcing, withŌ ⊆ V the deliveries which the planner will not offer, and with F ⊆ V the deliveries for which a decision has not been made yet. At the root node of the tree O =Ō = ∅ and F = V , while at leaf nodes the planner has decided the status (offered or not offered) of all deliveries and F = ∅. Branching in the tree amounts to selecting an offer whose status is not fixed (that is, an offer in F ) and creating two children nodes: one in which the delivery is offered and one in which it is not offered.
The effectiveness of a B&B algorithm depends on devising upper and lower bounds for the objective value of the problem at each node, allowing to prune large portions of the tree. In the rest of this section we describe the lower and upper bounds we use during tree exploration. Each node of the tree is determined by the three sets (O,Ō, F ) described above; we denote with z(O,Ō, F ) the cost of best solution in the subtree originating from node (O,Ō, F ), i.e., the best solution which can be obtained assigning the deliveries of F to either set O orŌ.

Lower boundz
A lower bound for z(O,Ō, F ) is the following: where PTP(X, Y ) denotes the value of the optimal solution of a Profitable Tour Problem over delivery points X ∪ Y in which points of the set X are forced to be visited. Intuitively, eq. (3) is stating that one can get a lower bound by being optimistic and assuming that the accepted deliveries will be exactly those (among the offered and the unfixed ones) which give the lowest total cost. We give a formal definition of problem PTP(X, Y ) in Appendix B, while the following theorem proves that eq. (3) defines a lower bound. (3) is a lower bound on the value of the objective function, Proof. Selecting the best set of deliveries to crowdsource, when we assume that (i) the ODs will always accept offered deliveries, and (ii) deliveries in setŌ cannot be crowdsourced, corresponds to finding the lowest-cost tour which visits all vertices ofŌ (they must be visited with the retailer's own vehicles), while deciding which vertices of O ∪ F to visit.
This decision is taken based on whether it is more convenient to visit vertices in O ∪ F with the own vehicle or to pay the crowdsourcing fee. We can model this problem as a PTP in which the profit associated with each vertex of O ∪ F is equal to the crowdsourcing fee saved if the retailer's own vehicle visits that vertex. Formally, the equivalence between these two problems can be expressed with the following equality: where, in the right-hand side, we sum all the crowdsourcing fees in the first term and then we subtract those of the vertices visited by the retailer's own vehicle in the second term (these are the profits of the vertices included in the optimal PTP tour).
With eq. (4), we are ready to show thatz(O,Ō, F ) is, indeed, a lower bound for z(O,Ō, F ): where the first inequality derives from the fact that and therefore sum (⋆) defines a convex combination of the terms ∑ i∈A m i + c V ′ \A . Because the value of a convex combination is always between those of its smallest and largest terms, the first inequality follows. The next equality follows from the fact that while the second-last equality is eq. (4) and the last one is the definition ofz.

Upper boundz
We can trivially define an upper bound for z(O,Ō, F ) in the following way: where c V denotes the cost of the TSP tour visiting all delivery locations. The intuition is that in eq. (5) we assume that we will both visit all delivery points with the own vehicle (thus term c V ) and that we will pay all crowdsourcing fees for offered or unfixed deliveries. The following theorem shows thatz is, indeed, an upper bound for z. (5) is an upper bound on the value of the objective function, z(O,Ō, F ).
Proof. The thesis follows from the definition of z: The first inequality is due to (⋆) defining a convex combination and, thus, being smaller than its maximal term. The second inequality follows from two other relations. First, ∑ i∈A m i ≤ ∑ i∈O∪O ′ m i because the sum of non-negative m i cannot decrease when enlarging the set over which the sum is computed. Second, c V ′ \A ≤ c V because the cost of the optimal TSP tour cannot decrease when visiting more vertices (assuming, as usual, that the triangle inequality holds). The last inequality is valid because the minimum is smaller than any value its operand takes. The last equality is the definition ofz.

Upper boundz ′
Whilez is a valid upper bound and it is fast to compute, it is not very tight. We can derive a tighter upper boundz ′ (O,Ō, F ) with a reasoning analogous to the one used to derivez: in the worst case, the realised set A is the one leading to the highest total cost (accounting both the TSP and the crowdsourcing fees). In other words, we takē The proof thatz ′ is a valid lower bound is similar to the proof of Theorem 1, and thatz ′ is tighter thanz follows immediately from its definition. The main issue with eq. (6) is that it requires to solve a hard maximisation problem. We can rewrite this problem as which has the same maximiser of (6) (although the two objective functions differ by a constant ∑ i∈V m i ). The problem defined in eq. (7) is a bi-level integer optimisation problem, in which the inner problem is a TSP, required to compute c X . The best and, as far as we are aware, only available exact general-purpose solver for bi-level integer optimisation (that of Fischetti et al. [29]) does not support dynamic cut generation schemes such as branch-and-cut. Therefore, we cannot model the inner TSP using an exponential number of subtour-elimination constraints, as in model (27)-(30); we must use a formulation which requires a polynomial number of variables and constraints. In particular, we use the multi-commodity flow reformulation by Sherali, Sarin, and Tsai [58], which is known to have one of the strongest continuous relaxations among compact TSP formulations (see, e.g., Öncan, Altnel, and Laporte [51]). The resulting problem reads as follows: Model (8)-(22) is written for the general case of an asymmetric graph G for ease of notation. Binary variables x ij take value 1 iff arc (i, j) is used in the TSP tour, binary variables d ij take value 1 iff vertex i precedes vertex j in the TSP tour, variables t k ij arise from a reformulation-linearisation strengthening of subtour-elimination constraints (see [57] and [58]), and binary variables y i determine whether vertex i should be included in the TSP. We also use notation V y = {i ∈ V : y i = 1} and V ′ y = {0} ∪ V y . We refer the reader to the survey by Öncan, Altnel, and Laporte [51] for a complete description of each constraint. Here we note that constraints (11) and (10) are classical TSP flow constraints, constraints (12)- (15) are precedence constraints and link variables x and d, while constraints (16)- (18) are the multi-commodity flow constraints and link variables x and t.
Because solving model (8)-(22) is time-consuming, we only use boundz ′ at the root node of the B&B tree.

Overall B&B algorithm
Algorithm 1 Branch-an-bound algorithm for the PTSPC. if at root node then if z < z * then 24: Algorithm 1 shows the high-level structure of the B&B algorithm, using a depth-first exploration strategy and branching on the unfixed delivery with the highest crowdsourcing fee. Note that we do not need to recompute the bounds at each node: in the children nodes in which we fix a delivery in O, in fact, the value of the bounds does not change. Furthermore, because the complexity of calculating increases with the size of O, it is convenient to explore first the child nodes where deliveries are fixed inŌ. In this way we delay the exploration of nodes with large sets O and, possibly, we skip it altogether if the corresponding part of the tree is pruned.

Acceleration strategies
as defined in eq. (1), is that truncating the sum at any point gives a valid lower bound. Thus, if at any moment during the computation of the partial sum exceeds the current best solution cost z * , we can discard the leaf node.
Another way to speed up the algorithm is to use the following bounding technique. Assume that we store the values of which we computed during the exploration of the B&B tree and that we are to compute . We can first compute an upper bound without solving any TSP: If the upper bound, which we denote asz leaf , is already better than the best solution z * we can update it as z * ←z leaf and we flip a flag indicating that z * is not the objective value of an actual solution.
Only if, when exploring a future node, we find a new solution improving on such a z * , then we have to calculate the actual value of . If, on the other hand, value z * is never improved upon, a user might be content with the optimal set and an upper bound on its expected cost and, therefore, skip altogether the computation of . Otherwise, a parallel implementation could dedicate a thread to compute the expected value and use the bound while exploring the rest of the tree. The following theorem proves the validity of bound (23). Proof. For notation convenience we define m A = ∑ i∈A m i and let p A,O = ∏ i∈A p i · ∏ i∈O\A (1 − p i ) be the probability that A is the accepted set when O is the offered set. Then, the following equality holds: , where the first sum considers the subsets of O ∪ {j} which contain j and the second sum considers those which do not contain j. Denote with γ A,j the difference between the costs of the TSPs over Note that γ A,j ≥ 0 because adding one more vertex cannot reduce the cost of the TSP. Then the following holds: However, it seems reasonable that even to evaluate one objective value , parts of the solution needed to obtain c V ′ \A 1 (for some A 1 ⊆ O) could be reused to compute c V ′ \A 2 (for some other A 2 ⊆ O): intuitively, TSP tours over similar subsets are likely to have subpaths in common.
Following this observation, we devise and compare two strategies to solve the multiple TSPs required • The first strategy (Concorde) is to disregard the fact that the multiple TSPs over sets V ′ \ A are related and simply solve each of them independently. To this end, we use the popular TSP solver Concorde [6]. It has been shown empirically by Mu et al. [50] that Concorde solve times scale as a · b is almost exponential-time, because it requires to solve 2 |O| TSPs.
• The second strategy (HeldKarp) is to use the classical Held-Karp Dynamic Programming (DP) algorithm [35] to compute c V ′ . The DP algorithm is based on the following cost function: let X ⊆ V and t ∈ V ′ \ X, and denote with H(X, t) the cost of the shortest Hamiltonian path from 0 to t in the subgraph induced by vertices {0, t} ∪ X. The cost of the optimal TSP tour over V ′ is then c V ′ = H(V, 0) (recall that V = V ′ \ {0}) and the DP recursion used is: The DP table to compute H(V, 0) has one column for each subset X ⊆ V and one row for each vertex i ∈ V ′ ; the entry in the column indexed by X and the row indexed by i is H(V, i). Held and Karp's algorithm is not cheap to execute: its time grows as O(2 n n 2 ) and its space (i.e., the size of the DP table) as O(2 n n). However, once the DP table is built, one can compute c X for any {0} ⊂ X ⊆ V ′ in constant time, simply accessing the entry in the column indexed by X and the row indexed by 0.
In a preliminary experiment, we used 3168 instances (see Section 5.1) with 8 to 16 delivery points, i.e., 9 to 17 TSP vertices. For each of them, we used strategies Concorde and HeldKarp to compute c X for all subsets X of delivery points. The results, summarised in Figure 2, suggest that using strategy HeldKarp is computationally less expensive but much more memory hungry than strategy Concorde, to the point of being impractical for larger instances. For this reason, we resort to using strategy Concorde in the computational experiments of Section 5.

Heuristic algorithms
We propose four heuristic strategies to explore the space of sets O ⊆ V , inspired by stepwise and bidirectional heuristic for variable selection in statistics (see, e.g., Hocking [37]): During preliminary experiments, we implemented variations of the above heuristics which start from random sets O ⊆ V . Such variations did not produce any sensible improvement over the more basic versions, so we decided to keep the latter for simplicity.

Approximation of E A [ C(O) ]
As noted before, computing the objective value of a solution O is hard, because one has to evaluate function C(O, A) for each set A ⊆ O (i.e., 2 |O| times) and solve a TSP at each evaluation. Our hypothesis, however, is that it is possible to approximate well, while evaluating much fewer functions. This hypothesis is intuitively motivated by the existence of strong concentration inequalities for both the sum of independent Bernoulli random variables [47,Ch. 4] and the length of the Euclidean TSP tour of points taken uniformly at random in a square [59,Ch. 2]. Because A is, indeed, a realisation of independent Bernoullis and c V ′ \A is the length of a TSP over points defined by the realisation of A, we can expect C(O, A) to be concentrated around . The above argument is not formal (e.g., the points in V \ A are not chosen uniformly at random) but it motivates the use of methods which work best when the quantity to estimate is concentrated around its mean. In the rest of this subsection we present two approximation methods, which we use to speed up the heuristic algorithms presented above.
where each set A ∈ A is built selecting each point i ∈ O with probability p i . Controlling the size of A, a user can trade-off computation time and approximation accuracy. Preliminary experiments showed that setting |A| = 20 was already enough to provide high accuracy. In particular, we found that the corresponding estimation allows to identify sets O whose expected cost lies under a 1% gap from the expected cost of the optimal set O opt , more than 95% of the times (over all instances and performing 50 re-runs for each instance) and is, in this sense, stable. We refer the reader to, e.g., Montemanni et al. [49] for more advanced methods of tuning parameter |A| for the related Probabilistic Orienteering Problem.

Another valid observation is that calculating E
is faster for small sets O, because there are fewer sets A ⊆ O. Defining some size-independent features of O, then, we propose to apply a Machine Learning (ML) algorithm to learn from a training set of small sets O and then predict it for larger sets.
We first describe the features we use as independent variables. To do so, we must introduce some notation. Letc(W ),ĉ(W ),c(W ) be, respectively, the largest, the average, and the smallest distance of a delivery point in set W ⊆ V from the depot: Let m(W ) be the sum of crowdsourcing fees for deliveries in W , m(W ) = ∑ i∈W m i . Finally, let d(W ) be the diameter of W , d(W ) = max i,j∈W c ij . We define, then, the following features: • One binary feature for each delivery point i ∈ V , with value 1 iff i ∈ O.
• One feature representing the fraction of crowdsourcing fees of offered deliveries, m(O)/m(V ).
• Three features, representing the ratios between largest, average, and smallest distances from the depot, between delivery points in O and all delivery points: • Three features, similar to those above, but referring to • Two features, representing the ratio between the diameters of, respectively, O and V \ O, and the diameter of . We decided to use the features above after data exploration and preliminary experimentation. One can further reduce their number by performing feature selection. However, the feature we use are quick to compute when building the training set and leaving any of them out neither decreases training time nor increases the model's accuracy significantly.
We test five simple and fast-to-train ML models: • the Elastic Net [65]; • a single Regression Tree [17]; • a modified regression tree, known as the M5 model [54]; • a Random Forest of regression trees [16]; • an ensemble of regression trees trained with the AdaBoost.R2 algorithm [23].

To create the training and test sets, we evaluate E
for all sets O ⊆ V on 1250 instances of size 8 to 12 (see Section 5.1). We use the two-thirds smallest sets O for the training set, and the remaining one-third for the test set (we denote the test set as T in the following).
] denote the prediction of an ML algorithm for the expected cost of set O. We use two metrics to assess the accuracy of the models. The first is a classical measure of prediction accuracy for regression models, the mean absolute relative error (MARE): The second is a metric of interest when the prediction from the ML model is the objective function of an optimisation model: the error on the best set (EB). This is the relative difference between the cost of the set that the ML model identifies as the lowest-cost set in T and the cost of the actual best set: This metric is important because it measures the relative loss that a planner would incur if he/she usedÔ ML instead of O opt as the offered set.  Figure 3 reports the distribution of MARE and EB over the 1250 test sets used in this analysis (we report detailed results in Appendix C). The median value for each model is reported in each box, and visualised with a horizontal line; the rest of the box spans between the first and third quartiles. Whiskers extend to the rest of the distribution, except for outliers marked with fliers. Both metrics agree in selecting Elastic Net as the best model out of the five we compare. We also note that all methods perform generally well, with the central quartiles of both distributions well below value 10%.
As a result of this preliminary computational analysis, we decided to use the Elastic Net model.

Computational study
After describing the instance generation method that we used, we propose two main analyses of results. The first aims at understanding how market environment factors, such as the willingness of customers to accept offers or the crowdsourcing fee amounts, affect planner's profitability and environmental sustainability. The indicator of profitability that we use are the savings that the planner achieves when allowing crowdsourcing vs. when serving all deliveries with the retailer's own vehicle. (The cost incurred when serving all deliveries with the own vehicle is the cost of the optimal TSP tour over all delivery points.) The indicator of sustainability is the number of miles saved by the retailer's own vehicle when crowdsourcing. This metric assumes that customers who accept to perform deliveries and use a carbon-emitting mean of transport, only apply a minimal detour to their originally planned routes.
The second analysis focuses on the computational contribution of this paper. We test the performance of the exact and heuristic methods introduced, propose to speed-up one of the heuristic methods using Monte Carlo and Machine Learning objective function estimation, and advise on which algorithms are more appropriate if the decision must take place within a few minutes.

Instance generation
We generate a large set of synthetic instances to analyse the impact of probabilities p and crowdsourcing fees m on the solutions. We consider instances with n = 8 up to n = 20 deliveries. In each of them, we place the depot at the origin of the Euclidean plane and distribute the delivery points uniformly within a radius R = 100 from the depot. In other words, for each delivery point, we generate a radius r ∈ [0, R], an angle θ ∈ [0, 2π), and then the coordinates of the point as x = √ r cos θ, y = √ r sin θ.
We implemented several criteria to generate probabilities p i and fees m i in each instance instance. For each criterion we first choose a base probability p or base fee m from a given set of values, and then generate p i and m i as described below. The three criteria to assign probabilities to the delivery points are: • Uniform criterion: we first fix a base probability p and then draw each p i uniformly at random in (p − 0.05, p + 0.05). We consider values of p in {0.05, 0.10, . . . , 0.55, 0.6}; during preliminary experiments we noticed that higher values lead to solutions in which, for the considered fee values, it is consistently convenient to offer all deliveries for crowdsourcing.
• Direct proportional criterion (farther deliveries are easier to crowdsource): we fix a base probability p ∈ {0.25, 0.5} and then let each p i = p + γ i · 0.25, where γ i ∈ [−1, +1] varies proportionally with the distance between delivery point i and the depot. For a point i at distance 0 from the depot, γ i would be −1; for a point at distance 100, it would be +1.
• Inverse proportional criterion (closer deliveries are easier to crowdsource): analogous to the previous one, but γ i varies inversely proportionally with the distance between the delivery point and the depot.
We use two criteria to assign the crowdsourcing fees: • Direct proportional criterion, i.e., each m i takes a value in (0, 100 · m n ) proportional to the value of p i compared to the minimum and maximum probabilities assigned to any delivery point. Here n is the number of delivery points and m ∈ {2.5, 2.6, . . . , 3.4, 3.5} is a base fee parameter determining the fee paid to the customers.
• Inverse proportional criterion: analogous to the previous one, but the fee is inversely proportional to the probability.
Varying the criteria above we created a dataset of 4576 instances. For each of the 13 sizes (n = 8 to 20) we consider, in fact, 16 possible ways of generating probabilities (12 values of p for the uniform criterion, 2 for the direct proportional criterion and 2 for the inverse proportional criterion) and 22 ways of generating fees (11 values of m for each of the two criteria). This gives a total of 13 · 16 · 22 = 4576 instances, which we make available on GitHub [56].

Analysis of the solutions
With this analysis we want to understand how instance generation parameters influence the optimal solutions to the PTSPC. These parameters are linked to properties of real-life scenarios; thus, analysing their impact can lead to managerial insights on which characteristics make crowdsourcing to customers more attractive. For example, if the market for crowdsourced LMD is offer-driven, one can imagine that higher fees would lead to higher crowdsourcing probabilities, a scenario modelled with the direct proportional fee criterion. If the market is demand-driven, the probability of crowdsourcing a delivery depends mainly on intrinsic characteristics (e.g., it is low for out-of-the-way delivery points) and a planner must offer high rewards for deliveries which are hard to crowdsource; this scenario is represented by the inverse proportional fee criterion. Analogously, one can use the direct proportional criterion for probabilities to model a supermarket which is mainly visited by customers living far away and driving a car, whereas the inverse proportional can model a supermarket in the city centre, with most of the customers walking there. Figure 4: Percentage of deliveries offered and percentage savings compared to no crowdsourcing (i.e., using the retailer's own vehicle for all deliveries), as a function of the instance generation parameters.
Next, we consider two relevant metrics and evaluate how they vary with the instance generation parameters. The first metric is the percentage of deliveries offered, 100 · |O opt | |V | (reported on the left y axis in Figure 4); the second is the percentage savings when using crowdsourcing, 100 · (reported on the right y axis). The two leftmost charts in Figure 4 show how the two metrics vary with parameters p and m, the base probability and base fee, respectively. The two rightmost chart, report the variation of the two metrics depending on the criteria used to generate the other probabilities and fees (see Section 5.1). The figure reports the average of such metrics, over all instances which share the same instance generation parameters.
As expected, increasing the base probability has a large impact on the solution: it yields larger offered sets and increased savings. Increasing the crowdsourcing fees, instead, has the opposite effect. Note, however, how probabilities tend to have a higher discriminative power on the savings achieved. For example, when grouping instances by base probabilities, the instances corresponding to p = 0.6 give, on average, more than 20% of savings. If we group instances by base fees, though, even the instances with the lowest fees m ∈ {2.5, 2.6} give a more heterogeneous array of savings, which only averages to circa 11% of savings.
In demand-driven markets the price elasticity curve is often modelled as a sigmoid function (see, e.g., [9]). Because, as noticed above, probabilities of acceptance have a large impact on savings, a planner should price the crowdsourcing fees to maximise the corresponding increase in probabilities (i.e., up until the elasticity curve starts flattening). After that, one can conceive alternative methods of improving the probability of acceptance without further increasing the fees; e.g., via marketing or providing alternative benefits. The specific relation between an increase in the fee and the corresponding increase in acceptance probability is market-specific and, to some extent, even customer-specific because different people have different price sensitivities. Therefore, although the planner should estimate this relation based on empirical data, as a rule of thumb we notice that the same relative variation applied to probabilities has a larger effect than applied to fees. For example, increasing the base probabilities by +40% (from 0.25 to 0.35) causes an increase in savings of 42% (from 6.53% to 9.25%). But a similar +40% increase in the base fees (from 2.5 to 3.5) only decreases the savings by 15% (from 10.88% to 9.29%). As such, if the relation between offered fee and acceptance probability were roughly linear in this interval, we would have a net +27% increase in savings under the high-fee/high-probability scenario compared to the low-fee/low-probability one. Because the critical part of the sigmoid elasticity curve is approximately linear, such an analysis seems plausible.
The two considered metrics are also stable when aggregating by the probability type and fee type parameters, with variations within 1-2 percentage points. This suggests that the advantages a planner can get by crowdsourcing last-mile deliveries are robust over the scenarios considered.
While the above analysis justifies the economical benefits of crowdsourcing last-mile deliveries, we also investigate on potential environmental benefits, in terms of miles saved. Let M PTSPC be the length of the tour of the retailer's own vehicle in an optimal solution to the PTSPC and M TSP be the length of the optimal Travelling Salesman tour for the same instance. We define the (percentage) miles saved as 100 · M TSP −M PTSPC M TSP . Note that, by definition, the miles saved only depend on the probabilities p i and not on the amount of fees m i . Figure 5 shows how the miles saved vary, as a function of the base probability p used during instance generation (detailed results are available in Appendix C). Each box spans the two central quartiles of the distribution over all instances with the given base probability. Whiskers extend to the rest of the distribution, except for outliers which are marked by fliers. The figure shows that using crowdsourcing can achieve a reduction in the amount of miles travelled by the retailer's own vehicle which ranges between 10% and 40%. Under the assumption that customers accepting to crowdsource the deliveries need a minimal detour from their planned route, almost all these savings translate into reduced vehicle-miles and into corresponding emissions savings. Table 1 shows an overview of the computational results, comparing different approaches to solve the PTSPC. Each column corresponds to an instance size, from 8 to 20 delivery points (here we report   Table 1: Comparison of different approaches to solve the PTSPC: complete enumeration, the branch-and-bound algorithm presented in Algorithm 1, the four heuristic introduced in Section 4.2, and two versions of heuristic F-step in which we estimate the objective function using the Monte Carlo and Machine Learning (Section 4.3) estimators. Rows "Time (s)" report the runtime in seconds. Row "Gap%" lists the optimality gap computed using lower and upper bounds introduced for the B&B algorithm. Rows "OptGap%" report the gap between the best solution found by the algorithm and the true optimum (the best values for each instance size are presented in bold). Rows "Closed%" list the fraction of instances for which the algorithms identified the optimal solution.

Computational analysis
average values over the 352 instances of each size, while in Table 4, Appendix C, we report standard deviations for the main metric considered), while each group of rows refers to an algorithm. We run all experiments on a cluster, in which we reserve one core of an Intel Xeon processor running at 1.7GHz with 4GB RAM.
The first row lists the time needed to enumerate all sets O and, for each of them, compute . This approach is infeasible in practice because, as expected, time grows exponentially in n; the enumeration takes more than fourteen hours for instances with 20 delivery points. In the next sets of row, we report indicators of the performances of the other algorithms. Rows "Time (s)" denote the elapsed time in seconds. Rows "OptGap%" list the percentage gap between the cost of the best solution found by the algorithm (UB) and the cost of the optimal solution (OPT). We compute the optimality gap as 100 · (UB − OPT)/UB. For the branch-and-bound algorithm we report two more quantities. Row "Gap%" gives the gap computed as 100 · (UB − LB)/UB, where LB denotes a lower bound on the objective value of the optimal solution. This is the gap that the algorithm can report to the user when it does not know the true optimal objective. Rows "Closed%" report the percentage of instances of the given size for which the gap was zero. Row "Nodes" lists the number of nodes visited during the exploration of the tree.
Next, we report the results of the branch-and-bound algorithm, which we run with a time limit of 1 hour. To highlight the importance of upper boundz ′ , we present the results both when we do not calculate this bound at the root node (rows "B&B, noz ′ ") and when we do (rows "B&B"). When not usingz ′ , the algorithm closes all instances up to size 12; on the other hand, it cannot solve to optimality any instance of size 17 and above. For these latter instances, the algorithm still has large gaps at the end of the runtime (rows "Gap%"). However, because we know (via enumeration) the cost of the optimal solution, we can also give a measure of the quality of the best feasible solution found within the time limit in rows "OptGap%". In this case, we see that the B&B algorithm finds high quality solution, all within 0.10% of the optimal on average. B&B, thus, identifies low cost solutions but struggles proving the optimality (or even the quality) of these solutions because of loose lower bounds. When using boundz ′ computation times increase slightly for the smaller instance sizes. The gaps, however, improve and the algorithm closes all instances up to size 13, while it cannot solve to optimality any instance of size 18 and above. For smaller instances which are solved within the time limit, usingz ′ allows to explore fewer nodes before reaching the provably optimal solution. For larger instances, on the other hand, the algorithm explores more nodes before the time limit hits, because the tighter bound allows to enter a node and prune it quickly in more occasions. In general, the gaps with the optimal solution (row "OptGap%") remain small, because they are more influenced by lower rather than upper bounds.
To see the different quality of the bounds directly, Figure 6 shows the upper and lower bound gaps at the root node. These gaps are respectively defined as 100·(UB−OPT)/UB and 100·(OPT−LB)/OPT, where UB and LB are the values of upper (z andz ′ ) and lower (z) bounds computed at the root node of the B&B tree. The leftmost chart aggregates instances by size n, the central one by base probabilities p, and the rightmost one by base fees m. As mentioned above, the figure shows that the lower bound is looser than the upper bounds, and causes large gaps. It also shows thatz ′ is significantly tighter thanz. Even if the main issue faced by the B&B algorithm is that the lower bound is weak, a tighter upper bound offers more chances to prune larger parts of the tree earlier and, thus, speed up the tree exploration. This is, indeed, reflected in the results of Table 1. Figure 6 also shows that the quality of the upper bounds decreases for high probabilities and, to a lesser extent, for high fees. In case of higher probabilities, in fact, more deliveries will be crowdsourced and the term c V in boundx is a bad approximation of the routing costs. When fees are high, optimal solutions tend to exclude the deliveries with the highest fees from the offered set; therefore, the worstcase costs computed byz andz ′ can be very far from the optimum. The quality of the lower bound, on the other hand, improves for high probabilities and high fees. As mentioned in Section 2.2, when probabilities of acceptance are 1 the PTSPC reduces to the PTP, which we use to computez. It seems reasonable, then, that for higher probabilities the difference between the cost of the solutions of the PTSPC and of the PTP diminishes and the bound becomes tighter. When fees are high, both the solution to the PTSPC and to the PTP used to compute the lower bound will tend to visit many delivery points with the retailer's own vehicle -in general, those with the highest fees. This makes the two solutions look more similar (and, thus, the bounds tighter), compared to when fees are low.
One last remark about the B&B algorithm is that, while the final gaps remain low for all instance sizes, there is a sharp drop in the number of instances solved to optimality starting from size 17 (or even 16 when not usingz ′ ). This drop means that the B&B algorithm still finds very good solutions (small gaps), but not exactly the optimal ones (low "Closed%"). Finally, we note how the number of nodes explored tends to increase up to instances of size 16; from size 17 on, because the exploration of each node starts to become time consuming, the algorithm visits fewer nodes within its time limit. Figure 6: Percentage gaps of the upper and lower bounds, compared to the optimal (or best-known) solution, at the root node of the B&B tree.
Going back to Table 1, the next four groups of rows refer to the heuristic algorithms introduced in Section 4.2. The F-step heuristic is faster than the B-step algorithm, because it tends to visit smaller sets (remember that it starts with O = ∅). On the other hand, it tends to have larger gaps for all instance sizes but for 19 and 20. Because the optimal solution tends to offer more than half of the deliveries (see Section 5.2) we could, in fact, expect that B-step gives slightly better results. In both cases, however, it is striking how the heuristic algorithms yield low gaps, with the vast majority lower than 1%. We attribute this phenomenon to the shape of the objective function of our problem which, as we observed empirically, tends to flatten once |O| reaches values close to |O opt |. Figure 7 visualises this phenomenon. The figure reports the gap between the objective value of generic sets O and the optimal set O opt , averaged over all sets O of sizes between |O opt |−5 and |O opt |+5 and over all instances. Note that the median gap for size difference 0, i.e., for sets O of the same size as O opt , lies well below 5%. Considering that the heuristics build sets which are locally optimal, in light of Figure 7, it is less surprising that they manage to keep the average gaps below 1%. To summarise, it seems important to determine how many customers should be offered for crowdsourcing and, once this is established, one can get further savings by carefully choosing which customers to crowdsource. This is a fact which we can exploit to devise further heuristics, as we discuss in Section 6. Table 1 also shows that there is a sharp drop in the number of instances for which the heuristic algorithms find the optimal solution (rows "Closed%"), once the instance size reaches value 18.
Regarding run times, even the fastest of the two stepwise heuristics (F-step) takes an average of roughly 20 minutes for the larger instances and can be impractical to use in real-life decision support tools. Using the bidirectional heuristics BF-bid and FB-bid increases the run times even further, while providing little improvement in terms of solution quality. Therefore, in the last two groups of rows, we focus , for all sets O ⊆ V which have size between −5 and +5 compared with the size of O opt . In other words, for all instances, we consider all sets O such that |O opt | − 5 ≤ |O| ≤ |O opt | + 5 and compare their cost with the cost of O opt . For each size difference, the boxes span between the first and the third quartile, with the central horizontal line denoting the median value. Whiskers extend to the rest of the distribution (omitting outliers). on speeding up the solution times: because the F-step heuristic is the fastest of the four and produces high quality solutions, we use it as a base and replace the exact evaluation of with its estimation using Monte Carlo (algorithm F-MC20) and Machine Learning (algorithm F-ML) estimators. For consistency with the other results, OptGap% will use the true value of the objective function of the set returned by the algorithms (calculated a posteriori), even if the algorithms themselves use its approximation.
For the F-MC20 algorithm, we use a sample size of 20 (i.e., |A| = 20 in the notation of Section 4.3) which allows us to speed up the run time of the algorithm considerably (almost always under two seconds) while maintaining a comparable solution quality. Such speed-up makes this algorithm suitable for interactive decision support tools, in which the decision maker can experiment and perform scenario analysis varying instance parameters such as the fee that the planner is willing to offer.
For the F-ML algorithm we took the following approach. We allocate the first five minutes of run time to building the training set. This means that the algorithm computes the exact expected cost for the smaller sets O during this period. If the algorithm completes within the first five minutes, then it is equivalent to F-step (the corresponding values are in grey in Table 1). On the other hand, if after this time there are still sets O to explore, we train the Machine Learning model using the data collected during the first five minutes and then use the model to estimate the objective value of the larger sets. Because the Elastic Net model selected in Section 4.3.2 trains in a matter of fractions of a second, this approach would allow us to have constant run times of roughly five minutes, even for instances larger than the ones considered in this study.
The instance sizes for which F-step took longer than 5 minutes were 19 an 20. In this case, the F-ML algorithm has run times close to 5:10 minutes and gives solution of the same quality as the F-step algorithm. Note that for instances of size 20, the average "OptGap%" is even smaller for F-ML than it is for F-step, indicating that choosing the customer to add using the estimated objective value instead of the true gives a better set of offered customers, in the end. Analogously, "Closed%" is smaller for both size 19 and 20. Since the run time of this heuristic tends to be stable (and reasonably small) no matter how large the instance size and the solutions produced are of high quality, it can be used in a non-interactive decision support tool which runs in the background roughly five minutes before the "end of day" period starts at the supermarket.

Conclusions and future research
In this paper we have introduced the problem of determining a subset of last-mile deliveries that a company should open for crowdsourcing at the end of the day. We placed this problem in the context of both optimisation of the last segment of the retail supply chain, and in that of TSP problems in which not all customers are visited.
We proposed a branch-and-bound algorithm which has the advantage of being able to provide optimality gaps. Gaps are high for larger instances even when using a time limit of one hour. However, the quality of the solutions found by the algorithm seems to be higher than what the gaps would suggest, due to poor lower bounds which inflate the gaps. Therefore, in future works, we plan to devise tighter lower bounds which can speed up the exploration of the B&B tree, provide stronger optimality guarantees for the primal solutions and allow to solve larger instances. We note that exact algorithms for stochastic routing problems are still limited to solve small instances. For example, the algorithm of Laporte, Louveaux, and Mercure [42] (which is still the state-of-the-art exact algorithm for the PTSP) solves instances with 50 customers, but only 5 of them are stochastic. Even recent B&B algorithms for the PTSP were tested on instances of up to 10 customers [46], 18 customers [4], and 30 customers [3].
We also proposed four heuristic algorithms which only explore a small portion of the solution space. Surprisingly, even simple heuristics such as the forward stepwise F-Step consistently give solutions within 1% of the optimum. We attribute this pleasant property to the shape of the objective cost landscape, which becomes flat once the number of offered deliveries is fixed, for reasons similar to those exposed in Section 4.3 for the concentration of E A [ C(O) ] around its mean. Using this information, we plan to investigate two-stage heuristics in which we first determine a good size for the offered set and accordingly produce a feasible solution quickly. Then, we look among offered sets of the given size to further lower the solution cost. We also plan to investigate the performance of our heuristics on instances with a radically different topology; for example, instances in which customers are strongly clustered.
Finally, we proposed two methods to approximate the objective function of our problem. Because computing the cost of one solution involves solving an exponential number of TSPs, such approximations give dramatic speed-ups in run times. The Monte Carlo simulation method runs under two seconds, and the Machine Learning method gives an almost constant-time algorithm, whose time limit can be set by the user (we used five minutes in our experiments). The interesting property of such approximations is that they have little impact on the solution quality, compared to the heuristic algorithm to which we applied them. For example, using the F-MC20 algorithm, one could build a real-time decision support tool which enables decision-makers to perform extensive scenario analyses.
An analysis of synthetic instances shows that crowdsourcing deliveries has the potential to both achieve savings and to reduce the total miles travelled by vehicles, contributing to a more sustainable last-mile supply chain. Moreover, using customers as occasional drivers can reduce the negative effects of current outsourcing policies, increase trust and promote social engagement. In the future, we also plan to extend our study to applications beyond LMD, e.g., in social care problems where pharmacies ask their customers to deliver medicines to their elderly neighbours. Table 2 provides a detail of the data relative to the comparison of MARE (Mean Absolute Relative Error) and EB (Error on the Best set) for the five Machine Learning models introduced in Section 4.3.2. The values in the table refer to the same data as Figure 3. Column "Model" reports the name of the Machine Learning model, while columns "Mean", "Median" and "StdDev" list, respectively, the mean, the median and the standard deviation of the two metrics (MARE and EB). All results are in percentage.  Table 2: Detailed results on the comparison of MARE and EB error metrics for the five considered Machine Learning models. The corresponding distributions are depicted graphically in Figure 3. All values are in percentage. Table 3 contains a detailed breakdown of the impact of instance generation parameter p on the miles saved in the optimal solution of the PTSPC. This data is also reported graphically in Figure 5. Column "Base prob" lists the values of parameter p. Columns "Mean", "Median" and "StdDev" list, respectively, the mean, the median and the standard deviation of the percentage of miles saved when using the optimal PTSPC solution instead of the TSP solution (which corresponds to no crowdsourcing).  Table 3: Detailed results on the miles saved as a function of the base probability p used during instance generation. The corresponding distributions are depicted graphically in Figure 5. All values are in percentage, relative to the TSP tour length.  Table 4: Standard deviations for the "OptGap%" values of Table 1. Each column refers to one of the eight algorithms considered. Each row refers to the results produced by that algorithm on the instances of the given size.