Bidirectional labeling for solving vehicle routing and truck driver scheduling problems

Abstract This paper studies the vehicle routing and truck driver scheduling problem where routes and schedules must comply with hours of service regulations for truck drivers. It presents a backward labeling method for generating feasible schedules and shows how the labels generated with the backward method can be combined with labels generated by a forward labeling method. The bidirectional labeling is embedded into a branch-and-price-and-cut approach and evaluated for hours of service regulations in the United States and the European Union. Computational experiments show that the resulting bidirectional branch-and-price-and-cut approach is significantly faster than unidirectional counterparts and previous approaches.


Introduction
In long-distance haulage, truck drivers must comply with hours of service regulations mandating minimal requirements concerning breaks and rest periods. This paper studies the vehicle routing and truck driver scheduling problem (VRTDSP) which is a variant of the well-known vehicle routing problem with time windows in which hours of service regulations must be complied with.
Approaches for solving vehicle routing and truck driver scheduling problems have to ensure that all truck drivers take regular breaks and rest periods as mandated by respective hours of service regulations. This is usually achieved by evaluating routes using forward labeling methods, where labels represent possible states of a truck driver after conducting some sequence of activities. These activities include breaks, rest periods, driving periods, and other periods in which the driver is working. For each activity conducted by the truck driver, the label is updated using a so-called resource extension function (REF). Determining a feasible truck driver schedule for a given route * This paper extends and replaces a previous unpublished working paper which focused on hours of service regulations in the United States. is a computationally expensive task, because for most hours of service regulations studied in the literature, no polynomial complexity bound is known and the number of route evaluations is usually huge.
In classical vehicle routing the evaluation of routes requires little computational effort. Forward labeling methods, which extend a label from one customer to another with a simple REF, can often be easily turned into backward labeling methods by simply reversing the orientation of the arcs on which the REF is applied (see Figure 1). For cumulative constraints, e.g., capacity constraints, the forward REF f nm and the backward REF g nm can extend the respective label attributes in an identical way. For other constraints, e.g., time windows, label attributes can often be extended by f nm and g nm in a very similar way, i.e, the backward REF is a simple inversion of the forward REF (see Irnich 2008). This ease of reversing the direction allows the use of bidirectional labeling approaches (Righini and Salani 2006). The benefit of bidirectional approaches is that forward and backward labels do not need to be propagated for the entire route. Instead, forward and backward labels can be propagated only up to a so-called half-way point, thus limiting the overall number of labels created and reducing the respective computational burden.  If hours of service regulations must be considered, a single REF between a pair of customers n and m does not suffice because the different driver activities, such as driving periods, breaks, and rests must be explicitly modeled and require dedicated REFs. Furthermore, due to the asymmetry of hours of service regulations, forward and backward labeling methods cannot use the same REFs for forward and backward label propagation. Up to now, it was impossible to leverage the potential of bidirectional approaches for the VRTDSP because it was unclear how to design a backward labeling method for route evaluations subject to hours of service regulations.
This paper presents backward labeling methods for the US truck driver scheduling problem (US-TDSP) and the EU truck driver scheduling problem (EU-TDSP), which are the problems of determining a sequence of driver activities allowing a truck driver to visit all customer locations in a route within given time windows and without violating hours of service regulations in the United States (US) and the European Union (EU). We show how backward labels can be combined with forward labels and present a bidirectional labeling method for the US-and EU-TDSP.
The bidirectional labeling method can be used within heuristic and exact approaches for solving the VRTDSP. In heuristic approaches, routes are usually modified using neighborhood operators which make minor changes to one or several routes. Here, solution approaches solely based on forward labeling methods do have a significant disadvantage, because whenever a route is changed, feasibility of the new route can only be validated after calculating new labels for all customer locations in the route from the first change until the end of the route. The bidirectional approach presented in this paper allows to re-evaluate a modified route by only updating labels locally, i.e., from the first change until the last change in the route and by merging forward and backward labels accordingly. This can speed up the evaluation of modified routes significantly.
Exact approaches for vehicle routing problems are often based on column generation (CG, see Desaulniers et al. 2005) where new routes are generated solving a shortest path problem with resource constraints (SPPRC, see Irnich and Desaulniers 2005). The bidirectional approach presented in this paper can be used to speed up the solution process for the SPPRC. We present such an approach, more precisely, we present a branch-and-price-and-cut (BPC) algorithm for the solution of the VRTDSP. Computational experiments demonstrate that bidirectional labeling significantly speeds up the solution process.
Another noteworthy contribution of this paper is that our bidirectional labeling approach is very effective when minimizing cost functions containing a duration-related component. As Regulation (EC) No 561/2006 demands that transport companies do not give drivers any payment related to distances traveled, realistic cost models for European transport operators must consider labour costs that are not based on distances. A realistic model of costs in the EU considers both distancerelated costs, such as fuel costs, as well as duration-related costs, such as labor costs. In order to calculate the realistic costs of a route, a truck driver schedule with minimal duration must be found. Finding such a truck driver schedule can be very time-consuming in the presence of hours of service regulations and previous unidirectional approaches (Goel 2018) have performed very poorly in this regard. The BPC algorithm with bidirectional labeling presented in this paper is particularly well suited for such problems and can be used to minimize the weighted sum of distance-and duration-related costs.
The remainder of this paper is as follows: Section 2 briefly reviews the related literature. In Section 3, hours of service regulations in the US and EU are summarized. Section 4 presents the bidirectional labeling approach for the US-and EU-TDSP. We propose new backward labeling methods and show how forward and backward labels can be combined to find feasible truck driver schedules. In Section 5, we present our BPC algorithm for the exact solution of the VRTDSP and show how the bidirectional labeling approach can be used to solve the arising subproblem which is an SPPRC considering hours of service regulations. Section 6 presents computational results for the BPC algorithm and concluding remarks are given in Section 7.

Related work
The performance of exact and heuristic algorithms for solving vehicle routing problems strongly depends on the effort required to evaluate the feasibility of routes or partial routes and their contribution to the objective function. For many constraints typically found in vehicle routing problems, route evaluations can be done very efficiently (Vidal et al. 2014, Campbell andSavelsbergh 2004).
However, in the presence of hours of service regulations, determining whether a truck driver schedule complying with the regulations exists, is a non-trivial and time-consuming task. Archetti and Savelsbergh (2009) were the first to present a polynomial time approach for checking compliance of a route with US hours of service regulations at that time. Goel and Kok (2012) show that the problem can be solved in O(k 2 ) time, where k is the number of locations visited by the route.
However, with the change in regulations in 2013, these approaches have become obsolete. So far no polynomial complexity bound is known for route evaluation subject to hours of service regulations in the United States and the European Union (Goel 2014(Goel , 2010. Furthermore, when searching for feasible truck driver schedules with minimal duration, the computational effort is significantly higher (Goel 2012).
Early approaches for solving vehicle routing problems in the presence of hours of service regulations tried to reduce the computational effort by using simple heuristics for route evaluations (Xu et al. 2003, Zäpfel andBögl 2008). For example, Xu et al. (2003) evaluate routes by iterating over possible starting times at the first location and determining a unique schedule for each starting time by following the constraints imposed by U.S. hours of service regulations. Among the feasible schedules for the different starting times, the schedule with the smallest costs is selected. Goel (2009) proposes a forward labeling method for EU hours of service regulations considering alternative break and rest schedules and show that significantly better solutions can be found compared to using simple heuristics. Similarly, Kok et al. (2010) present a forward labeling method for EU hours of service regulations considering additional provisions of the regulations. As forward labeling algorithms suffer from the effect that progressively more labels are created, extended, and needed to be stored when the length of the routes increases, Kok et al. (2010) propose to restrict the number of alternative labels to be considered by constant values. Prescott-Gagnon et al. (2010) propose to reduce the computational effort of route evaluations by determining lower and upper bounds on label attributes using forward and backward approaches. A heuristic labeling algorithm is then used to determine a feasible schedule within the tightened bounds. Besides using heuristic labeling methods and constraints on the number of alternative labels, Goel and Vidal (2014) use lower bounds on the duration to travel a given distance in order to avoid the extension of labels to customers that cannot be reached within their time windows. The approaches presented by Goel Kok et al. (2010), andPrescott-Gagnon et al. (2010) focus on minimizing the total distance whereas Goel and Vidal (2014) minimize the duration of each route only in a post-processing step. Rancourt et al. (2013) propose a forward labeling approach where schedule durations are considered and heuristic dominance rules are used to speed up the solution process.
For a rich vehicle routing problem with simplified break and rest requirements, Ceselli et al.
(2009) present a bidirectional dynamic programming approach capable of finding optimal solutions for small scale instances. The first exact approach for hours of service regulations in the United States and the European Union is presented by Goel and Irnich (2017). Goel (2018) extends the unidirectional approach of Goel and Irnich (2017) to accommodate for additional national rules within the European Union and duration-related costs. Although the algorithm does not manage to find optimal solutions for many of the 25 customer instances, the best solutions found within the runtime limit of one hour demonstrated that cost savings can be significant when using realistic cost functions based on distance and duration.

Hours of service regulations
This section describes the most important rules of hours of service regulations in the United States and the European Union for a planning horizon of one week.

United States
Hours of service regulations United States are imposed by the Federal Motor Carrier Safety Administration (2011). According to these regulations, a driver must not drive for more than 11 hours without taking a rest period of at least 10 consecutive hours. The regulation prohibits a driver from driving after 14 hours have elapsed since the end of the last rest period. Furthermore, no driving is allowed if 8 hours have elapsed since the end of the last rest or break period of at least 30 minutes.
Lastly, drivers may not be on duty for more than 60 hours in 7 days or 70 hours in 8 days.
Hours of service regulations in the United States do not constrain how drivers are financially compensated and most truck drivers are paid by how many miles they have driven (Bureau of Labor Statistics, U.S. Department of Labor 2019).
Most member states of the EU have night time definitions of four or seven hours duration starting between 20.00h and midnight and ending between 4.00h and 7.00h. A night time definition from 20.00h to 7.00h covers all of the different night time definitions in the EU. In all member states of the EU, the daily working time limit that applies for drivers performing night work is significantly smaller than the amount of work that can legally be conducted during a day. Therefore, we assume in the remainder that drivers take a rest in every night and do not perform night work.
The amount of driving and the amount of working within a week is restricted to at most 56 and 60 hours, respectively.

Overview of parameters
The main parameters of hours of service regulations in the United States and the European Union are summarized in Table 1.
The minimum duration of a rest period t rest|1st -3h The minimum duration of the first part of a rest period taken in two parts t rest|2nd -9h The minimum duration of the second part of a rest period taken in two parts t break The minimum duration of a break t break|1st -1 4 h The minimum duration of the first part of a break taken in two parts The minimum duration of the second part of a break taken in two parts t drive|R 10h 9h The daily driving time limit t drive|B -4 1 2 h The maximum driving time without a break t work|B -6h The maximum amount of work time without a break t drive|W -56h The maximum amount of driving time between weekly rest periods t work|W 60h or 70h 60h The maximum amount of working time between weekly rest periods t elapsed|B 8h -The maximum time after the end of the last break or rest period until which a driver may drive t elapsed|R 14h -The maximum time after the end of the last rest period until which a driver may drive The duration of a day t night -4h -11h The duration of the time considered as night time The time of the day at which night time begins The time of the day at which night time ends Table 1 Parameters of hours of service regulations in United States and the European Union

Bidirectional labeling
Optimizing vehicle routes subject to hours of service regulations requires a methodology to validate compliance of all routes with the regulations. Furthermore, if total costs are related to the schedule duration, for example, if labor costs are related to time, the cost of performing a route can only be determined if all activities conducted by the truck driver and their durations are known.
The problem of validating compliance of a given route with hours of service regulation is a truck driver scheduling problem (TDSP) which is the problem of determining a sequence of driver activities allowing a truck driver to visit a given sequence of locations (n 1 , n 2 , . . . , n k ) in such a way that the cumulative duration of all driving activities between each pair of locations n i and n i+1 for 1 ≤ i < k matches the given driving time d n i ,n i+1 , that at each location n i for 1 ≤ i ≤ k a stationary activity of a given duration s n i is conducted and begins within a given time window [t min n i , t max n i ], and that the sequence of all driver activities complies with applicable hours of service regulations.
Forward and backward labeling methods can be used to modify appropriate labels along a trip between customers n and m using the networks and REFs illustrated in Figure 2 Goel and Irnich (2017) and Goel (2018). In backward labeling we can similarly define REFs for backward label propagation. Unlike in classical vehicle routing, backward REFs g a ∆ for vehicle routing and truck driver scheduling may substantially differ from their forward counterparts f a ∆ . The difficulty in developing efficient labeling methods for vehicle routing and truck driver scheduling stems from the fact that it is often not possible to decide on the best driver activity a to be conducted next. In the remainder of this section, we present backward labelling methods for hours of service regulations in the United States and the European Union and show how labels generated with forward and backward methods can be combined.

United States
Before presenting a backward labeling method for hours of service regulations in the United States, let us illustrate the asymmetry of the regulations on a simple example. Figures 3a and 3b show two symmetric schedules that could be obtained by forward and backward approaches.
In the schedule illustrated in Figure 3a, the driver starts fully rested with a work activity for loading the vehicle. After 7 hours of driving, a break is required because a driver is not allowed to drive if a total of 8 hours have elapsed without a break or rest. After the break, the driver continues driving for another 4 hours, after which the destination is reached. Due to strict time requirements on the time of loading and unloading activities in this example, the driver must wait for 3 hours before unloading the vehicle. This schedule complies with hours of service regulations in the United States.
The schedule illustrated in Figure 3b is the symmetric counterpart of the schedule illustrated in Figure 3a. Again, the driver starts fully rested with a work activity for loading the vehicle. After 3 hours of waiting, the driver drives for 4 hours before a break is required. After the break, the driver continues driving for another 7 hours before unloading the vehicle. This schedule, however, violates hours of service regulations in the United States, because the driver is driving after 14 hours have elapsed without a rest.

Figure 3 Symmetric schedules
Given the asymmetry of the regulations, we cannot simply invert the forward REFs of Goel and Irnich (2017) when developing a backward method. This section describes a backward labeling method for the US-TDSP. For the sake of conciseness, the following presentation focuses on the description of the backward labeling method and details concerning the underlying reasoning are provided in the Appendix. Similar, to the forward labeling method by Goel and Irnich (2017), the backward labeling method represents the state of the driver by a multi-dimensional label l . The index is used to highlight that the label belongs to the backward method. Similarly, we will later use an index indicating labels belonging to a forward method. The attributes of a backward label l = (l time , l trip , l work|W , l drive|R , l elapsed|R , l elapsed|B , l earliest|R , l earliest|B ) can be interpreted as follows: l time represents the start time of the earliest activity, l trip represents the remaining driving time on the trip to the previous customer, l work|W represents the accumulated working time, l drive|R represents the accumulated driving time preceding the next rest, l elapsed|R represents the time elapsed until the end of the last driving activity preceding the next rest, l elapsed|B represents the time elapsed until the end of the last driving activity preceding the next break or rest, l earliest|R represents the earliest possible time at which the last driving activity preceding the next rest must be completed, l earliest|B represents the earliest possible time at which the last driving activity preceding the next break or rest must be completed.
A label representing the state of a driver who ends service at location n k is This label can be used in a labeling method as an initial label which is changed using the REFs presented below.
Resource extension functions. Given a backward label representing the driver state at location m, the possible driver states at location n can be calculated by finding a path through the backward network shown in Figure Table 2 shows how label attributes l are updated tol by the REFs related to driving and other work. Blank entries indicate that the resource value is kept.   Table 3 shows how the REFs related to off-duty periods update label attributes. All these REFŝ  Feasibility conditions. Whether g a ∆ (l ) complies with US hours of service regulations can be determined based on the attribute values of l . In order to only generate labels complying with the regulations, the feasibility conditions given in Table 4 must be satisfied when using the corresponding REFs.  Dominance. We can use dominance rules to reduce the number of alternative labels to be considered. Given two feasible labels l andl which both represent a driver state at the beginning of the partial route (n i , n i+1 , . . . , n k ) with 1 ≤ i ≤ k, we write l l if l j ≤l j for all j ∈ {trip, work|W, drive|R, elapsed|R, elapsed|B, earliest|R, earliest|B} and l time ≥l time . If l l , then

REF Feasibility conditions
are non-decreasing in all resources. Hence, l dominatesl andl can be discarded from the set of labels to be updated.
In the Appendix it is shown which sequences of driver activities are dominated by others. Based on these findings we can find conditions telling us when a REF is inferior to another. For this we extend our labels by an additional attribute l last indicating the last activity scheduled. Table 5 provides an overview of inferiority conditions.  Figure 4 gives an example of how a schedule for the example in Figure 3 can be found using the backward REFs. The schedule illustrated in Figure 4a gives a partial schedule obtained bŷ

REF Inferiority conditions
where l indicates the initial driver state. Exploiting our inferiority conditions, we only used the maximum possible driving time and minimum allowed break duration. Combining forward and backward labels. Following Goel and Irnich (2017), a forward label can be represented by l = (l time , l trip , l work|W , l drive|R , l elapsed|R , l elapsed|B , l latest|R , l latest|B ) where the label attributes can be interpreted as follows: The US-TDSP for a given route (n 1 , n 2 , . . . , n k ) can now be solved by determining forward labels for a partial route (n 1 , n 2 , . . . , n i ) and backward labels for a partial route (n i , n i+1 , . . . , n k ) and checking for a feasible combination for each pair of forward and backward labels.
We now show how a forward label l associated to a driver state upon completion of a partial route (n 1 , n 2 , . . . , n i ) can be combined with a backward label l associated to a driver state when beginning a partial route (n i , n i+1 , . . . , n k ). Note, that both the forward and the backward labeling method add the stationary work at location n i . Thus, we need to be careful that we do not double count the respective duration s n i when checking whether the respective schedules can be combined.
A forward label l and a backward label l at the same location n i can be combined if Analogously, if above conditions hold for g rest t rest (l ) or g break t break (l ), then it is possible to combine the forward schedule with the backward schedule obtained by adding a rest or break.

European Union
The asymmetry of EU hours of service regulations mainly results from the possibility of taking breaks and rests in two parts with different durations. A backward label can be represented by l = (l time , l trip , l work|W , l drive|W , l drive|R , l work|B , l drive|B , l elapsed|R , l earliest|R , l rest , l break , l days , l dawn ) where l time , l trip , l work|W , and l drive|R have the same interpretation as for US hours of service regulations. The remaining label attributes can be interpreted as follows: For a driver who ends service on any given day of the planning horizon, we have to distinguish between the cases that the driver has already taken the first part of a rest period, or not. In the first case, 9 hours of rest are required after the last activity. In the second case, 11 hours of rest are required. In a backward labeling method, these two cases result in different initial labels with different values of l elapsed|R , l earliest|R , and l rest .
For a planning horizon covering several full days and a location n k with s n k = 0 and a time window spanning the full planning horizon, a backward label representing the state of a driver who ends service on the jth day with an 11 hour rest is l = (j − 1) · t day + t dusk , 0, 0, 0, 0, 0, 0, t rest , (j − 1) · t day + t dawn + t rest , 0, 0, 1, (j − 1) · t day + t dawn ).
A backward label representing the state of a driver who ends service on the jth day with a 9 hour rest can be obtained by changing above label by setting l elapsed|R = t rest|2nd , l earliest|R = (j − 1) · t day + t dawn + t rest|1st + t rest|2nd , and l rest = t rest|1st . If s n k > 0 or the time window of location n k is narrower, the initial labels can be adjusted accordingly.
For each day in the planning horizon and the two cases, the respective labels can be used as initial labels of a backward labeling method using the REFs presented below.
Resource extension functions. Analogously to the case of US hours of service regulations, we can propagate backward labels along the arcs of the network shown in Figure 2. Given the differences in the regulations we need dedicated REFs g drive , and g idle ∆ to update the driver state depending on the duration ∆ of the respective driver activity. Note, that we explicitly distinguish between a full rest and the second part of a rest taken in two parts and rest periods taken during day time and those covering a night. Furthermore, we only consider breaks taken in two parts because the full break requirement can also be fulfilled by taking both parts immediately after another. Table 6 shows how label attributes are updated by the REFs related to driving and other work. These REFs are very similar to those for US hours of service regulations with differences in earliest|R , l dawn + l elapsed|R + l rest + l break + ∆} max{l earliest|R , t min n + l elapsed|R + s n } Table 6 Backward REFs related to driving and other work determining l elapsed|R and l earliest|R . , however, they ensure that the rest spans over the entire night by settinĝ Table 7 Backward REFs related to rest periods taken during day time Furthermore, they increment l days by one, reduce l dawn by t day , and setl earliest|R to the appropriate value. Table 8 shows how the REFs related to break periods update label attributes. These REFŝ l earliest|R max{l earliest|R , l dawn + l elapsed|R + l rest + ∆ + t break|1st } max{l earliest|R , l dawn + l elapsed|R + l rest + ∆} l break t break|1st 0  In order to only consider labels complying with the regulations, the feasibility conditions given in Table 9 must be satisfied when using the corresponding REFs. For REF g drive ∆ the duration ∆ must not exceed the largest possible driving time given by ∆ EU l := min{l trip , t drive|W − l drive|W , t drive|R − l drive|R , t drive|B − l drive|B , t work|W − l work|W , t work|B − l work|B , t day − l rest − l break − l elapsed|R , l time − l dawn − l rest − l break }.
(3) requires that the second part of a break is already taken. Lastly, for REF g idle ∆ , the duration ∆ must be small enough so that the respective activity begins before the next night and the rest can be completed within 24 hours after the previous rest.

REF Feasibility conditions
Dominance. Like in the forward labeling method we can use dominance rules to reduce the number of alternative labels to be considered. Given two feasible labels l andl which both represent a driver state at the beginning of the partial route (n i , n i+1 , . . . , n k ) with 1 ≤ i ≤ k, we write l l if l j ≤l j for all j ∈ {days, trip, drive|W, drive|R, drive|B, work|W, work|B, elapsed, earliest, rest, break} and l time ≥l time . Note that if l time ≥l time and l earliest ≤l earliest then l dawn =l dawn .
If l l , then we also have g(l ) g(l ) for each REF Hence, l dominatesl andl can be discarded from the set of labels to be updated.
In the Appendix it is shown which sequences of driver activities are dominated by others, why it is always better to schedule driving activities as long as possible, and why it is always better to schedule break and rest periods as short as possible.

REF Inferiority conditions
g idle ∆ (l ) always  The conditions in Table 10 show when a label generated by a backward REF is dominated by another label. In particular, they tell us that it is never beneficial to explicitly schedule idle periods, that it is always better to schedule driving activities as long as possible, and that it is always better to schedule break and rest periods as short as possible. Also a break or rest should not be scheduled if the last activity was a break or rest or if it is possible to schedule the next visit.
Combining forward and backward labels. A bidirectional labeling method for EU hours of service regulations can be obtained by determining forward labels for a partial route (n 1 , n 2 , . . . , n i ), backward labels for a partial route (n i , n i+1 , . . . , n k ), and checking the conditions for a feasible merge for each pair of forward and backward label. Following Goel (2018), a forward label can be represented by l = (l time , l trip , l work|W , l drive|W , l drive|R , l work|B , l drive|B , l elapsed|R , l latest|R , l rest , l break , l days , l dusk ) where l time , l trip , l work|W , l drive|W , l drive|R , l work|B , l drive|B , and l days are the counterparts of the respective backward labels and l elapsed|R represents the time elapsed since the end of the previous rest, l latest|R represents the latest possible time at which the previous rest must be completed, l rest represents the remaining amount of rest required, i.e., t rest or t rest|2nd , l break represents the remaining amount of break time required, i.e., t break or t break|2nd , and l dusk represents the time at which the next night begins.
We now show how a forward label l associated to a driver state upon completion of a partial route (n 1 , n 2 , . . . , n i ) can be combined with a backward label l associated to a driver state when beginning a partial route (n i , n i+1 , . . . , n k ). Recall that both the forward and the backward labeling method add the stationary work at location n i and the respective duration s n i must not be double counted when combining the labels.
A necessary condition for combining the forward label with the backward label obviously is that because otherwise the cumulative amounts of driving and work would exceed the weekly limits. If furthermore max{l time + l rest , l dusk + t night } ≤ l time + s n i − l rest − l break , both labels can be merged because at least one night rest can be scheduled between the respective partial schedules excluding the work of the backward label. The total number of days required for the schedule corresponding to the merged pair of labels is i.e., the sum of the duration of both partial schedules plus the number of full days in between.
If no night rest can be scheduled, both labels can be merged if In this case, the total number of days required for the schedule corresponding to the merged pair of labels is l days + l days − 1.
it is possible to merge the backward label with the forward label obtained by adding the required break or rest period.

Vehicle Routing
This section describes a BPC algorithm for solving the VRTDSP using the bidirectional labeling approaches presented in the previous section.
Let C denote a given set of customer locations. For each n ∈ C let [t min n , t max n ], s n , and q n denote the time window of the customer, the non-negative duration of the service time that must begin within the time window, and the non-negative demand. Furthermore, let n depot denote the depot at which a homogeneous fleet of K vehicles are located, each having a capacity of Q. Analogously to customer locations, the depot has an associated time window, service time, and demand, however the time window spans the entire planning horizon, and the service time and demand are zero. For each pair (n, m) ∈ C ∪ {n depot } × C ∪ {n depot }, let d nm and c nm denote the driving time (excluding break and rest times) and the distance-related costs of travelling between n and m. The VRTDSP calls for the determination of at most K routes, where a route r = (n 1 , n 2 , . . . , n k ) is feasible if it starts and ends at the depot, i.e., n 1 = n k = n depot , if it visits a subset of customer locations between start and end, i.e., n i ∈ C for 1 < i < k, if the capacity is not exceeded, i.e., k i=1 q n i ≤ Q, and if a feasible truck driver schedule exists for the route. The goal is to find a set of feasible routes such that each customer in C is visited by exactly one route and that the total costs for all routes are minimized. As mentioned previously, labor costs in the United States are usually based on distance travelled and, therefore, we determine the costs of a route r = (n 1 , n 2 , . . . , n k ) by In the European Union labour cost must not be based on distance travelled and therefore we assume the costs of a route r = (n 1 , n 2 , . . . , n k ) to be the weighted sum of distance-related costs, in particular, for fuel and toll, and duration-related costs, in particular, for daily driver wages. These costs can be determined by where L r denotes the set of labels corresponding to feasible truck driver schedules for route r and c day denotes the cost for each day of operation.
The VRTDSP can be formulated as In this formulation R denotes the set of all feasible routes. The binary parameter a nr indicates whether customer n is visited by route r, and the binary variable λ r indicates if route r is used in the solution. The objective function (9a) is to minimize the cumulative cost over all routes used in the solution. Constraints (9b) ensure that each customer is visited exactly once. The number of used vehicles is limited by (9c) and the variable domains are given in (9d).
This formulation suffers from the usually huge number of routes in the set R. To overcome this issue we use a CG algorithm algorithm to solve the problem. Therein, the set R is replaced by a small subsetR ⊂ R of routes and more routes are added dynamically toR until a solution of the overall problem is found. The linear relaxation of Formulation (9) in which R is replaced byR is called the restricted master program (RMP). The CG algorithm alternates between optimizing the linear relaxation of the RMP and solving a pricing problem that adds additional variables to the RMP.
The pricing problem asks for a route r with negative reduced costc r := c r − µ − n∈C a nr π n < 0 where π n denote the dual prices of the constraints (9b) and µ denotes the dual price of the convexity constraint (9c) associated with the current solution.
Routes with negative reduced costs can be found by solving a SPPRC where the distance-related arc costs c nm are replaced by the reduced arc costc nm := c nm − 1 2 π n − 1 2 π m with π n depot = µ. The reduced costs of a route r = (n 1 , n 2 , . . . , n k ) in the United States can be determined bȳ and in the European Union byc If no route with negative reduced costs can be found, the solution is optimal for the linear relaxation of the original problem. Otherwise, the route with negative reduced costs is added tō R and the RMP is solved again. After an optimal solution for the linear relaxation of the original problem is found, branch-and-bound may be required to find an optimal integer solution of the original problem.

Shortest path problem with resource constraints
The pricing problem is a SPPRC that can be solved using labeling methods based on the methods presented in Section 4. Herein, forward labels are expanded by additional attributes l visited , l cost , and l load and backward labels by additional attributes l visited , l cost , and l load , where l visited , l visited represent the set of customer locations already visited, l cost , l cost represent the (reduced) cost of the partial route, and l load , l load represent the cumulated demand of the visited customers.
The REFs f trip nm and g trip nm are changed so that they increase l cost and l cost byc nm . For EU hours of service regulations, REFs f nightrest , g nightrest , and g nightrest|2nd are changed so that they increase l cost and l cost by c day . Furthermore, REFs f visit m and g visit n update l visited , l visited , l load , and l load accordingly.
The conditions that the next customer is not yet visited and that the capacity of the vehicle is not exceeded are added to the feasibility conditions of REFs f trip nm and g trip nm to avoid unnecessary calculations. Furthermore, the dominance criteria are extended by the conditions that a forward label l can only dominate another forward labell if l visited ⊆l visited , l cost ≤l cost , and l load ≤l load .
Analogously the dominance criteria are extended for backward labels. Note that for EU hours of service regulations, we remove the conditions that l days ≤l days and l days ≤l days because the objective is to minimize the weighted sum of distance-and duration-related costs which is represented by l cost and l cost .
Bidirectional labeling is used to solve the SPPRC using a dynamic half-way point defined on the resources l time and l time . As proposed by Tilk et al. (2017), the bidirectional labeling iteratively selects a forward or a backward label to extend, and dynamically computes a half-way point.
Forward labels with a value of l time larger than this half-way point, and backward labels with a value of l time smaller than this half-way point, are not extended and the method terminates when no label remains to be extended. After termination of the method, forward and backward labels are merged. When merging a forward label l for a partial route (n 1 , n 2 , . . . , n i ) and a backward label l for a partial route (n i , n i+1 , . . . , n k ), the conditions |l visited ∩ l visited | = 1 (12a) as well as the other merge conditions presented in Section 4 must hold.
The labeling takes by far the largest portion of the computation time in the overall BPC algorithm. To speed up the solution process, four acceleration techniques are used. First, we use the ng-path relaxation (Baldacci et al. 2011) with a neighborhood size of ten. Second, an additional set of unreachable customers is defined to strengthen the dominance as proposed by Feillet et al. (2004). Third, the labeling is solved heuristically using a limited discrepancy search (LDS, see Feillet et al. 2007). Last, a heuristic dominance rule is applied to further strengthen the dominance.

Branching and Cutting
To strengthen the linear relaxation, two classes of valid inequalities are used:. First, 2-path inequalities (Kohl et al. 1999) are separated and added whenever they are violated. Let W ⊂ C be a subset of customers that can not be visited by one single vehicle due to capacity or time window restrictions. Moreover, let δ − (W ) be the set of all arcs (i, j) ∈ A with i ∈ W and j / ∈ W . The corresponding 2-path inequality is given by r∈R (i,j)∈δ − (W ) b r ij λ r ≥ 2, where b r ij is the number of times route r traverses arc (i, j) ∈ A. We use the heuristic proposed by Kohl et al. (1999) to generate candidate sets W of maximal cardinality.
Second, subset-row inequalities (Jepsen et al. 2008) defined on sets of three customer are separated at the root node of the branch-and-bound tree. The inequality for a customer set U k ⊂ C is given by r∈R h r 2 λ r ≤ 1, where h r is the number of times route r visits a customer in U k . Note that the use of subset-row inequalities in the master problem requires adjustments in the pricing problem as explained by Jepsen et al. (2008).
Branching on arcs is required to finally ensure integer solutions of Formulation 9. To accelerate the solution process, we apply strong branching with up to eight candidate arcs. We choose the eight most fractional arcs in the current solution as branching candidates and perform a rough evaluation of each candidate by solving the current RMP twice, adding the constraint corresponding to each child node without generating additional columns. A similar procedure was applied for the capacitated VRP by Pecin et al. (2016). The resulting improvements in the lower bounds are usually overestimated. However, this evaluation strategy is fast and beneficial compared to just choosing the most fractional arc too branch on. The arc to branch on is then chosen according to the product rule (Achterberg 2007). As branch-and-bound node-selection rule, we apply a bestbound-first strategy, because our primary goal is to improve the dual bound.

Computational Results
This section reports on computational experiments conducted to evaluate the bidirectional approach and the overall performance of our BPC algorithm. We implemented our algorithm in C++ and compiled it into 64-bit single-thread code with MS Visual Studio 2013. All experiments were conducted on a standard PC with an Intel(R) Core(TM)i7-5930k clocked at 3.5 GHz and 64 GB of RAM, by allowing a single thread for each run. CPLEX 12.6.2 was used with the default parameters to solve the RMP in the column-generation algorithm and to determine an integer solution based on the columns generated so far when reaching the time limit of two hours. The time allowed to find the best integer solution in this final step was restricted to at most 600 seconds.
For US regulations, we tested our algorithm on the 56 benchmark instances for the VRTDSP-US proposed by Goel (2009) which can be obtained at https://www.telematique.eu/research/ downloads. These instances are derived from the VRPTW benchmark instances of Solomon (1987) that can be grouped in six different classes: Randomly distributed customers (R1 and R2), clustered customers (C1 and C2) and a mixed distribution (RC1 and RC2). Instance classes R1, C1 and RC1 have tight time windows and strict vehicle capacity, while C2, RC2, and R2 have wide time windows and loose vehicle capacity. Each instance contains 100 customers, the service time at every customer is set to 60 minutes, and we assume an average speed of 70 km/h and a cost structure of 0.50 Euro per kilometer. Like Goel and Irnich (2017), we create smaller instances by considering only the first 25 or 50 customers. For the EU regulations, we use the same instance set as for US regulations but adapt the time windows as described in Goel (2018) such that every customer can be visited during day time. For our experiments, we considered different night time definitions from 20.00h to 7.00h, from 23.00h to 6.00h, and from 0.00h to 4.00h. These night time definitions are representative for a large share of the countries in the European Union. Moreover, we assume a daily cost of c day = 150 Euro. Table 11 and 12 contain aggregated results of our experiments for the linear relaxation (LP) and the integer program (IP) and compares them with results for the branch-and-price (BP) algorithms presented by Goel and Irnich (2017) and Goel (2018)   1 CPU: Intel i7-5600U, run time limit: 3600 seconds 2 CPU: Intel i7-5930k, run time limit: 7200 seconds Table 12 Results for the EU regulation minimizing costs based on distance and duration Regarding EU regulations, Table 12 shows that 152 out of 168 instances with 25 customers can be solved to optimality in around 1025 seconds on average. The average gap over the remaining instance for which the linear relaxation is solved is around three percent for the 25 customer instances. The linear relaxation is solved for almost all 25 customer instances in less than 330 seconds on average. Furthermore, our approach was able to solve almost half of the 50 customer instances to optimality within the run time limit. The average gap over the remaining 50 customer instance for which the linear relaxation was solved is around 4.5% on average. The share of 50 customer instances for which the linear relaxation is not solved within two hours is around 20% of the instances. Although our experiments used a run time limit of two hours instead of the one hour time limit used by Goel (2018), we can see that our approach clearly outperforms the BP approach with unidirectional labeling.
In order to better understand the contribution of the bidirectional labeling proposed in this paper compared to the other algorithmic differences, we ran the same experiments replacing bidirectional labeling in our BPC with pure forward labeling and pure backward labeling. Tables 13 and 14 show aggregated results comparing the results of our BPC algorithm where the subproblem is either solved with forward, backward, or bidirectional labeling.  Table 13 Comparison of uni-and bidirectional Labeling for US regulations Table 13 shows that for US regulations, bidirectional labeling is on average between 1.5 times and 2.5 time faster than the unidirectional variants. Moreover, bidirectional labeling allows to solve six more instances to proven optimality. While backward labeling appears to be slower than forward labeling for the 25-customer instances, one more 50-customer instance can be solved to proven optimality with backward labeling.
Regarding EU regulations, an interesting observation is that backward labeling performs much worse than forward labeling. Table 14 shows that forward labeling solves 23 more instances and the computational effort is significantly lower. We ascribe this to the additional labels needed for schedules terminating with a full rest or the second part of a rest. Despite the comparably low performance of backward labeling, we can see that bidirectional labeling clearly outperforms unidirectional labeling. With bidirectional labeling, 28 more instances can be solved to optimality compared to forward labeling and 51 more instances compared to backward labeling. Bidirectional labeling is on average between 2 and 10 times faster than forward labeling and between 4 and 16 times faster than backward labeling. This shows that the bidirectional labeling method has a significant contribution to the good performance of our algorithm, especially considering EU regulations. One reason for this good performance is that in the EU-VRTDSP, initial labels have to be generated for each day of the planning horizon. In unidirectional labeling all of these alternative labels must be extended, which leads to a significantly higher computational burden. In our birectional approach many of the initial labels are never extended because they are already behind the half-way point.

Conclusion
In this paper we propose backward labeling methods for truck driver scheduling in the United States and the European Union. We show how labels generated with a forward labeling method can be combined with labels generated with our backward method. Being able to combine forward and backward labels can significantly speed up heuristic solution approaches for vehicle routing and truck driver scheduling problems, because unnecessary computational effort can be avoided when An important contribution of our approach is that it is particularly well suited for problems in which schedule durations must be minimized. As EU regulations prohibit any payment related to travel distance, labor costs cannot be included in the mileage costs. For realistic cost functions based on distance and duration, our bidirectional approach for the VRTDSP-EU is on average between 2 and 16 times faster than unidirectional variants.

European Union
For REFs associated to off-duty periods we have Thus, REF g idle can be neglected and when applying any of the other REFs, we can use the minimum duration required by the regulation and larger values of ∆ do not have to be considered.
Using any of these REFs multiple times after another may only be relevant for g nightrest , because g a ∆ 2 • g a ∆ 1 (l ) is infeasible for a ∈ {dayrest, dayrest|2nd, nightrest|2nd, rest|1st, break|2nd, break|1st}. As for any value ∆ 1 , ∆ 2 > 0 we can conclude that the first part of a rest is never taken immediately before the second part. Furthermore, rest periods taken during a day are never needed before or after a night rest because of and For any label l with l work|B = 0 we have where h represents any sequence of the REFs g drive and g visit . Therefore, we can assume that the second part of a break is only taken if l work|B > 0.
For any value ∆ > 0 we have where h represents any sequence of REFs not including g drive and g visit . Hence, we can conclude that no break or rest activities are scheduled if ∆ EU l > 0 and that driving periods are always scheduled with duration ∆ EU l . Lastly, if l trip = 0 and l time ≤ t max n + s n we have