The school bus routing and scheduling problem with transfers

In this article, we study the school bus routing and scheduling problem with transfers arising in the field of nonperiodic public transportation systems. It deals with the transportation of pupils from home to their school in the morning taking the possibility that pupils may change buses into account. Allowing transfers has several consequences. On the one hand, it allows more flexibility in the bus network structure and can, therefore, help to reduce operating costs. On the other hand, transfers have an impact on the service level: the perceived service quality is lower due to the existence of transfers; however, at the same time, user ride times may be reduced and, thus, transfers may also have a positive impact on service quality. The main objective is the minimization of the total operating costs. We develop a heuristic solution framework to solve this problem and compare it with two solution concepts that do not consider transfers. The impact of transfers on the service level in terms of time loss (or user ride time) and the number of transfers is analyzed. Our results show that allowing transfers reduces total operating costs significantly while average and maximum user ride times are comparable to solutions without transfers. © 2015 Wiley Periodicals, Inc. NETWORKS, Vol. 65(2), 180–203 2015


INTRODUCTION
In the school year 2011/2012 in Austria (population: 8.42 million [1]) about 1.1 million pupils attended one of the 6,120 schools [2]. Austria is divided into 121 districts and, on average, each district has 51 schools in total. The number of nonprimary schools ranges from one to 115 with an average of about 25 (across all districts in Austria). Each district consists of multiple municipalities, where the average number of schools in a municipality is 2.57, about half of them being nonprimary schools. If only rural areas are considered, the average number of nonprimary schools per district is 22, the maximum number is 38, and the average number of schools per municipality is about one.
The number of pupils per school differs substantially according to school type and location. Higher level secondary schools are usually located in more densely populated areas whereas primary and secondary schools are also located in rural areas. Primary schools have between 10 and 150 pupils in rural areas. Secondary schools have between 50 and 300 pupils and higher level secondary schools have about 200 to 800 pupils. All of the above data is taken from Statistik Austria [3] and approximated over the whole geographic region of Austria. The low density of schools in rural areas and the distribution of higher level secondary schools require most pupils to use some type of transportation system to get to school.
On the one hand, safety of the pupils during transportation is a crucial factor and must be ensured (i.e., short walking distances, short travel times, we will refer to this as service level). On the other hand, the costs of providing high quality services must be considered by the funding organization (e.g., administration). These two goals are conflicting in nature because high service level often requires dedicated routes for small groups of pupils which require more buses and raises costs.
The pupil transportation system differs from country to country and even from county to county. In some areas, dedicated bus services for every school or group of schools (if they are located close together) are in place. Contrarily, pupil transportation can also be integrated into the public transport system, where pupils use the general public transportation services. Countries often use mixed forms of transportation: In areas where no public transport is available dedicated services are provided and where available the public transportation system must be used. For example, in urban areas with a dense service network pupils may use public transport while in rural areas, dedicated bus services are provided. Other systems provide dedicated bus services for pupils attending primary school, while older pupils have to use public transport.
Austria has a mixed system: in general the public transportation network is used for pupil transportation with the exception of rural areas where dedicated services are provided. For pupils attending primary schools transfers between school buses are not allowed. This means that the planning problem for primary schools can be solved using models without transfers (i.e., approaches proposed in literature, as in [13,16,22,33,31] can be applied). The transportation network for older pupils can be designed to utilize transfers. In this study, we are interested in the design of the bus transportation network to meet the needs of pupils in secondary schools and older.
Generally, the problem of pupil transportation arises in the morning before the school begins and in the afternoon after school ends. Here, we consider the so-called morning problem only (i.e., the transportation of the pupils to their school before it begins). The service must be provided only once in the morning, therefore it is nonperiodic. Since the process is the same for every day a feasible solution for a single day can be used during the whole school year.
The overall problem consists of the following subproblems: bus stop selection, bus routing, bus scheduling, and school begin time adjustment. Bus stop selection refers to the process of choosing a proper subset of the set of available bus stops which are then serviced by the bus. This step may also include the assignment of pupils to bus stops. Bus routing is the generation of routes which are serviced by a bus.
Typically, bus routes have to respect capacity constraints and often also tour length or duration constraints. Bus scheduling is the computation of a feasible schedule for the buses. It determines which bus route is serviced at which point in time. School begin time adjustment (or bell time adjustment) optimizes the school begin times to allow buses to service multiple schools and thus reduce the number of necessary buses. All these subproblems are strongly interconnected and should be solved in an integrated manner.
Generally, the school bus routing and scheduling problem can be modeled in different ways. If only bus routing and scheduling is considered, and a single school is considered, it can be modeled, for example, as a vehicle routing problem (VRP) [34], where the bus starts at the school, collects the pupils at their bus stops and returns to the school where the pupils are dropped off. In the case where the buses do not start at the school but at the first pickup bus stop, the resulting problem can be modeled as an open vehicle routing problem (OVRP), considering a restriction on the maximum route length. As in case of the (O)VRP all schools are treated independently for every school an OVRP must be solved. An alternative approach is to model pupil transportation as a dial-a-ride problem (DARP) where for every pupil a transportation request arises [8]. The pickup point is the assigned bus stop of the pupil, the drop off point is the bus stop of the school and pupils of different schools may share a single bus.
A further generalization of the problem is to allow transfers (i.e., pupils attending different schools can share a single bus and can change the bus during their way to school). Transfers may be allowed on a predefined set of bus stops or at arbitrary bus stops. Therefore, the selection of transfer bus stops itself is an optimization problem.
Our contribution is fourfold: 1. Dedicated solution concept for the school bus routing and scheduling problem with transfers, taking into account bus stop selection and pupil assignment, bus routing, and bus scheduling. 2. Evaluation of the solutions considering transfers in terms of costs and service level. 3. Comparison of the solution with transfers with two different modeling approaches without transfers, namely DARP and OVRP. 4. Optimization of the bus stops used for pupil transfers.
In the next section, we give a detailed description of the problem and an overview of the literature. In section 3, the mixed integer linear programming (MILP) model which defines our problem is presented. Then, we describe the heuristic solution concept in detail (section 4). In section 5, we describe how to model the school bus routing and scheduling problem as a DARP and OVRP and describe two different state-of-the-art variable neighborhood search (VNS)-based solvers, a DARP and an OVRP solver, which are used in section 6. Section 6 gives a detailed analysis of the results of the three approaches by comparing and analyzing the properties of the obtained solutions. Last we summarize our findings and suggest further research directions (section 7).

PROBLEM STATEMENT AND RELATED WORK
Given is a set N of pupils, a set L of bus stops, and a set S of schools. The bus driving time t ij between bus stop i ∈ L and j ∈ L is given; and also the walking time u ni of pupil n ∈ N from her home to bus stop i ∈ L is known for all pupils. Every pupil n has a set of candidate pickup bus stops. The destination bus stop and the school begin time τ s of every school s ∈ S as well as the walking times from the destination bus stop to the school are known. The destination school and, therefore, the destination bus stop is known for every pupil n.
Additionally, a pupil may arrive at the earliest ω s and at the latest ω s minutes before school s begins. Further, pupils have a minimum γ and maximum γ waiting time if they change from one bus to the other.
Every bus b ∈ B has a maximum capacity c which must not be exceeded. It restricts the maximum number of pupils that can be on the bus at the same time. We consider a homogeneous bus fleet.
The objective is to calculate a transportation plan of minimum cost considering the following constraints: • bus capacity • upper and lower bounds on waiting times at every school • upper and lower bounds on waiting times for pupil transfers • maximum pupil walking time from home to their assigned bus stop Generally, for a feasible solution the following decision problems must be solved: 1. Assign pupils to bus stops 2. Calculate bus routes 3. Compute pupil routes based on the bus routes 4. Schedule buses to bus routes Pupils with the same pickup and destination bus stop form a single entity and do not split during their ways to school. This is for practical purposes, because in practice it may be difficult to instruct pupils with the same destination waiting at the same pickup bus stop to use different buses. Figure  1a exemplifies a simple problem instance (inst01 from the benchmark set, see section 6). There are 19 bus stops (white circles) numbered from 0 (virtual depot) to 18. Eight pupils (gray) numbered from 0 to 7, the school which they attend is given in parentheses (e.g., pupil five attends school one). Finally, there are two schools (black) numbered 0 and 1. Bus stop 4 is the destination bus stop for school 0 and bus stop 14 is the destination bus stop for school 1. Therefore, pupil 5 must be transported to bus stop 14. Figure 1b shows a feasible solution to the given problem. In the solution, the pupil assignment (arcs without labels), the bus routes (arcs with labels), and the scheduling data (labels on arcs) can be seen. For example, pupil 3 attends school 0 and walks to bus stop 11 to be picked up by a bus. Bus service on arc (11,4) starts at 50.54 and ends at 54.37, then pupil 3 has to walk to the school. In this example no transfers happen.
The school bus routing and scheduling problem with transfers has not yet been studied extensively in the literature. Newton and Thomas (1969) [20] and Newton and Warren (1970) [21] made one of the first attempts to solve the bus routing problem by the use of a computer for real life cases. They consider bus capacities and maximum user ride time constraints. In their approach, first a giant tour over all bus stops and the school is created. Then, starting from the school they build feasible routes by generating subroutes which satisfy the capacity and ride time constraints and connect them to the school (i.e., every route starts and ends at the school). Bektaş and Elmastaş (2007) [4] model their real life problem of transporting pupils of an elementary school in Ankara as a capacity and distance constrained OVRP. They minimize the number of buses which represents the operator objective and they use a solver to optimize their MILP model.
Recently, Riera-Ledesma and Salazar-González (2012) [28] have proposed a branch-and-cut approach for the school bus routing problem with pupil assignment and bus stop selection. They formulate the problem as a multivehicle traveling purchaser problem and discuss their extensions to the classic traveling purchaser model. Schittekat et al. (2013) [32] investigate the influence of bus stop selection on solution quality. They solve the problem using a parameter free greedy randomized adaptive search procedure combined with a variable neighborhood decent improvement method.
More general approaches do not separate pupils of different schools but allow mixed loads (i.e., the transportation of pupils attending different schools with the same bus). Braca et al. (1997) [6] solve the school bus routing problem with mixed loads for the region of New York City. They construct mixed load routes in a randomized way. Using restarts they are able to generate different routes. Additionally they consider arrival time windows at school and maximum ride times.   [24] propose an improvement procedure for mixed loads. First, an initial solution without mixed loads is computed. Then, relocation operators, moving pickup bus stops and, if necessary, also school bus stops, are used to obtain mixed load routes.
Besides the routing problem also the scheduling problem has to be solved. It consists in fixing the departure times of the buses and synchronizing them at transfer points.  recently proposed a scheduling algorithm for the school bus routing problem without transfers [16]. The bus trips to the different schools are considered as given and their duration and the school begin times are known. Buses must then be scheduled so as to transport all pupils to school on time. Spada et al. (2005) [33] propose a heuristic solution framework for the multiobjective school bus routing and scheduling problem. In their approach, they fix the number of buses and optimize the service level.
A further extension is the consideration of transfers (i.e., pupils may change buses during their way to school). A scheduling model which considers transfers of pupils between buses was proposed by Fügenschuh (2009) [14]. He optimizes school begin times to reduce the number of required buses. In the context of the pickup and delivery problem Masson et al. (2012) [18] analyze the effects of transfers on transportation costs. They design an adaptive large neighborhood search algorithm that handles transfers explicitly by special operators. In the context of public transportation, Cortés et al. (2010) [9] propose a branch-and-cut solution method for the pickup and delivery problem with transfers.
Transfers can only take place at predetermined points, so called transfer nodes. Bouros et al. (2011) [5] also solve the pickup and delivery problem with transfers. They allow transfers at arbitrary points in space, where the detour of two different vehicles is within a certain amount.
An early approach which takes into account school bus routing and scheduling with transfers, pupil assignment, and bell time adjustment was proposed by Desrosiers et al. (1981) [12]. They distinguish between urban and rural areas. For rural areas, pupils are collected from their homes and are transported to predefined transfer points, where they change the buses and are transported to their destination school through express routes. For every pupil, it is known to which transfer point she must be transported. In urban areas, pupils are assigned to bus stops where they are picked up by the bus. Routes are generated using modified versions of the Clarke and Wright Savings algorithm [7], Newton's giant tour approach [20] and an insertion approach [29]. Then, the routes are scheduled and the school begin times are adjusted so as to minimize the number of buses. They solve the problem hierarchically. The proposed approach does not include a sophisticated solution improvement method.
Recently, a literature overview was published by Park and Kim (2010) [23] which summarizes and categorizes the work in this area.
None of the above concepts explicitly considers the overall problem of bus stop selection, bus routing, bus scheduling, and transfers within a state of the art metaheuristic solution method.

MATHEMATICAL MODEL
The problem situation considered in this article can be modeled as a MILP, using two types of binary decision variables to determine the bus network and how it is used by the pupils: bus arcs (x ijb ) and pupil arcs (m nijb ). As the bus arc variables determine the bus lines and each line is served by a single bus, this part of the model is similar to a three-index VRP formulation, except that a bus stop may be visited multiple times by different buses. The pupil arcs define the path each pupil takes from its home bus stop to the destination bus stop of her school. Pupils can only use those arcs which are serviced by a bus.
For ease of exposition, we introduce an artificial depot denoted by 0 where each bus line must start and end. Each pupil has to be assigned to a single bus stop, where she must be picked up by a bus and use some sequence of bus line arcs until she reaches her destination bus stop (i.e., the bus stop of her school).
Transfers reduce service quality and an excessive number of transfers results in impractical solutions. In our heuristic solution concept, we penalize excessive transfers by an additional term in the objective function. Therefore, we also adhere to this approach in the below model.
In what follows, we first define the different input parameters and then the decision variables we use to model the school bus routing and scheduling problem with transfers.
It is subject to a number of constraints. The first set of constraints take care of the assignment of pupils to bus stops. Every pupil n must be assigned to a single bus stop i out of a set of possible pickup bus stops L n : i∈L n y ni = 1, ∀n ∈ N.
The bus line network is defined by constraints (3)- (5). A bus line, if used, must have a single origin at the virtual depot, the bus line must continue until the virtual depot, and every bus line may only service each bus stop at most once, Based on the bus line network, the paths of the pupils are determined. Pupils may only use arcs that are serviced by the bus line, If pupil n is assigned to bus stop i (y ni = 1) which is not her destination bus stop i = i s n she must leave the pickup bus stop ( j∈L ,j =i b∈B m nijb = 1). This is ensured as follows: Similarly, if pupil n arrives at bus stop h( i∈L ,i =h b∈B m nihb = 1) and it is not her destination bus stop (h = i s n ), she must travel on to another bus stop ( j∈L , j =h b∈B m nhjb = 1): In reality, it is usually impossible to have pupils that attend the same school and are assigned to the same pickup bus stop use different paths in the network. To avoid this situation, we make sure that, if pupil n travels along arc (i, j) and she attends the same school as pupil n (i.e., s n = s n ) and leaves from the same pickup bus stop (i.e, y in = y in = 1), pupil n uses this arc as well: b∈B m n ijb ≤ b∈B m n ijb + (2 − y hn − y hn ), ∀n , n ∈ N|s n = s n , h ∈ L n ∩ L n , i, j ∈ L .
We note that pupils n and n are not required to use the same bus on this arc; this may not be possible because of capacity restrictions.
The following constraints ensure that only feasible transfers are considered. To make sure that each pupil may only use each bus at most once during her to school journey, we use binary variables z nbi to indicate whether pupil n leaves bus b at stop i. These variables are set to 1 whenever pupil n arrives at bus stop i with bus b, but does not leave the bus stop with bus b ( j∈L ,j =i m njib = 1 and j∈L ,j =i m nijb = 0, therefore z nbi = 1): Then, to ensure that each pupil n may only leave (and thus use) each bus b at most once, we use the following set of constraints: To determine if the number of transfers of pupil n exceeds the preset transfer limit C, we count the number of transfers of pupil n. It is given by the number of times pupil n leaves a bus at a bus stop that is not her school ( b∈B i∈L ,i =i sn z nbi ).
Variables r n , giving the number of excessive transfers of pupil n, are then set as follows: Transfers are only allowed into one direction: either from bus b to bus b or from b to b . This is ensured by two sets of constraints: If at least one pupil changes from bus b to bus b at bus stop i, then v jb b = 1, and we make sure that if v jb b = 1, then v jb b = 0 and vice versa by means of constraints (15): The following constraints ensure temporal and logical feasibility. The purpose of these constraints is to synchronize the buses at transfer bus stops, to synchronize pupils and buses, and to ensure time feasibility (i.e., to make sure that pupils arrive at their schools within the respective arrival time windows). If pupil n travels from i to j, then her arrival time at i must be greater or equal to the arrival time at j plus the time necessary to travel from i to j, given by t ij : If pupil n arrives at bus stop i and this is the destination bus stop of school s and pupil n attends school s, then she must arrive (T ni sn ) within the school's arrival time window [τ s −ω s , τ s −ω s ]. This is ensured by the following constraints: A bus may not visit a bus stop before the earliest possible time e, which, together with the latest possible arrival time at a school (max s (τ s − ω s )), provides a bound on the travel times (excluding walking times) of the pupils: If bus b travels from i to j then the arrival time A jb at j must be equal to the arrival time at i (A ib ) plus the respective travel time t ij (Note that we assume that the service time is included into the travel time): Constraints (21) and (22) make sure that bus and pupil times are synchronized (i.e., if a pupil n travels on bus b, her arrival time at bus stop i has to be equal to the arrival time of bus b): The following constraints ensure timely synchronization of buses in the case of transfers. If at least one pupil changes from bus b to bus b at bus stop i, then the arrival time A b of bus b must not be greater than the arrival time of b plus the maximum waiting time γ and not lower than the arrival time of b plus the minimum waiting time γ .
Finally, also bus capacity constraints must be considered. They ensure that on every arc (i, j), which is serviced by bus b, the number of pupils does not exceed the capacity c: We use several so-called big-M terms. Let K denote the latest feasible arrival time at a school across all schools (i.e., K = max s∈S {τ s − ω s }). All these terms can conveniently be set to K.
The above formulation cannot be solved using state of the art solvers like Gurobi or CPLEX for reasonably sized problems; therefore, we develop a heuristic solution concept.

HEURISTIC SOLUTION CONCEPT
The idea of the proposed solution concept is to decompose the overall problem into several simpler (hierarchical) subproblems which can be solved in reasonable time by dedicated heuristics, similar to the approach by Desrosiers et al. (1981) [12]. However, we include feedback loops that allow information exchange between the different hierarchical levels. If infeasibilities are detected at some level, then this information is conveyed to all earlier stages and, in the next loop, appropriate measures are taken at these earlier stages to avoid the reported infeasibilites at later stages. As destroy and repair (or ruin and recreate)-based neighborhood search [26,19] has been successfully applied in the context of several other rich combinatorial optimization problems, we also base our framework on this idea. Algorithm 1 outlines our solution concept. First a feasible solution is built, which is then improved using a destroy and repair-based optimization approach [26]. To obtain a feasible solution, first, the pupil assignment and bus stop selection problem has to be solved. It determines the bus stops that have to be visited in the bus route generation step. The bus route network, thus, computed in the second step provides the basis for pupil routing (i.e., the identification of the actual path taken by each pupil in the network). This information is again input to the bus scheduling step, determining pupil and bus arrival as well as departure times at the different stops. Our objective is the minimization of the travel costs of the buses (i.e., the sum of the arcs serviced by the buses). The function Cost(s) called in Algorithm 1, line 7 returns the objective value of a solution as defined in (1), where W is set to 100 and C in Equation (13) is set according to the maximum allowed transfers.
Every step of the algorithm is explained in detail in the following subsections. We first describe the solution construction process (function InitialSolution(input data) in Algorithm 1) and its elements, then we illustrate the neighborhood-based search method (Algorithm 1, beginning with line 3).

Algorithm 2 Solution construction phase
With the exception of pupil assignment and bus stop selection all components are tightly connected. However, it cannot be ensured, that for a given bus routing a feasible pupil routing exists; or in case of a given pupil routing, that a feasible bus scheduling, respecting all temporal constraints, can be determined. Therefore, we exchange information between these three solution construction components. If at any stage an infeasibility is detected, the responsible part of the solution is identified and this information is passed to the earlier stages. There, this information is used to modify the respective solution accordingly.

Pupil Assignment and Bus Stop Selection.
Every pupil must be assigned to a bus stop, where she is picked up by a bus. In our approach, pupil assignment and bus stop selection is done in a single step. We note that, currently, changing the assignment of pupils to departure bus stops is not considered in the optimization. However, to include pupil assignment and bus stop selection into the overall optimization framework, the optimization algorithm is run with several different pupil to bus stop assignment strategies.
We formulate the pupil to bus stop assignment problems as integer programs. Using state-of-the-art commercial solver software, problem instances of realistic size can be solved within a reasonable amount of time.
We use the following three alternative assignment strategies: Minimize Distance to Pupils' Destinations (pa1) The model is given by objective function (33) and constraints (34), (35), and (36). The objective is to minimize the distance to the pupils' destinations (i.e., pupils are assigned to bus stops which are located in the direction of their destinations).
Consider h n to be the destination bus stop of pupil n and d lh n to be the travel time from bus stop l to bus stop h n , then the objective function is: It is subject to the following constraints. Every pupil n must be assigned to exactly one bus stop i: The number of pupils that can be assigned to a location i is limited by the bus capacity c: This problem is similar to the capacitated facility location problem (CFLP) [10], whereas the pupils are the customers and the bus stops are facilities. Therefore, this problem may also be solved with special algorithms for the CFLP.
Minimize number of bus stops (pa3) The third model formulation, given by the objective function (42) and constraints (34)-(36), (43), (44) minimizes the number of bus stops used. This choice appears advantageous from the operator perspective: fewer bus stops might result in shorter, and therefore less costly, bus routes. However, the main drawback of this assignment strategy is that there is no information on the pupils' destinations in the assignment phase. This can lead to situations where pupils have to walk long distances into the opposite direction of their school; and this may result in longer travel times and possibly in a higher number of transfer points. Let q i with i ∈ L equal 1 if bus stop i is used, and 0 otherwise; and constant M ≥ |N|.
All of the above models are solved using IBM ILOG CPLEX, which is fast enough even for large problems. In either case, the solution is an assignment of pupils to bus stops and pupils must be picked up at the selected bus stops. Our bus routing component ensures that this is the case.

Bus Routing.
After the assignment of pupils to bus stops, an initial routing solution is constructed. The bus routing ensures that there is a path for every pupil from the initial pickup bus stop to the destination bus stop. The proposed method is based on the following idea: If we consider only a single school and disregard the capacity constraints of the buses the optimal solution with regard to the objective is a minimum spanning tree (MST): It consists of the shortest edges and connects all nodes which are part of the single school subproblem.
Our approach is the following: For every school s a subset of nodes is constructed which contains the bus stops where pupils attending s are waiting. Then, we use Prim's algorithm to construct a MST with directed edges for the set of nodes. The starting node is the destination bus stop for school s. The edges are directed toward the previous selected bus stop, therefore from every node there is a path to the destination bus stop.
Paths in this tree may become long and, therefore, likely violate time constraints (i.e., they may be longer than the planning horizon). Therefore, we limit the maximum length of a path in the MST. Any path in the spanning tree must be at most as long as the planning horizon of the respective school s (i.e., τ s − ω s − e).
We achieve the adaptation of the spanning tree by changing the weight matrix which serves for the tree construction. During the tree construction the length of every path from the root node to the current leaves is stored. If the length exceeds the maximum length, the weight matrix is changed in the following way: The weights of the arcs which violate the length restriction and two preceding arcs are increased by a certain amount. Consider the path where p 0 is the root node (destination bus stop), p n is the leave node (bus stop farthest away from school). The path length starts with e at the root node and increases along the opposite direction of the path until the leave node. If the travel time exceeds the end of the planning horizon (τ s − ω s ) on this path at arc p t,t−1 , then the weight for the following arcs is increased: p t,t−1 , p t−1,t−2 , p t−2,t−3 . Using this scheme, long paths in the spanning tree can be shortened. This is repeated until all paths satisfy the maximum length restriction. Weights are only changed temporarily for the construction of the MST. Figure 2 shows an example of a MST based on original weights and a MST based on adapted weights. The circles represent the bus stops which must be visited, the rectangle represents the destination bus stop of the pupils. Arc labels are the length of the arc in the spanning tree and the nodes are labeled according to the cumulative length of the path (starting from the school bus stop). In the example only the relevant subtree is labeled. It can be seen that the longest path in this tree has a length of 165. If we restrict the maximum length of a path in the tree to 120, then the weights are adapted iteratively until every path has a length of at most 120. The final spanning tree is shown in Figure 2b, where the longest path is 112 and thus satisfies the maximum length restriction.
All arcs in the MST are directed toward the respective school. As soon as a feasible bus graph for each school has been found, we obtain a single directed graph by taking the union of all their edges. Therefore, for every pupil a path in this graph exists that starts at her initial bus stop (home) and ends at the destination bus stop (school).

Pupil Routing.
Based on this initial bus route graph, pupil paths are calculated. This is done iteratively for every pupil waiting at a bus stop using a shortest path algorithm on the bus graph taking into account bus capacities (i.e., the arcs are capacitated). As in the previous step, bus capacities were neglected, it is possible and even likely that it is not possible to route all pupils through the graph without violating arc capacities. In this case, the graph has to be augmented. Figure 3 shows an example of bus graph augmentation. Figure 3a is the bus graph generated in the previous step with labeled nodes. The labels represent the node numbers. We assume that at every bus stop exactly one pupil is waiting. The arc labels in parenthesis are the arc utilizations (i.e., the number of pupils on the bus servicing this arc). On arc (14,11) the utilization is 3 and on arc (12,11) the utilization is 2. Therefore, utilization on arc (11, 10) is 3 + 2 + 1 = 6. Now assume that the bus capacity respectively arc capacity is 5. On arcs (11,10), (10, 1) this constraint is violated and the graph needs to be adapted.
The idea is to augment the graph as little as possible and, therefore, minimize the additional costs necessary to transport all pupils. For every pupil who does not reach her destination, a set of reachable bus stops is identified. From this set for every bus stop the cost of an arc to the destination bus stop is calculated. The arc with the least cost is added to the bus graph. This is done iteratively until all pupils reach their destination bus stops. The order in which the pupils are considered in the routing is random but consistent (i.e., in every iteration pupils are routed in the same order).
In the example in Figure 3a suppose that pupils waiting at bus stops 13 and 12 are not yet routed through the network. Now the pupil at bus stop 12 is routed. She only reaches bus stop 10, then a capacity constraint violation occurs. Pupil 13 is routed next and reaches bus stop 11 before capacity constraint violation occurs: the residual capacity on arc (11,10) is zero at that point. For those two pupils, the set of reachable nodes is calculated. For the pupil at node 13 it is 13, 12, 11. Now the graph is augmented by adding the shortest arc which connects a node from the set of reachable nodes with the destination bus stop. In this example this is arc (12,1). Figure 3b shows the result with adapted arc utilization. Now a capacity feasible pupil routing exists and the temporal aspects of this partial solution must be checked. This is done by the bus scheduling component.

Bus Line
Scheduling. The purpose of bus line scheduling is to fix the begin times of the bus routes, such that they are synchronized at the transfer points and pupils reach their schools within the arrival time windows. This synchronization at transfer points possibly leads to waiting times for pupils and buses, and therefore, it may increase the travel time of pupils. Waiting times can only arise at transfer points.
Algorithm 3 shows the general flow of the bus line scheduling.
It is done on the bus line graph, which is an aggregation of the bus route graph. A bus line is a path in the routing graph Therefore, bus line scheduling is divided into two phases: a preprocessing phase to detect cycles in the solution for which no feasible schedule exists (Algorithm 3, lines 4 -7), and the scheduling phase. Figure 5 shows an excerpt of a solution which contains such an unresolvable cycle. In this example, pupil 1 is at bus stop 1 and needs to go to bus stop 3 (indicated by the dashed arc), pupil 2 is at bus stop 2 and needs to go to bus stop 1, and pupil 3 is at bus stop 3 and needs to go to bus stop 2. A cycle here is not an arc cycle but a cycle in the sequence of arcs used by pupils. This becomes clear if we add the time dimension as in Figure 5c. There, the pupil paths are aligned according to their arc utilization. And it becomes clear that if arc (1, 2) is serviced first, then arc (2, 3) and finally arc (3, 1), then there is no additional arc which is needed by pupil 3 to get from bus stop 1 to bus stop 2.
The preprocessing detects such situations and repairs them. This is done by building a temporary graph based on the pupil paths (Fig. 5d) and on this graph we apply a topological sort. Every node in the temporary graph is an arc of the bus graph; (1-2) is the arc from bus stop 1 to bus stop 2. The arcs represent a path of a pupil, for example, pupil 1 utilizes arcs (1, 2) and (2,3). Therefore, an arc [(1 -2), (2 -3)] is inserted into the graph. This is done for every pupil. If the resulting graph does not contain a cycle (i.e., it is topologically sortable), an order of arcs exists which can physically be serviced. In this context, we refer to this step as graph sequencing. Even though this could also be integrated into the scheduling, we decided to add an extra step, because the problem of nondecomposable graphs occurs often in case of bigger instances and it is faster to check.
In case the sequencing graph is not topologically sortable (as in Fig. 5d), the pupil routing must be changed and, therefore, the underlying bus graph. To eliminate cycles two different approaches are implemented. The first approach is to determine all arcs which are contained in the cycles and to insert inverted arcs. For example, adding the arc (2,1) to the graph in Figure 5b would eliminate the cycle and result in the pupil arc graph depicted in Figure 5e. The reverse arcs are inserted into the graph and the pupil paths are calculated based on the new graph and the sequencing is done again. This is repeated until the graph is topologically sortable.
If the cycles cannot be removed in this way (i.e., it is detected, that all reversed arcs of the cycle are already in the bus route graph), the strategy is changed. Now shortcuts between two arbitrary nodes in the cycles are added until the graph sequencing is feasible.
Finally, the starting times of the bus routes must be fixed. This is done using the approach of Dechter et al. [11] referred to as STP. They describe an approach where a set of inequalities of the form a 1 ≤ X 1 ≤ b 1 , a 2 ≤ X 2 − X 1 ≤ b 2 , . . . , a n ≤ X n − X n−1 ≤ b n can be transformed into a weighted graph and solved using an all-pairs shortest path problem. A detailed description of this method is given in [11] and the application in a DARP context is shown in [18]. If a solution without negative cycles exists, a feasible schedule exists and the algorithm returns the lower and upper bounds of the variables and fixing those variables to their upper bound yields a feasible solution. In this case, the solution construction process terminates with a feasible solution.
In case there is a cycle in the temporal graph, then no feasible solution exists and the infeasibility must be resolved.
It is difficult to determine which component of the solution causes the infeasibility. For example, a trip arrives too late at a school. Is the trip really too long, or is it caused by multiple transfers of pupils and therefore the required synchronization at the transfer points (i.e., induced by waiting time)? Often it is not a single element but the combination of the elements which leads to infeasibilities.
Nonetheless to resolve the infeasibility we use a simple approach. On the temporal aspects of the bus route graph we do a backward scheduling (i.e., we determine the latest times of all routes beginning with the latest arrival time at the schools). Iteratively, we determine the latest times of the preceding events. This allows us to identify the nodes at which the time-constraint violations occur. Please note that by doing backward scheduling only, we are more restrictive than necessary and may, therefore, exclude feasible solutions.
In case an infeasibility is detected, the bus and pupil routes are analyzed and subpaths which are identified to be part of the infeasibility are stored and must not occur in successive pupil routing attempts. For the bus line graph shown in Figure  4 consider that the waiting time from bus line 6 to bus line 5 at transfer point 5 exceeds the maximum waiting time. Then, the subpath (6, 5, 1) for the pupils is declared forbidden and a different pupil routing on the basis of the bus graph must be found. Forbidden subpaths are stored in a tabu list.
Then, the pupil paths are recalculated under consideration of the forbidden subpaths contained in the tabu list (i.e., those subpaths must not be used on pupil paths). Therefore, the bus route graph must be adapted so that all pupils reach their destination bus stop. This process is repeated until the solution is feasible.
At this point a feasible initial solution is available, but it is likely that the solution utilizes more arcs than necessary. Therefore, arcs which are not necessary are removed from the graph in an iterative manner. Before an arc is removed, it is tested if the solution remains feasible without this arc. In case the arc is necessary for feasibility, it is not removed. Arcs are tested for removal from longest to shortest regardless of the arc utilization. Outgoing arcs of nodes which have only a single outgoing arc are not considered. This way a feasible local optimal solution is generated. Based on this solution the iterative improvement scheme is started.

Neighborhood-Based Search Method
Analysis of the solutions after construction showed that in some cases the structure of the initial solution was quite different compared to the optimal routing solution calculated by the solver for small problem instances. Therefore, we designed an improvement method which is able to transform the structure of a given solution in such a way that good solutions can be obtained. Figure 6 emphasizes this by comparing the initial solution and the final, improved solution. It is a problem with two schools, 18 bus stops, and eight pupils. There, it can be seen that the pupil flows of the two solutions are completely different: Figure 6a consists of two independent bus systems, whereas the optimized solution in Figure  6b has a central transfer bus stop from which the pupils are then transported to their destination bus stop. To achieve this, the underlying structure of the bus routes must be changed completely. Conversely, in case the solution is already good and requires only slight modifications to become very good, the optimization method should allow this, too.
We use the idea of destroy and repair-based neighborhood search [26] for several reasons: First, it currently is state-of-the-art and successfully applied to a number of different, complex routing problems [18,30,27]. Second, it allows to control the amount of perturbation of a solution by parametrizing the operators, and therefore, it is able to balance exploitation and exploration. Third, additional methods can easily be integrated to adapt to the requirements of slightly different settings (e.g., some new test instances may be hard to optimize utilizing the current operators, then the operators can be adapted easily).
The idea here is to destroy or perturb a solution in terms of structure and then repair this solution in terms of solution quality. In this context, the destroy operator is a solution perturbation and the repair operator is a local search. In Algorithm 1 an outline of our approach is given. After an initial feasible solution is constructed it is iteratively perturbed and reoptimized. The perturbation changes the bus routing graph by removing and inserting arcs without considering arc capacities or temporal constraints. This may result in an infeasible solution and it is necessary to restore feasibility. As we have already developed the methodology to restore feasibility for the solution construction phase the same methods are used in the improvement phase, namely pupil routing and bus line scheduling (see sections 4. 1.3 and 4.1.4). Finally, a local search is applied to the perturbed feasible solution. It improves the solution quality by either removing arcs from the bus route graph, or by exchanging long arcs with shorter arcs preserving feasibility.
Hence two different types of operators are needed: perturbation and local search. They are described in the following.

Algorithm 4 Solution improvement
4.2.1. Perturbation. All operators are randomized and work by deleting and adding arcs in the bus route graph. To prevent cycling, all moves which are induced by an operator are stored in a tabu list. If an arc is removed from the solution, then it must not be added again until it is removed from the tabu list. If an arc is added to the solution, then it must not be removed as long as it is in the tabu list. Two different lists are used, one for forbidden arc removals and one for forbidden arc insertions. The length of the tabu list is a parameter.
For all perturbation operators, the amount how much of the solution is perturbed can be given in percentage of the number of arcs of the solution. It is possible to only slightly change a solution if the amount is low or to change many arcs if the amount is high.
Every perturbation operation consists of two steps: Arc removal, where a certain amount of arcs is removed. Followed by arc insertion, where an operator inserts a certain number of arcs.
Long arc removal. All arcs of the graph are sorted according to their length. This list then is used to remove a certain number of arcs. Finally the amount of removed arcs is inserted again randomly, where short arcs are preferred. The amount is either 0.2 k, 0.5 k, or 0.7 k, where k gives the number of arcs in the current solution.
Random arc removal. A certain amount of random arcs are removed. The same amount of removed arcs are inserted again with a bias toward short arcs. For arc insertion two different variants exist: connect geographically close nodes; do not connect geographically close nodes. The amount is either 0.4 k, 0.5 k, 0.7 k, or 0.9 k, where k gives the number of arcs in the current solution.
After applying any of the perturbation operators, the solution structure has changed but the new solution is most likely infeasible. Before the local search operator can be applied feasibility must be restored. This is done by applying the pupil routing and bus line scheduling algorithms of the solution construction.

Local Search.
The local search operator improves or repairs a solution by removing arcs from the solution to improve the objective function value in terms of the total length of the bus arcs. It iteratively removes arcs from the current solution. An arc is only removed if the solution stays feasible without the respective arc, else it is not removed. As soon as a valid arc for removal is found, it is removed, (i.e., the arcs are removed in a first-improvement scheme). The local search continues with the modified solution. We use three different ideas with respect to the order in which arc removals should be tried and performed. They are described in the following.
Long arc repair. This operator tries to remove long arcs in the solution. To achieve this, all arcs are sorted in decreasing order according to their length. Iteratively, every arc is tested for removal with a probability proportional of 0.95 x , where x refers to the rank of the arc in the sorted list (i.e., the removal probability decreases geometrically).
Residual capacity repair. This operator tries to remove arcs with low utilization, which are not first in a path. The idea is that arcs with low utilization can likely be removed without a too high solution perturbation in terms of pupil rerouting. The arcs are sorted in decreasing utilization order. Then, iteratively, every arc is tested for removal with a probability proportional of 0.95 x , where x refers to the rank of the arc in the sorted list.
Random repair. All arcs are in a random order list and the probability of arc removal for every arc in the list is proportional to 0.95 x , where x refers to the rank of the arc in the list.

Solution Evaluation.
The solution evaluation consists of two steps: 1. Determine the length of the bus arcs.

Determine the number of transfers for every pupil, in
case it is limited.
The length of the solution is determined by summing up the arcs of the solution. However, if the number of maximum allowed transfers is limited, we must determine which bus services which bus lines and if a pupil needs to transfer from one bus to another, see objective function (1) and constraints (13). To determine the number of transfers for the pupils, it is necessary to assign buses to bus lines. The below model determines the minimum number of buses necessary to serve the different bus lines and it returns a feasible schedule for each of the used buses. Based on this result, the number of transfers for pupils is computed, penalized, and added to the solution quality.
Given is the set of bus lines T. For every bus line i ∈ T the start time A i and its duration d i is known. The driving time between the last bus stop of line i ∈ T and the first bus stop of line j ∈ T is given. Additionally a minimum and a maximum waiting time for a bus which services a consecutive line, [θ, θ ], is given.
Based on this information, we are able to identify all time feasible bus line pairs (i, j) that can be served consecutively by a single bus. If a bus services line j after i then the starting time of j must be greater then the ending time of i plus the service duration of line i, given by d i , plus the travel time t ij to the start location of j (plus minimum and maximum waiting time at the beginning of j). In case the last stop of i is the same as the first stop of j (i.e., t ij = 0), waiting time does not need to be considered. More formally, let P denote the set that contains all feasible bus line connections (i, j), that is, all those bus line pairs for which one of the following two conditions holds: Using this information and binary variables, we are now able to give the integer program that we use to minimize the number of buses. Objective function (47) minimizes the number of first bus lines which corresponds to minimizing the number of buses: It is subject to two sets of constraints. A line j is either a first (f j = 1) line for a bus or has a predecessor ( i|(i,j)∈P c ij = 1) A line i either is the last line (l i = 1) for a bus or has a successor ( j|(i,j)∈P c ij = 1) (49)

ADDITIONAL MODELING APPROACHES
The school bus routing problem (without transfers) can also be modeled as an OVRP or as a DARP. In case the problem is modeled as an OVRP every school serves as depot and every bus stop at which a pupil for the respective school waits must be serviced by the bus. Pupils waiting at the same bus stop which have the same destination are grouped and treated as a single entity. For every school an independent problem arises and, therefore, multiple OVRP instances must be solved for every school bus routing problem.
The other approach is to model the problem as a DARP. The pupils are pickup up at their initial bus stop and are transported to their destination bus stop. Here, the pupils are treated independently from each other; hence for every pupil a transportation request must be serviced. In case of the DARP pupils of different schools may share a single bus. In this respect, the DARP is more general than the OVRP. Both modeling approaches are described in the following.

Open Vehicle Routing Problem
Given is a complete graph G = (V , E) with the vertex set V = {0, 1, 2, . . . , n} and the edge set E. Vertex 0 is the depot and vertices {1, 2, . . . , n} are the customers. Every customer has a demand d i . The travel time of edge (i, j) ∈ E is given by t ij , where t ij ≥ 0. The vehicle capacity of the homogeneous fleet is given by c, where ∀i ∈ V \ {0} : d i < c. In case of the OVRP, the vehicle routes either do not need to start at the depot (in case a pickup problem is given) or do not need to end at a depot (if a delivery problem is given, i.e., t 0i = 0 or t i0 = 0, ∀i ∈ V \ {0}). The objective is to find a set of round trips of minimum cost. Every customer must be serviced exactly once with a single visit. For every round trip, the sum of the customer demands must not exceed the vehicle capacity and the length of the round trip must be within an upper bound.
The OVRP solver used here is described in [17] and based on VNS proposed by Mladenović and Hansen [15]. The main components of VNS are a set of neighborhood operators, often ordered in increasing neighborhood size, for shaking and a set of operators for local search. Shaking operators perturb a given solution and local search operators improve a solution with regard to the objective function.
At first a feasible solution is perturbed by a shaking operator and then local search is applied to improve the solution. The shaking and improvement cycle is repeated until the search cannot escape from a local optimum. The idea of VNS now is to cycle through different neighborhood operators to perturb the current solution and thus may escape local optima and finally converge to the global optimum.
The VNS used here implements three different shaking operators: cross and icross-exchange, sequence ruin with  For a detailed description of the solution concept and its performance refer to [17].

Dial-A-Ride Problem
Given is a complete directed graph G = (V , A). V is the set of all vertices and A is the set of all arcs. For every arc (i, j) a non-negative travel time t ij is given. Vehicles are stationed at a depot 0 and must service n transportation requests. Each transportation request has an origin i and a destination n + i and a quantity q i , where q n+i = −q i . A request (i, n + i) may have a pickup time window [e i , l i ] (inbound request) or a delivery time window [e n+i , l n+i ] (outbound request). In case of the school bus routing problem all requests have a delivery time window which is determined by the school begin time and the maximum and minimum waiting time at school. Thus, all requests are outbound requests.
Formulating the school bus routing problem as a DARP, allows to transport pupils of different schools in the same vehicle (i.e., mixed loads). Every pupil is a single transportation request and must be picked up at her initial bus stop i and must be transported to her destination bus stop n + i. A time window at the destination bus stop is given that refers to the earliest and latest arrival time at school.
We use the dial-a-ride solver proposed in [30] which is based on the work of Parragh et al. [25], it can handle dynamic and stochastic problems. The school bus routing problem is a static problem without stochastic aspects.
The optimization strategy is based on VNS using the following neighborhood operators: move, swap, chain, and zero split. Every operator has an intensity level which ranges from one to five to control the amount of change. For a detailed description please see [30].

COMPUTATIONAL ANALYSIS
In the previous sections, a solution concept to compute a transportation plan for the school bus routing and scheduling problem with transfers was described, in the following referred to as SBR. Two additional modeling approaches were given. This section summarizes the results obtained by applying the three different solution methods to a set of benchmark problems. We want to answer the following questions: • How do different modeling and solution techniques perform on different problem instances and what are the differences in the solution properties (i.e., quality and service level)? • How does the number of maximum allowed transfers influence the solution quality and the service level (time loss and number of transfers)? • What impact has the pupil assignment strategy on the solution quality and service level?
The quality is defined by the objective function (1) without the penalty term. Therefore, it is the length of the arcs traveled by the buses. The objective in all three alternative modeling approaches (SBR, DARP, and OVRP) is identical. This allows us to compare the results. To measure the service level, we use the time loss of the pupils [33] and the number of transfers. The time loss of a pupil is the difference of the actual travel time from home to school and the shortest travel time (i.e., the pupil is assigned to its nearest bus stop, is picked up by a bus and directly transported to the destination bus stop).
At first we describe the design of the benchmark problems and the setting of the computational experiments (section 6.1). Then an in depth analysis of the results is given (section 6.2) by comparing results of our approach (SBR) to solutions computed by two state-of-the art VNS-based solver, namely a DARP solver [30] and an OVRP solver [17] as described in section 5.

Problem Instances
We use a set of 21 benchmark instances which range from eight pupils and two schools to 500 pupils and eight schools. The benchmarks are generated artificially but are designed to reflect the given situation. As we focus on nonprimary schools, pupils can enrol in the school of their choice and may, therefore, be located in the whole geographic area. The bus stops have the same geographic location in all instances, whereas pupil home locations vary. The planning horizon starts at zero (i.e., e = 0) and school begin times are 60 min. Therefore, the planning horizon is an hour. Table 1 summarizes the general properties of the problem instances. For every row in this table the instances differ only in the pupil location, all other parameters are fixed.
The bounds on the arrival time at school s are ω s = 5, ω s = 40, ∀s ∈ S, and the bounds on the bus change waiting time are γ = 1, and, γ = 10 in all instances. A time window is defined, which specifies the minimum (θ ) and maximum (θ) allowed waiting time for a bus if it services another bus line after the current one. These parameters are: θ = 0 and θ = 30.

Analysis
All of the following computations were performed on an Intel Core i7-3770 CPU with 3.40 GHz with 16GB RAM and a maximum runtime of 1 h. The solution concept was implemented in Java and executed using the Java SE Runtime Environmnent 1.8.0 and the HotSpot 64-Bit Server VM. The runtime for every instance is 60 min. For all approaches pupils are routed in groups (i.e., pupils which have the same pickup bus stop and the same destination bus stop are routed as a single entity). In the following, we give summary based on the results given in Tables 8-19. Solution quality and time loss are given in minutes. are the differences in the solution properties (i.e., quality and service level)? In Table 2, we provide average results for the three different modeling approaches considering the proposed benchmark instances which contain multiple schools. For these experiments, we use pupil assignment strategy pa1 (i.e., we minimize the distance to the pupils' destination bus stop when determining the starting bus stop). Furthermore, more than one transfer per pupil is penalized (i.e., C = 1, W = 100, where C is the number of maximum allowed transfers and W refers to the penalty weight). We note again that the other two approaches do not   solution quality of the OVRP approach deteriorates with an increasing number of schools, because for every school an OVRP is solved. Therefore, the solutions for the schools are independent (i.e., pupils of different schools cannot be on the same bus) and a lot of arcs are visited multiple times by buses for different schools. For the SBR approach, the average number of transfers per pupil ranges from 0.252 to 0.535 (i.e., 25% -53% of the pupils must change the bus on their way to school), for instances with at least 100 pupils. Over all instances the average transfers of pupils is 0.377 (i.e., nearly 38% of the pupils must change the bus one time). Detailed results are in Tables 10 and 11.  Tables 3 and 4 show the results for the two other pupil assignment strategies (i.e., minimizing bus stop fragmentation [pa2] and minimizing the number of bus stops [pa3]). Here, the average time loss of the pupils increases for all three approaches, and the objective function value decreases. The pupil assignment strategy influences the solution quality and the time loss of the pupils. A better solution quality implies a higher pupil time loss. For the SBR approach, pupil assignment strategies 2 and 3 also lead to a higher number of average transfers compared to pupil assignment strategy 1.

How do different modeling and solution techniques perform on different problem instances and what
The comparison of solution quality for different pupil assignment strategies for the SBR approach shows that the quality using pupil assignment strategy pa1 (minimize distance to pupils' destinations) is worse compared to the other two assignment strategies. Pupil assignment strategies pa2 and pa3 lead to better solution quality, because they tend to utilize a lower number of bus stations and, therefore, need to visit fewer stops. This is obvious for strategy pa3 (minimize number of bus stops) but also true in the case of strategy pa2     Tables 18 and 19. Table 5 compares the results of the DARP, OVRP, and SBR approach for an unlimited number of allowed transfers   Table 8).
We also compare the structure of the solutions calculated with the different solution techniques. Figure 7 compares solutions of the same problem instance (inst07, bus capacity 20) for the three approaches. The symmetry of the solution in case of the SBR approach can easily be seen (Fig. 7a). First the pupils are transfered to a central transfer point, one on the upper left (4) and one on the lower right (14). There they possibly change buses and are then transported to their destination school. In comparison, in the solutions computed by the other two approaches (Fig. 7b and c) some bus stops have to be visited multiple times. Additionally, some arcs are traversed multiple times (in different directions).
How does the number of maximum allowed transfers influence the solution quality and the service level (time loss and number of transfers)?. Table 6 shows the results averaged over all test instances with at least 100 pupils. The column max. transfers refers to the number of maximum allowed transfers, and the column quality, time loss, transfers refer to the respective solution property. For every property mean and maximum are given. We see that the solution quality improves with the number of maximum allowed transfers. If only one transfer is permitted the mean of the solution quality is 79.479, which is 17.9% higher than with unlimited number of allowed transfers (67.406). However, the average number of transfers is 0.45 if at most one transfer is allowed, whereas the average number of transfers is 0.639 in case of an unlimited number of transfers (i.e., for an unlimited number of transfers the average number of transfers per pupil is 42% higher than in the case where only at most one transfer per pupil is allowed). The difference in time loss is about 6% (at most one transfer [5.421] versus unlimited number of transfers [5.806]). From this table, we see that, as the number of allowed transfers increases, the solution quality improves, while the service level deteriorates. Especially the improvement from allowing two transfers instead of one transfer is high. This is due to the structure of the test instances. For most instances, allowing more than two transfers per pupil does not allow to find solutions of improved quality. Only in very few instances some pupils utilize three transfers, when the number of allowed transfers is unlimited, and only in a single case at most four transfers per pupil are used (inst17, Table 16). The numbers for this analysis are in Tables 8-10, 12-14, and 16-18.
What impact has the pupil assignment strategy on the solution quality and the service level? Table 7 compares solution quality and service level for different pupil assignment strategies and maximum allowed number of transfers. Again, only instances with at least 100 pupils are considered. Therefore, the numbers are slightly higher than in Tables 2-4. In the cases, where the maximum number of transfers is 1, the solution quality for different assignment strategies varies in the range of 75.72 -83.50, which is about 10% difference. In the case of an unlimited number of allowed transfers the range of variation is 67.324 -67.566, which is about 0.36%. We see that by allowing transfers the achievable solution quality increases and the solution quality range decreases. This, however, comes with an increase in the time loss as well as the number of transfers, which can also be seen in Table 7. For pupil assignment strategy pa1 (minimize distance to pupils' destinations), where the improvement in solution quality is highest, the time loss increases from 4.831 to 5.092 (about 5.4%) and the number of transfers increases from 0.401 to 0.599 (about 49.4%). For pupil assignment strategy pa2 (minimize bus stop fragmentation) those increases are from 5.642 to 6.342 (12.1%) and from 0.478 to 0.672 (40%) for the average time loss respectively transfers. The increase in time loss is 3.4% (from 5.789 to 5.984) and the increase in the number of transfers is 37% (from 0.471 to 0.648) for the pupil assignment strategy minimizing the number of bus stops (pa3). As expected, a higher improvement in solution quality implies a higher deterioration in the service level. Also, the improvement in solution quality from at most two transfers to an unlimited number of allowed transfers is small due to the fact that at most four transfers occur over all test cases, see Table 16, inst17. So, allowing transfers can compensate for bad pupil assignment in terms of solution quality at the cost of a lower service level. Detailed results are in Tables 8-10, 12-14, and 16-18.

CONCLUSION AND FURTHER RESEARCH
In this article, a mathematical model, a heuristic solution concept and an in depth analysis of the resulting solution properties for the school bus routing, and scheduling problem with transfers was given. The proposed solution concept has a modular design and can, therefore, be adapted easily in case the problem changes (e.g., if the constraints for the scheduling need to be changed, a general constraint programming solver can replace the current module).
Our computational study investigates three questions. First, it compares solutions of three different modeling approaches. Two approaches without transfers (OVRP, DARP) and our approach (SBR) which allows transfers. The benefit of integrated planning can be seen by comparing the OVRP solutions to solutions calculated by the DARP or SBR solver. For unlimited number of transfers the trade-off between solution quality, number of transfers, and time loss becomes evident. A high solution quality implies an increase in the time loss if no transfers are allowed. Contrary, if transfers are allowed it is possible to achieve high solution quality with low time loss at the cost of higher average number of transfers. The second analysis shows, that with an increasing number of allowed transfers costs decrease but, consequently, the service level decreases, too. Especially the average number of transfers increases. Third, the computational study investigates the question of how the assignment of pupils can influence solution quality and service level. The impact of the assignment strategy decreases as the number of maximum allowed transfers increases. Again, this comes at the cost of a higher number of transfers. However, in a real-world scenario, it may not be possible to allow an arbitrary number of pupil transfers.
The proposed solution concept is a first step toward a full decision support system. To further analyze the trade-off between costs (bus travel time/distance, number of required buses), service level (number of transfers, time loss, walking distance, waiting time), and the influence of problem instance properties a multiobjective optimization approach is needed.