A Matheuristic for AGV Scheduling with Battery Constraints

This paper considers the problem of scheduling automated guided vehicles (AGVs) with battery con- straints. Each transport request involves a soft time window, and the AGV ﬂeet used to service those requests is heterogeneous with a diverse set of capabilities and travel costs. In contrast to the existing literature, each transport request may require different AGV material handling capabilities (such as lift loads, tow loads, or handle loads with a mounted robot arm), and the AGV batteries can be recharged partially under consideration of a critical battery threshold. The problem is to assign the transport and charging requests to AGVs, sequence the requests, and determine their starting times and the recharg- ing durations of the AGVs with the objective of minimizing a weighted sum of the tardiness costs of transport requests and travel costs of AGVs. A mixed-integer linear programming model is formulated. We also propose a new matheuristic that makes use of an adaptive large neighborhood search algorithm and a linear program to solve industry-size instances. We illustrate the eﬃcacy of our approach with an industry case study using real-world data.


Introduction
A long-term goal of smart industry is to have a lights-out factory, which requires the control of operations at the shop floor without human interference with the use of automated guided vehicles (AGVs). The AGVs are driverless vehicles that transport goods and materials throughout various areas, e.g., shipping and receiving areas, storage, and workstations ( Confessore, Fabiano, & Liotta, 2013;Vivaldini, Rocha, Martarelli, Becker, & Moreira, 2016 ). The applications of AGVs have grown enormously because of their benefits, such as flexibility in processes, space utilization, product safety, and computer integration and control. More recently, the AGV technology has enhanced with new advances in their flexibility, where each vehicle will have specific capabilities (e.g., one can lift light or heavy loads, while another can tow loads) to perform heterogeneous tasks ( De Ryck, Versteyhe, & Debrouwere, 2020;Riazi, Diding, Falkman, Bengtsson, & Lennartson, 2019 ). This type of tasks is prevalent in the high-tech manufacturing sector, and it necessitates the handling of materials with specialized equipment, thus requiring a fleet of AGVs with varying capabilities.
An example of a factory requiring different material handling tasks is Brainport Industries Campus (BIC), a new high-tech campus constructed in Eindhoven, The Netherlands, which serves as a home front for far-reaching partnerships between suppliers, specialist companies, and innovative educational and knowledge institutions. The campus is a joint initiative of high-tech suppliers, also called tenants, to co-exist and collaborate on multiple fronts ( Brainport Industries Campus, 2020 ). Since multiple tenants collectively make use of a pool of AGVs, practitioners are looking for a better understanding on how to manage large AGV fleets with heterogeneous capabilities that can cater to a more diverse set of requirements. Consequently, there is an increasing need for planning algorithms to manage such fleets. Industry feedback indicates that most companies rely on the prepackaged fleet management software that is provided by AGV manufacturers. However, such a software is only applicable for a homogeneous AGV fleet and does not make use of look-ahead information such as due dates and time windows. In addition, when considering long travel distances in BIC, the performance of the AGVs can be influenced by battery restrictions. Therefore, efficient management and control of such an AGV system become key factors in reducing material handling costs, work-in-process inventories, and overall operational costs.
In this paper, we address the problem of scheduling AGVs with battery constraints and heterogeneous capabilities. Our problem holds some similarities to the electric vehicle routing problem with time windows (EVRPTW) addressed by Keskin & Çatay (2016) and score of a heuristic rewarded when finding a worse solution but accepted n c number of iterations station methods run when invoked T max execution time limit of the matheuristic γ number of random tasks to remove Zhao & Lu (2019) . The EVRPTW considers a fleet of electric vehicles (EVs) having limited driving range due to their battery capacities, which may need to visit charging stations while servicing customers in their routes. Our problem, however, distinguishes from the EVRPTW literature by the following features to comply with industrial needs. First, the fleet of AGVs considered in this study is heterogeneous in terms of not only travel speeds and costs, charging rates, and discharging rates but also, more importantly, capabilities to service different types of requests from various tenants. Second, partial charging for the AGVs is allowed under consideration of a critical battery threshold. Third, we consider both travel costs of AGVs and tardiness costs of transport requests, where different types of AGVs and requests have different unit travel costs and penalty charges, respectively. Motivated from our industry collaboration at BIC, the combination of these features constitutes the main novelty of the problem. We aim to make decisions on which AGVs the transport requests are assigned, the sequence in which the requests are processed, the times when AGVs should start servicing the requests, and the recharging duration of AGVs at charging stations so that a weighted average of the total travel and total tardiness costs is minimized.
The main contributions of this paper are as follows.
(i) We formulate the problem as a mixed-integer linear programming (MILP) model. (ii) We propose a new matheuristic approach that surpasses the MILP and the dispatching policy currently used in practice (see Appendix D for details) given the same computational time limit. In our matheuristic, which combines adaptive large neighborhood search (ALNS) ( Ropke & Pisinger, 2006 ) and linear programming (LP), we propose a new ALNS-based algorithm by introducing two pairs of destroy and repair operator sets to perform exploration and exploitation in distinct phases of the search. Here, we introduce problem-specific heuristics concentrating on the removal and insertion of charging tasks in which a possibility of partial charging for AGVs is allowable. We choose the ALNS framework to solve our problem since it provides flexibility to hybridize with other approaches to make more efficient implementations for optimization problems. Also, it has recently been demonstrated that ALNS-based matheuristics yield promising results on electric vehicle routing problems ( Keskin, Laporte, & Çatay, 2019;Keskin & Çatay, 2018;Koc, Jabali, & Laporte, 2018 ) owing to the ability to find the best tradeoff between the efficacy of exact approaches and the efficiency of metaheuristics ( Guimares, Klabjan, & Almada-Lobo, 2013 ). (iii) We provide managerial insights by performing a sensitivity analysis on a real-world case study, which allows us to determine potential room for improvement in practice. In this analysis, we vary the scheduling horizon, tightness of time window, the number of requests, AGV fleet size, costs related to tardiness and AGV travel, battery thresholds, and computational time limit. Our computational results reveal that the proposed matheuristic allows finding solutions of high quality even for industry-size large instances and yields an average of 33% improvement with respect to the dispatching policy in practice. (iv) We provide data for the real-world case study used for the experiments in this paper. The problem instances in the case study are generated from a fieldlab that has been built in the BIC. Additionally, the research is conducted in close collaboration with our industrial partner, IJssel Technology, a company located in the Netherlands, which is the central fleet owner at the BIC. The outcomes of this research can be broadly applied to settings when there is a central operator of a heterogeneous AGV fleet servicing different types of transport requests.
The remainder of this paper is organized as follows. Relevant literature is discussed in Section 2 . The problem is formally described in Section 3 , and a mathematical model is formulated in Section 4 . A matheuristic is proposed in Section 5 to solve industry-sized instances. Next, we describe the industry case study, conduct the parameter tuning, evaluate impacts of operators, and present computational results in Section 6 . Finally, conclusions and future research directions are discussed in Section 7 .

Literature review
The design of an AGV control and management system requires the best strategies to solve problems associated with common functions such as dispatching, routing, and scheduling ( Qiu, Hsu, Huang, & Wang, 2002 ). Langevin, Lauzon, & Riopel (1996) define dispatching as the process of selecting and assigning tasks to vehicles, routing as the selection of the specific paths that each vehicle traverses to accomplish its transportation tasks, and scheduling as the determination of the arrival and departure times of vehicles at each station. The decisions related to these functions can be made either simultaneously or sequentially. The reviews of De Koster (2006) andDe Ryck et al. (2020) present guidelines for the control of AGVs and include often-neglected areas, such as idle-vehicle positioning and battery management.
Most dispatching rules in the literature are single-attribute ones, which dispatch vehicles based on only one parameter ( Le-Anh, van der Meer et al., 2004;Li & Kuhl, 2017 ). Commonly used dispatching rules are first-come-first-serve (FCFS) and shortest-travel-distance-first (STTD). While single-attribute dispatching rules are common in practice, multi-attribute dispatching rules, which aggregate different attributes to select a task to be serviced by a vehicle, prove to be better in general ( Le-Anh & De Koster, 2005 ). The methodology proposed in Ho, Liu, & Yih (2012) involves concurrently solving the pickup-dispatching problem along with the load-selection problem with a scoring function that makes use of the slack time of parts, waiting time of parts, and the distance to the load. Jeong & Randhawa (2001) define a scoring function governed by a trained neural net to calculate priorities of incoming jobs. This method is particularly useful if there is a lot of data available on jobs and no clearly defined score function to assess priorities.
Dispatching rules have proven to be good and easy to implement but have been criticized for having a myopic view. More advanced approaches have, therefore, been studied to improve system performance. Liu, Tan, Kurniawan, Zhang, & Sun (2018) develop a mixed-integer programming model that solves the scheduling problem of mobile robots, tasked with transporting materials among different places, in a periodic or event-driven or hybrid way (having a static schedule being modified with new events). Sabuncuoglu & Kizilisik (2003) study reactive scheduling problems in a flexible manufacturing environment, where the arrival of jobs is not known in advance. They develop a simulation-based scheduling approach and compare offline and online scheduling schemes. Ebben, van der Heijden, & van Harten (2005) present a serial scheduling method for a dynamic resourceconstrained vehicle scheduling problem and analyze its dynamic behavior using discrete event simulation. Nevertheless, these studies do not consider time windows of transport requests and a heterogeneous fleet of AGVs with battery management.
Another stream of relevant work concerns the electric vehicle routing problem (EVRP). It is a variant of the vehicle routing problem (VRP) in which a fleet of EVs is used, instead of internal combustion engine vehicles (ICEVs). Conrad & Figliozzi (2011) study the EVRP, where EVs may recharge fully or 80% of the battery capacity at certain customer locations during their services. The objectives are to minimize the number of vehicles and total costs related to travel distance, service time, and charging. They propose an iterative construction and improvement heuristic to solve the problem. Erdoan & Miller-Hooks (2012) then study the green VRP (GVRP) in which alternative fuel vehicles such as solar, electric, or biodiesel vehicles are fully refueled at stations en route within a constant amount of time. They propose two heuristics to solve the GVRP with the objective of minimizing the total travel distance. Schneider, Stenger, & Goeke (2014) introduce the EVRPTW where the recharging may take place at any battery level with a linear charging function, and EVs always leave stations with a full battery. In their work, a hybrid heuristic of variable neighborhood search and tabu search (TS) is proposed for solving the EVRPTW given the objective being to minimize the total travel distance by using the minimum number of vehicles.

Problem description
The problem in this paper concerns a set of transport requests R that is serviced by a heterogeneous AGV fleet V with battery constraints. The source (pickup) and destination (delivery) nodes of request r ∈ R are denoted with s R r and d R r , respectively. Each request r ∈ R requires a set of capabilities A R r from an AGV for transport, and it has a time window [ e R r , l R r ], where e R r is the earliest pickup time and l R r is the latest delivery time of the request. An AGV is not allowed to reach the source of a request before its earliest pickup time. On the other hand, if AGV k arrives at the destination of request r after its latest delivery time, the tardiness occurs and is measured by penalty cost c R r incurred per time unit. Each request needs to be performed by exactly one capable AGV in the heterogeneous fleet. The AGV k ∈ V has a set of capabilities A V k and is capable of servicing request r ∈ R if A R r ⊆ A V k holds, i.e., the capability requirements of a request are included in the capabilities of an AGV. For example, with a request classified as a heavy load that needs to be lifted on a pallet, only those AGVs that have the capability to lift heavy loads can carry out that request. Each AGV also has a travel cost per time unit c V k . While an AGV is traveling, the battery charge level decreases proportionally with the traversed distance at a discharging rate of ˆ d V k , and the AGV may need to visit a charging station before continuing operations. Here, the battery is recharged at a charging rate of ˆ c V k . The charge level of an AGV (measured as a percentage) must take values between the minimum and maximum battery charge level denoted by b l percent and b u percent, respectively. The AGV must recharge its battery if the charge level drops below a critical threshold b V k , and the recharging duration should be long enough to allow the charge level to reach at least this threshold. In other words, before starting a new request, the charge level of the AGV must be at or above b V k percent. The critical threshold is a prespecified parameter for a given network of nodes to assure that an AGV can reach any request destination and a charging station right afterwards if needed. It is notable that an AGV is not permitted to visit a charging station while performing a request (i.e., carrying a load), and its route starts and ends at one of the charging stations. An AGV also cannot service more than one request at the same time, i.e., on its tour from a source node to a destination node (singleload AGVs). In addition, the AGVs can detect objects on the shop floor and move around them while traveling. Consequently, collisions between AGVs can be avoided by hardware, thus not being considered in this paper.
The problem can be defined on a directed graph G = (N, A ) , where N = { 1 , 2 , . . . , n } is the set of nodes and A = { (i, j) | i, j ∈ N, i = j} is the set of arcs. A node in the graph can represent a pickup or delivery location, or a charging station. Each pickup or delivery node i ∈ N has a service time h N i that represents the time needed for loading or unloading activities, while each charging node has zero service time. Each arc (i, j) is associated with a distance d i j , and it is traveled by AGV k ∈ V in t i jk time units. Note that both distance and travel time are symmetrical, i.e., d i j = d ji and t i jk = t jik . In addition, let E ⊂ N denote the set of charging stations and B = { 1 , 2 , . . . , n B } denote the set of charging requests, where n B is a safe upper bound, computed as n B = | V | · | E| , which allows AGVs to charge as many times as needed. Therefore, not all the requests from B need to be used in a schedule. For each charging request r ∈ B , the source and destination nodes are the same charging station, and the earliest pickup and latest delivery time are set to zero and infinity, respectively ( e R r = 0 , l R Lastly, T denotes the set of all transport and charging requests such that T = R ∪ B . Note that R , V, and N in the superscript are referred to request, AGV, and node, respectively. The goal of the problem is to determine (a) an assignment of AGVs to requests including charging requests, (b) a sequence of requests on AGVs, and (c) starting times of requests and recharging duration such that problem constraints are satisfied, and a weighted sum of the total tardiness cost of requests and the total travel cost of AGVs is minimized.

Mathematical formulation
In this section, we formulate a mathematical model based on the description in Section 3 . The model formulation is presented below.

Decision variables
In addition to the notation presented in Section 3 , we introduce the following decision variables for the mathematical formulation: x r r k binary variable, equal to 1 if request r is performed prior to r by AGV k , 0 otherwise y S rk arrival time of AGV k at the source node of request r ( S refers to source) y D rk arrival time of AGV k at the destination node of request r ( D refers to destination) T rk tardiness of request r performed by AGV k z S rk percent amount of battery discharge of AGV k when it reaches the source node of request r z D rk percent amount of battery discharge of AGV k when it reaches the destination node of request r δ rk travel time of AGV k when it performs request r

Mixed-integer linear programming model
The mathematical model of the described problem can be formulated as follows: Objective: Subject to: The objective function (1) minimizes the weighted sum of tardiness costs of requests and travel costs of AGVs, where α is a weight coefficient to prioritize one objective over the other. Constraints (2) ensure that each request is attended by at most one AGV, while Constraints (3) define whether or not a charging request is performed. Constraints (4) ensure that the capabilities of the requests and AGVs match. Self-visits are avoided by Constraints (5) . Constraints (6) conserve the incoming and outgoing arcs for each request done by an AGV. Constraints (7) ensure that an AGV arrives at the source node after the earliest pickup time. Constraints (8) and (9) compute the arrival times at destination and source nodes, respectively. Also, Constraints (9) prevent subtours with M defined as a large positive constant. Constraints (10) and Constraints (18) ( T rk ≥ 0 ) define the tardiness of each request. Constraints (11) determine the travel time of an AGV to perform a request. Constraints (12) and (13) calculate the amount of battery discharge due to the travels between the source and destination of a request and between the destination and source of two requests, respectively. The charged amount due to a charging request is given by Constraints (14) . Constraints ( 15-17 ) set the lower and upper limits ( b l = 0 , b u = 100 ) for an amount of battery discharge, while Constraints ( 18 -19 ) guarantee valid domains for the remaining decision variables.

Proposed matheuristic
In this section, we propose a matheuristic based on the integration of ALNS and LP to solve the problem in Section 3 . In our matheuristic, the ALNS is responsible for solving the assignments of AGVs and sequencing of requests, while the LP model is used to determine the schedule of requests, i.e., starting times and recharging duration. The ALNS involves multiple destroy and repair operators, which compete to modify a current solution. In each iteration, a destroy and a repair operator are chosen according to their past performance, and the resulting new solution is accepted if it satisfies some criteria defined by the ALNS ( Demir, Bekta, & Laporte, 2012 ). Here, we integrate the LP into the ALNS framework to utilize the advantages of both methods to find high-quality solutions for large problem instances efficiently.
Let Q 0 and S 0 denote the initial sequence and initial schedule, respectively. Similarly, Q current and S current denote the current sequence and schedule, while Q best and S best denote the best sequence and schedule, respectively. In the proposed matheuristic, there are four sets of operators: request destroy, request repair, (charge) station destroy and station repair, which are denoted by − c , + c , − s , and + s , respectively. Let d c ∈ − c and r c ∈ + c denote a destroy and a repair operator for requests, respectively, while d s ∈ − s and r s ∈ + s denote a destroy and a repair operator for charging tasks, respectively. Also, the weights of operators in − c , + c , − s , and + s , which reflect probabilities to choose an operator, are stored in vectors P − c , P + c , P − s , and P + s , respectively. The pseudocode of the proposed matheuristic is presented in Algorithm 1 .
The matheuristic starts with the initialization of a feasible sequence Q 0 and weights of operators in P − c , P + c , P − s , and P + s (line 1). Initially, all operators have the same weight. The initial schedule S 0 is then computed by an LP model with the input of Q 0 (line 2). The matheuristic considers Q 0 ( S 0 ) the current and best so far (lines 3-4) and sets the simulated annealing (SA) temperature τ to its initial value τ 0 (line 5). Then, a loop is executed until a stopping criterion is met (lines 6-31). At the start of the loop, the matheuristic selects a pair of request destroy and repair operators d c and r c to modify the current sequence Q current to generate a new sequence Q new , which is next used to compute S new (lines 7-9). If a new best solution is found, the matheuristic focuses on exploiting its neighborhood by applying station destroy and repair operators d s and r s until it reaches n c iterations and does not obtain another new best solution (lines 10-25). Otherwise, a new solution is accepted if it is better than the current solution, or worsen than but satisfies the SA acceptance criterion, where r U (0 , 1) is a random number Algorithm 1 Matheuristic.
Require: set of AGVs V , set of transport requests R , set of charging tasks B 1: Initialize Q 0 and P − c , P + c , P − s , P + s Sections 5.2 and 5.3 Section 5.4.1 and 5.5.1 8:   τ ← τ ε Section 5.7 31: until stopping criterion is met 32: return S best generated according to the uniform distribution U (0 , 1) (lines 26-27). The weights of selected request and station operators in P − c , P + c and P − s , P + s are updated when being used, respectively (lines 23 and 29). Also, the SA temperature is the last to be updated in each iteration, where ε is the cooling rate (line 30). At the end of the matheuristic, the best schedule S best is returned (line 32). We provide details on the algorithm in separate sections, as stated in the pseudocode.

Solution representation
The solution of our problem is a schedule in which tasks are carried out at specific time moments. In this section, we describe a solution representation by presenting the types, sequence, and schedule of tasks subsequently as follows.

Types of tasks
A schedule S consists of three types of tasks such as transport request (loaded travel) T R , unloaded travel UT , and charging CH, which are depicted in Fig. 1 . Let g R r denote the type of task r. First, the task in Fig. 1 (a) represents request r of type T R ( g R r = T R ), which has the time window [ e R r , l R r ], source and destination nodes ( s R r , d R r ), and set of capability requirements A R r . Second, Fig. 1 (b) presents a task of type UT ( g R r = UT ), which indicates the travel of an AGV without any load to move from the destination of its  previous task ( s R r ) to the source of its next task ( d R r ). This travel occurs in between two tasks of type T R or types T R and CH. Finally, Fig. 1 (c) presents a task of type CH ( g R r = CH), which has the source and destination nodes at the same charging station, i.e.,

Sequence of tasks
Task sequences on AGVs are constructed from tasks of two types T R and CH. We denote by Q the set of task sequences of all AGVs, i.e., Q = { Q k | ∀ k ∈ V } , where Q k is the sequence of tasks on AGV k . Additionally, let Q k r denote the r th task in sequence Q k , and the attributes of the task are represented by . For example, Fig. 2 illustrates the set Q consisting of two AGVs, with their set of capabilities, and their respective task sequences Q 1 and Q 2 in which Q 1 3 and Q 2 2 are the third and second task of AGV 1 and 2, respectively. Here, the attribute values of Q 1 3 are (−, −, 19 , 19 , −, CH) , while those of Q 2 2 are (32 , 55 , 1 , 12 , { A, C} , T R ) . Note that the symbol "-" indicates the absence of the corresponding attribute of a task.

Schedule of tasks
To compute the schedule S of all AGVs, each sequence Q k is post-processed by adding tasks of type UT to result in sequence ˆ Q k such that it is composed of transport requests, charging tasks, and unloaded travel tasks. Let S k denote the schedule of tasks on AGV k , then S = { S k | ∀ k ∈ V } . We also denote by S k r the r th task in schedule S k . Two additional attributes, namely y S r and y D r , are included to indicate the arrival times of an AGV at the source and destination nodes of task r , respectively. Note that these time moments are determined while computing the schedule S (see Section 5.6 ). In addition, for a task of type CH, the difference between y D r and y S r (i.e., y D r − y S r ) implies the recharging duration. Fig. 3 illustrates the set S = { S 1 , S 2 } , where S 1 and S 2 are the schedule of tasks on AGV 1 and 2, respectively. In schedule S 1 , S 1 2 indicates the second task ( r = 2 ) with its attributes After obtaining the schedule S, its objective value is calculated according to Eq. (20) , which is equal to the weighted sum of tardiness costs of requests and travel costs of AGVs. Here, (y D gives the tardiness of each transport request r .

Initialization
The quality of the initial sequence may have a crucial impact on the final outcome of the matheuristic. To generate a good initial sequence, we propose a constructive heuristic, which employs the Earliest Due Date (EDD) rule and the critical charge insertion method described in Section 5.5.2 . The pseudocode of this constructive heuristic is given in Algorithm 2 . It begins with sorting Algorithm 2 Proposed constructive heuristic.
Require: set of AGVs V , set of transport requests R 1: ˆ Assign a charging task to Q v 15: end if 16: end for 17: return Q 0 all transport requests r ∈ R in non-decreasing order of their latest delivery times to create the sorted set ˆ R (line 1). Then, we create the set of capable AGVs, denoted by V C , for servicing each sorted request (lines 2-3). If there exists only one candidate in V C , it is selected (lines 4-5). Otherwise, the AGV that has the minimum additional cost when inserting request r to its sequence is chosen (lines 6-10). The cost function f (r, k ) of inserting request r to AGV k is determined in line 8, where the tentative arrival time ˆ y D r of the AGV at the request destination is calculated given that the AGV should reach the source node as early as possible but not before the earliest pickup time. After servicing the request, if the charge level b V v of the chosen AGV v drops below the criti- cal battery threshold b V v , a charging task is assigned to that AGV. The (re)charging duration should allow the charge level to reach at least the critical threshold, as mentioned in Section 3 . In general, the proposed constructive heuristic can be classified as a greedy sequencing algorithm.

Adaptive weights for operators
At each iteration, the selection of a destroy and a repair operator is regulated by using the roulette wheel mechanism ( Ropke & Pisinger, 2006 ). This gives bigger opportunities for suitable operators to be used during the search, and also allows poor performance operators to be selected with a low probability for diversification purposes. If the operator d c ∈ − c is associated with and similarly for r c ∈ + c , d s ∈ − s , and r s ∈ + s . The weight of each operator is set to 1 at the beginning. Then, the weights are adjusted automatically using statistics from earlier iterations and recent performance ( Pisinger & Ropke, 2019 ). The performance of selected operators at each iteration is measured by the score value ψ. If a new best solution is found, their scores are rewarded by ψ = ψ 1 . If the new solution is better than the current solution but not the best, the scores are rewarded by ψ = ψ 2 . If the new solution is worse than the current solution but accepted according to the SA criterion, the scores are given by ψ = ψ 3 . Here, a high score value corresponds to a successful operator. The weights of the selected destroy and repair operators are updated using the following equations: where λ is the reaction factor that controls how sensitive the weights react to changes in the performance of the selected operators, and x represents either c (request) or s (station). Note that the weights of the operators which are not used in the current iteration remain unchanged. The main aim of the adaptive weight adjustment is to encourage driving the search forward by selecting operators that work well for the instance being solved.

Destroy operators
In this section, we present nine destroy operators used in our matheuristic. The first five are classified as request destroy that removes transport requests, whereas the last two are classified as station destroy that removes charging tasks. These operators are either adapted or inspired by Ropke & Pisinger (2006) , Keskin & Çatay (2016) , and Shaw (1998) . In general, the destroy phase mainly consists of removing a number of requests from the current sequence, adding them into a removal list, and returning a partially destroyed sequence.

Request destroy operators
We now describe the operators removing requests as follows.
Shaw destroy (SHD) : This operator is designed to remove requests which are related to each other in respect of several aspects and thus are easy to reshuffle them around. In our algorithm, the relatedness between two requests r and r is defined by S(r, r ) and is computed by Eq. (23) : where φ 1 , φ 2 , and φ 3 are weights (Shaw parameters) corresponding to the distance, time window, and capability requirement terms, respectively, and K r represents the set of AGVs that are capable of servicing request r. A lower S(r, r ) indicates a higher degree of relatedness between two requests r and r . It is assumed that d xy , e R x , and l R x are normalized by scaling those such that they only take the values between [0,1], where x and y respectively represent the first and second nodes in the distance term, while x in the time-window term indicates either request r or r . The SHD first chooses randomly a transport request r to be removed from a sequence (set) Q. Then, it sorts an array of non-removed requests L in increasing order of their relatedness values with respect to request r. Next, it selects the request placed in the position defined by [ y p | L | ] , where y ∈ [0 , 1) is a random number, and p is a deterministic power introducing some variation in the selection of related requests, making related requests more likely to be selected.
[ ·] is rounded to the nearest integer. The procedure iterates in a similar fashion until q requests are removed ( Keskin & Çatay, 2016;Ropke & Pisinger, 2006 ). The value of q will be determined in the parameter tuning section ( Section 6.2 ).
Worst request destroy (WRD) : This operator attempts to remove q requests with a high cost from a sequence Q. We define the cost of a request r in a schedule S as Random request destroy (RRD) : The operator simply removes γ requests randomly chosen from the sequence Q to diversify the search mechanism.

Station destroy operators
Charging tasks at stations are crucial components of our problem. Hence, removing or repositioning them in an AGV sequence may improve the solution. We here use the following station destroy operators.
Worst charge destroy (WCD) : This operator chooses an AGV k randomly and calculates the amount of charge that it accumulates at each particular charging station. Then, the WCD selects the charging task that leads to the least increase in charge and removes it from the sequence Q k .
Random charge destroy (RCD) : The operator simply selects an AGV randomly and removes a charging task which is also chosen randomly from the sequence of that AGV.
All charge destroy (ACD) : The operator simply selects an AGV randomly and removes all charging tasks from the sequence of that AGV.

Repair operators
To repair a partially destroyed sequence, six insertion operators are employed in our algorithm. These operators are divided into two groups, namely request repair and station repair. While the operators in the first group attempt to reinsert transport requests removed by destroy methods, those in the second group insert charging tasks into partial sequences thus affecting the visits to the charging stations of AGVs along their route. The insertion operators return a feasible sequence Q that serves as the input for the LP model, presented in Section 5.6 , to compute a feasible schedule S.

Request repair operators
We adapt the greedy and regret-κ insertions from Ropke & Pisinger (2006) and propose a new method called EDD-based greedy insertion to reinsert the transport requests. These three operators are described as follows.
Greedy insertion (GRI) : This operator repeatedly inserts a request into its best position overall that is called the minimum cost position. Let Q k r→ i denote the resulting sequence when inserting request r at position i in sequence Q k and f (r, k, i ) denote the cost of the whole partial sequence Q after the insertion. The pseudocode of the greedy insertion is presented in Algorithm 3 . For each re-Algorithm 3 Greedy insertion.
Require: set of destroyed requests (removal list) D , set of AGVs V , partial sequence Q 1: while | D | > 0 do 2: for each r ∈ D do 3: Insert r * at position i * in sequence Q k * Regret-κ insertion (RKI) : One potential downside of the GRI is that it often postpones the insertion of requests that have high insertion costs. When those requests need to be reinserted, there may not be many good positions available. Hence, the regret-κ heuristic is used to overcome this shortcoming. Let f (r, k, i ) be defined as in the GRI and f † (r, k ) denote the minimum cost of inserting request r into the sequence of AGV k . In addition, let f regret−κ (r) denote the regret value that is the difference in the cost of inserting request r in its best sequence and its κ-best sequence.
The pseudocode of the RKI is given in Algorithm 4 . In each itera-Algorithm 4 Regret-κ insertion. Require: level κ, set of destroyed requests D , set of AGVs V , partial sequence Q 1: while | D | > 0 do 2: for each r ∈ D do 3: Algorithm 3, line 5 7: Insert r * at position i * in sequence Q k * , they are computationally expensive since the insertion cost has to be calculated for each available position in a sequence Q. We hence introduce a faster heuristic called the EDD-based greedy insertion. Here, the difference between the EGI and the GRI is that the EGI only considers to reinsert a destroyed request r ∈ D into the sequence of a capable AGV k ∈ V C at the first position i , where l R . The idea lies in the fact that requests with earlier delivery times (earlier due dates) should be serviced before the ones with later delivery times. It is notable that a request insertion operator is feasible with respect to capability requirements but may not meet the critical battery threshold. In this case, the critical charge insertion presented in Section 5.5.2 is applied to make repaired solutions feasible.

Station repair operators
After removing charging tasks, a partially destroyed sequence may become infeasible to the battery charge requirement of AGVs. To repair the sequence, we propose two operators inserting charging tasks, namely critical charge and non-critical charge insertions. Since AGVs can charge as many times as needed, any charging task can be inserted throughout the station repair operators, and thus it may not be necessary to reinsert all the charging tasks which are removed by the station destroy heuristics. Note that any AGV should perform a charging task at a charging station closest to the location of its previously serviced request.
Critical charge insertion (CCI) : This operator estimates the battery charge level after servicing each request in a partial sequence of AGV k and inserts a charging task whenever its charge level falls below the critical battery threshold b V k . Non-critical charge insertion (NCI) : In this operator, we introduce the non-critical battery threshold for AGV k , denoted by b V k , to represent an upper threshold for the AGV charging. The main difference between b V k and b V k is that AGV k may visit a charging station if the charge level is below b V k , but it is not necessary to do so if the visit results in a missed deadline (whereas the AGV must visit a station if its charge drops below b V k ). In other words, this operator determines any request in a partial sequence, after which the battery state of charge is under the non-critical threshold and inserts a charging task if there is enough idle time between that request and the next request. The idea is to obtain possibly better solutions if charging tasks are added in advance, and not just when the battery level falls below the critical threshold as in the CCI.
All charge insert (ACI) : The operator simply selects an AGV randomly and inserts a charging task after every request in the sequence of that AGV.

LP scheduling
The destroy and repair operators of ALNS presented thus far are employed to solve the assignment and sequencing of requests on AGVs. In this section, we propose an LP model to optimize starting times and recharging duration of each AGV k ∈ V , where V represents the set of AGVs whose sequence Q k , and thus ˆ Q k , is modified by ALNS in each iteration of the matheuristic.
To present the model, we define decision variables T r , y S r , and y D r as the tardiness, arrival time at the source, and arrival time at the destination of task r, respectively. We also introduce the variable β r to keep track of the battery charge level of the input AGV at the start of task r. The initial battery charge level, β 0 , is set to B V k . The proposed LP model is as follows: Subject to: Objective (24) minimizes the total tardiness cost. Constraints (25) ensure that there is sufficient time to move from the source to the destination of each request. Since the source of the subsequent request is the destination of the current request, Constraints (26) ensure that the time abides the same. A request cannot be serviced before the earliest pickup time as imposed by Constraints (27) . The tardiness of each request is defined by Constraints (28) and Constraints (34) ( T r ≥ 0 ). Constraints (29) initialize the battery charge of the input AGV k at a level of B V k . The charging of AGV is given by Constraints (30) , while the discharging due to loaded and unloaded travel is given by Constraints (31) . Constraints ( 32 ) and ( 33 ) define the lower and upper limits ( b l = 0 and b u = 100 ) of the given AGV's charge level, while Constraints (34) impose restrictions on the other decision variables.

Acceptance and stopping criterion
After we obtain a schedule S from the proposed LP model, it is necessary to determine whether or not the schedule can be used as the basis for the next iterations of our matheuristic. To this end, we apply the SA acceptance criterion, which is most commonly used in the ALNS framework, to avoid getting trapped in local optima ( Santini, Ropke, & Hvattum, 2018 ). That is, a new improving solution is always accepted, i.e., where τ is the current temperature, and f (S x ) is determined according to Eq. (20) . We initialize τ at τ 0 such that a solution which is w % worse than the current solution is accepted with probability 0.5. Therefore, we have where w denotes the initial temperature control parameter ( Ropke & Pisinger, 2006;Zhao & Lu, 2019 ). Then, the current temperature decreases gradually by a cooling rate ε at every iteration ( τ ← τ ε), where 0 < ε < 1 . The matheuristic stops when its running time has passed a given time limit T max .

Computational experiments
This section presents the experimentation process for assessing the performance of the MILP model and the proposed matheuristic (MH). First, we describe the generation of problem instances based on the floor layout of a fieldlab in the BIC in Section 6.1 . Next, Section 6.2 discusses the parameter tuning for MH, and Section 6.3 quantifies the impact of the destroy and repair operators. In Section 6.4 , we describe the current dispatching rule used by our industry partner at BIC as a benchmark to the MILP and the proposed MH. We then compare the performance of the three methods to validate the performance of the proposed MH with small-sized problem instances. Finally, we carry out the sensitivity analysis with larger-sized problem instances to further demonstrate the MH's effectiveness and generate managerial insights for the industrial case study in Section 6.5 .

Case study design
The problem instances in this research are generated after discussions with industry practitioners from BIC. We generate problem instances by using the distance matrix based on the real world layout at the BIC, shown in Fig. 7 (see Appendix A ). In practice, requests arrive over the layout and are made known to the scheduler at the start of a scheduling horizon H that can typically be of length 1800 or 3600 seconds. The number of requests | R | within this time horizon can range between 20 and 60 depending on the workload. Additionally, requests contain information about their priority. High priority requests are usually production critical requests, which, if delivered late, can decrease the productivity of  [E] in which A represents the capability to lift a load, B to handle heavy loads, C to handle light loads, D to tow loads, and E to handle loads with a robotic arm mounted on the AGV. Therefore, when a request arrives with capability requirements [A,C], i.e., a light load that needs to be lifted, the scheduler must assign it to an AGV that contains these capabilities. Further, AGVs incur travel costs of $0.01, $0.02, and $0.03 per second. These costs are representative of operational and maintenance costs and have been altered to protect confidentiality. The representative data from the actual AGV fleet used in practice is shown in Table 9 (see Appendix B ). The instance generation procedure is presented in Algorithm 5 (see Appendix C ). This procedure takes an input parameter set including the scheduling horizon H, the probability of tight time windows T W , and the number of requests | R | and outputs an instance set containing | R | requests with a random arrival time moment within H seconds and variable time window lengths. The tightness of a time window is measured by the gap between the earliest pickup time and the latest delivery time, and a value of T W , e.g., equal to 0.5, indicates that transport requests have a 50% chance of having a tight time window, where T W ∈ [0 , 1] . For each parameter set 1 (scenario), we generate ten instances and run each instance ten times, making a total of 100 runs. We have observed that ten runs for each instance are sufficient to obtain a stable width around the mean with 95% confidence interval ( Wang et al., 2013 ). The mean and coefficient of variation (CV) of the costs for a parameter set are calculated from costs obtained in these 100 runs. The weight coefficient in the objective function, α, is set to 0.5 for all instances.
All experiments have been run with an execution time limit of 15s (i.e., T max = 15s), unless stated otherwise, on a computer with Intel Core i7-8750H CPU @ 2.20GHz CPU and 16GB RAM with Windows 10 operating system. The MH and MILP are programmed in Python v3.6 and Gurobi Python package, respectively, while the dispatching rule is implemented in the simulation software Anylogic v8.3.1.

Parameter tuning
The parameter tuning methodology is adopted from Ropke & Pisinger (2006) . For each parameter shown in Table 1 (see columns 1-2), we consider five scenarios, generate ten instances per scenario by Algorithm 5 , and perform ten runs per instance for each of five discrete possible values. The considered scenarios and parameter values are reported in Table 10 and Table 11 , respectively (see Appendix E ). Then, for each parameter value, the tuning procedure calculates the average percent deviation from the average of the best achieved solutions and selects the parameter value that results in the least average percent deviation. This procedure is repeated until all parameter values are tuned. The detailed results of the parameter tuning are given in Table 11 , and the parameters' best values are summarized in Table 1 . Moreover, to determine parameter q (the number of requests removed via Shaw destroy operators), we carry out a separate tuning experiment across various instance sizes that depend on the number of requests ( | R | ) and the number of AGVs ( | V | ). The result shows that the value of q can be determined by the piece-wise function as given in Eq. (35) . It can be seen that the value of q is nonincreasing in the instance size since a larger q -value results in a longer running time (due to an increase in computations) per iteration, which slows down the exploration rate of the matheuristic given the computational time limit. While a successful ALNS algorithm relies on the availability of several fast-to-explore operators, having a large q for large instances goes against this principle. For small instances, however, that is not the case, which is also reflected in our findings.

Impact of destroy and repair operators on the performance of ALNS
The ALNS framework consists of different destroy and repair operators. To gain some insights into the operators, we analyze their impact on the performance of the MH by reporting the number of times they are selected ('Used'), contribute to an improvement ('Imp.'), and lead to a new best solution ('Best') in Table 2 . All the instances generated in the parameter tuning section are used to test the performance of the operators. The results show that the operators in all groups are selected with approximately equal proportions. They also contribute equally to improvements in solution quality, except for 'Station repair' group, where CCI dominates NCI and ACI. The operators in 'Request destroy' and 'Request repair' groups contribute equally to finding new best solutions. On the other hand, operators in 'Station destroy' and 'Station repair' groups have noticeable disparity in performance, i.e., RCD, ACD, NCI, and ACI operators contribute relatively less to finding the best solution. To better understand the benefit of each operator to the proposed method, we conduct a statistical analysis (see Appendix F ), inspired by the work of Schiffer & Walther (2018a) . This analysis shows that removing the SHD-D, ACD, and ACI leads to the best increase in solution quality. Hence, we removed these operators afterwards. The remaining operators will then be used for all subsequent experiments.

Comparisons of MILP, the proposed MH, and a practical scheduling policy
In this section, we study the performance of the MILP and the proposed MH. In addition, the results of our MH are assessed with respect to the current practice of scheduling AGVs at IJssel Technology, our industry partner at BIC. The current practice makes use of a bidding-based dispatching rule. In this rule, a transport request is sent to the scheduler when the material to be transported is ready at its pickup location. This rule makes use of EDDs to prioritize requests for assignments on eligible idle AGVs. An AGV is considered idle when it has sufficient charge ( b V k ≥ b V k ) and is not assigned to any request. The scheduler selects the idle AGV that is closest to the source node of the request, i.e., distance-based bidding. Note that after servicing a request, an AGV is sent to the nearest charging station if its charge level drops below the critical threshold (the CCI heuristic). A detailed explanation of this dispatching rule, referred to as EDDBID, is provided in Appendix D .
The three approaches' performance is investigated for the problem instances in which the number of requests | R | ranges from 3 to 30 with H = 1800 , T W = 0 . 5 , and | V | = 6 . To ensure a fair comparison, we run the instances on one processor core and limit the computational time to 15s (i.e., T max = 15s) for all approaches. We also extend the time limit to 3600s on the MILP to check whether the optimal value is reached and further validate the solutions of the MH. Ten iterations are run per instance of a parameter set to obtain the mean and CV of the objective value. A percentage gap is calculated from the mean objective values found by two respective approaches, either the MH and MILP or the MH and EDDBID. A negative gap indicates an improvement in the solution quality of the MH over the other approaches. Table 3 shows that the MILP model can solve small instances to optimality, but with an increase in the number of requests, it fails to reach the optimal value within the considered time limits and returns only the best solution found so far. Namely, the MILP running in 360 0s (MILP-360 0) starts to give only the best feasible solutions when there are more than 5 requests, and not all iterations find a solution beyond 20 requests. Additionally, the MILP-3600 is not able to find a feasible solution for the instances with more than 27 requests. Similar trends can be observed for the MILP running in 15s (MILP-15) and are marked by distinguishing symbols in Table 3 .
On the other hand, the MH is able to find solutions for all instances and provides a better solution than the MILP-15 in all cases and MILP-3600 in a fourth of the cases. For the remaining cases, although the objective values found by the MH are greater than those of the MILP-3600, the difference is always smaller than 2%. In addition, the MH consistently finds a better solution beyond 20 requests. We also observe that the MH can obtain the first solution within a second, which is not the case with the MILP for even small instances. Finally, the MH outperforms EDDBID in all instances, with an improvement of 27% on average.

Sensitivity analysis
This section studies the performance of the MH and EDDBID and produces managerial insights by performing a sensitivity analysis on larger instances that are typical in practice. Both approaches are run with T max of 15s, except in Section 6.5.4 , where the computational time is varied. The MILP is excluded from the analysis since it was not able to find a feasible solution for largesized instances, as shown in Table 3 . Each scenario, considered in this analysis, is represented as ' H − T W − | R | − | V | ' and consists of ten instances in which H is varied in the set { 180 0 , 360 0 } , each instance is run ten times. The data of AGV fleet V is also in Table 9 (see Appendix B ), and the fleet size | V | of, e.g., three indicates that the first three AGVs in this table are employed.
We analyze the effect of the parameters H, T W , | R | , and | V | in Section 6.5.1 , while the effects of cost parameters, battery thresholds, and time limit are investigated in Sections 6.5.2, 6.5.3 , and 6.5.4 , respectively.

Impact of scheduling horizon, time-window tightness, number of requests, and fleet size
The mentioned values of the parameters H, T W , | R | , and | V | are combined to obtain 54 scenarios, thus 540 instances in total.
The average costs ( μ) obtained by MH and EDDBID are presented in Table 4 . The column 'Gap' reports the percentage gap between the costs of the two approaches. Several selected trends across the varied parameters are also shown in Fig. 4 .   Fig. 4 (a), under a constant AGV fleet size, e.g., | V | = 6 , shows the effect of varying T W , | R | , and H on the average cost. It can be observed that there is a lower gap in the performance of MH and EDDBID when the number of requests is less compared to when the number of requests is high. The cost of EDDBID increases at a higher rate than MH when increasing from 20 to 60 requests. On the other hand, the cost decreases with an increase in the scheduling horizon in both MH and EDDBID. However, the average improvement for all scenarios with respect to 1800s and 3600s remains the same, i.e., about 33%, which implies the robustness of MH in different scheduling horizons. Moreover, it can be seen in Fig. 4 (b) that MH performs better than EDDBID under tight time windows, and the improvement is constant. In addition, Fig. 4 (c) shows that the cost goes down with an increase in fleet size. The reduction in cost is higher when moving from 3 to 6 AGVs than when moving from 6 to 9 AGVs. Additionally, Fig. 4 (d) shows the increase in cost due to a higher number of requests. It is observable that when the number of requests is low ( | R | = 20) or the available fleet size is large ( | V | = 9), the difference in costs is smaller than when there are more requests to schedule on a limited fleet. This trend is also evident from the results in Table 4 .
In general, the cost gains are substantial across a wide number of scenarios with the mean gap equal to 33% while the CV ranges between 0.7% and 9%. Additionally, the cost savings are highest when working in an extremely constrained situation, i.e., when the number of requests to schedule is high, and the available fleet size is small. MH makes use of look-ahead information to feasibly plan and dispatch AGVs, resulting in large cost savings, and in scenarios where the fleet size is large, the effects of making use of lookahead information diminish.

Impact of penalty cost for tardiness and AGV travel cost
The experiments in Section 6.5.1 contain requests with varying penalty costs for tardiness and AGVs with varying travel costs.
We report, in this section, our observations drawn from these aspects. We use three types of requests with penalty costs 1, 2, and 3 in the experiments. Table 5 shows the breakdown in tardiness values among requests with different costs when scheduled with EDDBID and MH. It can be observed that requests with a higher cost constitute a lower share of overall tardiness value in MH, whereas these requests are found to constitute a larger proportion in EDDBID. In other words, the schedule obtained via MH reduces the tardiness and, if tardiness is unavoidable due to extremely tight time windows, MH prioritizes requests with a higher cost. EDDBID, on the contrary, is indifferent to request penalties and can lead to a high total cost.
Similarly, we use three types of AGVs with travel costs 1, 2, and 3 for the experiments in Section 6.5.1 . A larger travel cost is often accorded to AGVs that have high operational costs. Table 5 shows that MH shares the travel times of AGVs inversely proportional to their travel costs. Namely, the AGV with a higher travel cost has a lower proportion of the total travel time, while the AGV with the least travel cost constitutes the largest share. On the other hand, EDDBID has approximately an equal share of travel time, since it does not account for travel costs. It is notable that travel time also depends on capability requirements of a request ( A R r ) and capability set of AGVs ( A V k ), i.e., the matching between a request and AGV should satisfy the feasibility condition ( A R r ⊆ A V k ), and preference is considered when AGVs with different travel costs satisfy the capability requirements of a particular request. It is for this reason that the AGV with c V k = 3 has the same proportion when scheduled by EDDBID and MH in Table 5 .

Impact of battery thresholds
In this experiment, we vary critical and non-critical battery thresholds of AGVs (i.e., b V k and b V k ). While the former threshold defines a must condition of visiting a charging station of an AGV, the latter only presents a possibility of the visit. We select three parameter sets, representing a small, medium, and large setting, where b V k is tested with five values from 10 to 50 (percent of the full charge) and b V k is varied with three levels 60, 80, and 100. The results are shown in Table 6 and Fig. 5 .
For the medium and large settings, the lowest cost is obtained when b V k is around 30%, whereas, in a small setting, the best performing critical thresholds are higher, which implies that having a small critical threshold is not necessarily profitable. On the other hand, there is not an evident trend when varying the non-critical threshold, which may indicate that MH is able to schedule critical charge duration in order to maximize the feasibility of serving all requests.

Impact of computational time
We analyze the effect of the computational time on the solution quality of the MH by increasing the time limit to 15, 30, 45, and 60s. This experiment is also conducted with three problem settings (scenarios). The average total costs of each scenario obtained in 15s, 30s, and 45s are compared to that found in 60s. The    Table 7 and Fig. 6 show that the gap relative to the cost obtained in 60s decreases with the increase in the computational time. However, the gap is only at a maximum of about 2%, 3%, and 7% for a small, medium, and large setting, respectively. Additionally, it can be seen in Fig. 6 that the objective values of the three scenarios converge as the computational time increases, and the rate of convergence, for the large setting, is higher than it does for the smaller ones.
To further investigate the computational efficiency of the MH, we report in Table 8 the percentage of the time spent in each of the algorithm's components relative to the total time available ( T max ). We here consider the scenario with | V | = 9 , H = 1800 , T W = 0 . 8 , and T max = 15 s . It can be seen that the maximum time is spent in the 'Request destroy/repair' component. When increasing the number of requests, the number of iterations decreases due to expensive computations of operators that aim to destroy requests from their worst locations and reinsert them to their best ones, and these candidate locations increase with an increase in the number of requests. As a result of the reduced number of iterations, the percentage gap gradually diminishes between MH and EDDBID. However, we observe at our industrial partner that the instances of 9 AGVs and 60 requests are the busiest scenario in practice. Even if with 100 requests, the MH still produces a better solution than the EDDBID.
To summarize the sensitivity analyses, the proposed MH is able to guarantee better results than the current way of scheduling Table 8 Computational time study for different number of requests with H = 1800 , T W = 0 . 8 , | V | = 9 , T max = 15 s . within a computational time acceptable in practice. We also compute the CV values for all instances to examine the volatility in the obtained results. We observe an average CV of 4% that signifies the stability of the proposed MH leading to reliable solutions in different scenarios.

Conclusions
This paper studies the problem of scheduling single-load AGVs. The main novelty of the problem lies in simultaneously addressing the following characteristics that often arise in practice: transport requests with soft due dates and different penalty charges, a heterogeneous fleet of AGVs with different travel costs and capabilities, and partial charging of AGVs with a critical battery threshold. An MILP model was formulated with the objective of minimizing a weighted average of tardiness costs of requests and AGV travel costs. Nevertheless, the MILP model could only be applied to small-scale problems, i.e., up to 22 requests. Therefore, we proposed an ALNS-based matheuristic (MH) algorithm that can account for the heterogeneity in requests and AGVs and use various strategies to handle charging tasks in order to find good-quality solutions within a reasonable computational time in practice.
The computational experiments on a large number of scenarios generated from real-world data reveal that the solutions of the proposed MH are better than those of the MILP and the current way of scheduling employed in practice under the same time restriction. A sensitivity analysis on the variation of different parameters indicates an average of 33% improvement in current practice by using our solution approach. This improvement also results in better utilization of the available AGV fleet in addition to monetary savings.
Interesting future works may include the extension of our approach to multiple pickups and deliveries along the same route (multi-load AGVs) and the inclusion of path planning in the scheduling process.
Appendix B. Real data of AGV fleet Table 9 AGV fleet parameters.

Appendix C. Generation of instances
For a given interval (a, b) , let y (a,b) and z (a,b) denote a random number and an integer, respectively. The pseudocode for the instance generation is given by Algorithm 5 . For the experiments, Algorithm 5 Test instance generation procedure.
Here, a random element from set A R , e.g., [A,B] or [A,C], is denoted by y A R .

Appendix D. Current practice
This section presents the current practice of scheduling AGVs at IJssel Technology that makes use of a bidding-based dispatching rule (EDDBID), as illustrated in Fig. 8 . In this rule, a transport request is sent to the scheduler when the material to be transported is ready at its pickup location. A new request is stored in a database. The requests in the database are sorted in increasing order of their latest delivery times. This step ensures that requests which have early deadlines are processed first according to the EDD rule. The scheduler sends a call for bids for requests in the database to idle and capable AGVs. An AGV is considered idle when it has sufficient charge ( b V k ≥ b V k ) and is not assigned to any task. On receiving this call from the scheduler, each candidate sends a bid that is simply the distance from the AGV location to the source of a request. The scheduler then receives bids from the competing AGVs and awards the request to the best bidder, in this case, the  nearest capable AGV. Note that after servicing a request, an AGV will be sent to the nearest charging station if its charge level drops below the critical threshold (the CCI heuristic).

Appendix E. Parameter tuning
The scenarios considered for parameter tuning are given in Table 10 , and the complete results are reported in Table 11 , where we tune the parameters in the sequence from top to bottom. Five different values for each parameter are considered, and for each value, the tuning procedure runs ten times per instance. An average deviation is obtained by averaging the standard deviations for all instances in a given scenario, and the deviations across all scenarios are averaged to get 'Dev%'. The parameter value with the smallest 'Dev%' is selected. This procedure is repeated until all parameters have been tuned. The selected values of the parameters are indicated in bold in Table 11 .

Appendix F. ALNS operator analysis
To better understand the benefit of each operator to the proposed method, we conduct a statistical analysis based on the work of Schiffer & Walther (2018a) . The results of this analysis are presented in Table 12 . Within this analysis, we remove each operator one at a time, and a percentage gap is calculated from the objective value obtained when none of the operators are removed. A negative gap indicates an improvement in the solution quality of the MH when the corresponding operator is excluded from the original set of operators. If the exclusion of more than one operator leads to improvement in solution quality, then we investigate all possible combinations of removing multiple operators at once to make sure that there are no adverse effects of removing them. Finally, we select the operator configuration which provides the best result. Table 12 shows that the solution quality increased without using the SHD-D, RKI, RCD, ACD, and ACI operators. We then investigated the configurations without all these five, four out of the five, three out of the five, and two out of these five operators in more detail. Since removing the SHD-D, ACD, and ACI leads to the best result (improvement of 2%), we removed these operators after the analysis.  SHD  SHD-T  SHD-C  SHD-D  RRD  WRD  GRI  RKI  EGI  RCD  ACD  WCD  NCI  CCI  ACI  Gap [%]