A hybrid expert system , clustering and ant colony optimization approach for scheduling and routing problem in courier services

Article history: Received June 29 2017 Received in Revised Format July 1 2017 Accepted August 1


Introduction
The courier services are dedicated to pick up/delivery packages and/or documents that are sent from people to other people with the target of a faster and secure pick up/delivery.To manage these service systems there are different decision-making levels.In strategic level (or long-range term), the decisions state the objectives, resources, and policies of the organization in a long planning horizon, e.g., a typical problem consists in forecasting the kind of services to provide and the required capacity in the next three years.In tactical level (or medium-range term), the decisions are to determine in a gross mode the required resources to perform the service in a medium planning horizon, e.g., a typical problem consists of determining the required workforce in six months.In operational level (or short-range term), the decisions determine how to carry out the specific tasks in the service operation in short planning horizon, e.g., the scheduling problem of an operator to deliver a package to the customer's site.The operation could be on all scales, from within specific towns or cities, to regional, national and global services.The complexity of the problems increases from the strategic level to operational level because the required information also increases (and more detailed) and the response time is less.The scheduling problems belong to the operational level of the decision-making process and thus is difficult to solve.
The courier services have grown in coverage and in operative complexity, although, in many of them, the traditional expertise of their workers is still used as the main input to execute operation (Rodríguez-Vásquez et al., 2016).This feature allows the chance of human error to develop inconsistencies in the process, triggered, among other reasons, by employee turnover and the constant loss of the compounded knowledge and experience about the specific tasks involved in the process.In addition, when they are faced with scheduling services manually, even the most experienced planners can only consider a limited number of possibilities and need to invest a significant amount of time to obtain a feasible schedule that it is generally far from the optimum.
The courier service generally consists in distributing thousands of packages that are managed daily to a set of customers geographically distributed by a crew of workers (Rodríguez-Vásquez et al., 2016).There is a combinatorial optimization problem for finding the best set of routes for a workforce crew known as the vehicle routing problem (Toth & Vigo, 2002).In a broad sense, this problem finds the best set of routes to be performed by a set of vehicles (crews) in order to serve a set of geographically-spread customers subject to some operational constraints.Thus, the courier services could be modeled as vehicle routing problem (VRP).
In the courier services, there are real-life characteristics that are related to variants of the VRP models.The VRP with time windows (VRPTW) is the most common variant to include in the courier services.This constraint arises when the appointments can be arranged before delivery and have a specific date and time to serve (Toth & Vigo, 2002).The capacitated VRP (CVRP) involves the capacity constraint as the maximum load capacity of the vehicles, (Toth & Vigo, 2002).The distance-constrained VRP (DC-VRP) involves the time or distance constraint, in addition to the capacity constraint, each vehicle has a maximum traveling distance that can be reached, usually given in terms of distance or time (Farahani et al., 2011).The multi-period VRP (MP-VRP) considers a planning horizon composed of a set of periods in which customers have to be reached at least once.Another extension of the VRP is the due dates (VRPD) and this constraint takes into account the customer's due dates (Archetti et al., 2015).This paper presents a scheduling and routing problem in courier services that involves the variants of VRP described above: time windows, multiples periods, capacity, due date and distance-constrained.To solve this problem we propose a three-phase approach.The first phase consists of a scheduling model to find the visit date for each customer over the planning horizon by considering the release date, the due date and travel times between the customers and depot.To solve this problem, we use an expert system.The second phase is a clustering model for each period in order to assign customers to the crew according to the travel times, maximum load capacity and the customer's time windows.We use two algorithms to solve the problem: a centroid-based heuristic and sweep heuristics.Finally, the third phase consists of a routing model in order to find the sequence to visit all customers taking into account the customer's time windows and the available time of the vehicles.An Ant Colony Optimization (ACO) metaheuristic was developed to solve this problem.
The remainder of this paper is organized as follows.Section 2 reviews the background and literature related to scheduling and routing problems in courier services.Section 3 states the problem and its notation.Sections 4, 5 and 6 describe the scheduling, clustering and routing phases, correspondingly, to solve the scheduling and routing problems in courier services.Section 7 provides some numerical experiments of our proposed approach in an example inspired by a real-world case.Finally, Section 10 concludes this work and provides possible research directions.

Background and literature review
Scheduling and routing are very complex problems in service industries like courier companies.These problems involve decisions to allocate deliveries to a set of limited resources in a specific time under several operational constraints as time windows, capacity, due dates, among others.The complexity increases when the decision makers are faced with hundreds or thousands of packets to be scheduled in a short amount time by considering several performance measures as an operational cost, customer's service level, profitability, etc.The vehicle routing problem (VRP) is closely related with scheduling and routing problems in courier services because it involves several variants and extensions of VRP, thus these problems are NP-hard (Toth & Vigo, 2002).
Different applications of courier services tightly coupled with VRP can also be found in the literature.Malmborg (2000) states a preliminary application of scheduling in courier services.The author studies the bank messenger problems as scheduling application and it determines the starting time and the sequence of stop locations.The problem is solved using a heuristic based on a simplified criterion check processing delays.Ghiani et al. (2009) describe an approach to solve the dynamic vehicle dispatching problem with pickups and deliveries.They propose a set of anticipatory algorithms to solve the problem.Their objective function consists of minimizing the expected inconveniences of the customers and their results show better solutions compared with other approaches.Chang and Yen (2012) propose routing and scheduling strategies for city couriers in Taiwan.They seek to reduce the operational costs and improve the service level using a multi-objective multiple traveling salesman problems formulation.Their objective function minimizes the total traveled length and the unbalanced workload, simultaneously.They considered hard time windows and proposed a multi-objective scatter search framework in order to find the Pareto-optimal solutions.Yan et al. (2013) study the problem of planning and adjustment of courier routes and schedules applied to urban regions.They use VRP models that include demand and stochastic traveling times.Their approach adjusts planned routes according to actual operations in real time.Lin et al. (2014) show a prototype of a decision support system to solve the offline and online routing problems arising in courier service.They formulated the problem as a dynamic vehicle routing problem (DVRP) using fuzzy time windows in order to represent the service level.To solve the problem, a hybrid neighborhood search algorithm was developed.Their results find an improvement of the courier service level without the further expense of a longer traveling distance or a larger number of couriers.Janssens et al. (2015) present a two-phase approach to solve a courier scheduling problem.The first phase consists of partitioning a distribution region into smaller zones that are assigned to a preferred vehicle.Then, in the second phase, they modeled the problem for each zone based on VRP as a multiobjective optimization problem and develop a heuristic to solve it.
The works described above have in common the application of VRP in the courier services.Table 1 summarizes the constraints that usually are taken into consideration as capacity and time windows, also one or two simultaneous constraints are used.In this paper, we consider other constraints such as distance, and multiples periods, therefore, our solution considers four real-life characteristics at the same time, implying a high level of complexity.The VRP is one of the most researched combinatorial optimization problems in the literature and the trends towards VRP includes more real-life characteristics (Braekers et al., 2016).Several variants of the VRP includes features related to scheduling and routing courier services.Braekers et al. (2016) present a survey of the trends in the VRP literature.The authors found that CVRP (capacitated VRP) and VRPTW (VRP with Time Windows) are the features that have the most publications considered and these are the most important variants used in the modeling of courier services.Another important variation of the VRP is the Distance-Constrained (DC-VRP) that involves a real-life feature as the maximum distance that a vehicle is available to drive or the labor period, however some studies are dedicated to this constraint (Almoustafa et al., 2013;Kek et al., 2008;Nagarajan & Ravi, 2012;Tlili et al., 2014).On the other hand, the pickup and delivery constraint is generally combined with VRPTW (Fikar & Hirsch, 2015;Küçükoğlu & Öztürk, 2015;Lin, 2011;Liu et al., 2013;López-Santana & Romero Carvajal, 2015).The MP-VRP (multi-period VRP) considers a planning horizon where the customers need to be visited several times with periodic or non-periodic frequency according to the service operation.Francis et al. (2006) study both cases, periodic and non-periodic frequency, and implement a service choice variable and present several heuristics for its solution.Rodriguez et al. (2015) present an iterative method to solve the PVRP through the construction of a unique visit schedule.Their approach allows the load deconsolidation, using non-regular days of visits and variable frequencies to make it a more robust and flexible tool for a real-world environment.Also, the due date is a constraint added to customers arising a variant known as VRPD (VRP with due date).Archetti et al. (2015) study this variant with multiples periods and solve it with a branch-and-cut algorithm.According to Braekers et al. (2016), it is observed that several applications have real-life characteristics individually or in some case with a limited number of other characteristics, however, the literature lacks many combinations of realistic features.In the case of courier services, multiple features must be considered simultaneously as time windows, multiples periods, capacity, due date and distance-constrained.In addition, it is possible to develop efficient solution methods to solve these problems.
On the other hand, there are a wide variety of solution methods to solve the VRP in the literature, being able to be classified in exact methods, heuristics, and metaheuristics (Eksioglu et al., 2009).Among the main exact methods used to solve the VRP are the Branch and Bound (B&B), Branch and Cut and Setcovering-based (Farahani et al., 2011).An example of a B&B can be found in Almoustafa et al. (2013), where B&B is used to solve a VRP with distance constraints.For the Branch and cut, an example of the application for the VRP with multiple periods can be found in Archetti et al. (2015).An example of the Set-covering-based is the article (Cacchiani et al., 2014).
About the heuristics, the most used are the two-phase heuristics as: the cluster-first route-second, the route-first cluster-second and the PEDAL algorithms (Toth & Vigo, 2002).In the first, the classic example is the Sweep Algorithm (Toth & Vigo, 2002), the Adaptive Large Neighborhood Search framework (ALNS) in which several types of VRP can be solved (Pillac et al., 2012).For the second family, among the route-first cluster-second methods, application of Greedy Algorithms can be found (Chu et al., 2006;Sprenger & Mönch, 2012).Other heuristics are the constructive heuristic, the Clarke and Wright's Saving Algorithm (Clarke & Wright, 1964) is the most famous heuristic.Another more recent example is presented in (Lin, 2011) using constructive heuristic as a step of the broader heuristic proposal.
The third group is the metaheuristics, which are frequently used to solve large combinatorial optimization problems.Some of the most commonly used are: Tabu search (TS), developed from heuristic from local search while additionally including a solution evaluation, local searching tactics, termination criteria and elements such as tabus in a list and the tabu length (Jia et al., 2013); Ant Colony Optimization (ACO), inspired in the feeding process of the ants and is a collective intelligence algorithm used to solve complex combinatorial optimization problems as shortest path problems (Ding et al., 2012); and Genetic Algorithm (GA), defined as adaptive search heuristic that operates over a population of solutions, based on the evolution principle that improves the solution using crossover and mutation process (Pereira & Tavares, 2009) 2018) present a methodology of three stages to solve the multi-depot VRP.In the first one, the starting solutions are built up using constructive heuristics.The second stage consists of improving each starting solution with an iterated local search multi-objective metaheuristic (ILSMO).In the last stage, a single front is found using concepts of dominance and taking as a base for the previous results.
In the field of Expert Systems (ES), which is a knowledge-based system and uses the reasoning methods that emulate the behavior and performance of human expert to solve problems.The ES is a specific field of artificial intelligence (Díez et al., 2001;Méndez-Giraldo et al., 2013).An ES is a computer program that reasons and uses a knowledge base to solve complex problems in a particular domain (Krishnamoorthy & Rajeev, 1996;Kusiak, 1990).Sahin et al. (2012) show a statistical analysis of hybrid ES approaches and their applications.The publications have an increasing trending as an indicator of the popularity of hybrid ES.Dios and Framinan (2016) present a review of case studies in manufacturing scheduling tools, while Wagner (2017) shows a case study using ES in different areas and state future trends.Both studies affirm that the ESs play a strategic role in the scheduling methods and techniques since they allow to involve the human expertise in the algorithms and to hybrid with other techniques.However, in service systems the application of ES is scarce, López-Santana and Méndez-Giraldo (2016) present a proposal of an ES for scheduling in service systems.They state that the application of knowledge-based systems and ES for scheduling in services systems are scarce and propose a structure to determine the service system and identify the tools to solve a specific scheduling problem.
When examining the literature, one should notice the lack of modeling approaches for scheduling and routing problems in courier services because of its complexity in the multiple real-life characteristics and the large-scale of the problems.Indeed, most of the papers deal with applications of VRP and its variants individually.The novelty of this work consists of exploring and combining multiple real-life characteristics as time windows, due dates, multiples periods, distance-constrained and capacity constraints in the VRP model applied to courier services.According to the literature trends, we propose a hybrid solution approach that integrates an expert system to scheduling deliveries, two heuristics to clustering customers and a metaheuristic based in ACO to solve a routing problem.

Problem statement
We consider a set of customers 1,2, … , geographically distributed, in which each customer expects to receive a package.The set of workers (vehicles) who visit the customers are identicaly denoted by 1,2, … , and belong to a central hub 0 that is considered to be the starting point and the end point of all workers.We assume that a worker is assigned to a vehicle, thus we use worker and vehicle, interchangeably.All customers must be visited once during the planning horizon.Then, we define the problem in a directed complete graph , where 0 is the set of vertices and , : , ∈ , is the set of arcs.Each arc , has a non-negative associated value that represents the traveling time from to .The objective is to establish the set of routes to visit all customers seeking to minimize the total routing distance for the vehicles.The assumptions and conditions of the problem are summarized as follow:  The service time is the same for all customers and all vehicles. Each customer imposes a hard time-window in which the delivery must start.This means that the vehicle must arrive to start the service before the end of that time window, and it can arrive before the beginning of the time window, but the customer will not be serviced before this earliest time. The vehicles are homogeneous, i.e., they have the same capacity but they have different time windows for their labor period.It means that each vehicle has an availability limitation for the traveling time and service time. The vehicles start and finish in a central depot. The travel times are deterministic and fulfill the triangle inequality.
Table 2introduces the notation used in the mathematical models of our approach.Fig. 1 shows our approach that for solving three models: Total available hours of all vehicles Binary Variables: : is 1 if the customer is visited on the period and 0 in otherwise : is 1 if the customer is visited by the vehicle and 0 in otherwise : is 1 if the arc , is used for the optimal solution and 0 in otherwise : is 1 if the vehicle is used and 0 in otherwise Integer Variables: : amount of customers to visit on period Continuous Variables: : time in which the visit of customer starts : accumulated load delivered up to reaching customer 1. Scheduling model: The objective function, input, output and solution method for this phase is the set of parameters shown in Fig. 1.This model defines the visit date of each vertex from the set in the planning horizon of periods, by considering the release date, due date and the travel times for all , ∈ .To solve this model, we use an expert system based on the know-how of the courier service as the knowledge base.The inference engine works as a rule interpreter.The output consists of two sets: a set of customers allocated for each period and its respective sub set of arcs for each period.
2. Clustering model: This model takes as input the outputs of the scheduling model, the demand and times windows of the customers, and the capacity and labor schedule for the workers (see Fig. 1).The clustering model consists of grouping all vertices of the set according to the traveling time , that the arc , ∈ , and allocates it to a vehicle by considering its labor schedule , , its maximum load capacity and the time windows of the customers , .The outputs are a set of customers allocated for each vehicle in each period and its associated sub set of arcs.In the next three sections, we will present the components of our solution approach.First, we present the scheduling model to allocate the customers for each period; second, we develop the clustering model to determine the set of customers to each worker and finally, we present the routing model to find the sequence to visit the customers for each worker.

Scheduling model
We have a set of customers to be scheduled in a period within the planning horizon.Since the number of customer is too large, then we need to solve a scheduling model by taking into account the appointments arranged with them, the due dates of the customers, the maximum load capacity of the vehicles, the number of vehicles available, the average visiting capacity of the vehicles by hour and the total available hours of the vehicle.The first phase consists of allocating customers to periods.The next sections explain the mathematical formulation of the problem and its solution procedure that consists of an expert system.

Problem formulation
Given a set as the set of customers that requires to be visited during the planning horizon , we need to allocate each customer for a period to be visited.This problem uses a binary variable that take the value of 1 only if the customer ∈ is visited on period ∈ ; and 0, otherwise.
The scheduling model can be formulated mathematically as follows: subject to: , , , , , , ∈ , ∈ (2) The objective function (1) seeks to minimize the average distance between all customers to visit each period in the planning horizon, allowing to maximize the concentration in the visiting groups.Constraints (2) represents the relation between assigning a specific customer to a period depending on the appointments arranged with them, the due dates of the customers, the maximum load capacity of the vehicles, the amount of total vehicles available, the average visiting capacity of the vehicles by hour and the total available hours of the vehicles.In the scheduling model, this relation is represented by a set of rules that belongs to an expert system which also includes basic rules to comply with the constraints, e.g., the date of the appointments arranged with customers, and priority rules as ranking customers that are closer to the due date, customers who do not have an appointment and are closer than one who does so they can be visited in the same period.Constraint (3) assures that each customer will have a determined visiting date in the planning horizon.Constraint (4) controls the maximum amount of deliveries that can be programmed to be visited according to the total load capacity.And finally, Constraint (5) defines the binary variable used.

Solution procedure
Regarding the courier service, and referring to the definition in this paper, scheduling is performed by the workers using mainly the know-how and expertise of the business to define the period in the planning horizon in which the visit of each customer will be executed.Our plan consists of providing a model which allows complying with the efficacy of the scheduling process and to prevent the chance of human error.An expert system (ES) is a well-designed system which replicates the cognitive process that experts use to solve particular problems (Turban, 1989).In addition, from the designed rules that came from workers experience, the ES will use as a performance function, the objective function established in Eq.
(1), that consists in the minimization of the average distance between all customers to visit each period of the planning horizon.
The ES is specialized in a specific field and aims to solve problems through reasoning methods that emulates the performance of a human expert (Díez et al., 2001;Méndez-Giraldo et al., 2013).Fig. 2 shows the architecture of a traditional ES.The ES operates according to López-Santana and Méndez-Giraldo (2016).When the ES starts the process of inference, it needs a context or working memory, which represents the set of established facts.The explanation system simulates the process of the answer of an expert to the questions: How is a decision arrived at? and Why do we need a data?Since the knowledge base has to be continuously updated and/or appended depending on the growth of knowledge in the domain, an interface between expert and ES is necessary.The knowledge acquisition system executes this interface, and is not an on-line component, it can be implemented in many ways.In addition, another interface is necessary between the user and the ES.The user interface allows the user interacts with the ES giving data, defining facts and monitoring the status of the problem-solving.
The knowledge base is the component where all knowledge provided by experts is stored in an orderly and structured way under a set of relationships, such as rules or probability distributions, facts and heuristics that represent the thinking of the expert.The major task of ES rules the basis of its operation, but we can also use representation schemes such as semantic networks, frames, among others.The knowledge base is independent of the mechanisms of inference and search methods.
The inference engine performs two main tasks (López-Santana & Méndez-Giraldo, 2016): in the first phase, it examines the facts and rules, and if possible, add new facts; in the second one, it chooses the order in which inferences are made.Notably, the findings obtained by the inference engine, are made based on deterministic or stochastic information data and can be simple or compound.

Fig. 3. Architecture of knowledge base and inference engine
The main components that make up our proposed ES are the base of knowledge and the inference engine.Fig. 3 presents the structure of our ES.The knowledge base is represented by the data base of courier service through a process of documentary review of internal procedures, service agreement and also through interviews with employees who carry out scheduling activities, which may be considered as pseudo-expert.They will be formulated the rules of the type if (condition) then (action), which together create rules base.Both, data base and rules base, are the knowledge base of our ES and were built in mongoDB® and Java®.
The inference engine: this is a rule interpreter who decides when to apply the rules.This was built in Jess (Java expert system shell) and Java®.From this definition, the rules of the knowledge base will be clustered into three modules, to be applied in an orderly and sequential way as follows: Module of Static rules: The first module contains static rules since they are applied to every customer's package to visit of the available stock.These rules are based on the internal procedures and service agreements with the customers.The result is a feature modification named route status, this feature can assume three values: pending, dismissed and scheduled.If some rule is fulfilled, the feature is changed to "dismissed".If none of the module rules is fulfilled, the feature is remained with the value "pending".

Module of Appointments:
The second module contains only one rule, however, it is the major and application basis of the next module.The rule consists of evaluating if the customer has a booked appointment within the planning horizon if so, it changes the route to be "scheduled" and the same route date as the appointment date.

Module of Dynamic rules:
The third module contains a set of rules that are activated only if they are required.These rules were designed to take advantage of ES, which consists of intensive searches, in other words, instead of re-executing computational efforts by repeatedly applying different rules to the same ones, efforts are directed towards searching on a large basis of facts, those that activate the rules.These rules are built based on the experience and know-how of the pseudo-experts who program the routes.
The set of rules that make up the base of knowledge and the inference engine that was developed for the case study are listed in the Appendix A. The user interface and all components are integrated into Java®.
The outputs of scheduling model are the sets of customers that are scheduled to visit for each period.The next section describes the clustering model in order to build a determined number of clusters for each period associate each one for a worker.

Clustering model
With the results of scheduling model, we have for each period in the planning horizon, a set of customers is scheduled to be visited, however, the scale of these sets are too large, thus we use a strategy for grouping in sets for each vehicle using a clustering model.To solve this model we propose two algorithms: a centroid-based and sweep algorithms.The next sections describe the mathematical formulation of the clustering model and its solution procedures.

Problem formulation
Given the set of customers scheduled to be visited in each period , we use a binary variable that takes the value of 1 only if the vehicle ∈ visits the customer ∈ ; and 0, otherwise.For each period , the clustering model can then be stated as follows, subject to: The objective function (6) seeks to minimize the average distance among all the customers to be visited by the vehicle .Constraints (7) limit the maximum load capacity of each vehicle.Constraints (8) state that a customer can be allocated only for one vehicle.Constraints (9) ensure that the customers with time windows are allocated to vehicles that start their shift before.Constraints (10) define the binary variable used.

Solution procedure
There are several methods that allow clustering customers in groups according to distances, costs, etc. (Patiño Chirva et al., 2016).We selected two procedures: Centroid-Based and Sweep algorithms.Given d( as the demand of customer ∈ , denotes the vertex of the customer , , is the time window of customer and , is the schedule labor period of cluster , i.e., the time window when a vehicle is available to deliver the packages.We assume that a cluster corresponds to a vehicle. The Centroid-Based algorithm starts from the geometry of the geometric centers, around which the cluster environment is generated.Algorithm 1 shows the pseudocode of the centroid-based algorithm for clustering the set of customers to be visited for each period.This method is divided into two steps according to Shin and Han (2012) and we modify to adapt the time windows for customers and vehicles.In the first step is the clustering construction, the algorithm selects the node farthest from the source, within nodes that have not been previously allocated and a cluster is generated; then the geometric center calculated with Eq. ( 11) must be computed between these nodes, where and are coordinates in and , respectively, and the nodes belonging to the cluster.

⁄ , ⁄
To add nodes to the first cluster ( , the cluster construction algorithm finds among un-clustered nodes, which is located closest from , and includes to only if the demand of does not exceed the available capacity of and its time window belongs to the labor schedule of .If is added to cluster , the capacity of the cluster is reduced by the demand of and is recalculated.The same processes above are conducted until the available capacity of becomes smaller than the demand of the closest node from ( ).When the demand of exceeds the available capacity of , the algorithm stops to expand , and finds the farthest node among un-clustered nodes again in order to generate another cluster, .These processes are repeated until no unvisited node exists, i.e., when ∅.In summary, the nodes closest to the geometric center are added taking into account the defined capability of each cluster, which corresponds to the capacity of a vehicle, the time windows associated with the customers and the available schedules of the vehicles.The time complexity of this step is ).
In the second step, the clusters generated in step one are adjusted.Cluster adjustment means that if customer , which belongs to cluster , is closer to than , the demand of does not exceed the available capacity of , and the time window of belongs of the available labor time of then move from to cluster .If a customer moves from to , ) and are also recalculated.The time complexity of this step is also .
Algorithm 2 shows pseudocode for clustering the customers based in sweep algorithm.The first step is the initialization process, which consists of computing the polar coordinates and sort them in ascending order.Moreover, the clusters , , … , are sorted by schedule time.The second step is the cluster construction.It starts from an origin point a straight line which is rotating and creating a zone in which the customers must be assigned.The area addressed in the sweeping process constitutes the cluster, as long as it complies with the capacity constraint indicated for each one of them.The algorithm selects among un-clustered nodes includes to only if the demand of does not exceed the available capacity of and its time window belongs to the labor schedule of .If is added to cluster , the capacity of it is reduced by the demand of .The same processes above are conducted until the available capacity of becomes smaller than the demand of the closest node from ( ).Then we select the next cluster, and the process is repeated until no unvisited node exists, i.e.
∅. Like centroid-based algorithm, the sweeping heuristics were modified to take into account to the time windows associated with the customers and the available schedules of the vehicles.The time complexity of this algorithm is .
We have two algorithms to compute the clusters, then we will compare the average distances to select the clusters with the less average of distance between all customers of set .The outputs of this model are the sets of clusters that represent the workers or vehicles for each period that need to be visited.The next section describes the routing model to determine the sequence in which the customers are visited.

Routing model
With the results of the clustering phase, we have a set of customers to be allocated in paths in order to be severed for the crew of vehicles.This situation arises with a special routing model that consists in a traveling salesman problem with time windows (TSPTW) because each vehicle is solved separately.The next sections explain the mathematical formulation of the routing model and the solution procedure that is based on the Ant Colony Optimization (ACO) metaheuristic.

Problem formulation
Given a set of customer allocated to a period and a vehicle for the clustering model, the vehicle must serve the customers, thus we are faced to solve a routing model.The problem is defined on the auxiliary directed graph ′ , ′ , with ∪ 0, 1 and 0, : ∈ ∪ , 1 : ∈ ∪ , : , ∈ , , where vertices 0 and 1 denote the depot.The node 1 is a copy of the depot 0 and is introduced for the sake of clarity in the mathematical formulation presented hereafter.We assume a zero service time at the depot, i.e.

0.
This problem consists of designing a set of routes for each vehicle such that:  Each customer is visited exactly once,  The time window , is met, i.e., the vehicle must arrive to start service before time and no later than , but the customer will not be serviced before the beginning of the time window, and  The time window , is met, i.e., the vehicle has a time window that represents its labor period.
The problem is formulated as a TSPTW.This problem uses binary variables that takes the value of 1 only if the vehicle traverses the arc , ∈ , where ; and 0, otherwise.The variable defines the start time of a service ∈ .Finally, the variables represents the accumulated load in customer ∈ . For each vehicle and each period in the planning horizon, the mathematical model is formulated as follows: subject to: The objective function ( 11) aims to minimize the total travel time related to the visits performed for each vehicle.Constraints ( 12) ensure a visit each customer only once.Constraints ( 13) allow keeping fluency between trips in a route, i.e., after visiting one customer, a vehicle must immediately start another trip.Constraints ( 14) and ( 15) allocate values to the accumulated load variables and prevent sub-tours.Constraints ( 16) state the limited capacity of the vehicle.Constraints ( 17) establish the relationship between the vehicle's departure time from a customer and its immediate successor.Constraints ( 18) and ( 19) enforce the time windows at the customers and the vehicles.Constraints (20) impose binary conditions on the flow variables.Finally, constraints (21) defines the accumulated load variables as nonnegative.

Solution procedure
With the previous procedure we allocate a set of customers for each period in the planning horizon, as a result we can simplify the problem as a TSPTW, capacity constraints, and schedule capacity.To solve this problem an ACO metaheuristic is proposed since TSPTW is considered an NP-complete, i.e., belong to the combinatorial optimization problems that are considered difficult to solve.However, according to Glover and Kochenberger (2003), ACO has had several successful implementations in a wide variety of combinatorial optimization problems.In these problems, generally, ACO algorithms are linked with additional features, such as local optimizers against specific problems, that take ant solutions for local optimum (Glover & Kochenberger, 2003).Some recent examples of successful applications of the ACO metaheuristic for the TSPTW can be found in the literature (e.g.Kara & Derya, 2015;López-Ibáñez & Blum, 2010).In addition, if we consider that ACO metaheuristic has also been used for the VRPTW, more applications can be found in the literature (e.g.Cheng & Mao, 2007;Ding et al., 2012;Pureza et al., 2012;Yu et al., 2011).Cheng and Mao (2007) propose an ACO to solve the TSPTW.The authors developed a modified ant algorithm called ACS-TSPTW (Ant Colony System-TSPTW) based on the ACO technique to solve the TSPTW.Two local heuristics are integrated in ACS-TSPTW algorithm to manage the time windows limitations.Table 3 summarizes the main notation used for the proposed ACO.

Table 3
Notation used for proposed ACO to solve the TSPTW : pheromone from node to during the iteration ∆ : pheromone update from node to during iteration : set of arcs , : paths that are missing from the current ant : heuristic that increases the importance of nodes near their closing time : heuristic that increases the importance of nodes that do not require timeout * : best current path * : travel time of the best current road According to Cheng and Mao (2007), the ACO algorithm defines the amount of pheromone deposited for each path , , ∀ , ∈ with global and local update.The local update is computed as follows: where is the pheromone quantity on the way , for the time , 0 1 is the local evaporation rate and ∆ is the pheromone increase on the way , given by: where * is the distance traveled by the best path * and, the global update is given by: where 0 1 is the global evaporation rate.The transition node rule consists in choice to move to the next node is being taken with the Eq. ( 26).
Two local heuristics are defined by Eq. ( 28) and Eq. ( 29).The parameters , are user defined that determine the importance of and and is s the transition probability rule defined in Eq. ( 27).
where is the set of nodes that the ant has not visited in this tour.
The local heuristic increases the importance of the nodes near their closing time and is defined by: where is the remaining time to the node to close and is defined as where is the time to the node j close and is the current time l, controls the probability curve and is defined as the average of 0. The intuition behind this heuristic is that the ant should visit those nodes whose arrival times are closer to their upper time-window constraints before those nodes with later upper timewindow constraints in order to avoid the risk of lateness (Cheng & Mao, 2007).The local heuristic increases the importance of nodes near to its input time and is defined by: where is the remaining time to the node j to open and is defined as where is the time to the node j to open and is the current time, controls the probability curve and v is defined as the average of 0. Note that 1, when is negative, i.e., the open nodes are given the highest priority.
Algorithm 3 introduces the basic processing steps for the proposed ACO.The first step consists of starting the variables with initial values.The second step builds a path according to local heuristics and .The third step saves the best path found so far and his travel time, and updates the pheromone.The fourth one evaluates the stopping criterion, if it is met the algorithm stops and reports the solution, else the second step is run again and the iterative process continues until the stopping criterion is reached.For the initial amount of pheromone on each arc, Cheng and Mao (2007) suggest to determine it based on an approximate estimate of the tour length and, the stopping criterion is set to a maximum acceptable number of iterations to run the algorithm.

Results
In this section, we illustrate the results of our proposed method for scheduling and routing in courier services obtained in a real world case.All tests in this work were run using Java 8 on a Windows 8 64bit machine, with an Intel i5 3337 processor (2×1.8GHz) and 6 GB of RAM.

Case study
We illustrate the performance of the proposed method with an example based on a real-world scenario, where a courier company in Bogotá has a crew of more than two hundred motorcycle operators with delivery package in the city.The company has three operating centers, from which approximately 2.100 packages per day are delivered.That is 700 packages on average for each center, with 20 to 32 motorcycle operators, these can carry from 10 to 60 packages per day according to a schedule.We assume a planning horizon of three days.
The packages that must be delivered to the customer have a set of attributes, which determine the selection according to priorities in order to be scheduled.In the real case study, the packages have more than 60 attributes, however, many of them are informative and do not affect the rules of our ES.Table 4 lists the set of attributes considered in the case study.For the sample, a database with the packages that were in its system was requested by the courier company on September 20, 2016.In the sample received there were 17.492 packages.Fig. 4 shows the geographical dispersion of the packets on the map of Bogota.We assume a capacity of 2.700 visits per day and 75 vehicles.Table 5 shows the capacity as the number of visits per day and the time windows , for each vehicle .

Results of case study
In this section, we report some of the results for scheduling, clustering and routing models of the proposed method.Finally, we compare the results of proposed method with the results of the current method in the case study.

i. Results of scheduling model
Section 4 described the structure of our ES for scheduling phase.For the ES we use the Jess rule engine library (available in http://www.jessrules.com/jess/index.shtml).We apply the scheduling model over a planning horizon of three days.The performance measure is the average distance between all customers for each day.Table 6 presents the results of scheduling phase for three days for the case study.Fig. 5 shows the geographical distribution of the customers for each day.

ii. Results of clustering model
For the clustering phase, we apply both algorithms, centroid-based and sweep, for each day in the planning horizon.Table 7 shows the results of the average of distance between all customers.The average distance obtained for the centroid-based algorithm is in average 62% less than sweep algorithm, which indicates the centroid-based algorithm found better results in cooperation with the sweep algorithm for all days.To illustrate our results, Fig. 6 shows ten clusters generated with both algorithms, in a) centroid-based and b) sweep.We can observe that the centroid-based algorithm's cluster are more concentrated while the sweep algorithm's clusters are scattered.In addition, Table 8 shows the average distance of each cluster generated with the centroid-based algorithm for Day 1.   iii.Results of routing model For the routing phase we apply the ACO algorithm described in Section 6 for each cluster.For setting the parameters of ACO algorithm, we use the numbers suggest by Cheng and Mao (2007) as follows, 3, 0.99, 0.1, 0.5, 3 and 0.05.
Table 9 shows the total routing distance for each day, the number of customers scheduled and the average distance between customers in the route.A total of 2700 customers did not schedule because the 5% of the paths were infeasible since the time windows.

Table 9
Results of ACO algorithm in routing phase for three days in the case study Table 10 shows the detailed results for the 73 vehicle's paths in Day 1.To illustrate our results, Fig. 7 shows the path 68 in Day 1.The visiting order is represented by colors and numbers.

iv. Summary and comparison
We compare the average distance between the customers for each phase.Table 11 shows the results of each phase.The average distance between the customers decreases in each phase for all days.This result indicates that the customers are concentrated, thus the routing distance is reduced.Likewise, we compare the service level as the fulfillment of the customers with the delivery date.In the company, the actual service level is between 80% and 85%.Our approach obtains an average of 93%.
The improvement is 11%.Table 12 summarizes these results for each day.

Conclusions and future work
This work has presented a hybrid expert system, clustering, and ant colony optimization approach to solve a scheduling and routing problem in courier services where a set of customers geographically distributed are scheduled to be severed in a planning horizon and a crew of operators is routing in order to visit the customers' sites and deliver a package or documents.The contribution of this paper is twofold: for the scheduling and routing point view, the problem is closer to VRPs but the problem exhibits some distinctive aspects like the presence of time windows in the customers and the vehicles, the due dates for the customers, multiples periods and distance-constrained; for the solution method point view, the courier services are faced with a large-scale problems thus an exact method is not available to find a good solution in a short time, then we propose an integration of knowledge based model and heuristics/metaheuristics techniques in order to solve the problem using the knowledge in the courier service to identify and classify the customers and clustering and routing method to allocate the available crew to attend the customers.
We proposed a solution approach which consists of the integration of knowledge based model and heuristics/metaheuristics techniques in a three phase procedure.In the first phase, a scheduling model determined the visit date for each customer over the planning horizon considering the release date, the due date and travel times between the customers and depot.We have used an expert system to solve this problem seeking to minimize the average distance between the customers.In the second phase, a clustering model used two algorithms: a centroid-based and a sweep algorithms.For each period, the model built a set of groups the customers according to the travel times, maximum load capacity and the customer's time windows.Finally, in the third phase, a routing model assigned and scheduled the customers for each cluster to one operator and found the sequence to visit them taking into account the customer's time windows and the available time of the operators.We have solved a traveling salesman problem with time windows using an ACO algorithm.
Results over a real-world case inspired in a courier company have illustrated the application of our proposed procedure in order to allocate.Our proposed method looked for reducing the average of distance between the customers as a performance measure of the concentration of the customers.These results suggest that as the average distance is reduced, the construction is increased; thus the routing distances is less and the customers are severed with a better service level.The results of the case study suggest that our proposed method could outperform a procedure-based in the intuition of the planners, which does not include the expert system, and clustering algorithms.
Future work should focus on the extension of the problem to a dynamic setting, in which unexpected customer's orders or operator's features occur and it is necessary to reschedule to attend the customers.Also, it is possible to incorporate additional constraints such as the technical skills for operators, uncertainty in the travel times and visits, etc.Another opportunity to improve the solution is to incorporate other features in the expert system likes a learning module in order to add more information in the knowledge base, other classification methods based on fuzzy logic in order to manage imprecision and uncertainty, for instance the priority could be managed as linguistic variable and be represented as a fuzzy number and use a fuzzy inference system to determine the scheduling package, etc.In addition, to improve the execution time, it is possible to develop a parallelizable algorithm, which could be distributed among several computers (or multiple cores) thus reducing the computation time for all phases in our approach.
Appointment: Date of the Next appointment Real status: Process state in which the package is located Manageable Stock: internal rating Real reason: qualification of the last visit that was made Telephone management rating: last call rating Table A1 lists the set of rules that are used in Module of Static Rules for the expert system in scheduling phase.

Table A1
List of rules of Module of Static Rules For Module of Appointments, only the following rule applies: "If the appointment date is contained in the planning horizon then change route status to scheduled and make route date equal to appointment date".
The rules of Module 3 that were built from the experience of the pseudo-experts use the concept of onward chaining (Forward Chaining), which is why they were ordered from the rule with conditions to comply from the most strict to the more flexible, this in order to prioritize the most important and urgent packages to manage.Table A2 describes the rules of Module of Dynamic Rules.

Table A2
List of rules of Module of Dynamic Rules If there is available capability and the customer is closer to 5 customers with appointment or more and the package is expired and has 0 visits or less and has more than 5 calls then change from pending to scheduled.If there is available capability and the customer is closer to 4 customers with appointment or more and the package is expired and has 0 visits or less and has more than 4 calls then change from pending to scheduled.If there is available capability and the customer is closer to 3 customers with appointment or more and the package is expired and has 1 visit or less and has more than 3 calls then change from pending to scheduled.If there is available capability and the customer is closer to 2 customers with appointment or more and the package is expired and has 1 visit or less and has more than 2 calls then change from pending to scheduled.If there is available capability and the customer is closer to 1 customer with appointment or more and the package is expired and has 2 visits or less and has more than 1 call then change from pending to scheduled.If there is available capability and the customer is closer to 0 customers with appointment or more and the package is expired and has 2 visits or less and has more than 0 calls then change from pending to scheduled.
3. Routing model: The inputs consists of the outputs of clustering model, the travel times, customer's time windows and vehicle's time windows.The routing model is an individual optimization process to determine the order in which each vehicle will visit all customers assigned during the clustering process, taking into account the time windows of the customers and the available time of the vehicles.The output is a set of routes for each vehicle, with the scheduled detailed in the start time of service at each customer.

Fig. 1 .
Fig. 1.Proposed solution approach to scheduling and routing in courier services

Fig. 4 .
Fig. 4. Geographical dispersion of sample of case study

Fig. 5 .
Fig. 5. Geographical distribution of customers allocated in scheduling phase

Fig. 6 .
Fig. 6.Example of geographical distribution of ten clusters generated with a) centroid-based algorithm, and b) sweep algorithm If has an appointment date after the planning horizon then change from pending to dismissed If has real status Route Assigned then change from pending to dismissed If has real status Charged and verified then change from pending to dismissed If has real status Packed then change from pending to dismissed If has real status Delivered then change from pending to dismissed If has real status City Forwarded then change from pending to dismissed If has Not manageable stock then change from pending to dismissed If has Real reason Non covered city then change from pending to dismissed If has Real reason Customer has not ready documents then change from pending to dismissed If has Real reason Incomplete address 2 then change from pending to dismissed If has Real reason Incomplete delivery address then change from pending to dismissed If has Real reason Address 2 does not exist then change from pending to dismissed If has Real reason Delivery address does not exist then change from pending to dismissed If has Real reason Complete management then change from pending to dismissed If has Real reason Deficient identification of the final user then change from pending to dismissed If has Real reason Non localized delivery then change from pending to dismissed If has Real reason Moved then change from pending to dismissed If has Real reason Entity request then change from pending to dismissed If has Real reason Final user request then change from pending to dismissed If has Real reason Rejected then change from pending to dismissed If has Real reason No request rejection then change from pending to dismissed If has Real reason Stock rejection then change from pending to dismissed If has Real reason Destroyed rejection then change from pending to dismissed If has Real reason Phone rejection then change from pending to dismissed If has Real reason Declined then change from pending to dismissed If has Real reason Unspecified time leaving then change from pending to dismissed If has Real reason Holder without details then change from pending to dismissed If has Real reason Uncovered zones then change from pending to dismissed If has telephonic management rating Customer calls entity then change from pending to dismissed If has telephonic management rating Customer has not ready documents then change from pending to dismissed If has telephonic management rating Wrong customer names then change from pending to dismissed If has telephonic management rating Returned to entity then change from pending to dismissed If has telephonic management rating No ID then change from pending to dismissed If has telephonic management rating Rejected then change from pending to dismissed If has telephonic management rating Rejection because no needs then change from pending to dismissed If has telephonic management rating Unspecified time leaving then change from pending to dismissed If has telephonic management rating Return request then change from pending to dismissed If has telephonic management rating Minor child owner then change from pending to dismissed

Table 1
Literature of application of VRP in courier services

Table 2
Notation used for scheduling, clustering and routing models : demand of customer j : lower bound of time window for customer : upper bound of time window for customer : lower bound for the available time of vehicle : upper limit for the available time of vehicle : maximum work time for vehicle : travel time for arc , ∈ : release date of the visit with customer : due date to visit customer : average visit capacity of the vehicles per hour :

Table 4
List of rules attributes of the case study

Table 5
Capacity and time window of vehicles in case study

Table 6
Results of scheduling phase for three days in the case study

Table 7
Results of clustering phase for three days in the case study

Table 8
Results of centroid-based algorithm for Day 1

Table 10
Results of ACO algorithm in routing phase for three days in the case study

Table 11
Summary of average distance between customers for each phase in the case study Fig. 7 Example of geographical distribution of path 68 in Day 1

Table 12
Summary of service level of solution approach and actual results in the case study