Bai, Ruibin and Woodward, John R. and Subramanian, Nachiappan and Cartlidge, John (2018) Optimisation of transportation service network using κ-node large neighbourhood search. Computers & Operations

The Service Network Design Problem (SNDP) is generally considered as a fundamental problem in transportation logistics and involves the determination of an efﬁcient transportation network and corresponding schedules. The problem is extremely challenging due to the complexity of the constraints and the scale of real-world applications. Therefore, efﬁcient solution methods for this problem are one of the most important research issues in this ﬁeld. However, current research has mainly focused on various sophisticated high-level search strategies in the form of different local search metaheuristics and their hybrids. Little attention has been paid to novel neighbourhood structures which also play a crucial role in the performance of the algorithm. In this research, we propose a new efﬁcient neighbourhood structure that uses the SNDP constraints to its advantage and more importantly appears to have better reachability than the current ones. The effectiveness of this new neighbourhood is evaluated in a basic Tabu Search (TS) metaheuristic and a basic Guided Local Search (GLS) method. Experimental results based on a set of well-known benchmark instances show that the new neighbourhood performs better than the previous arc-ﬂipping neighbourhood. The performance of the TS metaheuristic based on the proposed neighbourhood is further enhanced through fast neighbourhood search heuristics and hybridisation with other approaches.

A C C E P T E D M A N U S C R I P T day 1 day 2 day 3 day 4 day 5 day 6 day 7 n1 n2 n3 Fig. 1 An example of a time-space network with 3 nodes and 7 periods. independent heuristic approaches that do not take specific advantage of a problem's underlying low-level solution structure. Examples of high-level strategies for more efficient search include the tabu-assisted guided local search by Bai et al (2012) and the hybrid tabu search with path-relinking method by Minh et al (2013). Analysis of the problem solution structure and its constraints is very limited. As indicated in Kendall et al (2016), a lot of optimization research studies merely borrow different metaphors without much deep insights on algorithmic or problem properties. These approaches do not satisfy real-world requirements either in terms of solution quality delivered or in computational time required. This is because the SNDP contains some difficult constraints and a flow distribution sub-problem, generally referred to as the Capacitated Multicommodity Min-Cost Flow (CMMCF) problem, which can be very expensive to solve if it is called many times within an iterative metaheuristic approach. This motivates us to develop more efficient metaheuristics for this important and challenging sub-problem. Therefore, unlike the above papers which focus on high-level strategies, in this paper, we propose and study a new larger neighbourhood that exploits the special structure of the SNDP constraints and has much better reachability due to the implicit constraint handling. The experiments on two basic metaheuristic approaches and a hybrid algorithm show that the new neighbourhood is very effective and could be used to develop more efficient algorithms for the SNDP.
The remainder of the paper is structured as follows: Section 2 provides a brief introduction to the SNDP and an overview of the research in freight service network design. Section 3 presents the arc-node based mathematical formulation for SNDP. Section 4 discusses the neighbourhood structure used in the previous studies. Section 5 describes the proposed κ-node neighbourhood operator whose performance is evaluated in Section 6 through a basic Tabu Search (TS) method and a basic Guided Local Search (GLS) method. Section 7 describes a hybrid algorithm based on the κ-node neighbourhood. Section 8 concludes the paper.

Literature Review
This section provides a brief overview of the previous research into SNDP which is closely related to classic network flow problems (Ahuja et al, 1993). Comprehensive reviews can be found in Crainic (2000); Crainic and Kim (2007); Wieberneit (2008).
Early work in this field includes Crainic and Rousseau (1986); Powell (1986); Crainic and Roy (1988). Crainic et al (1993) applied a TS metaheuristic to the container allocation/positioning problem. Crainic et al (2000) investigated a hybrid approach for capacitated multicommodity network design (CMND), combining a TS method with pivot-like neighbourhood functions and column generation. Ghamlouche et al (2003) continued the work and proposed a more efficient cycle-based neighbourhood function for CMND. Experiments with a simple TS framework demonstrated the superiority of the method to the earlier pivotlike neighbourhood functions in Crainic et al (2000). This approach was later enhanced by adopting a path-relinking mechanism (Ghamlouche et al, 2004). Barnhart et al (2002) addressed a real-life air cargo express delivery SNDP. The problem instances are characterised by their large sizes and the addition of further complex constraints to those in the general SNDP model. A tree formulation was introduced and the problem was solved heuristically using a method based on column generation. Armacost et al (2002) introduced a new mathematical model based on an innovative concept called the composite variable, which has a better Linear Programming bound than other

A C C E P T E D M A N U S C R I P T
Optimisation of Transportation Service Network Using κ-node Large Neighbourhood Search 5 models. A column generation method using this new model was able to solve the problem successfully within a reasonable computational time, taking advantage of the specific problem details. However, it may be difficult to generalise their model to other freight transportation applications, especially when there are several classes of services being planned simultaneously. Pedersen et al (2009) studied more generic SNDP in which a set of asset balance constraints was added to model the requirements that the number of incoming vehicles at each node must equal to the outgoing vehicles in order to maintain the continuity of freight services over time. A multi-start metaheuristic, based on TS, was developed and shown to outperform a commercially available MIP solver when computational time was limited to one hour per instance. Andersen et al (2009) compared the node-arc based formulation, the path-based formulation and a cycle-based formulation for SNDPs. Computational results on a set of small randomly generated instances indicated that the cycle-based formulation gave significantly stronger bounds and hence may allow for much faster solution methods of problems. More recent work by Bai et al (2012) attempted to further reduce the computational time and investigated a Guided Local Search (GLS) based hybrid approach. The computational study showed that GLS was able to obtain better solutions than Tabu Search (TS) but with less than two thirds of the computational time. However, GLS in that study was based on an arc-flipping neighbourhood which sometimes leads to poor solutions. Other methods of approaching SNDP have included ant colony and a branch and price method. Barcos et al (2010) investigated an ant colony optimization approach to address a simplified variant of freight SNDP. The algorithm was able to obtain solutions better than those adopted in the real-world within a reasonable computational time. Andersen et al (2011) studied a branch and price method for the SNDP. Although the proposed algorithm was able to find solutions of higher quality than the previous methods, the 10-hour computational time required by the algorithm poses a great challenge for practical applications.
Variants of SDNP have also been studied. Hoff et al (2010) investigated a variable neighbourhood search based metaheuristic approach for the service network design with stochastic demand, a problem sharing similar structure to SNDP. However, the neighbourhood functions used in their approach are mainly based on path oriented operators which, like the arc-flipping operator, have limitations in dealing with asset balance constraints. Alumur et al (2012) studied a heuristic approach for the simultaneous optimisation of hub locations and the service network. A multi-period supply chain network design problem was studied in Carle et al (2012) and an agent-based metaheuristic was proposed based on the idea of asynchronous cooperation between agents. Nickel et al (2012) studied a stochastic supply network design problem with financial decisions and risk management for which the authors only managed to solve small instances. Heuristic approaches appear to be the most promising methods for these types of problems. Yaghini et al (2012) proposed a simulated annealing metaheuristic for the CMND problem without asset-balance constraints. The approach utilised a neighbourhood structure based on the pivoting rules of the Simplex method in order to speed up the search. A multiobjective evolutionary algorithm was proposed for this same problem in Kleeman et al (2012). However, these metaheuristics do not necessarily perform well on SNDP due to the presence of the asset-balance constraints. Bai et al (2014) studied a stochastic service network design problem with rerouting. In Bai et al (2015), a service network design formulation was used to obtain the lower bound of a multi-shift full truckload transportation problem.
It can be seen that the aforementioned research mainly focused on either new models to better capture the complexities of the real-world freight transportation problems or new generic strategies to search the solution space more efficiently. However, limited research has been done to investigate new neighbourhood functions to tackle the difficult constraints and expensive flow distribution sub-problems. The goal of this paper is to address this gap by studying a new neighbourhood structures for SNDP. The effectiveness of the new structure is evaluated in two basic metaheuristic approaches (TS and GLS) and a hybrid method for a set of well-known SNDP benchmark instances.

The Freight SNDP Problem and Model
The SNDP is an important tactical/operational freight transportation planning problem. It is of particular interest for less-than truck load transportation and express delivery services, where consolidation of deliveries is widely adopted in order to maximise the utilisation of freight resources (Crainic, 2000). The SNDP involves the search for optimal or near-optimal service characteristics, including the selection of routes and the vehicle types for each route, the service frequency and the delivery timetables, the flow distribu- The set of arcs in the network. G = (N , A ) A directed graph with nodes N and arcs A .
The arc from node i to j. u i j The capacity of arc (i, j). f i j The fixed cost of arc (i, j). K The set of commodities. o(k) The origin (source) of commodity k ∈ K . s(k) The sink (destination) of commodity k. d k The flow demand of commodity k. c k i j The variable cost for shipping a unit of commodity k on the arc (i, j).
The amount of flow of commodity k on the arc (i, j). y i j The network design variables.
The vector of all flow decision variables, i.e.
The set of outward neighbouring nodes of node i.
The set of incoming neighbouring nodes of node i.
The objective of SNDP model, which represents the sum of the fixed cost and the variable cost for given solution vectors x and y, or expressed in terms of a potential solution s. g(s), g(x, y) The objective function which is actually solved, including a penalty for infeasibility, expressed in terms of a potential solution s or the decision variable component vectors x and y of s.
tion paths for each commodity, the consolidation policies, and the idle vehicle re-positioning, so that legal, social and technical requirements are met (Wieberneit, 2008). The SNDP differs from the Capacitated Multicommodity Network Design (CMND) problem, a wellknown NP-Hard problem, in that it has an additional source of complexity due to the required balance constraint for freight assets in order to ensure that vehicle routes are contiguous and that vehicles are in the correct positions after each planning cycle.
The problem of concern in this paper can be formulated in several ways. We used a node-arc based model described in Pedersen et al (2009) and also present it here for completeness. The list of notation used in the model is given in Table 1.
Let G = (N , A ) denote a directed graph with nodes N and arcs A . Let (i, j) denote the arc from node i to node j. Let K be the set of commodities. For each commodity k ∈ K , let o(k) and s(k) denote its origin and destination nodes, respectively. Let y i j be boolean decision variables, where y i j = 1 if arc (i, j) is used in the final design and 0 if it is not used. Let x k i j denote the flow of commodity k on arc (i, j). Let u i j and f i j be the capacity and fixed cost, respectively, for arc (i, j). Finally, let c k i j denote the variable cost of moving one unit of commodity k along arc (i, j). The SNDP can then be formulated as follows: minimise where x k i j ≥ 0 and y i j ∈ {0, 1} are the decision variables. The network capacity constraint (2) ensures that the maximum capacity of arc (i, j) is not violated. The flow conservation constraint (3) ensures that the entire flow of each commodity is delivered to its destination, where N + (i) denotes the set of outward neighbours of node i and N − (i) the set of inward neighbours. b k i is the outward flow of commodity k for node i, so we set Optimisation of Transportation Service Network Using κ-node Large Neighbourhood Search 7 asset-balance constraint, which is missing from the standard CMND formulation, as discussed in section 2 and which ensures the balance of transportation assets (i.e. vehicles) at the end of each planning period. For a given design variable vector y =< y 00 , ..., y i j , ... >, the problem becomes one of finding the optimal flow distribution variables. Constraint (4) is no longer relevant and the flow must be zero on all closed arcs, so only open arcs have to be considered in the model. Let A denote the set of open arcs in the design vector y and N be the set of nodes in A , then flow distribution variables (x k i j ) for all open arcs ((i, j) ∈ A ) can be obtained by solving the following CMMCF problem, where Constraint (6) ensures the total flow on each open arc in A is no more than its capacity. Constraint (7) is same as the constraint (3) which is the flow conservation constraint.

A Revisit of Previous Heuristic Approaches
In the previous efforts (Pedersen et al, 2009;Bai et al, 2012), neighbourhood search functions were primarily based on single arc state-flipping (otherwise referred to as arc adding/dropping) with the flow distribution handled separately either heuristically (based on a residual graph) or optimally by solving the corresponding CMMCF problem using an LP solver. Interested readers are referred to Pedersen et al (2009) for more details of this neighbourhood structure. However, one drawback of this neighbourhood is the inability to maintain solution feasibility in terms of asset-balance constraints. For a feasible solution satisfying the asset-balance constraints, flipping the state of a single arc will typically generate an infeasible solution (i.e. violating constraint (4)). Let us take a simple network in Figure (2012), one could generate 12 neighbouring solutions. Unfortunately none of them is feasible due to asset balance constraint violations. For example, opening arc (1,5) will lead to vehicle imbalance at both nodes 1 and 5. Similarly, closing arc (2,1) will lead to asset-balance constraint violations at nodes 1 and 2. In Pedersen et al (2009);Bai et al (2012), this constraint violation issue was addressed by using a special feasibility-recovery procedure at the end of each local search phase. Although effective in finding a feasible solution, the method may suffer from performance issues when the feasibility-recovery procedure leads to a large increase in costs, and hence inferior solutions.
Another major drawback of the arc-flipping neighbourhood function is the reachability in the search space. Observations from experimental tests in Bai et al (2012) show that considerable number of neighbourhood moves are rejected during the search and local search methods (both TS and GLS) tend to get stuck at local optima. It appears that this neighbourhood function struggles to reach certain regions of the search space regardless of the number of iterations permitted. This observation explains why the "multiple starts" used in Pedersen et al (2009) Figure 2.(b) requires closing two arcs 4→3 and 3→2 and opening arc 4→2. Since only one arc can be modified at each neighbourhood move (excluding arcs that are modified during the flow redistribution procedure), in theory it is possible to move to the neighbouring solution in Figure 2.(b) through 3 successive operators. In practice the success rate of such a move could be extremely low since the first two moves will result in asset imbalance at all three nodes involved and the penalty for this constraint violations can prevent the intermediate solutions from being accepted. In addition, if the flow redistribution during any of these three moves is infeasible, the search will not reach the solution depicted in Figure

The Proposed κ-node Neighbourhood
In this section, we describe the proposed new neighbourhood which was originated from the idea of pairedroute-flipping. The main purpose is to maintain the feasibility of the solution during the search by changing the status of two carefully selected routes. Each route is a sequence of arcs representing vehicle moves over time. We describe this idea in the following subsection.

The paired route-flipping
Instead of flipping an arc, we identify a set of arc-flipping operations with automatic feasibility satisfaction in terms of the asset-balance constraint. Figure 3 illustrates this arc-flipping operator. The solid lines represent open arcs and dotted arcs denote closed arcs. The paired-route-flipping operator involves simultaneously changing the statuses of two routes which share the same source and destination nodes. In this particular example, suppose that the algorithm decides to close a route 1→2→3. If we can find one of its paired route that also starts at node 1 and finishes at node 3 but with different statuses (i.e. route is closed), the asset balance constraint can be satisfied by simply opening the paired route (i.e. the dotted route). Although this neighbourhood operator can guarantee satisfaction of asset-balance constraints, identifying such a pair of routes is not trivial. On the contrary, it is much easier to focus on nodes rather than arcs, leading to our κ-node neighbourhood structure which we describe in the next subsection.

The κ-node neighbourhood operator
In this neighbourhood, a subset of κ nodes out of all the nodes are selected and arcs incident upon these nodes are considered for changes. Note that in order to prevent evaluating a candidate solution many times, we require that the change of arcs should involve exactly κ nodes rather than a subset of them. We focus on the small and medium sized neighbourhoods. Large neighbourhoods (e.g. κ > 4) are not considered since it is impractical to evaluate them within a realistic time limit. Figure 4 illustrates the κ-node operator when κ = 2, 3 and 4, respectively. It is not difficult to see that when κ = 2, a feasible neighbour may exist only if both arcs connecting the two nodes have a same status (i.e. either both closed or both open). If one of them is open and the other is closed, no feasible neighbouring solution exists.
When κ = 3, the maximum number of arcs between these nodes is 6. For a feasible current solution s, we denote design variables for arcs a 0 , a 1 , ..., a 5 as y 0 , y 1 , ..., y 5 , respectively. Including the current solution s, the maximum number of neighbouring solutions for a 3-node operator will be 2 6 = 64. However, not every neighbouring solution will be feasible in terms of asset-balance constraint (4). For any neighbouring solution s , to satisfy asset-balance constraint, the following constraints should be respected, each of which corresponds to one of the three nodes under consideration. We denote the corresponding design variables in s for arcs a 0 , a 1 , ..., a 5 as y 0 , y 1 , ..., y 5 , respectively.
Condition (8) is obtained from the asset balance constraint for node 0. The left side term is the difference between the number of outgoing and incoming arcs connecting node 0 and the other two nodes in solution s, while the right side term stands for the same difference for node 0 in its neighbouring solution s . In order to make sure node 0 stays asset-balanced after neighourhood moves in s , the left side term should be made equal to the right side. That is, any neighbourhood moves should not change the difference between the number of outgoing arcs and incoming arcs for node 0. The same requirements applies to node 1 and node 2, leading to conditions (9) and (10), respectively. Note that any of the two conditions will be sufficient to ensure feasibility since the third condition can be obtained from the other two conditions. For example, condition (10) can be obtained by simply adding (8) and (9) on both sides correspondingly. In theory, the total possible number of neighbouring solutions of s is 2 η − 1 where η is the number of directed arcs inter-connecting the κ nodes. Hence when κ = 3, η = 6, and 2 η − 1 = 63. However, since y 0 , y 1 , ..., y 5 take binary values only, these conditions will exclude lots of neighbouring networks that are infeasible. For example, if the left side of condition (8) equals 2 (meaning y 0 = y 2 = 1, y 1 = y 3 = 0), none of the 63 neighbours will be feasible because of this condition. If the left side of condition (8) equals 1, i.e. y 0 + y 2 − y 1 − y 3 = 1, including the original network there will be 4 possible feasible neighbouring solutions for this condition. They are: (1, 0, 0, 0), (0, 0, 1, 0), (1, 0, 1, 1), (1, 1, 1, 0). Due to variables y 4 and y 5 , more solutions are expected if both condition (8) and condition (9) are considered. Nevertheless, the number of asset-balanced neighbouring solutions for s will be significantly smaller than 63.
Similarly when κ = 4, the following conditions should be satisfied for candidate solutions to ensure the asset-balance at each node: A C C E P T E D M A N U S C R I P T y 0 + y 2 + y 4 − y 1 − y 3 − y 5 = y 0 + y 2 + y 4 − y 1 − y 3 − y 5 (11) y 1 + y 6 + y 8 − y 0 − y 7 − y 9 = y 1 + y 6 + y 8 − y 0 − y 7 − y 9 (12) y 3 + y 7 + y 10 − y 2 − y 6 − y 11 = y 3 + y 7 + y 10 − y 2 − y 6 − y 11 (13) y 5 + y 9 + y 11 − y 4 − y 8 − y 10 = y 5 + y 9 + y 11 − y 4 − y 8 − y 10 Again, only 3 out the above 4 conditions are active and the other one is redundant. For a medium sized network of 60 nodes, the number of subsets of nodes with cardinality of 4 is C 4 60 = 487635. For each node subset, as mentioned above, the maximum possible number of neighbouring solutions of s is 2 12 − 1 = 4095. However, the actual number of feasible neighbouring solutions that satisfy the above conditions is significantly smaller. The size of the neighbourhood depends on the current solution s. For example, there will be no feasible neighbours when the left side of the above conditions takes extreme values (-3 or 3) since it means difference of in-degree and out-degree for all 4 nodes is 3. Any modification of y 0 , ..., y 11 will violate at least one of these conditions. The number of feasible neighbours most probably reaches a maximum when the left side of these conditions take values in the middle of permitted range (i.e. equal to 0). That is: y 1 + y 6 + y 8 − y 0 − y 7 − y 9 = 0 (16) Through a binary tree search algorithm, one could solve the above equations and it turns out that only 121 possible feasible neighbours exist as far as the asset-balance constraint is concerned. Despite this reduction, the size of the neighbourhood in a 60-node network when κ = 4 is still more than 59 million (121 ×C 4 60 ). Considering the time taken to solve the flow distribution sub-problem for each of these candidate solutions in the neighbourhood, it is impractical to efficiently evaluate neighbourhoods larger than κ = 4. Even with κ = 4, it could still be very slow to have a complete evaluation of the neighbourhood. Faster neighbourhood search procedures are required.

Speeding up the neighbourhood search
In this section, we discuss ways that could speed up the neighbourhood search. In the previous neighbourhood structure, there may be solutions which can be discarded directly without ascertaining their objective values. Firstly, given a solution s and one of its neighbouring solutions s , if too many arcs are closed in s compared to s, there is very little chance that the flow on these arcs can be redistributed among the remaining network. It is therefore not necessary to solve the CMMCF sub-problem. Similarly if a neighbouring solution s has too many open arcs than the original solution, it is not necessary to evaluate this solution either since the fixed cost would increase dramatically, resulting in a poor solution. These two "extreme" cases are dealt with by adding cut-set inequalities and a heuristic rule respectively which we now discuss.
Let N κ be the set of κ nodes selected in the κ-node neighbourhood and A κ be the set of arcs that join any of two nodes from N κ . For a given κ, the maximum number of arcs incident with these κ nodes is P 2 κ = κ(κ − 1). For each of node i ∈ N κ , we define the following cut-sets S i and T i : Let CapST i = ∑ s∈S i ,t∈T i u st y st be the aggregated arc capacity from S i to T i in a candidate solution with design vector y. Let DemandST i be the total amount of commodity flows that originate from S i and destine to T i . Similarly, let CapT S i and DemandT S i be the total available capacity from T i to S i and total amount of commodity flows from T i to S i , respectively. The necessary conditions for the candidate solution with design variable y to be feasible are: CapT S i ≥ DemandT S i ∀i ∈ N κ (20)

A C C E P T E D M A N U S C R I P T
In addition, any modification of arcs related to a node i ∈ N κ will likely impact on the flows going through its neighbouring nodes. Therefore, similar flow cut-set inequalities can be generated for its neighbouring nodes. Let S i = N + (i) and T i = {N\S i }, ∀i ∈ N κ . We have: where CapNST i = ∑ s∈S i ∑ t∈T i u st y st is the aggregated arc capacity from S i to T i for node ∀i ∈ N κ in solution design vector y, and CapNT S i is the aggregated arc capacity available from T i to S i . DemandNST i and DemandNT S i , respectively, are the aggregated commodity flows from S i to T i and T i to S i . Although useful in avoiding unnecessary CMMCF sub-problem solving, these cut set inequalities can be computationally expensive themselves simply because of the huge number of cuts available. In our implementation, we set the cardinality of the cut set |S i | ≤ 3 and we only check against these inequalities for candidate solutions which have 3 arcs or more closed compared to the current solution.
In the case of an "excessive" number of open arcs in s compared to s, the following condition is used to check whether s will be evaluated or discarded. Neighbours that do not satisfy this condition will be discarded. ∑ where f κ is the average fixed cost of the arcs in A κ that are involved in this neighbourhood move. We discard a neighbouring solution if it contains w more open arcs than the original solution, evaluated in terms of the average fixed costs. In our implementation we set w = 2.5. The number of nodes required for κ-node neighbourhood is at least 2. For a given input κ (≥ 2), a neighbouring solution can be generated by making changes to arcs connecting exactly h (2 ≤ h ≤ κ) nodes. Therefore, here neighbourhood κ = 3 will contain neighbouring solutions with changes involved by all possible node pairs and all possible node triplets. The pseudo-code of the neighbourhood is given in Algorithm 1.
For every neighbouring design variable vector y , the procedure first checks whether asset-balance constraint is respected by this vector. If not, y is discarded and the next vector is considered. The assetbalance constraint is checked in the following way. When h = 2, as discussed in the previous section, y is feasible only if two arcs connecting the two nodes have a same status (i.e. both open or close). When h = 3 or h = 4, one can check the asset-balance at each node using conditions (8)-(10) and (11)-(14), respectively. When κ > 4, as we discussed in section 5.2, the size of the neighbourhood increases significantly. It is impractical to evaluate the entire neighbourhood. Therefore in our experiment, we set κ = 4. Note that when we generate neighbouring solutions for h = 3 or h = 4, we should not duplicate neighbours which have been generated for h = 2. That is, the neighbourhood moves for h = 3 should involve all three nodes, rather than a subset of it. For example, the neighbourhood for node set {1,2,3} should not contain the neighbourhood for node set {1,2}, neither for node set {2,3} or {1,3}. In this way, we can ensure the search starts from smaller neighbourhoods and when we explore larger neighbourhoods, we do not duplicate solution evaluations for previously visited solutions.
Once a neighbouring design variable vector y satisfies the asset-balance constraint and the net number of closed arcs is greater or equal to 3, we check it against the inequality conditions (16)-(21) to filter infeasible design variable vectors. After this, the CMMCF procedure is called to find a feasible flow if it exists. If the corresponding node set NS is in the tabu list and aspiration criterion is not met, this solution is discarded. Otherwise, it is compared against the initial solution and best solution so far. If a candidate solution improves the initial solution, the procedure returns the first-improved solution. Otherwise, it returns the best solution s * in the current neighbourhood.

Performance Evaluation
In this section, we evaluate the performance of the κ-node neighbourhood against two recent metaheuristics based on the arc-flipping neighbourhood. For purposes of comparison, we chose basic TS and basic GLS to avoid complications from other factors such as various intensification and diversification mechanisms.

A C C E P T E D M A N U S C R I P T
Algorithm 1 The pseudo-code of the proposed κ-node neighbourhood function with TabuList support. It returns a first-improvement neighbouring solution s * from the current solution s = (x, y) as well as the corresponding node set NS * which defines the neighbourhood. κ is the maximum number of nodes allowed in the κ-node neighbourhood and z(.) is the fitness function. if NS is in TabuList and the aspiration criterion is not met then 13: skip to the next node set NS. 14: else Copy the solution to s . return s * and NS * Return the best neighbouring solution 27: end procedure

A basic TS with κ-node neighbourhood function
We firstly implement a basic TS method with the proposed κ-node neighbourhood function (denoted as TS κ-node) to evaluate its performance. We compare it against the results reported in Pedersen et al (2009) by a multi-start TS method given and results reported in Bai et al (2012) by a tabu assisted multi-start GLS method. Both algorithms use the arc-flipping neighbourhood function. More details and discussions about TS can be found in the book of Glover and Laguna (1997).
The pseudo-code of TS κ-node algorithm is given in Algorithm 2. The inputs of the algorithm are a feasible initial solution s 0 , the objective function of the problem z(.), the maximum number of nodes allowed in the neighbourhood generation κ, and the maximum length of the tabu list T L. Because the neighbourhood search operates on feasible solutions only, the initial solution was generated by the tabu assisted GLS method (TA MGLS) in Bai et al (2012) which was stopped as soon as a feasible solution is found. As such, the initial solutions used by the TS method in this experiment are much inferior than the final results reported by TA MGLS (Bai et al, 2012). In our experiment, we set κ = 4 to keep the size of the neighbourhood relatively small so that it can be evaluated quickly. We used a fixed length tabu list TabuList which is maintained on the first-in-last-out basis. The maximum length is set to T L = 10 after some initial tests on a subset of the benchmark instances. Because the proposed κ-node neighbourhood is based on node sets rather than arcs, the tabu list contains the node set which leads to the adoption of the current solution returned by the procedure FirstDescent(s , z(.), k). The procedure repeatedly calls the FirstDescent(.) to search for a first-decent neighbouring solution which is not in the tabu list until the stopping criterion is met. In this case, the procedure stops when the maximum allowed time is exhausted. This was set to 2400s minus the amount of time spent in the initial feasible solution generation phase.

A C C E P T E D M A N U S C R I P T
Algorithm 2 A basic TS with κ-node neighbourhood input An initial feasible solution s 0 , the objective function z(.), κ, tabu length T L.
Initialise the TabuList, the current solution s = s 0 , and the best solution s b = s 0 . while stopping criterion is not met do s , NS ← FIRSTDESCENT(s , z(.), k) Get the first-descent solution and the node set NS if z(s ) < z(s b ) then s b = s Update the best solution end if TabuList.Add(NS) Add the corresponding node set to the TabuList if (TabuList.Length > T L) then TabuList.RemoveFrist Maintain the TabuList end if end while return s b Table 2 An initial evaluation of the performance of the proposed κ-node neighbourhood in a basic TS algorithm (TS κ-node) in comparison with two previous algorithms; TS (Pedersen et al, 2009) and TA MGLS (Bai et al, 2012). TS was used once only because it was developed into a deterministic algorithm. The best objective values are highlighted in bold.  Table 2 presents the computational results by the basic TS with the proposed neighbourhood function (denoted as TS κ-node) in comparison with two other metaheuristics for this problem; TS (Pedersen et al, 2009) and TA MGLS (Bai et al, 2012). Since TS in Pedersen et al (2009), tested on a Pentium IV 2.26GHz PC with 3600 seconds CPU time, was developed into a determinstic algorithm, only one run is required. Both TA MGLS and TS κ-node were run on a PC with 2.0GHz Intel Core 2 CPU, single-threaded and a 2400-second time limit in conjunction with Cplex12 as the linear programming solver. Therefore, both TS MGLS and TS κ-node uses much less time than TS in Pedersen et al (2009).

Instance
The experiments were based on a set of benchmark instances drawn from Pedersen et al (2009). This data set consists of 24 instances of different sizes (nodes, arcs, commodities) and distributions of fixed cost, variable cost and capacity. The first three numbers in the instance name represent the number of nodes, the number of arcs and number of commodities respectively. 'F' indicates that the fixed cost dominates the cost function while a 'V' means dominant variable costs. 'L' stands for loose capacity constraints while 'T' means capacities are tight. For each instance, 10 independent runs with different random seeds were conducted and their best, mean and worst results are reported. The best results among the three approaches are highlighted in bold. It can be seen that even with a very basic TS method, the new neighbourhood function A C C E P T E D M A N U S C R I P T is able to produce very competitive results. Both the TS method in Pedersen et al (2009) and the tabu assisted multi-start GLS method (TA MGLS) in Bai et al (2012) used a multi-start framework to diversify the search. It can be seen that the proposed neighbourhood evaluated in a basic TS, performed better than the TS method in Pedersen et al (2009). It also outperformed TA MGLS for many instances, particularly small instances. For large instances (e.g. instances with 400 commodities), TS κ-node was slightly inferior to TA MGLS. This is probably caused by longer computational time taken by each FirstDescent(.) procedure call for larger sized problems which leads to significant increase in CMMCF solution time. A possible improvement for this algorithm is then to develop some faster heuristic flow distribution procedures to reduce the number of CMMCF calls.

A basic guided local search with new neighbourhood function
We also implemented a basic GLS method with the proposed neighbourhood. The pseudo-code of the algorithm is given in Algorithm 3. GLS is a metaheuristic designed for constraint satisfaction and combinatorial optimisation problems (Voudouris and Tsang, 2003). The underlining idea is to take advantage of information gathered during the search to guide it and enable it to escape local optima. GLS adopts a transformed objective function which includes a penalty to penalise "unattractive" features in a candidate solution. We denote p r as the current penalty for the presence of a given feature r in the current solution s, and I r (s) is an indicator variable such that I r (s) = 1 if the candidate solution s contains feature r and I r (s) = 0 otherwise. h r is a cost associated with feature r. In this application, we chose all of the arcs as the GLS features and their fixed costs as the feature costs, i.e. h r = f r for each arc r ∈ A . Parameter λ is a scaling coefficient between the original objective function z(s) and the aggregated feature penalties. Since λ is problem instance dependent and is difficult to tune directly, it was estimated by λ = αg(s * )/ ∑ r I r (s * ) where s * is the current best solution and α is a parameter that is less problem-dependent than λ . At each GLS iteration, the proposed κ-node neighbourhood function FirstDescent(.) was used to find a first-descent solution except that the TabuList in FirstDescent(.) was set to empty. Therefore, the GLS we tested here is in its basic version. Find r with maximum util r , set p r = p r + 1 8: end for 9: end while 10: return s ← best solution found according to the original objective function z(s).
Similar to the experimental setup in the previous section, the basic GLS metaheuristic started from a feasible solution s 0 generated by the TA MGLS in Bai et al (2012) which stops as soon as a feasible solution is found. Similar to TS κ-node, the stopping criterion was 2400 seconds of computational time, minus the time spent for generating an initial feasible solution and the size of the neighbourhood is set to κ = 4. The arcs in the network were chosen to be GLS features R and the fixed cost of each arc is defined as the corresponding feature cost. The GLS parameter was set to α = 0.05 based on a preliminary experiment on a number of representative instances. Table 3 presents the results by GLS with κ-node neighbourhood for the same set of benchmark instances, in comparison with a same basic GLS with an arc-flipping neighbourhood operator (denoted as GLS) and the TA MGLS in Bai et al (2012). It can be seen that with a same GLS framework, the proposed A C C E P T E D M A N U S C R I P T κ-node neighbourhood function outperformed the arc-flipping neighbourhood for each of the 24 instances. Compared with TA MGLS which is much more sophisticated, GLS κ-node was inferior for most instances but obtained better results for instance C51 and C57, both of which have a small commodity size. GLS κnode is generally competitive when the problem size is small. For large instances, each FirstDescent(.) call is expensive and hence impedes the search significantly. This is compounded with influence of the transformed objective function g(s) used in GLS that leads to poor solutions since local optima were not reached when the computational time is not sufficient. This also explains why TS κ-node was able to obtain better solutions than GLS κ-node in general although both of them started from the same initial solutions and used exactly the same neighbourhood function. Nevertheless, through these two experiments, the new neighbourhood function has shown its effectiveness by producing very promising results, obtaining the new best-known results for many instances. This is largely attributed to its better reachability because of larger neighbourhood sizes and abilities to maintain feasibility. Contrary to many other neighbourhood operators, the proposed new neighbourhood operator uses the constraint violations to their advantage by ignoring lots of infeasible solutions. Compared with the previous neighbourhood function, the superiority of the κ-node operator was demonstrated by the superior results obtained both by the basic GLS and basic TS without the multi-start mechanism which was crucial in a previous hybrid method TA MGLS in order to prevent the search from getting stuck at local optima.

A C C E P T E D M A N U S C R I P T
CMMCF sub-problem which again is computationally expensive. In this section, we investigate how the proposed neighbourhood method can be improved further. We now describe the heuristics that we used in our experiments to speed up the κ-node neighbourhood search, followed by a hybrid metaheuristic to further enhance the performance of the algorithm.

Speeding up the neighbourhood search
Due to the size of the neighbourhood and the relatively large solution time for the full CMMCF subproblem, we adopted the following approximate method which solved a reduced network flow problem. More specifically, in Algorithm 1, line 11, instead of solving the CMMCF sub-problem exactly to obtain the optimal flow for y , we assume that the existing flows of the current solution s are already well distributed except that the commodity flows through nodes in set NS have to be redistributed since arcs interconnecting these nodes have changed in the neighbourhood move. Let x be the flow vector before the neighbourhood move and x be the flow vector after the neighbourhood move. Let K r be the set of commodities that have a positive flow through one of nodes in NS. Firstly, we set x = x and then delete from x all the flows of every commodity in K r . Let A r be the set of open arcs in y with a positive residual capacity (after the removal of commodity flows for K r ) and rc i j be residual capacity for arc (i, j) ∈ A r . Finally, let N r be the set of nodes joined by any of arcs in A r . The optimal flows for commodities K r are then obtained by solving the following reduced minimum cost network flow problem.
Once the reduced network flow problem is solved, the objective value of s can be computed through partial evaluation since it is easier to calculate the objective difference between s and s . In addition, in our implementation, we further speed up the search by only sampling 10% of the neighbourhood randomly when h = 4 (see Algorithm 1) since the size of this neighbourhood is extremely large. However, for cases h = 2 and h = 3, the entire neighbourhoods are evaluated.

Hybridising with other approaches
In this section we describe how the κ-node neighbourhood based tabu search approach can be hybridised with other approaches to improve its performance. More specifically we hybridise it with a variable fixing heuristic. The proposed hybrid algorithm can be illustrated by Figure 5.
The hybrid algorithm is divided into four phases, they are MIP TL, TA MGLS, TA k-node, and RMIP. The first stage (denoted as MIP TL) is the initialisation stage which adopts a Cplex MIP solver to generate a feasible solution by directly solving the SNDP model (1)-(4) within a given time limit (t 1 ). TA MGLS is the tabu assisted multi-start guided local search method proposed in Bai et al (2012). TA k-node is the tabu search method we described in the previous section. RMIP is a post optimisation procedure which solves a reduced MIP problem based on a pool of elite solutions found during the second and the third phases. The computational time for these three phases are denoted as t 2 , t 3 and t 4 , respectively. The search starts from the first phase and then goes through these phases sequentially. Each stage starts from the best solution found from the previous stage and tries to improve it, except the final stage RMIP which operates on a pool of elite solutions found in phase 2 and phase 3. In this application, the maximum size of the elite solution pool is fixed to 100. Once the pool reaches the maximum size, inferior solutions will be replaced with better solutions that are not in the elite pool yet. In the initialisation phase, if Cplex fails to generate a feasible solution, TA MGLS will start from an initial solution generated by LP Round, a procedure which solves the corresponding LP-relaxation of the problem and then rounds the design variables to the nearest integers. The flow variables are then recomputed by solving the corresponding CMMCF problem. The underlining ideas for this hybridisation are: 1) for many problem instances, Cplex MIP solver can find good quality (in a few cases, optimal) solutions fairly quickly but often converges very slowly afterwards. In addition, Cplex MIP tends to fail to produce feasible solutions for instances with a large number of arcs. 2) The Tabu Assisted Multi-start Guided Local Search (TA MGLS) (Bai et al, 2012) is very efficient in finding a good quality feasible solution thanks to the special ability of the guided local search which exploits the structure of the problem directly. 3) The κ-node neighbourhood has much better reachability than the previous neighbourhood. It can reach some local optima that otherwise cannot be found. 4) We observed, in our initial experiments, that the network design differences between the best neighbouring solutions found in stage two and stage three are not significant. Many of them share identical arc settings. In our hybrid algorithm, a reduced network design problem is modelled by prefixing the values of these identical arcs in the original MIP model. In our experiments, the reduced problem was generally solvable in 10-15 s for the majority of our tested instances with a few exceptions. We have set a time limit of 60 s for this stage (see Table 4) to ensure the majority of the computational time can be spent on the κ-node neighbourhood search stage The hybrid algorithm is applied to solve the same instances. The time limits for the four stages are set to t 1 = 2400,t 2 = 600,t 3 = 4140,t 4 = 60 seconds, respectively. Therefore, the total CPU time for each run of the hybrid algorithm is 7200 seconds. The hybrid algorithm is run 5 times (each with a different random seed) on a PC with 2.0GHz Intel Core 2 CPU and 8GB RAM and the best and the mean results are reported. Table 4 shows the detailed results of the four different stages of the hybrid algorithm and Table 5 compares the final results by the hybrid algorithm against those by Cplex12.4 MIP solver with 2 hours time limit (Cplex 2h), and a very recent metaheuristic TS-PR (Minh et al, 2013), which was run on a workstation with AMD Dual-Core Opteron 2.4GHz CPU and 16GB RAM. Due to data unavailability in the referenced article, only the best results out of 10 runs by TS-PR are included, each of which takes 7785 seconds CPU time on average. It can be seen that the proposed hybrid algorithm based on the κ-node neighbourhood performed competitively and has found new best solutions for several instances. It is particularly suitable as a quick post-optimisation approach for Cplex which appears to suffer slow convergence issues for some instances.