Edge covering with continuous location along the network

Article history: Received February 5 2020 Received in Revised Format February 28 2020 Accepted March 5 2020 Available online March 5 2020 The set covering problem is to find the minimum cardinality set of locations to site the facilities which cover all of the demand points in the network. In this classical problem, it is assumed that the potential facility locations and the demand points are limited to the set of vertices. Although this problem has some applications, there are some covering problems in which the facilities can be located along the edges and the demand exists on the edges, too. For instance, in the public service environment the demand (population) is distributed along the streets. In addition, in many applications (like bus stops), the facilities are not limited to be located at the vertices (intersections), rather they are allowed to be located along the edges (streets). For the first time, this paper develops a novel integer programming formulation for the set covering problem wherein the demand and facility locations lie continuously along the edges. In order to find good solutions in a reasonable time, a matheuristic algorithm is developed which iteratively adds dummy vertices along the edges and solves a simpler problem which does not allow non-nodal facility locations. Finally, a Benders decomposition reformulation of the problem is developed and the lower bounds generated by the Benders algorithm are used to evaluate the quality of the heuristic solutions. Numerical results show that the Benders lower bounds are tight and the matheuristic algorithm generates good quality solutions in short time. © 2020 by the authors; licensee Growing Science, Canada


Introduction
Placing facilities in appropriate locations has a long history but it has been received a great deal of formal researches since 1950s (Heragu, 2008). The set covering problem is one of the most well-known facilitylocation problems with many applications, especially in the public sector (Farahani et al., 2012;Snyder, 2011). In this problem, a demand point would be covered if its distance from at least one sited facility does not exceed a given length. The objective of the set covering problem is the minimization of the number of used facilities to cover all of the demand points (Snyder, 2011). This problem was introduced by (Hakimi, 1965). He addressed the problem of finding the minimum number and the locations of policemen to cover the vertices of a highway network. Toregas et al. (1971) proposed the famous integer programming formulation for this problem.
The abovementioned classical set covering problem assumes that the facilities could be located only at the vertices of the network and the demand occurred only at the vertices (Toregas et al., 1971). While these two assumptions have their own applications, there are other circumstances where the facilities have insignificant geographic footprint and hence can be located anywhere along the network (Wei et al., 2014); and at the same time, the demand for these facilities lies continuously along the edges. For example, we can refer to bus stops, interactive kiosks, ATMs, bike sharing stations, convenience shops, car charging stations, footbridges and wildlife road crossings, surveillance cameras, municipal waste collection bins, accident reporting acoustic sensors on road networks, and fire hydrants, some of which are mentioned in Berman et al. (2016) and Wei et al. (2014). As compared to the classical variation of the problem, there are two implications in dealing with this unrestricted variation of the problem (Church & Murray, 2018). Firstly, an infinite number of candidate facility locations are to be regarded. Secondly, there is not a finite set of demand points anymore; instead the continuous parts of the network (i.e. edges) must be considered for coverage.
Hakimi property (also known as nodal optimality) states that in the p-median problem, there is always at least one optimal solution that the facilities are sited only on the vertices (Laporte et al., 2015). It is known that this property does not carry over for the classical set covering problem (Snyder, 2011). Here, by using a simple counterexample, we show that this property does not hold for the case in which the demand is distributed along the edges. Hence, the relaxation of the restriction that the facilities must be located at the vertices can decrease the number of required facilities to cover the network. Fig. 1 represents a small network with coverage distance 10 in which the number beside each edge is its length. It can be verified that if the facilities are restricted to be located at vertices, the network is covered by at least two facilities (e.g. the ones denoted by F). However, the relaxation of this restriction allows the coverage of the whole network by only one facility located at point A.

Fig. 1. Unrestricted location of facilities
In Fig. 1, the coverage of the network is equivalent to the coverage of its vertices. However, this is not a general rule. Consider Fig. 2 with coverage distance 10. Again, the length of each edge is written beside it and the located facilities are denoted by F. All vertices are covered by these facilities, but some part of the network (i.e. the dotted section) has been left uncovered. Hence, the coverage of the vertices is not necessarily equivalent to the coverage of the entire network. Similar to the edge covering problem, the minimum vertex cover problem is to find the minimum-size set of vertices which covers all of the edges (Cormen et al., 2009). However, the notion of coverage is different and is not based on the coverage distance, rather it is assumed that each edge is covered if a facility is located on one of its endpoints or both. Some researchers like Li et al. (2016) and Cai et al. (2019) addressed the weighted extension of this problem in which the aim is to minimize the total weights of the selected vertices. They designed some efficient local search algorithms to solve this problem. Pandey and Punnen (2018) studied a generalization of the minimum vertex cover problem in which the objective function is composed of the costs of not covering, covering, and double covering of edges. They developed the integer programming formulations of the problem and analyzed its computational complexity.
The formal discussion on the edge covering problem regarding the coverage distance dates back to 1976 when ReVelle et al. studied the set covering problem with the edge demand. They assumed that an edge is called covered if at least one sited facility reaches all of its points, meaning that the coverage of one edge cannot be accomplished by the joint partial coverage from multiple facilities. Sadigh et al. (2010) extended the edge covering problem allowing each edge to be covered from two directions by partial coverage of two facilities. They solved the problem using a Tabu search algorithm equipped with size reduction rules which decrease the solution time. Similar to ReVelle et al. (1976), they assumed that the candidate sites to place the facilities are limited to the set of vertices. Conrow et al. (2018) studied the bike sharing siting problem in the city of Phoenix. They considered two objectives of maximization of the coverage of demand points and edge segments. They considered a finite set of potential locations for bike sharing stations.
Covering problems with facility location on the edges and discrete demand points have also been studied. Church and Meadows (1979) addressed the maximal and set covering problems wherein the demands exist on the vertices but the facilities are allowed to be sited anywhere along the network. For both problems, they proved that there is an optimal solution that the facilities are located at vertices plus some specific points along the edges. They also formulated the set covering and maximal covering of edges with the restriction that the facilities can be located only on the vertices and each edge is allowed to be covered by multiple facilities. Groß et al. (2009) tackled the problems of locating new facilities (e.g. bus stops and railway stations) along the edges of a transportation network in order to satisfy either the demand of some finite points in the plane or the nodes of the network. They considered two objectives, separately: minimizing the number of stations which cover all demand points and minimizing the total distance between demand points and their nearest facilities when the number of stations is given. Also, they proved that there are finite sets of optimal location points of polynomial order. Schobel et al. (2009) studied adding new bus stops (or train stations) to an existing bus route (or railway network) with the objective of minimization of total added stopping times subject to the constraint that all of the discrete set of demand points must be covered. They converted the problem to the classical set covering problem and found some cases that can be solved more efficiently. Jayalakshmi et al. (2017) addressed cooperative maximum covering problem wherein the coverage was not binary, rather its strength decreases linearly with increase in distance. Further, they assumed that each demand point is considered as covered if the sum of partial coverage from all facilities reaches a threshold. They assumed that the sites for locating facilities include continuous parts of edges as well as the vertices.
A few studies have considered the facility location problems when both demand and location occur continuously along the network. De Los Mozos & Mesa (2000) tackled the single facility location problem along the network wherein the demand lies both on the vertices and along the edges. By knowing the density function of the level of edge demand, they developed a decomposition algorithm to minimize the variance of distance between the located facility and all demand points. They showed that this approach finds the optimal location of the facility in a more efficient manner than the exhaustive approach. Berman et al. (2016) tackled the maximal covering problem that the demand is distributed along the edges and facilities can be located anywhere on the network. For the case when the demand is distributed constantly along edges and there is only one facility to be located on the network, they derived the discrete set of locations called finite dominating set (FDS) which always contains the optimal location. Based on the FDS, they designed a greedy adding and improvement heuristic to solve the multifacility location problem. They also addressed the minimal covering problem of obnoxious facilities. Wei et al. (2014) addressed the set covering problem where the facilities can be located anywhere along the network and the demand exists all over the edges. They designed an algorithm which generates lower and upper bounds iteratively in which the gap between these bounds approaches zero as the algorithm proceeds. Their algorithm was based on the assumption that the coverage distance is defined based on the Euclidian distance (i.e. each demand point is covered if at least one facility is located in its coverage circular disk).
The contributions of this paper are as follows. First, this paper tackles a novel extension of the set covering problem where the facilities can be located continuously along the network and the aim is to cover all of the edges (i.e. the whole network) with the minimum number of facilities. To the best of our knowledge, the only study which has considered this problem is Wei et al. (2014). However, Wei et al. (2014) assumed that the coverage is based on the Euclidian distance of the points on the plane. On the contrary, we assume that the coverage is defined based on the shortest path along the network. This seems more practical for public facilities in which the customers (facilities) should pass through the network to reach the facilities (customers). Second, a novel mathematical programming formulation of the problem is developed for the first time. Third, a matheuristic algorithm is designed which can find near optimal solutions in a short time. Fourth, a Benders algorithm is developed to generate lower bounds which verify the quality of the matheuristic solutions.
The remaining parts of the paper are organized as follows. Section 2 introduces the problem and develops its mathematical formulations. Section 3 is dedicated to the solution approaches. Subsection 3.1 provides some dominance rules. Subsection 3.2 introduces the node-adding matheuristic approach. Subsection 3.3 presents the Benders decomposition formulation of the problem which is used to generate the lower bound solutions. Section 4 uses the numerical results to verify the quality of the matheuristic algorithm and the tightness of the Benders' lower bounds. Finally, Section 5 concludes the paper and sheds light on some future extensions.

Problem formulation and notations
This paper tackles a novel edge covering problem wherein the facilities can be located anywhere on the network. The objective is to determine the minimum number of required facilities and their positions on the network. An edge does not need to be covered by a single facility, rather it can be covered cooperatively. The following assumptions are considered in the sequel.
-The graph of the problem is connected and undirected.
-At most one facility can be located along each edge.
-There is no parallel edges between any pair of vertices.
-All edges must be covered.
-No edge is longer than twice of the coverage distance.
It should be stated that the multi-graphs can be handled by the following simple trick. As shown in Fig.  3, adding a dummy vertex (like D) on an arbitrary position of each parallel edge converts the multi-graph to an ordinary graph. The last assumption guarantees the feasibility of the problem. Although the violation of this assumption does not necessarily lead to infeasibility of the problem, the feasibility can be assured by adding some dummy vertices over the edges which are longer than two times of the coverage distance.
The sets and the parameters of the problem are as follows.  The parameters dijs are determined before solving the problem by running a shortest path algorithm for every pair of vertices. To this end, we used Dijkstra's algorithm (Cormen et al., 2009) and implemented it in Matlab. The mathematical formulation of the problem is as follows: P: ; ; , , 0 ; The objective function (1) minimizes the number of used facilities. Constraint (2) states that the sum of the distance of the location of any facility from the endpoints of its edge equals the length of that edge. We remind that for the edges without a located facility, the value of variable y has no meaning.
Constraints (3) and (4) ensure that if at least one edge is covered by edge ij, a facility have to be located along it. Constraint (5) imposes that edge i′j′ is covered through vertex i′ by at most one located facility along the other edges.
Constraints (6) and (7) determine the length of edge ij which is covered by a located facility along other edges. To clarify these two constraints, consider the example depicted in Fig. 5 where a facility (denoted by a diamond) is located along edge ij. Suppose that the coverage distance (DC) is 10. Assume that the shortest path from node i to node i′ is of length 5 (i.e. dii'=5). As shown in this figure, yij=1. Constraint (6) does not enforce iji′j′ to be zero. If iji′j′=1, then ci′j′ can be 4. Now, assume that DC is 4. In this case, iji′j′ has to be zero. Otherwise, Constraint (6) is violated. If for all edges i′j′ we have iji′j′=0, then Constraint (7) imposes that ci′j′=0.

Fig. 5. An explanatory example for the calculation of cij
Constraints (8-9) ensure that the covered length of edge ij by its own facility is not greater than the coverage distance. Constraint (10) prevents erroneous calculation of cijs by precluding twice consideration of edge parts. It is worth noting that when an edge does not contain a located facility, this constraint is redundant since the values of yij can take arbitrary values. Constraint (11) enforces every edge to be totally covered. The remaining constraints determine the domain of the variables. It is worth noting that formulation P can also handle the graphs with directed edges. Suppose that edge ij is directed from i to j. Then, it suffices to drop all variables and constraints which consider traversing edge ij from j to i.

Solution approach
In this section first, some dominance rules are presented which reduce the feasible region. Further, a matheuristic algorithm is developed which solves a restricted variation of the problem, iteratively. Moreover, to evaluate the quality of the matheuristic solutions, Benders decomposition formulation of the problem is used to generate the benchmark lower bounds.

Dominance rules
In this subsection, some dominance rules are explained with the aim of reducing the feasible region.
Observation 1. For any pair of nodes i and i′, if dii′>DC, then iji′j′= 0. This is clear since any located facility along edge ij cannot cover any part of edge i′j′ through the shortest path ii′.
Observation 2. For any pair of edges ij and i′j′, if dii′=dji′+lij, then iji′j′= 0. In these situations, regardless of the value of yij, the shortest path from the located facility along edge ij to node i passes through node j. Hence, solutions with iji′j′=1 are dominated.
To illustrate this observation, consider the network shown in Figure 6 in which the straight lines represent the edges and the curves denote the shortest paths if the network excludes edge ij. Since dii′=dji′+lij, the curve ii is longer than dji′+lij. It follows that the coverage of edge i′j′ through node i′ by a facility located along edge ij is always longer (or at least equal) if the coverage is considered through node j.

Fig. 6. Illustration of observation 2
Observation 3. For any pair of edges ij and i′j′, if dii′=dij′+li′j′, then iji′j′= 0. In this case, the shortest path from node i to node i′ traverses through node j′. Hence, solutions with iji′j′=1 are dominated.
To illustrate this observation, consider the network shown in Fig. 7 in which the straight lines represent the edges and the curves denote the shortest paths if the network excludes edge i′j′. Since dii′=dij′+ li′j′, the curve ii is longer than dij′+ li′j′. It follows that if the coverage of edge i′j′ by a facility that is located along edge ij through nodes i and i′ is positive, this facility completely covers edge i′j′ through nodes i and j. Thus, the solutions with iji′j′=1 are dominated. Observation 4. Suppose that ij is a pendant edge (that is, i or j has vertex degree 1) with lijDC. Without loss of generality, let i be the vertex with degree 1 and assume that i<j. Then, in any optimal solution xij=1. Further, there is an optimal solution in which yij=DC.

The Edge-division matheuristic approach (EDA)
The infinite number of candidate location points (along the edges) is one of the major sources of complexity of the problem P. The mathematical formulation of the restricted variation of the problem (we denote it by RV for short) in which the facilities are limited to be located on the vertices is as follows. RV: ; , The objective is to minimize the number of located facilities. Parameters ckij determine the coverable length of edge ij through vertex i if a facility is located at vertex k. ckij is calculated beforehand and equals min{max{0,DC-dki},lij}. Regarding (18) and (20), the first summation of constraint (16) calculates the convex combination of scalars ckij s. Thus, the maximum value of this expression for edge ij is resulted when kij equals 1 for k with maximum ckij and zero for k′k. This discussion holds for the second summation of constraint (16). Suppose these two expressions are in their maximum values. Now, if the sum of the covered lengths through both directions of edge ij is greater than or equal to the length of this edge, edge ij is covered completely.
The optimal solution of the problem P can be approximated by adding dummy vertices along the edges and solving problem RV which is easier. Adding more dummy vertices may improve the quality of the approximate solution but increases the time it takes to solve the problem. Assume that  is the number of dummy vertices which are supposed to be located along the edges. Figure 8 describes our proposed algorithm to distribute the dummy vertices evenly over the network.
Initialization: l′ijlij , ij0; ijE For iter=1 to  Find the edge in E with maximum l′ij and call it as ij ijij+1 l′ij l′ijij/(ij+1) For each edge ijE with ij>0, Place ij nodes along the edge so that the edge is divided into ij+1 equidistant segments. Fig. 8. Spreading the dummy vertices The vertex set of the new graph includes the original and the dummy vertices. Moreover, the set of edges of this graph is composed of the undivided edges and the segments of the divided edges. The length of segments are l′ij. Knowing l′ij, the length of the shortest path between any pair of vertices which includes at least one dummy vertex can be easily calculated. The pseudocode of the edge-division matheuristic is presented in Fig. 9.
Initialization: RemTT, 0, GInitial Graph Solve problem RV by calling CPLEX with run time limit RemT for the graph with no dummy vertices BestObjObj BestSolutionSolution RemT RemT-RunTime While RemT>0 Augment G with new graph by adding =2  m dummy nodes distributed according to Fig. 8.

+1
Solve problem RV by calling CPLEX for the new graph with run time limit of RemT RemT RemT-RunTime If Obj< Bestobj BestObjObj BestSolutionSolution Report BestObj and BestSol Fig. 9. The edge-division matheuristic

The Benders decomposition approach (BD)
In order to evaluate the performance of the heuristic solutions, we need their corresponding optimal or lower bound solutions. Problem P is a mixed integer programming problem with a huge number of constraints and binary variables which cannot be solved optimally by the general purpose solvers in a reasonable time except for small problems. Such solvers (like CPLEX) also provide lower bound solutions, but these bounds are not necessarily tight. Benders technique is an optimization algorithm which decomposes the original problem into two smaller problems called master problem and subproblem (Taskin, 2010). This approach can solve problems with large scale formulations efficiently by alternately solving the master and subproblem (Oğuz et al., 2018). The master problem includes only a few (or even no) constraints. We use Benders decomposition algorithm to find tight lower bounds. In our case, the master problem is feasible and bounded, the subproblem is a feasibility problem, and the dual subproblem is also feasible. The optimal solution of P could be obtained by the solution of the master problem if it satisfies the constraints of the subproblem. Otherwise, the dual subproblem is unbounded and a constraint called feasibility cut needs to be appended to the master problem to exclude the corresponding extreme ray. The master problem (MP) of problem P is as follows.

MP:
Here, we provide three initial cuts (IC) with the aim of reducing the search space of the master problem.
Theorem 1. Expression (27) where  is an infinitesimal positive number and x is the integral part of x. Proof.
The number of required facilities to cover the shortest path between the farthest nodes is a lower bound on the minimum number of required facilities to cover the entirety of the graph. First assume that the length of the shortest path between the farthest nodes is less than or equal to 2DC. This path can be covered by locating one facility at the middle of it. Now, consider the opposite case. Let the shortest path between the farthest nodes i and j be of length K+2DC. The minimum number of required facilities to cover this path happens when two facilities are located at distance DC from nodes i and j. If K is multiple of 2DC, the part of this path between these facilities can be covered by at least (K/(2DC))-1 facilities located along it. On the other hand, if K is not divisible by 2DC, the minimum number of required facilities to cover this path is attained by locating K/2DC along this path.
IC1.The following cut reduces the search region.
IC2. If edge i′j′ does not accommodate a facility, its outer covering facilities need to be located along the edges that allow the complete coverage.
According to this constraint, as shown in Figure 10, if two outer facilities (on edges ij and ij) cannot completely cover edge i′j′ when their facilities are located on the nearest place to this edge (that is, node i and node i), they cannot cover this edge regardless of the position of the located facilities.
This cut is illustrated by the example provided in Fig. 11. Any located facility along edge ij cannot cover edges i′j′ and ij simultaneously since (DC-dii′)+(DC-dji)<lij.
Nodes with located Facilities i i Fig. 12. The Benders algorithm Table 1 represents the results of solving 40 randomly generated problem instances. All run times are in seconds. Although the time limit for the EDA algorithm is 1000 seconds, in most of the instances it finds its best found solution in less than a few seconds. LB implies for the best found lower bound and BF denotes the best found solution.  Fig. 14 compares the solutions of the EDA algorithm with the best found solution obtained through solving P by CPLEX. In other words, this figure depicts the savings in number of required facilities when the problem is solved by the EDA rather than solving problem P by CPLEX. On average, the number of used facilities by CPLEX is 6.5 times the ones used by the EDA algorithm. Further, in three-quarters of the instances, solving the problem by EDA, rather than solving P by CPLEX, results in at least 58% reduction in the number of required facilities. Fig. 15 represents the quality of the heuristic solutions by comparing them with the greatest lower bound found by CPLEX and Benders algorithm. On average, the number of used facilities by EDA algorithm are 0.08 larger than the maximum of two lower bounds. In 75% of the instances, EDA solutions match the maximum lower bound and hence are optimal. There is only one instance (i.e. #26) that EDA algorithm uses two additional facilities than the maximum lower bound.

Concluding results and future research directions
This paper addressed the total edge covering problem in which the facilities are allowed to be located anywhere along the network. The mathematical programming formulation of the problem is developed for the first time. A matheuristic which iteratively solves the facility location on real and dummy nodes is proposed. In order to evaluate the quality of the heuristic solutions, a Benders algorithm is developed which generates tight lower bounds. Numerical results show the efficiency and effectiveness of the matheuristic algorithm for problems with 45 to 80 nodes.
This study addressed one of the facility location problems with continuous demand and location along the network. Many other facility location problems, like dispersion and center problems, can be attacked regarding the continuous location as well as the distributed demand along the edges. We considered uncapacitated facilities. The study can be extended regarding the capacity of the facilities and the density of the demand along the edges. For the ease of formulation, we assumed that at most one facility can be located along each edge. This assumption can be relaxed by allowing multiple facilities to be located along each edge. Max LB EDA