Place Your Next Branch with MILE-RUN: Min-dist Location Selection over User Movement
Introduction
Min-dist location selection, as discussed in [14], is an important spatial decision problem that has a wide spectrum of applications, such as marketing, urban planning, logistics, location-based services, etc. Informally, given a set U of users and a set F of existing locations for a specific kind of facility, the Min-dist problem selects a location c from a set of candidates, namely C, for establishing a new facility so that the average distance between users and their respective nearest facilities is minimized.
Several techniques [14], [15], [26], [30] have been proposed for addressing Min-dist location selection problems, and all these works assume that each user is stationary such that one can be represented as a single point in a two-dimensional space. Unfortunately, with the proliferation of mobile devices, this assumption may not hold in various applications where users are not static and traveling from one place to another every day. In these scenarios, where a moving user can be modeled as a set of spatial positions instead of a single one [21], none of existing efforts in this field can be applied.
Consider the scenario of sharing economy which is taking off in recent years. A car-sharing company (e.g., Autolib, Car2Go) wishes to improve its market share by optimizing the existing service network, namely the parking and charging stations. In order that vehicles can be more accessible to users for competitive purposes, the company needs to choose from a collection of candidate places an optimal location which, if a new station is built there, minimizes the average pick-up distance for users. Suppose that a user is mobile, then she would probably take a car close to where she were present at. As a result, for a more effective solution, the evaluation of the average distance cannot be solely attributed to a single location as all of the positions that represent a user’s movement play a role.
Another example is franchise chain (e.g., McDonald’s, 7-eleven). In order to optimize customer experience, the company plans to build a new store on the basis of existing service branches, so as to minimize the average travel cost to attract more customers. As mobile customers would probably purchase in stores nearby where they were present at, similar to the previous example, their movements have to be taken into consideration in the distance evaluation.
To address the Min-dist problem in the mobile setting, in this paper we present a novel problem, namely Min-dIst Location SElection oveR User MovemeNt (mile-run), which addresses the aforementioned limitation that existing Min-dist techniques suffer from. As users’ travels between spatial locations in the real-world are usually confined to road network, we focus on the road network distance. Accordingly, we assume both facilities and users are located on a road network graph G(V, E), where V and E denote the sets of vertices and edges, respectively. Additionally, as stated in [14], [15], in many real applications, companies can only choose from a finite number of candidate places for rent or sale in a region or on a road, and we follow this setting in the paper. As a result, the mile-run problem can be intuitively defined as follows. Given a set U of mobile objects (e.g., users, customers), a set F of existing locations for a specific kind of facility (e.g., stations, chain stores) and a network graph G(V, E), the mile-run problem returns the optimal location among a set C of candidate ones for establishing a new facility so that the average network distance between all objects and their respective nearest facilities is minimized.
Comparing with existing Min-dist location selection solutions, there are three challenges in the mile-run problem. Firstly, a moving object is described by a set of positions. In addition to the large amount of positions, which may lead to costly computation, taking every position into account would be inappropriate as some positions are not/poorly contributive for finding the optimal place to establish the new branch. For instance, there may be positions with GPS errors or visited by a user purely occasionally, etc. Hence, it is vital to the evaluation of average distance by eliminating these noisy positions. Secondly, even if we have the knowledge of the worthwhile positions for each moving user, it is non-trivial to determine which facility a specific user will visit for service as she is probable to be present at any one of these positions. As existing solutions to the Min-dist problem are single point based, they cannot handle the multiple point scenario. Accordingly, the average distance is difficult to be evaluated under the Min-dist criteria. Finally, the computational overhead is extremely expensive due to the massive shortest network distance computations over users, existing facilities and candidate locations, which makes the Euclidean distance based Min-dist techniques unavailable.
In this work, we present a systematic solution, illustrated in Fig. 1, to address the above challenges introduced by the mile-run problem. The framework consists of three major components as follows.
In the first component, we model user movement with the concept of reference locations [11]. Each reference location is one of the activity places that a user frequently shows up every day and thus are worth considering. With the help of kernel method [24], we can identify for each user multiple reference locations with different present probabilities.
The second component formalizes the Min-dist problem in the mobile scenarios. In light of the uncertainty of movements, users that are probably present at any one of the reference locations can be modeled following possible world semantics [2]. Based on the criteria of existing Min-dist problem (i.e., stationary objects), the optimal candidate with respect to a specific possible world can be obtained. As any possible world may occur, all possible worlds have effects on the result and the average distance is in fact a random variable accordingly. Hence, candidates can be ranked according to the expected average distance over all possible worlds.
In the third component, we systematically solve the proposed mile-run problem in two steps. (1) Notably, the amount of possible worlds is huge due to the massive number of users, such that it is impractical to explore all of them, which ends up with exponential complexity. To avoid the exhaustive enumeration, a novel method from the aspect of reference location is designed, which can be effectively leveraged to reduce the ranking process from exponential complexity to linear one. (2) Furthermore, as shown in Section 7, city road networks usually have large volumes of vertices and edges, which results in costly network traversals in finding the optimal candidate for massive users. To address this problem, with the help of spatial locality index techniques, we propose two solutions, network nearest facility circle (NNFC) based and local network based, both of which provide superior efficiency. The former method focuses on only those reference locations affected by certain candidates. For the latter, not any network traversal is needed for the selection of candidates. On the other hand, we present an alternative solution based on network extension scheme for addressing mile-run from scratch, in the case that existing facilities and users are unknown in advance. In summary, our major contributions are outlined as follows.
- •
To the best of our knowledge, this is the first effort to formalize the problem of Min-dIst Location SElection oveR User MovemeNt (mile-run). Compared to existing works, mile-run is a generalized Min-dist problem, which takes into account the movements of objects.
- •
Depending on whether index can be used, three novel solutions are developed to address the mile-run problem efficiently and accurately under different cases.
- •
Comprehensive experiments are conducted on both real-world and synthetic data over two real road networks. The results demonstrate that, comparing to a baseline method, our proposed solutions significantly improve the efficiency by orders of magnitude.
The rest of this paper is organized as follows. We give a brief overview of related work in next section. In Section 3, we discuss user movement and model mobile objects using reference locations. Section 4 formally gives the definition of the mile-run problem. We propose our solutions to mile-run in Section 5. Section 6 analyzes the cost of the proposed methods. Section 7 reports the empirical study and the last section concludes this paper. A set of frequently-used notations and acronyms are listed in Table 1.
Section snippets
Related work
There has been increasing research efforts to address the location selection (LS) problem due to its importance in real-life applications. They are generally classified into two major categories: Min-dist and Max-inf. The former problem aims to find a location, if a new facility is built there, minimizing the aggregate distance from each user to her closest facility. The latter one returns an optimal location for establishing a new facility such that it can attract the most number of users. In
Preliminaries
In this section, we first discuss the representation of mobile objects, and then introduce kernel method, which is employed to model the movements of users in our framework.
MILE-RUN problem
In this section, we formally define the mile-run problem. Based on reference location, mobile objects are modeled following possible world semantics. Then candidate locations are ranked by expected distance over all possible worlds. We begin by introducing some terminologies that are necessary for the formal definition.
A location ℓ in this paper is a two-dimensional position on an edge in a given directed road network G(V, E), with a geographical coordinate (i.e., latitude and longitude). Each
Solutions to MILE-RUN
In light of Theorem 1 and Definition 3, a straightforward solution to the mile-run problem is to exhaustively check all candidate locations. It works as follows. For each c ∈ C, we compute its Δ(c). Then the candidate with the greatest ERD is the optimal answer. Despite the avoidance of enumerating all possible worlds for expected Min-dist, this method is still expensive due to the massive number of network traversals that occur in repeatedly finding the nearest facility of each reference
Cost analysis
In this section, we conduct theoretical study over all proposed methods (i.e., Straightforward, NSJ, LNB and EN). Observe that the time complexity for finding the nearest facility of a reference location is O(|V|log |V|), as the network traversal terminates once a facility vertex is found, which means only vertices (1 means the facility vertex) are involved in the exploration.
The straightforward method described in the very beginning of Section 5 iteratively checks candidates based on
Performance study
In this section, we investigate the performance of our proposed mile-run solutions from a variety of aspects.
Conclusions
In this paper, we introduce a novel location selection problem, Min-dIst Location SElection oveR User MovemeNt (mile-run), which takes into account the movements of objects. Based on reference locations, which can model moving users and be captured from movement data, we present two groups of algorithms, index-based and index-free ones for solving mile-run. The first group, including NSJ and LNB, answers mile-run efficiently with carefully designed index structures, and fits the case where
Acknowledgments
The authors would like to thank the editor and anonymous reviewers for their constructive comments on this paper, and the authors [26] for sharing their codes. This work was supported by the National Natural Science Foundation of China (Nos. 61672408 and 61472298), CCF-VenustechRP2017005 and China 111 Project (No. B16037).
References (32)
- et al.
On efficient k-optimal-location-selection query processing in metric spaces
Inf. Sci.
(2015) - et al.
Query processing in spatial network databases
VLDB
(2003) - et al.
Mobile facility location
DIAL-M
(2000) - et al.
Probabilistic reverse nearest neighbor queries on uncertain data
TKDE
(2010) - et al.
Efficient algorithms for optimal location queries in road networks
SIGMOD
(2014) A note on two problems in connection with graphs
Numer. Math.
(1959)- et al.
The optimal-location query
SSTD
(2005) - et al.
Optimal network location queries
SIGSPATIAL/GIS
(2010) R-trees: a dynamic index structure for spatial searching
SIGMOD
(1984)- et al.
Top-k most influential locations selection
CIKM
(2011)