Elsevier

Information Sciences

Volumes 463–464, October 2018, Pages 1-20
Information Sciences

Place Your Next Branch with MILE-RUN: Min-dist Location Selection over User Movement

https://doi.org/10.1016/j.ins.2018.06.036Get rights and content

Abstract

Due to the wide spectrum of applications, the Min-dist location selection problem has drawn much research attention in spatial database studies. Given a group of existing locations of a specific kind of facility, Min-dist problem aims to find the optimal location from series of candidate places to establish a new facility such that the average distance between users and their respective nearest facilities can be minimized. Although plenty of efforts have been proposed to address Min-dist problems, they all assume that users are stationary. Unfortunately, in practice, objects (e.g., people, animals) are mobile in various scenarios. Due to the movements of users, it is non-trivial to identify an optimal candidate, where none of existing solutions is applicable. Motivated by that, in this paper we take into account the mobile factor and present the first effort on a generalized Min-dist problem, called Min-dIst Location SElection oveR User MovemeNt (mile-run). To address the efficiency issue caused by user movement and road network, based on a reference location transformation, we present two groups of algorithms, index-based and index-free ones. The first group answers mile-run efficiently with the help of spatial locality based index structures, and fits the case where facilities and users are known apriori, while the second group solves the problem from scratch. Extensive experiments are conducted on both real-world and synthetic datasets, the results of which demonstrate that our algorithms are more efficient compared to the baseline method by orders of magnitude.

Introduction

Min-dist location selection, as discussed in [14], is an important spatial decision problem that has a wide spectrum of applications, such as marketing, urban planning, logistics, location-based services, etc. Informally, given a set U of users and a set F of existing locations for a specific kind of facility, the Min-dist problem selects a location c from a set of candidates, namely C, for establishing a new facility so that the average distance between users and their respective nearest facilities is minimized.

Several techniques [14], [15], [26], [30] have been proposed for addressing Min-dist location selection problems, and all these works assume that each user is stationary such that one can be represented as a single point in a two-dimensional space. Unfortunately, with the proliferation of mobile devices, this assumption may not hold in various applications where users are not static and traveling from one place to another every day. In these scenarios, where a moving user can be modeled as a set of spatial positions instead of a single one [21], none of existing efforts in this field can be applied.

Consider the scenario of sharing economy which is taking off in recent years. A car-sharing company (e.g., Autolib, Car2Go) wishes to improve its market share by optimizing the existing service network, namely the parking and charging stations. In order that vehicles can be more accessible to users for competitive purposes, the company needs to choose from a collection of candidate places an optimal location which, if a new station is built there, minimizes the average pick-up distance for users. Suppose that a user is mobile, then she would probably take a car close to where she were present at. As a result, for a more effective solution, the evaluation of the average distance cannot be solely attributed to a single location as all of the positions that represent a user’s movement play a role.

Another example is franchise chain (e.g., McDonald’s, 7-eleven). In order to optimize customer experience, the company plans to build a new store on the basis of existing service branches, so as to minimize the average travel cost to attract more customers. As mobile customers would probably purchase in stores nearby where they were present at, similar to the previous example, their movements have to be taken into consideration in the distance evaluation.

To address the Min-dist problem in the mobile setting, in this paper we present a novel problem, namely Min-dIst Location SElection oveR User MovemeNt (mile-run), which addresses the aforementioned limitation that existing Min-dist techniques suffer from. As users’ travels between spatial locations in the real-world are usually confined to road network, we focus on the road network distance. Accordingly, we assume both facilities and users are located on a road network graph G(V, E), where V and E denote the sets of vertices and edges, respectively. Additionally, as stated in [14], [15], in many real applications, companies can only choose from a finite number of candidate places for rent or sale in a region or on a road, and we follow this setting in the paper. As a result, the mile-run problem can be intuitively defined as follows. Given a set U of mobile objects (e.g., users, customers), a set F of existing locations for a specific kind of facility (e.g., stations, chain stores) and a network graph G(V, E), the mile-run problem returns the optimal location among a set C of candidate ones for establishing a new facility so that the average network distance between all objects and their respective nearest facilities is minimized.

Comparing with existing Min-dist location selection solutions, there are three challenges in the mile-run problem. Firstly, a moving object is described by a set of positions. In addition to the large amount of positions, which may lead to costly computation, taking every position into account would be inappropriate as some positions are not/poorly contributive for finding the optimal place to establish the new branch. For instance, there may be positions with GPS errors or visited by a user purely occasionally, etc. Hence, it is vital to the evaluation of average distance by eliminating these noisy positions. Secondly, even if we have the knowledge of the worthwhile positions for each moving user, it is non-trivial to determine which facility a specific user will visit for service as she is probable to be present at any one of these positions. As existing solutions to the Min-dist problem are single point based, they cannot handle the multiple point scenario. Accordingly, the average distance is difficult to be evaluated under the Min-dist criteria. Finally, the computational overhead is extremely expensive due to the massive shortest network distance computations over users, existing facilities and candidate locations, which makes the Euclidean distance based Min-dist techniques unavailable.

In this work, we present a systematic solution, illustrated in Fig. 1, to address the above challenges introduced by the mile-run problem. The framework consists of three major components as follows.

In the first component, we model user movement with the concept of reference locations [11]. Each reference location is one of the activity places that a user frequently shows up every day and thus are worth considering. With the help of kernel method [24], we can identify for each user multiple reference locations with different present probabilities.

The second component formalizes the Min-dist problem in the mobile scenarios. In light of the uncertainty of movements, users that are probably present at any one of the reference locations can be modeled following possible world semantics [2]. Based on the criteria of existing Min-dist problem (i.e., stationary objects), the optimal candidate with respect to a specific possible world can be obtained. As any possible world may occur, all possible worlds have effects on the result and the average distance is in fact a random variable accordingly. Hence, candidates can be ranked according to the expected average distance over all possible worlds.

In the third component, we systematically solve the proposed mile-run problem in two steps. (1) Notably, the amount of possible worlds is huge due to the massive number of users, such that it is impractical to explore all of them, which ends up with exponential complexity. To avoid the exhaustive enumeration, a novel method from the aspect of reference location is designed, which can be effectively leveraged to reduce the ranking process from exponential complexity to linear one. (2) Furthermore, as shown in Section 7, city road networks usually have large volumes of vertices and edges, which results in costly network traversals in finding the optimal candidate for massive users. To address this problem, with the help of spatial locality index techniques, we propose two solutions, network nearest facility circle (NNFC) based and local network based, both of which provide superior efficiency. The former method focuses on only those reference locations affected by certain candidates. For the latter, not any network traversal is needed for the selection of candidates. On the other hand, we present an alternative solution based on network extension scheme for addressing mile-run from scratch, in the case that existing facilities and users are unknown in advance. In summary, our major contributions are outlined as follows.

  • To the best of our knowledge, this is the first effort to formalize the problem of Min-dIst Location SElection oveR User MovemeNt (mile-run). Compared to existing works, mile-run is a generalized Min-dist problem, which takes into account the movements of objects.

  • Depending on whether index can be used, three novel solutions are developed to address the mile-run problem efficiently and accurately under different cases.

  • Comprehensive experiments are conducted on both real-world and synthetic data over two real road networks. The results demonstrate that, comparing to a baseline method, our proposed solutions significantly improve the efficiency by orders of magnitude.

The rest of this paper is organized as follows. We give a brief overview of related work in next section. In Section 3, we discuss user movement and model mobile objects using reference locations. Section 4 formally gives the definition of the mile-run problem. We propose our solutions to mile-run in Section 5. Section 6 analyzes the cost of the proposed methods. Section 7 reports the empirical study and the last section concludes this paper. A set of frequently-used notations and acronyms are listed in Table 1.

Section snippets

Related work

There has been increasing research efforts to address the location selection (LS) problem due to its importance in real-life applications. They are generally classified into two major categories: Min-dist and Max-inf. The former problem aims to find a location, if a new facility is built there, minimizing the aggregate distance from each user to her closest facility. The latter one returns an optimal location for establishing a new facility such that it can attract the most number of users. In

Preliminaries

In this section, we first discuss the representation of mobile objects, and then introduce kernel method, which is employed to model the movements of users in our framework.

MILE-RUN problem

In this section, we formally define the mile-run problem. Based on reference location, mobile objects are modeled following possible world semantics. Then candidate locations are ranked by expected distance over all possible worlds. We begin by introducing some terminologies that are necessary for the formal definition.

A location ℓ in this paper is a two-dimensional position on an edge in a given directed road network G(V, E), with a geographical coordinate (i.e., latitude and longitude). Each

Solutions to MILE-RUN

In light of Theorem 1 and Definition 3, a straightforward solution to the mile-run problem is to exhaustively check all candidate locations. It works as follows. For each c ∈ C, we compute its Δ(c). Then the candidate with the greatest ERD is the optimal answer. Despite the avoidance of enumerating all possible worlds for expected Min-dist, this method is still expensive due to the massive number of network traversals that occur in repeatedly finding the nearest facility of each reference

Cost analysis

In this section, we conduct theoretical study over all proposed methods (i.e., Straightforward, NSJ, LNB and EN). Observe that the time complexity for finding the nearest facility of a reference location is O(|V|log |V|), as the network traversal terminates once a facility vertex is found, which means only |V|+1 vertices (1 means the facility vertex) are involved in the exploration.

The straightforward method described in the very beginning of Section 5 iteratively checks candidates based on

Performance study

In this section, we investigate the performance of our proposed mile-run solutions from a variety of aspects.

Conclusions

In this paper, we introduce a novel location selection problem, Min-dIst Location SElection oveR User MovemeNt (mile-run), which takes into account the movements of objects. Based on reference locations, which can model moving users and be captured from movement data, we present two groups of algorithms, index-based and index-free ones for solving mile-run. The first group, including NSJ and LNB, answers mile-run efficiently with carefully designed index structures, and fits the case where

Acknowledgments

The authors would like to thank the editor and anonymous reviewers for their constructive comments on this paper, and the authors [26] for sharing their codes. This work was supported by the National Natural Science Foundation of China (Nos. 61672408 and 61472298), CCF-VenustechRP2017005 and China 111 Project (No. B16037).

References (32)

  • Y. Gao et al.

    On efficient k-optimal-location-selection query processing in metric spaces

    Inf. Sci.

    (2015)
  • D. Papadias et al.

    Query processing in spatial network databases

    VLDB

    (2003)
  • S. Bespamyatnikh et al.

    Mobile facility location

    DIAL-M

    (2000)
  • M.A. Cheema et al.

    Probabilistic reverse nearest neighbor queries on uncertain data

    TKDE

    (2010)
  • Z. Chen et al.

    Efficient algorithms for optimal location queries in road networks

    SIGMOD

    (2014)
  • E.W. Dijkstra

    A note on two problems in connection with graphs

    Numer. Math.

    (1959)
  • Y. Du et al.

    The optimal-location query

    SSTD

    (2005)
  • P. Ghaemi et al.

    Optimal network location queries

    SIGSPATIAL/GIS

    (2010)
  • A. Guttman

    R-trees: a dynamic index structure for spatial searching

    SIGMOD

    (1984)
  • J. Huang et al.

    Top-k most influential locations selection

    CIKM

    (2011)
  • F. Li et al.

    On trip planning queries in spatial databases

    SSTD

    (2005)
  • Z. Li et al.

    Mining periodic behaviors for moving objects

    KDD

    (2010)
  • D. Papadias et al.

    Aggregate nearest neighbor queries in spatial databases

    TODS

    (2005)
  • J. Qi et al.

    The min-dist location selection query

    ICDE

    (2012)
  • J. Qi et al.

    The min-dist location selection and facility replacement queries

    World Wide Web

    (2014)
  • S. Shang et al.

    Finding the most accessible locations: reverse path nearest neighbor query in road networks

    SIGSPATIAL

    (2011)
  • Cited by (0)

    View full text