A Privacy-Preserving Continuous Location Monitoring System for Location-Based Services

To protect users’ private locations in location-based services, various location anonymization techniques have been proposed. The most commonly used technique is spatial cloaking, which organizes users’ exact locations into cloaked regions (CRs). This satisfies the K-anonymity requirement; that is, the querier is not distinguishable among K users within the CR. However, the practicality of cloaking techniques is limited due to the lack of privacy-preserving query processing capacity, for example, providing answers to the user's spatial queries based on knowledge of the user's cloaked location rather than the exact location. This paper proposes a cloaking system model called anonymity of motion vectors (AMV) that provides anonymity for spatial queries. The proposed AMV minimizes the CR of a mobile user using motion vectors. In addition, the AMV creates a ranged search area that includes the nearest neighbor (NN) objects to the querier who issued a CR-based query. The effectiveness of the proposed AMV is demonstrated in simulated experiments.


Introduction
With advances in mobile communication technology and the widespread use of GPS-enabled mobile devices, locationbased services (LBS) are growing rapidly. In LBS, mobile users issue spatial queries to LBS providers to obtain information based on their geographical locations [1][2][3][4]. Typical LBS applications include road navigation, area-specific weather forecasts, map information, automotive traffic monitoring, location-based SNS, and nearest point of interest (POI) queries [5][6][7].
Because LBS is provided for users based on their locations, users must report their location information to an LBS provider. Although LBS provides valuable services to users, revealing one's private locations to potentially untrustworthy LBS providers poses privacy concerns [8,9]. Knowledge of a person's location can be used by an adversary to physically locate and identify the person. In fact, there are many reallife scenarios in which perpetrators abuse technologies to gain access to private location information about victims [10].
Several researchers have presented the concept of a location anonymizer, a trusted third party that acts as a middle layer between the user and the LBS server [6]. The anonymizer blurs the users' exact locations into CRs using existing location anonymization algorithms and sends the CRs, rather than the exact user location, to the LBS server. This mechanism enables the users to obtain the desired LBS without revealing their exact locations to the LBS server. The -anonymity requirement given by the querier ensures that the CR is -anonymous (e.g., the querier is indistinguishable among users within the CR), thus reducing the probability of the querier's location being exposed to untrustworthy parties to 1/ . Because the LBS server does not know the querier's exact location, it can only return a set of candidate answers to the CR-based query.
This paper proposes a cloaking system model called anonymity of motion vectors (AMV), which provides anonymity for spatial queries. The proposed AMV can 2 International Journal of Distributed Sensor Networks address user queries based on the user's cloaked location information from the location anonymizer. The AMV minimizes the CR by predicting user movements based on the motion vector. The proposed method can be applicable in both outdoor and indoor environments. For example, a client in the car moves along the road and a cloaked region (CR) can be created based on the client's moving speeds and directions. In indoor environments, the movement range of a client is generally restricted due to various obstacles like walls and doors. This might make the motion vectors produced by the proposed AMV invalid. Nevertheless, the AMV can achieve effective query processing by adjusting the update intervals of the vector according to the size of the indoor space (i.e., the movement range of the client). The main features of the AMV can be summarized as follows: (1) Client movements are considered when generating a CR that satisfies the user-specified -anonymity level. (2) The user location update cost of the anonymizer can be reduced by allowing the anonymizer to not check the user location from time to + . (3) Compared with the previous cloaking model (distributed grid based continuous cloaking (DGCC) (refer to Figure 1) [23], minimum cycle region (MCR) [24], which produces circular CRs (refer to Figure 2)), the AMV generates a smaller CR, thus lowering the query processing time.
The major notations used throughout this paper are as shown in Notations.
The rest of this paper is organized as follows. Section 2 highlights related works. Section 3 describes the proposed AMV model and depicts the creation of CRs using the previous MCR and proposed AMV. Section 4 describes the creation of ER for CR-based NN queries. Section 5 presents the equations of the AMV. The experimental results are given in Section 6, and Section 7 concludes the paper.

Related Work
To address the location privacy protection of LBS users, many location anonymization techniques have been proposed. Cloaking and -anonymity are the most commonly used forms of location anonymization [10][11][12][13][14][15][16][17][18][19][20][21][22]. Figure 1 depicts an example of the distributed grid based continuous cloaking (DGCC) area creation when the required -anonymity level is 4 (e.g., = 4) [23]. The DGCC method provides the privacy protection of the query issuer by maintaininganonymous clients adjacent to the querier. In Figure 1(a), the CR is represented by a minimum bounding circle and contains four mobile clients ( , 2 , 3 , and 5 ) at time . Figure 1(b) shows the 4-anonymous CR at + 1. Note that some of the clients have moved and that the CR is enlarged to enclose the original four clients. In Figure 1(a), the radius of the CR is 2 √ 2, and the radius of the CR in Figure 1 To avoid such an increase in the CR size, a cloaking algorithm that forms a CR with the − 1 clients that are currently most adjacent to the querier has been proposed. However, the weakness of this approach is that an adversary   can guess the actual querier (denoted here by ), who remains constant over time, while other − 1 clients in the CR are continuously being updated (e.g., , 2 , 3 , 5 → , 1 , 4 , 6 ) [23].
The cloaking method in [12] considers both -anonymity and -diversity. It creates a minimum CR by first finding number of buildings. After satisfying the -diversity requirement, it extends the CR to satisfy the -anonymity requirement (e.g., the CR contains at least number of users). The New Casper proposed by Mokbel et al. [11] is a research prototype of the framework for anonymous LBS. It satisfies the four requirements of location anonymity: accuracy, quality, efficiency, and flexibility. The cloaking algorithm in [13] employs dummy generation in creating a CR. To solve the building double counting problem, it adds the building grouping item to the index structure of the existing privacy grid approach and minimizes the CR by finding users (e.g., -anonymity) in the grid cells adjacent to the buildings in the CR. However, these previous schemes ignore the client's movements.
Location anonymization techniques for LBS that consider the movement information of mobile users have been suggested [24,25]. The technique in [24] addresses the anonymization of snapshot and continuous LBS queries. It creates a minimum circular CR based on mobile user's location trajectories. However, this requires tolerance of temporal errors, as the direction and speed of user movement are not taken into account. The technique in [25] considers the user's speed and direction of movement in addition to location. It requires information about the exact location of all user nodes and calculates the querier's movements based on his/her direction of movement and speed. There are some additional anonymization techniques that consider client movements using the Voronoi diagram, road networks, and historical location data [26][27][28][29]. However, none of these existing techniques support spatial queries with cloaked areas (e.g., CRs). Recently, a method for private NN queries has been proposed. It creates a CR for the querier and performs query processing based on the type of objects (e.g., restaurants and cafes) in the CR [30]. This method does not consider -anonymity while taking into account object types, so it cannot be directly compared with the proposed AMV.

Proposed AMV Model
The AMV proposed in this paper considers client movements to create a minimized -anonymous CR and supports spatial queries with cloaked locations. The system model of the proposed AMV contains three components: the centralized LBS server, the location anonymizer, and the client. The location anonymizer is a trusted third party that is placed between the client and the LBS server [6]. To obtain LBS, the querier sends its location along with the spatial query to the location anonymizer. The location anonymizer that periodically receives location updates from mobile users blurs users' exact locations into CRs and sends the query, along with the CR, to the LBS server. The anonymizer creates a session ID that is valid during the entire service period for the querier. The LBS server then processes the CR-based query from the anonymizer and returns a candidate list of answers to the anonymizer. Finally, the anonymizer computes the exact query answer from the candidate list and sends it to the querier.
The proposed AMV assumes that the −1 nearest clients that are most adjacent to the querier are bounded in a CR. This condition is assumed to lower the server overhead, which increases as the CR size increases. Figure 2 shows the CR produced by the previous MCR method that considers only the user's speed [24]. The dotted rectangle represents a minimum CR that satisfies the 4anonymity requirement at + 1. The solid rectangle is a modified version of the CR that encompasses all of the grid cells that are fully or partially included in the initial CR (i.e., the grid cells that are intersected by the initial CR are also enclosed). The CR of the solid rectangle is confirmed as a final CR. Figure 1 shows the CR at , and the area of this CR is 25.1. The CR at + 1 shown in Figure 2 has a larger area of 88. In Figure 2, the movement radius of each client is represented by a circle because only the speed, not the direction, is known. Figure 3 presents the CR created by the proposed AMV, which considers the speed and direction of movement using the motion vector. It is assumed that these clients are moving at the same speed as the clients in Figure 2. The 4-anonymous CR at +1 shown in Figure 3 has an area of 42. In other words, the AMV produces a smaller CR than the MCR.
Compared to the existing technologies, the proposed AMV involves the extra cost of maintaining the speed and direction information of the moving users. However, the user can provide this motion information when submitting the location and -anonymity parameter information to the LBS server. Thus, the overhead associated with this additional motion information is not significant. Assume that 1 byte of storage is required for the representation of client speed in the AMV and another 1 byte for the representation of client direction. As -and -axes exist in the coordinate information of object, it becomes 2 × coordinates (GPS coordinate information generally uses -axis 4 bytes andaxis 4 bytes; in case of detailed GPS measurement, the data size on -and -axes may increase). The size of object is typically assumed to be from 128 bytes to 1,024 bytes [31]. When the number of clients is 100, the size of data stored in the server is 1,000 bytes in the AMV, 900 bytes in the MCR, and 800 bytes in the DGCC. The DGCC yields the smallest data size, but it has a drawback that its data size increases by times whenever an update is made. Note that the proposed work is not intended to reduce the data size maintained in the clients. This work aims to minimize the data to be delivered by the server by discovering ER and EN in response to the client query. In other words, the proposed method intends to address the problems caused by the growth of CR sizes. For example, the CR of the MCR is larger than the CR of the AMV, which can lead to increased ER and EN (see Sections 3 and 4). As EN of the MCR is greater than EN of the AMV, the MCR gets heavier server overheads than the AMV. The client's mobility pattern follows the random waypoint mobility model [32], which is widely used.

Finding ER in NN Queries
This section illustrates how to create ER for a CR-based NN query.
In Figure 4, it is assumed that there are no objects inside the CR. The CR of the AMV described in Section 3 is used to find ER . The solid rectangle in Figure 4 represents the CR of a mobile user who issued a query, and the dotted rectangle denotes a search area where the nearest object to the CR is found. To create an ER , one must first check the CR and then spread out to the neighboring grid cells around the CR one by one, until the nearest neighbor object is found. Here, the nearest object found is 3 . Then, the top-left vertex of the CR (denoted by " ") that is farthest from 3 is connected to 3 , which creates a line segment. Vertex becomes the center of a circle whose radius is the distance between and 3 ; for example, dist( 3 , ) = maxdistance(maxdist). The grid cells within this circle and those intersected by this circle become ER (the shaded area in Figure 4). If there is more than one object nearest to the CR, for example, dist(CR, ) = dist(CR, +1 ), is prioritized over +1 .
Algorithm 1 describes the primary identification (filtering) of ER depicted in Figure 4.
The initial ER returned by Algorithm 1 is insufficient to answer the user's exact NN query. In Figure 4, one can see there is an object other than 3 that is nearer to the querier; for example, dist( , 3 ) > dist( , 4 ). Algorithm 1 provides an inaccurate answer, in which the actual nearest object to the querier is 4 . To resolve this problem, the secondary (refined) ER creation is suggested. Figure 5 depicts the secondary creation of ER . In addition to the ER found in the primary creation, other regions  that can include the answers (e.g., the NN objects) to the CRbased query are located. The CR vertex that is farthest from 3 (e.g., ) and the CR vertex with the shortest distance from 3 (e.g., ) are excluded. Note that the distance between and 3 is denoted by mindistance (mindist). The distance from 3 to the other two vertices and is denoted by minmaxdistance (minmaxdist). Of these two vertices, the one that is farther from 3 is denoted by minmaxdist ; that is, dist( 3 , ) > dist( 3 , ), and the other one is minmaxdist . The ER of each of the minmaxdist and minmaxdist is generated in the same way that the primary ER is generated.
The final ER (the shaded area in Figure 5) includes the grid cells in the ER of minmaxdist and minmaxdist , as well as those in the original ER . With this final ER , the exact answer to the user's NN query can be found.
Algorithm 2 describes the secondary creation (refinement) of ER depicted in Figure 5.
In the event that objects exist inside the CR, the creation of ER does not differ, and Algorithms 1 and 2 can be applied in the same manner. This is depicted in Figure 6.

AMV Equations
This section defines the proposed AMV and the previous MCR using equations. dist( , ) denotes the distance between and . ⃗ denotes the vector of . Equations (1) and (2) express how to calculate the CR of a moving user in the MCR and AMV, respectively. The DGCC does not consider (estimated) client movements so it does not create a CR of the user who issued a query. In the DGCC, the location of the querier is represented with coordinates. However, the DGCC provides -anonymity, so the coordinates of the querier are protected. ( , ) and ( , ) denote coordinates. The associated conditions are ≥ and ≥ : AMV's CR = ⃗ × ⃗ . 6 International Journal of Distributed Sensor Networks The computation of the CR that satisfies the -anonymity requirement in the AMV is represented in where ⃗ 1 ≥ ⃗ 2 , ⃗ min denotes the smallest value in the -axis of vector ⃗ 2 and ⃗ max denotes the largest value in the -axis of ⃗ 1 . In the -axis, the same conditions as those in the -axis are applied.
In comparison with the AMV, the circular CR of the MCR is transformed into a quadrate CR. The CR represented by a square is 1.17 times larger than the circular CR. If there are grid cells that intersect the CR, the CR is extended to enclose them. This decreases the error rate involved in CRbased queries. The equation below computes the CR of the MCR that satisfies the -anonymity requirement: Similar to -AMV, ⃗ min denotes the smallest value along the -axis of ⃗ 2 , whereas ⃗ 1 ≥ ⃗ 2 . ⃗ max denotes the largest value along the -axis of ⃗ 1 . These -axis conditions are applied to the -axis as well.
To represent the -DGCC of the DGCC into an equation, the following definitions are made. When = { 1 , 2 , . . . , }, -anonymous clients satisfying the DGCC requirements are denoted as = { 1 , 2 , . . . , }: min ( max ) of -DGCC denotes the clients that are located close to the querier, whereas max denotes those far from the querier. Only the clients that satisfy ∋ { min , max , min , max } can be included in the CR of -DGCC.
The case of having objects inside the CR ( ∈ CR) and the case of having no object inside the CR ( ∉ CR) can be distinguished by developing the equations associated with CR-based NN query processing. Among the four vertices of the -anonymous CR (e.g., , , , and ), maxdist denotes the vertex farthest from , and mindist denotes the shortest CR vertex from . The distance from to the other two vertices is denoted by minmaxdist. Of these vertices, minmaxdist denotes the vertex farther from , and minmaxdist denotes the other one. When dist( , ) > dist( , ) > dist( , ) > dist( , ), the distance of each of the CR vertices to is expressed using the equations below: maxdist = dist ( , ) , With the defined distances, the search area for the NN object that answers the CR-based query can be defined as follows: Search area of maxdist = (maxdist) 2 × . The search area of mindist is included in that of maxdist, so its calculation can be omitted.
The search areas of minmaxidst = dist( , ) and minmaxidst = dist( , ) may contain objects that are nearer to the CR but are not included in the search area of maxdist. Hence, the search area of minmaxdist (the generic term for minmaxidst and minmaxidst ) should also be created: Search area of minmaxdist The final search area ER for the NN query includes the search areas of (8) and (9): all of the grid cells enclosed by the computed search areas and those intersected by the search areas are included in the final search area.
In NN queries, nearest objects satisfying the query can be both inside and outside the CR. In this case, the final search area for both the inside objects and the outside objects is calculated individually using the equations presented in this section. The results are then combined, for example, inside CR{(8) ∪ (9)} ∪ outside CR{(8) ∪ (9)}.

Experimental Evaluation
Through simulated experiments, the performances of the proposed AMV and the previous MCR were measured for comparison. These experiments were performed on a computer with a 2.9 GHz processor and 4 GB memory. C++ was used to conduct the experiments. It was assumed that the querier was stationary and that − 1 clients were moving. The distance of a single grid cell was set to 10 m. The MCR to which the AMV was compared created a minimum CR that satisfies the specified -anonymity requirement (e.g., the stationary querier and −1 clients are bounded in a CR). The experiment results are the average of the values obtained by executing the experiments 100,000 times. Table 1 summarizes the parameter settings of the experimental environment. Figure 7 shows how the size of the CR varies according to the -anonymity levels.
In the experiments, the movement radius of the client was set to 10 grid cells and the update interval of the anonymizer International Journal of Distributed Sensor Networks was 3 seconds. CR of AMV generates 30.9% and 55.6% less CR for each of the MCR and the DGCC. min ( max ) of -DGCC denotes the clients that are located close to the querier, whereas max denotes those far from the querier. Only the clients that satisfy ∋ { min , max , min , max } can be included in the CR of -DGCC. Therefore -DGCC's slope is increased significantly when was increased in the graph. Though the MCR considers only the speed of user movements, the AMV considers both the direction and speed using the motion vector, thus producing a smaller CR. Figure 8 shows the size of CRs with respect to the movement radius of the client. In the experiments, theanonymity level was set to 10 ( = 10), and the anonymizer update interval was 3 and 5 seconds. On average, CR of AMV creates 49.2% and 67.6% less CR for each of the MCR and the DGCC. In Figure 8, for AMV, DGCC, and MCR, the CR size increases as the client's movement radius increases, but the rate of increase of the AMV is less steep than that of the MCR and DGCC. When = 5, the DGCC has the largest CR. The reason is that the distance of the querier from the clients satisfying increases as the movement radius of client and update time increase. Figure 9 depicts the size of CRs with regard to the location anonymizer's update interval, denoted by . The location of the moving client at is predicted based on the speed and direction of the client's motion vectors, and the CR is created by considering the predicted client location. In the experiments, the -anonymity level was 10, and the client's movement radius was 10 and 30 grid cells. On average, CR of AMV makes 42.7% and 65.2% less CR for each of the MCR and the DGCC.
Figures 10 through 14 examine the search area ER for the NN objects satisfying the CR-based query and the number of objects EN in ER . In the experiments, the number of clients was set to 100,000, and the number of objects was 50,000. The of the NN query was 10, and the anonymizer update interval was 3 seconds. The client's movement radius was 10 grid cells. The experiment results of a 10NN query (e.g., = 10) are presented in Figures 10 through 13, and Figure 14 shows the result of a NN query (e.g., is a variable). Figure 10 shows ER for the NN objects that satisfy the CR-based 10NN query.
The size of ER decreases as the number of objects increases, which occurs at the following intervals: 30,000, 50,000, 70,000, and finally 100,000. This occurs because the objects satisfying the 10NN query are located nearer to the CR when the objects are more densely populated. ER of AMV generates 46.7% and 53.7% less ER for each of the MCR and the DGCC. Figure 11 presents EN of the 10NN query that varies according to -anonymity levels. EN of AMV creates 24.2%  and 41.3% less EN for each of the MCR and the DGCC. As demonstrated above, the AMV yields a smaller CR than the MCR and DGCC, which decreases the size of ER . Naturally, the number of objects enclosed in a smaller ER is smaller. Figure 12 shows EN of the 10NN query with respect to the number of clients, varying from 50,000 to 200,000. EN decreases as the number of clients increases because the CR size decreases when the clients are more densely populated. A smaller CR leads to a smaller ER , which, in turn, decreases EN . On average, EN of AMV creates 42.0% and 51.7% less EN for each of the MCR and the DGCC. Figure 13 shows EN in connection with the client's movement radius. EN of AMV generates 55.4% and 71.6% less EN for each of the MCR and the DGCC. The gap between the AMV, DGCC, and MCR graphs grows as the  client's movement radius increases. Compared with the AMV, the MCR's and DGCC's CR size increases more rapidly as the client's movement radius increases, and a larger CR leads to a larger ER (and a larger EN ). Figure 14 presents EN of the CR-based NN query with respect to varying , from 3 to 5 to 10 to 20. The MCR graph increases more rapidly than the AMV graph. When = 3, there is a high likelihood that the requested object can be found inside the CR, thus minimizing ER . When = 20, the objects outside the CR need to be searched, which increases both ER and EN . On average, EN of AMV makes 45.8% and 62.1% less EN for each of the MCR and the DGCC.

Conclusion
This paper presents a cloaking system model that enables mobile users to make spatial queries without revealing their exact location information. The proposed AMV generates a minimum -anonymous CR by predicting the future location of the moving client based on the motion vector. This enables the AMV to generate a smaller CR than the previous cloaking method. In addition, the AMV creates a search area ER for the nearest neighbor objects that satisfies the NN query with the -anonymized region. The ER creation algorithms are performed in two phases: filtering that finds the initial ER and refinement that supplements the initial ER . By minimizing ER for the CR-based NN query, the AMV allows the location anonymizer to return the exact query answers (e.g., the NN objects) to the querier. In the future, the performance of the AMV can be further examined through experiments in which the querier is moving rather than remaining stationary.

CR:
A cloaked region : Th en e a r e s to b j e c t ER : A range search area for the nearest objects satisfying the spatial query EN : The number of objects in ER dist( , ): Distance between and MBR: A minimum bounding rectangle : th client : th object -anonymity: neighboring users to the querier NN: nearest objects to the querier.