Privacy-Enhancing k-Nearest Neighbors Search over Mobile Social Networks

Focusing on the diversified demands of location privacy in mobile social networks (MSNs), we propose a privacy-enhancing k-nearest neighbors search scheme over MSNs. First, we construct a dual-server architecture that incorporates location privacy and fine-grained access control. Under the above architecture, we design a lightweight location encryption algorithm to achieve a minimal cost to the user. We also propose a location re-encryption protocol and an encrypted location search protocol based on secure multi-party computation and homomorphic encryption mechanism, which achieve accurate and secure k-nearest friends retrieval. Moreover, to satisfy fine-grained access control requirements, we propose a dynamic friends management mechanism based on public-key broadcast encryption. It enables users to grant/revoke others’ search right without updating their friends’ keys, realizing constant-time authentication. Security analysis shows that the proposed scheme satisfies adaptive L-semantic security and revocation security under a random oracle model. In terms of performance, compared with the related works with single server architecture, the proposed scheme reduces the leakage of the location information, search pattern and the user–server communication cost. Our results show that a decentralized and end-to-end encrypted k-nearest neighbors search over MSNs is not only possible in theory, but also feasible in real-world MSNs collaboration deployment with resource-constrained mobile devices and highly iterative location update demands.


Motivation
With the rapid development of 5G Wireless Communication, mobile social networks (MSNs), represented by instant messaging and location sharing, have become essential parts of people's everyday lives. According to [1], the number of enrolled users in MSNs worldwide reaches 862 million in 2020, and it is estimated to exceed 900 million by the end of 2021. In particular, the utilization rate of location-based MSN services reaches 96.9% based on the positioning system (e.g., GPS, WiFi, Bluetooth, etc.) embedded in mobile devices, such as Facebook's "Nearby friends", Foursquare's "Swarm", and Joyrun's "real-time running competition", and so forth. In these services, users can broadcast their locations among friends and send location-based queries for nearby friends. Therefore, the location-based services provide a profoundly mobile interface for users' real-life social networks.
Nevertheless, people are using the enormously popular MSNs services without realizing their privacy concerns: the MSNs services providers can observe and accumulate the geo-location that users transmit through the network. According to the Mobile APP Security Research [2], among the 50 MSN services surveyed, there are 35 services that leak users' location data to advertisers or data analysis services on purpose without any permission. In recent years, a lot of research also uses data analysis and machine learning technology to extract a large number of their sensitive information from users' location information over MSNs: by analyzing their search patterns, the matched friends and search similarities and so forth, it is easy to predict the location conversion patterns between users and their friends [3,4]. It is also turned out that Facebook's historical spatiotemporal trajectory leaks the geographical distance between each user and the spot that he frequently queries, and then the service provider can learn the access probability-whether and when the user will check-in the next time [5].
To avoid illegal access to users' locations and search patterns by unauthorized service providers and hackers, previous research aimed to encrypt the location information before uploading. However, traditional encryption methods limit the MSNs service provider's ability to provide location-based services for users. To achieve privacy-preserving locationbased query, the straightforward mathematical methods deploy private information retrieval [6], searchable encryption [7] and other crypto primitives to make the encrypted location data searchable. However, these methods come at the huge cost of computation and communication overhead. Moreover, in a privacy-preserving setting, users have their keys embedded in their mobile devices. If a user is allowed to share his/her location encryption key with friends, he/she needs to launch a search request to the platform multiple times when he wants to retrieve his/her friends' locations. Moreover, when granting or revoking friends' search rights, a user and his/her friends should update their keys with synchronous locally, which are not suitable for MSNs platforms with highly extensible requirements. To the best of our knowledge, it is fair to say that achieving fine-grained location access control while providing an efficient, secure location-based neighbors search service has become one of the challenging research topics in the field of privacy-enhancing MSNs and still remains open.
In this work, we translate the high-level vision of the above issues and location privacy demand in MSNs into technical requirements and design a privacy-enhancing k-nearest neighbors search scheme containing cryptographic protocols that meet them. The purpose of this work is to protect users' location data and search patterns privacy, and make it available for users to query for their k-nearest neighbors based on current distances. In terms of technical contribution, our work presents an efficient construction so that the server can effectively compute and sort the encrypted distance between a user and his/her friends without any decryption operation, which is the first to tackle this problem through the lens of secure multi-party computation. It also achieves lightweight friend authentication and authority management by enabling users to grant/revoke their friends' search rights without updating others' keys. In terms of security, our scheme satisfies adaptive L-semantic secure and revocation secure under random oracle model. We also undertook an extensive experiment that validates our work, showing that the proposed scheme is possible in theory and feasible in practice.

MSNs Privacy
In recent decades, researchers have proposed many privacy-preserving approaches for MSNs. Encryption is the most common method for achieving privacy. For example, Flybynight [8] is a Facebook application for encrypting and decrypting sensitive data using client-side JavaScript. However, it is easy to be attacked by an adversary because the server holds users' keys and takes charge of key management. NOYB (short for none of your business) [9] offers privacy and preserves MSN services' functionality based on a secret dictionary's encryption. Besides, there have been many privacy-preserving matching solutions over MSNs proposed with different techniques. Some schemes are based on private set intersection protocols [10,11] to allow two users to compute the intersection of the two private profile sets privately, but leak no useful information of both parties. For example, Niu et al. designed a spatiotemporal matching scheme for privacyaware users in MSNs based on the profile's weight or level and the participant's social strength [10]. Zhang et al. proposed the concept of a fine-grained privacy information matching protocol by giving preference to each profile and using a similarity function to measure the matching degree [11]. To reduce computation cost, some works [12,13] designed non-encryption-based privacy-preserving matching protocols. Fu et al. proposed a privacy-preserving common-friend matching scheme based on a bloom filter [12]. It transmitted the common profiles of two users into an intersection of bloom filters, which ensures the privacy of friend lists against unknown users. However, it will not be able to resist brute-force attacks, resulting in privacy information leakage. Sun et al. [13] proposed a privacy-preserving spatiotemporal profiles matching scheme to let each user periodically record his locations by a geographic cell index among a large set of predefined ones, which can ensure spatiotemporal privacy at the cost of possibly huge communication and computation overhead.

Location Privacy
With the rapid development and enormous popularity of location-based services, scholars have paid more and more attention to location data's privacy and security. Many approaches focus on how to perform privacy-preserving location queries: Bamba et al. proposed a k-anonymity-based scheme that relies on a server to construct an anonymous set based on users' original queries to make query indistinguishable on the server-side [14]. Bordenabe et al. [15] and Shi et al. [16] both integrated differential privacy to realize nearby friends' queries. Differential privacy provides a rigorous privacy guarantee by adding noise (randomly to choose a set of fake locations ) to make their data and query deferentially private. Jorgensen et al. incorporated a clustering procedure that groups users according to the social network's natural community structure and significantly reduced noise [17]. The above works [14][15][16][17] can achieve relatively high efficiency. However, the limitations of these works are that it is challenging to achieve provable security guarantees with formal security definitions, since they did not employ well-designed and provable encryption methods. Zhou et al. took advantage of private information retrieval (PIR) to realize nearby friends' queries [18]. It provides strong cryptographic guarantees but needs complex operations, and it only protects query privacy but not location privacy. Li et al. designed a private location information matching protocol over MSNs based on inner product similarity (IPS) [19], putting users' map locations into vectors and encrypting the vectors. The similarity function is used to measure the similarity degree of the encrypted vectors of different users. Schlegel et al. designed an encryption method of dynamic location grid index structure [20], achieving neighbor point search on the premise of not revealing location privacy to the third party. In the above encryption-based schemes, the computation efficiencies are not ideal, requiring multi-round interactions at the logarithmic level between user and server.
Many other works are based on higher security assumptions to achieve a trade-off between security and efficiency. For example, some works [21,22] assumed that the service provider is honest and that it has the authority to access the location plaintext without leaking any information to others. Some works [21,23,24] introduced a trusted third party (TTP) to achieve the trade-off between security and efficiency. Unfortunately, there may not exist such a TTP in real MSNs scenarios. Some non-TTP solutions [15,20] are based on approximate measurements (e.g., linear programming and dynamic grid) with no accurate result. Some works [18,25] need complex operations (e.g., sending fake queries or receiving redundant results) to achieve secure guarantees, which incur high communication and computation overhead at the user-side, making them unsuitable for resource-constrained mobile devices.

Overview
The privacy-enhancing k-nearest neighbors search scheme over MSNs can be viewed as a decentralized system of end-to-end encrypted social network databases, focusing on the diversified demands of location privacy in MSNs. Our design relies on various cryptographic building blocks, including pseudo-random function, homomorphic crypto mechanism, secure multi-party computation and broadcast encryption.
-Aiming at the limited computation power of resource-constrained mobile devices, we design a lightweight end-to-end location encryption algorithm and a server-aid location re-encryption protocol based on Paillier homomorphic encryption to achieve further location sharing. The protocol allows the service provider to transfer friends' location ciphertexts into the query user's homomorphic ciphertexts without requiring them to be online to participate in the calculation. -We build a secure dual-server architecture and design a secure k-nearest neighbors search protocol by secure multi-party computation and a homomorphic encryption mechanism under this architecture. The server can effectively compute and sort the distance between users and their friends without any decryption operation. Compared with the cloud-center model, where a single server holds complete knowledge, the dual-server architecture minimizes the leakage to the servers and reduces the cost of communication between the mobile user and the server. -To achieve fine-grained access control, we design a dynamic friends management mechanism based on public-key broadcast encryption. It enables users to grant/revoke their friends' search rights without updating others' keys, achieving lightweight friend authentication and authority management. Moreover, this mechanism satisfies revocation secure that the adversary cannot obtain the user's location information through collusion with the server and the revoked friends, thus further improving the scheme's overall security.

Architecture and Syntax
Our scheme is designed to be executed among: U, S 1 , and S 2 . U is a set that contains n mobile users {U 1 , ..., U n }. Each user U i ∈ U can connect with others as his friends dynamically. S 1 is the primary server that provides a mobile social network service to all users in U. Each user U i ∈ U can send a search request to S 1 for k-nearest neighbors among friends based on current location. S 2 is a collaborated server to conduct secure computation with S 1 for k-nearest neighbors search.
The scheme's architecture is shown in Figure 1. At a high level, users' information and their relationships are modeled by a direct graph structure G. To initialize the system, the primary server S 1 executes Initial algorithm to output public parameter params and an empty G. Any user U i should use public parameter params to generate his symmetric key K i and public/secret keys (PK i , SK i ) locally by executing KeyGen algorithm and interacts with the primary server S 1 for registration by Join protocol. Any enrolled user U i ∈ U can grant/revoke U j ∈ U's location search right by interacting with S 1 in Grant/Revoke protocols. U i holds a friends index F i that records his granted friends. According to the real-life MSNs' location service architecture, we deploy trusted location infrastructure to provide tracing service by sending the current location l i of each user U i ∈ U to his local mobile device periodically. U i executes LocUpdate to encrypt his location data l i by his symmetric key K i at local and uploads the location ciphertext C i to S 1 . U i then can execute Search protocol with S 1 by sending k-nearest neighbors search request. S 1 performs encrypted search in G with the assistance of S 2 and returns the search result to U i , without relying on the presence of any other user. The proposed scheme's syntax consists of seven polynomial-time algorithms and protocols, which is shown in Syntax below: shown in Syntax below: Definition 1 (Correctness). Correctness implies that, for all 1 k , all (G, params) generated by Initial(1 k ), all (K i , PK i , SK i ) generated by KeyGen(params), all (F i ; G ) generated by Join(U i (id i , PK i ); S 1 (G)), all (K i , PK i , SK i ) generated by KeyGen(params), and all sequences of LocUpdate, Grant and Revoke protocols, Search(U i (K i , F i ); S 1 (G); S 2 (SK i )) will always output result R k that: R k satisfies D 1 < · · · < D K ; and there does not exist U i ∈ R k such that The security definition of adaptive L-semantic secure is formalized by an ideal/realworld paradigm [8]. Roughly speaking, we require that execution of the scheme in the real-world is indistinguishable from an ideal-world. In real-world Real(1 k ), the protocols between the adversarial servers and user execute just like the real scheme. In ideal-world Ideal(1 k ), there exists two simulators Sim 1 and Sim 2 who can get the leakage information from leakage functions and try to simulate the execution of A 1 and A 2 in Real(1 k ). Definition 2 (Adaptive L-Semantic Secure). Given the syntax in Section 3.2 and consider the following probabilistic paradigms where, U={U 1 , ..., U n } is users set, A 1 and A 2 are two non-colluding adversaries with pseudo-random polynomial time (PPT) computation ability, Sim 1 and Sim 2 are two PPT simulators, and L 1 to L 4 are leakage functions.

210
Definition 1 (Correctness). Correctness implies that, for all 1 k , all (G, params) generated by Initial(1 k ), all (K i , PK i , SK i ) generated by KeyGen(params), all (F i ; G ) generated by Join(U i (id i , PK i ); S 1 (G)), all (K i , PK i , SK i ) generated by KeyGen(params), and all sequences of LocUpdate, Grant and Revoke protocols, Search(U i (K i , F i ); S 1 (G); S 2 (SK i )) will always output result R k that: R k satisfies D 1 < · · · < D K ; and there does not exist U i ∈ R k such that The security definition of adaptive L-semantic secure is formalized by an ideal/realworld paradigm [7]. Roughly speaking, we require that the execution of the scheme in the real-world is indistinguishable from an ideal-world. In real-world Real(1 k ), the protocols between the adversarial servers and the user execute just like in the real scheme. In ideal-world Ideal(1 k ), there exist two simulators Sim 1 and Sim 2 that can obtain the leakage information from leakage functions and try to simulate the execution of A 1 and A 2 in Real(1 k ). Definition 2 (Adaptive L-Semantic Secure). Given the syntax in Section 3.2 and considering the following probabilistic paradigms, where U = {U 1 , ..., U n } is the users' set, A 1 and A 2 are two non-colluding adversaries with pseudo-random polynomial time (PPT) computation ability, Sim 1 and Sim 2 are two PPT simulators and L 1 to L 4 are leakage functions.

Real(1 k ):
It is run among the A 1 , A 2 and U using the real scheme. -A 1 initializes a empty graph structure G by (G, params) ← Initial(1 k ); -Every U i ∈ U computes (K i , PK i , SK i ) ← KeyGen(params); -Every U i ∈ U interacts with A 1 and A 2 to execute LocUpdate, Grant and Revoke protocols in any order; -U send polynomial times queries (q 1 , . . . , q t ) to A 1 ; -For each q i : Ideal(1 k ): It is run by Sim 1 and Sim 2 and a challenger C.
-A 1 initializes a empty simulated G and update it by L 1 ; -Sim 1 and Sim 2 simulate LocUpdate, Grant and Revoke protocols in any order by L 2 ; -C sends polynomial times queries (q 1 , . . . , q t ) to Sim 1 ; -For each q i : The proposed scheme achieves adaptive L-semantic secure if for all polynomial time A 1 and A 2 , there exists polynomial time simulators Sim 1 and Sim 2 such that the following two distribution ensembles are computationally indistinguishable:

Revocation Secure
Revocation secure guarantees that the scheme satisfies that any user's revoked friend cannot provide a valid search for his location, even if an adversary illegally steals the revoked friend's key. We construct the experiment Exp Revoke A rev (1 k ) to formalize the revocation secure definition. Exp Revoke A rev (1 k ) is interactively executed by a challenger C and an adversary A rev who has the ability to add friends, perform search and revoke friends in the real scheme. C deletes the user who has been added to friends index by A rev . A rev continues to generate a search token using the revoked friend's identity 220 and make a search request. After a polynomial number of queries, C revokes all users queried to the Grant oracle but not subsequently queried to the Revoke oracle (i.e., all users for which A rev holds their valid user keys).
The adversary A rev must then produce a search token which, when used as input to Search protocol, does not produce null, i.e., A rev must produce a valid search request even though it does not hold a non-revoked key. After several rounds of queries, if A rev 's probability of winning the revocation secure experiment with PPT computation ability is negligible, then we can say that the proposed scheme satisfies revocation secure.

Definition 3 (Revocation Secure). Given the syntax in Section 3.2 and consider Exp Revoke
A rev (1 k ) which is executed by a challenger C and an adversary A rev : Real(1 k ): It is run among the A 1 , A 2 and U using the real scheme.
-A 1 initializes a empty graph structure G by (G, params) ← Initial(1 k ); -Every U i ∈ U computes (K i , PK i , SK i ) ← KeyGen(params); -Every U i ∈ U interacts with A 1 and A 2 to execute LocUpdate, Grant and Revoke protocols in any order; -U send polynomial times queries (q 1 , . . . , q t ) to A 1 ; -For each q i : Ideal(1 k ): It is run by Sim 1 and Sim 2 and a challenger C.
-A 1 initializes a empty simulated G and update it by L 1 ; -Sim 1 and Sim 2 simulate LocUpdate, Grant and Revoke protocols in any order by L 2 ; -C sends polynomial times queries (q 1 , . . . , q t ) to Sim 1 ; -For each q i : The proposed scheme achieves adaptive L-semantic secure if for all polynomial time A 1 and A 2 , there exists polynomial time simulators Sim 1 and Sim 2 such that the following two distribution ensembles are computationally indistinguishable:

Revocation Secure
Revocation secure guarantees that the scheme satisfies that any user's revoked friend cannot provide a valid search for his location, even if an adversary illegally steals the revoked friend's key. We construct the experiment Exp Revoke A rev (1 k ) to formalize the revocation secure definition. Exp Revoke A rev (1 k ) is interactively executed by a challenger C and an adversary A rev who has the ability to add friends, perform search and revoke friends in the real scheme. C deletes the user who has been added to friends index by A rev . A rev continues to generate a search token using the revoked friend's identity 220 and make a search request. After a polynomial number of queries, C revokes all users queried to the Grant oracle but not subsequently queried to the Revoke oracle (i.e., all users for which A rev holds their valid user keys).
The adversary A rev must then produce a search token which, when used as input to Search protocol, does not produce null, i.e., A rev must produce a valid search request even though it does not hold a non-revoked key. After several rounds of queries, if A rev 's probability of winning the revocation secure experiment with PPT computation ability is negligible, then we can say that the proposed scheme satisfies revocation secure.

Definition 3 (Revocation Secure). Given the syntax in Section 3.2 and consider Exp Revoke
A rev (1 k ) which is executed by a challenger C and an adversary A rev :

230
The proposed scheme achieves adaptive L-semantic security if, for all polynomial time A 1 and A 2 , there exists polynomial time simulators Sim 1 and Sim 2 such that the following two distribution ensembles are computationally indistinguishable:

Revocation Security
Revocation security guarantees that the scheme satisfies that any user's revoked friend cannot provide a valid search for his location, even if an adversary illegally steals the revoked friend's key. We construct the experiment Exp Revoke A rev (1 k ) to formalize the revocation security definition. Exp Revoke A rev (1 k ) is interactively executed by a challenger C and an adversary A rev who has the ability to add friends, perform a search and revoke friends in the real scheme. C deletes the user who has been added to the friends index by A rev . A rev continues to generate a search token using the revoked friend's identity and makes a search request. After a polynomial number of queries, C revokes all users that are queried to the Grant oracle but are not subsequently queried to the Revoke oracle (i.e., all users for which A rev holds their valid user keys).
The adversary A rev must then produce a search token which, when used as an input to Search protocol, does not produce null, that is, A rev must produce a valid search request even though it does not hold a non-revoked key. After several rounds of queries, if A rev 's probability of winning the revocation security experiment with PPT computation ability is negligible, then we can say that the proposed scheme satisfies revocation security.

Definition 3 (Revocation Secure). Given the syntax in Section 3.2 and considering Exp Revoke
A rev (1 k ), which is executed by a challenger C and an adversary A rev : Version May 26, 2021 submitted to Journal Not Specified 7 of 20 Specifically, C runs Initial to initialize G, generates key (K i , PK i , SK i ) and state ciphertext cst i by KeyGen and Join. C sends G and cst i to A rev . A rev can access to the following oracles, where · denotes the parameters that are provided by A rev himself: After polynomial times rounds of queries, C revokes all the users that have access to O Grant (·, G, id j , PK i , F i ) but not O Revoke (·, G, id j , PK i , F i ). A rev generates a search token τ in Search protocol. If the output of Search is not ⊥, then returns 1, otherwise returns 0.
The proposed scheme achieves revocation secure if for all A rev , all 1 k , the advantage of A rev to win Exp Revoke

Initialization
On input the security parameter 1 k , S 1 initializes the global social network graph structure G = (V, E ) and public parameters params. In graph G, the maximal number of vertexes in V is n, that is |V | = n, which represents the maximum amount of enrolled users. Each vertex v i ∈ V should be attached with the information for a enrolled user U i ∈ U that S 1 gathered. The existing of a non-zero edge e ij ∈ E between v i ∈ V and v j ∈ V represents the friends relationship of U i and U j . In other words, if U i and U j are strangers of each other, then e ij = 0. G is empty at initialization.

260
Specifically, C runs Initial to initialize G, generates key (K i , PK i , SK i ) and state ciphertext cst i by KeyGen and Join. C sends G and cst i to A rev . A rev can access to the following oracles, where · denotes the parameters that are provided by A rev himself: After polynomial times rounds of queries, C revokes all the users that have access to O Grant (·, G, id j , PK i , F i ) but not O Revoke (·, G, id j , PK i , F i ). A rev generates a search token τ in Search protocol. If the output of Search is not ⊥, then returns 1, otherwise returns 0.
The proposed scheme achieves revocation security if, for all A rev , all 1 k , the advantage of A rev to win Exp Revoke

Initialization
On input of the security parameter 1 k , S 1 initializes the global social network graph structure G = (V, E ) and public parameters params. In graph G, the maximal number of vertexes in V is n, that is |V | = n, which represents the maximum amount of enrolled users. Each vertex v i ∈ V should be attached with the information for an enrolled user U i ∈ U that S 1 gathered. The existence of a non-zero edge e ij ∈ E between v i ∈ V and v j ∈ V represents the friends relationship of U i and U j . In other words, if U i and U j are strangers to each other, then e ij = 0. G is empty at initialization.

Key Generation
If a user U i is willing to join in the system, he should generate his own keys at local. U i 's keys consists of the following parts: the key for the pseudo-random function F to encrypt location data, the key pair for the broadcast encryption scheme BE , the key pairs for the Pallier encryption scheme P and the Goldwasser-Micali encryption scheme GM. U i first takes as input the binary representation of the public parameters params, and randomly selects a k-bit string k i ∈ {0, 1} k for his key of F . Then he generates (bpk i , msk i ) by BE .KeyGen, (pk i , sk i ) by P.KeyGen and (pk i , sk i ) by GM.KeyGen. Afterwards, he forms his symmetric key K i as (msk i , k i ), public key PK i as (bpk i , pk i , pk i ) and secret key SK i as (sk i , sk i ). The lengths of the above keys are determined by the security parameter 1 k . Finally, U i publishes his public key PK i throughout the system.

Join
Before joining in, U i should generate his friends index F i with d entries, where d represents the maximum amount of U i 's friends. F i is a key-value data structure, which is empty at first. The key part of F i will be attached with the granted friends' identities, the corresponding value part will be attached with the granted friends' session keys. More precisely, if U j is a friend of U i , then F i [id j ] stores the session key k i j that U j has shared with U i , where id i represents U i 's identity: F i [id j ] = k i j , where id j represents U j 's identity. To register, U i should also add the server S 1 in F i by generating S 1 's session key k i S 1 by BE .J oin msk i (S 1 ) and setting F i [S 1 ] = k i S 1 . Afterwards, U i randomly selects a k-bit string st i as his current state value and encrypts st i to cst i by BE .E nc bpk i (S 1 , st i ). Then U i sends S 1 a registration request Re i = (id i ||cst i ||k i S 1 ). S 1 selects an empty vertex v i ∈ V in G and attaches v i with Re i .

Location Update
An enrolled user U i ∈ U can interact with S 1 to update his location by LocUpdate protocol. First, U i obtains his current geo-location l i from the trusted location infrastructure that sends U i 's geo-location to his local mobile device periodically. U i maps l i into an integer x i from Z k and computes its square x i 2 . To hide l i from S 1 , U i needs to encrypt x i and x i 2 at local: he chooses two random values r 1 and r 2 from Z k , uses his key k i to generate p 1 = F k i (r 1 ) and p 2 = F i (r 2 ) by pseudo-random function F , and hides x i and x i 2 into c x i = (x i + p 1 , r 1 ) and c x i 2 = (x i 2 + p 2 , r 2 ) by (p 1 , p 2 ) and (r 1 , r 2 ). Finally he forms his current location ciphertext L i as L i = (c x i , c x 2 i ) and sends L i to S 1 . S 1 updates the information embedded in vertex v i in G as v i ← v i ||{L i }.

Grant
When U i connects U j as his friend, he should grant U j 's right to search his location by conducting Grant protocol with S 1 . First, U i adds U j 's identity id j as an entry in U i 's friends index F i , generates U j 's session key k i j by BE .J oin msk i (id j ), sends k i j to U j in secure channel. U i then selects a k-bit string st i as his updated state value, encrypts st i to cst i for the updated friends group in F i that contains U j by BE .E nc (bpk i ) (st i , F i ), and boardcasts cst i to the system. After receiving his session key k j i from U j , he attaches F i [id j ] with k j i : F i [id j ] = k j i . Afterwards, he sends grant request (cst i ||id j ) to S 1 . S 1 first checks whether there is a non-zero direct edge e ij in G. If not, it sets e ij = 1 and update v i in G with new

K-Nearest Neighbors Search
Each enrolled user U s ∈ U can send a search request to S 1 for retrieving his knearest neighbors sorted by distances, shown in Protocol computes c * encrypts st i to cst i by BE .E nc bpk i (st i , F i ) for the updated group in F i that excludes U j . Afterwards, he sends revoke request (cst i ||id j ) to S 1 . S 1 first checks whether there is a non-zero direct edge e ij in G. If true, it set e ij = 0 and update v i in G with new cst i :

Revoke
When U i wants to revoke U j 's search right, he should conduct Revoke protocol with S 1 . U i first deletes F i [id j ] locally, selects a k-bit string st i as his updated state value, encrypts st i to cst i by BE .E nc bpk i (st i , F i ) for the updated group in F i that excludes U j . Afterwards, he sends revoke request (cst i ||id j ) to S 1 . S 1 first checks whether there is a non-zero direct edge e ij in G. If true, it set e ij = 0 and update v i in G with new cst i : v i ← v i \{cst i } ∪ {cst i }. If F is a pseudo-random function, P, GM and BE are CPA secure, and the DGK protocol [31] is proved to be semantically secure in the random oracle model, then the proposed scheme satisfies adaptive L-semantic security, which is defined in Definition 2.

Protocol 1 K-Nearest Neighbors Search
Proof. We construct two simulators Sim 1 , Sim 2 that can generate the simulated values in Ideal(1 k ) using the information given in the leakage functions L 1 to L 4 , and prove that Ideal(1 k ) is indistinguishable with Real(1 k ) by any PPT adversary. Given the information leaked from L 1 , Sim 1 can learn |cst Given the information leaked from L 3 , Sim 1 can obtain search tokens {τ 1 , . . . , τ q }. Afterwards, it can choose random valueτ i in size |τ i | to simulate each τ i . Moreover, since {st 1 , . . . , st d ) is generated by BE . Dec by decrypting {cst 1 , . . . , cst d } using keys {k 1 s , . . . , k d s ), and each k i s in {k 1 s , . . . , k d s } is a k-bit random string, each st j in τ i is indistinguishable from τ i by any PPT adversary. Therefore, Sim 1 cannot learn extra information from τ 1 , . . . , τ q , which satisfies: The sorting network between A 1 and A 2 contains (log d) 2 levels, and each level contains (log d) 2 times of P 3 protocols. Therefore, the simulation of the sorting network can be reduced to prove Since z = x + r, where x is a l-bits integer and r is a l+λ-bits integer, the distribution of z is indistinguishable from z. We can get (sk s , [z]) ≈ (sk s , [z]). Besides, since the distribution ofz and z are independent of t, we can get (sk s , l, [z]||λ||) ≈ (sk s , l, [z], ||λ||). In a similar way, at every pairs i, A 2 's view can be denoted as view A 2 = (([D x ] i , [D y ] i , l, pk s , r, λ , [z l ]). We can build Sim 2 to simulate A 2 in the following phases: Choose two random valuesλ,z l , computes ||λ||, z ; -Output view Sim 2 = ([D x ], [D y ], l, pk s ,r, [z l ]).
In both view A 2 and view Sim 2 , r is extracted from uniform distribution (0, 2 λ+l ) Z, Therefore, for all polynomial time A 1 and A 2 , there exists polynomial time simulators Sim 1 and Sim 2 such that: We can demonstrate that the proposed scheme satisfies adaptive L-semantic security in the random oracle model, which is defined in Definition 2. Theorem 1 proved.

Revocation Secure
Theorem 2. If BE is CPA secure, then the proposed scheme satisfies revocation secure, which is defined in Definition 3.
Proof. Assuming the advantage of A rev to win Exp Revoke A rev (1 k ) is negligible, we can construct an adversary A be , who can break the CPA secure of BE with assist of A rev . We will show that if A rev has a non-negligible advantage in Exp Revoke A rev (1 k ), then we can construct an adversary A be that uses A rev as a subroutine to break the CPA secure of BE .
To make the output of Exp Revoke A rev (1 k ) as 1, A rev needs to provide a valid search token. To achieve that, A rev must know st i . A new value of st i is randomly selected and encrypted by BE .E nc bpk i (st i , F i \u j ) at each time a user is revoked from the system, where F i \u j is the new friends index. A rev then broadcast this encrypted value to all users. BE 's security ensures that only a non-revoked friend of U i can decrypt this ciphertext to obtain st i with overwhelming probability. Hence, the adversary can only create a valid search token if he is a valid friend of U i , or he will break the security of BE . That is, the probability that a random bit string is valid is 2 −k . It means that the adversary will not be able to produce a valid token with non-negligible probability.
Let C be the challenger for the adversary A be against BE , A be will act as the challenger for A rev : 1.
C runs BE .KeyGen(1 k ) to generate keys (msk be , bpk i A be issues a query to C for the secret key of A rev . C runs BE .J oin msk i (A rev ) to generate k A rev , sends k A rev to A be . To fully enroll A rev as a valid friend, the state ciphertext also needs to be updated by A be . A be send F i and a newly generated st i to C, C runs BE .E nc bpk i (st i , F i ) to generate new cst i . A be runs Grant to generate the key k i A rev of A rev .

3.
A be runs Initial to generate graph G, and sends k i A rev and G to A rev . A rev can access to oracles O Grant and O Revoke .

4.
A be revokes A rev by running Revoke, A be runs Revoke a second time in order to produce two values st i0 ← {0, 1} k and st i1 ← {0, 1} k for st i , and sends st i0 and st i1 to C as the challenge value for A be , along with a set of no revoked friends F i of A rev . 5.
C selects a bit b ∈ {0, 1}, uses BE .E nc bpk i (st ib , F i ) to encrypt st ib and generates cst ib , sends cst ib to A be as the challenge ciphertext for the CPA secure of BE . A be sends cst ib to A rev as the challenge ciphertext of Exp Revoke A rev (1 k ).

6.
A rev generates token τ, and sends τ to A be . Since the advantage for A rev to win Exp Revoke A rev (1 k ) is non-negligible, the probability of validity of τ is non-negligible. 7.
If t 0 = ⊥, then Search stops. According to the following situations, A be outputs its guess for b: -If t 0 = ⊥, this tells A be that st i0 was used to generate the token, A be outputs its guess for b as b = 0; -Of t 1 = ⊥, this tells A be that st i1 was used to generate the token, A be outputs its guess for b as b = 1.
From the above analysis, the advantage of A be to break the CPA secure of BE can be computed as Adv BE A be (1 k ): Since the advantage δ of A rev to win Exp Revoke A rev (1 k ) is non-negligible, the advantage δ 2 of A be to break the CPA security of BE is non-negligible, which contradicts the CPA security of BE . Therefore, there exists no A rev , who can win Exp Revoke A rev (1 k ) with non-negligible probability, and the proposed scheme satisfies revocation security as defined in Definition 3. Theorem 2 proved.

Theoretical Analysis
The complexity analysis is shown in Table 1, where n is the maximum amount of enrolled users and d is the maximum amount of each user's friends. We compare our scheme with the related privacy-preserving location-based query schemes [15,18,20] in Table 2. Due to the significant differences among the existing schemes in application scenarios, secure models, evaluation indicators and other factors, we focus on comparing characteristics and security.

Stor u Stor
Stor: storage complexity; Comp: computation complexity; Comm: communication complexity. For result accuracy, [15] achieves differential privacy for location information using linear programming techniques. It is specifically designed for simple computation that cannot provide accurate encrypted distance sorting. Ref. [20] uses a dynamic location grid structure to cluster users close to each other. However, the search results in [15,20] have a specific rate of false positives, which are suitable for similarity search. Our scheme and [18] use Euclidean distance to calculate the encrypted distance to achieve precise secure sorting. Ref. [18] focuses on searching the number of points of interest in a specific location area; our scheme sorts the distances based on the proven-secure comparison protocol. In terms of security, Ref. [18] protects location search privacy by way of private information retrieval (PIR). Although it adopts the anchor technology to improve search efficiency, it still has a certain communication overhead. Ref. [20] achieves sort privacy by assuming the server only performs the search, and the user performs the result sorting. As a result, the above methods each sort privacy but lead to high computation or communication costs.
Besides, compared with other schemes, our scheme also has a flexible access control mechanism. Moreover, our scheme achieves a constant-time computation cost and communication cost when updating friends and encrypting locations, and a user only needs to store key-related information locally. Therefore, we can demonstrate that our proposed scheme has both a very light user workload and a moderate server workload while being secure against the honest-but-curious adversary. In nowadays's mobile social networking environment, the user-side lightweight device's storage and computation cost should be minimized as much as possible. As a consequence, the proposed scheme is more suitable for the real-life thin clients MSNs deployment scenario.

Implementation
We implement and analyze the performance of our scheme. The experiments were run on several computers with Linux Ubuntu 18.04.2 64-Bit Version with Inter(R) Core(TM) I7-2600 quad-core processor (3.4 ghz) and 8 GB memory, which were installed on VMware Workstation in the LAN in C++ language. One of the computers acted as the server-end and the others acted as user-ends, respectively. We implemented a job allocation mechanism in the server-end that the computer acted as the master server and used threads to simulate the collaborated server that performed the assistant job. Each user-end stored the user's keys locally and interacted with the server-end. To submit a search request, a user-end only communicated with the master server.
In the simulation experiments, the security parameter k was set to 256 bits. We chose SHA256 in the OpenSSL library [32] for the pseudo-randomness function, and used the Relic library [33] to implement Paillier and GM homomorphic encryption. To implement the scheme more securely, we improved the modulus n of the Paillier and GM to 1024 bits. Besides, we used BGW2 [26] to implement public-key broadcast encryption. The key length in the above public encryption methods was set to be 1024 bits.
We conducted data simulations based on real-world data sets, which came from the newest version of the Enron email dataset [34], where we randomly selected 1000 accounts as the total users set. We represented users' friendships in the form of linked contacts. We selected a random integer in (10,50) to simulate the user's location' value, which was updated periodically. Moreover, we initialized the social network graph structure G with 1000 vertexes and 3831 edges that contained the above data and used a unique value to identify each vertex (user) in Z k . We did not record the network communication time during all the experiments since it depends on the user-end and the server-end's network connection. Each data point in the experiments was obtained after being repeated 50 times to generate the average value.

Storage Analysis
We first analyzed the storage overhead of our scheme. Table 3 shows the comparison between the encrypted G and unencrypted G of the generation time and the server's storage cost in the trend of the number of users increases. It can be seen that the server's storage cost increased almost linearly with the increase of the number of users. Since we used symmetric encryption to encrypt location, compared with the Paillier homomorphism ciphertext, the inflation rate of the symmetric ciphertext of the location decreased significantly, which is consistent with the theoretical analysis. Therefore, the proposed scheme achieves the trade-off of users' location confidentiality and search privacy with the acceptable additional storage cost.

Communication
In terms of communication, we mainly analyzed the amount of data transformed between (1) U i and S 1 and (2) S 1 and S 2 in Search protocol. Theoretically, when U i requests to search k-nearest neighbors among his d friends, U i 's communication overhead increases almost linearly with k. When S 1 and S 2 interact with each other to compute the distance from the total of d friends' location ciphertexts, the data size of the communication between them is O((logd) 2 ). Figure 2a,b shows the relationship between the two types of communication overhead in the experiment with the increasing trends of the friends' number d and the search parameter k, respectively. In general, the amount of data transmission required by the user in Search protocol is positively related to k. When k increases to a particular value (greater than d), the data transmission volume tends to be stable. The communication overhead between S 1 and S 2 is mainly positively related to d, but independent of the increase of k. Moreover, the distance computation sub-procedure requires several rounds of interactions, so the amount of communication overhead between servers is relatively large, which is consistent with the theoretical analysis.

Search Time
We also analyzed the primary source of the search time overhead for Search protocol. First, we divided the Search protocol at the server-end into two sub-procedures of location search and distance sort. Figure 3 shows the relationship between search time and the number of friends d. In Figure 3, the total time overhead of Search protocol is shown in the blue curve, the time overhead to extract and re-encrypt location ciphertext is shown in the yellow curve, and the time overhead to compute and sort the encrypted distance is shown in the red curve. From Figure 3, we can see that the time overhead of the two sub-procedures in the Search protocol generally increases with the increasing trend of d. Specifically, the location search time is far lower than the distance sort time, and with d increases to 4, the curve growth is slowing down. The distance sort time has a stable approximate linear relation with d. Therefore, it can be concluded that the computation and comparison of encrypted distances are two primary time-overhead sources of the Search protocol, which is consistent with the theoretical analysis.

Scalability
In terms of scalability, we first analyzed the impact of the search users' number who submit search requests in parallel on the time overhead of the Search protocol. To be specific, we deploy one host to simulate one user to execute the Search protocol and record the total time overhead. Then we deploy six hosts to simulate six users to repeat the same experiment and compare the results. It is worth mentioning that, when recording the time of multi-user search, multiple user-ends simultaneously send the search requests to the server-end. We record the start and end time when the server-end receives the search request until it completes each user's search. Figure 4 shows the relationship between the parallel search users' number and Search protocol's total time. It can be seen that one user's search time is slightly lower than six parallel users' search times. The former is approximately in a stable linear relation with d, and the latter slows down to a constant level with the increase of d. From the trend it can be concluded that, with the number of search users d increasing, its impact on search time overhead is weakened, and it further weakens the influence of the increasing number of friends on the search time. Therefore, the multi-user parallelism has a weak impact on search time overhead, which helps the scheme to achieve a certain level of scalability.
Besides, we analyzed the influence of the expandable number of remote servers on the search time overhead. First, we deployed three servers to execute the Search protocol for six users simultaneously and recorded the total time overhead. Then we deployed six servers to repeat the same experiment and compare the results. Figure 4 shows the relationship between the number of servers and the search time. It can be seen that the search time of 6-server deployment is significantly lower than the running time with 3-server deployment, and the former's growth was slowed down to a constant level after d reaches 4, but the latter's growth takes an approximately linear relationship with the number of friends steadily. Therefore, it can be concluded that deploying multiple servers to perform parallel searches can reduce the search time overhead and further weaken the influence of the increasing number of friends on the search time.

Remark 1.
It is worth pointing out that the search process's main computation cost is the homomorphic encryption/decryption operation and broadcast decryption operation. The computation efficiency is closely related to the selected parameters of the underlying algorithms. The server-end implementation can also be optimized to reduce the search time by using multiple threads for distance sorting and using approximate sorting algorithms, and so forth. In our experiment, we did not adopt any optimization method. The server was allowed to complete all the computation steps in a single thread in each phase to reflect the scheme's original execution efficiency faithfully.

Conclusions
Aiming at the problem of location privacy disclosure in MSNs, we propose a privacyenhancing k-nearest neighbors search scheme over MSNs. We deploy a dual-server collaborative architecture and design an encrypted location-oriented k-neighbor search protocol based on secure multi-party computation and homomorphic encryption. Our scheme achieves accurate nearby friends retrieval while protecting the geo-location and the distance order from revealing them to the servers. We propose a lightweight dynamic friends management mechanism based on public-key broadcast encryption to satisfy the finegrained access control requirement. It enables users to grant/revoke a friend's location search right without updating others' keys and achieves constant-time identity authentication. The scheme satisfies adaptive L-semantic security and revocation security under the random oracle model. Compared with the works on single server architecture, the proposed scheme reduces the communication cost between users and the server and prevents location information leakage, which achieves a trade-off of the location availability and privacy.