Road Distance Computation Using Homomorphic Encryption in Road Networks

Road networks have been used in a wide range of applications to reduces the cost of transportation and improve the quality of related services. The shortest road distance computation has been considered as one of the most fundamental operations of road networks computation. To alleviate privacy concerns about location privacy leaks during road distance computation, it is desirable to have a secure and efficient road distance computation approach. In this paper, we propose two secure road distance computation approaches, which can compute road distance over encrypted data efficiently. An approximate road distance computation approach is designed by using Partially Homomorphic Encryption and road network set embedding. An exact road distance computation is built by using Somewhat Homomorphic Encryption and road network hypercube embedding. We implement our two road distance computation approaches, and evaluate them on the real cityscale road network. Evaluation results show that our approaches are accurate and efficient.

in encryption form to avoid privacy leaks. As an example, an online ride hailing service enables a rider to find the closest driver to offer ride service. To enjoy this service, both riders and drivers have to update their locations to the online ride hailing service provider, while the service provider computes the shortest road distances from the rider to all drivers, and select the closest driver. But the service providers are not always honest, they may track users or infer their profiles for economic advantage. To alleviate this privacy concerns, the riders and the driver submit their encrypted locations, and the service provider can compute the encrypted road distance over received ciphertexts. However, it is not a trivial problem to compute shortest road distance in ciphertext domain. Some schemes have been presented in the literature to compute shortest road distance in a secure manner, which make use of cryptographic primitives to encrypt the road network itself or the corresponding pre-generated index, e.g., Partially Homomorphic Encryption (PHE), Somewhat Homomorphic Encryption (SHE), Yao's garbled circuits (GCs). Shen et al. [1] proposed a graph encryption scheme based on symmetric-key primitives and SHE, which enables approximate constrained shortest distance queries. Meng et al. [2] presented three schemes based on distance oracle and structured encryption for approximate shortest distance queries. Wang et al. [3] proposed a secure Graph DataBase (SecGDB) encryption scheme based on PHE and Yao's GCs, which supports exact shortest distance/path queries. Wu et al. [4] proposed an efficient cryptographic protocol for fully-private navigation based on compressing the next-hop routing matrices, symmetric Private Information Retrieval (PIR) and Yao's GCs. However, existing schemes are not efficient enough to compute large-scale shortest distances in real time.
To tackle the practical limitations of the state-of-the-art, we propose two secure road distance computation approaches, which can compute road distance over encrypted data efficiently. We summarize main contributions as: • We propose an efficient approximate road distance computation approach over encrypted data, by using PHE and road network set embedding. Our approach only needs several additive homomorphic operations to compute an encrypted approximate road distance. • We propose an efficient exact road distance computation approach over encrypted data, by using SHE and road network hypercube embedding. Our approach only needs several additive and multiplicative homomorphic operations over packed ciphertexts to compute an encrypted exact road distance. • We implement our approximate road distance computation approach using Paillier Cryptosystem and exact road distance computation approach using FV scheme. Their performance is evaluated on the real city-scale road network. Evaluation results show that they achieve high accuracy, and keep efficient.
The remainder of this paper is structured as follows. In Section 2, we briefly introduce necessary preliminaries. In Section 3, we propose two road distance computation approaches over encrypted data. In Section 4, we evaluate their performance. Finally, we review the related literature and summarize the paper.

Paillier Cryptosystem
Partially homomorphic encryption (PHE) allows to carry out operations over ciphertexts. Paillier cryptosystem [5] is a popular PHE scheme, which relies on the decisional composite residuosity assumption. We briefly summarize it as follows for better understanding and description of our plan.
• Key Generation: (pk PHE , sk PHE ) ← KeyGen PHE (1 λ ). Select two primes p, q and calculate N = p × q and λ = lcm(p − 1, q − 1), where lcm means to take the least common multiple of p − 1 and q − 1. Then, choose a random g ∈ Z * N 2 so that gcd L g λ mod N 2 , N = 1, where L(x) = (x − 1)/N. The public key is pk PHE = (N, g) and the private key is sk PHE = λ.
• Encryption:ĉ ← Enc PHE (m, pk PHE ). Given a plaintext m ∈ Z N and a random number r ∈ Z N , the ciphertext can be derived asĉ = Enc PHE (m mod N; r mod N) = g m r N mod N 2 . • Decryption: m ← Dec PHE (ĉ, sk PHE ). Letĉ ∈ Z N 2 be a ciphertext, the plaintext of it is given by Paillier cryptosystem has properties of additive homomorphism and the mixed multiplication homomorphism: for any m 1 , m 2 , r 1 , r 2 ∈ Z N , we obtain Enc PHE (m 1 ,

FV Scheme
FV scheme [6] is a widely used SHE scheme which can support a finite number of both multiplications and additions on data in the cipher domain. Mathematically, FV scheme depends on a hard computation problem named as Ring Learning with Errors (RLWE) problem. Set R t = Z t [x]/(x n + 1), and in the ring structure R t , x n will be converted to -1. The plain text space in FV scheme is R t , and the cipher text is the polynomial array in R q . Given w being a base, + 1 = log 2 q + 1 represents the number of terms when the integer in the base q is decomposed into the base w. The + 1 polynomials is obtained by decomposing the polynomials in R q into base-w components coefficient-wise. With a $ ← S we uniformly sample a from the finite set S, and [·] q represent reduction modulo q into the interval (−q/2, q/2]. The FV scheme is briefly introduced as follows.

Secure Distance Computation
In this section, we propose two different methods to compute road distance in ciphertext domain.

Road Network Set Embedding
By using Road Network Set Embedding (RNSE) technique [7], the planar road network can be converted into a high-dimensional space, in which we can convert the complex calculation of the shortest road distance into a simple calculation supported through existing encryption primitives.
We model the road network as a weighted planar graph G = (V , E, W ). Define V as the set of vertices in G (i.e., road junctions) and E as the set of edges in G (i.e., road sections). Let R be a set of subsets of V and it describes a high-dimensional embedding space: where α and β are equal to O (log |V |). Subset V i,j is composed of 2 i nodes randomly selected from V . The shortest road distance between node v ∈ V and subset V i,j can be calculated by Based on the above definition, the coordinate of a node v, which is a vector with O(log 2 |V |) dimensions, is defined as the distance from the node v to each subset: Then we can use Ω = {c v | v ∈ V } to represent the embedded road network of G.
For the coordinate of a position l on the road section (v s , v d ) ∈ E, it can be denoted by Without losing generality, let the embedded road network have ω dimensions, s.t. ω ≤ log 2 |V | . By calculating the chessboard distance from c s to c d , the shortest distance from location l s to l d can be approximately represented as where dist C (·, ·) denotes the chessboard distance amid two coordinates.

Encrypted Approximate Road Distance Computation Using Paillier Cryptosystem
Suppose that the road network is represented by G = (V , E, W ) and the dimension of the embedded road network is ω, we can use the RNE technique described in Section 3.1.1 to calculate the coordinates of each point in G, where the coordinate is a vector of ω dimension. Then, we will obtain the embedded road network denoted by The coordinates c l s and c l d can be encrypted element-by-element using the public key pk PHE generated by Paillier cryptosystem, respectively. The encrypted coordinates are represented as The encrypted approximate road distance between l s and l d can be computed as follows:  (2 )).
2) Because Paillier cryptosystem has a plaintext space much larger than the upper limit of the road distance, several ciphertexts can be packed into one ciphertext using ciphertext packing technology, which can improve the efficiency. In the Paillier cryptosystem, assuming that p = N/( + 1) is the number of slots in a single packed ciphertext, p ciphertexts can be packed into one ciphertext. The main idea of the ciphertexts packing technique is described as follows.
. After performing a decryption operation, we can get the packed plaintext [a 1 | · · · |a l ] = l i=0 a i 2 (l−i) , and then recover a 1 , . . . , a l . Using the above ciphertext packing technique, every encrypted element in [[dist(l s , l d )]] can be packed into a same packed ciphertext: 3) The approximate road distance between l s and l d , ]. Further post-process is required to extract the maximum from dist(l s , l d ). In the simplest way, [[dist(l s , l d )]] can be decrypted with the secret key sk PHE , and then unpacked to recover dist(l s , l d ). There are, of course, other more complex scenes, where the maximum needs to be selected in an oblivious manner. For these scenes, some wellestablished cryptographic tools can be integrated, such as Yao's garbled circuit and secret sharing scheme. More details about secure comparison over encrypted integers can be referred to [8][9][10][11].

Road Network Hypercube Embedding
The m-dimensional hypercube H m , is a graph whose node set V consists of 2 m m-dimensional boolean vectors, i.e., vectors with binary coordinates 0 or 1, where two nodes are adjacent whenever they are different in exactly one coordinate. Moreover, the size of H m is m2 m−1 and its older is 2 m . Road Network Hypercube Embedding (RNHE) technique [7] is concerned with finding mappings between a road network and a higher-dimensional hypercube that preserve certain topological properties. Let the weighted graph G = (V , E, W ) represent the road network. The vertices set is V and the edge set is E. v j ), which denotes the road distance of the edge. For a vertex v i ∈ V , its coordinate can be expressed by a boolean vector v i with m dimensions, and we use Ω H = {v i | v i ∈ V } represents the embedded road network. We can obtain the shortest road distance between arbitrary two vertices by computing Hamming distance between their coordinates. Note that the location in the road network and its position in corresponding planar graph can typically be converted from one to the other, and hereafter they are used interchangeably.
Related definitions are as follows.
For each edge e ∈ F, it will have a unique opposite edge when F is an even face and have two opposite edges when F is an odd face. The embedded road network Ω H can be constructed by G as follows. Each alternating cut L corresponds to two connected components {G/L} 0 and {G/L} 1 . The coordinate of every vertex in {G/L} 0 will be appended with 0 and the coordinate of every vertex in {G/L} 1 will be appended with 1. We can find all alternating cuts which contain e as below.
• Starting with e, we move in both directions, take opposite edge on even face and end when we meet the first odd face in both directions. • Next, we turn right on one odd face and left on the other (we can obtain more alternating cuts by changing the selection of odd face). • Proceeding in both directions, we alternate at all odd faces and end up with reaching the outer face. Clearly, the coordinate is an m-dimensional boolean vector, where m is the total number of alternating cuts. At last, G can be embedded into an m-dimensional hypercube H m .
The computational complexity of hypercube embedding is O(|V | √ |V |). In Fig. 1, the road network corresponds to the hypercube H 14 , and its embedded road network is shown in Tab. 1. Note that above hypercube embedding is not affected by different road network topology.   Fig. 1 Vertex Label Let Ω H be the embedded road network of the road network G. The coordinates of two . We can calculate the shortest road distance from v s to v d as below.
where dist E (v s , v d ) means the exact shortest road distance, dist H (·, ·) means the Hamming distance. In Fig. 1, the shortest road distance from v a to v e is calculated by dist E (v a , v e ) = 1 2 d H (v a , v e ) = 6. Given l = (v, Δ) denoting a location in the road network G, v means the nearest node to l and Δ means the shortest road distance between v and l. Let two locations be l s = (v s , Δ s ) and l d = (v d , Δ d ) respectively, and the shortest road distance from l s to l d is computed by

Encrypted Exact Road Distance Computation Using FV Scheme
For locations l s = (v s , Δ s ) and l d = (v d , Δ d ), the road distance between them can be computed in ciphertext domain by using FV Scheme. In a basic way, we can encrypt the respective coordi- of v s and v d bit-by-bit with the public key pk SHE to obtain two ciphertext sequences, denoted as: When the length of m is long, the computation overhead of above basic distance computation method is heavy, since it is inefficient to encrypt/decrypt coordinate bit-by-bit. To reduce computation overhead, we propose an optimized approach with ciphertext packing to compute the shortest road distance efficiently. We now describe the details of two packed ciphertext constructions for the coordinates v s and v d as follows.
where v s can be converted into the polynomial coefficients of f s (v s ) by packing. The packed ciphertext of v s is calculated by encrypting f s (v s ) as follows: where v d is converted into the polynomial coefficients of f d (v d ) by packing. The packed ciphertext of v d is calculated by encrypting f d (v d ) as follows: Encrypted dist E (v s , v d ) can be computed by two packed ciphertexts denoted byv s andv d . Using multiplicative homomorphism, the ciphertext of Hamming weight of v s is calculated by the plaintext polynomial and the packed ciphertext, denoted as: then we have where x n = −1(modx n + 1). For plaintext modulus t that is large enough, the constant term in Eq. (19) equals the Hamming weight of v s . Likewise, the ciphertext of Hamming weight of v d is calculated by the plaintext polynomial and the packed ciphertext, denoted as: then we have The constant term in Eq. (21) equals the Hamming weight of v d . Using multiplicative homomorphism, the ciphertext of the inner product of coordinates v s and v d is calculated by two packed ciphertexts as follows: then we have The inner product of two coordinates v s and v d is equal to the constant term in Eq. (23).
Based on Eq. (9), we can use three packed ciphertexts (18), (20) and (22) to calculate the encrypted dist E (l s , l d ) as follows: Based on Eq. (10) and (24), the ciphertext of the distance between location l s = (v s , Δ s ) and l d = (v d , Δ d ) as follows: where we haveΔ s = Enc SHE (Δ s , pk SHE ) andΔ d = Enc SHE (Δ d , pk SHE ).
It needs only three multiplicative homomorphisms operations and four subtractive/additive homomorphisms operations for the calculation of the shortest road distance over two locations.

Experiment Evaluation
Our experiments are performed on the real road network of California, which consists of 21048 vertices and 21693 edges (www.cs.utah.edu/∼lifeifei/SpatialDataset.htm). Following the assumptions made in [7], we need to delete some trivial edges and insert virtual vertices on edges with fixing the unit distance. For PHE, we use the Paillier cryptosystem library (acsc.cs.utexas.edu/libpaillier). For the modified Paillier cryptosystem, we set N and g to 1024 bits and 160 bits, respectively. For SHE, we use FV scheme built on FV-NFLlib (github.com/CryptoExperts/FV-NFLlib), and the degree of polynomials in FV scheme n is set to 2048. All our experiments are conducted and executed on a PC running Ubuntu 18.04 LTS, with an Intel i7 processor at 3.4GHZ and 16GB RAM.
We evaluate the accuracy and the efficiency of our proposed road distance approaches in a k-Nearest Neighbors (kNN) query application [12,13]. We first generate some locations on the edges of the road networks in a random fashion. Then, one location is randomly picked as the starting location, from which all road distances to other locations are computed by using our approaches. Finally, the closest location is selected as the nearest neighbor. Fig. 2 depicts the accuracy of kNN query by using Euclidean distance, approximate road distance (dimension ω = 8, 16, 24, 32) and exact road distance under different location scales. Euclidean distance is considered as the lower bound of accuracy, which always stays from 85% to 90%. We can see the accuracy of approximate road distance raises steadily as the dimension of the embedded road network increases. It is roughly 95% when the dimension is higher than 24. That is because higher dimension indicates higher accurate approximation. When we vary the location scale from 1000 to 4000, the accuracy of Euclidean distance gradually increases as the location scale increases, because larger location scale means there may exist closer destination locations located around the starting location. But the accuracy of Euclidean distance is still less than 90%. The accuracy of approximate road distance is always high under any driver scale, which is roughly 95% if the dimension is higher than 24. As expected, the accuracy of exact road distance keeps almost 100% under any location scale. Above experimental results demonstrate that both approximate road distance computation approach and exact road distance computation approach can reach a higher accuracy due to the choice of road distance. We use average online computation cost for per kNN query to evaluate the efficiency. As shown in Fig. 3, the computation cost of the approximate road distance computation approach raises with the dimension increases. The reason is that higher dimension requires more encryption operations over a location coordinate. Meanwhile, the computation cost of both the approximate road distance computation approach and the exact road distance computation approach increase almost linearly as the location scale increases. That is because more distance computation is required with larger location scale. From above evaluation, we can see that the two approaches achieve high accuracy and efficiency.

Related Works
Numerous protocols have been proposed for private shortest road distance computation in different applications, such as kNN query and navigation. For (yet related) privacy issues of distance computation, some approaches utilize structural anonymization [14], differential privacy [15][16][17] or Private Information Retrieval (PIR) [18] to guarantee privacy for the client or the server. However, these approaches suffer from limited privacy, performance or scalability. There is also a vast literature on privacy-preserving shortest distance computation in structured encryption [19] or graph encryption, which focuses on protecting graph data when outsourced to third-party servers or on the cloud [20][21][22]. The most famous class of structured encryption schemes are searchable symmetric encryption (SSE) schemes [23]. Generally speaking, SSE schemes usually encrypt indexes or search trees for the purpose of efficiently searching on encrypted data. Another line of work executing graph algorithms over encrypted graphs is to develop data-oblivious algorithms [24] or data structures [25]. In these solutions, the graph data is stored in an Oblivious RAM (ORAM) [26] or an oblivious data structure on the server. The client can compute the shortest distances on the server without leaking its access patterns. Also relevant are the works based on SMC, such as Yao's GCs and ORAM. The generic solution is to construct a GC that contains the entire graph structure for a shortest-path algorithm and apply Yao's protocol. However, above approaches are often prohibitively expensive and impractical for city-scale road networks [27,28]. For instance, the GC-based approach by Carter et al. [29,30] requires several minutes to compute a single shortest path in a road network with just 100 vertices. Another generic approach combining GCs and ORAM requires communication overhead on the order of GB and run-times ranging from tens of minutes to several hours for a single computation on a network with 1024 vertices. Recently, some schemes are proposed to support computation over large-scale encrypted graphs. Shen et al. [1] proposed a graph encryption scheme based on symmetric-key primitives and SHE, which enables approximate constrained shortest distance queries. Meng et al. [2] presented three schemes based on distance oracle and structured encryption for approximate shortest distance queries. Above two schemes provide an estimate on the shortest distance, along with sacrificing accuracy. Wang et al. [3] proposed a secure Graph Data Base (SecGDB) encryption scheme based on PHE and Yao's GCs, which supports exact shortest distance/path queries. Wu et al. [4] proposed an efficient cryptographic protocol for fully-private navigation based on compressing the next-hop routing matrices, symmetric Private Information Retrieval (PIR) and Yao's GCs, which requires about 1.5 s and less than 100 KB of bandwidth for each hop in city-scale road network. Compared with existing schemes, our two road distance computation approaches are more efficient to compute large-scale shortest distances in real time.

Conclusions
In this paper, we proposed two secure road distance computation approaches, which can compute road distance over encrypted data efficiently. An approximate road distance computation approach is designed by using Partially Homomorphic Encryption and road network set embedding. An exact road distance computation is built by using Somewhat Homomorphic Encryption and road network hypercube embedding. According to the evaluation over a real city-scale road network, we have verified that our approaches are accurate and efficient.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.