Next Article in Journal
Development of a Neurodegenerative Disease Gait Classification Algorithm Using Multiscale Sample Entropy and Machine Learning Classifiers
Previous Article in Journal
Real Space Triplets in Quantum Condensed Matter: Numerical Experiments Using Path Integrals, Closures, and Hard Spheres
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Two-Party Privacy-Preserving Set Intersection with FHE

1
School of Mathematics and Information Science, Guangzhou University, Guangzhou 510006, China
2
State Key Laboratory of Cryptology, P.O. Box 5159, Beijing 100878, China
3
School of Mathematics and Systems Science, Guangdong Polytechnic Normal University, Guangzhou 510665, China
*
Author to whom correspondence should be addressed.
Entropy 2020, 22(12), 1339; https://doi.org/10.3390/e22121339
Submission received: 28 October 2020 / Revised: 18 November 2020 / Accepted: 22 November 2020 / Published: 25 November 2020

Abstract

:
A two-party private set intersection allows two parties, the client and the server, to compute an intersection over their private sets, without revealing any information beyond the intersecting elements. We present a novel private set intersection protocol based on Shuhong Gao’s fully homomorphic encryption scheme and prove the security of the protocol in the semi-honest model. We also present a variant of the protocol which is a completely novel construction for computing the intersection based on Bloom filter and fully homomorphic encryption, and the protocol’s complexity is independent of the set size of the client. The security of the protocols relies on the learning with errors and ring learning with error problems. Furthermore, in the cloud with malicious adversaries, the computation of the private set intersection can be outsourced to the cloud service provider without revealing any private information.

1. Introduction

In 1978, Rivest first presented the idea of fully homomorphic encryption (FHE) [1]. Gentry constructed the first specific FHE scheme in 2009 [2]. Since then, dramatic progress in FHE is made by Gentry and many other researchers around the world. The first generation is based on an approximate GCD problem of integers and ideal lattices [2,3]; the second generation is based on ring learning with errors (RLWE) and learning with errors (LWE) problems, and developed several techniques, including re-linearization, key switch and modulus reduction, for decreasing noise growth [4,5]; the third generation involves the GSW scheme, which is based on approximate eigenvalues and RLWE [6]. Shuhong Gao’s scheme [7] is a compressed fully homomorphic encryption scheme, denoted by SGFHE below, and this scheme has three features: (1) The cipher with private key encryption is expanded six times and with public key encryption is 10 + l o g 2 ( n ) , where n (a power of 2) is the block length of the message; the computation of all ciphertexts is modulo r, where r = 16 n ; and the boundary of noise size is n 1 . (2) The bootstrapping algorithm needs only a bootstrapping key and the boundaries of the noise size of the output ciphers are still n 1 with no failure at all. (3) the security of Shuhong Gao’s scheme is based on the learning with errors problems and ring learning with errors problems, and for the block length of any message n 512 , it costs at least 2 160 bit operations for breaking the scheme with the current approaches. In addition, with TFHE bootstrapping [8], the LWE cipher produced could be invalid with a probability of about 2 33 (for n = 500 ). That probability is very small, and for computing many functions it is useful; however, it cannot be applied to functions that require more than 2 33 bit operations (unless increasing n). In SGFHE, the error size of the L W E ciphers after bootstrapping are always bound by n 1 ; this feature is not available in other FHE schemes. The total time cost for the bootstrapping procedure of the SGFHE scheme is about 130 ms, that is, 10 times as much as TFHE.
Secure multi-party computing (SMPC) is mainly about how to compute a function safely without a trusted third party. Secure multi-party computing was first proposed by Yao Qizhi in 1982. After being developed by Goldreich, Micali, Wigderson et al. [9], secure multi-party computing became a very active research field in modern cryptography. The research on MPC [10] is divided into general schemes and specific schemes designed for certain computing scenarios; the general scheme is not as efficient as a specific optimized scheme that is specially designed for a certain application. In practical applications, specific schemes are more widely used [11]. Secret sharing [12], garbled circuit [13,14], oblivious transfer [15], commitment schemes [16] and homomorphic encryption [17] are the key pieces of technology to realize SMPC, and SMPC is of great significance in the study of secret sharing schemes and privacy protection, where it is widely used in correlation analysis, data security queries, trusted data exchanges, etc. [18,19,20,21,22].
Private preserving set intersection (PSI) computing is an important aspect in secure multi-party computing. It not only performs well in scientific computing, but in real life many data can be represented by sets, so it can be used in privacy protection computing to complete corresponding data computing in the sets. The private preserving set intersection computing is the basic operation in many applications, such as machine learning, data mining [23], secure distributed data connection [24] and in privacy protection law enforcement, where it is especially widely used.

1.1. Related Work

Several specialized PSI protocols have been proposed in the literature which are more efficient than using general secure computation [33]. The main methods are: based on oblivious polynomial evaluation [25], based on an oblivious pseudo-random function [26], based on a blind signature [27], based on homomorphic encryption [28], based on the Bloom filter [29], etc. Shen Liyan et al. [30] gave a detailed overview of the development prospects of private preserving set intersection computing, the protocol developed by Google scholar. Mihaela Ion et al. [11] applied private preserving set intersection computing to advertising cooperation.

1.2. Contributions

We present three private set intersection protocols. First, we propose a novel private set intersection protocol based on Shuhong Gao’s fully homomorphic encryption scheme and prove the security of the protocol in the honest-but-curious model. We then present a variant of promoted protocol. We also present a variant of the protocol which is a completely novel construction for computing the intersection based on the Bloom filter and a fully homomorphic encryption; this protocol’s complexity is independent of the set size of the client. The security of the protocol relies on the learning with errors and ring learning with errors problems. Furthermore, in a cloud with malicious adversaries, the computation of the private set intersection can be outsourced to the cloud service provider without revealing any private information. The ciphertext extension of the protocols is small so that the protocols have strong practicability.
The remainder of the paper is structured as follows: We next review the basic concepts and techniques used in Section 2. In Section 3, we introduce the homomorphic operation used. We describe the basic two-party computing protocol, the improvement protocol and the two-party computing protocol based on the Bloom filter in Section 4. We present our conclusions in Section 5.

2. Basic Concepts and Techniques

2.1. Notation

Let χ be an error distribution; according to the distribution χ , x χ is randomly chosen. For an integer n 1 , let R n = Z [ x ] / ( x n + 1 ) , R n , q = Z [ x ] / ( x n + 1 , q ) , where ( x n + 1 , q ) represents the ideal of Z [ x ] generated by x n + 1 and q. For any polynomial f ( x ) = i = 0 d f i ( x i ) R ( x ) , we define the -norm as | | f ( x ) | | = max 0 i d | f i | .

2.2. LWE Ciphers and Modulus Reduction

Regev proposed LWE problem [31,32] over Z q . Let χ be a probabilistic distribution, and s Z q n be an arbitrary vector that is a secret key of any user. ( a , b ) is an LWE sample, where a Z q n is selected randomly and uniformly, b s , a + e ( mod q ) , e χ .
Let D q = q / 4 , 1 τ D q / 2 , a Z q n , and compute b s , a + e + x D q ( mod q ) for encrypting one bit, e τ , τ . Let E s ( x ) = ( a , b ) Z q n × Z q , ( a , b ) is the LWE ciphertext for x { 0 , 1 } . Note that D q = q / 2 in Regev’s but D q = q / 4 in SGFHE scheme for homomorphic bit operations.
Modulus reduction can reduce the LWE ciphers of Z q to Z r where r is far less than q.
Lemma 1 
([7]). Let s , a Z q n , e Z q n with | e | τ , D r = r / 4 , and b s , a + e + x D q ( mod q ) .
(1) Suppose τ q ( n 3 ) / ( 2 r ) , q 4 r and s { 0 , 1 } n . b = r b / q , a = r a / q ; then
b s , a + e + x D r ( mod r ) .
(2) Let = l o g 2 q , q 16 . Suppose that τ q ( n 5 ) / ( 2 r ) with s Z q n . Then there exist s { 0 , 1 } n l , a Z r n and b Z r , satisfying
b s ( a ) t + e + x D r ( mod r ) ,
where e Z , | e | n .

2.3. RLWE Ciphers

Lyubashevsky et al. introduced the RLWE problem to acquire more efficient encryption schemes [33]. An RLWE sample v = ( a ( x ) , b ( x ) ) R n 2 , where a ( x ) R n , q , a ( x ) = i = 0 n 1 a i x i , and
b ( x ) : = s ( x ) a ( x ) + e ( x ) ( mod ( x n + 1 , q ) )
for some e ( x ) R n , | | e ( x ) | | τ , τ is the bound of error.
v ( s ( x ) , 1 ) t e ( x ) ( mod ( x n + 1 , q ) ) .
Let m ( x ) = i = 0 n 1 m i x i , where m i { 0 , 1 } denotes an n-bit message. The RLWE cipher of m ( x ) with error size τ is
R E s ( m ( x ) ) = v + m ( x ) D q ( 0 , 1 ) R n , q 2 .
Suppose R E s ( m ( x ) ) = ( a ( x ) , b ( x ) ) . We have
b ( x ) s ( x ) a ( x ) m ( x ) D q + e ( x ) ( mod ( x n + 1 , q ) ) ,
when τ D q / 2 , the message m ( x ) can be recovered from m ( x ) b ( x ) s ( x ) a ( x ) ( mod ( x n + 1 , q ) ) .

2.4. GSW Ciphers and External Product

2.4.1. Gadget Matrix

Suppose that B and l are positive integers so that B q . Suppose that when g = ( 1 , B , , B 1 ) , an arbitrary a Z q could be denoted by
a = a 0 + a 1 + + a 1 B 1 = ( a 0 + a 1 , + a 1 ) g t ,
where a i Z has a small size. Let B / 2 a i B / 2 ; then ( a 0 + a 1 , + a 1 ) is unique. Let  2 B a i 2 B ; the lemma as following is straightforward to prove.
Lemma 2
([7]). Let B q , a Z . For 0 i 1 , choose x i Z , | x i | 3 B / 2 , which is uniform, random and independent. Suppose that
a ( x 0 + x 1 B + + x 1 B 1 ) y 0 + y 1 B + + y 1 B 1 ( mod q )
where | y i | B / 2 . Set a i = x i + y i ; then ( a 0 , a 1 , , a 1 ) is uniform random solution to
a a 0 + a 1 B + + a 1 B 1 ( mod q )
with | a i | B / 2 .
Hence, any list of elements in Z q can be extended. That is, each polynomial a ( x ) R n , q can be denoted by
a ( x ) = a 0 ( x ) + a 1 ( x ) B + + a 1 ( x ) B 1 = ( a 0 ( x ) , a 1 ( x ) , , a 1 ( x ) ) g t ,
where | | a i ( x ) | | 2 B . A gadget matrix of ( 2 ) × 2 is defined as
G = g t 0 0 g t .
Any ( a ( x ) , b ( x ) ) R n , q 2 can be denoted by
( a ( x ) , b ( x ) ) = u ( x ) G
where u ( x ) R n 2 is selected randomly and uniformly, and | | u ( x ) | | 2 B . Here G 1 , only as an operator, acts on the right of ( a ( x ) , b ( x ) ) (G is not a square matrix, so it has no inverse).
u ( x ) = ( a ( x ) , b ( x ) ) G 1
A row vector u ( x ) has 2 polynomials; the coefficients of the polynomials are small and at most  2 B . This can increase the dimension to decease the coefficient. By the above definition, we have the following equation.
( v G 1 ) G = v , v R n , q 2

2.4.2. External Product

Suppose that a row vector v = ( a ( x ) , b ( x ) ) R n , q 2 , and arbitrary matrices A R n , q 2 × 2 of 2 × 2 , define the external product of v and A as
v A = ( v G 1 ) A R n , q 2 ;
it is a random vector; for v G 1 is a random vector of 1 × 2 . By definition, the external product satisfies the right distributive, namely, for arbitrary two matrices A , B R n , q ( 2 ) × 2 of 2 × 2 , we have
v ( A + B ) ( v G 1 ) ( A + B ) = ( v G 1 ) A + ( v G 1 ) B = v A + v B ( mod ( x n + 1 , q ) ) .

2.4.3. GSW Ciphers

Let an n-bit secret key s ( x ) = i = 0 n 1 s i x i , where s i { 0 , 1 } , R L W E sample A R n , q 2 × 2 (the rows of A are R L W E samples) and a GSW cipher for m ( x ) R n is
G S W s ( m ( x ) ) = A + m ( x ) G R n , q 2 × 2 ;
according to the definition of R L W E sample
A ( s ( x ) , 1 ) t w ( x ) ( mod ( x n + 1 , q ) ) ,
where w ( x ) R n 2 , and | | w ( x ) | | τ ; τ is the error size of GSW ciphers.
Lemma 3
([7]). Let m 0 ( x ) , m 1 ( x ) R n be any two polynomials. For any R E s ( m 0 ( x ) ) with error size τ and any G S W s ( m 1 ( x ) ) with error size τ 1 , we have
R E s ( m 0 ( x ) ) G S W s ( m 1 ( x ) ) = R E s ( m 0 ( x ) m 1 ( x ) )
and R E s ( m 0 ( x ) m 1 ( x ) ) which has an error size of at most τ | | m 1 ( x ) | | + 4 B n τ 1 .

2.5. Bloom Filter

A Bloom filter [34] is a compact data structure for probabilistic set membership testing, and can insert and query data efficiently. The Bloom filter provides a time and space-efficient method to check whether there is an element in the set. A Bloom filter consists of a binary vector and a set of hash functions; b j represents the j-th bit of the Bloom filter b and all elements of the empty Bloom filter are 0. Any Bloom filter b includes the three steps as follows:
C r e a t e ( α ) : Create an empty Bloom filter with α bits; the hash function { h i | 0 i < β } is:
h i : { 0 , 1 } * { 0 , , α 1 } .
A d d ( x ) : Compute β hash values g i = h i ( x ) of the element x using the hash function h i ( 0 i < β ) . Set the Bloom filter cell with subscript g i to 1.
g i = h i ( x ) b g i = 1
T e s t ( x ) : Test whether the element x is in the Bloom filter b. Compute β hash values g i = h i ( x ) of the element x; if the β cells with subscript g i are 1 ( b g i = 1 ), then return 1 (true).
T e s t ( x ) = i = 0 β 1 b h i ( x ) = i = 0 β 1 b g i
The Bloom filter has a negligible false positive probability; T e s t ( x ) will return 1, although x cannot be added to the Bloom filter. Given ω elements to be added and the expected maximum false positive probability 2 k , the Bloom filter size α needs to satisfy:
α ω k l n 2 2 .
A Bloom filter is widely used in cryptography. Bellovin and Cheswick [35] and Goh [36] implemented a securely document search using a Bloom filter. Raykov and Bellovin [37] realized a secure database query. Qiu L and Li Y [38] realized privacy data mining and BIP-0037 put forward the application of a Bloom filter in Bitcoin. Reference [39,40,41] realized the set intersection computing based on Bloom filters.

3. Homomorphic Operations

In SGFHE scheme, let any two LWE ciphers be E s ( x 1 ) and E s ( x 2 ) with x 1 , x 2 { 0 , 1 } ; one bootstrapping can compute three bit operations E s ( x 1 x 2 ) , E s ( x 1 x 2 ) and E s ( x 1 x 2 ) ; the scheme follows the approach in Ducas et al. [42] and Chillotti [43], but does not need to perform a key switch.

3.1. Key Generations

Let n be a power of 2, n 64 . Suppose that r can be divided by 8; m = r / 2 , B = 35 r 2 n ,
r 16 n , q n r , 1220 r 4 n 2 Q < 1225 r 4 n 2 = B 2 .
D r = r / 4 , D q = q / 4 , D ˜ Q = Q / 8 , a n d G = 1 0 B 0 0 1 0 B .
Secret key Pick s { 0 , 1 } n uniformly and randomly; let s ( x ) = i = 0 n 1 s i x i .
Public key p k = ( k 0 ( x ) , k 1 ( x ) ) , k 0 ( x ) R n , q ,
k 1 ( x ) k 0 ( x ) s ( x ) + e ( x ) ( mod x n + 1 , q ) ,
where e ( x ) R n , | | e ( x ) | | < D q / ( 41 n ) .
Bootstrapping key A bootstrapping key b k = ( C 0 , C 1 , , C n 1 ) can be generated as follows:
For 1 i n 1 do:
(1)
Pick a j i R m , Q , 1 j 4 ;
(2)
Pick e j i ( x ) R m , | | e j i | | n , 1 j 4 ;
(3)
Compute b j i ( x ) : = a j i ( x ) s ( x ) + e j i ( x ) ( mod x m + 1 , Q ) , 1 j 4 ;
(4)
Set C i : = a 1 i ( x ) b 1 i ( x ) a 2 i ( x ) b 3 i ( x ) a 3 i ( x ) b 2 i ( x ) a 4 i ( x ) b 4 i ( x ) + s i G ( mod Q ) .

3.2. Bootstrapping Algorithm

Lemma 4.
Suppose that a bootstrapping key b k has an error size at most τ 1 ; r is divisible by 8 and r 16 n , Q n n 3 16 B r 2 τ 1 . Then, for any two LWE ciphers E s ( x i ) = v i Z r n × Z , with error size D r / 4 where x i { 0 , 1 } for i = 1 , 2 , the bootstrapping algorithm in Algorithm 1 outputs random LWE ciphers E s ( x 1 x 2 ) , E s ( x 1 x 2 ) , E s ( x 1 x 2 ) Z r n × Z r all with error size < n D r / 4 [7].
Algorithm 1 Bootstrapping Algorithm: B T b k ( v 1 , v 2 ) c 1 , c 2 , c 3 .
Input: b k = ( C 0 , C 1 , , C n 1 ) R m , Q 2 × 2 : bootstrapping key;
    ( v 1 , v 2 ) Z r n × Z r : v i = E s ( x i ) , x 1 , x 2 { 0 , 1 } ;
Output: E s ( x 1 x 2 ) , E s ( x 1 x 2 ) , E s ( x 1 x 2 ) Z r n × Z r ;
1:
Compute u : = v 1 + v 2 = ( u 0 , u 1 , , u n 1 , u n ) Z r n × Z r ;
2:
T : = { j Z : D r j D r } , t ( x ) : = j T x j ;
3:
A : = ( 0 , t ( x ) x u n D ˜ Q ) R m , Q 2 ;
4:
for k from 0 to n 1 do
5:
   A : = A ( G + ( x u k 1 ) C k ) ;
6:
end for
7:
Let A = ( i = 0 m 1 a i x i , i = 0 m 1 b i x i ) . Set
a 1 : = ( E x t r a c t ( a ( x ) , 3 m / 4 ) , D ˜ Q + b 3 m / 4 ) Z Q n × Z Q ;
a 2 : = ( E x t r a c t ( a ( x ) , m / 4 ) , D ˜ Q b m / 4 ) Z Q n × Z Q ;
a 3 : = a 2 a 1 Z Q n × Z Q ;
8:
for i from 1 to 3 do
9:
   c i : = r a / Q Z r n × Z r ;
10:
end for
11:
Return c 1 , c 2 , c 3 ;

3.3. Encryption Scheme

Lemma 5.
Suppose that r = 2 t + 1 ; ( a ( x ) , b ( x ) ) R n , r 2 can be computed from Algorithm 2. Then for some ω 3 ( x ) , | | ω 3 ( x ) | | D r / 4 , so that
2 t 4 b ( x ) s ( x ) a ( x ) ω 3 ( x ) + m ( x ) D r ( mod x n + 1 , r ) .
Specifically, if r = 16 n , then ( u , v ) returned in Algorithm 2 has 6n bits and represents an R L W E cipher R E s ( m ( x ) ) , and the error size < n [7].
Algorithm 2 Encryption with private key: R E s ( m ( x ) ) ( u , v ).
Input:n-bit secret key s ( x ) = i = 0 n 1 s i x i , s i { 0 , 1 } ;
   n-bit message m ( x ) = i = 0 n 1 m i x i , m i { 0 , 1 } ;
    t : = l o g 2 ( r ) , hence 2 t r 2 t 1 ;
    P : { 0 , 1 } { 0 , 1 } n ( t + 1 ) ;
Output: ( u , v ) { 0 , 1 } n × { { 0 , 1 } 5 } n
1:
u { 0 , 1 } n , a ( x ) : = P ( u , x ) R n , r ;
2:
ω ( x ) R n , | | ω ( x ) | | D r / 8 , b 1 ( x ) : = a ( x ) s ( x ) + ω ( x ) + m ( x ) D r ( mod x n + 1 , r ) ;
3:
b ( x ) = i = 0 n 1 b i x i : = b 1 ( x ) / 2 t 4 ;
4:
v = ( b 1 , b 2 , , b n 1 ) ( { 0 , 1 } 5 ) n ;
5:
return ( u , v ) ;
Lemma 6.
Suppose that r = 2 t + 1 , r 16 n , q 4 r a n d n 164 . Suppose that R E p k ( m ( x ) ) : = ( a ( x ) , b ( x ) ) R n , r 2 be any ciphertext output by Algorithm 3. Then 2 t 5 b ( x ) s ( x ) a ( x ) ω 3 ( x ) + m ( x ) D r ( mod x n + 1 , r ) for some ω 3 ( x ) R n with | | ω 3 ( x ) | | D r / 4 .
Specifically, if r = 16 n , then any ciphertext ( a ( x ) , b ( x ) ) has n ( 10 + l o g 2 ( n ) ) bits and the error, that is, each coefficient of ω 3 ( x ) , is in ( n , n ) randomly [7].
We can divide the data x into d blocks of length n. Let N = d n , x = ( x 1 , x 2 , , x d ) { 0 , 1 } N , x k = ( x k , 0 , x k , 1 , , x k , n 1 ) , x k { 0 , 1 } n . Each x k can be expressed as a polynomial i = 0 n 1 x k , i x i R n . Then—encrypted using the private-key scheme c k = R E s ( x k ) , 1 k d by Algorithm 2—note that the cipher text size c k is about 6N bits and then encrypted using the public-key scheme c k = R E p k ( x k ) , 1 k d by Algorithm 3; note that the cipher text size c k ’ is about N ( 10 + l o g 2 ( n ) ) . Homomorphic computing can be performed in three steps as follows:
Algorithm 3 Encryption under public key: R E p k ( m ( x ) ) ( a ( x ) , b ( x ) ) R n , r 2 .
Input: p k = ( k 0 ( x ) , k 1 ( x ) ) , k 0 ( x ) R n , q ;
    m ( x ) = i = 0 n 1 m i x i :n-bit message where each
    m i { 0 , 1 } ;
    t : = l o g 2 ( r ) ;
Output: ( a ( x ) , b ( x ) ) R n , r 2 u ( x ) R n ,
1:
u ( x ) R n , each coefficient random from { 1 , 0 , 1 } ;
2:
ω 1 ( x ) R n , | | ω 1 ( x ) | | D q / ( 41 n ) ;
3:
ω 2 ( x ) R n , | | ω 2 ( x ) | | D q / 82 ;
4:
a 1 ( x ) : = k 0 ( x ) u ( x ) + ω 1 ( x ) ( mod x n + 1 , q ) ;
5:
b 1 ( x ) : = k 1 ( x ) u ( x ) + ω 2 ( x ) + m ( x ) D q ( mod x n + 1 , q ) ;
6:
a ( x ) : = r q a 1 ( x ) , b ( x ) : = r 2 t 5 q b 1 ( x ) ;
7:
Return ( a ( x ) , b ( x ) )
(1)
Unpacking the R L W E ciphertexts R E ( x k ) to get L W E ciphers in Z r n × Z r for the bits of x .
R E ( x k ) u n p a c k E s ( c k , i )
(2)
Homomorphic computing of f ( x ) = y = { y 0 , y 1 , , y M } { 0 , 1 } M on L W E ciphers.
f ( x ) B T b k { E s ( y 0 ) , E s ( y 1 ) , , E s ( y M ) }
(3)
Packing the L W E ciphers { E s ( y 0 ) , E s ( y 1 ) , , E s ( y M ) } of function f into R L W E ciphers in R n , r 2 .
{ E s ( y 0 ) , E s ( y 1 ) , , E s ( y M ) } p a c k R E s ( y )

4. Privacy-Preserving Set Intersection

We abstract the privacy set intersection computation model as follows. The client C owns a set { c 1 , , c v } of size v, and the server S holds a set { s 1 , , s ω } of size ω . After the end of the protocol, the client C only obtains the intersection { c 1 , , c v } { s 1 , , s ω } ; however, the server cannot get any information for the input and the set intersection of the client (including the size of the intersection).

4.1. The Basic Two-Party Computing Protocol

The summary of basic private two-party intersection protocol is shown in Figure 1. The specific steps are as follows:
1.
The client C encrypts the set with private key and sends ciphertexts to the server S.
2.
The server S implements homomorphic computing with bootstrapping key and sends the result to the client C.
3.
The client C decrypts and computes the intersection of the two sets; the server S cannot acquire any information about the input and output.
Our basic two-party computing protocol is shown in Figure 2. At step C S , the client sends p k , b k and R E s k ( c k ) to the server. At step S, the server unpacks R E s k ( c k ) to get E s k ( c k , j ) , unpacks R E p k to get E s k ( s i , j ) , samples u { 0 , 1 } n , calls bootstrapping operations to compute E s k ( z k , i ) , computes L W E ciphers E s k ( w i , j ) , packs the resulted LWE ciphers E s k ( w i , j ) into RLWE ciphers R E s k ( w i ) and sends them to the client. At step C, the client decrypts R E s k ( w i ) to get w i and computes the intersection.

4.2. Correctness of the Basic Two-Party Computing Protocol

First, the correctness of SGFHE scheme has been proven.
Let c k , s i be the set elements’ binary representation of the client and server respectively. The insufficient bits are filled with 0s and we extend the length to n.
c k = { c k , 1 , , c k , n } = { 0 , 1 } n , 1 k v
s i = { s i , 1 , , s i , n } = { 0 , 1 } n , 1 i ω
u s a m p l e { 0 , 1 } n
z k , i = j = 1 n ( c k , j s i , j )
If z k , i = 1 , then c k s i ; if z k , i = 0 , then c k = s i .
The server can acquire E s k ( z k , i ) by R E s k ( c k ) u n p a c k E s k ( c k , j ) , R E p k ( s i ) u n p a c k E s k ( s i , j ) and call ( 2 n 1 ) bootstrapping operations, denoted by z k , i = j = 1 n ( c k , j s i , j ) B T b k E s k ( z k , i ) .
Remark: RE represents RLWE cipher; E represents LWE cipher.
Let
z i = k = 1 v z k , i ;
E s k ( z i ) Z r n × Z r can be computed from E s k ( z k , i ) by implementing ( v 1 ) bootstrapping operations. Hence, implementing ( 2 n + v 2 ) bootstrapping operations by (6) can compute E s k ( z i ) .
z i = k = 1 v z k , i = k = 1 v j = 1 n ( c k , j s i , j ) B T b k E s k ( z i )
If z i = 1 , then w i = u is a random value with k , c k s i ; if z i = 0 , then there k so that c k = s i , w i = c k = s i is in the intersection. For s i and u , each bit
w i , j = z k u j ( 1 z k ) s i , j
can be computed by
w i = { w i , 1 , , w k , n } = z i u ( 1 z i ) s i
For plaintexts u j and s i , j , an LWE cipher of any bit z k u j ( 1 z k ) s i , j can be computed as
u j E s k ( z i ) + s i , j ( E s k ( 1 ) E s k ( z k ) ) ,
which still has error size < D r / 4 .
The L W E cipher is
E s k ( w i , j ) = u j E s k ( z i ) + s i , j ( E s k ( 1 ) E s k ( z k ) ) .
The server can pack the resulted LWE ciphers E s k ( w i , j ) into RLWE ciphers R E s k ( w i ) and send them to the client.
In the end, the client decrypts D e c ( R E s k ( w i ) ) w i and computes the intersection
{ c 1 , , c v } { w 1 , , w ω } { c 1 , , c v } { s 1 , , s ω } .

4.3. Security Analysis of the Basic Two-Party Computing Protocol

We analyze the security of the protocol by comparing the real model and the ideal model. The real model is the actual implementation of the basic private intersection protocol and it is a trusted server for computing the intersection. The trusted server receives the input { c 1 , , c v } of the client and the input { s 1 , , s ω } of the server, and will return the intersection with the client; however, the server cannot get any information about the output. The ideal model maintains all security evidence. In the semi-honest model, the participant’s view includes its own input and the information received from other participants during the progression of the protocol. The simulator can use the participant’s input and output to build a simulation that is computationally indistinguishable from the views. That proves that the participants cannot obtain any other information besides the inputs and outputs.
Theorem 1.
If SGFHE is held, then the basic two-party computing protocol can realize the private set intersection computing under the semi-honest model.
Proof. 
In the protocol, the server cannot obtain any other information besides receiving the R L W E ciphers. Its view can only be simulated with ciphertexts and its security is based on IND-CPA security of R L W E scheme.
The client only receives the R L W E ciphers of the intersections and the random R L W E ciphers. Therefore, it just includes the output information of the set intersection and the view of simulator is only the output information of the set intersection. □

4.4. The Improvement of the Basic Two-Party Computing Protocol

In the basic two-party computing protocol, the server will return the ciphertexts of the intersection elements or the random ciphertexts, and computes the intersection by decrypting the ciphertexts. In our improvement protocol shown in Figure 3, we just need to determine whether c k is in { s 1 , , s ω } without computing the ciphertexts of the intersection elements by the server. On the one hand, it can reduce the computational complexity; on the other hand, it will not reveal the size of the server set.
Let c k , s i be the set elements’ binary representations of the client and the server respectively. The insufficient bits are filled with 0s and we extend the length to n.
c k = { c k , 1 , , c k , n } = { 0 , 1 } n , 1 k v
s i = { s i , 1 , , s i , n } = { 0 , 1 } n , 1 i ω
z k , i = j = 1 n ( c k , j s i , j )
If z k , i = 1 , then c k s i ; if z k , i = 0 , then c k = s i .
The server can acquire E s k ( z k , i ) by R E s k ( c k ) u n p a c k E s k ( c k , j ) , R E p k ( s i ) u n p a c k E s k ( s i , j ) and call ( 2 n 1 ) bootstrapping operations, denoted by z k , i = j = 1 n ( c k , j s i , j ) B T b k E s k ( z k , i ) . The server packs L W E ciphers E s k ( z k , i ) to R L W E ciphers R E s k ( z k ) and sends them to the client.
{ E s k ( z k , 1 ) , E s k ( z k , 2 ) , , E s k ( z k , ω ) } p a c k R E s k ( z k ) ,
the client decrypts R E s k ( z k ) , if all z k is 1, then
c k { c 1 , , c v } { s 1 , , s ω } ;
else c k { c 1 , , c v } { s 1 , , s ω } .
In the protocol, the server cannot obtain any other information besides R L W E ciphers and the view can only be simulated by the ciphertexts. Its security is based on IND-CPA security of R L W E scheme.
The client acquires z k , i by (9), however, the probability of obtaining s i , j from z k , i and c k , j is 2 n , and it is negligible. The client only receives the output of the intersection; therefore, the view of simulator is just the output of the set intersection.

4.5. Two-Party Computing Protocol Based on a Bloom Filter

In this section, we construct a two-party protocol based on Bloom filter shown in Figure 4, in which the client C encrypts each bit of the Bloom filter with private key and sends it to the server S. The server S homomorphic computes T e s t ( s j ) with the bootstrapping key of client C and sends it to the client. C will obtain the intersection of the two sets by decrypting, but the server cannot get any information about the input and output (including the size of the intersection).
Let c k , s i be the set elements’ binary representations of the client and the server respectively. The insufficient bits are filled with 0s and we extend the length to n.
c k = { c k , 1 , , c k , n } = { 0 , 1 } n , 1 k v .
s j = { s j , 1 , , s j , n } = { 0 , 1 } n , 1 j ω .
The client C constructs a Bloom filter b = c r e a t e ( α ) and sends p k , b k , R E s k ( b ) to the server S.
z j = T e s t ( s j ) = i = 0 β 1 b h i ( s j )
According to (10), input E s k ( b 1 ) , , E s k ( b α ) ,
E s k ( z j ) = i = 0 β 1 E s k ( b h i ( s j ) ) .
Call ( β 1 ) bootstrapping operations to obtain E s k ( z j ) , denoted by
z j = T e s t ( s j ) = i = 0 β 1 b h i ( s j ) B T b k E s k ( z j ) .
w j = { w j , 1 , , w j , n } = z j s j ( 1 z j ) u
If z j = 1 , then there k such that c k = s j , and computing w j = s j by (11); similarly, if z j = 0 , then k such that c k s j , and computing w j = u by (11). For plaintexts s j and u , each bit can be computed by (11),
w j , t = z j s j , t ( 1 z j ) u t , 1 t n .
The corresponding L W E cipher is
E s k ( w j , t ) = s j , t E s k ( z j ) + u t ( E s k ( 1 ) E s k ( z j ) ) .
The correctness and security of the two-party computing protocol based on Bloom filter is similar to the basic two-party computing protocol. Please refer to Section 4.2 and Section 4.3.

5. Conclusions

We constructed the set intersection two-party computing protocols based on a fully homomorphic encryption scheme. The protocols are simple and only need two rounds of communication, and the security is based on R L W E and L W E problems in the semi-honest model. The ciphertext extension of the protocols is small so that the protocols have strong practicability. Furthermore, we can extended the set intersection protocol by outsourcing computing under the malicious model. The limitation of our schemes is they are two-party protocols. In future work, we shall extend them to multi-party protocols. The disadvantage of the private set intersection protocols is they are not efficient enough due to bottleneck the bootstrapping operation. On the theoretical side, with the development of fully homomorphic encryption technology, its performance has been greatly improved, but the efficiency of it is still worthy of in-depth study. The bottleneck of the SGFHE scheme is its bootstrapping operation; therefore, its parallelization and hardware implementation will be further studied to improve the overall efficiency of the protocol.

Author Contributions

Conceptualization, Y.C., C.T. and Q.X.; methodology, Y.C. and C.T.; validation Y.C., C.T. and Q.X.; writing-original draft preparation, Y.C., Q.X. and C.T.; writing-review and editing, Y.C. and Q.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Foundation of National Natural Science of China under grant 61772147, in part by Guangdong Province Natural Science Foundation of major basic research and Cultivation project under grant 2015A030308016, in part by Project of Ordinary University Innovation Team Construction of Guangdong Province under grant 2015KCXTD014, in part by Basic Research Major Projects of Department of education of Guangdong Province under grant 2014KZDXM044, in part by Collaborative Innovation Major Projects of Bureau of Education of Guangzhou City under grant 1201610005 and in part by the Key-Area Research and Development Plan of Guangdong province under grant 2019B020215004.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Rvest, R.L.; Adieman, L.; DERtouzos, M.L. On data banks and privacy homomorphisms. Found. Secur. Comput. 1978, 4, 169–180. [Google Scholar]
  2. Craig, G. Fully Homomorphic Encryption Using Ideal Lattices. In Proceedings of the Annual ACM Symposium on Theory of Computing, Bethesda, MD, USA, 31 May–2 June 2009; pp. 169–178. [Google Scholar] [CrossRef] [Green Version]
  3. Van Dijk, M.; Gentry, C.; Halevi, S.; Vaikuntanathan, V. Fully homomorphic encryp-tion over the integers. IACR Cryptol. Eprint Arch. 2009, 616. [Google Scholar] [CrossRef] [Green Version]
  4. Brakerski, Z.; Vaikuntanathan, V. Efficient fully homomorphic encryption from (standard) LWE. SIAM J. Comput. 2014, 43, 831–871. [Google Scholar] [CrossRef]
  5. Brakerski, Z.; Vaikuntanathan, V. Fully homomorphic encryption from ring-LWE and security for key dependent messages. In Proceedings of the Advances in Cryptology-CRYPTO 2011, Santa Barbara, CA, USA, 14–18 August 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 505–524. [Google Scholar] [CrossRef] [Green Version]
  6. Gentry, C.; Sahai, A.; Waters, B. Homomorphic encryption from learning with errors: Conceptually-simpler, asymptotically-faster, attribute-based. In Advances in Cryptology-CRYPTO 2013; Springer: Berlin/Heidelberg, Germany, 2013; pp. 75–92. [Google Scholar] [CrossRef] [Green Version]
  7. Gao, S. Efficient Fully Homomorphic Encryption Scheme. Cryptology ePrint Archive: Report 2018/637. 2018. Available online: https://eprint.iacr.org/2018/637 (accessed on 28 October 2020).
  8. Chillotti, I.; Gama, N.; Georgieva, M. Improving TFHE: Faster Packed Homomorphic Operations and Effcient Circuit Bootstrapping. Cryptology ePrint Archive: Report 2017/ 430. 2017. Available online: https://eprint.iacr.org/2017/430 (accessed on 28 October 2020).
  9. Oded, G.; Micali, S.; Avi, W. How to play ANY mental game. In Proceedings of the 19th Annual ACM Symposium on Theory of Computing, New York, NY, USA, 25–27 May 1987; pp. 218–229. [Google Scholar] [CrossRef]
  10. Ion, M.; Kreuter, B.; Nergiz, E. Private Intersection-Sum Protocol with Applications to Attributing Aggregate Ad Conversions. Cryptology ePrint Archive: Report 2017/738. 2017. Available online: https://eprint.iacr.org/2017/738 (accessed on 28 October 2020).
  11. Ion, M.; Kreuter, B.; Erhan, A. On Deploying Secure Computing Commercially: Private Intersection-Sum Protocols and their Business Applications. Cryptology ePrint Archive: Report 2019/723. 2019. Available online: https://eprint.iacr.org/2019/723. (accessed on 28 October 2020).
  12. Prasetyo, H.; Guo, J.M. A Note on Multiple Secret Sharing Using Chinese Remainder Theorem and Exclusive-OR. IEEE Access 2019, 7, 37473–37497. [Google Scholar] [CrossRef]
  13. Yang, Q.; Peng, G.; Gasti, P. MEG: Memory and Energy Efficient Garbled Circuit Evaluation on Smartphones. IEEE Trans. Inf. Forensics Secur. 2019, 14, 913–922. [Google Scholar] [CrossRef]
  14. Zhang, Z.; Zhang, F.G. Garbled Circuits and Indistinguishability Obfuscation. J. Cryptologic Res. 2019, 6, 541–560. [Google Scholar] [CrossRef]
  15. Qin, H.; Wang, H.; Wei, X. Privacy-Preserving Wildcards Pattern Matching Protocol for IoT Applications. IEEE Access 2019, 1. [Google Scholar] [CrossRef]
  16. Gama, M.; Mateus, P.; Souto, A. A Private Quantum Bit String Commitment. Entropy 2020, 22, 272. [Google Scholar] [CrossRef] [Green Version]
  17. Zhao, C.; Zhao, S.N.; Jia, Z.T. Advances in Practical Secure Two-party Computation and Its Application in Genomic Sequence Comparison. J. Cryptol. Res. 2019, 6, 194–204. [Google Scholar]
  18. Hazay, C.; Lindell, Y. Efficient Protocols for Set Intersection and Pattern Matching with Security against Malicious and Covert Adversaries. J. Cryptol. 2010, 23, 422–456. [Google Scholar] [CrossRef]
  19. Cristofaro, E.D.; Lu, E.; Tsudik, Y.A. Gene Efficient Techniques for Privacy-Preserving Sharing of Sensitive Information. Cryptology ePrint Archive: Report 2011/113. 2011. Available online: https://eprint.iacr.org/2011/113 (accessed on 28 October 2020).
  20. Saracevic, M.; Adamovic, S.; Miskovic, V.; Macek, N.; Sarac, M. A novel approach to steganography based on the properties of Catalan numbers and Dyck words. Future Gener. Comput. Syst. 2019, 100, 186–197. [Google Scholar] [CrossRef]
  21. Saracevic, M.; Adamovic, S.; Bisevac, E. Applications of Catalan numbers and Lattice Path combinatorial problem in cryptography. Acta Polytech. Hung. 2018, 15, 91–110. [Google Scholar]
  22. Coppolino, L.; D’Antonio, S.; Formicola, V.; Mazzeo, G.; Romano, L. VISE: Combining Intel SGX and Homomorphic Encryption for Cloud Industrial Control Systems. IEEE Trans. Comput. 2020, 99. [Google Scholar] [CrossRef]
  23. Lindell, Y.; Pinkas, B. Secure Multiparty Computation for Privacy-Preserving Data Mining. J. Priv. Confidentiality 2009. [Google Scholar] [CrossRef]
  24. Michael, L.; Nejdl, W.; Papapetrou, O.; Siberski, W. Improving Distributed Join Efficiency with Extended Bloom Filter Operations. In Proceedings of the 21st International Conference on Advanced Networking and Applications, Niagara Falls, ON, Canada, 21–23 May 2007. [Google Scholar]
  25. Naor, M.; Pinkas, B. Oblivious Transfer and Polynomial Evaluation. In Proceedings of the Thirty-First Annual ACM Symposium on Theory of Computing, Atlanta, GA, USA, 1–4 May 1999. [Google Scholar]
  26. Freedman, M.J.; Ishai, Y.; Pinkas, B. Keyword Search and Oblivious Pseudorandom Functions. In Second International Conference on Theory of Cryptography; Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar] [CrossRef] [Green Version]
  27. Chaum, D.; Rivest, R.L.; Sherman, A.T. Blind Signatures for Untraceable Payments. Adv. Cryptol. 1983. [Google Scholar] [CrossRef]
  28. Florian, K. Outsourced private set intersection using homomorphic encryption. ACM Symp. Inf. 2012, 85–86. [Google Scholar] [CrossRef]
  29. Nojima, R.; Kadobayashi, Y. Cryptographically Secure Bloom-Filters. Trans. Data Priv. 2009, 2, 131–139. [Google Scholar]
  30. Shen, L.; Chen, X.; Shi, J.; Hu, L. Survey on Private Preserving Set Intersection Technology. J. Comput. Res. Dev. 2017, 54, 2153–2169. [Google Scholar] [CrossRef]
  31. Regev, O. On lattices, learning with errors, random linear codes, and cryptography. In Proceedings of the 37th Annual ACM Symposium on Theory of Computing, Baltimore, MD, USA, 22–24 May 2005. [Google Scholar] [CrossRef]
  32. Regev, O. On lattices, learning with errors, random linear codes, and cryptography. J. ACM 2009, 34. [Google Scholar] [CrossRef]
  33. Lyubashevsky, V.; Peikert, C.; Regev, O. On ideal lattices and learning with errors over rings. In Cryptology-EUROCRYPT 2010; Springer: Berlin/Heidelberg, Germany, 2010; p. 6110. [Google Scholar]
  34. Bloom, B.H. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 1970, 13, 422–426. [Google Scholar] [CrossRef]
  35. Bellovin, S.; Cheswick, W. Privacy-Enhanced Searches Using Encrypted Bloom Filters. Cryptology ePrint Archive: Report 2004/022. 2004. Available online: https://eprint.iacr.org/2004/022 (accessed on 28 October 2020).
  36. Goh, E. Secure Indexes. Cryptology ePrint Archive: Report 2003/216. 2003. Available online: https://eprint.iacr.org/2003/216 (accessed on 28 October 2020).
  37. Raykova, M.; Vo, B.; Bellovin, S.; Malkin, T. Secure Anonymous Database Search. In Proceedings of the ACM Cloud Computing Security Workshop, Chicago, IL, USA, 13 November 2009; pp. 115–126. [Google Scholar] [CrossRef] [Green Version]
  38. Qiu, L.; Li, Y.; Wu, X. Preserving privacy in association rule mining with bloom filters. J. Intell. Inf. Syst. 2007, 29, 253–278. [Google Scholar] [CrossRef]
  39. Debnath, S.K.; Dutta, R. Secure and Efficient Private Set Intersection Cardinality Using Bloom Filter. Int. Inf. Secur. Conf. 2015, 9290, 209–226. [Google Scholar] [CrossRef]
  40. Changyu, D.; Liqun, C.; Zikai, W. When private set intersection meets big data: An efficient and scalable protocol. In Proceedings of the ACM Conference on Computer and Communications Security, Berlin, Germany, 4–8 November 2013; ACM: New York, NY, USA, 2013; pp. 789–800. [Google Scholar] [CrossRef] [Green Version]
  41. Egert, R.; Fischlin, M.; Gens, D. Privately Computing Set-Union and Set-Intersection Cardinality via Bloom Filters. Eur. J. Oper. Res. 2015, 139, 371–389. [Google Scholar] [CrossRef]
  42. Ducas, L.; Micciancio, D. FHEW: Bootstrapping homomorphic encryption in less than a second. In Proceedings of the Advances in Cryptology-EUROCRYPT 2015, Sofia, Bulgaria, 26–30 April 2015; Springer: Berlin/Heidelberg, Germany, 2015. Part I. Volume 9056, pp. 617–640. [Google Scholar] [CrossRef] [Green Version]
  43. Chillotti, I.; Gama, N.; Georgieva, M.; Izabachéne, M. Faster fully homomorphic encryption: Bootstrapping in less than 0.1 seconds. In Proceedings of the Advances in Cryptology-ASIACRYPT 2016: 22nd International Conference on the Theory and Application of Cryptology and Information Security, Hanoi, Vietnam, 4–8 December 2016; Springer: Berlin/Heidelberg, Germany, 2016. Part I. Volume 10031, pp. 3–33. [Google Scholar] [CrossRef]
Figure 1. Summary of the intersection protocol.
Figure 1. Summary of the intersection protocol.
Entropy 22 01339 g001
Figure 2. The basic two-party computing protocol.
Figure 2. The basic two-party computing protocol.
Entropy 22 01339 g002
Figure 3. Improvement.
Figure 3. Improvement.
Entropy 22 01339 g003
Figure 4. Protocol based on a Bloom filter.
Figure 4. Protocol based on a Bloom filter.
Entropy 22 01339 g004
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Cai, Y.; Tang, C.; Xu, Q. Two-Party Privacy-Preserving Set Intersection with FHE. Entropy 2020, 22, 1339. https://doi.org/10.3390/e22121339

AMA Style

Cai Y, Tang C, Xu Q. Two-Party Privacy-Preserving Set Intersection with FHE. Entropy. 2020; 22(12):1339. https://doi.org/10.3390/e22121339

Chicago/Turabian Style

Cai, Yunlu, Chunming Tang, and Qiuxia Xu. 2020. "Two-Party Privacy-Preserving Set Intersection with FHE" Entropy 22, no. 12: 1339. https://doi.org/10.3390/e22121339

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop