Skip to content
BY 4.0 license Open Access Published by De Gruyter February 14, 2024

Searchable encryption with randomized ciphertext and randomized keyword search

  • Marco Calderini ORCID logo , Riccardo Longo ORCID logo EMAIL logo , Massimiliano Sala ORCID logo and Irene Villa ORCID logo

Abstract

The notion of public-key encryption with keyword search (PEKS) was introduced to search over encrypted data without performing any decryption. In this article, we propose a PEKS scheme in which both the encrypted keyword and the trapdoor are randomized so that the cloud server is not able to recognize identical queries a priori. Our scheme is Ciphertext-Indistinguishabiltity secure in the single-user setting and Trapdoor-Indistinguishability secure in the multi-user setting with a stronger security, i.e., with multi-trapdoor.

1 Introduction

With the rapid development of cloud computing technology, more and more enterprises and individuals are willing to share their own data on cloud platforms.

Since data owners lose control of the data and cloud servers may be untrusted, several security and privacy issues arise in cloud storage. So, sensitive data should be encrypted before being uploaded to the cloud server to protect the data from being leaked. However, data encryption makes it extremely difficult to search for a specific file in a large number of encrypted files.

In the last few years, many prominent cryptographic primitives have been proposed for achieving secure and efficient cloud data usage, such as searchable encryption (SE) [1,2]. SE allows a remote server to search in the encrypted data on behalf of a client without the knowledge of plaintext data.

Almost all SE techniques provide search ability over encrypted documents by extracting the keywords from the plaintexts (the documents) and generating searchable ciphertexts corresponding to these keywords [37]. Then, data receivers can search uploaded encrypted documents to find those containing a keyword by generating a trapdoor to send to the server. Once the trapdoor has been received, the server runs an algorithm to test which documents contain the searched keyword. If there is a match, then the server returns the associated encrypted document.

The first SE schemes that appeared in the literature use a symmetric setting [2]. In 2004, Boneh et al. [1] proposed the first public-key encryption with keyword search (PEKS). In a multi-user setting, PEKS [1] allows any user to encrypt keywords for searching, by designated searching key holders.

However, PEKS schemes are vulnerable to offline keyword guessing attacks (KGAs) [8,9]. That is, given a trapdoor, the adversary can generate a ciphertext of a guessing keyword and then test whether it matches the trapdoor. If the keyword space has low entropy, this attack is very efficient. Indeed, several PEKS schemes are shown to be insecure against KGAs [5,812].

Public-key authenticated encryption with keyword search (PAEKS) was proposed in 2017 by Huang and Li [13] to defend against KGAs. Its security model guarantees two security goals: cipher-keyword indistinguishability (CI-security) and trapdoor indistinguishability (TI-security). The first refers to the fact that an attacker is not able to distinguish which keyword is associated to a ciphertext, even when this keyword is one (randomly chosen by the challenger) of two alternatives controlled by the attacker itself. Conversely, trapdoor indistinguishability states that an attacker is not able to distinguish which keyword is associated to a trapdoor, even when this keyword is one (randomly chosen by the challenger) of two alternatives controlled by the attacker itself. In the standard setting, the attacker may ask queries to a trapdoor oracle and a ciphertext oracle as long as they are not related to the challenge, while in the Full Security setting, this restriction is relaxed.

Recently, Noroozi and Eslami [14] showed that the PAEKS scheme in the study by Huang and Li [13] is not secure in the multi-user setting, and Qin et al. [6] showed that it is not secure in the multi-cipher-keyword setting. Both proposed some adjustments to their scheme. Qin et al. [15] proposed a PAEKS scheme, and they proved that their scheme is secure in the multi-cipher-keyword setting for CI-security and in the multi-user setting. However, in the study by Qin et al. [15] the trapdoor is deterministic, thus, if an attacker is allowed to issue a trapdoor query for any challenge keyword, then the scheme is not secure in a multi-trapdoor-keyword setting. In addition, Emura [16], Emura introduced a PAEKS scheme, where each keyword is converted into an extended keyword, and a PEKS scheme is then used for extended keywords. The author defined trapdoor privacy by formalizing indistinguishability against keyword guessing attack (IND-IKGA) and proved that the proposed scheme is secure under this notion. As remarked by Emura [16], IND-IKGA does not imply full TI-security.

In this article, we propose a PAEKS scheme with an improved TI-security model with respect to the study by Qin et al. [15]. In particular, our PAEKS scheme is CI-secure in the single-user scenario and TI-secure in the multi-trapdoor and multi-user scenario. Moreover, both the encryption and the trapdoor are randomized, i.e., encrypting two times the same keyword (or creating the trapdoor for the same keyword) will produce different results, in contrast with the study by Qin et al. [15] where the trapdoor is not randomized. Obviously, if the cloud server receives two trapdoors, both testing positive on the same ciphertext, then the server learns that the two corresponding queries are the same, even if the trapdoors look different. Yet, in our system, the cloud server is not able to recognize a priori that two encrypted trapdoors correspond to the same query. This is a significant advancement w.r.t. previous schemes.

Note also that, regarding our discussion on security, if we exchange the role of the encryption and trapdoor algorithm, we would obtain a PAEKS scheme that is CI-secure in the multi-user setting and TI-secure in the single-user setting.

This article is structured as follows: in Section 2, we provide some preliminary notions that are useful to understand the rest of the article; Section 3 presents PEKS and PAEKS schemes, together with the security models for trapdoor indistinguishability and ciphertext indistinguishability; in Section 4, we describe our PAEKS scheme, commenting on the possibility of combining multiple keywords; and in Section 5, we provide the security proofs of our scheme. In particular, we prove that our PAEKS scheme is fully secure in a multi-user setting for trapdoor indistinguishability, and it is secure in a single-user setting for ciphertext indistinguishability. The conclusions of our work are in Section 6.

2 Preliminaries

In this section, we collect the notations and preliminaries needed for the rest of this work. The symbol Z stands for the ring of integers, and for n a positive integer, Z n is the ring of integers modulo n .

2.1 Bilinear pairing

Let G and G T be two multiplicative cyclic groups of prime order p . An admissible pairing is defined as a map e : G × G G T that satisfies the following properties:

  1. Bilinearity: for any g , h G , and a , b Z , e ( g a , h b ) = e ( g , h ) a b .

  2. Non-degeneracy: for any generator g of G , e ( g , g ) G T is a generator of G T .

  3. Computability: for any g , h G , we can compute e ( g , h ) efficiently.

Bilinear pairings play an important role in the construction of many cryptographic schemes, such as identity-based encryption schemes [17], attribute-based encryption schemes [18], key-agreement protocols [19], signature schemes [20]. Many schemes based on pairings can be found in a recent survey on functional encryption in the study by Mascia et al. [21].

2.2 Complexity assumptions

We recall some problems that are believed to be hard. The related assumptions will be used in the proof of security of the proposed scheme.

A function f : N R is called negligible if, for any positive integer d , there exists an integer N d such that f ( k ) < 1 k d for any k N d .

We define the advantage of an algorithm A that outputs a guess β of a bit β as Adv A = Pr [ β = β ] 1 2 .

Definition 2.1

(CDH) The computational Diffie–Hellman (CDH) problem over a group G of order p is the following. Given a generator g G and two elements g x , g y G , for x , y randomly chosen from Z p , compute the element g x y . We say that CDH is intractable (i.e., the CDH assumption holds) if all polynomial-time algorithms have a negligible probability of solving CDH.

Note that, when we are considering an admissible bilinear map, solving the decisional version of the above problem is easy, so we have to adapt it as follows.

Definition 2.2

(DBDH) The decisional bilinear Diffie–Hellman (DBDH) problem over a bilinear pairing ( G , G T , e ) of order p is the following. Given a generator g G and elements g x , g y , g z G , where x , y , and z are randomly chosen from Z p , distinguish e ( g , g ) x y z from a random element of G T . We say that DBDH is intractable (i.e., the DBDH assumption holds) if all polynomial-time algorithms have a negligible advantage in solving DBDH.

If we assume that the DBDH is intractable, then the CDH is also intractable.

Definition 2.3

(DLIN) The decisional linear (DLIN) problem over a group G of order p is the following. Given a generator g G and the elements g x , g y , g x r , g y s G , for x , y , s , and r randomly chosen from Z p , distinguish g r + s from a random element of G .

As in the study by Huang and Li [13], we consider the modified decisional linear (mDLIN) problem, which is defined as below.

Definition 2.4

(mDLIN) Given a generator g G and the elements g x , g y , g j x , g k y G , for x , y , j , and k randomly chosen from Z p , distinguish g j + k from a random element of G . We say that mDLIN is intractable (i.e., the mDLIN assumption holds) if all polynomial-time algorithms have a negligible advantage in solving mDLIN.

Similarly, as before, if we assume that the mDLIN is intractable, then the CDH is also intractable. Indeed, suppose that given g a and g b , we are able to compute efficiently g a b . Thus, from g y and g k y , we are able to recover g k . Then, we can distinguish Z = g j + k from a random element by computing Z g k and then solving the CDH problem between Z g k and g x and check if it coincides with g j x .

3 Preliminaries on public-key SE schemes

In this section, we introduce PEKS and PAEKS schemes and the related security notions.

3.1 PEKS

A PEKS consists of the following (probabilistic) polynomial-time algorithms [1].

  • Setup( λ ): given in input a security parameter λ , the algorithm outputs a global system parameter Param.

  • KeyGen( Param ): given the system parameter, it outputs a pair of public and secret keys ( pk , sk ). The algorithm is run by the data receiver.

  • Encrypt( W , pk ): given a keyword W and the receiver’s public key, it outputs a ciphertext C W of W . The algorithm is run by the data sender.

  • Trapdoor( W , sk ): given a keyword W and the secret key, it outputs a trapdoor T W . The algorithm is run by the data receiver.

  • Test( pk , C W , T W ): given the receiver’s public key, a ciphertext C W , and a trapdoor T W , it outputs 1 (true) indicating that C W and T W contain the same keyword ( W = W ), and 0 otherwise. The algorithm is run by the cloud server.

The first bilinear pairing-based PEKS scheme was proposed in 2004 [1].

This type of scheme is vulnerable to the inside KGA. Indeed, to recover the keyword contained in a trapdoor T W , an honest-but-curious cloud server could check whether W equals the keyword W contained in T W by computing the ciphertext C W and performing the test algorithm. Since in real applications the keyword space is usually not that big, the server would be able to carry out the KGA in a reasonably short time.

To address this issue, the notion of PAEKS was introduced in 2017 [13].

3.2 PAEKS

A PAEKS scheme consists of the following (probabilistic) polynomial-time algorithms.

  • Setup( λ ): given in input a security parameter λ , the algorithm outputs a global system parameter Param .

  • KeyGen S ( Param ): given the system parameter, the sender’s key pair generation algorithm outputs a pair of public and secret keys ( pk S , sk S ) for the sender.

  • KeyGen R ( Param ): given the system parameter, the receiver’s key pair generation algorithm outputs a pair of public and secret keys ( pk R , sk R ) for the receiver.

  • Encrypt( W , sk S , pk R ): given a keyword W , the receiver’s public key, and the sender’s private key, it outputs a ciphertext C W of W . The algorithm is run by the data sender.

  • Trapdoor( W , pk S , sk R ): given a keyword W , the sender’s public key, and the receiver’s secret key, it outputs a trapdoor T W . The algorithm is run by the data receiver.

  • Test( pk S , pk R , C W , T W ): given the sender’s public key, the receiver’s public key, a ciphertext C W , and a trapdoor T W , it outputs 1 (true) indicating that C W and T W contain the same keyword, and 0 otherwise. The algorithm is run by the cloud server.

As explained by Huang and Li [13], the notion of PAEKS prevents a third-party, even the cloud server, from generating a valid ciphertext-keyword. It provides both confidentiality and integrity of the plaintext.

3.3 Security models

Similar to PEKS, security of PAEKS requires that there is no probabilistic polynomial-time adversary that could distinguish trapdoors or ciphertexts. Therefore, a semantic security model for PAEKS includes both CI-security and TI-security or Trapdoor Privacy. Related to the security of PAEKS schemes, we recall the notation presented in the study by Qin et al. [15], adapted to the purposes of our scheme. Suppose that ( pk S , sk S ) and ( pk R , sk R ) are the key pairs of the attacked data sender and data receiver, respectively. In a multi-user setting, an adversary may have the following two abilities to attack a PAEKS scheme.

3.3.1 Chosen keyword to ciphertext (CKC) attacks

In a CKC attack, the adversary has the ability to obtain a ciphertext for any keyword W of its choice under a receiver’s public key pk ¯ R specified by the adversary. That is, the adversary will obtain the ciphertext C W = Encrypt ( W , sk S , pk ¯ R ) . Formally, CKC attacks are modelled by giving the adversary A access to a ciphertext oracle Encrypt sk S ( , ) , viewed as a “black box”; the adversary can repeatedly submit any keyword W and a (data receiver’s) public key pk ¯ R of its choice to this oracle, and is given in return a ciphertext C W = Encrypt ( W , sk S , pk ¯ R ) .

3.3.2 Chosen keyword to trapdoor (CKT) attacks

In a CKT attack, the adversary has the ability to obtain a trapdoor of any keyword W of its choice under a sender’s public key pk ¯ S specified by the adversary. That is, the adversary will obtain the trapdoor T W = Trapdoor ( W , pk ¯ S , sk R ) . Similar to the previous case, CKT attacks are modelled by giving the adversary A access to a trapdoor oracle Trapdoor sk R ( , ) , viewed as a “black box”; the adversary can repeatedly submit any keyword W and a (data sender’s) public key pk ¯ S of its choice to this oracle and is given in return a trapdoor T W = Trapdoor ( W , pk ¯ S , sk R ) .

Clearly, the adversary’s access to the aforementioned oracles has to be restricted in some trivial instances. Let W 0 * and W 1 * be the challenge keywords, then, in the CI-security model, the adversary cannot request the trapdoors of the challenge keywords for the target public keys, lest the challenge become trivial. In many PAEKS schemes, such as [13], there are other limitations on the ciphertext oracle. For example, the adversary is not allowed to request the ciphertext corresponding to either W 0 * or W 1 * . Qin et al. [15], in their study, considered full CKC attacks, where this limitation is removed.

Similarly, for TI-security, the adversary cannot request the ciphertexts of the challenge keywords. In addition, in this case, many PAEKS schemes (see [13,15]) impose other limitations such as the adversary is not allowed to request trapdoors corresponding to either W 0 * or W 1 * . In this article, we consider full CKT attacks, where this limitation is removed.

In the following, we present the formal definitions of the security notions we will use. Note that in a multiuser setting, the adversary can also choose a public key to give as extra input to the oracles, while in the single-user setting, this public key is fixed.

3.3.3 (MU) full TI-security model

We describe the full TI-security game for an adversary A in the multi-user setting.

  1. Initialization: Given a security parameter λ , the challenger runs the setup algorithm to generate the global system parameter Param . Then, it runs KeyGen S ( Param ) and KeyGen R ( Param ) to generate the target sender’s key pair ( pk S , sk S ) and the target receiver’s key pair ( pk R , sk R ), respectively. The challenger invokes the adversary A on input ( Param , pk S , pk R ) .

  2. Phase 1: The adversary is allowed to adaptively issue queries to the following oracles for polynomially many times.

    • Trapdoor oracle O T : Given a keyword W and a public key pk ¯ S , the oracle computes the corresponding trapdoor T W with respect to pk ¯ S and sk R , and returns T W to A .

    • Ciphertext oracle O C : Given a keyword W and a public key pk ¯ R , the oracle computes the corresponding ciphertext C W with respect to sk S and pk ¯ R , and returns C W to A .

  3. Challenge: At some point, A chooses two keywords ( W 0 * , W 1 * ) with the restriction that ( W 0 * , pk R ) and ( W 1 * , pk R ) have never been queried to O C in Phase 1. These keywords are submitted to the challenger as the challenge keywords. The challenger randomly chooses a bit β { 0 , 1 } , computes T W β * Trapdoor ( W β * , pk S , sk R ) , and returns T W β * to A .

  4. Phase 2: The adversary continues to issue queries to O T and O C as above, with the restriction that neither ( W 0 * , pk R ) nor ( W 1 * , pk R ) could be submitted to the ciphertext oracle.

  5. Guess: Finally, A outputs a bit β { 0 , 1 } . It wins the game if and only if β = β .

We define A ’s advantage in correctly distinguishing the scheme’s trapdoors as Adv A T ( λ ) = Pr [ β = β ] 1 2 .

Definition 3.1

([MU] Full TI-security) A PAEKS scheme satisfies trapdoor indistinguishability under a full CKT attack and a CKC attack in the multi-user setting if, for all probabilistic polynomial-time adversaries A , the advantage Adv A T ( λ ) is negligible in λ .

3.3.4 Standard CI-security model

We describe the standard CI-security game for an adversary A , in the single-user scenario.

  1. Initialization: Given a security parameter λ , the challenger generates Param and prepares pk S and pk R as in the previous Game. It then invokes the adversary A on input ( Param , pk S , pk R ) .

  2. Phase 1: The adversary issues queries to oracles O T and O C as before, but in the single-user setting (no public key is given in input to the oracles since it is implicitly set to pk S and pk R , respectively).

  3. Challenge: At some point, A chooses two keywords ( W 0 * , W 1 * ) , which have not been requested for trapdoors nor ciphertexts, and submits them to the challenger as the challenge keywords. The challenger randomly chooses a bit β { 0 , 1 } , computes C W β * Encrypt ( W β * , sk S , pk R ) , and returns C W β * to A .

  4. Phase 2: The adversary continues to issue queries to O T and O C as above, with the restriction that neither W 0 * nor W 1 * could be submitted to either oracle.

  5. Guess: Finally, A outputs a bit β { 0 , 1 } . It wins the game if and only if β = β .

We define A ’s advantage in correctly distinguishing the ciphertexts as Adv A C ( λ ) = Pr [ β = β ] 1 2 .

Definition 3.2

(Standard CI-security) A PAEKS scheme satisfies ciphertext indistinguishability under a CKT attack and a CKC attack, if, for all probabilistic polynomial-time adversaries A , the advantage Adv A C ( λ ) is negligible in λ .

4 The new scheme

In this section, we describe a new public-key authenticated SE scheme. For the sake of brevity, we condense the two algorithms KeyGen S and KeyGen R into a single step KeyGen.

  • Setup. Given a security parameter λ , the algorithm constructs two multiplicative groups of prime order p , G , and G T , a random generator g of the group G and an admissible bilinear map e : G × G G T . Then, it selects two hash functions H : { 0 , 1 } * G and H 2 : G Z p * . The global system parameters are Param = ( G , G T , e , p , g , H , H 2 ) .

  • KeyGen . Given Param , the algorithm produces the following pair of keys for sender and receiver: ( pk S , sk S ) = ( g a , a ) and ( pk R , sk R ) = ( g b , b ) , with a , b Z p * chosen randomly. From them, the sender and the receiver can construct three common secrets: h , t G and s Z p * . In particular, h = g a b , t = g H 2 ( h ) , and s = H 2 ( t ) .

  • Encrypt. Given a keyword W { 0 , 1 } * , pk R , and sk S , the common secrets h , t , and s are obtained. The sender selects a random r Z p * and outputs the pair:

    C W = [ C 1 , C 2 ] = [ t H ( pk S pk R W ) s g r , h r ] .

  • Trapdoor. Given a keyword W { 0 , 1 } * , pk S , and sk R , the common secrets h , t , and s are obtained. The receiver selects a random ρ Z p and outputs the tuple:

    T W = [ Q 1 , Q 2 , Q 3 ] = [ e ( t H ( pk S pk R W ) s , h ρ ) , g ρ , h ρ ] .

  • Test. Given in input any trapdoor T W = [ Q 1 , Q 2 , Q 3 ] , corresponding to a keyword W , and any ciphertext C W = [ C 1 , C 2 ] , corresponding to a keyword W , the test consists in checking the equivalence

    e ( C 1 , Q 3 ) = Q 1 e ( C 2 , Q 2 ) .

The correctness of the scheme is verified as follows. Since

e ( C 1 , Q 3 ) = e ( t H ( pk S pk R W ) s g r , h ρ ) = e ( t H ( pk S pk R W ) s , h ρ ) e ( g r , h ρ ) ,

and

Q 1 e ( C 2 , Q 2 ) = e ( t H ( pk S pk R W ) s , h ρ ) e ( h r , g ρ ) = e ( t H ( pk S pk R W ) s , h ρ ) e ( g r , h ρ ) ,

if the keywords W and W are the same, then H ( pk S pk R W ) = H ( pk S pk R W ) and the equivalence is satisfied. If the keywords are different ( W W ), then H ( pk S pk R W ) H ( pk S pk R W ) due to collision resistance of the hash function H , and thus Q 1 e ( t H ( pk S pk R W ) s g r , h ρ ) .

Remark 1

The aim of the KeyGen algorithm is to generate three secrets only known by the sender and the receiver. Note that there are also other options in order to construct these secrets.

4.1 On the combination of keywords

Suppose that we want to allow the search also for combinations of keywords. That is, given two distinct keywords, we want to allow the search of documents containing the first keyword, the second keyword, or both keywords.

A trivial way to do this is, for the sender, to generate a ciphertext for each keyword characterizing the document. The receiver, who wants to find a document containing n different keywords W 1 , , W n , will have to verify that, among these ciphertexts, the test algorithm is satisfied at least once for all trapdoors T W 1 , , T W n .

Another possible solution is to generate a ciphertext and a trapdoor for the combination of keywords. This allows to generate only one trapdoor and so to reduce the number of tests needed.

For example, consider the case of two keywords. A document is characterized by the keywords W 1 and W 2 . The sender generates the following ciphertexts:

C W 1 = [ t H ( pk S pk R W 1 ) s g r 1 , h r 1 ] , C W 2 = [ t H ( pk S pk R W 2 ) s g r 2 , h r 2 ] , C W 1 & W 2 = [ t H ( pk S pk R W 1 ) s H ( pk S pk R W 2 ) s g r 3 , h r 3 ] .

So, if the receiver wants to search for both W 1 and W 2 , a solution would be to create the trapdoor:

T W 1 & W 2 = [ e ( t H ( pk S pk R W 1 ) s H ( pk S pk R W 2 ) s , h ρ ) , g ρ , h ρ ] .

Another solution is to encrypt the combination of W 1 and W 2 as follows:

[ t H ( pk S pk R W 1 W 2 ) s g r 3 , h r 3 ] ,

and then create the trapdoor for W 1 W 2 . However, in this way, first of all, the sender and the receiver should agree on a keyword ordering (e.g. lexicographic), so that it is always used with the same concatenation (since using W 1 W 2 instead of W 2 W 1 would change the output of the hash). Then, the sender needs to compute also H ( pk S pk R W 1 W 2 ) s , while, for the previous solution, the sender has already computed H ( pk S pk R W 1 ) s and H ( pk S pk R W 2 ) s , so it does not need to perform another hashing and exponentiation.

As a final remark, let us note that, if we do not use the shared secret t in the encryption and in the trapdoor, then the server is able to generate valid ciphertexts corresponding to combinations of keywords. Indeed, given two ciphertexts:

C W 1 = [ H ( pk S pk R W 1 ) s g r 1 , h r 1 ] , C W 2 = [ H ( pk S pk R W 2 ) s g r 2 , h r 2 ] ,

the server, just by multiplying them entry-by-entry, will obtain a valid ciphertext for the combination of the keywords. That is,

C W 1 * C W 2 = [ H ( pk S pk R W 1 ) s g r 1 , h r 1 ] * [ H ( pk S pk R W 2 ) s g r 2 , h r 2 ] = [ H ( pk S pk R W 1 ) s H ( pk S pk R W 2 ) s g r 1 + r 2 , h r 1 + r 2 ] = C W 1 & W 2 .

5 Security proofs

In this section, we prove that our PAEKS scheme is CI-secure and (MU) fully TI-secure. Note that, if we exchange the role of the Encrypt and Trapdoor algorithms, then we obtain a scheme TI-secure and (MU) fully CI-secure.

5.1 Trapdoor indistinguishability

Before stating the security result, recall that if we assume that the DBDH problem over ( G , G T , e ) is intractable, then the CDH problem over G is also intractable (see Definition 2.2).

Theorem 5.1

Under the DBDH assumption, our PAEKS scheme is (MU) fully TI-secure in a random oracle model.

Proof

To prove the theorem, we show that, for the proposed scheme, winning the related game with a non-negligible advantage implies solving the DBDH problem with a non-negligible advantage.

Assume that there is a probabilistic polynomial time adversary A that breaks the trapdoor privacy of our scheme with a non-negligible advantage ε T . We show in the following that there exists an algorithm that is able to solve the DBDH problem with a non-negligible advantage.□

Consider an instance of the DBDH problem, ( G , G T , e , p , g , g x , g y , g z , Z ) , where x , y , z Z p are chosen randomly, and Z is either a random element of G T or equal to e ( g , g ) x y z . Let β be a bit such that β = 0 if Z = e ( g , g ) x y z and β = 1 otherwise. The goal of is to guess the bit β , and it does so by simulating the (MU) full TI-security game for A as follows.

5.1.1 Initialization

controls the random oracles that define the hash functions H and H 2 and sets the system parameters to Param = ( G , G T , e , p , g , H , H 2 ) . Then, it selects two random values a , b Z p and sets pk R = g a , pk S = g b , so sk R = a and sk S = b , therefore h = g a b . Moreover, it selects another random value u Z p and sets t = g u . This implies that H 2 ( g a b ) = u . The last secret s is implicitly set to be equal to x , so that H 2 ( g u ) = x . To simplify the notation, we set v = a b . Then, calls A on input ( Param , pk S , and pk R ).

5.1.2 Phase 1

In this phase, four oracles are involved. The oracle for H 2 only requires an element in G as input. The oracle for H requires in input a tuple in G × G × { 0 , 1 } * . The oracles for encryption and trapdoor require in input a pair in G × { 0 , 1 } * . The number of queries for the different oracles is limited, specifically to at most q H 2 , q H , q T , and q C queries for the oracles O H 2 , O H , O T , and O C , respectively. To simplify the description of the game, we assume that the adversary would not issue a pair ( g i , W i ) to O T with g i g a (or O C with g i g b ) before issuing the following queries: ( g i , pk R , W i ) (or ( pk S , g i , W i ) ) to O H ; g i a (or g i b ) to O H 2 ; and g α to O H 2 , where α is the output of the previous query. To the hash oracles, we associate two lists L H 2 and L H (initially empty) collecting the outputs of the hashes. The mentioned oracles operate as follows:

  • Hash Oracle O H 2 . Given an element g i G , if there is an element in L H 2 of the form g i , n i , returns n i . If g i = g u , then aborts and outputs a random bit β as its guess of β . If g i = g a b , then sets n i = u . Otherwise, it selects a random n i Z p . The pair g i , n i is added to the list L H 2 . returns H 2 ( g i ) = n i as the hash value of g i to A .

  • Hash Oracle O H . Given a tuple ( pk ¯ S , pk ¯ R , W i ) , then returns h i if there is a tuple in L H of the form ( pk ¯ S , pk ¯ R , W i ) , h i , a i , c i . Otherwise, selects a random a i Z p and a biased c i { 0 , 1 } such that Pr [ c i = 0 ] = δ . Then, sets h i = g z g a i if c i = 0 , and h i = g a i otherwise. The tuple ( pk ¯ S , pk ¯ R , W i ) , h i , a i , c i is added to the list L H . Finally, returns to A H ( pk ¯ S pk ¯ R W i ) = h i as the hash value of W i .

  • Trapdoor oracle O T . Given a pair ( pk ¯ S , W i ) , selects a random ρ i Z p and retrieves from L H the tuple ( pk ¯ S , pk R , W i ) , h i , a i , c i . If pk ¯ S = g a , then computes the trapdoor T i as follows:

    • – If c i = 1 , then sets T i = [ e ( g u ( g x ) a i , g v ρ i ) , g ρ i , g v ρ i ] .

    • – If c i = 0 , then sets T i = [ e ( g u ( g x ) a i , g v ρ i ) e ( g z , ( g x ) v ρ i ) , g ρ i , g v ρ i ] .

    Note that T i is a well-distributed trapdoor: in fact, in the first case, we have H ( pk S pk R W i ) = g a i , and in the second case, H ( pk S pk R W i ) = g a i g z .

  • Otherwise, if pk ¯ S pk S , then sets h ¯ = ( pk ¯ S ) b and retrieves h ¯ , n i from L H 2 . It sets t ¯ = g n i , then it retrieves t ¯ , s ¯ from L H 2 . Then, it returns the trapdoor T i = [ e ( t ¯ h i s ¯ , h ¯ ρ ) , g ρ , h ¯ ρ ] .

  • Ciphertext oracle O C . Given a pair ( pk ¯ R , W i ) , retrieves from L H the tuple ( pk S , pk ¯ R , W i ) , h i , a i , c i and selects a random r i Z p . We distinguish two cases:

    • – If pk ¯ R pk R , sets h ¯ = ( pk ¯ R ) a and retrieves h ¯ , n i from L H 2 . Then, it sets t ¯ = g n i and retrieves t ¯ , s ¯ from L H 2 . Finally, returns the ciphertext C i = [ C i , 1 , C i , 2 ] = [ t ¯ h i s ¯ g r i , h ¯ r i ] .

    • – If pk ¯ R = g b , then checks the value of c i retrieved from L H . If c i = 0 , then it aborts and outputs a random bit β as its guess of β . Otherwise, it returns the ciphertext C i = [ C i , 1 , C i , 2 ] = [ g u + r i ( g x ) a i , g v r i ] . Note that C i is a well-distributed ciphertext.

5.1.3 Challenge

The adversary A submits two keywords W 0 * and W 1 * , and we assume that ( pk S , pk R , W 0 * ) and ( pk S , pk R , W 1 * ) have been queried to O H , but ( pk R , W 0 * ) and ( pk R , W 1 * ) have not been queried to O C . retrieves from L H the tuples ( pk S , pk R , W 0 * ) , h 0 * , a 0 * , c 0 * and ( pk S , pk R , W 1 * ) , h 1 * , a 1 * , c 1 * . If c 0 * = c 1 * = 1 , then it aborts and outputs a random bit β as a guess of β . If c 0 * = c 1 * = 0 , then let γ be a bit selected at random. Otherwise, let γ be the bit such that c γ * = 0 . Note that γ is uniformly distributed in { 0 , 1 } . then computes the trapdoor

T * = [ Q 1 * , Q 2 * , Q 3 * ] = [ ( Z e ( g x , g y ) a γ * e ( g , g y ) u ) v , g y , ( g y ) v ] .

Note that, if Z = e ( g , g ) x y z , then

Q 1 * = ( Z e ( g x , g y ) a γ * e ( g , g y ) u ) v = ( e ( g x z , g y ) e ( g x a γ * , g y ) e ( g u , g y ) ) v = e ( g u g x ( z + a γ * ) , g v y ) = e ( t H ( pk S pk R W γ * ) s , h y ) .

Therefore, T * is a correct trapdoor. In this case, the random value chosen for the trapdoor corresponds to y . Otherwise, the first entrance in Q 1 * is a random element of G 2 . The tuple T * is returned to the adversary.

5.1.4 Phase 2

A continues issuing queries to the oracles, with the restriction that it cannot issue ( pk R , W 0 * ) and ( pk R , W 1 * ) to O C . In this phase, sets c i = 1 for all new queries to O H .

5.1.5 Guess

Finally, A outputs a bit γ . If γ = γ , then outputs β = 0 , otherwise β = 1 .

We analyse now the success probability of . Denote by abt the event that aborts during the game, which is divided into three events.

  • abt 0 : if g i = g u in the simulation of O H 2 . Since u was selected randomly over Z p , therefore determining g u is either a random guess or, given that H 2 ( g a b ) = u , corresponds to solving the CDH. Therefore, under some limitations on the number of queries q H 2 , we have that Pr [ abt 0 ] is negligible.

  • abt 1 : if c i = 0 and pk ¯ R = g b in the simulation of O C . We can overestimate this probability by assuming that pk ¯ R = g b in every call to O C . Each c i is selected randomly and independently, therefore a lower bound to the probability that abt 1 does not happen is Pr [ abt 1 ¯ ] ( 1 δ ) q C .

  • abt 2 : if c 0 * = c 1 * = 1 in the generation of the challenge trapdoor. Therefore, Pr [ abt 2 ¯ ] = 1 ( 1 δ ) 2 .

So, the probability that does not abort in the game can be computed as follows:

Pr [ abt ¯ ] = Pr [ abt 0 ¯ ] Pr [ abt 1 ¯ ] Pr [ abt 2 ¯ ] Pr [ abt 0 ¯ ] ( 1 δ ) q C ( 1 ( 1 δ ) 2 ) .

If we set δ = 1 q C q C + 2 , then we have that the probability takes the maximal value

Pr [ abt 0 ¯ ] q C q C + 2 q C 2 2 q C + 2 ,

which is approximately Pr [ abt 0 ¯ ] 2 q C e and thus non-negligible.

We have seen that, if β = 0 (i.e., Z = e ( g , g ) x y z ) and does not abort, then the view of A is identically distributed as in a real attack. In this case, if A succeeds in breaking the trapdoor privacy of our scheme, then succeeds in solving the DBDH problem instance. Note also that, if β = 1 , then A acts on random inputs, so effectively outputs a random guess, and thus the probability of guessing correctly is 1 2 . Therefore, the probability of guessing the bit β (and thus solving the DBDH problem) is:

Pr [ β = β ] = Pr [ β = β β = 0 ] Pr [ β = 0 ] + Pr [ β = β β = 1 ] Pr [ β = 1 ] = 1 2 ( Pr [ β = β β = 0 ] + Pr [ β = β β = 1 ] ) = 1 2 Pr [ β = β β = 0 ] + 1 2 = 1 2 Pr [ β = β β = 0 abt ] Pr [ abt ] + Pr [ β = β β = 0 abt ¯ ] Pr [ abt ¯ ] + 1 2 = 1 2 1 2 ( 1 Pr [ abt ¯ ] ) + ( ε T + 1 2 ) Pr [ abt ¯ ] + 1 2 = 1 2 ε T Pr [ abt ¯ ] + 1 2 .

Thus, if the advantage ε T of the adversary A is non-negligible, then the advantage Pr [ β = β ] 1 2 of is also non-negligible.

5.2 Ciphertext indistinguishability

Recall that, if we assume that the mDLIN problem over G is intractable, then the CDH problem over G is also intractable (see Definition 2.4).

Theorem 5.2

Under the mDLIN assumption, our PAEKS scheme has ciphertext indistinguishability under CKC and CKT attacks in random oracle model.

Proof

To prove the theorem, we show that winning the related game with a non-negligible advantage implies solving the mDLIN problem with a non-negligible advantage. Assume that there is a probabilistic polynomial time adversary A , which breaks the ciphertext indistinguishability of our scheme with a non-negligible advantage ε C . We want to show that we can build an algorithm that is able to solve the mDLIN problem with a non-negligible advantage.

Consider an instance of the mDLIN problem ( G , G T , e , p , g , g x , g y , g j x , g k y , Z ) , where x , y , j , and k are randomly chosen from Z p , and Z is either a random element of G or Z = g j + k . Let β be a bit such that β = 0 if Z = g j + k , and β = 1 otherwise. The goal of is to guess the bit β , and it does so by simulating the CI-security game for A as follows.□

5.2.1 Initialization

controls the random oracles that define the hash functions H and H 2 and sets up the parameters as Param = ( G , G T , e , p , g , H , H 2 ) . Then, selects a random value a Z p , and it sets pk R = g a and pk S = g x a , so sk R = a and sk S = x a . Therefore, the common secret h corresponds to g x . Moreover, selects another random value u Z p and sets t = g u . The last common secret is set as s = y . Therefore, we have H 2 ( g x ) = u and H 2 ( g u ) = y . Since we are in a single-user setting, we simplify the notation by writing H ( W ) instead of H ( pk S pk R W ) . Then, calls A on input ( Param , pk S , pk R ) .

5.2.2 Phase 1

answers the adversary’s queries with the same oracles and with the same assumptions considered in the TI case, but in a single-user setting. We describe the differences with the oracles of the previous game.

  1. The hash oracle O H 2 aborts if it is called on g u .

  2. The action of the hash oracle O H is as follows. Given a keyword W i , it selects a random a i Z p and a biased c i { 0 , 1 } such that Pr [ c i = 0 ] = δ .

  3. It sets h i = g k y g a i if c i = 0 , and h i = g a i otherwise. The tuple W i , h i , a i , c i is added to the list L H (initially empty). It returns H ( W i ) = h i as the hash value of W i to A .

  4. For the trapdoor oracle O T , given in input a keyword W i , retrieves the tuple W i , h i , a i , c i from L H . If c i = 0 , it aborts and outputs a random bit β as a guess of β . In the other case (where H ( W i ) = g a i ), it selects a random ρ i and outputs

    T i = [ e ( g u ( g y ) a i , ( g x ) ρ i ) , g ρ i , ( g x ) ρ i ] .

  5. Similarly, for the ciphertext oracle O C , given in input a keyword W i , retrieves the tuple W i , h i , a i , c i from L H . If c i = 0 , it aborts and outputs a random bit β as a guess of β . Otherwise, selects a random r i and outputs

    C i = [ g u ( g y ) a i g r i , ( g x ) r i ] .

Note that both the ciphertext and trapdoor oracles’ answers are correctly distributed.

5.2.3 Challenge

When the adversary submits two keywords W 0 * and W 1 * , queried to O H but not to O T or O C , retrieves from L H the tuples W 0 * , h 0 * , a 0 * , c 0 * and W 1 * , h 1 * , a 1 * , c 1 * . If c 0 * = c 1 * = 1 , then it aborts and outputs a random bit β as a guess of β . If c 0 * = 0 or c 1 * = 0 , it sets γ be the bit such that c γ * = 0 , so h γ * = g k y g a γ * . If c 0 * = c 1 * = 0 , then γ is selected at random. Note that, as in the previous game, γ is uniformly distributed. computes the ciphertext

C * = [ C 1 * , C 2 * ] = [ Z g u ( g y ) a γ * , g x j ] .

If Z = g j + k , then we have C 1 * = g j + k g u ( g y ) a γ * = g u g y ( a γ * + k y ) g j = t H ( W γ * ) s g j , and C 2 * = h j . In this case, the random element corresponds to j and C * is a proper ciphertext. If Z is a random element of G 1 , so it is C 1 * . The tuple C * is returned to the adversary.

5.2.4 Phase 2

A continues to issue queries to the oracles, with the restriction that it cannot issue W 0 * and W 1 * to O T nor O C . As in the previous game, in this phase, sets c i = 1 for all new queries to O H .

5.2.5 Guess

Finally, A outputs a bit γ . If γ = γ , then outputs β = 0 , otherwise β = 1 .

Denoted by abt the event that aborts during the game, the probability of this event is similar to the one in the previous game.

  • abt 0 : if g i = g u in the simulation of O H 2 . Since u was selected randomly over Z p , therefore determining g u is either a random guess or, given that H 2 ( g x ) = u , corresponds to solving the CDH since the adversary knows pk S = g x a and pk R = g a . Therefore, under some limitations on the number of queries q H 2 , Pr [ abt 0 ] is negligible.

  • abt 1 : if c i = 0 in the simulation of O C or O T . Each c i is selected randomly and independently, therefore the probability that abt 1 does not happen is Pr [ abt 1 ¯ ] = ( 1 δ ) q C + q T .

  • abt 2 : if c 0 * = c 1 * = 1 in the generation of the challenge trapdoor. Therefore, Pr [ abt 2 ¯ ] = 1 ( 1 δ ) 2 .

So, the probability that does not abort in the game is bounded by:

Pr [ abt ¯ ] = Pr [ abt 0 ¯ ] Pr [ abt 1 ¯ ] Pr [ abt 2 ¯ ] .

With δ = 1 q T + q C q T + q C + 2 , we obtain:

Pr [ abt ¯ ] = Pr [ abt 0 ¯ ] q Q + q C q Q + q C + 2 ( q Q + q C ) 2 2 q Q + q C + 2 ,

which is approximately equal to Pr [ abt 0 ¯ ] 2 ( q Q + q C ) e and thus non-negligible. We have seen that, if β = 0 (i.e., Z = g j + k ) and does not abort, then the view of A is identically distributed as in a real attack. In this case, if A succeeds in breaking the ciphertext privacy of our scheme, then succeeds in solving the mDLIN problem instance. As before, if β = 1 , then A acts on random inputs, so effectively outputs a random guess, and thus the probability of guessing correctly is 1 2 . Therefore, the probability of guessing the bit β (and thus solving the mDLIN problem) is, just like in the previous game:

Pr [ β = β ] = 1 2 ε T Pr [ abt ¯ ] + 1 2 .

Thus, if the advantage ε C of the adversary A is non-negligible, then the advantage Pr [ β = β ] 1 2 of is also non-negligible.

To better understand the improvements of the proposed scheme, we present a comparison with the PAEKS scheme presented in the study by Qin et al. [15], see Table 1.

Table 1

Comparison of our scheme with the one proposed in in the study by Qin et al. [15]

Ciphertext Trapdoor TI-security CI-security
[15] Randomized Deterministic Standard in multi-user Fully in multi-user
This article Randomized Randomized Fully in multi-user Standard in single-user

6 Conclusions and open problems

In this work, we presented a new PAEKS scheme, which not only randomized the ciphertext but also the trapdoor. We proved that our scheme is fully TI-secure and CI-secure (or fully CI-secure and TI-secure if we swap the encryption and trapdoor algorithms). We also discussed how to use our SE scheme for combination of keywords.

Future work could be towards two directions:

  • To modify our scheme in order to obtain trapdoors that implement disjunctive queries, i.e., that can be satisfied simultaneously by subsets of a set of keywords. This would reduce the amount of information leaked to the cloud server when it performs the test since it is possible to tune the search more finely and disclose only the overall match between trapdoors and ciphertexts.

  • To extend our scheme to provide also CI-security in the multi-user setting.

Acknowledgements

This work has been accepted for presentation at CIFRIS23, the Congress of the Italian association of cryptography “De Componendis Cifris.” The first, second, and last authors are members of the INdAM Research Group GNSAGA.

  1. Funding information: This work has been partially supported by the project SERICS (PE00000014) under the MUR National Recovery and Resilience Plan funded by the European Union - NextGenerationEU. Funding by the MUR Excellence Department Project awarded to Dipartimento di Matematica, UniversitÃă di Genova, CUP D33C23001110001, and by the European Union within the program NextGenerationEU. The third author acknowledges support from Ripple’s University Blockchain Research Initiative.

  2. Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.

  3. Conflict of interest: Prof. Massimiliano Sala is the Editor-in-Chief of the Journal of Mathematical Cryptology and was not involved in the review process of this article.

  4. Data availability statement: Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.

References

[1] Boneh D, Crescenzo GD, Ostrovsky R, and Persiano G. Public key encryption with keyword search. In: International conference on the theory and applications of cryptographic techniques. Berlin, Heidelberg: Springer; 2004. p. 506–22. 10.1007/978-3-540-24676-3_30Search in Google Scholar

[2] Song DX, Wagner DA, Perrig A. Practical techniques for searches on encrypted data. In: 2000 IEEE Symposium on Security and Privacy. Berkeley, CA, USA: IEEE Computer Society; 2000. p. 44–55. Search in Google Scholar

[3] Demertzis I, Chamani JG, Papadopoulos D, Papamanthou C. Dynamic searchable encryption with small client storage. In: 27th Annual Network and Distributed System Security Symposium. NDSS 2020, San Diego, California, USA, 23-26 February 2020. The Internet Society; 2020. 10.14722/ndss.2020.24423Search in Google Scholar

[4] He D, Ma M, Zeadally S, Kumar N, Liang K. Certificateless public key authenticated encryption with keyword search for industrial internet of things. IEEE Trans Ind Inform. 2018;14(8):3618–27. 10.1109/TII.2017.2771382Search in Google Scholar

[5] Noroozi M, Karoubi I, Eslami Z. Designing a secure designated server identity-based encryption with keyword search scheme: still unsolved. Ann des Telecommun. 2018;73(11-12):769–76. 10.1007/s12243-018-0653-4Search in Google Scholar

[6] Qin B, Chen Y, Huang Q, Liu X, Zheng D. Public-key authenticated encryption with keyword search revisited: security model and constructions. Inf Sci. 2020;516: 515–28. 10.1016/j.ins.2019.12.063Search in Google Scholar

[7] Soleimanian A, Khazaei S. Publicly verifiable searchable symmetric encryption based on efficient cryptographic components. Des Codes Cryptography 2019;87(1):123–47. 10.1007/s10623-018-0489-ySearch in Google Scholar

[8] Byun JW, Rhee HS, Park H, Lee DH. Off-line keyword guessing attacks on recent keyword search schemes over encrypted data. In: SDM 2006, LNCS 4165, 2006; 2006. p. 75–83. 10.1007/11844662_6Search in Google Scholar

[9] Jeong IR, Kwon JO, Hong D, Lee DH. Constructing PEKS schemes secure against keyword guessing attacks is possible? Comput Commun. 2009;32(2):394–6. 10.1016/j.comcom.2008.11.018Search in Google Scholar

[10] Lu Y, Wang G, Li J. Keyword guessing attacks on a public key encryption with keyword search scheme without random oracle and its improvement. Inf Sci. 2019;479:270–6. 10.1016/j.ins.2018.12.004Search in Google Scholar

[11] Yau W-C, Heng S-H, Goi B-M. Off-Line keyword guessing attacks on recent public key encryption with keyword search schemes. In: Rong C, Jaatun MG, Sandnes FE, Yang LT, Ma J, (eds.) ATC 2008. LNCS. Vol. 5060. Heidelberg: Springer; 2008. p. 100–5. 10.1007/978-3-540-69295-9_10Search in Google Scholar

[12] Yau W, Phan RC, Heng S, Goi B Keyword guessing attacks on secure searchable public key encryption schemes with a designated tester. Int J Comput Math. 2013;90(12):2581–7. 10.1080/00207160.2013.778985Search in Google Scholar

[13] Huang Q, Li H. An efficient public-key searchable encryption scheme secure against inside keyword guessing attacks. Inf Sci. 2017;403:1–14. 10.1016/j.ins.2017.03.038Search in Google Scholar

[14] Noroozi M, Eslami Z. Public key authenticated encryption with keyword search: revisited. IET Inf Secur. 2019;13(4):336–42. 10.1049/iet-ifs.2018.5315Search in Google Scholar

[15] Qin B, Cui H, Zheng X, Zheng D. Improved security model for public-key authenticated encryption with keyword search. In: International Conference on Provable Security. Cham: Springer; 2021. p. 19–38. 10.1007/978-3-030-90402-9_2Search in Google Scholar

[16] Emura K. Generic construction of public-key authenticated encryption with keyword search revisited: stronger security and efficient construction. Proceedings of the 9th ACM on ASIA Public-Key Cryptography Workshop; 2022. 10.1145/3494105.3526237Search in Google Scholar

[17] Boneh D, Franklin M. Identity-based encryption from the Weil pairing. In: Annual international cryptology conference. Berlin, Heidelberg: Springer; 2001. p. 213–29. 10.1007/3-540-44647-8_13Search in Google Scholar

[18] Bethencourt J, Sahai A, and Waters B. Ciphertext-policy attribute-based encryption. In: 2007 IEEE Symposium on Security and Privacy (SP’07). IEEE; 2007. p. 321–34. 10.1109/SP.2007.11Search in Google Scholar

[19] Joux A. A one round protocol for tripartite Diffie–Hellman. In: International algorithmic number theory symposium. Berlin, Heidelberg: Springer; 2000. p. 385–93. 10.1007/10722028_23Search in Google Scholar

[20] Boneh D, Lynn B, Shacham H. Short signatures from the Weil pairing. In: International Conference on the theory and Application of Cryptology and Information Security. Berlin, Heidelberg: Springer; 2001. p. 514–32. 10.1007/3-540-45682-1_30Search in Google Scholar

[21] Mascia C, Sala M, Villa I. A survey on functional encryption. Adv Math Commun. 2023;17(5):1251–89. 10.3934/amc.2021049Search in Google Scholar

Received: 2023-09-05
Accepted: 2023-10-18
Published Online: 2024-02-14

© 2024 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 27.4.2024 from https://www.degruyter.com/document/doi/10.1515/jmc-2023-0029/html
Scroll to top button